WO2005045704A1 - Embedding a session dession description message in a real-time control protocol (rtcp) message - Google Patents

Embedding a session dession description message in a real-time control protocol (rtcp) message Download PDF

Info

Publication number
WO2005045704A1
WO2005045704A1 PCT/US2004/024065 US2004024065W WO2005045704A1 WO 2005045704 A1 WO2005045704 A1 WO 2005045704A1 US 2004024065 W US2004024065 W US 2004024065W WO 2005045704 A1 WO2005045704 A1 WO 2005045704A1
Authority
WO
WIPO (PCT)
Prior art keywords
field
stream
length
streams
message
Prior art date
Application number
PCT/US2004/024065
Other languages
French (fr)
Inventor
Anders E. Klemets
Eduardo P. Oliveira
James M. Alkove
Original Assignee
Microsoft Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/693,430 external-priority patent/US7586938B2/en
Application filed by Microsoft Corporation filed Critical Microsoft Corporation
Priority to BR0406609-0A priority Critical patent/BRPI0406609A/en
Priority to EP04779232A priority patent/EP1676216B1/en
Priority to AU2004287133A priority patent/AU2004287133B2/en
Priority to MXPA05007090A priority patent/MXPA05007090A/en
Priority to CA2512191A priority patent/CA2512191C/en
Priority to JP2006536573A priority patent/JP4603551B2/en
Publication of WO2005045704A1 publication Critical patent/WO2005045704A1/en
Priority to HK06112656.4A priority patent/HK1092237A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]

Definitions

  • This invention relates to streaming media and data transfers, and particularly to embedding a session description message in an RTCP message.
  • streaming such as the streaming of audio, video, and/or text is becoming increasingly popular.
  • the term "streaming" is typically used to indicate that the data representing the content is provided over a network to a client computer on an as-needed basis rather than being pre-delivered in its entirety before playback.
  • the client computer renders streaming content as it is received from a network server, rather than waiting for an entire "file” to be delivered.
  • the widespread availability of streaming multimedia content enables a variety of informational content that was not previously available over the Internet or other computer networks. Live content is one significant example of such content.
  • Using streaming multimedia, audio, video, or audio/visual coverage of noteworthy events can be broadcast over the Internet as the events unfold.
  • SDP Session Description Protocol
  • RRC Network Working Group Request for Comments
  • SDP has been developed as an application level protocol intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation.
  • SDP can be used in accordance with other protocols, such as the Real-Time Streaming Protocol (RTSP) or the HyperText Transfer Protocol (HTTP), to describe and/or negotiate properties of a multimedia session used for delivery of streaming data.
  • RTSP Real-Time Streaming Protocol
  • HTTP HyperText Transfer Protocol
  • the network server may be streaming a series of multimedia presentations to the client computer, such as presentations listed in a play list.
  • the network server switches from streaming one presentation to the next, it is oftentimes difficult for the SDP information for the next presentation to be made available to the client computer.
  • Content streaming may also be multicast.
  • Conventional approaches to multicasting of streaming content typically involves providing content to be multicast to a server, which then multicasts the content over a network (i.e., without feedback from the clients receiving the streams).
  • the server typically multicasts the content in several streams having different formats (e.g., bit rates, languages, encoding schemes etc.) Clients attached to the network can then receive the stream(s) appropriate for its resources.
  • one multicast approach requires that the server provide a file that provides "multicast information" that allows clients to open streams of content. Maintaining and publishing this file is typically a manual process that has a relatively high administrative cost. Further, if not properly maintained and published, clients may encounter problems, which can lead to customer dissatisfaction. Another problem with this approach is that clients must keep their "multicast information" up-to-date so that they can properly access the content. This problem is exacerbated for clients that do not have a suitable back channel to request updates (e.g., clients with unidirectional satellite links).
  • an RTCP message that embeds a session description message includes at least three fields.
  • the first field contains data identifying the RTCP message as being a type that embeds a session description message.
  • the second field contains data that is the session description message for a media presentation.
  • the third field contains data identifying a length of the RTCP message, generated by summing the length of the first field, the length of the second field, and the length of the third field.
  • the RTCP message is created at a device, such as a server device.
  • the session description message embedded within the RTCP message is associated with one of a plurality of pieces of media content in a play list of media content being streamed from the device to the recipient.
  • multimedia presentations are multicast using an announcement channel that includes presentation description information along with multiple channels for multiple streams of multimedia data to accommodate clients of different multimedia resources. Clients can use the announcement channel to select channel(s) appropriate for their multimedia resources.
  • the channels are created in a predetermined manner (e.g., preselected logical addresses, preselected ports of an IP address, etc.) so that clients can immediately join a channel without (or concurrently with) joining the announcement channel to reduce startup latency.
  • an acceleration channel may be created that provides blocks of data containing the current unit of the multimedia presentation along with a preselected number of previous units at a bit rate that is "faster than real-time" (i.e., at a rate that is faster than the bit-rate of the multimedia streams). This feature allows clients with suitable resources to more quickly buffer sufficient data to begin presenting the multimedia data to users.
  • the acceleration channel need not be "faster than real time" so that a client may concurrently join both the acceleration channel and another channel that multicasts multimedia data so that, in effect, the client receives the multimedia data at a rate that is "faster than real-time.”
  • FIG. 1 illustrates an example network environment that can be used to stream media using the session description message embedded in an RTCP message as described herein.
  • FIG. 2 illustrates example client and server devices that can stream media content using the session description message embedded in an RTCP message as described herein.
  • FIG. 3 illustrates example client and server devices in a multicast environment that can stream media content using the session description message embedded in an RTCP message as described herein.
  • FIG. 4 illustrates example client and server devices in a server-side play list environment that can stream media content using the session description message embedded in an RTCP message as described herein.
  • FIG. 5 illustrates an example format of an RTCP message having an embedded session description message.
  • FIG. 6 illustrates an example session description message format.
  • FIG. 7 is a flowchart illustrating an example process for embedding session description messages in an RTCP message when using a play list.
  • FIG. 8 is a flowchart illustrating an example process for receiving session description messages in an RTCP message when using a play list.
  • Embedding a session description message in a Real-Time Control Protocol (RTCP) message is discussed herein.
  • a multimedia or single media presentation is streamed from a media content source, such as a server device, to a recipient, such as a client device, using Real-Time Transport Protocol (RTP) packets.
  • RTP Real-Time Transport Protocol
  • Control information regarding the presentation being streamed is also sent from the media content source to the recipient using RTCP messages.
  • Embedded within at least some of the RTCP messages is a session description message that describes the presentation being streamed.
  • the media content source can be any source of media content, an example of which is a server device.
  • FIG. 1 illustrates an example network environment 100 that can be used to stream media using the session description message embedded in an RTCP message as described herein.
  • client computing devices 102(1), 102(2), . . ., 102(a) are coupled to multiple (b) server computing devices 104(1), 104(2), . . ., 104(b) via a network 106.
  • Network 106 is intended to represent any of a variety of conventional network topologies and types (including wired and/or wireless networks), employing any of a variety of conventional network protocols (including public and/or proprietary protocols).
  • Network 106 may include, for example, the Internet as well as possibly at least portions of one or more local area networks (LANs).
  • Computing devices 102 and 104 can each be any of a variety of conventional computing devices, including desktop PCs, workstations, mainframe computers, Internet appliances, gaming consoles, handheld PCs, cellular telephones, personal digital assistants (PDAs), etc.
  • PDAs personal digital assistants
  • One or more of devices 102 and 104 can be the same types of devices, or alternatively different types of devices.
  • Server devices 104 can make any of a variety of data available for streaming to clients 102.
  • the term "streaming" is used to indicate that the data representing the media is provided over a network to a client device and that playback of the content can begin prior to the content being delivered in its entirety (e.g., providing the data on an as-needed basis rather than pre-delivering the data in its entirety before playback).
  • the data may be publicly available or alternatively restricted (e.g., restricted to only certain users, available only if the appropriate fee is paid, etc.).
  • the data may be any of a variety of one or more types of content, such as audio, video, text, animation, etc. Additionally, the data may be pre-recorded or alternatively "live” (e.g., a digital representation of a concert being captured as the concert is performed and made available for streaming shortly after capture).
  • a client device 102 may receive streaming media from a server 104 that stores the streaming media content as a file, or alternatively from a server 104 that receives the streaming media from some other source.
  • server 104 may receive the streaming media from another server that stores the streaming media content as a file, or may receive the streaming media from some other source (e.g., an encoder that is encoding a "live" event).
  • streaming media refers to streaming one or more media streams from one device to another (e.g., from a server device 104 to a client device 102).
  • the media streams can include any of a variety of types of content, such as one or more of audio, video, text, and so forth.
  • FIG. 2 illustrates example client and server devices that can stream media content using the session description message embedded in an RTCP message as described herein.
  • Multiple different protocols are typically followed at both client device 102 and server device 104 in order to stream media content from server device 104 to client device 102. These different protocols can be responsible for different aspects of the streaming process.
  • one or more additional devices e.g., firewalls, routers, gateways, bridges, etc.
  • an application level protocol 150, a transport protocol 152, and one or more delivery channel protocols 154 are used as part of the streaming process. Additional protocols not shown in FIG.
  • Application level protocol 150 is a protocol at the application level for control of the delivery of data with real-time properties.
  • Application level protocol 150 provides a framework, optionally extensible, to enable controlled, on-demand delivery of real-time data, such as streaming audio and video content.
  • Application level protocol 150 is a control protocol for initiating and directing delivery of streaming multimedia from media servers.
  • Examples of application level protocol 150 include the Real-Time Streaming Protocol (RTSP) as described in Network Working Group Request for Comments (RFC) 2326, April 1998, and the HyperText Transport Protocol (HTTP) as described in Network Working Group Request for Comments (RFC) 1945, May 1996 or Network Working Group Request for Comments (RFC) 2068, January 1997.
  • Application level protocol 150 uses transport protocol 152 for the delivery of real-time data, such as streaming audio and video.
  • Transport protocol 152 defines a packet format for media streams.
  • Transport protocol 152 provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services.
  • transport protocol 152 examples include the Real-Time Transport Protocol (RTP) and the Real-Time Control Protocol (RTCP) as described in Network Working Group Request for Comments (RFC) 3550, July 2003. Other versions, such as future draft or standardized versions, of RTP and RTCP may also be used. RTP does not address resource reservation and does not guarantee quality-of- service for real-time services.
  • the data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide some control and identification functionality.
  • the RTCP protocol groups one or more control messages together into a unit referred to as an RTCP packet. Embedded within one or more of the RTCP packets is a control message that includes a session description message.
  • the session description message describes properties of the multimedia presentation being streamed from server device 104 to client device 102.
  • the streaming media from server device 104 to client device 102 thus includes the session description message.
  • the transport protocol 152 uses delivery channel protocol(s) 154 for the transport connections.
  • Delivery channel protocol(s) 154 include one or more channels for transporting packets of data from server device 104 to client device 102. Each channel is typically used to send data packets for a single media stream, although in alternate embodiments a single channel may be used to send data packets for multiple media streams. Examples of delivery channel protocols 154 include Transmission Control Protocol (TCP) packets and User Datagram Protocol (UDP) packets. TCP ensures the delivery of data packets, whereas UDP does not ensure the delivery of data packets.
  • TCP Transmission Control Protocol
  • UDP User Datagram Protocol
  • FIG. 3 illustrates example client and server devices in a multicast environment that can stream media content using the session description message embedded in an RTCP message as described herein.
  • the protocols 150, 152, and 154 of FIG. 2 are included in the client and server devices of FIG. 3, but have not been illustrated.
  • one or more additional devices e.g., firewalls, routers, gateways, bridges, etc.
  • a streaming module 182 of server device 104 streams the same multimedia presentation to each of multiple (x) client devices 102(1), 102(2), . .
  • Each client device 102 has a streaming media player 184 that receives the streamed multimedia presentation and processes the received stream at the client device 102, typically playing back the multimedia presentation at the client device 102.
  • the same data is streamed to each client device 102 at approximately the same time, allowing server device 104 to stream only one occurrence of the same multimedia presentation at a time, with the various client devices 102 listening in to this one occurrence being streamed.
  • the streaming media 186 includes RTCP messages having one or more session description messages embedded therein. The same session description message may be broadcast multiple times during the streaming of the multimedia presentation, thereby allowing new client devices 102 to listen in to the streaming media after streaming has begun but still receive a session description message describing the multimedia presentation.
  • FIG. 4 illustrates example client and server devices in a server-side play list environment that can stream media content using the session description message embedded in an RTCP message as described herein.
  • the protocols 150, 152, and 154 of FIG. 2 are included in the client and server devices of FIG.
  • a streaming module 202 of server device 104 streams a multimedia presentation as streaming media 204 to a streaming media player 206 of client device 102.
  • Streaming media player 206 receives the streamed multimedia presentation and processes the received stream at the client device 102, typically playing back the multimedia presentation at the client device 102.
  • Server device 104 includes a play list 208 that identifies multiple (y) pieces of media content 210(1), 210(2), . . ., 210(y).
  • a play list 208 includes multiple entries, each entry identifying one of the multiple pieces of media content 210.
  • play list 208 may identify a single piece of media content, although in such situations the single piece of media content could simply be referenced by itself rather than through the use of a play list.
  • a client device 102 is able to select a single resource for playback, that resource identifying play list 208.
  • Streaming module 202 accesses the identified play list 208, and then accesses the individual pieces of media content 210 and streams those pieces 210 to client device 102.
  • the client device 102 is able to access a single resource, yet have multiple different pieces of media content streamed from server device 104.
  • Each piece of media content 210 includes one or more media streams. Different pieces of media content 210 can include different numbers of media streams. Each piece of media content 210 is typically a multimedia presentation. The manner in which a "piece" of content is defined can vary by implementation and based on the type of media. For example, for musical audio and/or video content each song can be a piece of content. Content may be separated into pieces along natural boundaries (e.g., different songs), or alternatively in other arbitrary manners (e.g., every five minutes of content is a piece).
  • FIGS. 2, 3, and 4 at the transport level the data to be streamed form server device 104 to client device 102 is embedded in RTP packets. Control information related to the data being streamed and the RTP packets is embedded in one or more control messages within an RTCP packet. Typically, an RTCP packet consists of several messages of different types. The first message in the RTCP packet is either a Receiver Report or a Sender Report.
  • the second message is an SDES (Source Description) message.
  • the SDES message contains one or more textual meta-data items.
  • the SDES message contains a CNAME (canonical name) item.
  • the CNAME item is a persistent transport-level identifier of the media content source and provides a mapping between the RTP synchronization source (SSRC) number and a textual string.
  • the SSRC is a source of a stream of RTP (and RTCP) packets.
  • the CNAME is used so that a sender or receiver that is participating in multiple RTP sessions that belong to the same presentation may use different SSRC values in each RTP session, but keep the CNAME value the same.
  • An additional type of message that can be included in an RTCP packet is a control message having embedded therein a session description message.
  • the session description message describes properties of the multimedia presentation being streamed from server device 104 to client device 102.
  • Different media formats or protocols can be used for such session description messages.
  • An example of such a media format is the Session Description Protocol (SDP), Network Working Group Request for Comments (RFC) 2327, April 1998.
  • SDP Session Description Protocol
  • RFC Network Working Group Request for Comments
  • the session description message discussed herein is a message in accordance with the SDP format described in RFC 2327.
  • different formats can be used to describe properties of the multimedia presentation, one or more session description messages are sent from server device 104 to client device 102 that include identifier(s) of the properties.
  • a single session description message may be sent by server device 104 for a particular multimedia presentation, or alternatively multiple session description messages may be sent. If multiple session description messages are sent, the multiple messages may include the same information, different information, or overlapping information.
  • a session description message includes, for example, one or more of: an identification of various channels used to multicast the multimedia presentation; descriptions of each media stream available in the multimedia presentation (e.g., indicating the type of stream (e.g., video or audio), a bit-rate of each media stream, a language used in the stream, etc.); error correction information; security/authentication information; encryption information; or digital rights management (DRM) information; etc. It should be noted that in certain situations a session description message can be separated or fragmented across multiple RTCP control messages.
  • FIG. 5 illustrates an example format of an RTCP control message 250 having an embedded session description message.
  • RTCP message 250 is discussed below as including multiple fields (also referred to as portions), each storing various data. It is to be appreciated that these fields can be arranged in different orders than the order in which they are discussed below and shown in FIG. 5.
  • RTCP message 250 includes all of the fields shown in FIG. 5. In alternate embodiments, RTCP message 250 includes fewer than all of the fields shown in FIG. 5, or may include additional fields not shown in FIG. 5.
  • the fields of RTCP message 250 can be viewed as being grouped into three groups: a header 290, an RTP-State block 292, and the session description message 284. Header 290 includes various information about RTCP message 250.
  • RTP- State block 292 is optional, and when included is used to identify RTP-specific information about a stream of the multimedia presentation that is described in the session description message (e.g., to specify the SSRC and initial RTP sequence number of a stream in the session description message).
  • one RTP-State block 292 is associated with and included in RTCP message 250 for each media stream in the multimedia presentation.
  • Session description message 284 is the session description message embedded within RTCP message 250.
  • V (version) field 252 is a 2-bit field that identifies the version of RTP being used, which is the same in RTCP packets as in RTP packets. For example, the version defined by RFC 3550 is 2.
  • P (padding) field 254 is a single bit that, when set (e.g., to a value of 1), indicates that RTCP message 250 contains some additional padding at the end which is not part of the control information. This padding is included in the length field 262, but otherwise should be ignored. The amount of padding is included within the padding itself. In certain implementations, the additional padding is in octets, and the last octet of the padding is a count of how many padding octets are included (including itself) and thus should be ignored.
  • C (compression) field 256 is a single bit that, when set (e.g., has a value of
  • Res (reserved) field 258 is a 4-bit reserved field. In certain implementations, Res field 258 should be set to zero.
  • PT (payload type) header field 260 is a 7-bit field set to a value (e.g., 141) to indicate that RTCP message 250 embeds a session description message.
  • Length field 262 is a 16-bit field that identifies the length of RTCP message
  • This length can be generated by summing the lengths of the various fields in RTCP message 250, including any headers and any padding. In certain implementations, the length is identified in 32-bit quantities minus one.
  • SDPMsgHash (SDP message hash) field 264 is a 16-bit field used to identify the session description message included in RTCP message 250 and an address
  • the identifier in field 264 is calculated as a check-sum over the session description message and the address, so that if either changes, the value of the identifier in field 264 is also changed.
  • the value of SDPMsgHash field 264 is calculated in the same manner as the "msg id hash" field described in the Session Announcement Protocol (SAP), Network Working Group Request for Comments (RFC) 2974, October 2000. If the session description message is fragmented across multiple RTCP messages, as discussed below, the value of SDPMsgHash field 264 of each fragment should be identical.
  • F (more fragments) field 266 is a single bit that, when set (e.g., has a value of 1), indicates that the session description message has been fragmented into multiple RTCP messages, and that the current RTCP message does not contain the last fragment of the session description message. If F field 266 is not set (e.g., has a value of 0), then the session description message has not been fragmented (the complete session description message is included in RTCP message 250), or the session description message has been fragmented and RTCP message 250 contains the last fragment of the session description message.
  • FragSeqNum (fragment sequence number) field 268 is a 15-bit field used to identify different fragments of a session description message.
  • the fragments of a session description message are assigned identifiers in some manner known to both server device 104 and client device 102. For example, the identifiers may be assigned sequentially starting with the value of 0, so the first fragment has a value 0, the second a value 1, the third a value 2, and so forth. If RTCP message 250 does not contain a fragment of a session description message (i.e., RTCP message 250 contains a complete session description message), then FragSeqNum field 268 should be set to 0. NumRtpState (number RTP state) field 270 is a 16-bit field used to specify the number of RTP-State blocks contained in RTCP message 250. Each RTP-State block is 14 bytes in size.
  • the "NumRtpState" field is set to 0 when no RTP-State blocks are present. In the illustrated example of RTCP message 250, there is one RTP-State block 292. If there are multiple RTP-State blocks, then a field 272, 274,
  • a field 272 is a 1-bit field that is not set (e.g., has a value of 0) if PT field
  • RTP-State block 292 contains a valid RTP Payload Type number. If A field 272 is not set, the information in RTP-State block 292 only applies to the RTP Payload Type number identified in PT field 274 and the SDP Flow ID identified in Flow ID field 276. If
  • a field 272 is set (e.g., has a value of 1), then PT field 274 should be ignored, and the RTP-State block 292 applies to all RTP packets for the SDP Flow ID identified in Flow ID field 276, irrespective of the RTP Payload Type used.
  • PT field 274 is a 7-bit field specifying the RTP Payload Type number for the information in RTP-State block 292. If A field 272 is set (e.g., has a value of 1), I then PT field 274 is not used and should be set to 0.
  • Flow ID field 276 is a 24-bit field that identifies the SDP Flow ID to which the information in RTP-State block 292 refers.
  • SSRC (synchronization source) field 278 is a 32-bit field which specifies the
  • RTP SSRC field value used for the media stream which is identified by Flow ID field 276 If A field 272 is not set (e.g., has a value of 0), then SSRC field 278 only applies to RTP packets for this media stream that use the RTP Payload Type given by PT field 274.
  • RtpTime (RTP time) field 280 is a 32-bit field that specifies the value of the RTP Timestamp field that an RTP packet would have, if that packet was sent at exactly the beginning of the media stream identified by Flow ID field 276.
  • RtpTime field 280 is the value of the RTP Timestamp field of a packet that would be sent at exactly time T, even if no such RTP packet actually exists for the media stream identified by Rtp-State block 292.
  • RtpSeq (RTP sequence) field 282 is a 16-bit field that gives the value of the
  • SDP data field 284 is the session description message embedded in RTCP message 250. In situations where the session description message is fragmented, SDP data field 284 contains only a portion of the session description message (e.g., a single fragment of the session description message). In certain implementations, the session description message is a complete SDP description in UTF-8 format.
  • FIG. 6 illustrates an example session description message format. Although illustrated as a specific example in FIG.
  • Session description message 320 includes a session level description portion 322 and zero or more media level description portions 324.
  • Session level description portion 322 includes one or more fields having data that applies to the whole session and all media streams that are part of the session.
  • Each media level description portion 322, on the other hand, includes one or more fields having data that applies only to a single media stream.
  • the data fields in media level description portion 322 describe properties for particular media streams. These properties may be in addition to properties described in session level description portion 322, or in place of properties described in session level description portion 322.
  • Session description message 320 begins with a particular field, referred to as the protocol version field.
  • media level description portions 324 each start with a particular field, referred to as a media name and transport address field.
  • multiple fields of the same type may be included in a session description message (e.g., a single session description message may have two or more attribute fields).
  • Table I below illustrates example fields that may be included in session level description portion 322.
  • Table I includes a name for each example field, an abbreviation or type for each example field, and a brief discussion of each example field.
  • the protocol version field, the owner/creator and session identifier field, the session name field, and the time description field are required whereas all other fields in Table I are optional.
  • Table II below illustrates the time description field in additional detail. Table II includes a name for each field in the time description field, an abbreviation or type for each field in the time description field, and a brief discussion of each field in the time description field. The time the session is active field is required whereas the zero or more repeat times field is optional.
  • Table III below illustrates example fields that may be included in a media level description portion 324.
  • Table III includes a name for each example field, an abbreviation or type for each example field, and a brief discussion of each example field.
  • the media announcement field is required whereas all other fields in Table III are optional.
  • FIG. 7 is a flowchart illustrating an example process 350 for embedding session description messages in an RTCP message when using a server-side play list.
  • FIG. 7 shows acts performed by a media content source, such as a server device 104 (e.g., of FIGS. 1, 2, 3, or 4).
  • a media content source such as a server device 104 (e.g., of FIGS. 1, 2, 3, or 4).
  • the next piece of media content in the play list is identified (act 352).
  • the next piece is the first piece identified in the play list.
  • the next piece of media content is the piece that follows the piece whose end was reached.
  • this next piece may be in the order defined by the play list, or the user may be able to navigate to a different piece within the play list (e.g., the user may be able to request that a particular piece in the play list be skipped or jumped over).
  • Information describing the identified piece of media content is then obtained (act 354).
  • This information can be obtained in one or more different manners.
  • One manner in which this information can be obtained is retrieval from a file or record.
  • at least some of the information is stored in a file or record associated with the identified piece of media content. This file or record is accessed in act 354 to retrieve the information stored therein.
  • Another manner in which this information can be obtained is receipt from a human user.
  • At least some of the information is received from a human user. These user inputs are used in act 354 as at least some of the information to be included in the session description message. Another manner in which this information can be obtained is automatic detection. In certain embodiments, at least some of the information can be obtained automatically by a computing device by analyzing the source of the identified piece of media content or the identified piece of media content itself. This automatically detected information is used in act 354 as at least some of the information to be included in the session description message. An RTCP message having a session description message that includes the obtained information is then created (act 356). In certain embodiments, this RTCP message is in the form of RTCP message 250 of FIG. 5 discussed above.
  • the created RTCP message is then sent to the intended recipient of the next piece of media content (act 358).
  • the intended recipient of the next piece of media content is the device to which the media content is being streamed (e.g., client device 102 of FIGS. 1, 2, 3, or 4).
  • the created RTCP message is included in an RTCP packet that is included as part of the streaming media being streamed to the intended recipient. It should be noted that situations can arise where the number of media streams being streamed for two different pieces of media content identified in a play list are different.
  • the first piece of media content identified in a play list may have two streams (e.g., an audio stream and a video stream), while the second piece of media content identified in a play list may have three streams (e.g., an audio stream, a video stream, and a text subtitle stream).
  • each media stream is typically using a different UDP channel that is received at the recipient on a different UDP port. If the recipient only opened two ports for the first piece of media content (e.g., one port for the audio stream and one port for the video stream), there would be no port available for the recipient to receive the text subtitle stream of the second piece of media content. Such situations can be resolved in different manners.
  • such situations are resolved by streaming the additional media stream(s) over an open HTTP connection using TCP.
  • An indication is included in RTCP message 250 (e.g., as an additional RTP-State block 292 for each additional media stream) that the additional media stream(s) is being streamed in this manner.
  • such situations are resolved by having the recipient open one or two extra ports, often referred to as wildcard ports. Each of these wildcard ports can be used to receive any media stream that the server device sends to the recipient.
  • An indication is included in RTCP message 250 (e.g., as an additional RTP-State block 292 for each additional media stream) of which of the wildcard ports the additional media stream(s) is being streamed to.
  • FIG. 8 is a flowchart illustrating an example process 380 for receiving session description messages in an RTCP message when using a server-side play list.
  • FIGS. 1, 2, 3, or 4 show acts performed by a recipient of streaming media, such as a client device 102 (e.g., of FIGS. 1, 2, 3, or 4).
  • a recipient of streaming media such as a client device 102 (e.g., of FIGS. 1, 2, 3, or 4).
  • the media content source is, for example, a server device 104 of FIGS. 1, 2, 3, or 4.
  • a session description message for a next piece of media content in the play list is extracted from the RTCP message (act 384).
  • this next piece of media content is the first piece of media content in the play list.
  • the next piece of media content is the next piece identified in the play list.
  • this next piece may be in the order defined by the play list, or the user may be able to navigate to a different piece within the play list (e.g., the user may be able to request that a particular piece in the play list be skipped or jumped over).
  • the session description message for the next piece of media content is typically received prior to playback of the current piece of media content being finished (to allow client device 102 to immediately begin playback of the next piece of media content when playback of the current piece of media content is finished).
  • the extracted session description message is then used in processing of the next piece of media content (act 386). This processing typically includes playback of the next piece of media content at client device 102.
  • system 500 includes a content source 502, a server 504, and clients 506 ⁇ -506 x that are connected to server 504 via a network 508.
  • Network 508 can be any suitable type of wired (including optical fiber) or wireless network (e.g., RF or free space optical).
  • network 508 is the Internet, but in other embodiments network 508 can be a local area network (LAN), a campus area network, etc.
  • server 504 includes an announcement generator 510. As will be described in more detail below, embodiments of announcement generator 510 generate streams containing information regarding multimedia presentations to be multicast over network 508.
  • FIG. 10 illustrates server operational flow of system 500500 of FIG. 9 in multicasting a multimedia presentation, according to one embodiment.
  • server 504 operates as follows to multicast a multimedia presentation.
  • server 504 receives a multimedia presentation via a connection 512.
  • server 504 receives the multimedia presentation from content source 502 via a link 512.
  • content source 502 provides multimedia content to be multicast over network 508.
  • the multimedia content can be generated in any suitable manner.
  • the multimedia content may be previously recorded/generated content that is then stored in a datastore (not shown), or a live performance that is captured (e.g., using a video camera, microphone, etc.) and encoded (encoder not shown).
  • the multimedia presentation will include multiple streams.
  • the multimedia presentation may include a video stream, an audio stream, another video stream encoded at a lower bit rate and another audio streams encoded at a lower bit rate.
  • the multimedia presentation may have more or fewer streams than those described in this example application.
  • server 504 receives the multimedia presentation in the form of one or more streams in block 524.
  • server 504 forms an announcement stream and multicasts the announcement stream over network 508 via a link 514.
  • announcement generator 510 of server 504 forms the announcement stream.
  • announcement generator 510 may be configured by an administrator, while in other embodiments announcement generator 510 may be configured to process stream(s) received in block 524 and extract information from the stream(s) to form the announcement stream
  • server 504 multicasts the announcement stream on a dedicated announcement channel (i.e., a channel without announcement information related to other multimedia presentations).
  • a channel can be a logical address such as a multicast Internet protocol (IP) address and port.
  • IP Internet protocol
  • a client can join a channel by listening to the logical address and port associated with the channel.
  • Clients may learn of the logical address in any suitable manner such as but not limited to email, invitations, website postings, and conventional Session Announcement Protocol (SAP) multicasts (e.g., as defined in Specification, IETF RFC-2974, entitled "Session Announcement Protocol".
  • SAP Session Announcement Protocol
  • the SAP multicast need not include the detailed presentation description information that would be provided in an "in-line” announcement stream (described in more detail below).
  • the announcement stream is multicast "in-line" with a stream containing multimedia data.
  • the stream of multimedia data can be multicast using packets according to the Real-time Transport Protocol (RTP) and the announcement stream can be multicast using packets according to the Real-time Transport Control Protocol (RTCP).
  • RTP Real-time Transport Protocol
  • RTCP Real-time Transport Control Protocol
  • the RTP is defined in Request For Comments (RFC) 3550, Internet Engineering Task Force (IETF), July, 2003 (which includes the specification of the RTCP as well).
  • RTP is extended to support announcement data in RTCP packets.
  • the announcement data can be sent "in-line" in the same RTP packets (or other protocol packets/datagrams) as the multimedia data.
  • the announcement channel can be out-of-band (e.g., when the announcement channel is multicast using SAP.
  • the announcement stream contains information that describes the multimedia presentation such as, for example, identification of various channels used to multicast the multimedia presentation, descriptions of the stream (e.g., indicating the type of stream (e.g., video or audio); bit-rate of the stream; language used in the stream, etc.) being transported by each of the channels, error correction information; security/authentication information; encryption information; digital rights management (DRM) information, etc.
  • the announcement stream is repeatedly multicast during the multimedia presentation so that clients joining at different times may receive the multimedia presentation description information. A client receiving this presentation description information via the announcement stream can then determine which channel(s) are suitable to join based in view of its resources.
  • server 504 multicasts stream(s) selected from the stream(s) of the multimedia presentation received in block 524. In some scenarios, server 504 multicasts all of the streams received in block 524. In some embodiments, an administrator can configure server 504 to multicast particular streams in preselected channels. In one embodiment, server 504 supports at least an announcement channel, a video channel and an audio channel. More typically, server 504 will also support additional channels of video and audio streams of different bit rates to accommodate clients having different resources available to process the multimedia presentation. For example, as shown in FIG. 11, server 504 may be configured to support an announcement channel 532, an acceleration channel 534 (described below in conjunction with FIGS.
  • a high quality video channel 536 a high quality audio channel 538
  • an application channel 540 can be used to multicast data used by applications expected to be running locally on the clients (e.g., a media player, or other applications that may require a plug-in to use the multicasted application data such as Microsoft PowerPoint® data).
  • server 504 may map a stream into only one channel, multiple channels or no channels.
  • server 504 may be configured to map the Spanish language stream into all of channels 542 542 N , or only channel 542 l5 or to no channel at all.
  • the "layout" of the channels is preselected.
  • the channels may be a set of sequential IP addresses in the range of IP addresses assigned for multicasting (i.e., the range of IP address 224.0.0.0 to IP address 239.255.255.255).
  • announcement channel 532 may be assigned to IP address 231.0.0.1
  • acceleration channel 534 may be assigned to IP address 231.0.0.2
  • high quality video channel 536 may be assigned to IP address 231.0.0.3, and so on.
  • the channels may be a set of sequential ports of a group IP address.
  • announcement channel 532 may be assigned to port 231.0.0.1:5000
  • acceleration channel 534 may be assigned to port 231.0.0.1:500
  • high quality video channel 536 may be assigned to port 231.0.0.1:5004, and so on (so that ports 5001, 5003 and 5005 can be used for the RTCP packets).
  • the approaches used by the above embodiments of system 500500 have several advantages.
  • a client can more quickly obtain the presentation description information, thereby advantageously reducing start up latency.
  • a conventional SAP multicast approach typically has a larger start up latency because SAP multicasts generally announce a large number of multicasts, which tends to reduce the frequency at which announcements for a particular multimedia presentation is multicast (which in turn tends to increase start up latency).
  • these embodiments of system 500 do not require that clients have a back channel to server 504, thereby providing more flexibility in delivering multimedia presentations to a desired audience.
  • system 500 eliminates the need for the server to provide a "multicast information" file required in the previously-described conventional system, and, thus, the costs involved in maintaining and publishing this file. Still further, because the streams being multicast in the set of channels are
  • clients may choose to join particular channels without waiting to receive and process the multimedia presentation description information from announcement channel 532.
  • an aggressive client typically a client with relatively large resources
  • a client with large resources can be a client having a computing platform with high speed CPU and large buffering resources, and is connected to a high- speed computer network with a relatively large amount of available bandwidth. The high speed CPU and large buffering resources significantly reduce the risk of losing data.
  • FIG. 12 illustrates operational flow of client 506 1 (FIG.
  • client 506 1 in receiving a multimedia presentation that is being multicast by server 504 (FIG. 9), according to one embodiment.
  • Clients 506 2 -506 ⁇ (FIG. 9) can operate in a substantially identical manner.
  • client 506 1 operates as follows in receiving the multimedia presentation.
  • client 506 ls having already received the logical address of the announcement channel of a multimedia presentation, joins announcement channel 532.
  • server 504 repeatedly multicasts presentation description information on a dedicated announcement 3 ' 2 channel.
  • client 506 ! can relatively quickly receive the presentation description information compared to conventional systems that generally multicast description information of a relatively large number of multimedia and/or other types of presentations.
  • client 506 ⁇ then joins one or more of the channels that provide multimedia data streams, which are described in the received announcement stream.
  • client 506i can determine which channel(s) to join to have an optimal experience using the resources available to client 506 ⁇
  • Client 506 ⁇ then can receive the selected stream(s) of the multimedia presentation.
  • FIG. 12A illustrates operational flow of client 506i (FIG. 9) in receiving a multimedia presentation that is being multicast by server 504 (FIG. 9), according to another embodiment.
  • Clients 506 2 -506 x (FIG. 9) can operate in a substantially identical manner. Referring to FIGS. 9, 11 and 12A, client 506 ! operates as follows in receiving the multimedia presentation. In this embodiment, client 506 !
  • client 506 ⁇ also joins one or more preselected channels of the multimedia presentation in addition to announcement channel 532.
  • server 504 can be configured to multicast streams in preselected channels in a predetermined manner.
  • client 506 t can take advantage of the preselected channel assignments to join desired channels without having to receive the presentation description information from announcement channel 532.
  • client 506 1 has relatively large resources to receive and process multimedia presentations, capable of handling typical high quality video and high quality audio streams.
  • client 506i can be configured to immediately join channels 536 and 538 to receive high quality video and high quality audio streams to reduce start up latency with a relatively high expectation that client 506 ! can properly process the streams.
  • client 506 ! determines whether it can optimally process the stream(s) received from the channel(s) it joined in block 570 in view of the resources available to client 506 ⁇ .
  • client 506i uses the presentation description information received from announcement channel 532 to determine whether its resources can handle the streams received on these channels.
  • the streams of channels joined in block 570 may have bit rates (which will be described in the announcement stream) that are too great for client 506i to process without losing data (which can result in choppy audio playback for audio streams or blocky video playback for video streams).
  • client 506i determines in block 572 that it can optimally process the stream(s) of the preselected channel(s)
  • client 506 ! continues to receive the stream(s) from the channel(s) client 506i joined in block 570 until the multicasted multimedia presentation terminates.
  • client 506 ! in block 572 determines that it cannot optimally process the stream(s) of the preselected channel(s)
  • the operational flow proceeds to block 564 (previously described in conjunction with FIG. 12).
  • FIG. 13 illustrates some of the components of server 504 (FIG. 9), according to one embodiment.
  • server 504 in addition to announcement generator 510 (described above in conjunction with FIG. 9), server 504 includes a configuration controller 582, a configurable stream mapper 584, a source interface 586 and a network interface 588.
  • Source interface 586 is configured to receive one or more multimedia streams from content source 502 (FIG. 9) via link 512.
  • Configurable stream mapper 582 is configured to receive the streams from source interface 586, an announcement stream from announcement generator 510, and control information from configuration controller 582.
  • configurable stream mapper 584 functions like a switch in mapping or directing one or more of the streams received from source interface 586 to multicast channel(s).
  • Network interface 588 multicasts the selected streams over network 508 (FIG. 9).
  • configuration controller 582 configures configurable stream mapper 584 to map the received stream(s) of the multimedia presentation into channel(s).
  • configuration controller 582 directs announcement generator 510 in generating announcements. Operational flow of one embodiment of configuration controller 582 in is described below in conjunction with FIGS. 13 and 14.
  • FIG. 14 illustrates operational flow of configuration controller 582 (FIG. 13) in multicasting a multimedia presentation, according to one embodiment. Refe ⁇ ing to FIGS 13 and 14, one embodiment of configuration controller 582 operates to multicast a multimedia presentation as described below.
  • this embodiment of configuration controller 582 receives configuration information from an administrator. The administrator can manually provide configuration information to configuration controller 582 of server 504. This configuration information may define each of the channels in terms of logical address, and include presentation description information (previously described).
  • the presentation information may include the media type(s) of the stream(s) of the multimedia presentation to be multicast, the bit-rate(s) of the stream(s); the language(s), e ⁇ or co ⁇ ection information; security/authentication information; encryption information; digital rights management (DRM) information, etc.
  • configuration controller 586 may be configured to extract the presentation description information from the streams themselves (e.g., from header or metadata information included in the streams) after being received from content source 502 (FIG. 9) via source interface 586.
  • configuration controller 582 configures stream mapper 584 to map the announcement stream from announcement generator 510 and the multimedia data stream(s) from source interface 586 to the channels as described in the presentation description information.
  • This announcement stream is repetitively multicast over the announcement channel by server 504 during the multicast of the multimedia presentation.
  • configuration controller 582 provides presentation description information for the stream(s) to announcement generator 510.
  • announcement generator 510 forms the announcement stream that includes the presentation description information.
  • the "layout" of the channels may be preselected.
  • a client would be given a logical address (e.g., a URL) for joining a multicast multimedia presentation.
  • that first logical address is preselected to carry the announcement stream in one embodiment.
  • the next sequential logical address is preselected to carry the acceleration channel, while the next sequential logical address is preselected to carry a high quality video stream, and so on as shown in the embodiment of FIG. 11.
  • Configuration controller 582 configures stream mapper 584 to map the announcement stream and the multimedia data streams according to the preselected channel layout.
  • FIG. 15 illustrates some of the components of server 504 (FIG. 9), according to another embodiment. This alternative embodiment of server 504 is substantially similar to the embodiment of FIG. 13, except that this embodiment includes an accelerated stream generator 702.
  • accelerated stream generator 702 is configured to form a stream in which each unit of multimedia data that is multicast contains a cu ⁇ ent subunit of multimedia data and a preselected number of previous subunits of data.
  • an accelerated stream may be multicast so that a datagram contains the cu ⁇ ent frame(s) of the multimedia presentation and the frames of the previous five seconds.
  • accelerated stream generator 702 provides the accelerated stream to configurable stream mapper 584 to be mapped into a dedicated acceleration channel such as acceleration channel 534 (FIG. 11).
  • an acceleration channel datagram need not include the cu ⁇ ent frame(s).
  • FIG. 16 illustrates operational flow of server 504 with accelerated stream generator 702 (FIG. 15), according to one embodiment.
  • accelerated stream generator 702 forms a unit of multimedia data for multicast over network 508 (FIG. 9).
  • accelerated stream generator 702 forms the unit using a cu ⁇ ent subunit of the multimedia presentation data and the previous Z subunits of multimedia presentation data.
  • the unit may be a datagram or packet, and the subunits may be frames of multimedia data.
  • Z is selected to ensure that the unit (i.e., packet or datagram) will contain a key frame needed to render or decode the multimedia data. In other embodiments, Z is selected without regard to whether the unit will be ensured of having a key frame.
  • a block 804 the unit of multimedia data formed in block 802 is multicast over network 508 (FIG. 9).
  • accelerated stream generator 702 provides the unit of multimedia data to configurable stream mapper 584, which then maps the block to the acceleration channel.
  • Server 504 then multicasts the unit of multimedia data over network 508 (FIG. 9) via network interface 588.
  • server 504 multicasts the unit at a rate that is "faster than real time" (i.e., at a bit rate that is faster than the bit rate of the underlying multimedia data). This approach advantageously allows a client having relatively large resources to join the acceleration channel and quickly fill the buffer of its multimedia player in receiving the unit so that rendering or playback can begin more quickly.
  • the multicasted unit of multimedia data includes a key frame.
  • the rate at which server 504 multicasts the unit need not be "faster than real time”. This approach may be used in applications in which the client concu ⁇ ently joins both the acceleration channel and another channel that multicasts multimedia data so that, in effect, the client receives the multimedia data at a rate that is "faster than real-time.” If more multimedia data is to be multicasted, the operational flow returns to block 802, as represented in decision block 806. Thus, for example, using the above example of multimedia frames transported in datagrams, the next datagram would include the next frame of multimedia data, plus the frame added in the previous datagram, plus the previous (Z-l) frames.
  • each unit e.g., datagram
  • each unit represents a sliding window of the cu ⁇ ent subunit (e.g., frame) and the previous Z frames, with Z selected to be large enough to ensure that each unit has enough information to minimize the time needed to allow the client's multimedia player to start rendering/playback of the multimedia presentation.
  • Z may be selected to ensure that each unit has a key frame.
  • units of video and audio data are multicasted in an alternating manner on the same channel if the multimedia presentation includes both audio and video streams.
  • separate acceleration channels may be used for audio and video streams.
  • FIG. 17 illustrates client operational flow in receiving an accelerated stream, according to one embodiment.
  • a client e.g., one of clients 506 ⁇ -506 ⁇ of FIG. 9 joins the acceleration channel.
  • the acceleration channel is part of the preselected channel layout and the client can join it either concu ⁇ ently or without joining the announcement channel.
  • the acceleration channel can be advantageously used by a client having relatively large resources for receiving and processing multimedia presentations so that the client may reduce start up latency.
  • the client receives one or more units of multimedia data from the acceleration channel.
  • each unit of multimedia data is generated as described above in conjunction with FIG. 16.
  • the client can then process each unit of multimedia data to relatively quickly begin the rendering or playback process.
  • the client receives a unit of video data and a unit of audio data, with the video data containing a key frame so that the client can begin the rendering/playback process as soon as possible.
  • a unit need not have a key frame in other embodiments.
  • the client can then join a non-accelerated channel such as high quality video channel 536 and high quality audio channel 538.
  • the non-accelerated channels that the client joins are preselected using the above-described preselected channel layout.
  • the client joins channel(s) based on the presentation description information contained in announcement stream.
  • the client quits the acceleration channel.
  • the client quits the acceleration channel immediately after receiving the unit or units of multimedia data needed to begin the rendering/playback process or processes.
  • blocks 902 and 906 are performed in parallel so that the operational flow is that the client joins accelerated and non-accelerated channels concu ⁇ ently.
  • Block 904 is perfo ⁇ ned sequentially after block 902, with block 904 and 906 proceeding to block 908.
  • FIG. 17A illustrates an example scenario in which a client may join the acceleration channel and some preselected channels, and then join other channels (e.g., based on announcement information received from the announcement channel).
  • the client joins the acceleration channel (i.e., block 902) concu ⁇ ently with joining one or more preselected non-accelerated channels (i.e., block 906).
  • the client receives one or more units of multimedia data from the acceleration channel (i.e., block 904) as well as multimedia and announcement data from the non-accelerated channel(s).
  • the client may decide to quit the preselected channel(s) and join other nonaccelerated channels (i.e., blocks 572, 564 and 574).
  • the various multicasting embodiments described above may be implemented in computer environments of the server and clients.
  • An example computer environment suitable for use in the server and clients is described below in conjunction with FIG. 18.
  • FIG. 18 illustrates a general computer environment 1000, which can be used to implement the techniques described herein.
  • the computer environment 1000 is only one example of a computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the computer and network architectures.
  • Computer environment 1000 includes a general-purpose computing device in the form of a computer 1002.
  • the components of computer 1002 can include, but are not limited to, one or more processors or processing units 1004, system memory 1006, and system bus 1008 that couples various system components including processor 1004 to system memory 1006.
  • System bus 1008 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures.
  • such architectures can include an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, a Peripheral Component Interconnects (PCI) bus also known as a Mezzanine bus, a PCI Express bus, a Universal Serial Bus (USB), a Secure Digital (SD) bus, or an IEEE 1394, i.e., FireWire, bus.
  • Computer 1002 may include a variety of computer readable media. Such media can be any available media that is accessible by computer 1002 and includes both volatile and non- volatile media, removable and non-removable media.
  • System memory 1006 includes computer readable media in the form of volatile memory, such as random access memory (RAM) 1010; and/or non- volatile memory, such as read only memory (ROM) 1012 or flash RAM.
  • RAM 1010 random access memory
  • ROM read only memory
  • BIOS Basic input/output system
  • BIOS Basic input/output system
  • RAM 1010 typically contains data and/or program modules that are immediately accessible to and or presently operated on by processing unit 1004.
  • Computer 1002 may also include other removable/non-removable, volatile/non- volatile computer storage media.
  • Hard disk drive 1016 for reading from and writing to a non-removable, non- volatile magnetic media (not shown), magnetic disk drive 1018 for reading from and writing to removable, non-volatile magnetic disk 1020 (e.g., a "floppy disk"), and optical disk drive 1022 for reading from and/or writing to a removable, non- volatile optical disk 1024 such as a CD-ROM, DVD-ROM, or other optical media.
  • Hard disk drive 1016, magnetic disk drive 1018, and optical disk drive 1022 are each connected to system bus 1008 by one or more data media interfaces 1025.
  • hard disk drive 1016, magnetic disk drive 1018, and optical disk drive 1022 can be connected to the system bus 1008 by one or more interfaces (not shown).
  • the disk drives and their associated computer-readable media provide non- volatile storage of computer readable instructions, data structures, program modules, and other data for computer 1002.
  • a hard disk 1016, removable magnetic disk 1020, and removable optical disk 1024 it is appreciated that other types of computer readable media which can store data that is accessible by a computer, such as magnetic cassettes or other magnetic storage devices, flash memory cards, CD-ROM, digital versatile disks (DVD) or other optical storage, random access memories (RAM), read only memories (ROM), electrically erasable programmable read-only memory (EEPROM), and the like, can also be utilized to implement the example computing system and environment.
  • RAM random access memories
  • ROM read only memories
  • EEPROM electrically erasable programmable read-only memory
  • Any number of program modules can be stored on hard disk 1016, magnetic disk 1020, optical disk 1024, ROM 1012, and/or RAM 1010, including by way of example, operating system 1026, one or more application programs 1028, other program modules 1030, and program data 1032. Each of such operating system 1026, one or more application programs 1028, other program modules 1030, and program data 1032 (or some combination thereof) may implement all or part of the resident components that support the distributed file system.
  • a user can enter commands and information into computer 1002 via input devices such as keyboard 1034 and a pointing device 1036 (e.g., a "mouse").
  • Other input devices 1038 may include a microphone, joystick, game pad, satellite dish, serial port, scanner, and/or the like. These and other input devices are connected to processing unit 1004 via input/output interfaces 1040 that are coupled to system bus 1008, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB). Monitor 1042 or other type of display device can also be connected to the system bus 1008 via an interface, such as video adapter 1044. In addition to monitor 1042, other output peripheral devices can include components such as speakers (not shown) and printer 1046, which can be connected to computer 1002 via I/O interfaces 1040.
  • Computer 1002 can operate in a networked environment using logical connections to one or more remote computers, such as remote computing device 1048.
  • remote computing device 1048 can be a PC, portable computer, a server, a router, a network computer, a peer device or other common network node, and the like.
  • Remote computing device 1048 is illustrated as a portable computer that can include many or all of the elements and features described herein relative to computer 1002.
  • computer 1002 can operate in a non-networked environment as well.
  • Logical connections between computer 1002 and remote computer 1048 are depicted as a local area network (LAN) 1050 and a general wide area network (WAN) 1052.
  • LAN local area network
  • WAN wide area network
  • Such networking environments are commonplace in offices, enterprise- wide computer networks, intranets, and the Internet.
  • computer 1002 When implemented in a LAN networking environment, computer 1002 is connected to local network 1050 via network interface or adapter 1054.
  • computer 1002 When implemented in a WAN networking environment, computer 1002 typically includes modem 1056 or other means for establishing communications over wide network 1052.
  • Modem 1056 which can be internal or external to computer 1002, can be connected to system bus 1008 via I/O interfaces 1040 or other appropriate mechanisms. It is to be appreciated that the illustrated network connections are examples and that other means of establishing at least one communication link between computers 1002 and 1048 can be employed.
  • program modules depicted relative to computer 1002, or portions thereof may be stored in a remote memory storage device.
  • remote application programs 1058 reside on a memory device of remote computer 1048.
  • applications or programs and other executable program components such as the operating system are illustrated herein as discrete blocks, although it is recognized that such programs and components reside at various times in different storage components of computing device 1002, and are executed by at least one data processor of the computer.
  • Various modules and techniques may be described herein in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other devices.
  • program modules include routines, programs, objects, components, data structures, etc.
  • Computer readable media can be any available media that can be accessed by a computer.
  • Computer readable media may comprise “computer storage media” and “communications media.”
  • “Computer storage media” includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, or other data.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.
  • Communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as carrier wave or other transport mechanism. Communication media also includes any information delivery media.
  • modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner as to encode inforaiation in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media. Combinations of any of the above are also included within the scope of computer readable media.
  • wired media such as a wired network or direct-wired connection
  • wireless media such as acoustic, RF, infrared, and other wireless media. Combinations of any of the above are also included within the scope of computer readable media.
  • the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Strategic Management (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Computer And Data Communications (AREA)

Abstract

Embedded within at least some Real-Time Control Protocol (RTCP) messages sent from a media content source to a recipient is a session description message (186) that describes a media presentation (186) being streamed to the recipient (102). The session description message (186) can be associated, for example, with one of a plurality of pieces of media content in a play list of media content (186) being streamed from the device (104) to the recipient (102). In accordance with certain aspects, an RTCP message that embeds a session description message (186) includes at least three fields: a first field containing data identifying the RTCP message as being a type that embeds a session description message (186); a second field containing data that is the session description message (186) for a media presentation; and a third field containing data identifying a length of the RTCP message, generated by summing the length of the first, second, and third fields.

Description

EMBEDDING A SESSION DESCRIPTION MESSAGE IN A REAL-TIME CONTROL PROTOCOL (RTCP) MESSAGE
TECHNICAL FIELD
This invention relates to streaming media and data transfers, and particularly to embedding a session description message in an RTCP message.
BACKGROUND OF THE INVENTION Content streaming, such as the streaming of audio, video, and/or text is becoming increasingly popular. The term "streaming" is typically used to indicate that the data representing the content is provided over a network to a client computer on an as-needed basis rather than being pre-delivered in its entirety before playback. Thus, the client computer renders streaming content as it is received from a network server, rather than waiting for an entire "file" to be delivered. The widespread availability of streaming multimedia content enables a variety of informational content that was not previously available over the Internet or other computer networks. Live content is one significant example of such content. Using streaming multimedia, audio, video, or audio/visual coverage of noteworthy events can be broadcast over the Internet as the events unfold. Similarly, television and radio stations can transmit their live content over the Internet. The Session Description Protocol (SDP), Network Working Group Request for Comments (RFC) 2327, is a text-based format used to describe properties of a multimedia presentation, referred to as a "session", and properties of one or more media streams contained within the presentation. SDP has been developed as an application level protocol intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. SDP can be used in accordance with other protocols, such as the Real-Time Streaming Protocol (RTSP) or the HyperText Transfer Protocol (HTTP), to describe and/or negotiate properties of a multimedia session used for delivery of streaming data. However, in many situations it is difficult to get the SDP information from the network server to the client computer. For example, the network server may be streaming a series of multimedia presentations to the client computer, such as presentations listed in a play list. When the network server switches from streaming one presentation to the next, it is oftentimes difficult for the SDP information for the next presentation to be made available to the client computer. Thus, it would be beneficial to have an additional mechanism by which the SDP information can be made available to the client computer. Content streaming may also be multicast. Conventional approaches to multicasting of streaming content typically involves providing content to be multicast to a server, which then multicasts the content over a network (i.e., without feedback from the clients receiving the streams). The server typically multicasts the content in several streams having different formats (e.g., bit rates, languages, encoding schemes etc.) Clients attached to the network can then receive the stream(s) appropriate for its resources. To allow clients to select which stream(s) to receive, one multicast approach requires that the server provide a file that provides "multicast information" that allows clients to open streams of content. Maintaining and publishing this file is typically a manual process that has a relatively high administrative cost. Further, if not properly maintained and published, clients may encounter problems, which can lead to customer dissatisfaction. Another problem with this approach is that clients must keep their "multicast information" up-to-date so that they can properly access the content. This problem is exacerbated for clients that do not have a suitable back channel to request updates (e.g., clients with unidirectional satellite links).
SUMMARY OF THE INVENTION
Embedding a session description message in a Real-Time Control Protocol (RTCP) message is discussed herein. Embedded within at least some of the RTCP messages sent from a media content source to a recipient is a session description message that describes the media presentation being streamed to the recipient. In accordance with certain aspects, an RTCP message that embeds a session description message includes at least three fields. The first field contains data identifying the RTCP message as being a type that embeds a session description message. The second field contains data that is the session description message for a media presentation. The third field contains data identifying a length of the RTCP message, generated by summing the length of the first field, the length of the second field, and the length of the third field. In accordance with other aspects, the RTCP message is created at a device, such as a server device. The session description message embedded within the RTCP message is associated with one of a plurality of pieces of media content in a play list of media content being streamed from the device to the recipient. In accordance with other aspects, multimedia presentations are multicast using an announcement channel that includes presentation description information along with multiple channels for multiple streams of multimedia data to accommodate clients of different multimedia resources. Clients can use the announcement channel to select channel(s) appropriate for their multimedia resources. In accordance with other aspects, the channels are created in a predetermined manner (e.g., preselected logical addresses, preselected ports of an IP address, etc.) so that clients can immediately join a channel without (or concurrently with) joining the announcement channel to reduce startup latency. In accordance with other aspects, an acceleration channel may be created that provides blocks of data containing the current unit of the multimedia presentation along with a preselected number of previous units at a bit rate that is "faster than real-time" (i.e., at a rate that is faster than the bit-rate of the multimedia streams). This feature allows clients with suitable resources to more quickly buffer sufficient data to begin presenting the multimedia data to users. Alternatively, the acceleration channel need not be "faster than real time" so that a client may concurrently join both the acceleration channel and another channel that multicasts multimedia data so that, in effect, the client receives the multimedia data at a rate that is "faster than real-time."
BRIEF DESCRIPTION OF THE DRAWINGS
The same numbers are used throughout the document to reference like components and/or features. FIG. 1 illustrates an example network environment that can be used to stream media using the session description message embedded in an RTCP message as described herein. FIG. 2 illustrates example client and server devices that can stream media content using the session description message embedded in an RTCP message as described herein. FIG. 3 illustrates example client and server devices in a multicast environment that can stream media content using the session description message embedded in an RTCP message as described herein. FIG. 4 illustrates example client and server devices in a server-side play list environment that can stream media content using the session description message embedded in an RTCP message as described herein. FIG. 5 illustrates an example format of an RTCP message having an embedded session description message. FIG. 6 illustrates an example session description message format. FIG. 7 is a flowchart illustrating an example process for embedding session description messages in an RTCP message when using a play list. FIG. 8 is a flowchart illustrating an example process for receiving session description messages in an RTCP message when using a play list.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Embedding a session description message in a Real-Time Control Protocol (RTCP) message is discussed herein. A multimedia or single media presentation is streamed from a media content source, such as a server device, to a recipient, such as a client device, using Real-Time Transport Protocol (RTP) packets. Control information regarding the presentation being streamed is also sent from the media content source to the recipient using RTCP messages. Embedded within at least some of the RTCP messages is a session description message that describes the presentation being streamed. In the discussions herein, reference is made to multimedia presentations being streamed from a media content source to a recipient. The media content source can be any source of media content, an example of which is a server device. A recipient can be any recipient of media content, an example of which is a client device. Additionally, it is to be appreciated that although the discussion herein may refer to multimedia presentations being streamed, single media presentations may also be streamed in the same manner as discussed herein regarding multimedia presentations. FIG. 1 illustrates an example network environment 100 that can be used to stream media using the session description message embedded in an RTCP message as described herein. In environment 100, multiple (a) client computing devices 102(1), 102(2), . . ., 102(a) are coupled to multiple (b) server computing devices 104(1), 104(2), . . ., 104(b) via a network 106. Network 106 is intended to represent any of a variety of conventional network topologies and types (including wired and/or wireless networks), employing any of a variety of conventional network protocols (including public and/or proprietary protocols). Network 106 may include, for example, the Internet as well as possibly at least portions of one or more local area networks (LANs). Computing devices 102 and 104 can each be any of a variety of conventional computing devices, including desktop PCs, workstations, mainframe computers, Internet appliances, gaming consoles, handheld PCs, cellular telephones, personal digital assistants (PDAs), etc. One or more of devices 102 and 104 can be the same types of devices, or alternatively different types of devices. Server devices 104 can make any of a variety of data available for streaming to clients 102. The term "streaming" is used to indicate that the data representing the media is provided over a network to a client device and that playback of the content can begin prior to the content being delivered in its entirety (e.g., providing the data on an as-needed basis rather than pre-delivering the data in its entirety before playback). The data may be publicly available or alternatively restricted (e.g., restricted to only certain users, available only if the appropriate fee is paid, etc.). The data may be any of a variety of one or more types of content, such as audio, video, text, animation, etc. Additionally, the data may be pre-recorded or alternatively "live" (e.g., a digital representation of a concert being captured as the concert is performed and made available for streaming shortly after capture). A client device 102 may receive streaming media from a server 104 that stores the streaming media content as a file, or alternatively from a server 104 that receives the streaming media from some other source. For example, server 104 may receive the streaming media from another server that stores the streaming media content as a file, or may receive the streaming media from some other source (e.g., an encoder that is encoding a "live" event). As used herein, streaming media refers to streaming one or more media streams from one device to another (e.g., from a server device 104 to a client device 102). The media streams can include any of a variety of types of content, such as one or more of audio, video, text, and so forth. FIG. 2 illustrates example client and server devices that can stream media content using the session description message embedded in an RTCP message as described herein. Multiple different protocols are typically followed at both client device 102 and server device 104 in order to stream media content from server device 104 to client device 102. These different protocols can be responsible for different aspects of the streaming process. Although not shown in FIG. 2, one or more additional devices (e.g., firewalls, routers, gateways, bridges, etc.) may be situated between client device 102 and server device 104. In the example of FIG. 2, an application level protocol 150, a transport protocol 152, and one or more delivery channel protocols 154 are used as part of the streaming process. Additional protocols not shown in FIG. 2 may also be employed (e.g., there may be an additional protocol(s) between application level protocol 150 and transport protocol 152). Application level protocol 150 is a protocol at the application level for control of the delivery of data with real-time properties. Application level protocol 150 provides a framework, optionally extensible, to enable controlled, on-demand delivery of real-time data, such as streaming audio and video content. Application level protocol 150 is a control protocol for initiating and directing delivery of streaming multimedia from media servers. Examples of application level protocol 150 include the Real-Time Streaming Protocol (RTSP) as described in Network Working Group Request for Comments (RFC) 2326, April 1998, and the HyperText Transport Protocol (HTTP) as described in Network Working Group Request for Comments (RFC) 1945, May 1996 or Network Working Group Request for Comments (RFC) 2068, January 1997. Application level protocol 150 uses transport protocol 152 for the delivery of real-time data, such as streaming audio and video. Transport protocol 152 defines a packet format for media streams. Transport protocol 152 provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. Examples of transport protocol 152 include the Real-Time Transport Protocol (RTP) and the Real-Time Control Protocol (RTCP) as described in Network Working Group Request for Comments (RFC) 3550, July 2003. Other versions, such as future draft or standardized versions, of RTP and RTCP may also be used. RTP does not address resource reservation and does not guarantee quality-of- service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide some control and identification functionality. The RTCP protocol groups one or more control messages together into a unit referred to as an RTCP packet. Embedded within one or more of the RTCP packets is a control message that includes a session description message. The session description message describes properties of the multimedia presentation being streamed from server device 104 to client device 102. The streaming media from server device 104 to client device 102 thus includes the session description message. The transport protocol 152 uses delivery channel protocol(s) 154 for the transport connections. Delivery channel protocol(s) 154 include one or more channels for transporting packets of data from server device 104 to client device 102. Each channel is typically used to send data packets for a single media stream, although in alternate embodiments a single channel may be used to send data packets for multiple media streams. Examples of delivery channel protocols 154 include Transmission Control Protocol (TCP) packets and User Datagram Protocol (UDP) packets. TCP ensures the delivery of data packets, whereas UDP does not ensure the delivery of data packets. Typically, delivery of data packets using TCP is more reliable, but also more time-consuming, than delivery of data packets using UDP. FIG. 3 illustrates example client and server devices in a multicast environment that can stream media content using the session description message embedded in an RTCP message as described herein. In certain embodiments, the protocols 150, 152, and 154 of FIG. 2 are included in the client and server devices of FIG. 3, but have not been illustrated. Furthermore, although not shown in FIG. 3, one or more additional devices (e.g., firewalls, routers, gateways, bridges, etc.) may be situated between client device 102 and server device 104. A streaming module 182 of server device 104 streams the same multimedia presentation to each of multiple (x) client devices 102(1), 102(2), . . ., 102(x). Each client device 102 has a streaming media player 184 that receives the streamed multimedia presentation and processes the received stream at the client device 102, typically playing back the multimedia presentation at the client device 102. The same data is streamed to each client device 102 at approximately the same time, allowing server device 104 to stream only one occurrence of the same multimedia presentation at a time, with the various client devices 102 listening in to this one occurrence being streamed. The streaming media 186 includes RTCP messages having one or more session description messages embedded therein. The same session description message may be broadcast multiple times during the streaming of the multimedia presentation, thereby allowing new client devices 102 to listen in to the streaming media after streaming has begun but still receive a session description message describing the multimedia presentation. By embedding the session description messages within the RTCP messages of the streaming media 186, client devices 102 do not need to listen in to a separate stream or broadcast, potentially from a device other than server device 182, to receive the session description messages. Co-pending Application No. 10/693,430, filed October 24, 2003, entitled "Methods and Systems for Self-Describing Multicasting of Multimedia Presentations", which is hereby incorporated by reference, describes an example of such a multicasting environment. FIG. 4 illustrates example client and server devices in a server-side play list environment that can stream media content using the session description message embedded in an RTCP message as described herein. In certain embodiments, the protocols 150, 152, and 154 of FIG. 2 are included in the client and server devices of FIG. 4, but have not been illustrated. Furthermore, although not shown in FIG. 3, one or more additional devices (e.g., firewalls, routers, gateways, bridges, etc.) may be situated between client device 102 and server device 104. A streaming module 202 of server device 104 streams a multimedia presentation as streaming media 204 to a streaming media player 206 of client device 102. Streaming media player 206 receives the streamed multimedia presentation and processes the received stream at the client device 102, typically playing back the multimedia presentation at the client device 102. Server device 104 includes a play list 208 that identifies multiple (y) pieces of media content 210(1), 210(2), . . ., 210(y). In certain implementations, a play list 208 includes multiple entries, each entry identifying one of the multiple pieces of media content 210. Alternatively, play list 208 may identify a single piece of media content, although in such situations the single piece of media content could simply be referenced by itself rather than through the use of a play list. A client device 102 is able to select a single resource for playback, that resource identifying play list 208. Streaming module 202 accesses the identified play list 208, and then accesses the individual pieces of media content 210 and streams those pieces 210 to client device 102. Thus, the client device 102 is able to access a single resource, yet have multiple different pieces of media content streamed from server device 104. As the play list 208 is accessed by and referred to by server device 104 to identify the pieces of media content, rather than by client device 102, the play list 208 can also be referred to as a server-side play list. Each piece of media content 210 includes one or more media streams. Different pieces of media content 210 can include different numbers of media streams. Each piece of media content 210 is typically a multimedia presentation. The manner in which a "piece" of content is defined can vary by implementation and based on the type of media. For example, for musical audio and/or video content each song can be a piece of content. Content may be separated into pieces along natural boundaries (e.g., different songs), or alternatively in other arbitrary manners (e.g., every five minutes of content is a piece). For stored content, different pieces of content can be stored as multiple files or alternatively as the same file. Although illustrated as two separate drawings in FIGS. 3 and 4, it is to be appreciated that pieces of media content referenced by a server-side play list as illustrated in FIG. 4 can be multicast as illustrated in FIG. 3. Referring to FIGS. 2, 3, and 4, at the transport level the data to be streamed form server device 104 to client device 102 is embedded in RTP packets. Control information related to the data being streamed and the RTP packets is embedded in one or more control messages within an RTCP packet. Typically, an RTCP packet consists of several messages of different types. The first message in the RTCP packet is either a Receiver Report or a Sender Report. The second message is an SDES (Source Description) message. The SDES message contains one or more textual meta-data items. The SDES message contains a CNAME (canonical name) item. The CNAME item is a persistent transport-level identifier of the media content source and provides a mapping between the RTP synchronization source (SSRC) number and a textual string. The SSRC is a source of a stream of RTP (and RTCP) packets. The CNAME is used so that a sender or receiver that is participating in multiple RTP sessions that belong to the same presentation may use different SSRC values in each RTP session, but keep the CNAME value the same. An additional type of message that can be included in an RTCP packet is a control message having embedded therein a session description message. The session description message describes properties of the multimedia presentation being streamed from server device 104 to client device 102. Different media formats or protocols can be used for such session description messages. An example of such a media format is the Session Description Protocol (SDP), Network Working Group Request for Comments (RFC) 2327, April 1998. In certain embodiments, the session description message discussed herein is a message in accordance with the SDP format described in RFC 2327. Although different formats can be used to describe properties of the multimedia presentation, one or more session description messages are sent from server device 104 to client device 102 that include identifier(s) of the properties. A single session description message may be sent by server device 104 for a particular multimedia presentation, or alternatively multiple session description messages may be sent. If multiple session description messages are sent, the multiple messages may include the same information, different information, or overlapping information. A session description message includes, for example, one or more of: an identification of various channels used to multicast the multimedia presentation; descriptions of each media stream available in the multimedia presentation (e.g., indicating the type of stream (e.g., video or audio), a bit-rate of each media stream, a language used in the stream, etc.); error correction information; security/authentication information; encryption information; or digital rights management (DRM) information; etc. It should be noted that in certain situations a session description message can be separated or fragmented across multiple RTCP control messages. Such situations can arise, for example, when the session description message is very large. Each of these RTCP control messages is included in a different RTCP packet, and each contains a portion or fragment of the entire session description message. Client device 102, upon receiving all of the portions or fragments, can combine them together to recreate the session description message. FIG. 5 illustrates an example format of an RTCP control message 250 having an embedded session description message. RTCP message 250 is discussed below as including multiple fields (also referred to as portions), each storing various data. It is to be appreciated that these fields can be arranged in different orders than the order in which they are discussed below and shown in FIG. 5. Additionally, although sizes or lengths of these fields (e.g., in bits) are discussed below, it is to be appreciated that these are only examples and the fields may alternatively larger or smaller than these example sizes or lengths. In certain embodiments, RTCP message 250 includes all of the fields shown in FIG. 5. In alternate embodiments, RTCP message 250 includes fewer than all of the fields shown in FIG. 5, or may include additional fields not shown in FIG. 5. The fields of RTCP message 250 can be viewed as being grouped into three groups: a header 290, an RTP-State block 292, and the session description message 284. Header 290 includes various information about RTCP message 250. RTP- State block 292 is optional, and when included is used to identify RTP-specific information about a stream of the multimedia presentation that is described in the session description message (e.g., to specify the SSRC and initial RTP sequence number of a stream in the session description message). Typically, one RTP-State block 292 is associated with and included in RTCP message 250 for each media stream in the multimedia presentation. Session description message 284 is the session description message embedded within RTCP message 250. V (version) field 252 is a 2-bit field that identifies the version of RTP being used, which is the same in RTCP packets as in RTP packets. For example, the version defined by RFC 3550 is 2. P (padding) field 254 is a single bit that, when set (e.g., to a value of 1), indicates that RTCP message 250 contains some additional padding at the end which is not part of the control information. This padding is included in the length field 262, but otherwise should be ignored. The amount of padding is included within the padding itself. In certain implementations, the additional padding is in octets, and the last octet of the padding is a count of how many padding octets are included (including itself) and thus should be ignored. C (compression) field 256 is a single bit that, when set (e.g., has a value of
1), indicates that the data in SDP data field 284 has been compressed. Different types of compression can be used, such as using Zlib compression as discussed in
ZLIB Compressed Data Format Specification version 3.3, Network Working Group Request for Comments (RFC) 1950, May 1996. Res (reserved) field 258 is a 4-bit reserved field. In certain implementations, Res field 258 should be set to zero. PT (payload type) header field 260 is a 7-bit field set to a value (e.g., 141) to indicate that RTCP message 250 embeds a session description message. Length field 262 is a 16-bit field that identifies the length of RTCP message
250. This length can be generated by summing the lengths of the various fields in RTCP message 250, including any headers and any padding. In certain implementations, the length is identified in 32-bit quantities minus one. SDPMsgHash (SDP message hash) field 264 is a 16-bit field used to identify the session description message included in RTCP message 250 and an address
(e.g., IP address) of the sender (e.g., server device 104). In certain implementations, the identifier in field 264 is calculated as a check-sum over the session description message and the address, so that if either changes, the value of the identifier in field 264 is also changed. In certain implementations, the value of SDPMsgHash field 264 is calculated in the same manner as the "msg id hash" field described in the Session Announcement Protocol (SAP), Network Working Group Request for Comments (RFC) 2974, October 2000. If the session description message is fragmented across multiple RTCP messages, as discussed below, the value of SDPMsgHash field 264 of each fragment should be identical. F (more fragments) field 266 is a single bit that, when set (e.g., has a value of 1), indicates that the session description message has been fragmented into multiple RTCP messages, and that the current RTCP message does not contain the last fragment of the session description message. If F field 266 is not set (e.g., has a value of 0), then the session description message has not been fragmented (the complete session description message is included in RTCP message 250), or the session description message has been fragmented and RTCP message 250 contains the last fragment of the session description message. FragSeqNum (fragment sequence number) field 268 is a 15-bit field used to identify different fragments of a session description message. The fragments of a session description message are assigned identifiers in some manner known to both server device 104 and client device 102. For example, the identifiers may be assigned sequentially starting with the value of 0, so the first fragment has a value 0, the second a value 1, the third a value 2, and so forth. If RTCP message 250 does not contain a fragment of a session description message (i.e., RTCP message 250 contains a complete session description message), then FragSeqNum field 268 should be set to 0. NumRtpState (number RTP state) field 270 is a 16-bit field used to specify the number of RTP-State blocks contained in RTCP message 250. Each RTP-State block is 14 bytes in size. The "NumRtpState" field is set to 0 when no RTP-State blocks are present. In the illustrated example of RTCP message 250, there is one RTP-State block 292. If there are multiple RTP-State blocks, then a field 272, 274,
276, 278, 280, and 282 is included for each of the multiple RTP-State blocks. If a session description message is fragmented into multiple RTCP messages 250, then only the RTCP message 250 containing the first fragment of the session description message should contain an RTP-State block(s). A field 272 is a 1-bit field that is not set (e.g., has a value of 0) if PT field
274 contains a valid RTP Payload Type number. If A field 272 is not set, the information in RTP-State block 292 only applies to the RTP Payload Type number identified in PT field 274 and the SDP Flow ID identified in Flow ID field 276. If
A field 272 is set (e.g., has a value of 1), then PT field 274 should be ignored, and the RTP-State block 292 applies to all RTP packets for the SDP Flow ID identified in Flow ID field 276, irrespective of the RTP Payload Type used. PT field 274 is a 7-bit field specifying the RTP Payload Type number for the information in RTP-State block 292. If A field 272 is set (e.g., has a value of 1), I then PT field 274 is not used and should be set to 0. Flow ID field 276 is a 24-bit field that identifies the SDP Flow ID to which the information in RTP-State block 292 refers. Each media stream is streamed over a different RTP session. These RTP sessions are assigned a number using the "a=mid:" attribute as described in the Grouping of Media Lines in the Session Description Protocol (SDP) Network Working Group Request for Comments (RFC) 3388, December 2002. Flow ID field 276 identifies a particular "m=" entry in the session description message, which is the same as the value for the "a=mid" attribute (in accordance with RFC 3388) of the "m=" entry. SSRC (synchronization source) field 278 is a 32-bit field which specifies the
RTP SSRC field value used for the media stream which is identified by Flow ID field 276. If A field 272 is not set (e.g., has a value of 0), then SSRC field 278 only applies to RTP packets for this media stream that use the RTP Payload Type given by PT field 274. RtpTime (RTP time) field 280 is a 32-bit field that specifies the value of the RTP Timestamp field that an RTP packet would have, if that packet was sent at exactly the beginning of the media stream identified by Flow ID field 276. For example, if the timeline of the media presentation begins at time T, the value of RtpTime field 280 is the value of the RTP Timestamp field of a packet that would be sent at exactly time T, even if no such RTP packet actually exists for the media stream identified by Rtp-State block 292. RtpSeq (RTP sequence) field 282 is a 16-bit field that gives the value of the
RTP sequence number field of the first RTP packet that is sent for the media stream identified by Flow ID field 276. If A field 272 is not set (e.g., has a value of 0), then RtpSeq field 282 only applies to RTP packets for this media stream that use the RTP Payload Type given by PT field 274. SDP data field 284 is the session description message embedded in RTCP message 250. In situations where the session description message is fragmented, SDP data field 284 contains only a portion of the session description message (e.g., a single fragment of the session description message). In certain implementations, the session description message is a complete SDP description in UTF-8 format. FIG. 6 illustrates an example session description message format. Although illustrated as a specific example in FIG. 6, the session description message could have a format with fields or portions in different orders, or alternatively spread across different messages. Session description message 320 includes a session level description portion 322 and zero or more media level description portions 324. Session level description portion 322 includes one or more fields having data that applies to the whole session and all media streams that are part of the session. Each media level description portion 322, on the other hand, includes one or more fields having data that applies only to a single media stream. The data fields in media level description portion 322 describe properties for particular media streams. These properties may be in addition to properties described in session level description portion 322, or in place of properties described in session level description portion 322. For example, one or more properties in a particular media level description portion 322 may override, for the particular media stream associated with that particular media level description portion 322, properties identified in session level description portion 322. Session description message 320, and the structure of message 320 is discussed in additional detail below specifically with respect to SDP. It is to be appreciated that these specific structures are only examples, and that the session description message can take different forms. Session level description portion 322 begins with a particular field, referred to as the protocol version field. Similarly, media level description portions 324 each start with a particular field, referred to as a media name and transport address field. In certain embodiments, multiple fields of the same type may be included in a session description message (e.g., a single session description message may have two or more attribute fields). Table I below illustrates example fields that may be included in session level description portion 322. Table I includes a name for each example field, an abbreviation or type for each example field, and a brief discussion of each example field. In certain embodiments, the protocol version field, the owner/creator and session identifier field, the session name field, and the time description field are required whereas all other fields in Table I are optional.
Table I
Figure imgf000022_0001
Table II below illustrates the time description field in additional detail. Table II includes a name for each field in the time description field, an abbreviation or type for each field in the time description field, and a brief discussion of each field in the time description field. The time the session is active field is required whereas the zero or more repeat times field is optional.
Table II
Figure imgf000023_0001
Table III below illustrates example fields that may be included in a media level description portion 324. Table III includes a name for each example field, an abbreviation or type for each example field, and a brief discussion of each example field. In certain embodiments, the media announcement field is required whereas all other fields in Table III are optional.
Table III
Figure imgf000024_0001
FIG. 7 is a flowchart illustrating an example process 350 for embedding session description messages in an RTCP message when using a server-side play list. FIG. 7 shows acts performed by a media content source, such as a server device 104 (e.g., of FIGS. 1, 2, 3, or 4). Initially, the next piece of media content in the play list is identified (act 352). When playback of the pieces of media content begins, the next piece is the first piece identified in the play list. Additionally, each time the end of one piece of content is reached (e.g., the entire piece of content has been streamed to client device 102, even though play back of the piece at client device 102 has most likely not been completed yet), the next piece of media content is the piece that follows the piece whose end was reached. It should be noted that this next piece may be in the order defined by the play list, or the user may be able to navigate to a different piece within the play list (e.g., the user may be able to request that a particular piece in the play list be skipped or jumped over). Information describing the identified piece of media content is then obtained (act 354). This information can be obtained in one or more different manners. One manner in which this information can be obtained is retrieval from a file or record. In certain embodiments, at least some of the information is stored in a file or record associated with the identified piece of media content. This file or record is accessed in act 354 to retrieve the information stored therein. Another manner in which this information can be obtained is receipt from a human user. In certain embodiments, at least some of the information is received from a human user. These user inputs are used in act 354 as at least some of the information to be included in the session description message. Another manner in which this information can be obtained is automatic detection. In certain embodiments, at least some of the information can be obtained automatically by a computing device by analyzing the source of the identified piece of media content or the identified piece of media content itself. This automatically detected information is used in act 354 as at least some of the information to be included in the session description message. An RTCP message having a session description message that includes the obtained information is then created (act 356). In certain embodiments, this RTCP message is in the form of RTCP message 250 of FIG. 5 discussed above. The created RTCP message is then sent to the intended recipient of the next piece of media content (act 358). The intended recipient of the next piece of media content is the device to which the media content is being streamed (e.g., client device 102 of FIGS. 1, 2, 3, or 4). The created RTCP message is included in an RTCP packet that is included as part of the streaming media being streamed to the intended recipient. It should be noted that situations can arise where the number of media streams being streamed for two different pieces of media content identified in a play list are different. For example, the first piece of media content identified in a play list may have two streams (e.g., an audio stream and a video stream), while the second piece of media content identified in a play list may have three streams (e.g., an audio stream, a video stream, and a text subtitle stream). Additionally, when streaming media using UDP, each media stream is typically using a different UDP channel that is received at the recipient on a different UDP port. If the recipient only opened two ports for the first piece of media content (e.g., one port for the audio stream and one port for the video stream), there would be no port available for the recipient to receive the text subtitle stream of the second piece of media content. Such situations can be resolved in different manners. In certain embodiments, such situations are resolved by streaming the additional media stream(s) over an open HTTP connection using TCP. An indication is included in RTCP message 250 (e.g., as an additional RTP-State block 292 for each additional media stream) that the additional media stream(s) is being streamed in this manner. In other embodiments, such situations are resolved by having the recipient open one or two extra ports, often referred to as wildcard ports. Each of these wildcard ports can be used to receive any media stream that the server device sends to the recipient. An indication is included in RTCP message 250 (e.g., as an additional RTP-State block 292 for each additional media stream) of which of the wildcard ports the additional media stream(s) is being streamed to. In other embodiments, such situations are resolved by the server device sending the session description message to the recipient (e.g., in an RTCP message 250) that identifies all of the media streams available for the second piece of media content. The server device then waits for the recipient to select which of the media streams the recipient desires to receive. The recipient will make a selection (e.g., automatically or based on user input at the recipient), and send to the server device an indication of which media stream(s) were selected and which ports the selected media stream(s) are to be streamed to. FIG. 8 is a flowchart illustrating an example process 380 for receiving session description messages in an RTCP message when using a server-side play list. FIG. 8 shows acts performed by a recipient of streaming media, such as a client device 102 (e.g., of FIGS. 1, 2, 3, or 4). Initially, an RTCP message is received from a media content source (act 382). The media content source is, for example, a server device 104 of FIGS. 1, 2, 3, or 4. A session description message for a next piece of media content in the play list is extracted from the RTCP message (act 384). When streaming of the pieces of media content in the play list is just beginning, this next piece of media content is the first piece of media content in the play list. After streaming of at least one of the pieces of media content has begun, the next piece of media content is the next piece identified in the play list. It should be noted that this next piece may be in the order defined by the play list, or the user may be able to navigate to a different piece within the play list (e.g., the user may be able to request that a particular piece in the play list be skipped or jumped over). It should also be noted that the session description message for the next piece of media content is typically received prior to playback of the current piece of media content being finished (to allow client device 102 to immediately begin playback of the next piece of media content when playback of the current piece of media content is finished). The extracted session description message is then used in processing of the next piece of media content (act 386). This processing typically includes playback of the next piece of media content at client device 102. FIG. 9 illustrates a system 500 for multicasting multimedia presentations, according to one embodiment. In this embodiment, system 500 includes a content source 502, a server 504, and clients 506ι-506x that are connected to server 504 via a network 508. Network 508 can be any suitable type of wired (including optical fiber) or wireless network (e.g., RF or free space optical). In one embodiment, network 508 is the Internet, but in other embodiments network 508 can be a local area network (LAN), a campus area network, etc. In this embodiment, server 504 includes an announcement generator 510. As will be described in more detail below, embodiments of announcement generator 510 generate streams containing information regarding multimedia presentations to be multicast over network 508. The operation of this embodiment of system 500500 in multicasting multimedia presentations is described below in conjunction with FIG. 10 through FIG. 12. FIG. 10 illustrates server operational flow of system 500500 of FIG. 9 in multicasting a multimedia presentation, according to one embodiment. Referring to FIGS 9 and 502, server 504 operates as follows to multicast a multimedia presentation. In a block 524, server 504 receives a multimedia presentation via a connection 512. In this embodiment, server 504 receives the multimedia presentation from content source 502 via a link 512. In particular, content source 502 provides multimedia content to be multicast over network 508. The multimedia content can be generated in any suitable manner. For example, the multimedia content may be previously recorded/generated content that is then stored in a datastore (not shown), or a live performance that is captured (e.g., using a video camera, microphone, etc.) and encoded (encoder not shown). In a typical application, the multimedia presentation will include multiple streams. For example, the multimedia presentation may include a video stream, an audio stream, another video stream encoded at a lower bit rate and another audio streams encoded at a lower bit rate. In other applications, the multimedia presentation may have more or fewer streams than those described in this example application. Thus, in this embodiment, server 504 receives the multimedia presentation in the form of one or more streams in block 524. In a block 526, server 504 forms an announcement stream and multicasts the announcement stream over network 508 via a link 514. In this embodiment, announcement generator 510 of server 504 forms the announcement stream. In some embodiments, announcement generator 510 may be configured by an administrator, while in other embodiments announcement generator 510 may be configured to process stream(s) received in block 524 and extract information from the stream(s) to form the announcement stream In some embodiments, server 504 multicasts the announcement stream on a dedicated announcement channel (i.e., a channel without announcement information related to other multimedia presentations). As used in this context, a channel can be a logical address such as a multicast Internet protocol (IP) address and port. Thus, a client can join a channel by listening to the logical address and port associated with the channel. Clients may learn of the logical address in any suitable manner such as but not limited to email, invitations, website postings, and conventional Session Announcement Protocol (SAP) multicasts (e.g., as defined in Specification, IETF RFC-2974, entitled "Session Announcement Protocol". In embodiments using SAP multicasts to announce a multimedia presentation, the SAP multicast need not include the detailed presentation description information that would be provided in an "in-line" announcement stream (described in more detail below). In some embodiments, the announcement stream is multicast "in-line" with a stream containing multimedia data. For example, the stream of multimedia data can be multicast using packets according to the Real-time Transport Protocol (RTP) and the announcement stream can be multicast using packets according to the Real-time Transport Control Protocol (RTCP). In one embodiment, the RTP is defined in Request For Comments (RFC) 3550, Internet Engineering Task Force (IETF), July, 2003 (which includes the specification of the RTCP as well). In this embodiment, the RTP is extended to support announcement data in RTCP packets. In a further refinement, the announcement data can be sent "in-line" in the same RTP packets (or other protocol packets/datagrams) as the multimedia data. In other embodiments, the announcement channel can be out-of-band (e.g., when the announcement channel is multicast using SAP. The announcement stream contains information that describes the multimedia presentation such as, for example, identification of various channels used to multicast the multimedia presentation, descriptions of the stream (e.g., indicating the type of stream (e.g., video or audio); bit-rate of the stream; language used in the stream, etc.) being transported by each of the channels, error correction information; security/authentication information; encryption information; digital rights management (DRM) information, etc. In one embodiment, the announcement stream is repeatedly multicast during the multimedia presentation so that clients joining at different times may receive the multimedia presentation description information. A client receiving this presentation description information via the announcement stream can then determine which channel(s) are suitable to join based in view of its resources. In a block 528, server 504 multicasts stream(s) selected from the stream(s) of the multimedia presentation received in block 524. In some scenarios, server 504 multicasts all of the streams received in block 524. In some embodiments, an administrator can configure server 504 to multicast particular streams in preselected channels. In one embodiment, server 504 supports at least an announcement channel, a video channel and an audio channel. More typically, server 504 will also support additional channels of video and audio streams of different bit rates to accommodate clients having different resources available to process the multimedia presentation. For example, as shown in FIG. 11, server 504 may be configured to support an announcement channel 532, an acceleration channel 534 (described below in conjunction with FIGS. 15 and 16), a high quality video channel 536, a high quality audio channel 538, an application channel 540, alternative language channels 542 542N, and alternative bit rate channels 544 544 (for audio and/or video streams). In one embodiment, application channel 540 can be used to multicast data used by applications expected to be running locally on the clients (e.g., a media player, or other applications that may require a plug-in to use the multicasted application data such as Microsoft PowerPoint® data). Depending on the streams provided by content source 502 and the configuration of server 504, and the preselected definition of the channels, server 504 may map a stream into only one channel, multiple channels or no channels. For example, if multimedia presentation from content source 502 includes an English language stream and a Spanish language stream, server 504 may be configured to map the Spanish language stream into all of channels 542 542N, or only channel 542l5 or to no channel at all. In some embodiments, the "layout" of the channels is preselected. For example, in embodiments in which each channel has its own IP address, the channels may be a set of sequential IP addresses in the range of IP addresses assigned for multicasting (i.e., the range of IP address 224.0.0.0 to IP address 239.255.255.255). Thus, announcement channel 532 may be assigned to IP address 231.0.0.1, acceleration channel 534 may be assigned to IP address 231.0.0.2, and high quality video channel 536 may be assigned to IP address 231.0.0.3, and so on. Similarly, the channels may be a set of sequential ports of a group IP address. Thus, in an RTP -based embodiment, announcement channel 532 may be assigned to port 231.0.0.1:5000, acceleration channel 534 may be assigned to port 231.0.0.1:5002, high quality video channel 536 may be assigned to port 231.0.0.1:5004, and so on (so that ports 5001, 5003 and 5005 can be used for the RTCP packets). The approaches used by the above embodiments of system 500500 have several advantages. For example, because the announcement stream is multicast on a dedicated channel, a client can more quickly obtain the presentation description information, thereby advantageously reducing start up latency. In contrast, a conventional SAP multicast approach typically has a larger start up latency because SAP multicasts generally announce a large number of multicasts, which tends to reduce the frequency at which announcements for a particular multimedia presentation is multicast (which in turn tends to increase start up latency). Further, these embodiments of system 500 do not require that clients have a back channel to server 504, thereby providing more flexibility in delivering multimedia presentations to a desired audience. In addition, these embodiments of system 500 eliminates the need for the server to provide a "multicast information" file required in the previously-described conventional system, and, thus, the costs involved in maintaining and publishing this file. Still further, because the streams being multicast in the set of channels are
"predictable" in some embodiments, clients may choose to join particular channels without waiting to receive and process the multimedia presentation description information from announcement channel 532. For example, an aggressive client (typically a client with relatively large resources) may choose to join high quality video and high quality audio channels 536 and 538 concuπently with or instead of joining announcement channel 532, thereby reducing start up latency if the client indeed has sufficient resources to process the streams without losing data. For example, a client with large resources can be a client having a computing platform with high speed CPU and large buffering resources, and is connected to a high- speed computer network with a relatively large amount of available bandwidth. The high speed CPU and large buffering resources significantly reduce the risk of losing data. FIG. 12 illustrates operational flow of client 5061 (FIG. 9) in receiving a multimedia presentation that is being multicast by server 504 (FIG. 9), according to one embodiment. Clients 5062-506χ (FIG. 9) can operate in a substantially identical manner. Refeπing to FIGS. 9, 11 and 12, client 5061 operates as follows in receiving the multimedia presentation. In a block 562, client 506ls having already received the logical address of the announcement channel of a multimedia presentation, joins announcement channel 532. As previously described for one embodiment, server 504 repeatedly multicasts presentation description information on a dedicated announcement 3'2 channel. Thus, client 506! can relatively quickly receive the presentation description information compared to conventional systems that generally multicast description information of a relatively large number of multimedia and/or other types of presentations. In a block 564, client 506χ then joins one or more of the channels that provide multimedia data streams, which are described in the received announcement stream. In one embodiment, client 506i can determine which channel(s) to join to have an optimal experience using the resources available to client 506^ Client 506ι then can receive the selected stream(s) of the multimedia presentation. FIG. 12A illustrates operational flow of client 506i (FIG. 9) in receiving a multimedia presentation that is being multicast by server 504 (FIG. 9), according to another embodiment. Clients 5062-506x (FIG. 9) can operate in a substantially identical manner. Referring to FIGS. 9, 11 and 12A, client 506! operates as follows in receiving the multimedia presentation. In this embodiment, client 506! substantially concuπently performs block 562 (to join announcement channel 532 as described above) and a block 570. In block 570, client 506\ also joins one or more preselected channels of the multimedia presentation in addition to announcement channel 532. As previously described for one embodiment, server 504 can be configured to multicast streams in preselected channels in a predetermined manner. In this embodiment, client 506t can take advantage of the preselected channel assignments to join desired channels without having to receive the presentation description information from announcement channel 532. For example, in one scenario, client 5061 has relatively large resources to receive and process multimedia presentations, capable of handling typical high quality video and high quality audio streams. With these resources, client 506i can be configured to immediately join channels 536 and 538 to receive high quality video and high quality audio streams to reduce start up latency with a relatively high expectation that client 506! can properly process the streams. In a decision block 572, client 506! determines whether it can optimally process the stream(s) received from the channel(s) it joined in block 570 in view of the resources available to client 506ι. In one embodiment, client 506i uses the presentation description information received from announcement channel 532 to determine whether its resources can handle the streams received on these channels. For example, the streams of channels joined in block 570 may have bit rates (which will be described in the announcement stream) that are too great for client 506i to process without losing data (which can result in choppy audio playback for audio streams or blocky video playback for video streams). If client 506i determines in block 572 that it can optimally process the stream(s) of the preselected channel(s), client 506! continues to receive the stream(s) from the channel(s) client 506i joined in block 570 until the multicasted multimedia presentation terminates. However, in this embodiment, if client 506! in block 572 determines that it cannot optimally process the stream(s) of the preselected channel(s), the operational flow proceeds to block 564 (previously described in conjunction with FIG. 12). In block 564, using the presentation description information received in block 562, client 506i join one or more other channels that carry multimedia data streams that client 506ι can optimally process. In a block 574, in this embodiment client 506ι may quit the preselected channels joined in block 570. Client 506i continues to receive the stream(s) from the channels joined in block 564 until the multicasted multimedia presentation terminates or chooses to leave the channel. FIG. 13 illustrates some of the components of server 504 (FIG. 9), according to one embodiment. In this embodiment, in addition to announcement generator 510 (described above in conjunction with FIG. 9), server 504 includes a configuration controller 582, a configurable stream mapper 584, a source interface 586 and a network interface 588. In some embodiments, these elements are software modules or components that can be executed by a computing environment of server 504. Source interface 586 is configured to receive one or more multimedia streams from content source 502 (FIG. 9) via link 512. Configurable stream mapper 582 is configured to receive the streams from source interface 586, an announcement stream from announcement generator 510, and control information from configuration controller 582. In this embodiment, configurable stream mapper 584 functions like a switch in mapping or directing one or more of the streams received from source interface 586 to multicast channel(s).. Network interface 588 multicasts the selected streams over network 508 (FIG. 9). In some embodiments, configuration controller 582 configures configurable stream mapper 584 to map the received stream(s) of the multimedia presentation into channel(s). In addition, in some embodiments configuration controller 582 directs announcement generator 510 in generating announcements. Operational flow of one embodiment of configuration controller 582 in is described below in conjunction with FIGS. 13 and 14. FIG. 14 illustrates operational flow of configuration controller 582 (FIG. 13) in multicasting a multimedia presentation, according to one embodiment. Refeπing to FIGS 13 and 14, one embodiment of configuration controller 582 operates to multicast a multimedia presentation as described below. In a block 602, this embodiment of configuration controller 582 receives configuration information from an administrator. The administrator can manually provide configuration information to configuration controller 582 of server 504. This configuration information may define each of the channels in terms of logical address, and include presentation description information (previously described). For example, the presentation information may include the media type(s) of the stream(s) of the multimedia presentation to be multicast, the bit-rate(s) of the stream(s); the language(s), eπor coπection information; security/authentication information; encryption information; digital rights management (DRM) information, etc. In alternative embodiments, configuration controller 586 may be configured to extract the presentation description information from the streams themselves (e.g., from header or metadata information included in the streams) after being received from content source 502 (FIG. 9) via source interface 586. In a block 604, configuration controller 582 configures stream mapper 584 to map the announcement stream from announcement generator 510 and the multimedia data stream(s) from source interface 586 to the channels as described in the presentation description information. This announcement stream is repetitively multicast over the announcement channel by server 504 during the multicast of the multimedia presentation. In a block 606, configuration controller 582 provides presentation description information for the stream(s) to announcement generator 510. As previously described, announcement generator 510 forms the announcement stream that includes the presentation description information. As previously described, the "layout" of the channels may be preselected.
For example, a client would be given a logical address (e.g., a URL) for joining a multicast multimedia presentation. In one embodiment, that first logical address is preselected to carry the announcement stream in one embodiment. In this example, the next sequential logical address is preselected to carry the acceleration channel, while the next sequential logical address is preselected to carry a high quality video stream, and so on as shown in the embodiment of FIG. 11. Configuration controller 582 configures stream mapper 584 to map the announcement stream and the multimedia data streams according to the preselected channel layout. FIG. 15 illustrates some of the components of server 504 (FIG. 9), according to another embodiment. This alternative embodiment of server 504 is substantially similar to the embodiment of FIG. 13, except that this embodiment includes an accelerated stream generator 702. In one embodiment, accelerated stream generator 702 is configured to form a stream in which each unit of multimedia data that is multicast contains a cuπent subunit of multimedia data and a preselected number of previous subunits of data. For example, an accelerated stream may be multicast so that a datagram contains the cuπent frame(s) of the multimedia presentation and the frames of the previous five seconds. In this embodiment, accelerated stream generator 702 provides the accelerated stream to configurable stream mapper 584 to be mapped into a dedicated acceleration channel such as acceleration channel 534 (FIG. 11). However, in other embodiments, an acceleration channel datagram need not include the cuπent frame(s). FIG. 16 illustrates operational flow of server 504 with accelerated stream generator 702 (FIG. 15), according to one embodiment. Referring to FIGS. 15 and 16, this embodiment of server 504 operates as described below. In a block 802, accelerated stream generator 702 forms a unit of multimedia data for multicast over network 508 (FIG. 9). In this embodiment, accelerated stream generator 702 forms the unit using a cuπent subunit of the multimedia presentation data and the previous Z subunits of multimedia presentation data. As previously mentioned, the unit may be a datagram or packet, and the subunits may be frames of multimedia data. In one embodiment, Z is selected to ensure that the unit (i.e., packet or datagram) will contain a key frame needed to render or decode the multimedia data. In other embodiments, Z is selected without regard to whether the unit will be ensured of having a key frame. In a block 804, the unit of multimedia data formed in block 802 is multicast over network 508 (FIG. 9). In this embodiment, accelerated stream generator 702 provides the unit of multimedia data to configurable stream mapper 584, which then maps the block to the acceleration channel. Server 504 then multicasts the unit of multimedia data over network 508 (FIG. 9) via network interface 588. In one embodiment, server 504 multicasts the unit at a rate that is "faster than real time" (i.e., at a bit rate that is faster than the bit rate of the underlying multimedia data). This approach advantageously allows a client having relatively large resources to join the acceleration channel and quickly fill the buffer of its multimedia player in receiving the unit so that rendering or playback can begin more quickly. This feature is enhanced in embodiments in which the multicasted unit of multimedia data includes a key frame. Alternatively, the rate at which server 504 multicasts the unit need not be "faster than real time". This approach may be used in applications in which the client concuπently joins both the acceleration channel and another channel that multicasts multimedia data so that, in effect, the client receives the multimedia data at a rate that is "faster than real-time." If more multimedia data is to be multicasted, the operational flow returns to block 802, as represented in decision block 806. Thus, for example, using the above example of multimedia frames transported in datagrams, the next datagram would include the next frame of multimedia data, plus the frame added in the previous datagram, plus the previous (Z-l) frames. Thus, in this embodiment, each unit (e.g., datagram) represents a sliding window of the cuπent subunit (e.g., frame) and the previous Z frames, with Z selected to be large enough to ensure that each unit has enough information to minimize the time needed to allow the client's multimedia player to start rendering/playback of the multimedia presentation. As previously mentioned, in some embodiments Z may be selected to ensure that each unit has a key frame. In one embodiment, units of video and audio data are multicasted in an alternating manner on the same channel if the multimedia presentation includes both audio and video streams. In other embodiments, separate acceleration channels may be used for audio and video streams. At the start of a multimedia presentation, one embodiment of accelerated stream generator waits until at least Z subunits of multimedia data have been multicasted in the non-accelerated channel(s) before forming a unit of data in block 802. FIG. 17 illustrates client operational flow in receiving an accelerated stream, according to one embodiment. In a block 902, a client (e.g., one of clients 506ι-506χ of FIG. 9) joins the acceleration channel. In some scenarios, the acceleration channel is part of the preselected channel layout and the client can join it either concuπently or without joining the announcement channel. As previously described, the acceleration channel can be advantageously used by a client having relatively large resources for receiving and processing multimedia presentations so that the client may reduce start up latency. In a block 904, the client receives one or more units of multimedia data from the acceleration channel. In one embodiment, each unit of multimedia data is generated as described above in conjunction with FIG. 16. The client can then process each unit of multimedia data to relatively quickly begin the rendering or playback process. In one scenario, the client receives a unit of video data and a unit of audio data, with the video data containing a key frame so that the client can begin the rendering/playback process as soon as possible. As previously described, a unit need not have a key frame in other embodiments. In a block 906, the client can then join a non-accelerated channel such as high quality video channel 536 and high quality audio channel 538. In one embodiment, the non-accelerated channels that the client joins are preselected using the above-described preselected channel layout. In other embodiments, the client joins channel(s) based on the presentation description information contained in announcement stream. In a block 908, the client quits the acceleration channel. In one embodiment, the client quits the acceleration channel immediately after receiving the unit or units of multimedia data needed to begin the rendering/playback process or processes. Although blocks 902 through 908 are described as being performed sequentially, in the flow chart of FIG. 17 (as well as the other flow charts described herein) the blocks may be performed in orders different from that shown, or with some blocks being performed more than once or with some blocks being performed concurrently or a combination thereof. For example, in some embodiments, blocks 902 and 906 are performed in parallel so that the operational flow is that the client joins accelerated and non-accelerated channels concuπently. Block 904 is perfoπned sequentially after block 902, with block 904 and 906 proceeding to block 908. FIG. 17A illustrates an example scenario in which a client may join the acceleration channel and some preselected channels, and then join other channels (e.g., based on announcement information received from the announcement channel). In this example, the client joins the acceleration channel (i.e., block 902) concuπently with joining one or more preselected non-accelerated channels (i.e., block 906). Then the client receives one or more units of multimedia data from the acceleration channel (i.e., block 904) as well as multimedia and announcement data from the non-accelerated channel(s). As a result of joining the announcement channel, the client may decide to quit the preselected channel(s) and join other nonaccelerated channels (i.e., blocks 572, 564 and 574). The various multicasting embodiments described above may be implemented in computer environments of the server and clients. An example computer environment suitable for use in the server and clients is described below in conjunction with FIG. 18. FIG. 18 illustrates a general computer environment 1000, which can be used to implement the techniques described herein. The computer environment 1000 is only one example of a computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the computer and network architectures. Neither should the computer environment 1000 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the example computer environment 1000. Computer environment 1000 includes a general-purpose computing device in the form of a computer 1002. The components of computer 1002 can include, but are not limited to, one or more processors or processing units 1004, system memory 1006, and system bus 1008 that couples various system components including processor 1004 to system memory 1006. System bus 1008 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures can include an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, a Peripheral Component Interconnects (PCI) bus also known as a Mezzanine bus, a PCI Express bus, a Universal Serial Bus (USB), a Secure Digital (SD) bus, or an IEEE 1394, i.e., FireWire, bus. Computer 1002 may include a variety of computer readable media. Such media can be any available media that is accessible by computer 1002 and includes both volatile and non- volatile media, removable and non-removable media. System memory 1006 includes computer readable media in the form of volatile memory, such as random access memory (RAM) 1010; and/or non- volatile memory, such as read only memory (ROM) 1012 or flash RAM. Basic input/output system (BIOS) 1014, containing the basic routines that help to transfer information between elements within computer 1002, such as during start-up, is stored in ROM 1012 or flash RAM. RAM 1010 typically contains data and/or program modules that are immediately accessible to and or presently operated on by processing unit 1004. Computer 1002 may also include other removable/non-removable, volatile/non- volatile computer storage media. By way of example, FIG. 10 illustrates hard disk drive 1016 for reading from and writing to a non-removable, non- volatile magnetic media (not shown), magnetic disk drive 1018 for reading from and writing to removable, non-volatile magnetic disk 1020 (e.g., a "floppy disk"), and optical disk drive 1022 for reading from and/or writing to a removable, non- volatile optical disk 1024 such as a CD-ROM, DVD-ROM, or other optical media. Hard disk drive 1016, magnetic disk drive 1018, and optical disk drive 1022 are each connected to system bus 1008 by one or more data media interfaces 1025. Alternatively, hard disk drive 1016, magnetic disk drive 1018, and optical disk drive 1022 can be connected to the system bus 1008 by one or more interfaces (not shown). The disk drives and their associated computer-readable media provide non- volatile storage of computer readable instructions, data structures, program modules, and other data for computer 1002. Although the example illustrates a hard disk 1016, removable magnetic disk 1020, and removable optical disk 1024, it is appreciated that other types of computer readable media which can store data that is accessible by a computer, such as magnetic cassettes or other magnetic storage devices, flash memory cards, CD-ROM, digital versatile disks (DVD) or other optical storage, random access memories (RAM), read only memories (ROM), electrically erasable programmable read-only memory (EEPROM), and the like, can also be utilized to implement the example computing system and environment. Any number of program modules can be stored on hard disk 1016, magnetic disk 1020, optical disk 1024, ROM 1012, and/or RAM 1010, including by way of example, operating system 1026, one or more application programs 1028, other program modules 1030, and program data 1032. Each of such operating system 1026, one or more application programs 1028, other program modules 1030, and program data 1032 (or some combination thereof) may implement all or part of the resident components that support the distributed file system. A user can enter commands and information into computer 1002 via input devices such as keyboard 1034 and a pointing device 1036 (e.g., a "mouse"). Other input devices 1038 (not shown specifically) may include a microphone, joystick, game pad, satellite dish, serial port, scanner, and/or the like. These and other input devices are connected to processing unit 1004 via input/output interfaces 1040 that are coupled to system bus 1008, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB). Monitor 1042 or other type of display device can also be connected to the system bus 1008 via an interface, such as video adapter 1044. In addition to monitor 1042, other output peripheral devices can include components such as speakers (not shown) and printer 1046, which can be connected to computer 1002 via I/O interfaces 1040. Computer 1002 can operate in a networked environment using logical connections to one or more remote computers, such as remote computing device 1048. By way of example, remote computing device 1048 can be a PC, portable computer, a server, a router, a network computer, a peer device or other common network node, and the like. Remote computing device 1048 is illustrated as a portable computer that can include many or all of the elements and features described herein relative to computer 1002. Alternatively, computer 1002 can operate in a non-networked environment as well. Logical connections between computer 1002 and remote computer 1048 are depicted as a local area network (LAN) 1050 and a general wide area network (WAN) 1052. Such networking environments are commonplace in offices, enterprise- wide computer networks, intranets, and the Internet. When implemented in a LAN networking environment, computer 1002 is connected to local network 1050 via network interface or adapter 1054. When implemented in a WAN networking environment, computer 1002 typically includes modem 1056 or other means for establishing communications over wide network 1052. Modem 1056, which can be internal or external to computer 1002, can be connected to system bus 1008 via I/O interfaces 1040 or other appropriate mechanisms. It is to be appreciated that the illustrated network connections are examples and that other means of establishing at least one communication link between computers 1002 and 1048 can be employed. In a networked environment, such as that illustrated with computing environment 1000, program modules depicted relative to computer 1002, or portions thereof, may be stored in a remote memory storage device. By way of example, remote application programs 1058 reside on a memory device of remote computer 1048. For purposes of illustration, applications or programs and other executable program components such as the operating system are illustrated herein as discrete blocks, although it is recognized that such programs and components reside at various times in different storage components of computing device 1002, and are executed by at least one data processor of the computer. Various modules and techniques may be described herein in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. for performing particular tasks or implement particular abstract data types. These program modules and the like may be executed as native code or may be downloaded and executed, such as in a virtual machine or other just-in-time compilation execution environment. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments. An implementation of these modules and techniques may be stored on or transmitted across some form of computer readable media. Computer readable media can be any available media that can be accessed by a computer. By way of example, and not limitation, computer readable media may comprise "computer storage media" and "communications media." "Computer storage media" includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. "Communication media" typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as carrier wave or other transport mechanism. Communication media also includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode inforaiation in the signal. As a non-limiting example only, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media. Combinations of any of the above are also included within the scope of computer readable media. Reference has been made throughout this specification to "one embodiment," "an embodiment," or "an example embodiment" meaning that a particular described feature, structure, or characteristic is included in at least one embodiment of the present invention. Thus, usage of such phrases may refer to more than just one embodiment. Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art may recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, resources, materials, etc. In other instances, well known structures, resources, or operations have not been shown or described in detail merely to avoid obscuring aspects of the invention. While example embodiments and applications have been illustrated and described, it is to be understood that the invention is not limited to the precise configuration and resources described above. Various modifications, changes, and variations apparent to those skilled in the art may be made in the aπangement, operation, and details of the methods and systems of the present invention disclosed herein without departing from the scope of the claimed invention. Although the description above uses language that is specific to structural features and/or methodological acts, it is to be understood that the invention defined in the appended claims is not limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the invention.

Claims

1. One or more computer readable media having stored thereon a Real- Time Control Protocol (RTCP) message, comprising: a first field containing data identifying the RTCP message as being a type that embeds a session description message; a second field containing data that is the session description message for a media presentation; and a third field containing data identifying a length of the RTCP message, generated by summing a length of the first field, a length of the second field, and a length of the third field.
2. One or more computer readable media as recited in claim 1, wherein the session description message is a Session Description Protocol (SDP) session description message.
3. One or more computer readable media as recited in claim 1, wherein the media presentation comprises a multimedia presentation.
4. One or more computer readable media as recited in claim 1, further comprising: one or more RTP-State blocks, each RTP-State block identifying RTP- specific information about a media stream of the media presentation; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the one or more RTP-State blocks.
5. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data that identifies a version of RTP (Real-Time Transport Protocol) being used to stream the media presentation; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
6. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying whether additional padding is included in the RTCP message; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
7. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying whether the data in the second field has been compressed; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
8. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying the session description message and an address of a sender of the session description message; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
9. One or more computer readable media as recited in claim 8, wherein the data identifying the session description message and the address of the sender of the session description message comprises a check-sum calculated over the session description message and the address of the sender of the session description message.
10. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying a number of RTP-State blocks contained in the RTCP message; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
11. One or more computer readable media as recited in claim 10, wherein the fourth field contains a value of zero to indicate that no RTP-State blocks are contained in the RTCP message.
12. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying whether data in an RTP-State block of the RTCP message applies to all RTP packets having a particular SDP Flow ID or only to RTP packets having a particular RTP Payload Type number; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
13. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying an RTP Payload Type number for an RTP-State block of the RTCP message; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
14. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying a media stream of the media presentation to which an RTP-State block of the RTCP message refers; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
15. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying a source of a media stream of the media presentation to which an RTP-State block of the RTCP message refers; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
16. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying a value of an RTP Timestamp field that an RTP packet for a media stream of the media presentation would have if the RTP packet was sent at the beginning of the media stream; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
17. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data identifying a value of an RTP sequence number field of a first RTP packet that is sent for a media stream of the media presentation; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, and a length of the fourth field.
18. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data that indicates that the RTCP message contains a fragment of the session description message; a fifth field containing data that identifies the fragment; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, a length of the fourth field, and a length of the fifth field.
19. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data that identifies a version of RTP (Real-Time Transport Protocol) being used to stream the media presentation; a fifth field containing data identifying whether additional padding is included in the RTCP message; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, a length of the fourth field, and a length of the fifth field.
20. One or more computer readable media as recited in claim 1, further comprising: a fourth field containing data that identifies a version of RTP (Real-Time Transport Protocol) being used to stream the media presentation; a fifth field containing data identifying whether additional padding octets are included in the RTCP message; a sixth field containing data identifying whether the data in the second field has been compressed; a seventh field containing data identifying the session description message and an address of a sender of the session description message; an eighth field containing data identifying a number of RTP-State blocks contained in the RTCP message; a ninth field containing data identifying whether data in an RTP-State block of the RTCP message applies to all RTP packets having a particular SDP Flow ID or only to RTP packets having a particular RTP Payload Type number; a tenth field containing data identifying an RTP Payload Type number for the RTP-State block of the RTCP message; an eleventh field containing data identifying a media stream of the media presentation to which the RTP-State block of the RTCP message refers; a twelfth field containing data identifying a source of the media stream of the media presentation to which the RTP-State block of the RTCP message refers; a thirteenth field containing data identifying a value of an RTP Timestamp field that an RTP packet for the media stream of the media presentation would have if the RTP packet was sent at the beginning of the media presentation; a fourteenth field containing data identifying a value of an RTP sequence number field of a first RTP packet that is sent for the media stream of the media presentation; a fifteenth field containing data that indicates that the RTCP message contains a fragment of the session description message; a sixteenth field containing data that identifies the fragment; and the third field containing data identifying the length of the RTCP message, generated by summing the length of the first field, the length of the second field, the length of the third field, a length of the fourth field, a length of the fifth field, a length of the sixth field, a length of the seventh field, a length of the eighth field, a length of the ninth field, a length of the tenth field, a length of the eleventh field, a length of the twelfth field, a length of the thirteenth field, a length of the fourteenth field, a length of the fifteenth field, and a length of the sixteenth field.
21. One or more computer readable media having stored thereon a plurality of instructions that, when executed by one or more processors of a device, causes the one or more processors to: receive, from a media content source, a Real-Time Control Protocol (RTCP) message; extract, from the RTCP message, a session description message associated with one of a plurality of pieces of media content in a play list of media content being streamed from the media content source to the device; and process the one of the plurality of pieces of media content based at least in part on the session description message.
22. One or more computer readable media as recited in claim 21 , wherein the session description message is a Session Description Protocol (SDP) session description message.
23. One or more computer readable media as recited in claim 21, wherein the RTCP message is part of an RTCP packet.
24. One or more computer readable media as recited in claim 21 , wherein the instructions that cause the one or more processors to process the one of the plurality of pieces of media content based at least in part on the session description message cause the one or more processors to play back the one of the plurality of pieces of media content at the device.
25. One or more computer readable media as recited in claim 21 , wherein the instructions further cause the one or more processors to repeat the receipt, extraction, and processing for each of the other pieces of media content in the plurality of pieces of media content.
26. One or more computer readable media as recited in claim 21 , wherein the RTCP message comprises: a first field containing data identifying the RTCP message as being a type that embeds the session description message; a second field containing data that is the session description message; and a third field containing data identifying a length of the RTCP message, generated by summing the length of the first field, the length of the second field, and the length of the third field.
27. A method, implemented in a device, the method comprising: creating a Real-Time Control Protocol (RTCP) message that includes a session description message, the session description message being associated with one of a plurality of pieces of media content in a play list of media content being streamed from the device to a client device; and sending the RTCP message to the client device.
28. A method as recited in claim 27, further comprising: creating a second RTCP message that includes a second session description message, the second session description message being associated with a second of the plurality of pieces of media content in the play list; and sending the second RTCP message to the client device.
29. A method as recited in claim 27, wherein the device comprises a server device.
30. A method as recited in claim 27, wherein the session description message is a Session Description Protocol (SDP) session description message.
31. A method as recited in claim 27, wherein the RTCP message is part of an RTCP packet.
32. A method as recited in claim 27, wherein the RTCP message comprises: a first field containing data identifying the RTCP message as being a type that embeds the session description message; a second field containing data that is the session description message; and a third field containing data identifying a length of the RTCP message, generated by summing the length of the first field, the length of the second field, and the length of the third field.
33. A system comprising: a server device; a client device; the server device being configured to: create a Real-Time Control Protocol (RTCP) message that includes a session description message, the session description message being associated with one of a plurality of pieces of media content in a play list of media content being streamed from the server device to the client device; and send the RTCP message to the client device; and the client device being configured to: receive, from the server device, the RTCP message; extract, from the RTCP message, the session description message; and process the one of the plurality of pieces of media content based at least in part on the session description message.
34. A system as recited in claim 33, wherein the RTCP message comprises: a first field containing data identifying the RTCP message as being a type that embeds the session description message; a second field containing data that is the session description message; and a third field containing data identifying a length of the RTCP message, generated by summing the length of the first field, the length of the second field, and the length of the third field.
35. A method, comprising: receiving data of a multimedia presentation, wherein the data includes a first plurality of streams; and multicasting a second plurality of streams that includes a dedicated announcement stream and a first stream selected from the first plurality of streams, wherein the announcement stream includes presentation description information of the multimedia presentation.
36. The method of claim 35, wherein the second plurality of streams are multicast on different channels.
37. The method of claim 36, wherein the second plurality of streams is multicast on predetermined different channels.
38. The method of claim 37, wherein the predetermined different channels comprise predetermined logical addresses.
39. The method of claim 38, wherein the predetermined logical addresses are predetermined internet protocol (IP) addresses with predetermined ports.
40. The method of claim 37, wherein the predetermined different channels comprise predetermined ports of a logical address.
41. The method of claim 35, wherein the second plurality of streams further comprises a second stream that includes a plurality of units of data of the multimedia presentation, the plurality of units each comprising a preselected number of previous subunits of data of the multimedia presentation.
42. The method of claim 41 wherein each unit of the plurality of units includes a key frame.
43. The method of claim 35, wherein the second plurality of streams further comprises multiple streams of video data having different bit rates.
44. The method of claim 35, wherein the second plurality of streams further comprises multiple streams of audio data having different bit rates.
45. The method of claim 35, wherein the second plurality of streams further comprises multiple streams of multimedia data in different languages.
46. The method of claim 35, wherein the second plurality of streams further comprises a stream of data to be used by an application running on a client receiving the second plurality of streams.
47. The method of claim 35, wherein the announcement stream includes eπor coπection inforaiation.
48. The method of claim 35, wherein the announcement stream includes security information.
49. The method of claim 35, wherein the announcement stream is multicast on an out-of-band channel.
50. The method of claim 35, wherein the announcement stream is multicast on an in-band channel.
51. The method of claim 50, wherein the announcement stream is multicast to conform to a real-time transport control protocol (RTCP), the announcement stream being interspersed in-band within a stream of multimedia presentation data that are multicast to conform to a real-time transport protocol (RTP).
52. The method of claim 50, wherein the announcement stream is multicast so that announcement stream data is included in a packet containing multimedia presentation data.
53. A computer-accessible medium having computer-executable instructions to perform operations comprising: receiving data of a multimedia presentation, wherein the data includes a first plurality of streams; and multicasting a second plurality of streams that includes a dedicated announcement stream and a first stream selected from the first plurality of streams, wherein the announcement stream includes presentation description information of the multimedia presentation.
54. The computer-accessible medium of claim 53, wherein the second plurality of streams are multicast on different channels.
55. The computer-accessible medium of claim 54, wherein the second plurality of streams is multicast on predetermined different channels.
56. The computer-accessible medium of claim 55, wherein the predetermined different channels comprise predetermined logical addresses.
57. The computer-accessible medium of claim 56, wherein the predetermined logical addresses are predetermined Internet protocol (IP) addresses with predetermined ports.
58. The computer-accessible medium of claim 55, wherein the predetermined different channels comprise predetermined ports of an logical address.
59. The computer-accessible medium of claim 53, wherein the second plurality of streams further comprises a second stream that includes a plurality of units of data of the multimedia presentation, the plurality of units each comprising a preselected number of previous subunits of data of the multimedia presentation.
60. The computer-accessible medium of claim 59 wherein each unit of the plurality of units includes a key frame.
61. The computer-accessible medium of claim 53, wherein the second plurality of streams further comprises multiple streams of video data having different bit rates.
62. The computer-accessible medium of claim 53, wherein the second plurality of streams further comprises multiple streams of audio data having different bit rates.
63. The computer-accessible medium of claim 53, wherein the second plurality of streams further comprises multiple streams of multimedia data in different languages.
64. The computer-accessible medium of claim 53, wherein the second plurality of streams further comprises a stream of data to be used by an application running on a client receiving the second plurality of streams.
65. The computer-accessible medium of claim 53, wherein announcement stream includes eπor coπection inforaiation.
66. The computer-accessible medium of claim 53, wherein announcement stream includes security information.
67. A computer-accessible medium having computer-executable instructions to perform operations comprising: receiving data of a multimedia presentation, wherein the data includes a first plurality of streams; and multicasting a second plurality of streams that includes a first stream selected from the first plurality of streams and a second stream that includes a plurality of units of data of the multimedia presentation, the plurality of units each comprising a preselected number of previous subunits of data of the multimedia presentation.
68. The computer-accessible medium of claim 67, wherein each unit of the plurality of units includes a key frame.
69. The computer-accessible medium of claim 67, wherein the plurality of units of the second stream each includes enough data to reduce the amount of time needed by a multimedia player to begin playback of the multimedia presentation.
70. A method comprising: receiving data of a multimedia presentation, wherein the data includes a first plurality of streams; and multicasting a second plurality of streams that includes a first stream selected from the first plurality of streams and a second stream that includes a plurality of units of data of the multimedia presentation, the plurality of units each comprising a preselected number of previous subunits of data of the multimedia presentation.
71. The method of claim 70, wherein each unit of the plurality of units includes a key frame.
72. The method of claim 70, wherein the plurality of units of the second stream each includes enough data to reduce the amount of time needed by a multimedia player to begin playback of the multimedia presentation.
73. A method comprising: receiving data of a multimedia presentation, wherein the data includes a first plurality of streams; and multicasting a second plurality of streams that includes first and second streams related to information contained in the first plurality of streams, wherein the first and second streams are multicast in preselected channels.
74. The method of claim 73, wherein the predetermined different channels comprise predetermined logical addresses.
75. The method of claim 73, wherein the predetermined different channels comprise predetermined ports of an Internet protocol (IP) address.
76. The method of claim 73, wherein the first stream is an announcement stream containing presentation description information.
77. A computer-accessible medium having computer-executable instructions to perform operations comprising: receiving data of a multimedia presentation, wherein the data includes a first plurality of streams; and multicasting a second plurality of streams that includes first and second streams related to information contained in the first plurality of streams, wherein the first and second streams are multicast in preselected channels.
78. The computer-accessible medium of claim 77, wherein the predetermined different channels comprise predetermined logical addresses.
79. The computer-accessible medium of claim 78, wherein the predetermined logical addresses are predetermined Internet protocol (IP) addresses with predetermined ports.
80. The computer-accessible medium of claim 77, wherein the predetermined different channels comprise predetermined ports of a logical address.
81. The computer-accessible medium of claim 78, wherein the first stream is an announcement stream containing presentation description information.
82. A method, comprising: receiving a first stream from a preselected first channel, wherein the first stream comprises presentation description information related to a multimedia presentation being multicast; concuπently with receiving the first stream on the preselected first channel, receiving a second stream on a second preselected channel, wherein the second stream comprises a stream of multimedia data of the multimedia presentation being multicast.
83. The method of claim 82, further comprising: terminating reception of the second stream; and selectively receiving a third stream on a third channel selected in response to presentation description information received from the first stream, wherein the third stream comprises another stream of multimedia data of the multimedia presentation being multicast.
84. The method of claim 82, further comprising continuing reception of the second stream in response to presentation description information received from the first stream indicating that the second stream meets preselected criteria.
85. A computer-accessible medium having computer-executable instructions to perform operations comprising: receiving a first stream from a preselected first channel, wherein the first stream comprises presentation description information related to a multimedia presentation being multicast; concuπently with receiving the first stream on the preselected first channel, receiving a second stream on a second preselected channel, wherein the second stream comprises a stream of multimedia data of the multimedia presentation being multicast.
86. The computer-accessible medium of claim 85, wherein the operations further comprise: terminating reception of the second stream; and selectively receiving a third stream on a third channel selected in response to presentation description information received from the first stream, wherein the third stream comprises another stream of multimedia data of the multimedia presentation being multicast.
87. The computer-accessible medium of claim 85, wherein the operations further comprise: continuing reception of the second stream in response to presentation description information received from the first stream indicating that the second stream meets preselected criteria.
88. A method, comprising: receiving a unit of data from a preselected first channel, wherein the first channel transports a plurality of units of data of a multimedia presentation being multicast, wherein the plurality of units each comprise a preselected number of previous subunits of data of the multimedia presentation being multicast; terminating reception of data from the first preselected channel; and receiving a second stream on a second channel, wherein the second stream comprises a stream of multimedia of the multimedia presentation being multicast.
89. The method of claim 88, wherein the second channel is selected in response to presentation description information received from an announcement channel.
90. The method of claim 88, wherein the second channel is preselected.
91. A computer-accessible medium having computer-executable instructions to perform operations comprising: receiving a unit of data from a preselected first channel, wherein the first channel transports a plurality of units of data of a multimedia presentation being multicast, wherein the plurality of units each comprise a preselected number of previous subunits of data of the multimedia presentation being multicast; terminating reception of data from the first preselected channel; and receiving a second stream on a second channel, wherein the second stream comprises a stream of multimedia of the multimedia presentation being multicast.
92. The computer-accessible medium of claim 91, wherein the second channel is selected in response to presentation description information received from an announcement channel.
93. The computer-accessible medium of claim 91, wherein the second channel is preselected.
94. A system, comprising: a first interface to receive a first plurality of streams of a multimedia presentation; an announcement generator to provide an announcement stream containing presentation description information regarding the multimedia presentation; a mapper to map the announcement stream and a first stream selected from the first plurality of streams to a plurality of channels; and a second interface to multicast a second plurality of streams over a network, wherein the second plurality of streams comprises the mapped announcement stream and the mapped first stream.
95. The system of claim 94, wherein the second plurality of streams is multicast on predetermined different channels.
96. The system of claim 95, wherein the predetermined different channels comprise predetermined logical addresses.
97. The system of claim 96, wherein the predetermined logical addresses each comprise an Internet protocol (IP) address and a port.
98. The system of claim 95, wherein the predetermined different channels comprise predetermined ports of a logical address.
99. The system of claim 95, wherein the second plurality of streams further comprises a second stream that when multicast includes a plurality of units of data of the multimedia presentation, the plurality of units each comprising a preselected number of previous subunits of data of the multimedia presentation.
100. The system of claim 99 wherein the second stream includes a key frame.
101. The system of claim 95, wherein the second plurality of streams further comprises streams of video data having different bit rates selected from the first plurality of streams.
102. The system of claim 95, wherein the second plurality of streams further comprises multiple streams of audio data having different bit rates selected from the first plurality of streams.
103. The system of claim 95, wherein the second plurality of streams further comprises multiple streams of multimedia data in different languages selected from the first plurality of streams.
104. The system of claim 95, wherein the second plurality of streams further comprises a stream of data to be used by an application running on a client receiving the second plurality of streams.
105. The system of claim 95, wherein the announcement stream includes eπor coπection inforaiation.
106. The system of claim 95, wherein the announcement stream includes security information.
107. The system of claim 95, wherein the announcement stream is multicast on an out-of-band channel.
108. The system of claim 95, wherein the announcement stream is multicast on an in-band channel.
109. The system of claim 108, wherein the announcement stream is multicast to conform to a real-time transport control protocol (RTCP), the announcement stream being interspersed in-band within a stream of multimedia presentation data that are multicast to conform to a real-time transport protocol (RTP).
110. The system of claim 108, wherein the announcement stream is multicast so that announcement stream data is included in a packet containing multimedia presentation data.
111. A computer-accessible medium containing components as recited in claim 95.
112. A system, comprising: means for receiving a first plurality of streams of a multimedia presentation; means for generating an announcement stream containing presentation description information regarding the multimedia presentation; means for mapping the announcement stream and a first stream selected from the first plurality of streams to one or more channels of a plurality of channels; and means for multicasting the second plurality of streams over a network, wherein the second plurality of streams comprises the mapped announcement and first streams.
113. The system of claim 112, wherein the second plurality of streams is multicast on predetermined different channels.
114. The system of claim 113, further comprising: means for providing a plurality of units of data of the multimedia presentation to the means for mapping, the plurality of units of data to be multicast as part of the second plurality of streams, wherein the plurality of units when multicast each comprises a preselected number of previous subunits of data of the multimedia presentation.
115. A computer-accessible medium containing components as recited in claim 112.
PCT/US2004/024065 2003-10-24 2004-07-28 Embedding a session dession description message in a real-time control protocol (rtcp) message WO2005045704A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
BR0406609-0A BRPI0406609A (en) 2003-10-24 2004-07-28 Embedding a session decoding message in a realtime control protocol (rtcp) message
EP04779232A EP1676216B1 (en) 2003-10-24 2004-07-28 Embedding a session description (SDP) message in a real-time control protocol (RTCP) message
AU2004287133A AU2004287133B2 (en) 2003-10-24 2004-07-28 Embedding a session dession description message in a real-time control protocol (RTCP) message
MXPA05007090A MXPA05007090A (en) 2003-10-24 2004-07-28 Embedding a session dession description message in a real-time control protocol (rtcp) message.
CA2512191A CA2512191C (en) 2003-10-24 2004-07-28 Embedding a session description message in a real-time control protocol (rtcp) message
JP2006536573A JP4603551B2 (en) 2003-10-24 2004-07-28 Embedding a session description message in a Realtime Control Protocol (RTCP) message
HK06112656.4A HK1092237A1 (en) 2003-10-24 2006-11-17 Embedding a session description (sdp) message in a real-time control protocol (rtcp) message

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US10/693,430 US7586938B2 (en) 2003-10-24 2003-10-24 Methods and systems for self-describing multicasting of multimedia presentations
US10/693,430 2003-10-24
US10/836,973 US7492769B2 (en) 2003-10-24 2004-04-30 Embedding a session description message in a real-time control protocol (RTCP) message
US10/836,973 2004-04-30

Publications (1)

Publication Number Publication Date
WO2005045704A1 true WO2005045704A1 (en) 2005-05-19

Family

ID=34577154

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/024065 WO2005045704A1 (en) 2003-10-24 2004-07-28 Embedding a session dession description message in a real-time control protocol (rtcp) message

Country Status (9)

Country Link
EP (1) EP1676216B1 (en)
JP (1) JP4603551B2 (en)
KR (1) KR101117874B1 (en)
AU (1) AU2004287133B2 (en)
BR (1) BRPI0406609A (en)
CA (1) CA2512191C (en)
MX (1) MXPA05007090A (en)
RU (1) RU2372647C2 (en)
WO (1) WO2005045704A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008047184A1 (en) * 2006-10-20 2008-04-24 Sony Ericsson Mobile Communications Ab Sharing multimedia content in a peer-to-peer configuration
EP1962474A1 (en) 2006-12-29 2008-08-27 Intel Corporation Method and apparatus for mutually-shared media experiences
US20100189256A1 (en) * 2007-07-02 2010-07-29 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for storing and reading a file having a media data container and metadata container
EP2769524A1 (en) * 2011-10-21 2014-08-27 Telefonaktiebolaget L M Ericsson (publ) Real-time communications methods providing pause and resume functionality and related devices
CN112368993A (en) * 2018-07-05 2021-02-12 三星电子株式会社 Method and apparatus for providing multimedia service in electronic device

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101779458B (en) * 2007-08-14 2013-01-16 日本放送协会 Video distribution device and video distribution program
US9615119B2 (en) 2010-04-02 2017-04-04 Samsung Electronics Co., Ltd. Method and apparatus for providing timeshift service in digital broadcasting system and system thereof
KR101705898B1 (en) * 2010-04-02 2017-02-13 삼성전자주식회사 Method and system for providing timeshift service in digital broadcasting system
JP5897603B2 (en) * 2011-01-19 2016-03-30 サムスン エレクトロニクス カンパニー リミテッド Control message composition apparatus and method in broadcast system
RU2485695C2 (en) * 2011-07-26 2013-06-20 Государственное образовательное учреждение высшего профессионального образования Академия Федеральной службы охраны Российской Федерации (Академия ФСО России) Method to verify virtual connection for transfer of multimedia data with specified characteristics
PT2704391T (en) * 2012-08-27 2019-08-07 Broadpeak System and method for delivering an audio-visual content to a client device
KR101448550B1 (en) * 2012-11-21 2014-10-13 서울대학교산학협력단 Apparatus and Method for Traffic Classificaiton
JP2015012305A (en) * 2013-06-26 2015-01-19 ソニー株式会社 Content supply apparatus, content supply method, program, terminal apparatus, and content supply system
CN105359536B (en) * 2013-07-17 2020-07-24 索尼公司 Content providing device and method, terminal device, and content providing system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6275471B1 (en) * 1998-05-12 2001-08-14 Panasonic Technologies, Inc. Method for reliable real-time multimedia streaming
US20030065917A1 (en) * 2001-09-26 2003-04-03 General Instrument Corporation Encryption of streaming control protocols and their headers

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001028909A1 (en) 1999-10-21 2001-04-26 Mitsubishi Denki Kabushiki Kaisha Elevator group controller
JP2002141964A (en) * 2000-08-24 2002-05-17 Matsushita Electric Ind Co Ltd Transmission reception method and its system
US7031311B2 (en) * 2001-07-23 2006-04-18 Acme Packet, Inc. System and method for providing rapid rerouting of real-time multi-media flows
CN100450176C (en) * 2001-12-11 2009-01-07 艾利森电话股份有限公司 Method of rights management for streaming media

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6275471B1 (en) * 1998-05-12 2001-08-14 Panasonic Technologies, Inc. Method for reliable real-time multimedia streaming
US20030065917A1 (en) * 2001-09-26 2003-04-03 General Instrument Corporation Encryption of streaming control protocols and their headers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1676216A4 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008047184A1 (en) * 2006-10-20 2008-04-24 Sony Ericsson Mobile Communications Ab Sharing multimedia content in a peer-to-peer configuration
US9318152B2 (en) 2006-10-20 2016-04-19 Sony Corporation Super share
EP1962474A1 (en) 2006-12-29 2008-08-27 Intel Corporation Method and apparatus for mutually-shared media experiences
US20100189256A1 (en) * 2007-07-02 2010-07-29 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for storing and reading a file having a media data container and metadata container
RU2459378C2 (en) * 2007-07-02 2012-08-20 Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Device and method to process and read file having storage of media data and storage of metadata
US8462946B2 (en) 2007-07-02 2013-06-11 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for storing and reading a file having a media data container and metadata container
RU2492587C2 (en) * 2007-07-02 2013-09-10 Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Форшунг Е.Ф. Apparatus and method for storing and reading file, having media data storage and metadata storage
US9236091B2 (en) 2007-07-02 2016-01-12 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for processing and reading a file having a media data container and a metadata container
EP2769524A1 (en) * 2011-10-21 2014-08-27 Telefonaktiebolaget L M Ericsson (publ) Real-time communications methods providing pause and resume functionality and related devices
CN112368993A (en) * 2018-07-05 2021-02-12 三星电子株式会社 Method and apparatus for providing multimedia service in electronic device
US11496529B2 (en) 2018-07-05 2022-11-08 Samsung Electronics Co., Ltd. Method and device for providing multimedia service in electronic device
CN112368993B (en) * 2018-07-05 2023-06-09 三星电子株式会社 Method and apparatus for providing multimedia service in electronic device

Also Published As

Publication number Publication date
KR20060105424A (en) 2006-10-11
RU2005120669A (en) 2006-01-20
AU2004287133A2 (en) 2005-05-19
CA2512191A1 (en) 2005-05-19
BRPI0406609A (en) 2005-12-06
JP4603551B2 (en) 2010-12-22
CA2512191C (en) 2013-12-31
KR101117874B1 (en) 2012-03-08
EP1676216B1 (en) 2012-10-24
EP1676216A1 (en) 2006-07-05
RU2372647C2 (en) 2009-11-10
AU2004287133B2 (en) 2010-05-13
EP1676216A4 (en) 2010-10-06
AU2004287133A1 (en) 2005-05-19
JP2007509573A (en) 2007-04-12
MXPA05007090A (en) 2005-10-18

Similar Documents

Publication Publication Date Title
EP2365449B1 (en) Embedding a session description message in a real-time control protocol (RTCP) message
CA2508888C (en) Session description message extensions
AU2004202538B2 (en) RTP payload format
CA2512191C (en) Embedding a session description message in a real-time control protocol (rtcp) message
WO2005111837A1 (en) Fast startup for streaming media

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 2005/04757

Country of ref document: ZA

Ref document number: 200504757

Country of ref document: ZA

WWE Wipo information: entry into national phase

Ref document number: 2004779232

Country of ref document: EP

Ref document number: 2741/DELNP/2005

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2004287133

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: PA/A/2005/007090

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 2005120669

Country of ref document: RU

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2512191

Country of ref document: CA

Ref document number: 2006536573

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 1020057012410

Country of ref document: KR

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2004287133

Country of ref document: AU

Date of ref document: 20040728

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 2004287133

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 20048032793

Country of ref document: CN

ENP Entry into the national phase

Ref document number: PI0406609

Country of ref document: BR

WWP Wipo information: published in national office

Ref document number: 2004779232

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020057012410

Country of ref document: KR