WO2018005835A1

WO2018005835A1 - Systems and methods for fast channel change

Info

Publication number: WO2018005835A1
Application number: PCT/US2017/040060
Authority: WO
Inventors: Kumar Ramaswamy; Jeffrey Allen Cooper; John Richardson
Original assignee: Vid Scale, Inc.
Priority date: 2016-07-01
Filing date: 2017-06-29
Publication date: 2018-01-04

Abstract

Systems and methods described herein relate to providing fast switching between different available video streams. In an exemplary embodiment, a user viewing a selected channel of video content receives a manifest file (such as a DASH MPD) that identifies various representations of the selected channel. The manifest file also identifies channel-change streams for one or more alternate channels. The channel-change streams may have a shorter segment size than regular streaming content. While displaying the selected content, a client also retrieves the channel-change streams of the alternate channels. If the client changes to one of the alternate channels, the client displays the appropriate channel-change stream while a regular representation of the alternate channel is being retrieved.

Description

SYSTEMS AND METHODS FOR FAST CHANNEL CHANGE

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] The present application is a non-provisional filing of, and claims benefit under 35 U.S.C. §119(e) from, U.S. Provisional Patent Application Serial No. 62/357,863, entitled "SYSTEM AND METHOD FOR FAST STREAM SWITCHING USING PARALLEL PROCESSING IN CLIENT PLAYER," filed July 1, 2016, the entirety of which is incorporated herein by reference, and from U.S. Provisional Patent Application Serial No. 62/383,371, entitled "SYSTEMS AND METHODS FOR FAST CHANNEL CHANGE," filed September 2, 2016, the entirety of which is incorporated herein by reference.

BACKGROUND

[0002] In traditional client devices for receiving digital video streams, the client device includes either a dedicated hardware decoder or a software decoder running on a central processing unit (CPU). The decoder may be, e.g. a MPEG2, H.264 or other codec decoder for video, and MP3, AAC, or other codec for audio. Example client devices are set top boxes, PCs, laptops, tablets, or smartphones. The client decodes an incoming compressed audio/video stream and then presents it to the display, where the display may be attached via a video cable or may be integrated as part of the client device, such as in a smartphone or tablet.

[0003] There is a latency in compressed video/audio systems that causes a delay from the time a user chooses a signal (e.g. a channel) until the time the picture and sound are presented on the display device and speakers. This "stream switch time" or "channel change time" latency depends on the architecture of the broadcast system. Traditional broadcast systems such as cable, satellite, or terrestrial broadcast have relatively low latencies since the content is pushed from the headend to the receivers all the time, and the client receiver only needs to switch to a different incoming signal. There is a small delay due to client decoder/di splay processing and buffering before the decoder, but typical systems can switch channels in 1-2 seconds.

[0004] The stream switch time results in large part from the way video is encoded. In the most common video coding schemes, information that encodes an entire frame of video, independent of other frames, is sent relatively infrequently. Instead, most frames of video are encoded using information that represents the difference between that frame and one or more preceding and/or following frames. This use of differential coding prevents random access into a compressed stream. To enable channel change, random access points coded without reference to other frames may be periodically used. In adaptive bit rate (ABR) streaming systems, long intervals of as much as ten seconds may be used between random access points to improve efficiency in coding, since multiple user random access does not need to be supported for individual streams.

[0005] In traditional ABR streaming systems, a decoder responds to a channel change event by identifying a random access point to access the stream, accessing the new content at a low- bitrate ABR representation, and ramping the ABR bitrate up to the network capabilities.

[0006] In OTT (over the top) systems that use protocols such as HLS (HTTP live streaming), DASH (dynamic adaptive streaming over HTTP), or Silverlight on the internet, the stream switching time is generally longer due to the protocols. The client requests from the headend a different channel or stream, and then the headend responds. In addition, since the transmission medium is the internet, the transmission is bursty. The OTT protocols provide resilience to the bursty/lossy characteristics of the internet. This results in a stream change latency from 5-10 seconds or even longer.

SUMMARY

[0007] Systems and methods described herein relate to providing fast switching between different available video streams. For example, in a system where different high resolution sections of video (e.g. multiple zoomed views of the video) are available, systems and methods are proposed to enable very fast switch time for the user experience. Systems and methods disclosed herein further provide for rapid switching back to the previous stream.

Video Client for Rapid Stream Switching.

[0008] In an exemplary embodiment, a client device includes a plurality of decoders and a communication interface through which a plurality of encoded video streams may be retrieved. The client device may be retrieving a first-bitrate representation of first video content at a first bitrate. The first bitrate may be selected using adaptive bitrate (ABR) adaptation techniques to select a bitrate based on network conditions. While the client device is retrieving the first-bitrate representation of the first video content, the client device also retrieves a second-bitrate representation of second video content at a second bitrate. The second video content may be, for example, content representing an adjacent channel or content that is associated with the first content (e.g. the second content may be a zoomed-in version of a portion of the first content).

[0009] The second bitrate is lower than the first bitrate and may be the lowest-available bitrate for the second content. Using the decoders, the client device decodes both the first-bitrate representation of the first video content and the second-bitrate representation of the second video content. The client device also causes display of the decoded first-bitrate representation of the first video content, e.g. by displaying the video on a built-in display or outputting decoded video to an external display, such as a television or computer monitor.

[0010] While the first video content is being displayed, the client device receives an instruction from a user to switch to the second video content. In response to the instruction to switch to the second content, the client device switches from causing display of the decoded first-bitrate representation of the first video content to causing display of the decoded second-bitrate representation of the second content. The second content is thus displayed promptly (e.g. within one or two video frames) in response to the user's instruction, appearing almost instantaneous to the user. However the promptly-displayed representation of the second content is at a relatively low bitrate. Consequently, the user device also, in response to the instruction, retrieves a third- bitrate representation of the second video content at a third bitrate that is higher than the second bitrate. The user device decodes the third-bitrate representation of the second video content and subsequently switches from the lower-bitrate representation of the second content to the higher- bitrate representation once it is feasible to do so (e.g. when a sufficient amount of data has been buffered, or when a random-access point is received in the higher-bitrate representation).

[0011] Once the client device has switched to display of the second video content, the device may cease to retrieve the first content altogether, or the device may retrieve a low-bitrate representation of the first content (thus enabling rapid switching back to the first content, if the user desires).

[0012] In some embodiments, the client device receives a manifest (e.g. a media presentation description (MPD) in the case of MPEG-DASH) that identifies the bitrates of various representations of the second video content. The second bitrate (with which the second content is initially retrieved and decoded) may be selected so as to be the lowest bitrate identified in the manifest that is compatible with a decoder of the client device. The third bitrate (with which the second content is retrieved, decoded, and displayed after the client instruction) in some embodiments is selected to be the highest compatible bitrate identified in the manifest that is no greater than the first bitrate. In some embodiments, the third bitrate is selected to be the highest compatible bitrate identified in the manifest that is less than the first bitrate. In some embodiments, the third bitrate is equal to the first bitrate.

[0013] In some embodiments, the bitrate selected as the third bitrate depends on whether or not the first and second content are retrieved from the same network domain. If the first and second content are retrieved from the same domain, then network conditions are likely to be similar and the client device may retrieve the second content at a third bitrate that is substantially the same as the first bitrate (e.g. equal to the first bitrate, or the highest available bitrate lower than the first bitrate). On the other hand, if the first and second content are retrieved from different network domains, the client may perform ordinary ABR adaptation techniques to select the third bitrate. For example, the third bitrate may be the next-higher available bitrate above the second bitrate, and the client device may increase the bitrate for the second content in a stepwise fashion until further increases would not be compatible with current network conditions.

[0014] As an alternative to determining whether the first and second content are retrieved from the same network domain, other conditions may be imposed to determine whether retrieval of the first and second content are likely to be under substantially similar network conditions.

[0015] In one exemplary method, a client device causes display of first video content. While the device is causing display of the first video content, the device retrieves and decodes a first representation of second video content. In response to a user instruction, the client device (a) initially switches to display of the first representation of the second video content, (b) retrieves a second representation of the second video content at a higher bitrate than the first representation, and (c) subsequently switches to display of the second representation.

Content Coding for Enhancement of Stream Switching.

[0016] One exemplary embodiment provides a method of enabling fast stream change in an adaptive bitrate streaming system with multiple media streams being available for viewing. Some such embodiments operate in systems wherein the multiple media streams encode different representations of the same content. In an exemplary method, a first representation of the content is encoded at a first segment length, wherein the first representation is for a first view of the content. A second representation of the content is encoded at a second segment length, wherein the second segment length is temporally shorter than the first segment length and wherein the second representation is for a second view of the content. A manifest file is generated (e.g. a DASH MPD), wherein the manifest file identifies the first representation and the second representation. The information in the manifest file that identifies the first representation may be, for example, a URL or a template from which a URL can be generated. The manifest file may be delivered to a client.

[0017] In some embodiments, a third representation of the content is also encoded, wherein the third representation is for the second view of the content and further wherein the second representation is used for transitioning (e.g. fast stream change) from the first representation to the third representation.

[0018] In some embodiments, the generation of the manifest file includes selection of the second representation for inclusion in the manifest file based on a prediction that a particular streaming client is likely to request the second view of the content. [0019] In another exemplary embodiment, a method is provided for enabling fast channel change in an adaptive bitrate streaming system with multiple media streams being available for viewing, wherein the multiple media streams encode multiple content channels containing different content. A first media stream is encoded at a first segment length, wherein the first media stream is for a first channel of content. A second media stream is encoded with a second segment length wherein the second segment length is temporally shorter than the first segment length and wherein the second media stream is for transitioning from the first channel to a second channel. A manifest file (e.g. DASH MPD) is generated, where the manifest file identifies the first media stream and the second media stream. The manifest file may be delivered to a client.

[0020] In some embodiments, a third media stream is encoded, wherein the third media stream is for the second channel and further wherein the second media stream is used for transitioning (e.g. fast channel change) from the first media stream to the third media stream.

[0021] In some embodiments, the generation of the manifest file includes selection of the second representation for inclusion in the manifest file based on a prediction that a particular streaming client is likely to request the second channel.

BRIEF DESCRIPTION OF THE DRAWINGS

[0022] FIG. 1 is a block diagram illustrating the functional architecture of a system for adaptive bit rate distribution (ABR) of streaming video using traditional ABR streams and zoom coded streams.

[0023] FIG. 2 is a block diagram illustrating the functional architecture of an over-the-top (OTT) system including client components.

[0024] FIG. 3 is a flow chart illustrating steps performed during a stream switch in a system such as that of FIG. 2.

[0025] FIG. 4 is a functional block diagram of a system architecture for delivering video content with secondary streams.

[0026] FIG. 5 is a schematic illustration of file structure in secondary streams and primary streams, illustrating the use of a common group of pictures (GOP) structure.

[0027] FIG. 6 is a schematic illustration of file structure in secondary streams and primary streams, illustrating the use of relatively short segments.

[0028] FIG. 7 is a schematic illustration of file structure in secondary streams and primary streams, illustrating coding of secondary streams with a higher frequency of intra pictures as compared to primary streams. [0029] FIG. 8 is a message flow diagram illustrating a method of delivering streaming content with secondary streams.

[0030] FIG. 9 is a functional block diagram of a system architecture for delivering zoom coded content with secondary streams.

[0031] FIG. 10 is a schematic timeline illustrating a channel -change event performed in some embodiments.

[0032] FIG. 11 is a block diagram illustrating the functional architecture of an over-the-top (OTT) system including a client with multiple parallel decoders according to an embodiment.

[0033] FIG. 12 is a flow chart illustrating steps performed during a stream switch in an exemplary system such as that of FIG. 11.

[0034] FIG. 13 is a schematic timeline illustrating the downloading, decoding, and display, of different streaming video channels according to an exemplary embodiment.

[0035] FIG. 14 is a schematic timeline illustrating the downloading, decoding, and display, of different streaming video channels according to another exemplary embodiment.

[0036] FIG. 15 is a flow chart illustrating a stream switching method according to some embodiments.

[0037] FIG. 16 is a schematic illustration of a display illustrating an exemplary video stream display layout according to an embodiment.

[0038] FIG. 17 is a block diagram of the functional architecture of a block-based decoder that may be used in some embodiments.

[0039] FIG. 18 is a block diagram of the functional architecture of an exemplary wireless transmit/receive unit (WTRU) that may be employed as a client device in some embodiments.

[0040] FIG. 19 is a block diagram of the functional architecture of an exemplary network entity that may be employed as a server in some embodiments.

DETAILED DESCRIPTION

[0041] Transition between normal full-view videos and selected zoomed streams representing regions of interest benefit from low latency because the full-view and zoomed view are of the same ongoing video. In such applications, switching between different views of the same content preferably has a very low switching delay. However, the transition between ABR streams generally involves a delay caused by retrieving an appropriate access point in the target ABR stream. Potential access points may be 4-10 seconds apart, imposing significant switching delay. [0042] In exemplary embodiments disclosed herein, custom low-bitrate secondary streams are prepared for use as a low latency bridge in the stream change gap in video playback while a full resolution zoom stream is fetched in parallel. Candidate secondary streams may automatically be streamed along with the main channel to prepare for a switch into a different stream. Alternately a secondary stream may be delivered with low latency on request.

[0043] In exemplary embodiments, information identifying one or more secondary streams is provided in a manifest (such as an MPD) associated with a channel that is currently being viewed. In this way, when a channel change is initiated, there is no requirement for the client to wait to receive a manifest for the newly-selected channel before displaying the secondary stream. In some embodiments, availability of one or more secondary streams is advertised in a manifest of the current stream being viewed. In some embodiments, one or more of the secondary streams are requested in response to initiation of a channel change. In some embodiments, one or more of the secondary streams are requested in response to a determination that a channel change is likely to occur shortly. Secondary streams preferably are encoded with low bitrate and low access latency.

[0044] Exemplary systems and methods disclosed herein enable fast switching between a main stream and alternative channel streams. The alternative channel streams may be, for example, different broadcast television channels. In some embodiments, the alternative channel streams may represent, for example, a zoomed, highlighted, or otherwise enhanced version of the main channel stream and/or of one or more particular regions of the main channel stream. Streams including one or more such enhancements are referred to herein as zoom coded streams. It is desirable to provide a substantially seamless and responsive user experience when the end customer requests changes between main and alternative streams (such as zoom coded streams). Similarly, the switch from the alternative stream back to the main stream is preferably also substantially seamless and fast.

[0045] An exemplary functional architecture of a zoom coding system is illustrated in FIG. 1. Traditionally, an input full-resolution stream 100 (4K resolution, for example) may be processed and delivered at a lower resolution, such as high definition (HD) to an end consumer. In FIG. 1, traditional processing is represented in the components labeled "Traditional ABR Streams" 106. Using traditional adaptive bit rate (ABR) coding, an adaptive bit rate encoder 104 may produce ABR streams 106 that are published to a streaming server 108, and the streaming server in turn delivers customized streams to end customers 110.

[0046] An exemplary zoom coding encoder 102 receives the full resolution input video stream

100 and with a variety of techniques produces, for example, cropped portions of the original sequence at the native resolution. A cropped portion may be, for example, video of a particular player in a sporting event. These cropped portions may in turn be encoded using traditional ABR techniques. A user is presented with the choice of watching the normal program (e.g. the traditional streams delivered using ABR techniques) and in addition, zoom coded streams that may represent zoomed portions of the original program. Once the user makes a choice to view a zoom coded stream, the client may request a representation of the program with the appropriate bitrate from the streaming server. The streaming server may then deliver the appropriate stream to the end client.

[0047] In general, for a given video sequence, it is possible to create any number of zoom coded streams by tracking any number of objects. Technology for tracking objects is well known, with many classes of techniques being available. Exemplary techniques that may be used include those describe in, for example, "Object Tracking - A Survey", A. Yilmaz, O. Javed, M. Shah, ACM Computing Surveys, Vol. 38, No. 4, Article 13, (December 2006). An encoder, based on the type of content, may choose from the available techniques to track moving objects of interest and hence a region of interest.

[0048] In such embodiments, when a customer requests a switch to a zoom coded stream, the request is sent to the streaming server 108, which then starts delivering the appropriate zoom coded stream. However, since the zoom coded streams are ABR encoded, it may take a specified amount to time (e.g. 15-30 seconds in a standard ABR implementation). This kind of response time (the time taken for the screen to have the appropriate zoom coded video after the user requests the same) may not be acceptable to customers who are more familiar with channel changes in a broadcast or cable TV environment, which can be accomplished in 1 to 2 seconds.

[0049] Zoom coding systems and methods provide client devices the ability to switch between different views of the content via an OTT distribution. However, to make the user experience valuable, the switching time between views is preferably fast, less than a second, and if possible nearly instantaneous. Systems and methods described herein provide mechanisms for rapid (and in some embodiments nearly instantaneous) OTT stream switching.

[0050] A functional architecture of a traditional OTT system is shown in FIG. 2. FIG. 2 illustrates a plurality of available streams 200 at the headend. Each stream is available in different representations. Each representation has a different bit rate or resolution to provide adaptability to

IP network conditions. An IP network 202, such as the internet or private Ethernet network is provided to communicate selected representations. A client receiver 204 is provided. The client receiver includes a decoder 206 (video and audio decoder), scaler 208 (scaling of the picture for the display), and display renderer 210 which formats the pixels for the display system. A controller component 212 responds to user input from, e.g., a remote control or keyboard. The controller component provides signaling inside the client device for choosing head end signals and display components. A display 214 is provided, which may be, for example, a television screen, a PC screen, or an integrated display in a tablet or smartphone.

[0051] In the example of FIG. 2, four streams are available, each of which has a number N of different representations. The four streams may correspond to different channels or programs, or may represent different views of the same content, for example. Of course, embodiments disclosed herein may be implemented with different numbers of streams or representations or different OTT protocols. DASH, HLS, and Silverlight are exemplary OTT protocols that can be used.

[0052] In the traditional architecture illustrated in FIG. 2, when the user requests a change in channel (or stream selection), a channel-change process is performed as illustrated in the flow chart of FIG. 3. In step 302, a user selects a different stream (e.g. a different television channel). In step 304, the user' s controller issues a request for the new stream. In step 306, after some round- trip delay, the user's controller begins receiving the new stream. In step 308, after a sufficient amount of the new stream has been buffered, the user's decoder begins decoding the new stream, and in step 310, the video content encoded in the new stream is finally displayed to the user.

[0053] The process illustrated in FIG. 3 may result in a 5-15 second delay. A large portion of this latency is a result of the nature of IP networks. IP networks (especially the internet) are bursty and lossy networks. Therefore, OTT protocols transfer video and audio in time segment packages. Before the client can decode/display the content from a new stream, a significant amount of latency time is spent buffering up the incoming OTT video/audio packages of the content. Fast switching between streams is not feasible under such conditions.

[0054] In known adaptive bit rate (ABR) systems, a client downloads a manifest file that describes the available representations of video content (e.g. different representations with different bit rates to be used in different network conditions). When a user requests a channel change (for example from an available broadcast bouquet and using a program guide), the client application requests from the server one of the representations that is identified in the manifest file. Typically, the client starts by requesting the lowest bit rate representation and works its way up to the best quality that can be supported by the network bandwidth. Incidentally, the lowest bit rate representation can in general also be downloaded the fastest, helping get the channel decoded and presented the fastest. However, in most ABR systems, even the lowest bit rate representation may not have the characteristics that are desirable to affect a fast channel change. Rapid Stream Switching.

[0055] Systems and methods are described herein to improve the channel switch time to switch from the main channel to an alternate channel (e.g. a zoom channel) or from an alternate channel back to the main channel or to another alternate channel.

[0056] Exemplary systems and methods disclosed herein make use of one or more streams referred to herein as secondary streams. The secondary stream facilitates a fast channel change. In an exemplary embodiment, the secondary stream has one or more of the following properties: (a) very short segment lengths; (b) low bit rate and correspondingly lower spatial and/or temporal resolution; (c) high rate of intra frames; (d) intra-refresh coding techniques.

[0057] In an exemplary embodiment, when a specific channel is being streamed to a client, a set of secondary streams is substantially continuously downloaded to the client in the background. A selection of which secondary streams are downloaded may be made based on, for example, viewing habits of a particular user. Viewing habits may be tracked over time or programmed in by the user. In some embodiments, the secondary streams being sent include secondary streams for the next few channels in a program guide around the channel being watched. In some embodiments, the secondary streams being sent include secondary streams for the last channel that the user watched. In some embodiments, the secondary streams being sent may include a secondary stream for a channel viewed most frequently by the user (or, for example, the channel viewed most frequently at that particular time of day, or that particular combination of time of day and day of the week). Other criteria may alternatively be used for selecting which secondary stream or streams are sent to a particular client. The number of different secondary streams sent to a client may be limited in order to avoid imposing an undue burden on the network.

[0058] FIG. 4 is a schematic block diagram of an exemplary system architecture. As illustrated in FIG. 4, different representations (e.g. at different bit rates) of each channel of content are provided to a streaming server 402. The streaming server is also provided with a secondary stream for each channel of content. Each client device 404 receives from the streaming server a particular requested representation of the channel being viewed at that client device. Each client device also receives a set of secondary streams for other channels (or, for example, secondary streams for zoom coded versions of the channel being viewed).

[0059] At the respective client devices, the secondary streams may be continuously decoded and available to be selected for display nearly instantaneously when the user requests that particular channel. Alternatively, the secondary streams may be stored to be decoded only if that channel is selected at the client device (e.g. by a channel-change input from a user). The latter option saves CPU resources on the client device.

[0060] In cases where a channel change is initiated at a client device from a currently-viewed channel to a newly-selected channel, for which a secondary stream is being received, the channel change can be effected rapidly. The channel change may appear nearly instantaneous in cases where the secondary stream of the new channel was already being decoded. In cases where decoding of the secondary stream does not start until the new channel is selected, switching to display of the new channel may still be relatively rapid. In some embodiments, when the secondary stream of the newly-selected channel is not being received, the secondary stream may be requested in response to, e.g. a channel -change request from a user and may allow for relatively rapid display of the newly-selected channel.

[0061] In some embodiments, the GOP (group of pictures) structure of the secondary stream can be selected to allow entry point access at sub-segment locations. For example, if segments are two seconds long, the secondary stream may include intra frames at quarter-second intervals. When the switch to secondary stream is initiated, then the client latency to switch may be this quarter second instead of the full segment duration of two seconds. An example of a secondary stream with a common GOP structure is illustrated in FIG. 5.

[0062] In some embodiments, intra refresh coding is used for coding of the secondary stream. In intra refresh coding, macroblocks are systematically inserted in inter frames such that a client picture is refreshed within X frames.

[0063] As illustrated in FIG. 6, secondary streams with a relatively short temporal segment length may be used to permit rapid stream switching.

[0064] As illustrated in FIG. 7, secondary streams may be coded with a relatively shorter segment length (e.g. in terms of time spanned by the segments), when compared to the segment length of the main content streams. This may allow a channel change to be accomplished without waiting for the current segment of main content to be entirely played out.

[0065] With respect to FIG. 7, in response to selection of a new channel (e.g. immediately upon selection, or after a certain number of secondary streams are decoded and displayed), one of the regular primary representations is requested. In some embodiments, the lowest bit rate representation is requested. In exemplary embodiments, the segment boundary of the standard stream and the secondary stream are aligned in the process of switching from a secondary stream to a regular representation. For example, the secondary stream segment length may be encoded to have a temporal length of one second while segments of the standard representations of that program are encoded to have a temporal length of ten seconds. In this case, after the switch to the secondary stream of the new channel has occurred, the client may wait until the next ten-second time boundary before switching from display of the secondary stream to display of the regular representation. Such a method may be employed when, for example, the segment length of the secondary stream is smaller than or equal to the normal segment length.

[0066] In some embodiments, the lowest-bitrate representation of a channel is used as the secondary stream. While this may have some advantages in terms of simplicity, it may not have all the desirable characteristics of the specifically designed secondary streams disclosed herein.

[0067] A message flow diagram of an exemplary event sequence in some embodiments is illustrated in FIG. 8. FIG. 8 illustrates the packaging and delivery of different representations of video content to client devices, including the delivery of secondary streams. Independent of the mechanism by which a client chooses the secondary streams, the server creates an appropriate media presentation description (MPD) for each user. The MPD contains information identifying a subset of channels available in the bouquet that would be considered likely channels that the end user may watch next. The number of secondary streams requested and processed by the end user may be programmable or may be set adaptively. It may be noted that there may be differentiated users with different sets of rights. A User Access Rights server may be used to inform the head end of the available service category for each user that signs up and requests a piece of content for consumption.

[0068] As illustrated in FIG. 8, one or more content sources 802 provide various video programs 804a, 804b (possibly among others) to an encoder 806. Encoder 806 encodes each video program 804a, 804b into a respective set of adaptive bitrate streams 808a, 808b. Each of these sets of streams includes a secondary (or channel-change) stream, which may be the lowest-bitrate stream in each set. A transport packager 810 segments the streams and generates a manifest (e.g. an MPD) for each program. The segments (812a, 812b) and corresponding manifests (814a, 814b) are made available over a network, for example being distributed to a plurality of edge streaming servers (e.g. 818) through an origin server 816.

[0069] Continuing in FIG. 8, a video client 820 issues a request 822 for particular video content to a web server 824. The web server 824 may redirect (826) the client to an edge streaming server

818. The client requests (828) the content from the edge streaming server. The request may provide a user identifier, and the edge streaming server may confirm (communications 832) with a user rights access server (830) that the user is authorized to access the content. In step 834, a manifest file is delivered to the client. Based on the manifest file, the client requests appropriate streams in step 836. The streams 838 provided in response to the request may include a primary content stream along with one or more low-bitrate secondary streams. At 840, the client receives a user input indicating selection of content that corresponds to one of the secondary streams. In response, the client substantially instantaneously (e.g. within one or two frames after the input) switches to display of the relevant secondary stream. The client then operates to obtain a higher-bitrate version of the selected content. This may involve communication 842 (e.g. redirection) with the web server to obtain an address at which the new content is available. The client requests (844) a higher- bitrate representation of the selected content, and the client continues to display the lower-bitrate version of the selected content until a sufficient amount of higher-bitrate representation 846 of the selected content has been received. Once a sufficient amount of the higher-bitrate representation 846 has been received, the client seamlessly switches to display of the higher-bitrate representation.

[0070] In exemplary embodiments, at least some of the channels to which a client may switch are streams representing zoom coded content. In some embodiments, unlike a general channel lineup, there may be only a few zoom coded streams that are created and available for viewing. In such embodiments, all of the secondary streams for zoom coded content may be constantly downloaded and either continuously decoded or decoded only when a switch to that specific zoom coded stream is requested. In parallel, the client may request that the server start delivering a normal representation of the zoom coded content being requested. This normal representation is buffered, decoded and ready to be played out when the next segment is due to be played.

[0071] Various enhancements may be implemented using zoom coded streams. In some embodiments, a zoom coded stream may implement a replay feature by allowing a client to repeat playback starting from a particular previous time (e.g. x seconds earlier). To implement a "zoom forward" feature, a client may simply play the zoom stream. In an exemplary zoom forward feature, the main stream and the zoom stream (of the stream that is being switched into from a current stream) are frame synchronized so as to prevent jumps in time (either forward or backward) when the zoom forward effect is requested.

[0072] FIG. 9 is a schematic block diagram of the architecture of an exemplary system for rapid channel change in which clients is provided with the ability to switch among different zoom coded streams related to a main stream. In the illustrated embodiment, the zoom coded streams are zoomed-in views of particular regions of interest (ROIs) from the main stream, although other types of zoom coded streams (e.g. streams with enhancements other than spatial zooming, such as increased frame rate or an increased bit depth) may be used.

[0073] Some embodiments are implemented using MPEG DASH. A simplified example of an

MPD (a "pseudo MPD") that may be used in some embodiments is described below. The exemplary MPD identifies different content representations for secondary streams and zoomed streams within a single DASH period and adaptation set. Other locations and methods for sharing this data, such as other types of manifest file, may alternatively be used.

[0074] A pseudo MPD describing a primary stream a zoom stream and a fast channel change stream is described below. Descriptions of the streams referenced in the MPD are further provided below. Note that the exemplary secondary stream has short segment length as well as low resolution and bitrate. The deriving of segment numbers for the individual duration and segment length may be performed through Segment Templates illustrated below. Parameters to segment template allow specification of short segments for the secondary stream 'ccl ' (the "cc" representing "channel change").

[0075] This example includes three different representations for media of total length 300 seconds. For the primary view 'primary', a representation with 30 segments of length 10 second and video frame resolution of 1920x1080 and 6.8 Mbs is used. For a zoom view 'zooml ' a representation with 30 segments of length 10 second and video frame resolution of 1920x1080 and 6.8 Mbs is used. For the secondary stream corresponding to zooml, namely 'ccl ', a representation with short segment size of 1 second and lower resolution and bitrate is used. The relevant representations may be identified as follows in a DASH MPD.

Representation id="primary" bandwidth="6800000" width="1920" height="1080">

Representation id="zooml" bandwidth="6800000" width="1920" height="1080">

Representation id="ccl" bandwidth="500000" width="360" height="240">

In the context of DASH MPD, the above-listed elements may appear as follows.

<!— main video source -->

<BaseURL>video/</BaseURL>

<!- Main 1080p Representation at 6.8 Mbps and 10 second segments -->

Representation id= " primary" bandwidth= "6800000" width= " 1920" height= " 1080" >

<BaseURL>primary/</BaseURL>

</SegmentTimeline>

</SegmentTemplate>

</Representation> <!-Zooml 1080p Representation at 6.8 Mbps and 10 second segments -->

Representation id="zooml" bandwidth="6800000" width="1920" height="1080">

<BaseURL>Zooml/</BaseURL>

</SegmentTimeline>

</SegmentTemplate>

</Representation>

<!- Channel Change Representation at 0.5 Mbps and 1 second segments -->

Representation id="ccl" bandwidth="500000" width="360" height="240">

</SegmentTimeline>

</SegmentTemplate>

</Representation>

</AdaptationSet>

[0076] A further exemplary embodiment is illustrated in FIG. 10. In the embodiment of FIG. 10, a client device is receiving, decoding, and causing display of a selected representation of content in Channel 1. While the client is receiving and decoding the representation of Channel 1, the client is also receiving the secondary stream for a different channel, Channel 2. Information identifying the secondary stream of Channel 2 (such as a URL) may be conveyed in, for example, the MPD of Channel 1.

[0077] In the exemplary embodiment of FIG. 10, the client initiates a channel change (step 1002) at time . This may be in response to, for example, a user initiating a channel change through a remote control (e.g. an up or down arrow), or a user selecting a region of interest in a zoom coded video. At time t₂, a random-access point is reached in the secondary stream. In this case, the random-access point is the beginning of segment p+2 of the secondary stream, although the random-access point may be a different type of random access point, such as an intra frame within a segment. The client then causes display of the secondary stream (step 1004) starting with segment p+2. Also in response to the initiation of the channel change, the client requests a regular ABR representation of Channel 2. For example, the client may request segment m+1 of Channel 2, which may be the next segment with a start time occurring after the client initiated the channel change. At time t₃, a random access point is reached in the regular Channel 2 stream. In this case, the random-access point is the start of segment m+1 of Channel 2. The client then causes display of the content in the regular representation of Channel 2 (step 1006). Thus, from the perspective of a viewer, a channel change is requested at time h, the content of Channel 2 appears on screen at time h (though potentially with a relatively low quality), and a higher-quality version of Channel 2 appears at time h. As can be seen from FIG. 10, the latency between times and h can be reduced when the temporal frequency of random-access points (e.g. segment starts and intra frames) in the secondary stream is increased.

[0078] In a variation of the embodiment of FIG. 10, the client is decoding (but not displaying) the secondary stream of Channel 2 while the client is causing display of Channel 1. In such embodiments, when the client initiates a channel change (at time h), the client may quickly switch to display of segment p+1 of the secondary stream, which was already being decoded. In response to initiation of the channel change, the client also requests and receives a regular (primary) representation of Channel 2. The client continues to decode and display the secondary stream (including segment p+2) until a random access point in Channel 2 is reached, such as the start of Segment m+1, at which time the client causes display of the regular representation of Channel 2. An example of a client device that may be used with this embodiment is the client device of FIG. 11, which includes multiple decoders capable of operating in parallel.

[0079] Simultaneous decoding of multiple streams may add complexity to the client. However, in some embodiments, the decoder may be implemented using a software function running on a CPU. Such software based clients may have enough CPU power to decode multiple streams simultaneously. A software-based implementation may operate to decode multiple streams simultaneously, depending on the resolution and bit rate.

[0080] A client may use various techniques to obtain information regarding the availability of different representations. For example, the client may retrieve a respective media presentation description (MPD) file or other manifest file corresponding to each channel from a streaming media server. Alternately the client may receive a combined manifest file or MPD which defines the various representations for multiple different channels or multiple related content views. Either type of MPD may specify the secondary streams (or streams appropriate for use as secondary streams) in addition to other streams and representations.

[0081] Exemplary embodiments may be employed in conjunction with zoom coding systems. For example, different streams may not be different channels, but may instead be video streams related to another video asset. For example, the different streams may include a primary stream (e.g. a video of a sporting event) along with secondary streams representing high-resolution sections of the primary stream (e.g. zoomed video of a particular player or of a game ball), or different versions of the primary stream with different frame rates or bit depths, for example. [0082] Different techniques may be used for selecting which secondary streams— and how many secondary streams— are selected to be retrieved at a low bitrate to allow for rapid stream switching. In general, it is desirable for the secondary streams to be the streams that are the most likely to be requested by a viewer of the primary stream. For example, in an embodiment where each stream corresponds to a traditional numbered broadcast or cable channel, and were the viewer is watching a stream corresponding to channel N, the secondary streams may be streams corresponding to channels N+l and N-l . Alternatively, if the client device determines that the user has been consistently "surfing" through the channels in an upward (or downward) direction, then when the client device is displaying channel N as a primary stream, it may be retrieving channels N+l and N+2 (or N-l and N-2) as secondary streams. Where a primary stream is being displayed along with a display (e.g. a "thumbnail" still image or video) of one or more other "recommended" streams, the client device may retrieve and decode the recommended streams as secondary streams. Other techniques may be used for selecting the identity and number of secondary streams.

[0083] In some embodiments, including zoom coding applications, the client may display more than one stream on screen at a time. For example, to provide the user visual information about other available zoom streams, the client may display on screen a small scaled-down version of one or more other available streams.

[0084] As one example involving a sporting event, Stream 1 may depict the entire playing field, Stream 2 may depict a zoomed-in region centered on player 1, Stream 3 may depict a zoomed-in region centered on player 2, and Stream 4 may depict a zoomed-in region centered on the ball. The display at the client may be broken into any style of mosaic pictures of these different streams. For example, FIG. 16 shows Stream 1 using the majority of the screen, while Streams 2,3,4 are shown as smaller pictures. In this case, Stream 1 may be considered a 'primary stream' which is retrieved, decoded and displayed at a relatively high bit rate, and Streams 2,3,4 may be received as secondary streams which are retrieved, decoded and displayed at lower bit rates and possibly using a smaller segment size. When the user decides he wants to see the zoom of player 2, he could select stream 3, in response to which the client may swap the positions of stream 1 and stream 3 on the screen. Using the techniques described herein, the swap could be achieved with relatively low latency, with the secondary stream version of Stream 3 immediately scaled and displayed in the larger area previously occupied by Stream 1. The client may then request a higher bit rate version of the zoomed Player-2 view in order to transition Stream 3 to a higher quality version appropriate for display in the larger area of the screen. At the same time, the client may request a secondary stream of the entire field view, in order to transition Stream 1 to a lower bitrate, lower quality version which may be sufficient for display in the smaller screen area formerly occupied by Stream 3.

[0085] In exemplary embodiments described above, multiple decoders are used to decode the primary stream and the secondary streams in parallel. Alternative embodiments may be implemented using only a single decoder. In some such embodiments using a single decoder, the client requests and retrieves both the primary stream and the secondary streams as described above, but the client decodes only the primary stream and causes display of that primary stream. The client in such embodiments buffers the secondary streams without decoding those streams. In response to an indication to change streams (e.g. a user input requesting a channel change), the client stops decoding and displaying the primary streams and instead begins decoding and displaying the buffered secondary stream that corresponds with the newly-selected channel. Also in response to the indication to change streams, the client requests a higher bitrate version of the newly-selected channel. When the higher bitrate version of the newly-selected channel is received, the client switches to decoding and causing display of the higher bitrate version. The client may further retrieve and buffer a lower-bitrate secondary stream corresponding to the original channel, allowing a user to quickly switch back to the original channel if desired.

[0086] Exemplary embodiments disclosed herein provide enhanced (e.g. near instantaneous) stream switching using an architecture as depicted in FIG. 11. In the example of FIG. 11, multiple decoders (in this example, four decoders) are provided in the client and are decoding different streams simultaneously. When the user requests a stream switch, the only latency is the time to switch the display renderer from an initial stream to a new stream. Since the pixels are present at the input to the display renderer at all times, the switch time may be near instantaneous, as low as one or two frame times, typically 1/30^th of a second (33 milliseconds) for 60 fps video. To a person viewing, the switch would appear to be a nearly instantaneous response to a key press on a remote control or keyboard.

[0087] FIG. 12 is a flow chart illustrating an exemplary method that may be performed in the system of FIG. 11.

[0088] Simultaneous decoding may use more IP network bandwidth than decoding of just a single stream. To minimize the impact of the increased bandwidth, streams that are not on screen but are being decoded can be decoded using the lowest bitrate available for that stream, where multiple representations with different bitrates are available for each stream. After a switch takes place (e.g. while the client is displaying a low bitrate version of the stream that has been switched to), the client may request a higher bitrate representation for the stream that has been switched to, while the previous stream (e.g. the originally displayed stream that is now being switched to the background) may be switched to a lower bit rate representation.

Rapid Bitrate Adaptation.

[0089] Exemplary embodiments disclosed herein provide for recovery to the optimal channel bandwidth after a channel switch is made. In addition to the various content representations intended for ongoing display, a server may make available one or more alternative representations intended for use during a transition between one channel or content view and a different channel or content view. The alternative representations may be referred to as secondary streams. Such streams may be provided at a low bit rate, and/or with a small segment size, compared to higher bit rates and larger segment sizes which may be used for other content representations available from the server. In an exemplary adaptive bit rate (ABR) decoder, a first (or primary) stream (e.g. a first channel or content view) and one or more of the secondary streams are streamed and/or decoded simultaneously. The client may switch nearly instantaneously to the selected channel since a decoded version of the selected channel is available at the client. This decoded version may, however, be a very low bit rate stream with a low video quality. With a traditional ABR system, the client would then start up its ABR adaptation process and slowly works its way up to the optimal quality based on available bandwidth. In an exemplary embodiment, however, the process of recovery is accelerated by using information from the first or primary channel's last used bandwidth setting. Based on that, the client requests the representation that has a bandwidth requirement which is the same as or just below that of the last presumed available bandwidth of the first (or primary) stream. This intelligent ABR request circumvents the sometimes slow ramp up process of the ABR system due to its adaptation. This is also useful in a zoom coding system where the zoom channel switch tends to be a relatively short term effect, and where a ramp up to the right bit rate using the normal client algorithm would be too slow and would detract from the customer experience.

[0090] As an example, consider a situation in which Channel 1 (first or primary channel) has four representations with the following bit rates: 3.5 Mbps, 2 Mbps, 1.2 Mbps and 500 Kbps. An adjacent channel, Channel 2, to which the customer is switching to also has four representations with the same respective bit rates as the primary channel. Consider a case in which the user is currently tuned to Channel 1 and in which the network conditions allow Channel 1 to operate in a steady state at 3.5 Mbps. Assume the customer switches to Channel 2. In a normal scenario, assuming that the lowest bitrate representation was being used as a secondary stream, the player would already have been retrieving and decoding the 500 Kbps representation corresponding to

Channel 2 even prior to the switch. Now a normal client device may be expected to switch its way up to 3.5 Mbps (assuming network conditions have not changed much during the switch). In some exemplary embodiments, the client device promptly requests the 2.0 Mbps representation of Channel 2, where 2.0 Mbps is the highest-bitrate representation that is lower than the 3.5 Mbps of the representation that was being retrieved for Channel 1. In this example, the client does not make any request for the 1.2Mbps representation. In other exemplary embodiments, the client device may promptly request the 3.5Mbps representation after executing the switch, since this is equal in bitrate to the representation at which Channel 1 was being retrieved at the time of the switch. In this case, the client need not make any step-up requests for the 1.2Mbps and 2.0Mbps representations of the Channel 2 content.

[0091] In exemplary embodiments, gains in switching time may be improved when additional representations are available. Using embodiments disclosed herein, the optimal operating point is reached sooner. These systems and methods may further be used for switching to zoom coded streams.

[0092] FIG. 13 is a timeline illustrating an exemplary embodiment. The timeline shows four streams (CH 0, CH 1, CH 2, CH 3) being received and decoded by the client. Initially, the client is displaying a 3.5 Mbps representation of Channel 0. The client (e.g. as a result of user input) may choose a different stream to place on screen (step 1302). When a user selects a different stream, in this case Channel 2, the lowest bitrate representation (500 Kbps) is displayed since this representation was already being received and/or decoded. In response to the switch to Channel 2, the client requests a higher bitrate representation (3.5 Mbps) of Channel 2. When a sufficient amount of data has been received for the 3.5 Mbps representation of Channel 2, the client switches seamlessly from display of the 500Kbps representation to the 3.5 Mbps representation of Channel 2, such that a user may perceive an increase in quality but would not perceive any temporal jump in the content. To minimize the duration that the low bit rate representation is used after the switch, the OTT package format may use short segment durations to allow the client to quickly switch which representation is used. Throughout the foregoing, 500 Kbps representations of Channels 1 and 3 are being retrieved and decoded to enable rapid switching in case the user decides to change to one of those channels. For example, at step 1304, the user switches to Channel 1. In response, the client switches to display of the 500 Kbps representation of Chanel 1 and also requests the higher bitrate representation (3.5 Mbps) of Channel 1. When a sufficient amount of data has been received for the 3.5 Mbps representation of Channel 1, the client switches seamlessly from display of the 500Kbps representation to the 3.5 Mbps representation of Channel 1, such that the user may perceive an increase in quality but would not perceive any temporal jump in the content. [0093] In the embodiment of FIG. 13, the client upon a channel change promptly requests the new stream at the bitrate that was being used for the previous channel. In an alternative embodiment illustrated in FIG. 14, the client upon a channel change requests the new stream at the highest bitrate that is lower than the bitrate in use for the currently- displayed stream. For example, Channel 2 may have representations available at 500 Kbps, 1.2 Mbps, 2.0 Mbps, and 3.5 In the example of FIG. 14, the representation of Channel 0 at 3.5 Mbps is initially being displayed. Upon a user instruction to switch to Channel 2 (step 1402), the client promptly requests a representation of Channel 2 at 2 Mbps, which is the highest-bitrate representation with a bitrate lower than 3.5 Mbps. If network conditions permit after viewing of Channel 2 at 2 Mbps, the client may then request the 3.5 Mbps representation of Channel 2. In alternative embodiments, the client may promptly request the 3.5 Mbps representation of Channel 2 without first requesting the 2 Mbps representation of Channel 2.

[0094] In an exemplary method, as illustrated in FIG. 15, a client device receives manifest files (e.g. MPDs) for at least first video content and second video content. The manifest files in this example identify, for each stream, a plurality of representations having different bitrates. In this example, the user wishes initially to view the first content. Based on network conditions, the user's client device adaptively selects a bitrate at which the first content is retrieved, decoded, and displayed. The adaptive selection of a bitrate for the first content may include first retrieving the content at the lowest available bitrate and, if network conditions permit, retrieving representations of the content at increasingly high bitrates until no higher bitrate is available or until network conditions would not permit a further increase of bitrate. The bitrate selected adaptively for retrieval of the first content is referred to in FIG. 15 as bitrate B.

[0095] In the example of FIG. 15, the client device retrieves a low-bitrate representation of the second video content. For example, the client device may retrieve and decode the representation of the second content that has the lowest available bitrate that is identified in the manifest and that is decodable by the client device. This lowest available bitrate version may be used as a secondary stream for the second content.

[0096] The client device decodes the first content and causes the first content to be displayed, e.g. by displaying the decoded video on a built-in display of the client device or by sending the decoded video to a separate display device, such as a television screen or computer monitor. The client device also decodes the low-bitrate representation of the second content in parallel. While the decoded video for the second content is thus available, that decoded video is initially not displayed (or, in some embodiments, is displayed only in a smaller format, such as picture-in- picture or as illustrated in FIG. 16). [0097] A user then instructs the client device to switch to display the second content. This instruction may be received in a variety of ways, such as the user pressing a button on a remote- control, keypad, or touch screen. In response to this instruction, the client device promptly switches from causing display of the decoded video of the first content to causing display of the decoded video of the second content.

[0098] Because the representation of the second content was already being retrieved and decoded, the switch to the second content appears nearly instantaneous (e.g. with a delay which may be no more than one or a few frames of video) to the user. However, at this point, the second stream may have a relatively low image quality because it was being retrieved at a relatively low bitrate. In some embodiments, such as that illustrated in FIG. 15, the client device operates to determine whether the bitrate adaptation process can be expedited. To do this, in the embodiment of FIG. 15, the client device further determines whether the representations of the first and second content are being retrieved from the same network domain. If the representations of the first and second content are being retrieved from the same domain, this serves as an indication that the network conditions for retrieving the first and second content are likely to be similar and are likely to support similar bitrates. Thus, if the first and second content are being retrieved from the same domain, the client device promptly requests a representation of the second content at bitrate B (which had been selected for delivery of the first content), or, if bitrate B is not available, at the greatest bitrate less than B. The client device displays the second content at bitrate B once sufficient data has been received to decode and display that stream. It should be noted that bitrate adaptation may continue to be performed by the client device after the initial request for the second content at bitrate B. That is, changing network conditions may lead to the client device requesting a representation of the second content at bitrates greater than or less than B. However, the initial request for the second content at bitrate B is expected to result in a more rapid convergence on an optimal bitrate for the delivery of the second content.

[0099] On the other hand, if the client device determines that the first and second content are not retrieved from the same domain, the client may perform an ordinary adaptive bitrate procedure to select a bitrate at which to receive the second content, for example by gradually ramping up the requested bitrate until network conditions will not accommodate further increases.

[0100] In alternative embodiments, the second content is requested at a representation having bitrate B (or the highest available bitrate lower than B) regardless of whether the first and second content are retrieved from the same domain. This may be particularly beneficial where, for example, bitrate limitations imposed by the network are predominantly limitations arising closer to the client device, e.g. limitations in the bitrate of the client device's connection with a corresponding access point. In such cases, the optimum bitrate for delivery of the first content may be expected to be close to the optimum bitrate for delivery of the second content, even if the two are retrieved from separate network domains.

Exemplary Decoder.

[0101] FIG. 17 is a functional block diagram of a block-based video decoder 1700. Each of the decoders within the single-decoder client of FIG. 2 or the multi -decoder client of FIG. 11 may be implemented using the functional architecture of decoder 1700. In the embodiment of FIG. 17, a received video bitstream 1702 is unpacked and entropy decoded at entropy decoding unit 1708. The coding mode and prediction information are sent to either the spatial prediction unit 1760 (if intra coded) or the temporal prediction unit 1762 (if inter coded) to form the prediction block. The residual transform coefficients are sent to inverse quantization unit 1710 and inverse transform unit 1712 to reconstruct the residual block. The prediction block and the residual block are then added together at 1726. The reconstructed block may further go through in-loop filtering at loop filter 1766 before it is stored in reference picture store 1764. The reconstructed video may then be sent out to drive a display device, as well as used to predict future video blocks. In the case of a client having multiple decoders, such as the client of FIG. 11, one or more of the decoder components may be shared among the decoders.

[0102] Note that various hardware elements of one or more of the described embodiments are referred to as "modules" that carry out (i.e., perform, execute, and the like) various functions that are described herein in connection with the respective modules. As used herein, a module includes hardware (e.g., one or more processors, one or more microprocessors, one or more microcontrollers, one or more microchips, one or more application-specific integrated circuits (ASICs), one or more field programmable gate arrays (FPGAs), one or more memory devices) deemed suitable by those of skill in the relevant art for a given implementation. Each described module may also include instructions executable for carrying out the one or more functions described as being carried out by the respective module, and it is noted that those instructions could take the form of or include hardware (i.e., hardwired) instructions, firmware instructions, software instructions, and/or the like, and may be stored in any suitable non-transitory computer- readable medium or media, such as commonly referred to as RAM, ROM, etc.

Exemplary Client Device.

[0103] Exemplary embodiments disclosed herein are implemented using one or more wired and/or wireless network nodes, such as a wireless transmit/receive unit (WTRU) or other network entity. [0104] FIG. 18 is a system diagram of an exemplary WTRU 1802, which may be employed as a client device in embodiments described herein. As shown in FIG. 18, the WTRU 1802 may include a processor 1818, a communication interface 1819 including a transceiver 1820, a transmit/receive element 1822, a speaker/microphone 1824, a keypad 1826, a display/touchpad 1828, a non-removable memory 1830, a removable memory 1832, a power source 1834, a global positioning system (GPS) chipset 1836, and sensors 1838. It will be appreciated that the WTRU 1802 may include any sub-combination of the foregoing elements while remaining consistent with an embodiment.

[0105] The processor 1818 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like. The processor 1818 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 1802 to operate in a wireless environment. The processor 1818 may be coupled to the transceiver 1810, which may be coupled to the transmit/receive element 1122. While FIG. 18 depicts the processor 1818 and the transceiver 1820 as separate components, it will be appreciated that the processor 1818 and the transceiver 1820 may be integrated together in an electronic package or chip.

[0106] The transmit/receive element 1822 may be configured to transmit signals to, or receive signals from, a base station over the air interface 1816. For example, in one embodiment, the transmit/receive element 1822 may be an antenna configured to transmit and/or receive RF signals. In another embodiment, the transmit/receive element 1822 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, as examples. In yet another embodiment, the transmit/receive element 1822 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 1822 may be configured to transmit and/or receive any combination of wireless signals.

[0107] In addition, although the transmit/receive element 1822 is depicted in FIG. 18 as a single element, the WTRU 1802 may include any number of transmit/receive elements 1822. More specifically, the WTRU 1802 may employ MTMO technology. Thus, in one embodiment, the WTRU 1802 may include two or more transmit/receive elements 1822 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 1816.

[0108] The transceiver 1820 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 1822 and to demodulate the signals that are received by the transmit/receive element 1822. As noted above, the WTRU 1802 may have multi-mode capabilities. Thus, the transceiver 1820 may include multiple transceivers for enabling the WTRU 1802 to communicate via multiple RATs, such as UTRA and IEEE 802.11, as examples.

[0109] The processor 1818 of the WTRU 1802 may be coupled to, and may receive user input data from, the speaker/microphone 1824, the keypad 1826, and/or the display/touchpad 1828 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit). The processor 1818 may also output user data to the speaker/microphone 1824, the keypad 1826, and/or the display/touchpad 1828. In addition, the processor 1818 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 1830 and/or the removable memory 1832. The non-removable memory 1830 may include random-access memory (RAM), read-only memory (ROM), a hard disk, or any other type of memory storage device. The removable memory 1832 may include a subscriber identity module (SEVI) card, a memory stick, a secure digital (SD) memory card, and the like. In other embodiments, the processor 1818 may access information from, and store data in, memory that is not physically located on the WTRU 1802, such as on a server or a home computer (not shown).

[0110] The processor 1818 may receive power from the power source 1834, and may be configured to distribute and/or control the power to the other components in the WTRU 1802. The power source 1834 may be any suitable device for powering the WTRU 1802. As examples, the power source 1834 may include one or more dry cell batteries (e.g., nickel -cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion), and the like), solar cells, fuel cells, and the like.

[0111] The processor 1818 may also be coupled to the GPS chipset 1836, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 1802. In addition to, or in lieu of, the information from the GPS chipset 1836, the WTRU 1802 may receive location information over the air interface 1816 from a base station and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 1802 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.

[0112] The processor 1818 may further be coupled to other peripherals 1838, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity. For example, the peripherals 1838 may include sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.

Exemplary Server.

[0113] FIG. 19 depicts an exemplary network entity 1990 that may be used in embodiments of the present disclosure, for example as a server for generating or delivering manifest files and/or video streams according to methods disclosed herein. As depicted in FIG. 19, network entity 1990 includes a communication interface 1992, a processor 1994, and non-transitory data storage 1996, all of which are communicatively linked by a bus, network, or other communication path 1998.

[0114] Communication interface 1992 may include one or more wired communication interfaces and/or one or more wireless-communication interfaces. With respect to wired communication, communication interface 1992 may include one or more interfaces such as Ethernet interfaces, as an example. With respect to wireless communication, communication interface 1992 may include components such as one or more antennae, one or more transceivers/chipsets designed and configured for one or more types of wireless (e.g., LTE) communication, and/or any other components deemed suitable by those of skill in the relevant art. And further with respect to wireless communication, communication interface 1992 may be equipped at a scale and with a configuration appropriate for acting on the network side— as opposed to the client side— of wireless communications (e.g., LTE communications, Wi-Fi communications, and the like). Thus, communication interface 1992 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for serving multiple mobile stations, UEs, or other access terminals in a coverage area.

[0115] Processor 1994 may include one or more processors of any type deemed suitable by those of skill in the relevant art, some examples including a general-purpose microprocessor and a dedicated DSP.

[0116] Data storage 1996 may take the form of any non-transitory computer-readable medium or combination of such media, some examples including flash memory, read-only memory (ROM), and random-access memory (RAM) to name but a few, as any one or more types of non- transitory data storage deemed suitable by those of skill in the relevant art could be used. As depicted in FIG. 19, data storage 1996 contains program instructions 1997 executable by processor 1994 for carrying out various combinations of the various network-entity functions described herein. [0117] Although features and elements are described above in particular combinations, one of ordinary skill in the art will appreciate that each feature or element can be used alone or in any combination with the other features and elements. In addition, the methods described herein may be implemented in a computer program, software, or firmware incorporated in a computer- readable medium for execution by a computer or processor. Examples of computer-readable storage media include, but are not limited to, a read only memory (ROM), a random access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD- ROM disks, and digital versatile disks (DVDs). A processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Claims

1. A video client method comprising:

retrieving a first-bitrate representation of first video content at a first bitrate;

while retrieving the first-bitrate representation of the first video content, retrieving a second-bitrate representation of second video content at a second bitrate lower than the first bitrate;

causing display of the first-bitrate representation of the first video content;

during display of the first video content, receiving from a user an instruction to switch to the second video content;

in response to the instruction to switch to the second content:

initially switching from causing display of the first-bitrate representation of the first video content to causing display of the second-bitrate representation of the second content;

retrieving a third-bitrate representation of the second video content at a third bitrate higher than the second bitrate; and

subsequently switching from causing display of the second-bitrate representation of the second content to causing display of the third-bitrate representation of the second content.

2. The method of claim 1, further comprising, in response to the instruction to switch to the second content, retrieving a fourth-bitrate representation of first video content at a fourth bitrate lower than the first bitrate.

3. The method of any of claims 1-2, wherein the third bitrate is equal to the first bitrate.

4. The method of any of claims 1-3, further comprising receiving a manifest of the second video content, wherein the manifest identifies representations of the second content at a plurality of bitrates.

5. The method of claim 4, wherein the second-bitrate representation is selected to be the lowest bitrate representation of the second video content identified in the manifest.

6. The method of claim 4, wherein the third-bitrate representation is selected to be the representation of the second video content identified in the manifest having the highest bitrate that is no greater than the first bitrate.

7. The method of claim 4, wherein the third-bitrate representation is selected to be the representation of the second video content identified in the manifest having the highest bitrate that is less than the first bitrate.

8. The method of any of claims 1-7, wherein the initially switching is performed within two video frames after receiving the instruction to switch.

9. The method of any of claims 1-8, further comprising initiating decoding of the second-bitrate representation of second video content before receiving the instruction to switch to the second content.

10. The method of any of claims 1-4 or 8-9, further comprising determining whether the first video content and the second video content are served from the same network domain;

wherein, (a) if the first video content and second video content are served from the same network domain, the third bitrate is substantially equal to the first bitrate, and (b) if the first video content and second video content are not served from the same network domain, the third bitrate is between the first bitrate and the second bitrate.

11. The method of claim 10, wherein the third bitrate is the lowest available bitrate for the second video content that is higher than the second bitrate.

12. The method of any of claims 1-11, wherein the subsequently switching is performed in response to a determination that a sufficient amount of the third-bitrate representation of the second content has been buffered.

13. The method of any of claims 1-12, wherein the subsequently switching is performed at the end of a media segment of the second-bitrate representation.

14. The method of any of claims 1-12, wherein the subsequently switching is performed when a random access point of the third-bitrate representation is reached.

15. The method of any of claims 1-14, wherein the second video content is not displayed until after the instruction to switch.

16. The method of any of claims 1-15, wherein the subsequently switching is performed seamlessly.

17. The method of any of claims 1-16, further comprising, in response to the instruction to switch to the second content, stopping the retrieving of the first-bitrate representation of the first video content.

18. A video client comprising a processor and a non-transitory computer storage medium storing instructions operative to perform functions comprising:

causing display of the first-bitrate representation of the first video content;

in response to the instruction to switch to the second content: