EP3466081A1 - Rattrapage du curseur de direct dans un flux en direct - Google Patents

Rattrapage du curseur de direct dans un flux en direct

Info

Publication number
EP3466081A1
EP3466081A1 EP17726528.7A EP17726528A EP3466081A1 EP 3466081 A1 EP3466081 A1 EP 3466081A1 EP 17726528 A EP17726528 A EP 17726528A EP 3466081 A1 EP3466081 A1 EP 3466081A1
Authority
EP
European Patent Office
Prior art keywords
content
threshold
delay
live
client device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP17726528.7A
Other languages
German (de)
English (en)
Inventor
Euan Mcleod
Marc Joliveau
Stefan Christian Richter
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Amazon Technologies Inc
Original Assignee
Amazon Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/170,169 external-priority patent/US10091265B2/en
Priority claimed from US15/170,164 external-priority patent/US10530825B2/en
Application filed by Amazon Technologies Inc filed Critical Amazon Technologies Inc
Publication of EP3466081A1 publication Critical patent/EP3466081A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2401Monitoring of the client buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2402Monitoring of the downstream path of the transmission network, e.g. bandwidth available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44004Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44209Monitoring of downstream path of the transmission network originating from a server, e.g. bandwidth variations of a wireless network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/633Control signals issued by server directed to the network components or client
    • H04N21/6332Control signals issued by server directed to the network components or client directed to client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • H04N21/41407Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance embedded in a portable device, e.g. video client on a mobile phone, PDA, laptop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • Live streaming content includes channels or feeds with scheduled content (e.g., premium movie channels) and live broadcasts (e.g., sporting events, news, etc.).
  • scheduled content e.g., premium movie channels
  • live broadcasts e.g., sporting events, news, etc.
  • live streaming content often does not have a distinct end point and may continue indefinitely.
  • VOD content may be buffered in client devices well in advance of the client playhead (i.e., the content fragment currently being rendered by the client). This is typically not the case for live content because of the constraint that the delay between the live playhead (i.e., the latest content fragment available) and the client playhead be as low as possible.
  • FIG. 1 is an illustration of a delay between the live and client playheads of a live content stream.
  • FIG. 2 is a simplified diagram of a computing environment in which
  • FIG. 3 is a simplified diagram of an example of a client device that may be used with implementations enabled by the present disclosure.
  • FIG. 4 is a flowchart illustrating operation of a particular implementation.
  • FIG. 5 is a flowchart illustrating operation of a particular implementation.
  • This disclosure describes techniques for reducing the delay between the live playhead of live streaming content and the client playhead of a client device consuming the live stream.
  • an increased playback speed is used by the media player on the client device so that the delay is gradually reduced.
  • the increase in the playback speed is preferably small enough so that the faster playback is not perceptible to a human viewer.
  • the media player jumps forward in the stream, skipping content that is considered expendable, e.g., black frames, slate frames (e.g., an image with "please stand by" or "we'll be right back), low-value advertising content, etc.
  • the client playhead may be brought closer in time to the live playhead. An example will be instructive.
  • Live streaming content is sometimes annotated with metadata in real time by human operators as the content is being generated.
  • segments of the content may be identified by annotators as being expendable. For example, when the cameras at the event focus on the crowd or an aerial view of the surrounding geography rather than the pitch for a few seconds, or when play is stopped for an injury, such segments of the content can be identified by a human annotator as expendable.
  • the media player on client device 102 reaches such a point in the live stream, it can skip ahead to the next fragments in the stream that are not identified as expendable. In this way, the delay between the live playhead and the client playhead of device 102 can be shortened, reducing the likelihood of viewer frustration.
  • the playback speed on device 102 can be increased for a period of time in a way that is not noticeable to the viewer to allow for a more gradual reduction in the delay between the live and client playheads. And if both client devices 102 and 104 are attempting to minimize this delay in similar fashion, it is much more likely that the respective viewers will be having viewer experiences that are more closely synchronized in time.
  • FIG. 2 illustrates an example of a computing environment in which a video content service 202 provides live streaming content (e.g., audio or video) via network 204 to a variety of client devices (206-1 through 206-5) in accordance with the techniques described herein.
  • Content service 202 may conform to any of a wide variety of architectures such as, for example, a services platform deployed at one or more co-locations, each implemented with one or more servers 203.
  • Network 204 represents any subset or combination of a wide variety of network environments including, for example, TCP/IP- based networks, telecommunications networks, wireless networks, satellite networks, cable networks, public networks, private networks, wide area networks, local area networks, the Internet, the World Wide Web, intranets, extranets, etc.
  • Client devices 206 may be any suitable device capable of connecting to network 204 and consuming live streaming content provided by service 202.
  • Such devices may include, for example, mobile devices (e.g., cell phones, smart phones, and tablets), personal computers (e.g., laptops and desktops), set top boxes (e.g., for cable and satellite systems), smart televisions, gaming consoles, wearable computing devices (e.g., smart watches or smart glasses), etc.
  • mobile devices e.g., cell phones, smart phones, and tablets
  • personal computers e.g., laptops and desktops
  • set top boxes e.g., for cable and satellite systems
  • smart televisions gaming consoles
  • wearable computing devices e.g., smart watches or smart glasses
  • At least some of the examples described herein contemplate implementations based on computing models that enable ubiquitous, convenient, on-demand network access to a shared pool of computing resources (e.g., networks, servers, storage, applications, and services).
  • such computing resources may be integrated with and/or under the control of the same entity controlling content service 202.
  • such resources may be independent of content service 202, e.g., on a platform under control of a separate provider of computing resources with which content service 202 connects to consume computing resources as needed.
  • the computer program instructions on which various implementations are based may correspond to any of a wide variety of programming languages, software tools and data formats, may be stored in any type of non-transitory computer-readable storage media or memory device(s), and may be executed according to a variety of computing models including, for example, a client/server model, a peer-to-peer model, on a stand-alone computing device, or according to a distributed computing model in which various functionalities may be effected or employed at different locations.
  • content service 202 is described as if it were integrated with the platform(s) that provides the live streaming content to client devices. However, it will be understood that content service 202 may provide access to live streaming content in conjunction with one or more content delivery networks (e.g., CDN 214) that may or may not be independent of content service 202. In addition, the source of the live content may or may not be independent of content service 202 (e.g., as represented by content provider server 216). The range of variations known to those of skill in the art are contemplated to be within the scope of this disclosure.
  • Some of the implementations enabled by the present disclosure contemplate logic resident on the client devices consuming live streaming content from content service 202; such logic being configured to make decisions in conjunction with consuming the video content such as, for example, monitoring the delay between playheads, increasing playback speed, and/or skipping expendable content.
  • the logic might be part of an existing algorithm or module on the client device or implemented to work in conjunction with such an algorithm or module.
  • the logic might be implemented, for example, in a media player on the client device or as a separate application or module resident on the client device.
  • content service 202 may include logic that facilitates at least some aspects of monitoring and reducing the delay between playheads as described herein (e.g., as represented by playhead catchup logic 211).
  • such logic might be used to associate metadata with fragments or segments of the content that identify expendable content that can potentially be skipped to allow for reduction of the delay between the live playhead and client playheads. As discussed below, this may be done manually (e.g., by human operators), using existing content metadata, or by real-time analysis of the frames or fragments of the content.
  • Such logic might also be configured to determine whether and how far a particular client is behind the live playhead and/or take steps or send instructions to the client (e.g., to initiate higher-speed playback or skipping of content) to support getting the client's playhead closer to the live playhead.
  • content service 202 may also include a variety of information related to the live streaming content (e.g., other associated metadata and manifests in data store 212 to which service 202 provides access. Alternatively, such information about the live streaming content, as well as the live streaming content itself may be provided and/or hosted by one or more separate platforms, e.g., CDN 214. It should be noted that, while logic 210 and 211, and data store 212 are shown as integrated with content service 202, implementations are contemplated in which some or all of these operate remotely from the associated content service, and/or are under the control of an independent entity. From these examples, those of skill in the art will understand the diversity of use cases to which the techniques described herein are applicable.
  • FIG. 3 A block diagram of an example of a client device 300 suitable for use with various implementations is shown in FIG. 3.
  • Device 300 includes one or more single or multi-core processors 302 configured to execute stored instructions (e.g., in device memory 320).
  • Device 300 may also include one or more input/output (I/O) interface(s) 304 to allow the device to communicate with other devices.
  • I/O interfaces 304 may include, for example, an inter-integrated circuit (I2C) interface, a serial peripheral interface (SPI) bus, a universal serial bus (USB), an RS-232 interface, a media device interface, and so forth.
  • I/O interfaces 304 may include, for example, an inter-integrated circuit (I2C) interface, a serial peripheral interface (SPI) bus, a universal serial bus (USB), an RS-232 interface, a media device interface, and so forth.
  • I/O inter-integrated circuit
  • SPI serial peripheral interface
  • USB universal serial bus
  • Device 300 may also include one or more communication interfaces 308 configured to provide communications between the device and other devices. Such communication interface(s) 308 may be used to connect to cellular networks, personal area networks (PANs), local area networks (LANs), wide area networks (WANs), and so forth. For example, communications interfaces 308 may include radio frequency modules for a 3G or 4G cellular network, a WiFi LAN and a Bluetooth PAN. Device 300 also includes one or more buses or other internal communications hardware or software (not shown) that allow for the transfer of data and instructions between the various modules and components of the device. [0022] Device 300 also includes one or more memories (e.g., memory 310).
  • memories e.g., memory 310
  • Memory 310 includes non-transitory computer-readable storage media that may be any of a wide variety of types of volatile and non-volatile storage media including, for example, electronic storage media, magnetic storage media, optical storage media, quantum storage media, mechanical storage media, and so forth.
  • Memory 310 provides storage for computer readable instructions, data structures, program modules and other data for the operation of device 300.
  • module when used in connection with software or firmware functionality may refer to code or computer program instructions that are integrated to varying degrees with the code or computer program instructions of other such "modules.” The distinct nature of the different modules described and depicted herein is used for explanatory purposes and should not be used to limit the scope of this disclosure.
  • Memory 310 includes at least one operating system (OS) module 312 configured to manage hardware resources such as I/O interfaces 304 and provide various services to applications or modules executing on processor(s) 302.
  • OS operating system
  • Memory 310 also includes a user interface module 316, a content rendering module 318, and other modules.
  • Memory 310 also includes device memory 320 to store a wide variety of instructions and information using any of a variety of formats including, for example, flat files, databases, linked lists, trees, or other data structures. Such information includes content for rendering and display on display 306(1) including, for example, any type of video content.
  • OS operating system
  • Memory 310 also includes a user interface module 316, a content rendering module 318, and other modules.
  • device memory 320 to store a wide variety of instructions and information using any of a variety of formats including, for example, flat files, databases, linked lists, trees, or other data structures.
  • Such information includes content for rendering and display on display 306(1) including, for example, any type of video content.
  • a portion of device memory 320 may be distributed across one or more other devices including servers, network attached storage devices, and so forth.
  • the logic or computer program instructions used to support reducing the delay between live and client playheads as described herein may be implemented in a variety of ways. For example, at least some of this functionality may be implemented as part of the code of a media player operating on device 300. Alternatively, modules 319 and 321 may be implemented separately from and interact with the device's media player, web browser, mobile app, decoder, etc.
  • implementations are contemplated in which at least a portion of the logic or computer program instructions may reside on a separate platform, e.g., service 202, CDN 214, etc.; potentially working in conjunction with the client-side logic to reduce the delay between the respective playheads. Suitable variations and alternatives will be apparent to those of skill in the art. It will also be understood that device 300 of FIG. 3 is merely an example of a device with which various implementations enabled by the present disclosure may be practiced, and that a wide variety of other devices types may also be used (e.g., devices 206-1 to 206-5). The scope of this disclosure should therefore not be limited by reference to device-specific details.
  • FIG. 4 The delivery of live streaming content to a client device according to a particular implementation is illustrated in the flow chart of FIG. 4.
  • H.265 encoding also commonly referred to as HEVC
  • HEVC High Efficiency Video Coding
  • FIG. 4 also assumes a media player on the client device that includes logic (e.g., modules 319 and 321) configured to manage at least some aspects of reducing the delay between the live and client playheads as described herein.
  • a user wants to connect with a content service using a client device
  • the connection is typically achieved through some kind of login process to the service in a user interface presented on the client device.
  • Content playback is provided, for example, via a resident media player, web browser, or mobile app.
  • Access to content over the Internet is typically governed by a DRM system such as Google's Widevine, Microsoft's Play Ready, Apple's FairPlay, or Sony's OpenMG to name a few representative examples.
  • Live streaming content is typically delivered in an encrypted stream using any of a variety of encryption technologies including, for example, various Advanced Encryption Standard (AES) and Elliptic Curve Cryptography (ECC) encryption techniques.
  • AES Advanced Encryption Standard
  • ECC Elliptic Curve Cryptography
  • the live stream may also be delivered using an adaptive bit rate streaming technique such as, for example, MPEG-DASH (Dynamic Adaptive Streaming over HTTP), Apple's HLS (HTTP Live Streaming), or Microsoft's Smooth Streaming, to name a few representative examples.
  • MPEG-DASH Dynamic Adaptive Streaming over HTTP
  • Apple's HLS HTTP Live Streaming
  • Microsoft's Smooth Streaming to name a few representative examples.
  • a request for the content is sent to the corresponding content service (404).
  • the content service provides the client device with the information the client device needs to acquire a stream of the content (406). This may include, for example, DRM licenses, a decryption key, content metadata, and information about where the client can request the fragments of the selected content at various resolutions (e.g., a manifest).
  • the client device then acquires a stream of the live content using the information received from the content service (408).
  • the delay between the live playhead and the client playhead is tracked (410). This may be done in a variety of ways. For example, logic on the client device can count the cumulative amount of time required to recover from rebuffering events. In a simpler approach, a fixed amount of time could be added to the delay for each rebuffering event. In another example, time stamps associated with the recently requested fragments and representative of or close in time to the live playhead could be compared to a local clock on the client device.
  • the time reference used by logic on the client device to determine the delay could be a time stamp associated with one or more fragments acquired at the beginning of the session.
  • logic on the client could compare the difference between such a time stamp and that of a later fragment with actual time elapsed on the client device (e.g., as determined by a local clock) to determine the extent to which the delay has grown over time.
  • server-side logic could determine the delay for a particular client by comparing the time stamp for a recently requested fragment by that client with the time stamp for the fragment most recently made available by the content service, or the latest fragment requested by any client consuming the live stream.
  • server-side logic determines the delay for a particular client device
  • the server-side logic could also determine whether the delay exceeds a threshold and, when that occurs, transmit a message or an instruction to the client to initiate use of one or both of the catch-up mechanisms.
  • the server-side logic could periodically transmit the delay to the client device for decision making on the client.
  • the delay value being tracked may only be an
  • both client and server-side logic might use time references that are suitable proxies for one or both of the playheads without departing from the scope of this disclosure.
  • the time stamp associated with a fragment most recently requested by a client device will likely differ from the time the fragment is actually rendered and displayed by the client device, but may otherwise be suitable for purposes of determining a reliable approximation of the actual delay.
  • the time stamp associated with the fragment most recently made available by the content service might be earlier than the time at which the fragment becomes available to some client devices.
  • a delay between the live playhead and the client playhead for a particular client may be tracked, determined, or approximated for use as described herein.
  • the scope of the present disclosure should therefore not be limited by reference to such examples.
  • the threshold may be selected to keep the client device acceptably close to the live playhead while ensuring a particular level of content quality.
  • the threshold might also be selected, at least in part, to ensure compliance with any applicable service level agreement(s).
  • the logic initiated may be configured to increase the playback speed of the client device's media player, skip playback of expendable content, or use a combination of these "catch-up mechanisms.”
  • the playback speed of the media player on the client device is increased (502).
  • the amount by which the playback speed is increased may be relatively small, e.g., 2-5%, and potentially as much about 15%.
  • the playback speed could be increased to 32 or 33 frames per second.
  • the increase in playback speed is imperceptible to most human viewers and may be empirically determined, e.g., using viewer assessment by human subjects.
  • the increase in playback speed may be at least partially dependent on the type of content. For example, humans may more readily distinguish an increase in playback speed for musical content than for content that is primarily visual in nature. So, for content that includes musical content (e.g., pure audio content, or video with significant musical content), the increase in playback speed may be lower than for some video content. And although it is preferable that the increase in playback speed be imperceptible to some, most, or all human viewers, implementations are contemplated in which this does not need to be the case.
  • the increased playback speed may be a constant speed.
  • implementations are contemplated in which the playback speed may vary dynamically. For example, if the increased playback speed is not successful in reducing the delay between the live and the client playheads after some programmable period of time, the playback speed might be further increased.
  • the playback speed for different types of content may be different. For example, segments of the content that include musical content could be identified or detected (e.g., from content metadata) and the playback speed could be reduced for those segments while playback of segments not including musical content could be at a higher rate.
  • different playback speeds might be used for different ranges of delay so as to enable faster catch-up for larger delays.
  • Increased playback speed or content skipping might also affect or interact with the operation of other logic on the client device such as, for example, adaptive bit rate selection logic.
  • adaptive bit rate selection logic might be configured to request fragments at reduced quality when the media player is operating at a higher playback speed so that the high-speed playback can continue even if available bandwidth is low.
  • increased playback speed and/or content skipping might be disabled where the content quality attainable by the adaptive bit rate selection logic is or would be negatively affected by the operation of the catch-up mechanisms (e.g., the video quality drops below a threshold).
  • the available bandwidth may be checked before initiating use of a catch-up mechanism to ensure playback quality. For example, if available bandwidth is below a certain level, increased playback speed may be disallowed or, if already started, suspended.
  • expendable content in the live stream is initiated (504).
  • expendable content may correspond to any of a variety of breaks between the most relevant or interesting segments of the content such as, for example, black frames, slate frames, credits, opening or closing montages, commercial breaks, etc.
  • such segments of content may be identified with reference to metadata that is introduced into the live stream (e.g., as metadata tags) by human operators in substantially real time. That is, human operators may view and annotate the live content (e.g., as it is received from the live content source) for a wide variety of purposes such as, for example, dynamic insertion of advertisements, providing additional descriptive content (e.g., sports play-by-play), rating of content for different viewing audiences, etc. According to some implementations, human operators annotate the live content by identifying expendable content, i.e., content for which playback may be skipped on client devices that are sufficiently behind the live playhead.
  • expendable content i.e., content for which playback may be skipped on client devices that are sufficiently behind the live playhead.
  • expendable content may be identified in a variety of other ways.
  • expendable content might be identified using information about the content that is provided by the content provider.
  • content providers often provide information (e.g., content stream metadata) about events or breaks in content (e.g., commercial breaks, breaks between blocks of content, the beginning or end of scheduled content, the beginning of important live content, etc.) that may present opportunities for content skipping.
  • events or breaks might include a fade to black, a few black frames, or content that is less important to viewers (e.g., commercial breaks, credits, etc.).
  • Such information for catching up to the live playhead may be advantageous in that there is a relatively high degree of reliability in the timing of such events as they are explicitly identified by the content provider. Further, for some types of live streams (e.g., streams of scheduled content), such events or breaks may be relatively far out into the future and thus may be communicated to the client well in advance.
  • live streams e.g., streams of scheduled content
  • the identification of expendable content may be based on real-time or near-real-time video inspection and analysis. For example, video fragments, GOPs, and individual video frames can be analyzed to determine whether they are black frames, or correspond to scenes in which the display images do not appreciably change for an extended period of time. As should be appreciated, such an approach may be particularly important for live streams that do not follow a strict schedule, e.g., live sporting events in which commercial breaks or the end of the program is determined by play on the field.
  • Identification of expendable content may be done by the client (e.g., content skipping module 321) with reference to either or both of information from the content provider (e.g., in stream metadata), or by inspection of the fragments or frames of the current stream as they are received.
  • the client might be configured to identify low- complexity or static content (e.g., by virtue of the relationships or dependencies among frames in a GOP). This might be done instead of or in addition to identification of expendable content on the server side (e.g., by playhead catchup logic 211).
  • the delay between the live and client playheads is monitored to determine whether operation of one or both of the catch-up mechanisms should be terminated.
  • the threshold used may be chosen to get the client playhead as close as possible to the live playhead without negatively affecting the user experience in terms of content quality and/or an unacceptably high rate of rebuffering events.
  • the threshold might be the same as the one used to initiate operation of the catch-up mechanisms (e.g., 412 of FIG. 4).
  • some level of hysteresis might be built into the system, using a lower threshold for termination of higher-speed playback or content skipping to ensure that the client isn't rapidly switching between normal playback and use of the catch-up mechanisms.
  • Either or both thresholds might be dynamic in nature, depending, for example, on available bandwidth or a current state of an adaptive bit rate algorithm.
  • different thresholds may apply to initiation and termination of each mechanism. That is, increasing playback speed may allow for a finer control of the reduction of the delay between the live and client playheads as compared to the cruder but faster control represented by the skipping of content. Implementations are therefore contemplated in which the threshold(s) associated with increased playback speed is lower than the threshold(s) associated with content skipping. For example, the increase in playback speed might be initiated when the delay reaches 30 seconds, but content skipping might not be initiated until the delay reaches a minute or more. This allows for a more brute force approach (represented by content skipping) for longer delays, while allowing for a more fine-grained and precise approach (represented by higher-speed playback) for shorter delays.
  • catch-up mechanisms may be used in combination, they may or may not be used simultaneously.
  • the different mechanisms might be used alternatively, e.g., with content skipping being used until the delay has decreased sufficiently to allow for further reduction using higher-speed playback. Variations on this theme within the scope of the present disclosure will be understood by those of skill in the art.
  • Embodiments disclosed herein may include a computer program product, having one or more non-transitory computer-readable media having computer program instructions stored therein, the computer program instructions being configured such that, when executed by one or more computing devices, the computer program instructions cause the one or more computing devices to at least one of, acquire a stream of live video content for playback on a client device, determine that a delay between a live playhead of the live video content and a client playhead associated with the playback of the live video content on the client device at a first frame rate exceeds a threshold, increase a playback speed of the live video content on the client device to a second frame rate in a manner that is substantially imperceptible to a human viewer, thereby reducing the delay, identify one or more expendable portions of the live video content, skip playback of at least one of the one or more expendable portions of the live video content, thereby reducing the delay, determine that the delay is below the threshold, and/or decrease the playback speed of the live video content on the client device to
  • the one or more processors may be configured to identify the one or more expendable portions of the live video content (1) using first content metadata associated with the live video content by a content provider in conjunction with generation of the live video content, (2) using second content metadata associated with the live video content by a human operator after generation of the live video content, or (3) by analyzing the live video content substantially in real time.
  • the one or more processors may further be configured to determine the delay between the live playhead and the client playhead by determining a cumulative time for recovering from rebuffering events occurring on the client device.
  • Embodiments disclosed herein may include a client device including a memory, an output device, and one or more processors configured, in conjunction with the memory, to one or more of acquire a stream of content for playback on the output device, determine that a delay between a live playhead of the content and a client playhead associated with the playback of the content exceeds a first threshold, and/or increase a playback speed of the content.
  • the one or more processors may be further configured to determine that the delay is below a second threshold, the second threshold being lower than the first threshold, and/or decrease the playback speed of the content.
  • the one or more processors may be further configured to at least one of determine that the delay exceeds a second threshold, the second threshold being higher than the first threshold, and/or further increase the playback speed of the content.
  • the one or more processors may be further configured to identify one or more expendable portions of the content, and/or skip playback of at least one of the one or more expendable portions of the content.
  • the one or more processors may be further configured to determine that the delay exceeds a second threshold, the second threshold being different from the first threshold, and the one or more processors may be configured to skip playback of at least one of the one or more expendable portions of the content in response to determining that the delay exceeds the second threshold.
  • the one or more processors may be further configured to determine that there is sufficient available bandwidth for increasing the playback speed of the content.
  • the one or more processors may be further configured to determine that available bandwidth or a playback quality of the content has dropped below a corresponding threshold, and/or decrease the playback speed of the content.
  • the one or more processors may be further configured to determine the delay based on one or more rebuffering events or using time stamps associated with fragments or frames of the content.
  • Embodiments disclosed herein may include a computer-implemented method including at least one of, acquiring a stream of content for playback, determining that a delay between a live playhead of the content and a client playhead associated with the playback of the content exceeds a delay threshold, and/or increasing a playback speed of the content.
  • the method may further include determining that the delay is below a second threshold, the second threshold being lower than the first threshold, and/ordecreasing the playback speed of the content.
  • the method may further include determining that the delay exceeds a second threshold, the second threshold being higher than the first threshold, and/or further increasing the playback speed of the content.
  • the method may further include identifying one or more expendable portions of the content, and/or skipping playback of at least one of the one or more expendable portions of the content.
  • the method may further include determining that the delay exceeds a second threshold, the second threshold being different from the first threshold, and wherein skipping playback of at least one of the one or more expendable portions of the content may occur in response to determining that the delay exceeds the second threshold.
  • the method may further include determining that there is sufficient available bandwidth for increasing the playback speed of the content.
  • the method may further include determining that available bandwidth or a playback quality of the content has dropped below a corresponding threshold, and/or decreasing the playback speed of the content.
  • the method may further include determining the delay based on one or more rebuffering events or using time stamps associated with fragments or frames of the content.
  • Embodiments disclosed herein may include a computer-implemented method including at least one of receiving content from a live content source, encoding the content for streaming to client devices, streaming the content to the client devices, determining a delay between a live playhead of the content and a client playhead associated with the playback of the content on a first client device, and/or instructing the first client device to increase a playback speed of the content.
  • determining the delay may include comparing a first time stamp associated with a first fragment of the content requested by the first client device with a second time stamp associated with a second fragment of the content available for streaming to the client devices, or with a third time stamp associated with a third fragment requested by one or more other client devices.
  • Embodiments disclosed herein may include a computer program product, including one or more non-transitory computer-readable media having computer program instructions stored therein, the computer program instructions being configured such that, when executed by one or more computing devices, the computer program instructions cause the one or more computing devices to at least one of, acquire a stream of live video content for playback on a client device, the stream of the live video content including a plurality of video fragments identify one or more rebuffering events on the client device, determine a delay between a live playhead of the live video content and a client playhead associated with the playback of the live video content on the client device based on the one or more rebuffering events, determine that the delay exceeds a threshold, identify one or more expendable portions of the live video content using content metadata associated with at least some of the video fragments of the live video content, and/or skip playback of at least one of the one or more expendable portions of the live video content, thereby reducing the delay.
  • the content metadata may be associated with the live video content by a human operator after generation of the live video content.
  • the one or more processors may be configured to determine the delay between the live playhead and the client playhead by determining a cumulative time for recovering from the one or more rebuffering events.
  • Embodiments disclosed herein may include a client device, including memory, an output device, and one or more processors configured, in conjunction with the memory, to one or more of, acquire a stream of content for playback on the output device, determine that a delay between a live playhead of the content and a client playhead associated with the playback of the content exceeds a first threshold, identify an expendable portion of the content, and/or skip playback of the expendable portion of the content.
  • the one or more processors may be configured to identify the expendable portion of the content (1) using first content metadata associated with one or more fragments of the content by a content provider of the content, (2) using second content metadata associated with one or more fragments of the content by a human operator after generation of the content, and/or (3) by analyzing one or more frames or fragments of the content substantially in real time.
  • the one or more processors may be further configured to increase a playback speed of the content.
  • the one or more processors may be configured to increase the playback speed of the content in response to the delay exceeding a second threshold, the second threshold being lower than the first threshold.
  • the one or more processors may be further configured to one or more of determine that the delay has dropped below a second threshold, the second thresholding being lower than the first threshold, an/or terminate skipping of expendable content.
  • the one or more processors may be further configured to determine the delay based on one or more rebuffering events or using time stamps associated with fragments or frames of the content.
  • Embodiments disclosed herein may include a computer-implemented method including one or more of acquiring a stream of content for playback, determining that a delay between a live playhead of the content and a client playhead associated with the playback of the content exceeds a first threshold, identifying an expendable portion of the content, and/or skipping playback of the expendable portion of the content.
  • identifying the expendable portion of the content may include (1) using first content metadata associated with one or more fragments of the content by a content provider of the content, (2) using second content metadata associated with one or more fragments of the content by a human operator after generation of the content, and/or (3) analyzing one or more frames or fragments of the content substantially in real time.
  • the expendable portion of the content may include one or more of black frames, slate frames, credits, an opening montage, a closing montage, a commercial break, a break in action, a replay review, a time out, and/or substantially static content.
  • the method may further include increasing a playback speed of the content.
  • increasing the playback speed of the content may occur in response to the delay exceeding a second threshold, the second threshold being different than the first threshold.
  • the method may further include one or more of determining that the delay has dropped below a second threshold, the second threshold being lower than the first threshold, and/or terminating skipping of expendable content.
  • the method may further include determining the delay based on one or more rebuffering events or using time stamps associated with fragments or frames of the content.
  • Embodiments disclosed herein may include a computer-implemented method, including one or more of receiving content from a live content source, encoding the content for streaming to client devices, including associating metadata with portions of the content, the metadata identifying the portions of the content with which the metadata are associated as expendable content, and/or streaming the content to the client devices, at least some of the client devices being configured to use the metadata to skip playback of the portions of the content identified as expendable content.
  • associating the metadata with portions of the content may include (1) using first content metadata associated with the content before the content was received, (2) using second content metadata associated with the content by a human operator after the content was received, and/or (3) analyzing the content substantially in real time.
  • the method may further include determining a delay between a live playhead of the content and a client playhead associated with the playback of the content on a first client device.
  • determining the delay may include comparing a first time stamp associated with a first fragment of the content requested by the first client device with a second time stamp associated with a second fragment of the content available for streaming to the client devices, or with a third time stamp associated with a third fragment requested by one or more other client devices.

Abstract

Les techniques de l'invention servent à réduire le retard entre un curseur de direct du contenu d'un flux en direct et un curseur de client d'un dispositif client consommant le flux en direct. Dans une des techniques, le lecteur multimédia utilise une vitesse de lecture accélérée sur le dispositif client, de manière à réduire progressivement le retard. Dans une autre technique, le lecteur multimédia saute vers l'avant dans le flux, en sautant un contenu qui n'est pas jugé indispensable.
EP17726528.7A 2016-06-01 2017-05-23 Rattrapage du curseur de direct dans un flux en direct Withdrawn EP3466081A1 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US15/170,169 US10091265B2 (en) 2016-06-01 2016-06-01 Catching up to the live playhead in live streaming
US15/170,164 US10530825B2 (en) 2016-06-01 2016-06-01 Catching up to the live playhead in live streaming
PCT/US2017/034066 WO2017210027A1 (fr) 2016-06-01 2017-05-23 Rattrapage du curseur de direct dans un flux en direct

Publications (1)

Publication Number Publication Date
EP3466081A1 true EP3466081A1 (fr) 2019-04-10

Family

ID=58794268

Family Applications (1)

Application Number Title Priority Date Filing Date
EP17726528.7A Withdrawn EP3466081A1 (fr) 2016-06-01 2017-05-23 Rattrapage du curseur de direct dans un flux en direct

Country Status (2)

Country Link
EP (1) EP3466081A1 (fr)
WO (1) WO2017210027A1 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10530825B2 (en) 2016-06-01 2020-01-07 Amazon Technologies, Inc. Catching up to the live playhead in live streaming
CN110113626B (zh) * 2019-05-13 2021-05-07 北京奇艺世纪科技有限公司 一种回放直播视频的方法及装置
EP3767962A1 (fr) * 2019-07-19 2021-01-20 THEO Technologies Client multimédia à taille de tampon adaptative et procédé associé
CN114885209B (zh) * 2022-04-08 2023-06-16 车智互联(北京)科技有限公司 一种直播数据处理方法、计算设备及可读存储介质
CN114710637A (zh) * 2022-04-18 2022-07-05 深圳创维-Rgb电子有限公司 Web端监控视频流低时延处理方法、装置、设备及介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9237387B2 (en) * 2009-10-06 2016-01-12 Microsoft Technology Licensing, Llc Low latency cacheable media streaming
US8402155B2 (en) * 2010-04-01 2013-03-19 Xcira, Inc. Real-time media delivery with automatic catch-up

Also Published As

Publication number Publication date
WO2017210027A1 (fr) 2017-12-07

Similar Documents

Publication Publication Date Title
US10530825B2 (en) Catching up to the live playhead in live streaming
JP7284906B2 (ja) メディアコンテンツの配信および再生
US8990843B2 (en) Eye tracking based defocusing
US10091265B2 (en) Catching up to the live playhead in live streaming
US9060207B2 (en) Adaptive video streaming over a content delivery network
US9204061B2 (en) Switching content
CN110178377B (zh) 用于视频传送会话的初始比特率选择
EP3466081A1 (fr) Rattrapage du curseur de direct dans un flux en direct
US10904639B1 (en) Server-side fragment insertion and delivery
US10638180B1 (en) Media timeline management
US20210385540A1 (en) Systems and methods for real-time adaptive bitrate transcoding and transmission of transcoded media
US20150172161A1 (en) Real-time processing capability based quality adaptation
CN108881931B (zh) 一种数据缓冲方法及网络设备
CN110582012B (zh) 视频切换方法、视频处理方法、装置及存储介质
US11930066B2 (en) Method to insert program boundaries in linear video for adaptive bitrate streaming
US9680904B2 (en) Adaptive buffers for media players
US9742749B1 (en) Live stream encryption
CN113141514A (zh) 媒体流传输方法、系统、装置、设备及存储介质
US9866459B1 (en) Origin failover for live streaming
US10356159B1 (en) Enabling playback and request of partial media fragments
US10433023B1 (en) Heuristics for streaming live content
US10313759B1 (en) Enabling playback and request of partial media fragments
US20140201368A1 (en) Method and apparatus for enforcing behavior of dash or other clients
US11005908B1 (en) Supporting high efficiency video coding with HTTP live streaming
US10893331B1 (en) Subtitle processing for devices with limited memory

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20190101

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20200109

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

GRAJ Information related to disapproval of communication of intention to grant by the applicant or resumption of examination proceedings by the epo deleted

Free format text: ORIGINAL CODE: EPIDOSDIGR1

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

INTC Intention to grant announced (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20220404