CN111316652A

CN111316652A - Personalized content stream using aligned encoded content segments

Info

Publication number: CN111316652A
Application number: CN201880054202.0A
Authority: CN
Inventors: 奥拉夫·尼尔森; 埃文·杰拉尔德·斯塔顿
Original assignee: Amazon Technologies Inc
Current assignee: Amazon Technologies Inc
Priority date: 2017-09-05
Filing date: 2018-08-23
Publication date: 2020-06-19
Also published as: US20190075148A1; WO2019050693A1; EP3679717A1; US10911512B2

Abstract

Systems and methods are described for implementing a personalized content stream whose content may be dynamically changed by a user such that a plurality of base content may be seamlessly included within the personalized content stream. Multiple input content streams are encoded to include temporally aligned splice point frames that break the inter-frame dependencies before and after a given splice point frame. Temporally aligned segments are then generated from the encoded stream. Thereafter, the user may select segments of any of the input content streams to be included within the personalized content stream, and the output device may decode segments generated from different input content streams as part of the personalized content stream without introducing errors into the content stream due to the segments being from different input content streams. Thus, the user may dynamically change the content of the personalized content stream based on their preferences.

Description

Personalized content stream using aligned encoded content segments

Background

Generally, computing devices exchange data using a communication network or a series of communication networks. Companies and organizations operate computer networks that interconnect several computing devices to support operations or provide services to third parties. Computing systems may be located in a single geographic location or in multiple different geographic locations (e.g., interconnected via a private or public communication network). In particular, a data center or data processing center, collectively referred to herein as a "data center," may include several interconnected computing systems to provide computing resources to users of the data center. The data center may be a private data center operated on behalf of an organization, or may be a public data center operated on behalf of or for public interest.

A service provider or content creator (such as an enterprise, artist, media distribution service, etc.) may employ interconnected computing devices (e.g., within a data center) to deliver content to users or clients. In some cases, these computing devices may support traditional content distribution systems, such as by creating, modifying, or distributing streaming television or radio content. In other cases, these computing devices may be used to duplicate or replace previous content distribution systems. For example, a data center may provide network-based streaming audio or video content in a manner similar to a conventional television or radio network. This content is sometimes referred to as "internet television" or "internet radio," respectively. The content provided by these distribution systems (e.g., both traditional content and computing network-based content) may be pre-recorded or real-time. Typically, where computing devices are used to facilitate traditional or network-based distribution systems, specialized software is used to replace or duplicate the functionality of specialized hardware devices. For example, software applications may be used to encode and packetize data streams containing real-time video content, thereby reducing or eliminating the need for dedicated hardware to perform these functions. Due to the flexibility of software-based solutions, a single computing device may be used to generate content for both traditional and network-based generation systems.

Drawings

FIG. 1 is an illustration of a personalized content stream output on an output device and a view selection device that may be used to modify base content within the personalized content stream;

fig. 2 is a block diagram showing an illustrative logical environment in which a streaming content delivery system may operate to provide personalized content streams, in accordance with some embodiments of the present disclosure;

FIG. 3 is a block diagram showing an illustrative configuration of a server that may implement one embodiment of elements of the streaming content delivery system of FIG. 1;

FIG. 4 is an illustrative visualization of a plurality of personalized content streams that may be provided by the streaming content delivery system of FIG. 1 to different output computing devices based on an input stream representing a plurality of base content;

FIG. 5 is a block diagram showing illustrative interactions for processing an input stream representing a plurality of base content at the streaming content delivery system of FIG. 1 to enable mixing of segments of base content within a personalized content stream;

6A-6C are block diagrams showing illustrative interactions with the streaming content delivery system of FIG. 1 to provide a personalized content stream to an output device; and is

FIG. 7 is a flow diagram showing an illustrative routine for providing a personalized content stream including segments representing a plurality of base content.

Detailed Description

In general, aspects of the present disclosure are directed to providing a video stream that is dynamically customizable on a per user basis, such that a user can change which video content is included in the stream while the stream is ongoing, and such that the change in the video content of the stream results in seamless playback on an output device. As described herein, streaming content may include any content that is delivered to an output device continuously or repeatedly during output of the stream. Streaming content may be compared to downloaded content, which typically requires that the entirety of the content (e.g., a complete video file) be obtained before output of the content can begin. In one embodiment, the streaming content is real-time content (e.g., both recorded and presented "on-the-fly," which may include less delay, such as providing capability for content review or filtering). The actual primary content (e.g., video or audio content) within the stream is typically pre-selected by the provider of the stream and included in the stream. For example, a content provider (e.g., a streaming content service) may choose to include real-time events, such as sporting events or concerts, within a content stream. As such, while a viewer may be able to select from a plurality of different content streams (e.g., each including different content), the viewer is typically unable to modify the base content included within the streams. Furthermore, switching the viewing device between different streams negatively impacts the viewing experience, as such switching is typically not seamless. Instead, the viewing device typically must obtain information about the new stream, retrieve one or more segments of the stream (e.g., into a buffer), and otherwise prepare to output the new stream. This process may result in a large delay when switching streams, thereby preventing the user from making frequent stream switches. Such delays can be particularly problematic in situations where a user wishes to switch between multiple streams of related content, such as multiple different recordings of a real-time sporting event (e.g., from the same or different content providers). Some embodiments of the present application address these and other problems by providing a content stream that is dynamically customizable on a per-user basis so that a user can select base content to be included within the content stream while the stream is ongoing. For example, in accordance with an embodiment of the present disclosure, a user may begin consuming a stream depicting a first view of a real-time event (e.g., a sporting event), and during output of the stream, dynamically change the content of the stream such that it depicts a second view of the real-time event. The stream may be personalized for the user such that changes in content do not necessarily affect the stream provided to other users. Thus, each viewing of a real-time event may be enabled to dynamically change their viewing of the event, as shown in the personalized video stream. The streams may be provided by a streaming content delivery service that operates to change the base content of the streams in a seamless manner such that no viewing device is required to switch streams or incur the delay normally required to switch streams. In some cases, the streaming content delivery service may modify or generate the base content such that the viewing device is unaware that any changes have occurred with respect to the video stream. Furthermore, the streaming content delivery service may provide personalized content streams to viewing users by dynamically recombining various different encoded content without the need to re-encode such content into each personalized stream. Thus, the computational resources required to provide a personalized content stream may be reduced or minimized. Accordingly, embodiments of the present disclosure provide a resource-efficient streaming content delivery service that enables users to seamlessly change the underlying content provided within a personalized content stream.

As an illustrative example, consider a real-time event such as a sporting event, where a content provider (such as a video production company) photographs the event from multiple angles, producing multiple content streams, each corresponding to a different view. Typically, a supervisor or other employee of the provider will determine which of the multiple views to include in the final output content stream at any given time, and then provide the final output stream to the viewing user. The user may not be able to view the event from a different angle at all, or may be forced to switch to a different output stream (e.g., an output stream of a different provider) to view the event, at which point the user will be required to view the event from any angle included within the different output stream. However, according to embodiments of the present disclosure, the streaming content delivery system may make multiple views of such events available simultaneously. For example, a streaming content delivery system may obtain three different content streams, each representing a shot of an event taken from a different perspective, and make each such stream available to an end user. In addition, the streaming content delivery system may enable a user to mix and match segments of each content stream together to form a personalized content stream.

For example, referring to FIG. 1, a user may view a personalized content stream on a viewing device 60 such as a television (which may represent, for example, a "smart" television having an integrated computing system, network interface, or the like, or a non-computerized television having a network-connected computing device such as a personal video playback device, a home theater computing device, or a game console). The personalized content stream may be provided by a streaming content delivery system and may initially depict a first view of a given event, such as a view of a baseball game viewed from behind home base. The user may also be associated with an input device 52, such as a computerized telephone or tablet computer, enabling the user to change the view depicted on the viewing device 60 by dynamically changing the segments included in the personalized content stream. Illustratively, the input device 52 may depict a graphical user interface (e.g., as provided by an application executing on the input device 52) that includes elements 54-58, each of which provides a graphical representation of various views of events available on the personalized content stream. The graphical representation may include, for example, a low resolution (e.g., "thumbnail") version of the content stream depicting the view of the event, or a periodically refreshed image (e.g., screenshot) depicting the view of the event, as available at the streaming content delivery system. Each element 54-58 is selectable by a user to request a change in the personalized content stream output on the viewing device 60 to depict the view corresponding to the selected element 54-58. Thus, by selecting element 54 on input device 52, the user may cause viewing apparatus 60 to output a first view of the event ("View A"); by selecting element 56, the user may cause viewing device 60 to output a second view of the event ("view B"); and by selecting element 58, the user may cause viewing device 60 to output a third view of the event ("view C"). Although the graphical elements are depicted in fig. 1 as being optional to modify the view, other inputs on the input device 52 may also be used. For example, input device 52 may include a gyroscope such that a particular orientation of device 52 causes a particular view to be selected. Thus, users are enabled to dynamically select their event views. Further, rather than requiring a change in the operation of the viewing device 60, the output of the viewing device 60 may be controlled at the streaming content delivery system by dynamically changing the segments included within the personalized content stream as displayed on the viewing device 60. Thus, different views of the event can be seamlessly interchanged on the viewing device 60 without the delays or interruptions typical of switching content streams.

In some embodiments, seamless interchange of segments within a personalized content stream may be facilitated by synchronizing or aligning aspects of the content stream. For example, a streaming content delivery system may be configured to generate or modify segments of different input content streams (e.g., different views of an event) such that the segments contain specialized "splice point" frames at a common location across the different input content streams (e.g., at the same time). The splice point frame may represent a specialized content frame that breaks the inter-frame dependency of the content stream at a particular point so that any subsequent frame can be decoded without reference to any frame before the splice point frame (and so that the frame before the splice point frame can be decoded without reference to any frame after the splice point frame). One example of such specialized content frames is IDR frames defined by the h.264 video coding format, which is known in the art. IDR frames, when inserted into an h.264 compatible video stream, ensure that all subsequent frames of the video stream can be decoded without reference to any frame of the video stream that precedes the IDR frame. By including splice point frames at a common location between segments of different input content streams, the streaming content delivery system can interchange the different input content streams within the output content stream without interrupting decoding of the output content stream. For example, at a common splice point location between two input content streams, the streaming content delivery system stops including a first of the two input content streams in the output content stream and starts including a second of the two input content streams in the output content stream. Because the splice point frame ensures that there are no inter-frame dependencies between frames of the two input content streams, the downstream decoding device will not experience decoding errors in the output content stream due to corrupted inter-frame dependencies. An exemplary mechanism FOR aligning splice point frames within different ENCODED content STREAMS is described in more detail in U.S. patent application No. 15/614,345 ("the' 345 application"), filed 2017, 6, month 5, and entitled "OUTPUT SWITCHING FOR ENCODED connecting STREAMS," the entire contents of which are incorporated herein by reference. For example, as disclosed in the' 345 application, a system may include an output switching controller configured to coordinate operations of a content encoder and a content packager such that the content packager is enabled at a given point in time to switch between packaging a first encoded content stream and a second encoded content stream without decoding the content streams and without introducing errors into the output stream. In particular, the output switching controller may be configured to instruct each of the content encoders to insert a splice point frame into their respective encoded content streams at a given point in time, and to ensure that subsequent frames of the encoded content streams do not reference any frames preceding the splice point frame. The splice point frame thus acts as a "switching point" so that a viewing device can view the encoded content stream starting from a point corresponding to the splice point frame without generating errors due to loss of inter-frame dependencies. In one embodiment, the splice point frame is an IDR frame that satisfies the h.264 video coding format. The output switch controller may be further configured to indicate a time at which each content packager is expected to include a splice point frame with respect to the encoded content stream. Thus, the content packager can switch from packaging the first encoded content stream into the output stream to packaging the second encoded content stream into the output stream at the point in time. Because the content packager switches to the second encoded content stream at a time corresponding to the splice point frame, subsequent decoding of the content stream can occur without experiencing errors due to loss of inter-frame dependencies. In this way, the content packager is enabled to switch the output stream to any number of encoded content streams without introducing errors in the output stream and without the need to decode and re-encode the content stream.

In some cases, the streaming content delivery system may also cause other attributes of different encoded content streams to align or synchronize. For example, the streaming content delivery system may align or synchronize timestamps or time codes of encoded content streams representing the input content streams such that interchanging input content streams within a personalized content stream does not cause a break in the continuity of the timestamps or time codes within the personalized content stream. An exemplary mechanism for aligning or synchronizing attributes OF encoded CONTENT streams, such as timestamps or time codes, is described in more detail in U.S. patent application No. 15/194,347 ("the' 347 application"), filed on 27/6/2016 and entitled "SYNCHRONIZATION OF MULTIPLE encoding for STREAMING control," the entire CONTENTs OF which are incorporated herein by reference. For example, as disclosed in the' 347 application, to ensure interchangeability of content output by a content encoder, the content encoder may be configured to detect potential desynchronization between content encoders within an encoder pool and exchange state information using a synchronization protocol, thereby enabling the content encoder to reestablish synchronization and thus provide interchangeable output. The state information may include, for example, information mapping the input timecodes to output timestamps for encoding the content stream, as well as the number of groups of pictures (GOPs) or count of frames passed for a given input timecode and output timestamp. More specifically, in one embodiment, each encoder within a pool may be configured to periodically transmit information regarding its encoding status to each other encoder. Upon receiving the encoding status from another encoder, the receiving encoder may verify that the received encoding status matches its own status (e.g., may be interchanged). In the event that the received coding state does not match the current state of the encoder, the encoder may determine whether the received state is authoritative (indicating that its own state has become desynchronized from the pool), and if so, modify its output to resynchronize its state with the state of the pool. Any number of consistency assurance protocols may be utilized to identify authoritative information and ensure consistency among the pools. For example, each encoder may be configured to identify the "oldest" state information (the earliest timestamp or the largest number of GOPs applied to a given timecode) as authoritative. In some embodiments, a single component, such as a content introducer, may act as a controller to determine the authoritative encoding status of the pool. Thus, the introducer may periodically receive status information from each encoder and use the received status information to detect whether any encoders have become de-synchronized. In the event that the encoders have become desynchronized, the introducer may transmit instructions to the desynchronized encoders to reestablish synchronization. Illustratively, the content encoder may utilize the authoritative status information to determine segment boundaries for its own encoded content, and may use those segment boundaries for the content. For video, a segment boundary may specify the alignment of GOPs within a video stream. The size of each GOP typically depends on the configuration of the encoder, and the output of the same or interchangeable content may depend on the use of the same segment boundaries. By utilizing the authoritative status information, the encoder can determine where the next GOP segment should begin. Illustratively, the desynchronized content encoder may calculate the next video frame from the state information to start the GOP according to the following equation:

NextGOPFrame＝Timecode_n+GOPSize-((Timecode_n-Timecode_i)mod GOPSize)

wherein:

nextgopfield indicates the next time code that the GOP will start;

Timecode_nrepresents any timecode within the video (e.g., the current timecode of the desynchronization encoder 114);

Timecode_ia time code representing that the GOP is known to have started (e.g., the latest time code indicated within the authoritative status information as corresponding to the GOP); and is

GOPSize denotes the number of frames within a GOP.

Note that the equation assumes Timecode_nNot equal to the time code at which the GOP will start. In Timecode_nA time code (e.g., (Timecode) indicating that a GOP will start_n-Timecode_i) mod GOPSize ═ 0), then nextgfprame equals Timecode_n. Further, the equation assumes that operations such as addition occur based on the relationship between these units (e.g., 30 frames per second) with appropriate conversion between different units such as time codes and frames.

In some cases, either or both of the introducer and the encoder may be configured to receive and respond to requests for authoritative status. For example, when an encoder joins a pool, the encoder may be configured to transmit a request for an authoritative state to either or both of the introducer and other encoders in the pool, and synchronize its own state with the authoritative state.

When the attributes of the different input content streams are aligned or synchronized, any interchange of base content that has occurred in the personalized content stream may not be apparent from the point of view of the viewing device that outputs the personalized content stream. Thus, the personalized content stream may allow for seamless transitions between different input content streams. Furthermore, as will be described below, the input content stream may be processed by the streaming content delivery system in an encoded format and interchanged within the personalized content stream without the need to decode or re-encode the input content stream. Thus, the streaming content delivery system may operate in a computationally efficient manner relative to systems that attempt to encode personalized content streams separately.

While the exemplary embodiments are described above with respect to input content streams representing different views of a common event, some embodiments of the present disclosure may enable the generation of personalized content streams based on input content streams representing different events or different types of content (such as different real-time events, television programs, etc.). For example, a personalized content stream may be used as a personalized "channel" that provides a content stream selected by a user from a variety of television programs, movies, fan-made content, and the like. Further, while examples may be discussed with respect to video content, in some cases, embodiments of the present disclosure may enable personalized content streams that include other types of content (such as audio content). For example, different input audio streams may be intermixed within the personalized content stream by changing the base input audio stream at an Encoder Boundary Point (EBP) within the audio stream, which divides the audio stream into segments, so that segments of different audio streams may be intermixed within the personalized content stream without introducing errors in the personalized content stream. Accordingly, the examples provided herein are intended to be illustrative in nature.

As will be appreciated by those skilled in the art in light of the present disclosure, embodiments disclosed herein improve the ability of computing systems, such as content streaming systems, to deliver content to users. In particular, aspects of the present disclosure improve the following capabilities of a content stream delivery system: providing a customized content stream on a per user basis and dynamically and seamlessly changing which encoded content is included as input to the output content stream without introducing errors in the output content stream and without requiring decoding or re-encoding of the input content stream. Furthermore, the presently disclosed embodiments solve technical problems inherent within computing systems; in particular, computing devices have limited ability to transmit information, limited ability to decode and encode content, and inherent errors that occur when attempting to combine encoded information under existing systems. These technical problems are addressed by various technical solutions described herein, including generating a personalized content stream by selectively including different segments of an input content stream in the personalized content stream, each segment being seamlessly interchangeable with other segments.

The foregoing aspects and many of the attendant advantages of this disclosure will become more readily appreciated as the same become better understood by reference to the following description, when taken in conjunction with the accompanying drawings.

FIG. 2 is a block diagram showing an illustrative environment 100 including a plurality of content provider computing devices 108, an output computing device 102, a view selection computing device 104, and a streaming content delivery system 110 in communication via a network 106. Although the content provider computing device 108, the output computing device 102, and the view selection computing device 104 are shown grouped as in fig. 2, the content provider computing device 108, the output computing device 102, and the view selection computing device 104 may be geographically remote and independently owned or operated. For example, the output computing device 102 and the view selection computing device 104 may be used by numerous users of the access streaming content delivery system 110 in various global, continental, or regional locations. Further, the content provider computing device 104 may represent numerous related parties or different parties associated with the streaming content delivery system 110 to provide streaming content to the client computing device 102. Thus, the grouping of the content provider computing device 108, the output computing device 102, and the view selection computing device 104 within fig. 1 is intended to represent a logical grouping, rather than a physical grouping. Similarly, each of the components of the streaming content delivery system 110 may be located within geographically dispersed areas. For example, the streaming content delivery system 110 may contain points of presence ("POPs") 112 at a variety of different global, continental, or regional locations to provide a wide geographic presence for the streaming content delivery system 110. Although shown as distinct, two or more of the content provider computing device 108, the output computing device 102, the view selection computing device 104, and the streaming content delivery system 110 may be operated by a common entity or by a common computing device. In some cases, two or more of the content provider computing device 108, the output computing device 102, and the view selection computing device 104 may represent a single physical device. For example, a single physical computing device may serve as both the output computing device 102 to output the personalized content stream and the view selection computing device 104 to select the view represented within the personalized content stream.

The network 106 may be any wired network, wireless network, or combination thereof. Additionally, the network 106 may also be a personal area network, a local area network, a wide area network, a cable television network, a satellite network, a cellular telephone network, or a combination thereof. In the exemplary environment of fig. 2, the network 106 is a Global Area Network (GAN), such as the internet. Protocols and components for communicating via other communication networks of the above-mentioned type are well known to those skilled in the art of computer communications and, therefore, need not be described in greater detail herein. Although each of the content provider computing device 108, the output computing device 102, the view selection computing device 104, and the streaming content delivery system 110 are depicted as having a single connection to the network 106, various components of the content provider computing device 108, the output computing device 102, the view selection computing device 104, and the streaming content delivery system 110 may be connected to the network 106 at different points. Thus, communication time and capabilities may vary between the components of fig. 2.

The output computing device 102 may include any number of different computing devices capable of outputting the streaming content provided by the streaming content delivery system 110. For example, the respective output computing devices 102 may correspond to a laptop or tablet computer, a personal computer, a wearable computer, a server, a Personal Digital Assistant (PDA), a hybrid PDA/mobile phone, a mobile phone, an e-book reader, a set-top box, a camera, a digital media player, and so forth. Each output computing device 102 may include hardware and/or software that enables the reception and output of streaming content, including dedicated playback hardware, dedicated software (e.g., specially programmed applications), and general purpose software (e.g., web browsers) that enable the output of streaming content (e.g., by directly downloading the content, downloading web pages that include the content, etc.).

Similarly, the view selection computing device may include any number of different computing devices capable of communicating with the streaming content delivery system 110 to inform the streaming content delivery system 110 of content that the user desires to include within the personalized content stream. For example, each view selection computing device 102 may correspond to a laptop or tablet computer, a personal computer, a wearable computer, a server, a Personal Digital Assistant (PDA), a hybrid PDA/mobile phone, a mobile phone, an e-book reader, a set-top box, a camera, a digital media player, and so forth. In one embodiment, the view selection computing device 102 includes hardware and/or software that enables a user interface to be received and output, thereby enabling selection from a plurality of content that may be included in the personalized content stream. Such software may include, for example, a specially programmed application or a web browser capable of outputting a graphical user interface. In another embodiment, the view selection computing device 102 includes a dedicated interface, such as a plurality of physical buttons, to enable selection from a plurality of content that may be included in the personalized content stream. In other embodiments, selection of the plurality of content may be accomplished using other inputs of the view selection computing device 102. For example, a gyroscope within the view selection computing device 102 may enable view selection based on the orientation of the device 102 in real space.

The content provider computing device 104 may comprise any computing device owned or operated by an entity that provides content to the streaming content delivery system 110 for subsequent transmission within a content stream to an output computing device (which may include one or more client computing devices 102). For example, the various content provider computing devices 104 may correspond to laptop or tablet computers, personal computers, wearable computers, servers, Personal Digital Assistants (PDAs), hybrid PDA/mobile phones, e-book readers, set-top boxes, cameras, digital media players, and so forth. The content provider computing device 104 may include a server hosting streaming audio, video, text, multimedia, or other encoded content. In some cases, the content provider computing device 104 may be associated with a recording device (such as a camera) that records the real-time event. The content provider computing device 108 may transmit the content to the streaming content delivery system 110 over a network in an encoded or unencoded (e.g., "raw") format. In some embodiments, the content provider computing device 108 may be operated by individual users. For example, the content provider computing device 108 may correspond to a client computing device that executes software to record a current program (such as a video game) displayed on the client computing device and transmit the recording of the program to a streaming content delivery system.

The streaming content delivery system 110 may include various components and devices configured to enable the output computing device 102 to access streaming content provided to the streaming content delivery system 110 by the content provider computing device 104. In particular, the streaming content delivery system 110 may include several POPs 112, the several POPs 112 configured to host streaming content or to serve as cache points for streaming content hosted by the streaming content delivery system 110. Each POP112 may include a variety of computing devices configured to supply content to the output computing device 102. Thus, each POP112 may include any number of processors, data stores, or networking components that cooperate to facilitate retrieval and delivery of streaming content to the output computing device 102. POP112 may communicate with other components of streaming content delivery system 110 via an internal network of streaming content delivery system 110, which may include any wired network, wireless network, or combination thereof, and may be a personal area network, a local area network, a wide area network, a cable television network, a satellite network, a cellular telephone network, or a combination thereof. In some cases, the internal network may be implemented at least in part by network 106 (e.g., as a virtual private network or "VPN"). Illustratively, each POP112 may function to hold a limited selection of content segments (e.g., the most recently requested n content segments) in a local cache data store so that these content segments can be quickly transmitted to the output computing device 102. When the local cache data store does not include the requested content segment, the POP112 can be configured to retrieve the content segment from a remote data store, such as the content data store 118 of the streaming content delivery system 110 or a data store within the system of the content provider 104 (not shown in fig. 1), and return the content segment to the requesting output computing device 102.

According to embodiments of the present disclosure, the streaming content delivery system 110 may include a content introducer service 116 configured to receive an input content stream from the content provider computing device 108, to modify the content of the stream as necessary to enable seamless interchange of portions of the input content stream in the personalized output content stream, and to store the content in a content data store 118, which data store 118 may correspond to any persistent or substantially persistent data storage device, such as a Hard Disk Drive (HDD), a solid state drive (SDD), a network attached storage device (NAS), a tape drive, or any combination thereof. Illustratively, the content introducer service 116 may include one or more content encoders configured to receive content, such as a content stream, from the content provider computing device 108 and encode or re-encode the content into one or more encodings suitable for delivery to the output computing device 102. For example, content introducer service 116 may receive content streams at relatively high quality from content provider computing devices 108 and re-encode the content into a variety of relatively low quality content streams (e.g., 'HD' quality, 'SD' quality, etc.) for delivery to output computing devices. The introducer service 116 may also include a content packager computing device configured to package the encoded content into segments suitable for transmission within the personalized content stream to the output computing device 102. Further details regarding the operation of such encoding and packaging computing devices are discussed in the '345 application and the' 347 application, both of which are incorporated by reference above. For example, as discussed in the' 347 application, the introducer may pass the content stream onto one or more content encoders that may encode the content into one or more formats accepted by the content distribution system or content output device. To provide redundant or cooperative encoding of content (e.g., to provide a resilient or adaptive quality stream), multiple content encoders may be configured to encode a content stream received from an introducer according to the same parameters or interchangeable parameters. After encoding the content, each content encoder may provide the encoded content to one or more content packagers, which may package the content into a container format accepted by the content distribution system and/or the content output device.

The streaming content delivery system 110 may also include a streaming personalization service 114, the streaming personalization service 114 configured to provide personalized content streams to the output computing device 102 based on segments of the stored input content streams (e.g., as stored in the content data store 118). Although the operation of the stream personalization service 114 will be described in greater detail below, the stream personalization service 114 is generally operable to select segments from one or more input content or content streams for inclusion in a personalized content stream, and to provide information to the output computing device 102 to enable the device 102 to retrieve the selected segments from the stream content delivery system 110 (e.g., via interaction with the POP 112). In one embodiment, the information provided to the output computing device 102 includes a manifest file referencing segments to be included in the personalized content stream, such as a manifest file generated according to the hypertext transfer protocol ("HTTP") real-time streaming ("HLS") protocol, or a Media Presentation Description (MPD) file generated according to the MPEG dynamic adaptive streaming over HTTP ("MPEG-DASH") protocol. The manifest for either protocol may be generated following the Common Media Application Format (CMAF) standard. Thus, using the reference within this manifest file, the output computing device 102 may retrieve the content segment from the POP112 to form a personalized content stream. As discussed in more detail below, the stream personalization service 114 may be configured to dynamically select segments to be included within a personalized content stream based on view selection information provided by the view selection computing device 104, thereby enabling a user to interact with the view selection computing device 104 to modify base content within the personalized content stream.

Those skilled in the art will appreciate that the streaming content delivery system 110 may have fewer or more components than shown in fig. 2. In addition, the streaming content delivery system 110 may include various web services and/or peer-to-peer network configurations. Thus, the depiction of the streaming content delivery system 110 in fig. 2 should be considered illustrative. For example, in some embodiments, components of the streaming content delivery system 110 (such as the streaming personalization service 114 or the content introducer service 116) may be executed by one or more virtual machines implemented in a hosted computing environment. The managed computing environment may include one or more rapidly provisioned and released computing resources, which may include computing devices, networking devices, and/or storage devices. The managed computing environment may also be referred to as a cloud computing environment.

Fig. 3 illustrates one embodiment of an architecture of a server 300, which server 300 may implement one or more of the elements of the streaming content delivery system 110, such as the streaming personalization service 114. The overall architecture of the server 300 shown in fig. 3 includes an arrangement of computer hardware and software components that may be used to implement aspects of the present disclosure. As shown, the server 300 includes a processing unit 304, a network interface 306, a computer-readable media drive 307, an input/output device interface 320, a display 302, and an input device 324, all of which may communicate with each other by way of a communication bus. The network interface 306 may provide a connection to one or more networks or computing systems, such as the network 106 of FIG. 2. Processing unit 304 may thus receive information and instructions from other computing systems or services via a network. The processing unit 304 may also communicate to and from the memory 310 and also provide output information for the optional display 302 via the input/output device interface 320. The input/output device interface 320 may also accept input from an optional input device 324, such as a keyboard, mouse, digital pen, or the like. In some embodiments, server 300 may include more (or fewer) components than those shown in fig. 3. For example, some embodiments of the server 300 may omit the display 302 and the input device 324, while providing input/output functionality through one or more alternative communication channels (e.g., via the network interface 306).

Memory 310 may include computer program instructions that are executed by processing unit 304 to implement one or more embodiments. Memory 310 typically includes RAM, ROM, and/or other persistent or non-transitory memory. Memory 310 may store an operating system 314, with operating system 314 providing computer program instructions for use by processing unit 304 in the general management and operation of server 300. Memory 310 may also include computer program instructions and other information for implementing aspects of the present disclosure. For example, in one embodiment, memory 310 includes user interface software 312, which user interface software 312 generates a user interface (and/or instructions thereof) for display on a computing device, e.g., via a navigation interface such as a web browser installed on the computing device. Additionally, the memory 310 may include or be in communication with one or more secondary data stores, such as data store 320, which may correspond to any persistent or substantially persistent data storage, such as a Hard Disk Drive (HDD), a solid state drive (SDD), a Network Attached Storage (NAS), a tape drive, or any combination thereof.

In addition to the user interface module 312, the memory 310 may also include stream personalization software 316 that may be executed by the processing unit 304. In one embodiment, stream personalization software 316 implements various aspects of the present disclosure, such as selecting segments to include within a personalized content stream based on user-provided view selection information and providing a manifest file or other information to output content device 103, thereby enabling device 103 to retrieve the selected segments to output the personalized content stream. Although the stream personalization software 316 is shown in fig. 3 as part of the server 300, in other embodiments all or a portion of the software may be implemented by an alternative computing device within the streaming content delivery system 110, such as a virtual computing device within a hosted computing environment. Further, while fig. 3 is described with respect to software implementing the stream personalization service 114, the software within memory 310 may additionally or alternatively include instructions for implementing other components of the present disclosure (such as the content introducer service 116).

With respect to fig. 4, an illustrative example is shown in which three content provider computing devices 108A-108C each provide a respective input video content stream, labeled as video streams A, B and C, respectively. Each video content stream may represent a different view of a common real-time event or may represent a different real-time event. For example, each video content stream may represent a feed of a different camera in a single sporting event, or each video content stream may represent a recording of a different sporting event. For ease of description, each of the streams of video content is shown as starting with a common timestamp (of zero). However, there is not necessarily a common timestamp between the video content streams.

Each video content stream is received at the streaming content delivery system 110 with changes or modifications to the stream as needed to enable seamless interchange of segments of the input video content stream within the personalized content stream. For example, the content introducer service 116 may re-encode the input video content stream into one or more encoded streams (e.g., multiple bit rates, resolutions, etc.) in a format suitable for display on the output computing device 102A. In addition, the content introducer service 116 may synchronize or align aspects of the encoded stream during encoding to enable interchangeable presentation of portions of the encoded stream. Further discussion regarding alignment or synchronization of content streams is provided below with reference to fig. 5.

Thereafter, the streaming content delivery system 110 may provide a plurality of personalized output streams to the output computing device 102A with different combinations of snippets generated from each input video content stream, and may vary the snippets within each personalized output stream based on information received from the view selection computing device 104. For example, the streaming content delivery system 110 may provide a first personalized content stream 402 to the first output computing device 102A, the first personalized content stream 402 including a segment representing the content of the first input video content stream from timestamps 0 through 10 (stream a), then a segment representing the content of the second input video content stream from timestamps 10 through 20 (stream B), and then a segment representing the content of the first input video content stream from timestamps 20 through 30. These segments may represent, for example, an initial user selection of a view represented by a first input video content stream on a view selection computing device 104 associated with the output computing device 102A, followed by a selection of a view represented by a second input video content stream that is just prior to timestamp 10, and followed by a selection of a view represented by the first input video content stream at timestamp 20. The streaming content delivery system 110 may provide similar

personalized output streams

404 and 406 to other output computing devices 102 (shown as

output computing devices

102B and 102C) based on input from other view selection computing devices 104 associated with the respective output computing device 102.

As shown in fig. 4, the transition points of the personalized content stream between different base content (e.g., representing various input content streams a-C) may correspond to the locations of splice point frames within the segments representing the different base content. Thus, inclusion of these different segments within the personalized content stream is less likely to result in inter-frame dependency errors when the output content stream is played back on the output computing device 102A. As will be described in more detail below, the streaming content delivery system 110 may, in some cases, delay modifying base content within the personalized content stream from a first base content to a second base content until a common location in the two base contents is reached. For example, upon receiving a request from the view selection computing device 104 to switch the personalized content stream 402 from including the input content stream a to including the input content stream B, the streaming content delivery system 110 may delay switching what input content stream is included within the personalized content stream 402 until timestamp 10, which timestamp 10 may represent the next location of a common splice point frame between the two input content streams. In other cases, such as where the streaming content delivery system 110 determines that a common splice point frame will not occur within a threshold period of time, the streaming content delivery system 110 can cause a new splice point frame to be included at a common location within the encoded content corresponding to the input content stream to facilitate faster interchange of base content within the personalized content stream 402.

Exemplary mechanisms for selective insertion of splice point frames are discussed in more detail within the' 345 application, which is incorporated by reference above. For example, as disclosed in the' 345 application, a system may include an output switching controller configured to coordinate operations of a content encoder and a content packager such that the content packager is enabled at a given point in time to switch between packaging a first encoded content stream and a second encoded content stream without decoding the content streams and without introducing errors into the output stream. In particular, the output switch controller may be configured to instruct each of the content encoders to insert a splice point frame into their respective encoded content streams at a given point in time, and to ensure that subsequent frames of the encoded content streams do not reference any frames preceding the splice point frame. The splice point frame thus acts as a "switching point" so that a viewing device can view the encoded content stream starting from a point corresponding to the splice point frame without generating errors due to the loss of inter-frame dependencies. In one embodiment, the splice point frame is an IDR frame that satisfies the h.264 video coding format. The output switch controller may be further configured to indicate a time at which each content packager is expected to include a splice point frame with respect to the encoded content stream. Thus, the content packager can switch from packaging the first encoded content stream into the output stream to packaging the second encoded content stream into the output stream at the point in time. Because the content packager switches to the second encoded content stream at a time corresponding to the splice point frame, subsequent decoding of the content stream can occur without experiencing errors due to loss of inter-frame dependencies. In this way, the content packager is enabled to switch the output stream to any number of encoded content streams without introducing errors in the output stream and without the need to decode and re-encode the content stream.

Referring to fig. 5, an illustrative interaction will be described such that a set of content streams received from one or more content provider computing devices 108 are processed into aligned content segments that can be dynamically selected for inclusion within a personalized content stream. The interaction begins at (1), where the content provider computing device 108 transmits the content stream to the content introducer service 116. Illustratively, the content stream may be transmitted via the network 106 in any suitable format, such as an MPEG-TS streaming content format. In one implementation, each content stream is transmitted in a common format. In another embodiment, two or more of the content streams are transmitted in different formats. As described above, the content stream may represent a variety of different types of base content. For example, the content streams may represent different angles of capture in a real-time event, or represent recordings of different events. In one embodiment, the content stream represents screenshots of displays of different computing devices participating in a common activity, such as different screenshots of players participating in a network-based multiplayer video game. Although fig. 5 is described with reference to content streaming from the content provider computing device 108, embodiments of the present disclosure may also be applied to non-streaming content. For example, the content provider computing device 108 may transmit pre-recorded content to the content introducer service 116 for processing according to embodiments described herein.

At (2), the content introducer service 116 processes the input content stream to align splice point frames within the content stream so that different portions of different input content streams can be included in a common output content stream without introducing errors in the output content stream due to the corruption of inter-frame dependencies. Illustratively, the content introducer service 116 may insert splice point frames at a common location within the encoded stream as discussed above in accordance with the embodiments of the '345 application, or may synchronize other attributes, such as time stamps or time codes of the encoded stream as discussed above in accordance with the embodiments of the' 347 application. The content introducer service 116 can encode the incoming content stream into any number of known formats, including, but not limited to, H.263, H.264, H.265, MICROSOFT SMPTE 421M (also known as VC-1), APPLE^TMProRes, APPLE intermediate codec, VP3 through VP9, Motion JPEG ("M-JPEG"), MPEG-2 part 2, RealVideo, Dirac, Theora, and MPEG-4 part 2 (for video), and Vorbis, Opus, MP3, advanced audio coding ("AAC"), pulse code modulation ("PCM"), focus on sound ("DTS"), MPEG-1, audio coding 3 ("AC-3"), free lossless audio codec ("FLAC"), and RealAudio (for audio), or combinations thereof. Various techniques for encoding content are known in the art and, therefore, will not be described in further detail herein. The content introducer service 116 may also package the encoded stream into any container format suitable for inclusion within the personalized content stream. As will be appreciated by those skilled in the art, the container format may generally combine encoded audio and video, possibly along with audio and video synchronization information, subtitles, metadata, or other information, into a file. Examples of containers include, but are not limited to, Matroska, FLV, MPEG-4 part 12, VOB, Ogg, Audio video interleave ("AVI"), Quicktime, advanced System Format ("ASF"), RealMedia, ISO base media File Format (ISOBMFF), segmented MP4(fMP4), and MPEG transport stream ("MPEG-TS"). In one illustrative embodiment, the content introducer service 116 pairs within each according to the H.264 video coding standardThe content streams are encoded and each stream is packaged into MPEG-TS files, which may represent segments of the encoded stream that may be included in the personalized content stream. In one embodiment, the segments of each content stream may be time aligned such that every nth segment of each content stream begins at a common time (e.g., a common timestamp) and ends at a common time. The temporal alignment of the segments may facilitate intermixing of segments of different input content streams into the personalized content stream, as discussed below. Further, in some cases, the segments of each content stream may be packaged to begin at a splice point frame within the content stream. Because the streaming content delivery system 110 may be configured to change base content within the personalized content stream at points corresponding to detected splice point frames within segments of two base content (e.g., content before the change and content after the change), generating segments starting with a splice point frame may enable changing base content within the personalized content stream by combining segments of different base content.

At (3), the content introducer service 116 stores aligned stream data (e.g., a set of segments representing each of the input content streams and including base content with aligned splice point frames) into a content data store 118. According to embodiments described herein, segments corresponding to different input content streams (e.g., representing different base content) may thereafter be selectively included within the personalized content stream.

With reference to fig. 6A-6C, an illustrative interaction for providing a personalized content stream to the output computing device 102 will be described. In particular, interactions for providing a personalized content stream to the output computing device 102 that includes segments representing first base content (e.g., content of a first input content stream) will be described with reference to fig. 6A and 6B, while interactions for dynamically changing segments of the personalized content stream such that segments of second base content (e.g., content of a second input content stream) are included within the personalized content stream will be described with reference to fig. 6C. Thus, via the interactions of fig. 6A-6C, a user is enabled to interact with the streaming content delivery system 110 to dynamically interchange base content within the personalized content stream.

The interaction of fig. 6A begins at (1), where the output computing device 102 requests a personalized content stream. Such a request may be generated, for example, by a user interacting with the output computing device 102 to select a personalized content stream (e.g., via an interface of the output computing device 102). The request may indicate initial content that is desired to be included within the personalized content stream, such as a first view of a real-time event available on the streaming content delivery system 110.

At (2), the stream personalization service 114 selects a segment to include in the personalized content stream. The segments may, for example, correspond to segments generated by the content introducer service 116 from an input content stream corresponding to the initial desired content and stored by the content introducer service 116 in a content data store.

At (3), the stream personalization service 114 generates manifest data for personalizing the content stream. The manifest data may represent an initial portion of a manifest file for the personalized content stream. As mentioned above, the manifest file typically includes references to several content segments that collectively represent the personalized content stream. The manifest file may be repeatedly updated or appended during the personalized content stream to reflect new segments to be included within the content stream. For example, the initial manifest data generated by the stream personalization service 114 may list the first 10 segments of the content stream, and the stream personalization service 114 may later update the manifest data to include the next 10 segments of the content stream, and so on. In some cases, the manifest file may only reflect a limited number of segments around the current output position of the content stream, and thus correspond to "moving windows" of segments within the content stream. In other cases, the manifest file may maintain references to historical segments of the content stream, and thus be similar to a "log" of all segments within the content stream. As described above, the manifest file may represent a manifest file generated according to the hypertext transfer protocol ("HTTP") real-time streaming ("HLS") protocol, or a Media Presentation Description (MPD) file generated according to the HTTP-based MPEG dynamic adaptive streaming ("MPEG-DASH") protocol.

In one embodiment, the manifest data (e.g., representing an initial portion of the manifest file) includes one or more references to segments included within the content data store 118 or otherwise available at the streaming content delivery system 110. The reference may take the form of, for example, a uniform resource locator or "URI". Thus, when the initial desired content represents "input stream A", the manifest data may include a series of references in the form of "http:// example. tld/segments/in _ stream _ A/segments _ < n >. ts", where "< n >" represents the integer ordering of the segments generated from input stream A. In some cases, the manifest data may include multiple references for each segment of the personalized content stream. For example, the manifest data may reference multiple versions of a first segment of a given input stream, each version having a different resolution, bit rate, or format.

In another embodiment, the manifest data may include "generic" or "placeholder" references, without including direct references to segments of the input content stream. Such references may not have a direct correspondence to the fragments within the content data store 118, but rather represent placeholder references (such as placeholder references in the form of uniform resource locators or "URIs") that the streaming content delivery system 110 may later associate with the fragments in the content data store 118. For example, the manifest file may include several consecutive references in the form "http:// example. tld/< personal _ stream _ id >/< n >. ts", where "< personal _ stream _ id >" denotes an identifier of the personalized content stream and "< n >" denotes an integer ordering of segments within the personalized content stream. To enable later parsing of the placeholder references, the stream personalization service 114 may also determine a mapping of each placeholder reference to a fragment of the input content stream (e.g., as stored in the content data store 118). For example, the stream personalization service 114 may determine that the reference "http:// example. tld/< personal _ stream _ id >/1. ts" corresponds to "http:// example. tld/segments/in _ stream _ A/segment _1. ts", and "http:// example. tld/< personal _ stream _ id >/2. ts" corresponds to "http:// example. tld/segments/in _ stream _ A/segment _2. ts", etc. As will be discussed in more detail below, the use of placeholder references in the manifest data may enable the streaming content delivery system 110 to more quickly modify segments included within the personalized content stream, as the streaming content delivery system 110 may change the mapping even after the manifest data is transmitted to the output computing device 102.

The manifest data is then transmitted to the output computing device 102 at (4). By retrieving and playing back the segments referenced within the manifest data, the output computing device 102 may output a personalized content stream (e.g., a segment that includes the originally desired content), as discussed below. Although a limited number of interactions are shown in fig. 6A, these interactions may be repeated any number of times during the output of the personalized content stream. For example, the output computing device 102 may continuously or repeatedly interact with the stream personalization service 114 to retrieve additional manifest data so that the output computing device 102 may continue to retrieve and output segments of the personalized content stream.

The interaction of fig. 6A continues in fig. 6B, where the output computing device 102 retrieves one or more segments of the personalized content stream using the manifest data and plays back the segments, as discussed above, to output the personalized content stream. Specifically, at (5), the output computing device 102 transmits a request to the POP112 for one or more segments referenced within the manifest data (e.g., by submitting an HTTP GET request to a URI corresponding to a segment in the manifest data). As discussed above, in some cases, the manifest data may include generic or placeholder references that do not directly correspond to files stored within the content data store 118. To this end, at (5), the POP112 may interact with the stream personalization service 114 to resolve the placeholder identifiers (e.g., as identified within the request) to identifiers of segments of the input content stream. For example, POP112 may transmit a request to stream personalization service 114 to resolve a given placeholder identifier to an identifier of a segment of an incoming content stream. The stream personalization service 114 may then utilize the mapping information (e.g., as generated from the interaction of fig. 6A discussed above) to return identifiers for segments of the incoming content stream to the POP 112. In some cases, rather than requiring a request from a POP112, the flow personalization service 114 may "push" mapping information to one or more POPs 112, such that no POP112 is required to retrieve the mapping information in response to the request. In embodiments that do not utilize placeholder references, the request may directly identify the segment of the input content stream, and thus, interaction (6) may not be required.

At (8), the POP112 retrieves the determined segment of the incoming content stream from the content data store 118, if desired. Illustratively, to reduce network communications between the POP112 and the content data store 118, the POP112 may maintain a local cache of recently requested segments. For this reason, interaction (8) may not be needed if certain segments of the incoming content stream have been previously cached in POP 112. Otherwise, the POP112 at interaction (8) may retrieve the determined segment of the incoming content stream from the content data store 118.

At interaction (9), the requested segment is transmitted back to the output computing device 102. Thereafter, the output computing device 102 may output the segment as part of a personalized content stream. Thus, according to the interactions of fig. 6A and 6B, a user utilizing the output computing device 102 is enabled to output a personalized content stream including desired primary content.

With reference to FIG. 6C, an illustrative interaction for dynamically changing segments within a personalized content stream will be described. In particular, the interaction of fig. 6C may enable a user using the view selection computing device 104 to request changes to clips included in the personalized content stream, and also enable the streaming content delivery system 110 to change clips within the personalized content stream without introducing errors into the stream, such that different base content may be seamlessly displayed within the personalized content stream. For ease of reference, the interactions of FIG. 6C are renumbered to begin with (1). However, it should be understood that the interaction of FIG. 6C may occur at any time during the output of the personalized content stream.

The interaction of fig. 6C begins at (1), where the view selection computing device 104 receives a request (e.g., from a user input) to change a view within the personalized content stream to different base content (e.g., different views of the same real-time event, views of different real-time events, etc.). The request may be received, for example, by a user selecting elements 54-58 of FIG. 1.

At (2), the view selection computing device 104 transmits a notification of the requested view to the stream personalization service 114. The notification may be transmitted, for example, by code executing on the view selection computing device 104 (e.g., a browser application or a dedicated view selection application).

At (3), the stream personalization service 114 selects a new segment of the base content to include within the personalized content stream based on the new view change request. The new segment may correspond to a segment representing the primary content selected by means of the view change request. The stream personalization service 114 then causes the new segment to be included within the personalized content stream, thereby changing the content of the personalized content stream to reflect the base content selected in the view change request. Because each segment of the new base content includes splice point frames that are aligned with segments of the previous base content, the segments can be intermixed within the output content stream without introducing errors into the content stream.

In embodiments where the manifest data provided to the output computing device 102 directly references the segments present in the content data store 118, the stream personalization service 114 may cause the newly selected segments to be referenced within any additional manifest data transmitted to the output computing device 102. Thus, as the personalized content stream continues and as the output computing device 102 continues to receive manifest data regarding the personalized content stream, the output computing device 102 will (after receiving the view change request at the view selection computing device 104) begin obtaining manifest data that references the new segment dynamically selected by the stream personalization service 114. By processing the manifest data, the output computing device 102 will begin outputting segments of the newly selected base content, thereby changing the base content output to the user without requiring the output computing device 102 to stop outputting the personalized content stream or otherwise change the content stream.

As described above, in some embodiments, the stream personalization service 114 can provide the manifest data including placeholder references to the output computing device 102, and the POP112 can utilize the placeholder to fragment mapping to resolve requests to receive fragments from the output computing device 102. For example, the use of placeholder references may be beneficial because it may enable the streaming content delivery system to modify the fragments provided to the device based on the request for the fragments rather than based on delivery of manifest data. Because the request for the snippet occurs based on processing the manifest file, the use of the placeholder snippet may enable later determination of the snippet provided to the output computing device 102 and may improve the observed responsiveness of the personalized content stream to the view change request. For purposes of further discussion of fig. 6C, it will be assumed that the stream personalization service 114 provides manifest data with placeholder references to the fragments. Thus, when a new segment is selected based on the view change request, the stream personalization service 114 can update the placeholder to segment mapping to reflect the mapping of placeholder references to segments of the new base content. For example, where a previous placeholder to segment mapping mapped a placeholder n to a segment n of the first base content, the stream personalization service may update the mapping to map the placeholder n to a segment n of the second base content. The mapping may be updated to replace the reference to the first base content with the reference to the second base content from the change location determined by the stream personalization service 114. The stream personalization service 114 selects the change location based on identifying locations corresponding to splice point frames within both the first base content and the second base content. Illustratively, the stream personalization service 114 may choose to change locations by: the next segment corresponding to the first base content that has not been retrieved by the output computing device 102 is determined, and references to the corresponding segment of the second base content are replaced into the mapping of the segment and subsequent segments. Thus, the next time the output computing device 102 requests the segment of the personalized content stream, the streaming content delivery system 110 may cause the segment of the second primary content to be delivered to the output computing device 102. Where the direct references to the segments are included within the manifest data, the alteration location may be selected as the next segment of the first base content that has not been referenced in the manifest data transmitted to the output computing device 102.

Thereafter, at (4), the output computing device 102 may request one or more segments (e.g., based on processing a reference within the manifest data previously provided by the stream personalization service 114). At interactions (4) - (8), the POP112 may, in turn, retrieve updated fragment mapping information from the stream personalization service 114, determine the fragments identified by the mapping, retrieve the fragments from the content data store 118 as needed, and transmit the fragments back to the output computing device 102. These interactions are similar to interactions (5) - (9) of fig. 6B, and thus the description of the interactions is not repeated. Thereafter, at (9), the output computing device 102 outputs the segment as part of the personalized content stream, thus changing the base content output to the user without requiring the output computing device 102 to stop outputting or otherwise change the personalized content stream.

While the interactions of fig. 6A-6C are described above with respect to segments stored within the content datastore 118 (e.g., representing content packaged into files), in some cases the interactions may occur with respect to segments dynamically generated by the POP112 in response to user requests. For example, where the stream personalization service 114 utilizes placeholder references within the manifest file, the stream personalization service may generate a mapping of the placeholder references to portions of one or more base content desired to be included within the segment (e.g., timestamp ranges of one or more input content streams to be included in the content). The POP112 may then generate a snippet to be delivered to the output computing device 102 based on a portion of the base content in the map. For example, each POP112 may include a packager computing device configured to dynamically generate a fragment by packaging a portion of an input content stream (e.g., received from the content introducer service 116) based on a portion of base content indicated in a mapping provided by the stream personalization service 114. In some cases, the POP112 may dynamically generate segments only if the appropriate segments do not already exist within the POP112, such that if a segment with given base content is to be delivered to multiple output computing devices 102, the POP112 may reuse the previously generated segments for second and subsequent requests. In one embodiment, POP112 may dynamically generate each segment by packaging portions of one or more input content streams. In another embodiment, POP112 may only dynamically generate segments that include different base content in a single segment. For example, while the POP112 may maintain a different set of segments for each input content stream, it may also be configured to dynamically create segments that include portions of different input content streams. Thus, if the stream personalization service 114 determines that a segment should include two different input content streams (e.g., 2 seconds for a first input content stream and 2 seconds for a second input content stream), the POP112 may dynamically generate the segment based on the two input content streams and return the dynamically generated segment to the output computing device 102. Thus, in various instances, the POP112 may act as a packager that packages the input content stream into segments to be included within the personalized content stream.

Referring to FIG. 7, one illustrative routine 700 for providing a dynamically personalized content stream will be described. The routine 700 may be implemented, for example, by the streaming content delivery system 110 of FIG. 1.

The routine 70 begins at block 702, where the streaming content delivery system 110 obtains segments of a plurality of base content having aligned splice point frames. In one embodiment, the segments may be provided to the streaming content delivery system 110 (e.g., by the content provider computing device 108). In another embodiment, as discussed above, the streaming content delivery system 110 may obtain segments of multiple base content with aligned splice point frames by processing multiple input content streams to generate segments of each input content stream with aligned splice point frames.

Thereafter, at block 704, the streaming content delivery system 110 receives a request for a personalized content stream. The request may be transmitted, for example, by the output computing device 102, and may indicate the first base content to be included within the personalized content stream.

At block 706, the streaming content delivery system 110 identifies a segment for the personalized content stream from the currently requested base content (e.g., as indicated within the initial request for the personalized content stream). Illustratively, the streaming content delivery system 110 may identify one or more segments that are generated based on the input content stream identified within the request for the personalized content stream.

At block 708, the streaming content delivery system 110 transmits the identified segments to the output computing device 102. In one embodiment, the fragments may be transmitted based at least in part on the introduction of the identified fragments within the manifest data. In another embodiment, the fragments may be transmitted based at least in part on including placeholder references within the manifest data and resolving the placeholder references to the identified fragments at the streaming content delivery system 110.

At block 710, the routine 700 may vary based on whether a request to change the base content of the personalized content stream has been received. If so, the routine 700 proceeds to block 712, where the streaming content delivery system 110 updates the requested primary content within the personalized content stream. For example, the streaming content delivery system 110 may record in memory an indication that the personalized content stream should be modified to include different base content. The routine 700 then proceeds to block 714. If a request to change base content is not received, the routine 700 may proceed directly from block 710 to block 714.

At block 714, the routine 700 may change based on whether the personalized content stream has ended (e.g., whether the user has requested that the personalized content stream end, whether the current base content of the stream has ended, etc.). If so, the routine 700 proceeds to block 716 and ends. Otherwise, the routine 700 returns to block 706, where the streaming content delivery system 110 again identifies the base segment of the currently requested base content at block 706, and causes the segment to be transmitted to an output device at block 708, as described above. The routine 700 may continue in this manner, thereby enabling the user to dynamically change the content of the personalized content stream.

While one illustrative example of the routine 700 is described above, elements or the ordering of elements in the routine 700 may vary across embodiments of the present disclosure. For example, in some cases, block 712 may be used as a separate routine, or as an "interrupt" to routine 700, such that the requested primary content may be changed at any time during execution of routine 700. Other modifications or variations will be apparent to persons skilled in the art in view of this disclosure. Thus, the elements and ordering of the routine 700 are illustrative in nature.

All of the methods and processes described above may be embodied in, and fully automated via, software code modules executed by one or more computers or processors. The code modules may be stored in any type of non-transitory computer-readable medium or other computer storage device. Some or all of the methods may alternatively be embodied in dedicated computer hardware.

Conditional language such as "can," "may," "might," or "may" is understood in the context of general usage to present some embodiments as including but not others as including certain features, elements, and/or steps, unless specifically stated otherwise. Thus, such conditional language is not generally intended to imply that features, elements, and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether such features, elements, and/or steps are included or are to be performed in any particular embodiment.

Unless specifically stated otherwise, disjunctive language such as the phrase "X, Y or at least one of Z" should generally be understood in the context of presenting items, etc. that may be X, Y or Z or any combination thereof (e.g., X, Y and/or Z). Thus, such disjunctive language is generally not intended to, and should not imply that certain embodiments require that at least one of X, at least one of Y, or at least one of Z each be present.

Articles such as "a" or "an" should generally be construed to include one or more of the described items unless explicitly stated otherwise. Thus, phrases such as "a device configured to. Such one or more enumerated means may also be collectively configured to perform the recited enumeration. For example, "a processor configured to execute enumerations A, B and C" may include a first processor configured to execute enumeration A in cooperation with a second processor configured to execute enumerations B and C.

Any routine descriptions, elements, or blocks in flow charts described herein and/or shown in the accompanying drawings should be understood as potentially representing modules, segments, or portions of code that include one or more executable instructions for implementing the specified logical function or element in the routine. Alternative implementations are included within the scope of the embodiments described herein in which elements or functions may be deleted or performed in an order other than that shown and discussed, including substantially concurrently or in reverse order, depending on the functionality involved as would be understood by those skilled in the art.

It should be emphasized that many variations and modifications may be made to the above-described embodiments, the elements of which are to be understood as being other acceptable examples. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.

The foregoing may be better understood in view of the following clauses:

clause 1. a system for providing a personalized content stream created from a user-selected combination of input content streams, the system comprising:

one or more encoded computing devices configured with computer-executable instructions to:

obtaining at least two input content streams;

encoding at least two input content streams into at least two encoded content streams, the at least two encoded content streams comprising one or more splice point frames at a common location between the at least two encoded content streams, each splice point frame ensuring that there is no inter-frame dependency between one or more frames preceding the splice point frame and one or more frames following the splice point frame;

generating a first set of content segments from a first encoded content stream representing a first base content of at least two encoded content streams, wherein each segment of the first set of content segments starts at a position in the first encoded content stream corresponding to a splice point frame; and

generating a second set of content segments from a second encoded content stream of the at least two encoded content streams representing a second base content, wherein each segment of the second set of content segments begins at a position in the second encoded content stream corresponding to a splice point frame;

a data store configured to store the first set of content segments and the second set of content segments; and

one or more computing devices configured with computer-executable instructions to:

receiving a request for a personalized content stream, the request identifying the first base content;

transmitting one or more segments of the first set of content segments as a first portion of the personalized content stream;

receiving a request to modify the personalized content stream to include the second base content; and

transmitting one or more of the second set of content segments as a second portion of the personalized content stream.

Clause 2. the system of clause 1, wherein the first encoded content stream is real-time streaming content.

Clause 3. the system of clause 1, wherein the first base content represents a first view of a common event, and wherein the second base content represents a second view of the common event.

Clause 4. the system of clause 1, wherein the first encoded content stream and the second encoded content stream are encoded according to the h.264 standard, and wherein the splice point frame is an instantaneous decoder refresh ("IDR") frame.

Clause 5. a computer-implemented method, the method comprising:

generating a first set of content segments from a first input content stream representing a first base content, wherein each segment of the first set of content segments begins at a location in a first encoded content stream corresponding to a splice point frame, each splice point frame ensuring that there is no inter-frame dependency between one or more frames preceding the splice point frame and one or more frames following the splice point frame;

generating a second set of content segments from a second input content stream representing a second base content, wherein each segment of the second set of content segments begins at a position in a second encoded content stream corresponding to a splice point frame, and wherein each segment of the second set of content segments is temporally aligned with a corresponding segment of the first set of content segments;

Clause 6. the computer-implemented method of clause 5, wherein the first set of content segments represents at least one of audio content or video content.

Clause 7. the computer-implemented method of clause 5, wherein the request to modify the personalized content stream is received from a view selection computing device, and wherein the computer-implemented method further comprises transmitting a graphical representation of the first base content and the second base content to the view selection computing device.

Clause 8. the computer-implemented method of clause 7, wherein the graphical representation is a thumbnail image.

Clause 9. the computer-implemented method of clause 7, wherein the graphical representation is a low resolution video stream.

Clause 10. the computer-implemented method of clause 5, wherein transmitting one or more of the first set of content segments as the first portion of the personalized content stream comprises: transmitting manifest data comprising references to one or more segments of the first set of content segments.

Clause 11. the computer-implemented method of clause 5, wherein transmitting one or more of the first set of content segments as the first portion of the personalized content stream comprises:

transmitting manifest data comprising a set of placeholder references;

generating mapping information that maps individual ones of the set of placeholder references to individual ones of the one or more pieces of the first set of content pieces;

receiving a request for one of the set of placeholder references;

determining a first fragment of the one or more fragments that corresponds to the placeholder reference based at least in part on the mapping information; and

returning the first fragment in response to the request.

Clause 12. the computer-implemented method of clause 5, wherein the first set of content segments are packaged within a plurality of containers generated according to at least one of a hypertext transfer protocol ("HTTP") real-time streaming ("HLS") protocol or an HTTP-based MPEG dynamic adaptive streaming ("MPEG-DASH") protocol.

Clause 13. the computer-implemented method of clause 5, wherein the first encoded content stream and the second encoded content stream are moving picture experts group transport streams ("MPEG-TS").

Clause 14. a system for providing a personalized content stream created from a user-selected combination of input content streams, the system comprising:

a data storage area, the data storage area comprising:

a first set of content segments representing a first base content, wherein each segment of the first set of content segments begins at a location in a first encoded content stream corresponding to a splice point frame, each splice point frame ensuring that there is no inter-frame dependency between one or more frames preceding the splice point frame and one or more frames following the splice point frame;

a second set of content segments representing a second base content, wherein each segment of the second set of content segments begins at a position in a second encoded content stream corresponding to a splice point frame, and wherein each segment of the second set of content segments is temporally aligned with a corresponding segment of the first set of content segments;

Clause 15. the system of clause 14, further comprising an encoding computing system configured with computer-executable instructions to generate the first and second sets of content segments from the respective first and second input content streams.

The system of clause 16, the system of clause 14, wherein the request to modify the personalized content stream is received from a view selection computing device, and wherein the one or more computing devices are further configured with computer-executable instructions to transmit graphical representations of the first base content and the second base content to the view selection computing device.

Clause 17. the system of clause 16, wherein the personalized content stream is output on an output computing device, and wherein the output computing device is a different device than the view selection computing device.

Clause 18. the system of clause 16, wherein the personalized content stream is output on the view selection computing device.

The system of clause 19. the system of clause 14, wherein the one or more computing devices are configured with computer-executable instructions to transmit one or more segments of the first set of content segments as the first portion of the personalized content stream at least in part by transmitting manifest data comprising a reference to the one or more segments of the first set of content segments.

The system of clause 20, the system of clause 14, wherein the one or more computing devices are configured with computer-executable instructions to transmit one or more segments of the first set of content segments as the first portion of the personalized content stream at least in part by:

transmitting manifest data comprising a set of placeholder references;

receiving a request for one of the set of placeholder references;

returning the first fragment in response to the request.

The system of clause 14, wherein the request to modify the personalized content stream is received from a view selection computing device, and wherein the view selection computing device transmits the request to modify the personalized content stream in response to input from a user, the input comprising one or more of: a selection of a graphical representation of the first base content and the second base content, a modification of an orientation of the view selection computing device, or a selection of a physical input on the view selection computing device.

Claims

1. A computer-implemented method, the computer-implemented method comprising:

2. The computer-implemented method of claim 1, wherein the request to modify the personalized content stream is received from a view selection computing device, and wherein the computer-implemented method further comprises transmitting a graphical representation of the first base content and the second base content to the view selection computing device.

3. The computer-implemented method of claim 2, wherein the graphical representation is at least one of a thumbnail image or a low resolution video stream.

4. The computer-implemented method of claim 1, wherein transmitting the one or more segments of the first set of content segments as the first portion of the personalized content stream comprises: transmitting manifest data comprising references to the one or more segments of the first set of content segments.

5. The computer-implemented method of claim 1, wherein transmitting the one or more segments of the first set of content segments as the first portion of the personalized content stream comprises:

transmitting manifest data comprising a set of placeholder references;

receiving a request for one of the set of placeholder references;

returning the first fragment in response to the request.

6. The computer-implemented method of claim 1, wherein the first set of content segments are packaged within a plurality of containers generated according to at least one of a hypertext transfer protocol ("HTTP") real-time streaming ("HLS") protocol or an MPEG-dynamic adaptive streaming over HTTP ("MPEG-DASH") protocol.

7. The computer-implemented method of claim 1, wherein the first encoded content stream and the second encoded content stream are moving picture experts group transport streams ("MPEG-TS").

8. A system for providing a personalized content stream created from a user selected combination of input content streams, the system comprising:

a data storage area, the data storage area comprising:

a first set of content segments representing a first base content, wherein each segment of the first set of content segments begins at a location in a first encoded content stream corresponding to a splice point frame, each splice point frame ensuring that there is no inter-frame dependency between one or more frames preceding the splice point frame and one or more frames following the splice point frame; and

a second set of content segments representing a second base content, wherein each segment of the second set of content segments begins at a position in a second encoded content stream corresponding to a splice point frame, and wherein each segment of the second set of content segments is temporally aligned with a corresponding segment of the first set of content segments; one or more computing devices configured with computer-executable instructions to:

9. The system of claim 8, further comprising an encoding computing system configured with computer-executable instructions to generate the first and second sets of content segments from respective first and second input content streams.

10. The system of claim 8, wherein the request to modify the personalized content stream is received from a view selection computing device, and wherein the one or more computing devices are further configured with the computer-executable instructions to transmit graphical representations of the first base content and the second base content to the view selection computing device.

11. The system of claim 10, wherein the personalized content stream is output on an output computing device, and wherein the output computing device is a different device than the view selection computing device.

12. The system of claim 10, wherein a personalized content stream is output on the view selection computing device.

13. The system of claim 8, wherein the one or more computing devices are configured with the computer-executable instructions to transmit the one or more segments of the first set of content segments as the first portion of the personalized content stream at least in part by transmitting manifest data that includes references to the one or more segments of the first set of content segments.

14. The system of claim 8, wherein the one or more computing devices are configured with the computer-executable instructions to transmit the one or more segments of the first set of content segments as the first portion of the personalized content stream at least in part by:

transmitting manifest data comprising a set of placeholder references;

receiving a request for one of the set of placeholder references;

returning the first fragment in response to the request.

15. The system of claim 8, wherein the request to modify the personalized content stream is received from a view selection computing device, and wherein the view selection computing device transmits the request to modify the personalized content stream in response to input from a user, the input comprising one or more of: a selection of a graphical representation of the first base content and the second base content, a modification of an orientation of the view selection computing device, or a selection of a physical input on the view selection computing device.