US20180007422A1

US20180007422A1 - Apparatus and method for providing and displaying content

Info

Publication number: US20180007422A1
Application number: US15/280,947
Authority: US
Inventors: Dennis D. Castleman
Original assignee: Sony Interactive Entertainment Inc
Current assignee: Sony Interactive Entertainment Inc
Priority date: 2016-06-30
Filing date: 2016-09-29
Publication date: 2018-01-04
Also published as: EP3479257A4; CN109417624B; KR20190022851A; JP6944564B2; JP7029562B2; JP2021103327A; US11089280B2; KR20210000761A; JP6686186B2; CN109417624A; EP3479574A4; JP2019521388A; JP2019525305A; EP3479257A1; JP2020123962A; EP3479574A1; CN109416931A; CN109416931B; KR102294098B1; US10805592B2

Abstract

A method for displaying content is provided. One embodiment of the method includes determining a focal area of a viewer of a content item displayed on a display device, retrieving a low bit rate version of the content item, retrieving a portion of a high bit rate version of the content item corresponding to the focal area, combining the portion of the high bit rate version of the content with the low bit rate version of the content item to generate a combined image, and causing the combined image to be displayed to the viewer via the display device. Systems perform similar steps and non-transitory computer readable storage mediums each store one or more computer programs are also provided.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Patent Application No. 62/374,687, filed on Aug. 12, 2016, entitled “APPARATUS AND METHOD FOR PROVIDING AND DISPLAYING CONTENT”, the entire disclosure of which is hereby fully incorporated by reference herein in its entirety.
This application also claims the benefit of U.S. Provisional Patent Application No. 62/357,259, filed on Jun. 30, 2016, entitled “APPARATUS AND METHOD FOR CAPTURING AND DISPLAYING SEGMENTED CONTENT”, the entire disclosure of which is hereby fully incorporated by reference herein in its entirety.
This application is related to U.S. patent application Ser. No. ______, filed on the same date as this application, entitled “APPARATUS AND METHOD FOR CAPTURING AND DISPLAYING SEGMENTED CONTENT”, by inventor Dennis D. Castleman, and identified by Attorney Docket No. 139592 [SCEA16001US01], the entire disclosure of which is hereby fully incorporated by reference herein in its entirety.
This application is also related to U.S. patent application Ser. No. ______, filed on the same date as this application, entitled “APPARATUS AND METHOD FOR GAZE TRACKING”, by inventor Dennis D. Castleman, and identified by Attorney Docket No. 138627 [SCEA16004US00], the entire disclosure of which is hereby fully incorporated by reference herein in its entirety.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates generally to video processing and display.

2. Discussion of the Related Art

Video streaming is increasingly becoming one of the main ways that media contents are delivered and accessed. Video streaming traffic also accounts for a large portion of Internet bandwidth consumption.

SUMMARY OF THE INVENTION

One embodiment provides a method for displaying content, comprising: determining a focal area of a viewer of a content item displayed on a display device, retrieving a low bit rate version of the content item, retrieving a portion of a high bit rate version of the content item corresponding to the focal area, combining the portion of the high bit rate version of the content with the low bit rate version of the content item to generate a combined image, and causing the combined image to be displayed to the viewer via the display device.
Another embodiment provides a system for displaying content, comprising: a display device, a sensor device, and a processor coupled to the display device and the sensor device. The processor being configured to: determine, with the sensor device, a focal area of a viewer of a content item displayed on the display device, retrieve a low bit rate version of the content item, retrieve a portion of a high bit rate version of the content item corresponding to the focal area, combine the portion of the high bit rate version of the content with the low bit rate version of the content item to generate a combined image, and cause the combined image to be displayed to the viewer via the display device.
Another embodiment provides a non-transitory computer readable storage medium storing one or more computer programs configured to cause a processor based system to execute steps comprising: determining a focal area of a viewer of a content item displayed on a display device, retrieving a low bit rate version of the content item, retrieving a portion of a high bit rate version of the content item corresponding to the focal area, combining the portion of the high bit rate version of the content with the low bit rate version of the content item to generate a combined image; and causing the combined image to be displayed to the viewer via the display device.
Another embodiment provides a method for providing content, comprising: receiving a content item, generating a low bit rate version of the content item, receiving a content request from a playback device, the content request comprises an indication of a viewer focal area, selecting a portion of the high bit rate version of the content item based on the viewer focal area, and providing the low bit rate version of the content item and the portion of the high bit rate version of the content item to the playback device in response to the content request.
Another embodiment provides a system for providing content comprising: a memory device, a communication device; and a processor coupled to the memory device and the communication device. The processor being configured to: receive a content item, generate a low bit rate version of the content item, store the high bit rate version of the content item and the low bit rate version of the content item on the memory device, receive, via the communication device, a content request from a playback device, the content request comprises an indication of a viewer focal area, select a portion of the high bit rate version of the content item based on the viewer focal area, and providing the low bit rate version of the content item and a portion of the high bit rate version of the content item to the playback device in response to the content request.
A better understanding of the features and advantages of various embodiments of the present invention will be obtained by reference to the following detailed description and accompanying drawings which set forth an illustrative embodiment in which principles of embodiments of the invention are utilized.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other aspects, features and advantages of embodiments of the present invention will be more apparent from the following more particular description thereof, presented in conjunction with the following drawings wherein:

FIG. 1 is a process diagram illustrating a process for providing content in accordance with some embodiments of the present invention;

FIG. 2 is a flow diagram illustrating a method for providing content in accordance with some embodiments of the present invention;

FIG. 3 is a flow diagram illustrating a method for displaying content in accordance with some embodiments of the present invention;

FIGS. 4A and 4B are illustrations of a content display area in accordance with some embodiments of the present invention;

FIG. 5 is an illustration of image blending in accordance with some embodiments of the present invention;

FIGS. 6A and 6B are illustrations of image cells in accordance with some embodiments;

FIGS. 7A and 7B are illustrations of focal areas in accordance with some embodiments; and

FIG. 8 is a block diagram illustrating a system in accordance with some embodiments of the present invention.

DETAILED DESCRIPTION

Digital video content may be stored and transmitted in a variety of formats. Factors such as the video's resolution, frame rate, coding format, compression scheme, and compression factor can affect the total size and bit rate of the video file. In digital multimedia, bit rate generally refers to the number of bits used per unit of playback time to represent a continuous medium such as audio or video. The encoding bit rate of a multimedia file may refer to the size of a multimedia file divided by the playback time of the recording (e.g. in seconds). The bit rate of a video content file affects whether the video can be streamed without interruptions under network bandwidth constraints between a streaming server and a playback device.
Referring first to FIG. 1, a process for recording, hosting, and displaying video content according to some embodiments is shown. In step 111, video content is captured by a camera system. In some embodiments, the camera system may comprise one or more of a conventional camera system, a stereoscopic camera system, a panoramic camera system, a surround view camera system, a 360-degree camera system, and an omnidirectional camera system, and the like. In step 112, the captured video is encoded and transmitted to a server. In some embodiments, the encoding performed in step 112 may comprise lossy or lossless video encoding. In some embodiments, the video may comprise a live-streaming or a prerecorded video content. In some embodiments, the camera may communicate with the server via wireless or wired means by way of a network, such as for example the Internet.
In some embodiments, the camera performing steps 111 and 112 may comprise a segmented video capture device such as those described in U.S. Provisional Patent Application No. 62/357,259, filed on Jun. 30, 2016, entitled “APPARATUS AND METHOD FOR CAPTURING AND DISPLAYING SEGMENTED CONTENT”, the entire disclosure of which is hereby fully incorporated by reference herein in its entirety. With a segmented video capture device, each captured video stream be provided as separate video streams to the server or may be combined into a single video stream prior to step 112.
In step 121, the server decodes the video content received from the camera. In some embodiments, the decoded video may comprise a video in the originally captured resolution, frame rate, and/or bit rate. In step 122, the server reduces the bit rate of the decoded video stream. In some embodiments, the bit rate of the video content may be reduced by one or more of: reducing the resolution of the video, reducing the frame rate of the video, and compressing the video with a compression algorithm. In step 123, the reduced bit rate video is encoded and prepared for streaming to a playback device. In some embodiments, steps 122 and 123 may comprise a single step. For example, an encoding algorithm may be used reduce the bit rate of the received content.
In step 125, one or more portions of the received video are extracted from received video. Portions of a content item may generally refer to a spatial section of the video content display area. In some embodiments, a portion of the content may comprise an area of the content displayed area spanning one or more frames. In some embodiments, if the encoding scheme of the received content allows for partial decoding (e.g. MPEG-4 transport stream), the extraction in step 125 may be performed by partially decoding the received content. In some embodiments, step 125 may be performed in response to receiving a viewer focal area from a playback device and the extracted portion may correspond to the location of the viewer's focal area in the content. In some embodiments, step 125 may be performed on the content preliminarily and one or more portions may be extracted and stored for later retrieval by playback devices. In step 127, the extracted portion is encoded and prepared for streaming to the playback device.
As used herein and throughout this disclosure, high and low bit rates are relative terms referring to the relative bit rates of the at least two versions of a video content item provided from the server to a playback device. Generally, the server may generate at least one low bit rate version of the received video and extract at least a portion of a version of the content item having a higher bit rate as compared to the low bit rate version. In some embodiments, multiple versions of a video content item having different bit rates may be created by the servers. In some embodiments, bit rate reduction may also be performed on the received video prior to extracting portions of the content in step 125 and/or performed on the portion extracted in step 125. Generally, a high bit rate version of the content item has a higher average bit rate than the low bit rate version of the content item over the duration of the video content. In some embodiments, the bit rate of the high bit rate version of the content item may be higher than the low bit rate version of the content item for some or all of temporal segments of the video content. In some cases, the video stream containing the extracted portion of the high bit rate version of the content item may have a lower bit rate as compared to the video stream comprising the low bit rate version of the content item. For example, the portion of the high bit rate version of the content item may cover a significantly smaller display area of the content as compared to the low bit rate version, resulting in the lower bit rate of the extracted portion. In some embodiments, the low bit rate version of the content item may comprise lower one or more of resolution, frame rate, and compression quality as compared to the high bit rate version of the content item. In some embodiments, the low bit rate version of the content item may comprise a lower video quality and/or definition as compare to the high bit rate version of the content item. In some embodiments, the low and high bit rate versions of the content may comprise constant bit rate (CBR) or variable bit rate (VBR) video streams.
In some embodiments, the server may communicate with the playback device by way of a network, such as for example the Internet. In step 131, the playback device receives and decodes a low bit rate version of the video content and a portion of a high bit rate portion of the video content. The portion of the high bit rate portion of the video content may be selected based on the focal area of a viewer viewing the content via the playback device. In some embodiments, the focal area of a viewer refers an area of the viewer's field of vision that is or is likely to be in focus while the viewer views the content. In some embodiments, the focal area may correspond to one or more of the central, paracentral, macular, near peripheral, and mid peripheral areas of the viewer's field of vision. The focal area of the viewer may be detected by a sensor device coupled to the playback device. In some embodiments, Inertial Measurement Unit (IMU) data recorded by a capture device of the content item may be compared to the viewer's eye and/or head direction to determine the portion of the high bit rate video content to extract for the playback device. In some embodiments, the low bit rate version of the video content and the portion of the high bit rate portion of the video content may be transmitted as separate video streams from the server to the playback device.
In step 132, the low bit rate version of the video content and the portion of the high bit rate portion of the video content are combined. In some embodiments, combining the video streams comprises combining the low bit rate version of the content item with the portion of the high bit rate version at the location of the content displayed area from which the high bit rate portion was extracted. In some embodiments, step 132 comprises blending the two video streams by including a transition area between the high and low bit rate areas of the image to reduce the noticeability of the border between the two versions of the video content. In some embodiments, step 132 further comprises scaling the low bit rate version of the video content to the resolution and/or frame rate of the high bit rate version of the content prior to combining the images.
In step 133, the combined image is displayed to the viewer. The combined image may be displayed via one or more of a flat screen display, a curved display, a dome display device, a head-mounted display device, an augmented reality display device, a virtual reality display device, and the like. In some embodiments, the combined image may be viewed by a head mounted display such as the systems and devices described in U.S. patent application Ser. No. 15/085,887, filed on Mar. 30, 2016, entitled “Head-Mounted Display Tracking,” the entire disclosure of which is hereby fully incorporated by reference herein in its entirety.
In some embodiments, instead of the steps shown in FIG. 1, the high bit rate portion of the video content may be combined with the low bit rate version of the content at the server and encoded as a single video stream for transmission. While the resolution and the frame rate of such video streams may not be reduced as compared to a full high bit rate version, the overall size of the transmitted video stream may still be reduced by processing the area of the content outside of the focal area with a more lossy video compression algorithm before recombining the images.
In the process shown in FIG. 1, the portion of the content item corresponding to the user's focal area is provided in a relatively high bit rate and the remaining area of the content are provided in a relatively low bit rate. With the process shown in FIG. 1, the network bandwidth demand for achieving interruption-free video streaming may be reduced by decreasing the overall bit rate of the streaming video content while maintaining the video quality in the focal area of the viewer's field of vision.
Referring next to FIG. 2, a method for providing content is shown. The steps in FIG. 2 may generally be performed by a processor-based device such as one or more of a computer system, a server, a cloud-based server, a content host, a streaming service host, a media server, and the like. In some embodiments, the steps in FIG. 2 may be performed by one or more of the content server 810 and the playback device 820 described with reference to FIG. 8, the server described with reference to FIG. 1, and/or other similar devices.
In step 210, the system receives a content item. The content item may comprise one or more of a movie, a TV show, a video clip, prerecorded video content, streaming video content, live-streamed video content, and the like. In some embodiments, the video content may comprise a single video stream or a plurality of video streams captured by one or more of a stereoscopic camera system, a panoramic camera system, a surround view camera system, a 360-degree camera system, an omnidirectional camera system, and the like. In some embodiments, the content item may be encoded via any encoding scheme such as MPEG, WMV, VP8, and the like. In some embodiments, the system may further be configured to decode the received content item according to various encoding schemes in step 310.
In step 220, the system generates a low bit rate version of the content item. In some embodiments, the bit rate of the received content may be reduced by one or more of: reducing the resolution of the video, reducing the frame rate of the video, and compressing the video with a lossy compression algorithm. A lossy compression generally means that the compressed video lacks some information present in the original video. In some embodiments, multiple low bit rate versions of the content item may be generated in step 220 and stored for retrieval by playback devices.
In step 230, the system receives a content request. In some embodiments, the content request may be received from a playback device such as a game console, a personal computer, a tablet computer, a television, a head mounted display (“HMD”), an augmented reality device, a virtual reality device, a wearable device, a portable user device, a smartphone, etc. In some embodiments, the content request may identify one or more of the content item being requested, the requested temporal segment, an indication of the viewer's focal point and/or area, and/or other authentication information. In some embodiments, the content request may be similar to a conventional streaming content request. In some embodiments, the content request may comprise an indication of the viewer's focal area which may correspond to a point or an area in the content display area. In some embodiments, the indication of the viewer's focal area may comprise a coordinate or a set of coordinates within the dimension of a frame of the content. In some embodiments, the indication of the viewer's focal area may be represented by a viewing angle. In some embodiments, the focal area may be determined based on a sensor device associated with the playback device comprising one or more of an eye tracking sensor and a head tracking sensor.
In step 240, the low bit rate version of the content is provided to the playback device in response to the content request received in step 230. In some embodiments, multiple low bit rate versions of the content item may be generated in step 220. In step 240, the system may select from among the multiple low bit rate versions of the content item based on one or more of: the current or estimated network throughput between the playback device and the server, the available bandwidth at the server and/or the playback device, the requested video quality specified in the content request, the playback device's processing capacity, user settings, etc. In some embodiments, the selection of the low bit rate version of the content item from a plurality of versions may be similar to conventional adaptive bit rate streaming methods.
In step 250, the system selects a portion of the high bit rate version of the content item based on the content request. The high bit rate version of a content item generally refers a version of the content with a higher bit rate as compared to the low bit rate content provided in step 240. In some embodiments, the high bit rate version of the content item may comprise a higher average bit rate than the low bit rate version of the content over the duration of the video content. In some embodiments, during some or all temporal segments of the video content, the bit rate of the high bit rate version of the content item may be higher than the low bit rate version of the content. In some embodiments, the high bit rate version of the content may comprise the original content received in step 210. In some embodiments, the high bit rate version of the content item may also comprise a reduced bit rate version of the originally received content item.
In some embodiments, the portion of the content selected in step 250 may be selected based on the viewer's focal area comprising one or more of a detected focal point and a predicted future focal point. In some embodiments, the predicted future focal point may be predicted by the server and/or the playback device. In some embodiments, the future focal point may be predicted based on one or more of the viewer's gaze path history, a gaze path profile associated with the viewer, gaze path data collected from a plurality of viewers, and a content provider provided standard gaze path. Examples of predicting the viewer's future focal point are described in U.S. patent application Ser. No. ______, filed on the same date as this application, entitled “APPARATUS AND METHOD FOR GAZE TRACKING”, by inventor Dennis D. Castleman, and identified by Attorney Docket No. 138627 [SCEA16004US00], the entire disclosure of which is hereby fully incorporated by reference herein in its entirety.
A portion of the content may generally refer to a spatial portion of the display content area such as a set pixels within a frame. In some embodiments, a portion may comprise the same part of the display content area spanning a plurality of frames. In some embodiments, the portion selected in step 250 may generally correspond to the location of a viewer's focal area in the content display area. In some embodiments, the displayed area of the content may be divided into a plurality of sections. For example, the displayed area of the content may be divided into quadrants, 3×3 grids, 5×5 grids, etc. In some embodiments, one or more sections of the content display area that overlaps the focal area of the viewer may be selected to comprise the portion of the high bit rate version of the content item provided to the playback device. In some embodiments, the focal area and/or the extracted portion of the content may comprise any shape and size. Examples of focal areas and portions extracted from content items are described in more detail with references to FIGS. 4A-4B and FIGS. 7A-7B herein.
In some embodiments, the system may further select from a plurality of original and/or reduced bit rate versions of the content to extract the selected portion based on one or more of: the current or estimated network throughput between the playback device and the, the available bandwidth at the server and/or the playback device, a requested video quality specified in the content request, the playback device's processing capacity, and user settings. In some embodiments, the portion of the high bit rate version may be extracted from one of the reduced bit rate versions generated in step 220. In some embodiments, the high bit rate version of the content item may generally be selected from versions of the content item with higher bit rate as compared to the low bit rate version of the content item selected in step 240.
In some embodiments, the system may be configured to provide two or more portions of the high bit rate version of the content item in step 270. For example, the system and/or the playback device may predict two or more likely future focal areas of the viewer. The system may then select two or more portions of the high bit rate version of the content item based on the two or more likely future focal areas of the viewer in step 250. The playback device may be configured to select from among the provided portions shortly before playback based on the detected focal area.
In step 260, the system determines whether the selected portion has been previously cached in the system. In some embodiments, when a portion of the high bit rate version of the content is extracted, the system may cache the portion for later use. In some embodiments, the system may preliminarily generate a plurality of extracted portions of the high bit rate version of the content item based on predicting the locations that viewers are likely to focus on in the displayed content. For example, preliminarily extracted portions may correspond to high activity areas and/or foreground areas of the displayed content. In some embodiments, the cached portions may each comprise an encoded video stream. In some embodiments, the system may be configured to automatically purge extracted portions that have not been used for a set period of time (e.g. hours, days, etc.). In some embodiments, each cached portion of the high bit rate portion may be identified and retrieved with an area identifier and a time stamp identifier (e.g. section 3B, time 00:30:20-00:30:22). In some embodiments, portions of the high bit rate version of the content may be stored in an encoded form in the cache and be made directly available for streaming to playback devices. If the selected portion has been previously cached, the system may provide the cached portion to the playback device in step 270.
If the selected portion has not been previously cached, the system extracts a portion of the high bit rate version of the content in step 280. In some embodiments, the portion may be extracted from the content received in step 210. In some embodiments, the portion may be extracted from one of the reduced bit rate versions of the originally received content. In some embodiments, the portion of may be extracted by first decoding the received content. In some embodiments, the system may be configured to partially decode and extract a portion of the content from an encoded version of the content item. In some embodiments, step 280 may further comprise processing the extracted portion to include a plurality of empty/transparent pixels or cells around the edge of the extracted portion. The density of empty/transparent pixels may gradually increase toward the outer edge of the extracted portion such that when the extracted portion is combined with a lower bit rate version of the content, the edge between the two images is less noticeable to human eyes. The inclusion of the empty/transparent pixels may further decrease the bandwidth usage for transmitting the portion of the high bit rate version of the content. In some embodiments, step 280 may further comprise separately encoding the extracted portion for streaming. The encoded portion of the high bit rate version of the content item may then be provided to the playback device in step 270. In some embodiments, the portion of the high bit rate version of the content item may be provided in a plurality of encoded video streams each corresponding to a predefined area (e.g. a cell in a grid) of the content display area.
In some embodiments, steps 270 and 240 may occur at substantially the same time to provide corresponding temporal segments of the same content item to the playback device. In some embodiments, the low bit rate version of the content may be provided and buffered at the playback device prior to the corresponding high bit rate portion of the content item being provided in step 270. In some embodiments, the portion of the high bit rate version of the content item and the low bit rate version of the content item may be provided as two separately encoded and transmitted video streams. In some embodiments, portions of the high bit rate version of the content item and the low bit rate version of the content item may be provided from different parts of a server system. For example, a central server may be configured to stream low bit rate versions of content items to playback devices while a plurality of geographically dispersed server devices may be configured to extract and/or provide portions of the high bit rate versions of the same content item to nearby playback devices.
In some embodiments, steps 210 through 270 may be repeated for multiple content items. In some embodiments, steps 250-270 may be repeated periodically as a viewer views a content item at the playback device. For example, the playback device may periodically (e.g. every few milliseconds, seconds, frames, etc.) update the focal area of the viewer at the server, and the system may select a different portion of the high bit rate version of the content item based on the updated focal area of the viewer. In some embodiments, the playback device may be configured to detect a change in the focal area and only notify the server when the location of the focal area changes. In some embodiments, if no focal area is detected (e.g. the user is not currently looking at the screen) the system may skip steps 250-270 and only provide the low bit rate version of the content item to the playback device. In some embodiments, if the user is detected to be not looking at the display device, the system may further select the lowest bit rate version of the content item to provide to the playback device in step 240 to reduce network bandwidth usage. In some embodiments, if an interruption in the streaming of the content is detected, the system may adjust the bit rate of the low and/or high bit rate versions of the content provided to reduce interruptions.
Referring next to FIG. 3, a method for providing content is shown. The steps in FIG. 3 may generally be performed by a processor-based device such as one or more of a game console, a personal computer, a tablet computer, a television, a head mounted display (“HMD”), an augmented reality device, a virtual reality device, a wearable device, a portable user device, a smartphone, a mobile device, and the like. In some embodiments, the steps in FIG. 3 may be performed by one or more of the content server 810 and the playback device 820 described with reference to FIG. 8, the playback device described with reference to FIG. 1, or other similar devices.
In step 310, the system determines a focal area of a viewer. In some embodiments, the focal area may be determined based on a sensor device comprising one or more of an eye tracking sensor and a head tracking sensor. In some embodiments, the head direction of the user may be determined by a head tracker device comprising one or more of an Inertial Measurement Unit (IMU), an accelerometer, gyroscope, an image sensor, and a range sensor. In some embodiments, an IMU may comprise an electronic device that measures and reports a body's specific force, angular rate, and/or magnetic field surrounding the body, using a combination of accelerometers and gyroscopes, sometimes also magnetometers. In some embodiments, the head tracker device may be coupled to a head mounted display (HMD) worn by the user. In some embodiments, the gaze location of the user may be determined by an eye tracker device comprising one or more of an image sensor, an optical reflector sensor, a range sensor, an electromyography (EMG) sensor, and an optical flow sensor.
In some embodiments, the focal area may be determined based on one or more of a detected focal point and a predicted future focal point. In some embodiments, the future focal point may be predicted based on one or more of the viewer's gaze point history, a gaze path profile associated with the viewer, gaze path data collected from a plurality of viewers, and a content provider provided standard gaze path. In some embodiments, the focal area may by represented by a point of focus in a 2D or 3D space. In some embodiments, the focal area may be represented as a 3D angle such as a direction represented by a spherical azimuthal angle (θ) and polar angle (φ). In some embodiments, the focal area may be represented by a 2D polar angle (φ). In some embodiments, the focal area may correspond the pitch, yaw, and roll of the viewer's head, eyes, and/or the display device. In some embodiments, the system may compare the IMU data of the recorded content and the IMU data of the display device to determine the focal area of the view relative to the content. In some embodiments, the size of the focal area may further be determined based on the viewer's distance from the display device. For example, for a television display, a smaller focal area may be associated with a viewer sitting 5 feet away from the screen while a larger focal are may be associated with a viewer sitting 10 feet away. In some embodiments, the focal area may be approximated to an area of fixed size and shape around the user's focal point.
In step 320, the playback device retrieves a low bit rate version of a content item. In some embodiments, a playback device sends a content request to a server hosting the content item in step 320 to retrieve the content item. The low bit rate version of the content item may comprise a reduced bit rate version of the content item generated by a content provider and/or the hosting service. In some embodiments, step 320 may occur prior to step 310 and the low bit rate version of the content item may begin to be downloaded, buffered, and/or viewed prior to the focal area of the viewer being determined. In some embodiments, step 320 may correspond to step 240 described with reference to FIG. 2 herein.
In step 330, the playback device retrieves a portion of a high bit rate version of the content item. In some embodiments, the playback device sends a content request identifying the focal area of the viewer determined in step 310 to a server to retrieve the portion of the high bit rate version of the content item. Generally, the retrieved portion may comprise a spatial portion of the content selected based on the focal area of the viewer. In some embodiments, the retrieved portion may comprise a short temporal segment of an area of the content item (e.g. milliseconds, seconds, frames, etc.). In some embodiments, the portion of the high bit rate version of the content item may be retrieved in a video stream separately encoded from the low bit rate version of the content item retrieved in step 320. In some embodiments, the low bit rate version of the content item may buffer ahead of the retrieval of the high bit rate version of the content item. In some embodiments, step 330 may correspond to step 270 described with reference to FIG. 2 herein.
In step 340, the system combines the portion of the high bit rate version of the content item with the low bit rate version of the content item to generate a combined image. In some embodiments, in step 340, the system first decodes the portion of the high bit rate version of the content item retrieved in step 330 and the low bit rate version of the content item retrieved in step 320. In some embodiments, if the resolution and/or frame rate of the low and high bit rate versions of the content item are different, the system may first adjust the resolution and/or frame rate of at least one of the versions prior to combining the images. For example, the system may increase the resolution and/or frame rate of the low bit rate version of the content item to match the resolution and/or frame rate of the high bit rate portion by up-sampling and/or interloping the decoded low bit rate version of the content item.
In some embodiments, the system may combine the two versions of the content item by replacing the pixels in the frames of the low bit rate version of the content item with pixels from the corresponding frames of the portion of the high bit rate version of the content item. In some embodiments, the frames may be identified and matched by time stamps. In some embodiments, the image may further be blended to reducing the appearance of a border between the two versions of the content item. In some embodiments, the system blends the versions of the content item by generating a transition area between the portion of the high bit rate version of the content and the low bit rate version of the content. In the transition area, the pixels containing information from the high bit rate version may gradually decrease from the high bit rate area towards the low bit rate area of the displayed content. In some embodiments, blending the portion of the high bit rate version of the content items with the low bit rate version of the content item may comprise grouping pixels into triangular cells for blending. Examples of the transition areas and blending are described with reference to FIGS. 5 and 6A-6B herein. In some embodiments, the high bit rate portion may be provided in a pre-blended form from the server. For example, edges of the high bit rate portion may comprise a plurality of empty/transparent pixels with graduated density. The playback device may then overlay the high bit rate portion with the transparent pixels onto the low bit rate version of the content item without further processing the images and archive the blended effect.
In step 350, the combined image is displayed on a display device. In some embodiments, the display device may comprise one or more of a monitor, a television set, a projector, a head mounted display (HMD), a virtual reality display device, a wearable device, a display screen, a mobile device, and the like. In some embodiments, prior to step 350, the system may further adjust the combined image based on the display device's specifications. For example, for virtual reality display devices, the system may adjust for the warp and distortions associated with the device.
In some embodiments, steps 310 to 350 may be repeated continuously as a viewer views a content item. In some embodiments, based on the focal area detected in step 310, different portions of the high bit rate version of the content item may be retrieved in step 330 and combined with the low bit rate version in step 340 over time. In some embodiments, step 320 may occur independently of steps 310 and 330. In some embodiments, if no focal area is detected, the system may only retrieve the low bit rate version of the content item to display and skip steps 330-350 until a focal point is detected again.
In some embodiments, the system may further be configured to determine a view area of the viewer and retrieve only a portion of the low bit rate content based on a view area of the viewer in step 320. The view area of the viewer may be determined based on one or more of eye tracking and head tracking similar to the determination of the focal area in step 310. The view area of the viewer may generally refer to the area of the content that is visible to the user but may or may not be in focus in the viewer's field of vision. In some embodiments, the view area may comprise an area surrounding the focal area. In some embodiments, the portion of the low bit rate version of the content item retrieved may exclude areas of the content area not within the view area. In some embodiments, the portion of the low bit rate version of the content item retrieved may further exclude the focal area and only include the area that is assumed to be visible to the viewer but not in focus. In some embodiments, the retrieved portion of the low bit rate version of the content item may correspond to one or more of the near, mid, and far peripheral vision area of the viewer's field of vision.
Referring next to FIG. 4A, an illustration of a content display area is shown. The content area 400 represents the entire image area of a content item. While the content area 400 is shown to be a rectangle, in some embodiments, the content area 400 may correspond to a cylinder, a sphere, a semi-sphere, etc. for immersive content and/or omnidirectional video content. The content area 400 may generally comprise any shape, aspect ratio, and size without departing from the spirit of the present disclosure. The focal point 410 represents the viewer's point of focus within the content. In some embodiments, the focal point 410 may correspond to a detected focal point and/or a predicted focal point. The focal area 412 represents an area around the focal point 410 that is likely to be in focus within the viewer's field of vision. In some embodiments, the focal area may comprise one or more of the central, paracentral, macular, near peripheral, and mid peripheral areas of the viewer's field of vision. The size and shape of the focal area 412 are shown as examples only. The relative sizes of the focal area 412 and the content area 400 may also vary. In some embodiments, the shape and size of the focal area 412 may be calibrated for each individual user and/or be estimated based on the viewer's profile containing one or more of viewer demographic information, viewing habits, user feedback, user settings, etc. In some embodiments, the size of the focal area 412 may further be determined based on the viewer's distance from the display screen. In some embodiments, for display device types with a fixed distance between the eyes of the viewer and the display screen (e.g. HMDs), the size of the focal area 412 may generally be assumed to remain the same.
In some embodiments, the playback device may be configured to retrieve a portion of the high bit rate version of the content item corresponding to the focal area 412. In some embodiments, the content area 400 may be divided into a grid comprising a plurality of sections. In some embodiments, sections of the content area 400 overlapping the focal are 421 may comprise the portion of the high bit rate version of the content item retrieved by the playback device. In some embodiments, when the content item is displayed, the high bit rate version of the content item may be displayed in the portion of the content area corresponding to the focal area 412 and the low bit rate version of the content item may be displayed in the remaining portion of the content area 400. In some embodiments, depending on the size and shape of the sections of the content area 400 defined by the server, the high bit rate area may not be an exact match to the size and shape of the focal area 412 but may generally substantially cover the focal area 412. In some embodiments, the portion of the high bit rate version of the content item may be extracted to closely match the shape and size of the focal area 412.
Referring next to FIG. 4B, another illustration of a content display area is shown. The content area 400, the focal point 410, and the focal area 412 in FIG. 4B may generally be similar to the corresponding elements in FIG. 4A. In some embodiments, the system may further determine a view area 411 surrounding the focal area 412 as shown in FIG. 4B. The view area 414 may generally refer to the area of the content that is visible to the user but may or may not be in focus in the viewer's field of vision. In some embodiments, the portion of the low bit rate version of the content item retrieved may exclude areas of the content area 400 outside of the view area 414. In some embodiments, the portion of the low bit rate version of the content item retrieved may further exclude the focal area 412 and only include the area that is assumed to be visible to the viewer but not in focus. In some embodiments, the view area may correspond to one or more of the near, mid, and far peripheral vision area of the viewer's field of vision.
In some embodiments, the content area 400 may correspond to an immersive video content and/or an omnidirectional video content captured by a plurality of image sensors. The view area 414 may be used to select and stitch a plurality of separately encoded video streams as described in U.S. Provisional Patent Application No. 62/357,259, filed on Jun. 30, 2016, entitled “APPARATUS AND METHOD FOR CAPTURING AND DISPLAYING SEGMENTED CONTENT” the entire disclosure of which is hereby fully incorporated by reference herein in its entirety. For example, if the view area 414 overlaps two of the four video streams captured by a multi-camera system, the low bit rate version of the content item retrieved may comprise only the two corresponding streams. In some embodiments, the focal area 412 may also comprise data from a plurality of separately encoded video streams that are stitched at the playback device.
Referring next to FIG. 5, an illustration of a transition area is shown. In some embodiments, FIG. 5 may represent a combined image displayed in step 350 of FIG. 3. The displayed image comprises a low bit rate area 510, a high bit rate area 512, and a transition area 511. In the transition area 511, pixels containing information from the high bit rate area 512 may gradually decrease from the high bit rate area 512 toward the low bit rate area 510. In some embodiments, blending the portion of the high bit rate version of the content with the low bit rate version of the content item comprises grouping pixels in the transition are 511 into cells for blending. In some embodiments, each set of grouped pixels may contain data from one of the versions of the content item or the other. In FIG. 5, the size and shape of the transition area 511 is shown as an example only and the transition area 511 may be of any size, shape, and thickness. Generally, the transition area 511 surrounds the high bit rate area and includes interleaved data from both the high bit rate area 512 and the low bit rate area 510 to reduce the appearance of a border between the two areas.
Referring next to FIGS. 6A and 6B, illustrations of triangular cells are shown. FIG. 6A shows a sphere divided into a plurality of triangular cells. The sphere may correspond to the content area of an omnidirectional and/or immersive video content. In some embodiments, each cell may comprise a unit for blending images. In some embodiments, triangular cells better adapt to the curvature of a sphere and are less noticeable to human eyes as compared to square or rectangular cells. The triangular cells may further be subdivided into smaller triangular cells to provide for adjustable granularity in blending. FIG. 6B illustrates blending using triangular cells. The cells in FIG. 6B may represent a section of a transition area between two versions of a content item. In FIG. 6B, cells labeled with “1” may contain data from one version of a content item and cells labeled with “2” may contain data from a different version of the content item. In some embodiments, each cell in FIG. 6B may be subdivided into smaller triangular cells for more granular blending. In some embodiments, a transition area may have any number of row or columns of triangular cells. In some embodiments, each cell shown in FIGS. 6A and 6B may be merged or subdivided to form triangular cells of different sizes for blending images.
Referring next to FIGS. 7A and 7B, illustrations of focal areas are shown. In some embodiments, the focal area of a viewer may be determined based on the area of the content that is likely to be in focus in a viewer's field of vision. In FIGS. 4A and 4B, the focal area is approximated to an oval. In some embodiments, the focal area may be approximated to a circle, a square, etc. by the system. FIGS. 7A and 7B illustrate other shapes that may represent the shape of the focus area used by the system. The shape shown in FIG. 7A approximates the shape of human's field vision with two merged ovals having aligned major axes. The shape in shown in FIG. 7B comprises two oval having major axes that are perpendicular to each other. The shape shown in FIG. 7B may be used to create a buffer area around the point of focus. For human eyes, vertical or horizontal movements are generally more common than diagonal movements. Therefore, using the shape shown in FIG. 7B to approximate the focal area may allow a viewer to have some vertical or horizontal eye movements without having their focal area leave the high bit rate content area. In some embodiments, the retrieved portion of the high bit rate content item discussed here may correspond one or more of the shapes shown in FIGS. 4A-4B, 7A-7B, a circle, a square, a rectangle, and the like.
Referring next to FIG. 8, there is shown a system for providing and displaying content that may be used to run, implement and/or execute any of the methods and techniques shown and described herein in accordance with some embodiments of the present invention. The system includes a content server 810 and a playback device 820 communicating over a data connection such as a network.
The content server 810 includes a processor 812, a memory 813, and a communication device 814. The content server 810 may generally comprise one or more processor-based devices accessible by the playback device via a network such as the Internet. In some embodiments, the content server may comprise one or more of a cloud-based server, a content host, a streaming service host, a media server, a streaming video server, a broadcast content server, a social networking server, and the like. The processor 812 may comprise one or more of a control circuit, a central processor unit, a graphical processor unit (GPU), a microprocessor, a video decoder, a video encoder and the like. The memory 813 may include one or more of a volatile and/or non-volatile computer readable memory devices. In some embodiments, the memory 813 stores computer executable code that causes the processor 812 to provide content to the playback device 820. In some embodiments, the communication device 814 may comprise one or more of a network adapter, a data port, a router, a modem, and the like. Generally, the communication device 814 may be configured to allow the processor 812 to communicate with the playback device 820. In some embodiments, the processor 812 may be configured to provide a low bit rate version of a content item and a portion of a high bit rate version of the content item to the playback device 820 based on a request from the playback device 820. In some embodiments, the request may comprise an identification of the requested content item and/or an indication of a focal area of the viewer of the content item. In some embodiments, the processor 812 may be configured to generate and/or store at least one of the low bit rate version of the content item and one or more portions of the high bit rate version of the content item based on a received content item.
The memory 813 and/or a separate content library may store one or more content items each comprising at least two versions of the content item having different bit rates. In some embodiments, the content server 810 may be configured to stream the content recorded by a capture device to the playback device 820 in substantially real-time. In some embodiments, the content server 810 may be configured to host a plurality of prerecorded content items for streaming and/or downloading to the playback devices 820 on-demand. While only one playback device 820 is shown in FIG. 8, the content server 810 may be configured to simultaneously receive content from a plurality of capture devices and/or provide content to a plurality of playback devices 820 via the communication device 814. In some embodiments, the content server 810 may be configured to facilitate peer-to-peer transfer of video streams between capture devices and playback devices 820. For example, the low bit rate version of the content item may be transferred via a peer-to-peer network while portions of the high bit rate content item may be transferred via the content server 810. In some embodiments, the content server 810 may be configured to provide the low bit rate version of the content item and the portion of the high bit rate version of the content item in separately encoded video streams.
In some embodiments, the content server 810 may further be configured to pre-process the content item before providing the content item to the playback device 820. In some embodiments, the content server 810 may soften the edges of the extracted portion of the high bit rate version of the content server by including empty/transparent pixels on at the edges prior to providing the portion of the high bit rate content to the playback device 820. When the pre-processed portion of the high bit rate version of the content item is provided to a playback device 820, the playback device 820 may blend the video streams by simply combining the pixel data from the two versions without performing further image processing. In some embodiments, the content server 810 may be configured to combine a low bit rate version of a content item with a portion of the high bit rate version of the content prior to providing the combined content to the playback device 820.
While one content server 810 is shown, in some embodiments, functionalities of the content server 810 may be implemented on one or more processor-based devices. In some embodiments, the content servers 810 for providing low bit rate versions of contents and for providing high bit rate versions of contents may be separately implemented. For example, a central content server may be configured to provide low bit rate versions of contents while a plurality of geographically distributed content servers may be configured to provide portions of the high bit rate versions of contents to playback devices.
The playback device 820 comprises a processor 821, a memory 823, a display device 825, and a sensor device 827. In some embodiments, the playback device 820 may generally comprise a processor-based devices such as one or more of a game console, a media console, a set-top box, a personal computer, a tablet computer, a television, a head mounted display (“HMD”), an augmented reality device, a virtual reality device, a wearable device, a portable user device, a smartphone, etc. The processor 821 may comprise one or more of a control circuit, a central processor unit (CPU), a graphical processor unit (GPU), a microprocessor, a video decoder and the like. The memory 823 may include one or more of a volatile and/or non-volatile computer readable memory devices. In some embodiments, the memory 823 stores computer executable code that cause the processor 821 to determine a focal area of a user and retrieve a content item from the content server 810. In some embodiments, the playback device 820 may be configured to retrieve a low bit rate version and a portion of a high bit rate version of the content item from the content server 810 and/or from a local storage and combine the two versions to generate a combined image to display to the user via the display device 825. In some embodiments, the memory 823 may comprise a buffer for buffering one or more versions of the content item retrieved from the content server 810. In some embodiments, the computer executable code stored in the memory 823 may comprise one or more of a computer program, a software program, a playback device firmware, a mobile application, a game and/or media console application, etc.
The display device 825 may comprise a device for displaying content to a viewer. In some embodiments, the display device 825 may comprise one or more of a monitor, a television, a head mounted display (HMD), a virtual reality display device, a wearable device, a display screen, a mobile device, and the like. In some embodiments, the display device 825 may comprise a stereoscopic display having one or more screens.
The sensor device 827 may comprise one or more sensors configured to determine a focal point and/or area a viewer of the display device 825. In some embodiments, the sensor device 827 may comprise one or more of an image sensor, an optical reflector sensor, a range sensor, an electromyography (EMG) sensor, and an optical flow sensor for detecting eye and/or head movement. In some embodiments, the sensor device 827 may comprise an IMU that measures and reports a body's specific force, angular rate, and/or magnetic field surrounding the body, using a combination of accelerometers and gyroscopes, sometimes also magnetometers. In some embodiments, the sensor device 827 may be coupled to an HMD and/or a wearable device that allows the sensor to detect the motion of the user's head or eyes via the motion of the HMD and/or wearable device. In some embodiments, the sensor device 827 may comprise an optical sensor for detecting one or more of a head motion and eye-motion of the user. In some embodiments, the sensor may be coupled to an HMD and/or a wearable device and/or be a relatively stationary device that captures data from the viewer from a distance.
While the display device 825 is shown as part of the playback device 820, in some embodiments, the display device 825 may comprise a separate device with or without a separate processor. In some embodiments, the display device 825 may be coupled to the playback device 820 via a wired or wireless communication channel. For example, the playback device 820 may comprise a PC or a game console and the display device 825 may comprise an HMD configured to display content from the playback device 820. In some embodiments, the sensor device 827 may be part of the playback device 820, the display device 825, and/or may be a physically separated device communicating with one or more of the playback device 820 and the display device 825. In some embodiments, one or more of the display device 825 and the sensor device 827 may be integrated with the playback device 820. In some embodiments, the display device 825 may further comprise a processor and/or a memory for at least partially storing the retrieved content and/or the viewer's eye or head movement detected by the sensor device 827.
In some embodiments, the playback device 820 may further include a communication device such as a network adapter, a Wi-Fi transceiver, a mobile data network transceiver, etc. for requesting and downloading content items from the content server 810 and/or a capture device. In some embodiments, the playback device 820 may further include one or more user input/output devices such as buttons, a controller, a keyboard, a display screen, a touch screen and the like for the user to control the selection and playback of content items.
In some embodiments, one or more of the embodiments, methods, approaches, and/or techniques described above may be implemented in one or more computer programs or software applications executable by a processor based apparatus or system. By way of example, such processor based apparatus or systems may comprise a computer, entertainment system, game console, workstation, graphics workstation, server, client, portable device, pad-like device, etc. Such computer program(s) may be used for executing various steps and/or features of the above-described methods and/or techniques. That is, the computer program(s) may be adapted to cause or configure a processor based apparatus or system to execute and achieve the functions described above. For example, such computer program(s) may be used for implementing any embodiment of the above-described methods, steps, techniques, or features. As another example, such computer program(s) may be used for implementing any type of tool or similar utility that uses any one or more of the above described embodiments, methods, approaches, and/or techniques. In some embodiments, program code macros, modules, loops, subroutines, calls, etc., within or without the computer program(s) may be used for executing various steps and/or features of the above-described methods and/or techniques. In some embodiments, the computer program(s) may be stored or embodied on a computer readable storage or recording medium or media, such as any of the computer readable storage or recording medium or media described herein.
Therefore, in some embodiments the present invention provides a computer program product comprising a medium for embodying a computer program for input to a computer and a computer program embodied in the medium for causing the computer to perform or execute steps comprising any one or more of the steps involved in any one or more of the embodiments, methods, approaches, and/or techniques described herein. For example, in some embodiments the present invention provides one or more non-transitory computer readable storage mediums storing one or more computer programs adapted or configured to cause a processor based apparatus or system to execute steps comprising: determining a focal area of a viewer of a content item displayed on a display device, retrieving a low bit rate version of the content item, retrieving a portion of a high bit rate version of the content item corresponding to the focal area, combining the portion of the high bit rate version of the content with the low bit rate version of the content item to generate a combined image, and causing the combined image to be displayed to the viewer via the display device.
While the invention herein disclosed has been described by means of specific embodiments and applications thereof, numerous modifications and variations could be made thereto by those skilled in the art without departing from the scope of the invention set forth in the claims.

Claims

What is claimed is:

1. A method for displaying content, comprising:

determining a focal area of a viewer of a content item displayed on a display device;

retrieving a low bit rate version of the content item;

retrieving a portion of a high bit rate version of the content item corresponding to the focal area;

combining the portion of the high bit rate version of the content with the low bit rate version of the content item to generate a combined image; and

causing the combined image to be displayed to the viewer via the display device.

2. The method of claim 1, wherein the focal area is determined based on a sensor device comprising one or more of an eye tracking sensor and a head tracking sensor.

3. The method of claim 1, wherein the focal area is determined based on one or more of a detected focal point and a predicted future focal point.

4. The method of claim 1, further comprising:

retrieving a different portion of the high bit rate version of the content item in response to detecting a change in the focal area of the viewer.

5. The method of claim 1, further comprising:

sending a content request to a content server, wherein the content request comprises an indication of the focal area.

6. The method of claim 1, wherein combining the portion of the high bit rate version of the content with the low bit rate version of the content item comprises generating a transition area between the portion of the high bit rate version of the content and the low bit rate version of the content to blend the portion of the high bit rate version of the content and the low bit rate version of the content.

7. The method of claim 1, wherein combining the portion of the high bit rate version of the content with the low bit rate version of the content item comprises grouping pixels into triangular cells for blending.

8. The method of claim 1, wherein the portion of the high bit rate version of the content item and the low bit rate version of the content item are retrieved as two separately encoded video streams.

9. The method of claim 1, wherein the content item comprises one or more of an immersive content and an omnidirectional video.

10. The method of claim 1, wherein the low bit rate version of the content item comprises lower one or more of resolution, frame rate, and compression quality as compared to the high bit rate version of the content item.

11. A system for displaying content, comprising:

a display device;

a sensor device; and

a processor coupled to the display device and the sensor device, the processor being configured to:

determine, with the sensor device, a focal area of a viewer of a content item displayed on the display device;

retrieve a low bit rate version of the content item;

retrieve a portion of a high bit rate version of the content item corresponding to the focal area;

combine the portion of the high bit rate version of the content with the low bit rate version of the content item to generate a combined image; and

cause the combined image to be displayed to the viewer via the display device.

12. The system of claim 11, wherein the sensor device comprises one or more of an eye tracking sensor and a head tracking sensor.

13. The system of claim 11, wherein the focal area is determined based on one or more of a detected focal point and a predicted future focal point.

14. The system of claim 11, wherein the processor is further configured to:

retrieve a different portion of the high bit rate version of the content item in response to detecting a change in the focal area of the viewer.

15. The system of claim 11, wherein the processor is further configured to:

send a content request to a content server, wherein the content request comprises an indication of the focal area.

16. The system of claim 11, wherein combining the portion of the high bit rate version of the content with the low bit rate version of the content item comprises generating a transition area between the portion of the high bit rate version of the content and the low bit rate version of the content to blend the portion of the high bit rate version of the content and the low bit rate version of the content.

17. The system of claim 11, wherein combining the portion of the high bit rate version of the content with the low bit rate version of the content item comprises grouping pixels into triangular cells for blending.

18. The system of claim 11, wherein the portion of the high bit rate version of the content item and the low bit rate version of the content item are retrieved as two separately encoded video streams.

19. The system of claim 11, wherein the content item comprises one or more of an immersive content and an omnidirectional video.

20. The system of claim 11, wherein the display device comprises one or more of a head mounted display, a virtual reality display device, an augmented reality display device, and a display screen.

21. A non-transitory computer readable storage medium storing one or more computer programs configured to cause a processor based system to execute steps comprising:

retrieving a low bit rate version of the content item;

22. A method for providing content, comprising:

receiving a content item;

generating a low bit rate version of the content item;

receiving a content request from a playback device, the content request comprises an indication of a viewer focal area;

selecting a portion of a high bit rate version of the content item based on the viewer focal area; and

providing the low bit rate version of the content item and the portion of the high bit rate version of the content item to the playback device in response to the content request.

23. The method of claim 22, further comprising:

separately encoding the low bit rate version of the content item and the portion of the high bit rate version of the content item prior to providing the content to the playback device.

24. The method of claim 22, further comprising:

extracting a plurality of portions from the high bit rate version of the content item; and

caching one or more of the plurality of portions of the high bit rate version of the content item for retrieval by a plurality of playback devices.

25. A system for providing content comprising:

a memory device;

a communication device; and

a processor coupled to the memory device and the communication device, the processor being configured to:

receive a content item;

generate a low bit rate version of the content item;

store a high bit rate version of the content item and the low bit rate version of the content item on the memory device;

receive, via the communication device, a content request from a playback device, the content request comprises an indication of a viewer focal area;

select a portion of the high bit rate version of the content item based on the viewer focal area; and