CA2824708C

CA2824708C - Video content generation

Info

Publication number: CA2824708C
Application number: CA2824708A
Authority: CA
Inventors: Austin J. Vrbas; Danial E. Holden
Original assignee: Comcast Cable Communications LLC
Current assignee: Comcast Cable Communications LLC
Priority date: 2011-01-14
Filing date: 2011-01-14
Publication date: 2018-08-21
Anticipated expiration: 2031-01-14
Also published as: WO2012096674A1; EP2664156A1; EP2664156A4; CA2824708A1

Abstract

The features relate generally to acquiring, formatting, and distributing video content to a variety of devices in multiple geographies. Some features described herein relate to preserving the stereoscopic effect for the formatting and distribution of 3D video content. Additionally, features described herein relate to customized and/or dynamic generation of a 3D stream based on a user device and/or user preferences.

Description

VIDEO CONTENT GENERATION
[01] <this paragraph intentionally left blank>

[02] <this paragraph intentionally left blank>
BACKGROUND

[03] As more video content becomes available in both two dimensional (2D) and three dimensional (3D) appearances, more demands are placed on service providers to provide users with access to the video content in a number of different manners and to a number of different users in different formats across different service providers.

[04] With the emergence of 3D video content offerings by service providers to viewers, live events, such as sporting events, may be captured with stereoscopic cameras and production equipment. There will always be a demand for more 3D
video content offerings to more viewers across more service providers. As some service providers gain exclusive access rights for video content distribution of certain content, such as 3D content, a demand is placed for creating architectures to allow other service providers to access and offer such video content to viewers.

SUMMARY
1051 The features described herein relate generally to acquiring, formatting, and distributing live video content and other content signals to a variety of devices in multiple geographies. Some features described herein relate to preserving stereoscopic effect for formatting and distribution of live 3D video content.
Additionally, some features described herein relate to customized and/or dynamic generation of a 3D stream based on a user device and/or user preferences.
1061 Aspects of the present disclosure describe a stereoscopic production solution, e.g., for live events, that provides 3D video asset distribution to multiple devices and networks. The production solution centralizes stereoscopic signal multiplexing, audio synchronization, compression, file capture and for distribution over networks such as broadband networks. In some embodiments, live or recorded 3D video content may be accessible by different service providers with different subscribers/users and protocols across a network of the content provider.
1071 The preceding presents a simplified summary in order to provide a basic understanding of some aspects of the disclosure. The summary is not an extensive overview of the disclosure. It is neither intended to identify key or critical elements of the disclosure nor to delineate the scope of the disclosure. The summary merely presents some concepts of the disclosure in a simplified form as a prelude to the description below.
_9 _ BRIEF DESCRIPTION OF THE DRAWINGS
[08] The present disclosure is illustrated by way of example and not limited in the accompanying figures in which like reference numerals indicate similar elements and in which:
[09] Figure 1 illustrates an example communication or content distribution network in accordance with one or more aspects of the present disclosure.
[10] Figure 2 illustrates an example hardware architecture that can be used to implement various features of the disclosure.
[11] Figure 3 illustrates an example system for distribution of content over a plurality of networks in accordance with one or more aspects of the present disclosure.
[12] Figure 4 illustrates an example video capture and distribution process in accordance with one or more aspects of the present disclosure.
[13] Figure 5 illustrates an example video encoding process in accordance with one or more aspects of the present disclosure.
[14] Figure 6 illustrates an example video distribution process for a plurality of providers in accordance with one or more aspects of the present disclosure.
[15] Figure 7 illustrates an example communication video-on-demand (VOD) distribution network in accordance with one or more aspects of the present disclosure.

1161 Figure 8 illustrates an example synchronization and encoding process in accordance with one or more aspects of the present disclosure.
DETAILED DESCRIPTION
1171 In the following description of various illustrative embodiments, reference is made to the accompanying drawings, which form a part hereof, and in which is shown, by way of illustration, various embodiments in which aspects of the disclosure may be practiced. It is to be understood that other embodiments may be utilized and structural and functional modifications may be made, without departing from the scope of the present disclosure.
1181 Aspects of the present disclosure describe a distribution architecture that allows client adaptation (multiple CODECs, multiple bitrates, multiple transports), higher reliability by, e.g., placing critical components in monitored facilities with uninterrupted power sources, and redundancy.
1191 Aspects of the disclosure may be made operational with numerous general purpose or special purpose computing system environments or configurations.
Examples of computing systems, environments, and/or configurations that may be suitable for use with features described herein include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, digital video recorders, programmable consumer electronics, Internet connectable display devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
[20] The features may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
Generally, program modules may include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Features herein may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices. Although described in relation to IP video, concepts of the present disclosure may be implemented for any format capable of carrying 3D video content.
[21] Figure 1 illustrates an example communication distribution network in accordance with one or more aspects of the present disclosure. Aspects of the network allow for streaming of 3D video content over a packet switched network, such as the Internet. One or more aspects of the network are adapted to deliver 3D stereoscopic content to Internet (or another public or local network) connected display devices. Still other aspects of the network adapt stereoscopic content to a variety of network interface device technologies, including devices capable of rendering two dimensional (2D) and 3D content.

-5-1221 3D video content, including live 3D video content, may be created and/or offered by one or more 3D content sources 100. The sources 100 may capture 3D video content using one or more cameras 101A and 101B. Cameras 101A and/or 101B
may be any of a number of cameras that are configured to capture video content.
In accordance with one or more aspects of the present disclosure, cameras 101A

and 101B may be configured to capture video content for a left eye and a right eye, respectively, of an end viewer. The captured video content from cameras 101A and 101B may be used for generation of 3D video content for transmission, e.g., to an end user output device. In yet other configurations, a single camera 101A or 101B may be utilized for capturing of 2D video content. In such configurations, the captured video content form camera 101A or 101B may be used for generation of 2D video content for transmission to a user output device.
1231 The data output from the cameras 101A and/or 101B may be sent to an encoding (e.g., stereographer/productionlvideo processing) system 102 for initial processing of the data. Such initial processing may include any of a number of processing of such video data, for example, cropping of the captured data, color enhancements to the captured data, and association of audio to the captured video content.
1241 An optional audio recording system 103 may capture audio associated with the video signal from the cameras 101A and 101B and generate a corresponding audio signal. Alternatively, cameras 101A/B may be adopted to capture audio.
The audio captured may, for example, include spoken words in an audio track that accompanies the video stream and/or other audio associated with noises

-6-and/or other sounds. Audio recording system 103 may generate an audio signal that may be inserted, for example, at corresponding time sequences to the captured video signals in the encoding system 102.
1251 The audio track may be directly associated with the images captured in the video signal. For example, cameras 101A and/or 101B may capture and generate data of a video signal with an individual talking and the audio directly associated with the captured video may be spoken words by the individual talking in the video signal. Alternatively and/or concurrently, the audio track also may be indirectly associated with the video stream. In such an example, the cameras 101A and/or 101B may capture and generate data of a video signal for a news event and the audio indirectly associated with the captured video may be spoken words by a reporter not actually shown in the captured video.
1261 For example, data from the encoding system 102 may be 3D video content corresponding to two signals of live video content of a sporting event. Audio recording system 103 may be configured to capture and provide audio commentary of a sports analyst made during the live sporting event, for example, and encoding system 102 may encode the audio signal to one or more video signals generated from cameras 101A, 101B. Alternatively, the audio signal may be provided as a separate signal from the two video signals. The audio signal from the audio recording system 103 and/or the encoding system 102 may be sent to a stream generation system 104, to generate multiple digital datastreams (e.g., Internet Protocol streams) for the event captured by the cameras 101A, 101B.

-7-1271 The stream generation system 104 may be configured to transmit at least two independent signals of captured and/or processed video data from cameras 101A
and 101B. The data may be compressed in accordance with a particular protocol and/or for transmission of the signals across a particular type of infrastructure for data transmission. For example, compression of the video may allow for transmission of the captured vide signals across a greater distance while using less bandwidth in comparison to no compression of the captured video signals.
The audio signal added by the audio recording system 103 also may be multiplexed with one or both of the two video signals. As noted above, the generated signals may be in a digital format, such as an Internet Protocol (IP) encapsulated format. By transmitting each video signal for respective eyes of a viewer as two separate/independent signals, data integrity is better maintained and fewer resources are used. Independent or separate video signals, for example, have not been frame synced to the other between a video image and/or audio capturing location, e.g., an on-site location, and a central processing point in the infrastructure of a service provider, e.g., an off-site location. A
single centralized processing point for video frame synchronization enables a service provider, for example, to capture and process video for 3D implementation with minimal equipment needed at on-site locations. In addition, a remote downstream processing point (e.g., at a video-on-demand (VOD) server) for video frame synchronization enables a service provider to capture video for 3D

implementation with less equipment necessary at an on-site location. The

-8-processing for video frame synchronization may occur at multiple VOD servers downstream from a central office 106.
[28] The single or multiple encapsulated IP streams may be sent via a network 105 to any desired location. In the example of captured 3D video with camera 101A
capturing a video signal for one eye of a viewer and camera 101B capturing a video signal for the other eye of the viewer, two separate or independent encapsulated IP streams may be sent via the network 105 to a desired location, such as central office 106. The network 105 can be any type of communication network, such as satellite, fiber optic, coaxial cable, cellular telephone, wireless (e.g., WiMAX), twisted pair telephone, etc., or any combination thereof. In some embodiments, a service provider's central office 106 may make the content available to users.
[29] The central office 106 may include, for example, a decoding system 107 including one or more decoders for decoding received video and/or audio signals.
In the example configuration for processing 3D video signals, decoding system 107 may be configured to include at least two decoders for decoding two video signals independently transmitted from the stream generation system 104.
Decoding system 107 may be configured to receive the two encoded video signals independently. Decoding system 107 may further be configured to decompress the received video signals, if needed, and may be configured to decode the two signals as the first and second video signals corresponding to the video signal captured for the left eye of a viewer and the video signal captured for the right eye of the viewer.

-9-[30] In the case of an audio signal being transmitted with the associated video signals, decoding system 107 further may be configured to decode the audio signal. In one example, an audio signal corresponding to audio captured with associated video content may be received as part of one of the two video signals received from a stream generation system 104. In such an example, one of the video signals may be for the right eye of a viewer and the other video signal may be for the left eye. The audio signal may be received as an encoded combined signal with the left and/or right eye signal.
[31] Upon receipt of the two, or more, video signals, e.g., one for the right eye of a viewer and the other as a combined signal, one signal being for the left eye of the viewer and one being the audio signal, decoding system 107 may be configured to decode the first video signal, e.g., the video signal for the right eye of the viewer, and to decode the second video signal, e.g., the combined signal. The combined signal may be decoded to the video signal for the left eye of the viewer and the audio signal.
[32] The two, or more, video signals received by decoding system 107 may be compressed video signals. The two video signals may be compressed, for example, for transmission purposes in order to reduce the use of bandwidth and/or to operate with a service provider's infrastructure for transmission of video signals. In such examples where the video signals arc compressed, decoding system 107 may be configured to decompress the video signals prior to, concurrent with, and/or after decoding the video signals.
-io-[33] Operatively connected to decoding system 107 may be a frame syncing system 108, which may be combined as a computing device as depicted in Figure 2 (discussed below). Frame syncing system 108 may be configured to compare time codes for each frame of video content in the first video signal with those for each frame of video content in the second signal. The syncing system 108 may match frames by time codes to produce a frame synced video signal in which each frame contains the left and right eye data, e.g., images, which occur at the same time in the video program. In the example of 3D video content for viewers, a frame synced video signal may be utilized by an output device of a viewer.
The output device may output the frame synced video signal in a manner appropriate for a corresponding viewing device to render the video as a 3D video appearance.
The resulting output from the syncing system 108 may be a single stream of the frame synced signal. The left and right eye video may drift during transport.
As long as the drift is consistent, it may be corrected on the receive side of the transport.
[34] For example, a viewer may utilize an active shutter headgear/eye gear that reads a video signal from an output device as an over/under format. In such an example, the active shutter headgear may be configured to close the shutters for one eye and open the shutters of the other eye of the headgear per respective frame of video content. As such, an appearance of 3D images may be created for a viewer.
[35] Options for methods of frame syncing a first video signal with a second video signal include, but are not limited to, over/under syncing, e.g., top/bottom, side by side full syncing, alternative syncing, e.g., interlaced, frame packing syncing, e.g., a full resolution top/bottom format, checkerboard syncing, line alternative full syncing, side-by-side half syncing, and 2D+ depth syncing. These example methods are illustrative and additional methods may be utilized in accordance with aspects of the disclosure herein.
[36] In the example of an audio signal included with one or both of the video signals as a combined signal, upon decoding of the audio signal by a decoding system 107, frame syncing system 108 may be configured to sync the audio signal with the frame synced video signal. The process of syncing the audio signal with the by frame syncing system 108 may include identifying a time sequence of the frame synced video signal to insert the corresponding audio signal. In an example where only video for the right eye of the viewer is compressed by the right eye encoder and where video for the left eye of the viewer and audio are both compressed by the left eye encoder, the left eye encoder runs slower than the right eye encoder. The extra processing of audio on the left eye path results in this stream taking longer to compress. On the receive site, all of the video and audio may be decompressed and then reassembled as the independent left eye video, right eye video, and audio. Because of the processing time for procession by the left eye encoder, the delta in delivery times between the left eye and right eye delivery paths may be compensated for.
[37] Operatively connected to frame syncing system 108 may be a video formatting system 109. Video formatting system 109 may be configured to receive a frame synced video signal from the frame syncing system 108. The frame synced video signal includes a first video signal corresponding to a first video feed for one eye of a viewer and a second video signal corresponding to a second video feed for the other eye of the viewer. Video formatting system 109 may compress the frame synced video signal into a plurality of compressed frame synced video signals. Each compressed frame synced video signal may be compressed according to a different format, e.g., for different transmission systems and/or different user devices.
[38] In the example of Figure 1, video formatting system 109 may include three different compression format devices for compressing a frame synced video signal before transmission across a network 110. The three different compression format devices may be, for example, H.264 component 111A, MPEG2 component 111B, and Windows Media 9 component 111C. By taking the output the event back to baseband, the video may be multicast to multiple encoding and distribution platforms using IP technology. Video is multicast using IP to multiple video compression platforms. Previous technologies would have used an SDI router to deliver the video to multiple compression platforms. By using IP multicast, it is possible to feed the video to multiple compression and transport platforms. The present disclosure is not limited to three video CODECs. Using aspects of the present disclosure, it is possible to add any number of CODECs.

Others include Flash, 0n2, MPEG-1, Smooth, and Zen. Additional and/or alternative format devices or systems may be included as desired or required.
[39] Component 111A may be configured to compress the frame synced video signal to H.264 format, for example, which may be a format often utilized for set-top boxes of users of service providers. Component 111B may be configured to compress the frame synced video signal to MPEG2 format, for example, which may be a format often utilized for home computers of users, e.g., subscribers, of service providers. Component 111C may be configured to compress the frame synced video signal to Windows Media 9 format, for example, which may be a format often utilized for streaming video to users. Although Figure 1 shows H.264 component, MPEG2 component and Windows Media 9 component as three illustrative compression format devices, the disclosure is not so limited and may include any format of compression of video and number of components.
The present disclosure should not be interpreted as being limited to the examples provided herein.
[40] Video formatting system 109 may be configured to encode the 3D video content for a plurality of different formats for different end devices that may receive and output or render the 3D video content. Video formatting system 109 may be configured to generate a plurality of Internet protocol (IP) streams of encoded 3D
video content specifically encoded for the different formats for rendering.
[41] The different formats may correspond to different types of rendering/display devices that a user would use to view the 3D content. For example, one set of two of the IP streams may be for rendering the 3D video content on a display being utilized by a polarized headgear system, while another set of two of the IP
streams may be for rendering the 3D video content on a display being utilized by an anaglyph headgear system. Any of a number of technologies for rendering and/or viewing rendered 3D video content may be utilized in accordance with the concepts disclosed herein. Although anaglyph and polarized headgear are used as examples herein, other 3D headgear types or display types may be used as well, such as active shutter and dichromic gear.
[42] Video formatting system 109 may be connected to network 110, which can be any type of communication network, such as satellite, fiber optic, coaxial cable, cellular telephone, wireless (e.g., WiMAX), twisted pair telephone, etc., or any combination thereof. Video formatting system 109 may be configured to transmit the compressed frame synced video signals from components 111A, 111B, and 111C over the network 110. As such, a frame synced video signal according to any particularly desired compression format may be transmitted to any desired location.
In some embodiments, network 110 may be operatively connected to, may incorporate one or more components, or may be network 105.
[43] In some examples, a home of a user may be configured to receive data from network 110. The home of the user may include a home network configured to receive encapsulated 3D video content and distribute such to one or more viewing devices, such as televisions, computers, mobile video devices, 3D headsets, etc.
For example, 3D video content may be configured for operation with a polarized lens headgear system. As such, a viewing device or centralized server may be configured to recognize and/or interface with the polarized lens headgear system to render an appropriate 3D video image for display.
[44] In other examples, a computer may be configured to receive data from network 110 as streaming video. The computer may include a network connection to a closed system configured to receive encapsulated 3D video content and distribute such to one or more viewing devices that operate within a closed network. Such an example may be an intranet network.
[45] Figure 2 illustrates general hardware elements that can be used to implement any of the various computing devices discussed herein. The computing device 200 may include one or more processors 201, which may execute instructions of a computer program to perform any of the features described herein. The instructions may be stored in any type of non-transitory computer-readable medium or memory (e.g., disk drive or flash memory), to configure the operation of the processor 201.
For example, instructions may be stored in a read-only memory (ROM) 202, random access memory (RAM) 203, removable media 204, such as a Universal Serial Bus (USB) drive, compact disk (CD) or digital versatile disk (DVD), floppy disk drive, or any other desired electronic storage medium. Instructions may also be stored in an attached (or internal) storage 205 (e.g., hard drive, flash, etc.). The computing device 200 may include one or more output devices, such as a display 206 (or an external television), and may include one or more output device controllers 207, such as a video processor. There may also be one or more user input devices 208, such as a remote control, keyboard, mouse, touch screen, microphone, etc. The computing device 200 may also include one or more network interfaces, such as input/output circuits 209 (such as a network card) to communicate with an external network 210. The network interface may be a wired interface, wireless interface, or a combination of the two. In some embodiments, the interface 209 may include a device such as a modem (e.g., a cable modem), and network 210 may include the external network 110, an in-home network, a provider's wireless, coaxial, fiber, or hybrid fiber/coaxial distribution system (e.g., a DOCSIS network), or any other desired network.
[46] The Figure 2 example is an example hardware configuration. Modifications may be made to add, remove, combine, divide, etc. components as desired.
Additionally, the components illustrated may be implemented using basic computing devices and components, and the same components (e.g., processor 201, storage 202, user interface 205, etc.) may be used to implement any of the other computing devices and components described herein. For example, the various components herein may be implemented using computing devices having components such as a processor executing computer-executable instructions stored on a computer-readable medium, as illustrated in Figure 2.
[47] One or more aspects of the disclosure may be embodied in a computer-usable data and/or computer-executable instructions, such as in one or more program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types when executed by a processor in a computer or other data processing device. The computer executable instructions may be stored on one or more computer readable media such as a hard disk, optical disk, removable storage media, solid state memory, RAM, etc. As will be appreciated by one of skill in the art, the functionality of the program modules may be combined or distributed as desired in various embodiments. In addition, the functionality may be embodied in whole or in part in firmware or hardware equivalents such as integrated circuits, field programmable gate arrays (FPGA), and the like. Particular data structures may be used to more effectively implement one or more aspects of the invention, and such data structures are contemplated within the scope of computer executable instructions and computer-usable data described herein.
1481 Figure 3 illustrates an example system for access or distribution of video content over, for example, a plurality of different service provider networks in accordance with one or more aspects of the present disclosure. The different service provider networks can serve some or different geographic regions (e.g., one network 311 may serve users in Ohio while another network 331 may serve users in Pennsylvania), different access types (e.g., one network 311 may serve fiber optic/coaxial cable users in Pennsylvania, and another network 331 may serve wireless cellular customers in Pennsylvania), different IP services (e.g., two different web services offering streaming 3D video to their respective users), any other form of offering services to different users, and any combination or subcombination thereof In the example of Figure 3, a network 301 of a first service provider is shown operatively connected to a plurality of networks of other service providers, including network 311 of service provider 310, network 321 of service provider 320, network 331 of service provider 330, and network 341 of service provider 340. Alternatively, networks 301, 311, 321, 331, and 341 may be operated by one service provider. Network 301 as described herein may be operatively connected to, may incorporate one or more components, or may be network 110 described above with respect to Figure 1. Network 301 may operate as a video distribution system to the networks 311, 321, 331, and 341.

1491 Network 311 of service provider 310 may be a network for users of a competitor to the service provider transmitting data through network 301. In the example of a live 3D sporting event where the rights for capturing and distribution are owned by one service provider, competitor service providers may desire access to the same live 3D video and audio feeds. An arrangement may be in place between the service providers to allow access to the particular live 3D video content.
However, in such situations, the transmission format for the network 311 may be different from the transmission format of the service provider providing the live 3D
video signal through its network 301. A television 319 connected to the service provider 310 may be configured to receive and render 3D video content through, for example, a set-top box 317, a QAM 315, and a processing platform 313.
Alternatively, this and other networks may communicate directly with a user's display device. Distribution through network 311 may require unicast transmissions while the backbone operation of network 301 may be a multicast transmission format. In accordance with the present disclosure, an IP network may be utilized for acquisition and distribution amongst various service providers.
1501 Service provider 320 may stream live video content to users through a URL. A
user to service provider 320 may access live video content originating from network 301 of a different service provider at a computer 327, through, for example, a content delivery network 325, an encoder 323, and network 321.
Again, in such situations, the transmission format for the network 321 through, e.g., the competitor service provider 320 may be different from the transmission format of the service provider providing the live video signal through its network 301.

1511 Like network 311 for service provider 310, network 331 of service provider 330 may be a network for users of a competitor to the service provider transmitting data through network 301. In such situations, the transmission format for the network 331 through the competitor service provider 330 may be different. A television 339 (or other display device) connected to the service provider 330 may be configured to receive and render video content through, for example, a set-top box 337, a QAM 335, and a processing platform 333.
1521 Service provider 340 may be for users in an international market, such as Europe.
A user to service provider 340 may access video content originating from network 301 of a different service provider at a display device 347 connected to the service provider 340. The display device 347 may be configured to receive and render video content through a set-top box 345, a QAM 343, or directly from network 341. In this example, network 341 may comprise, in whole or in part, a wireless network (e.g., satellite, cellular, WiMax, etc.). The transmission format for the network 341 through the competitor service provider 340 may be different from the transmission format of the service provider providing the live video signal through its network 301.
1531 In one or more of the examples of Figure 3, the video signal (e.g., a live video signal) from network 310 may be a 3D video signal, such as for a concert, a political event, and/or a sporting event. In some examples of Figure 3, the live video signal from network 310 may include an audio signal associated with the live video signal. The audio signal may be directly and/or indirectly associated with the live video signal. Still further, one or more features of the distribution of the original live 3D video content from a source to network 110 in Figure 1 may be utilized as the live video content distributed from network 310 in Figure 3.
[54] Figure 4 illustrates an example video distribution process in accordance with one or more aspects of the present disclosure. The various steps may be performed by different entities in the system (e.g., the cameras 101A, 101B, encoding system 102, audio recording device 103, stream generator 104, decoding system 107, frame syncing system 108, and video formatting system 109) as discussed herein.
In step 401, a video feed for the right eye of a viewer may be captured by, for example, the camera 101A. In what may be a concurrent step, in 403, a video feed for the left eye of the viewer may be captured by, for example, camera 101B.
In yet another step that may be concurrent, in 405, an audio feed associated with the video feeds may be captured by, for example, audio recording device 103.
[55] In step 407, the captured live video feed for the right eye of the viewer may be encoded by, for example, encoding system 102, and in step 409, the captured live video feed for the left eye of the viewer may be encoded with the audio feed into an encoded combined signal by, for example, encoding system 102. Alternatively, the audio feed may be encoded separately or with the right eye video feed.
Encoding of the video feeds may be desired in order to transport the video feeds from a first point to a second point over a known network configuration of a service provider to meet a specific frame size and format for a receiver. In 411, the encoded live video feed for the right eye of the viewer may be compressed, and in step 413, the encoded combined feed may also be compressed by, for example, the stream generator 104. Compression of the encoded video feeds may be desired in order to utilize less bandwidth in transmitting the encoded video feeds through a network.
[56] Proceeding to step 415, the compressed encoded live video feed for the right eye and the compressed encoded combined feed may be transmitted over a network, for example over network 105, as two independent signals. One benefit of transmission of the video feeds as two independent feeds, e.g., IP streams, is that it ensures a full resolution signal (e.g., originally captured resolution video feed for each eye, as opposed to a frame synced half resolution video for each eye) may be transmitted for later processing before output to a viewer. Such a delay of frame syncing the left eye and the right eye video off-site from a point of origination allows for centralized processing of the signals. Thus, resources for later processing to frame sync the signals are reduced to the central location.
[57] In step 417, which may be concurrently performed with step 419, the compressed encoded live video feed for the right eye of the viewer may be decompressed, and the compressed encoded combined feed may be decompressed, respectively, by, for example, decoding system 107. In step 421, the encoded live video feed for the right eye of the viewer may be decoded by, for example, decoding system 107, back to an original captured video format and in step 423, the encoded combined feed may be decoded into the live video feed for the left eye of the viewer and the audio feed by, for example, decoding system 107 back to an original captured video and audio format.

[58] Proceeding to step 425, the live feed for the right eye that was captured in step 401 and the live feed for the left eye that was captured in step 403 are ready for frame syncing. In 425, each frame of video content may be synced between the feed for the right eye and the feed for the left eye by, for example, frame syncing system 108. The frame syncing of the two signals may be performed in any of a number of different manners. For example, for each frame of video content, the live feed for the right eye may be reduced by 1/2 resolution and placed in the upper half of a frame synced video signal. The live video feed for the left eye also may be reduced by 1/2 resolution and placed in the lower half of the frame synced video signal.
Thus, the resulting frame synced video feed may be an over/under frame synced video feed. The live video feeds may be synced based upon a time sequence of the original recording of the live video feeds by the cameras. The live video feeds may have been marked at predefined intervals with a time code that corresponds to a particular time in the captured video feeds for the right eye and the left eye of the viewer.
[59] An end user's 3D viewing device may be configured to output the upper half of the frame synced video signal to a right eye of the viewer and to output the lower half of the frame synced video signal to a left eye of the viewer. Utilizing a frame synced video signal, the 3D viewing device (e.g., headgear) may create the appearance of live 3D video content for the viewer. As described herein, any of a number of different frame syncing formats may be utilized to frame sync the video feed for the right eye of a viewer with the live video feed for the left eye of the viewer and the present disclosure is not limited to any specific example herein.

1601 Moving to step 427, if an audio signal was included with one of the video feeds for the left eye and/or video feed for the right eye in a combined feed, the audio signal may be synced to the sync framed video feed by, for example, frame syncing system 108. The audio signal may be synced with the frame synced video signal based upon a time sequence of the original recording of the audio stream by the audio recording system with the original capturing of the live video feeds by the cameras. The audio signal may have been marked at predefined intervals with a time code that corresponds to a particular time in the captured video feeds for the right eye and the left eye of the viewer.
1611 In alternative embodiments of the example process of Figure 4, two independent video feeds, one for the right eye of a viewer and one for the left eye of a viewer that were captured originally by two different cameras may be distributed and frame synced as a frame synced video feed without the inclusion of an audio signal. In such examples, in step 409, the video feed for the left eye of the viewer may be encoded without the inclusion of an audio feed. In step 413, the encoded live video feed for the left eye may be compressed, again without the inclusion of an audio feed. Transmission in step 415, decompression in step 419, and decoding in step 423 follow without the inclusion of any audio feed.
1621 In addition, compressing and decompressing feeds may be an optional step.
One or more steps may be implemented and/or not included. Still further, although the example of Figure 4 illustrates an embodiment of encoding and transmitting an associated audio feed with the video feed for the left eye of a viewer, the associated audio feed may alternatively or concurrently be encoded and transmitted with the video feed for the right eye of the viewer.
[63] Figure 5 illustrates an example video encoding process in accordance with one or more aspects of the present disclosure. The various steps may be performed by different entities in the system (e.g., the cameras 101A, 101B, encoding system 102, audio recording device 103, stream generator 104, decoding system 107, frame syncing system 108, and video formatting system 109) as discussed herein.
In step 501, a frame synced live video feed with associated audio may be received by, for example, video formatting system 109. A service provider may want to distribute the frame synced live video feed in a number of different video formats in order to distribute the video content over a wider range of electronic devices configured to render the video content on an output device. For transmission through different networks and mediums, different compression formats may be needed.
1641 Proceeding to step 503, the received frame synced video feed (e.g., live video feed) may be compressed in accordance with a first video compression format by, for example, H.264 component 111A, MPEG2 component 111B, and/or Windows Media 9 component 111C. Any of a number of different video compression forniats may be utilized herein and the present disclosure should not be interpreted as limited to those described. Example video compression formats that may be utilized include MPEG2, H.264, and Windows Media 9.

1651 The process then moves to step 505 where a determination may be made as to whether there are more compression formats associated with the distribution network of the service provider that are needed in order to transmit the frame synced live video feed to a larger pool of viewers utilizing different output devices for rendering the live video feed. If there is no other compression format needed, the process moves to step 511. If another compression format is needed in step 505, the process moves to step 507 where the next video compression format may be identified for implementation. For example, in step 503, the frame synced live video feed may be compressed in accordance with an H.264 compression format.
In step 507, an MPEG2 video compression format may be identified as needed.
1661 In step 509, the frame synced live video feed received in 501 may be compressed in accordance with the identified next video compression format. Thus, there now may be two compressed frame synced live video feeds, one compressed in accordance with the first format in step 503 and one compressed in accordance with the next identified compression format in step 509. The process may return to step 505 to determine, once again, if another compression format associated with the distribution network of the service provider is needed.
1671 In step 511, the compressed frame synced live video feeds for each of the different compression formats are transmitted over a network, such as over network 110, as different compression format signals. As such, a service provider with some users/subscribers having end devices configured to receive and decompress signals of a first format and other users having end devices to receive and decompress signals of a second and different format may be able to receive and view the same live video content.
[68] In step 513, a device associated with a subscriber/end user may determine an available frame synced live video feed of compression format that matches the compression format utilized by the end device, for example, in response to a user viewing an electronic program guide and opting to view a particular video program. In the examples described above, the end device of the user may utilize a frame synced live video feed according to an H.264 compression format.
[69] Having determined the frame synced live video feed according to the matching compression format, in step 515, the matching compressed frame synced live video feed may be received by a subscriber/end user system. For example, a set-top box at a home of a user may be configured to receive such a compressed frame synced live video signal. Then, in step 517, the received compressed frame synced live video feed may be decompressed in order to render the live video content to an end user.
[70] In other embodiments of the example process of Figure 5, the frame synced live video feed may or may not include an audio signal associated with the frame synced live video feed. In such examples, the frame synced live video feed may include a similarly compressed audio feed. Then, in step 517, the audio signal may be decompressed from the frame synced live video feed as well.
[71] In addition, the example in Figure 5 may be implemented in accordance with one or more examples of embodiments of Figure 4 described above. For example, the frame synced live video feed received in step 501 may be the same frame synced live video feed generated in step 425 in Figure 4 or may be the frame synced live video feed generated in step 427 that includes an audio feed associated with the originally captured video.
1721 Figure 6 illustrates an example video distribution process for a plurality of service providers in accordance with one or more aspects of the present disclosure.
The various steps may be performed by different entities in the system (e.g., the cameras 101A, 101B, encoding system 102, audio recording device 103, stream generator 104, decoding system 107, frame syncing system 108, video formatting system 109, network 301, one or more components within service provider 310, one or more components within service provider 320, one or more components within service provider 330, and one or more components within service provider 340) as discussed herein. In step 601, a frame synced live video feed (or a non-live feed) with associated audio may be received by, for example, video formatting system 109. A service provider may want to distribute the frame synced live video feed through a number of different networks of other service providers in order to distribute the video content over a wider range of electronic devices configured to render the video content on an output device.
1731 Proceeding to step 603, the received frame synced live video feed may be processed into a digital datastream by, for example, video formatting system 109.
Any of a number of different datastream generated formats may be utilized herein and the present disclosure should not be interpreted as limited to those described.
Example datastream generated formats that may be utilized include Internet Protocol streams, internetwork packet exchange streams, and Internet Group Management Protocol streams.
[74] Some service providers might only use unicast transmissions for transports of video data across its network. In step 605, the datastream from step 603 may be further processed into a first datastream configured to be transmitted across a network according to a unicast transmission for such service providers by, for example, video formatting system 109. For example, a competitor service provider may want to receive and distribute the original frame synced live video feed from the originating service provider. In such an example, the competitor service provider may distribute in a unicast transmission. Accordingly, the first processed datastream configured for transmission as a unicast transmission in 605 may be utilized as described below.
[75] In step 607, a determination may be made as to whether there are more service provider transmission types that are needed in order to transmit the frame synced live video feed to a larger pool of viewers across different service providers. If there are no other service provider types needed, the process moves to step 613. If another service provider type is needed in step 607, the process moves to step where the next service provider transmission format may be identified for implementation.
[76] In step 611, the datastream received in step 603 may be processed into a next datastream configured to be transmitted across a network according to a different transmission format, such as a multicast transmission by, for example, video formatting system 109. Thus, there now may be two datastreams for frame synced live video feeds, one datastream in accordance with the first format in step 605 and one datastream in accordance with the next identified format in step 611. The process may return to step 607 to determine, once again, if another service provider format type associated with the distribution network of another service provider is needed.
[77] In step 613, the datastreams for each of the different service provider formats are transmitted over a network, such as over network 110, as different service provider formatted signals. As such, different service providers with networks configured to receive unicast transmission and/or networks configured to receive multicast transmission may be able to receive and view the same live video content.
Thus, the different transmissions across a network, such as across network 110, may be signals with different transmission formats as well as signals with different frame syncing formats [78] In step 615, for a first service provider network, the processed datastream for unicast transmission may be received. As such, the first service provider may transmit within its system accordingly. Similarly in step 617, for a second service provider network, the processed datastream for multicast transmission may be received. As such, the second service provider may transmit within its system accordingly.
[79] In embodiments of the example process of Figure 6, the processed datastreams may or may not include an audio signal associated with the frame synced live video feed. In still other embodiments, although described in Figure 6 as a frame synced live video signal, the live video signal may be for 2D video rendering and a frame synced live video signal may not be needed. In such examples, the frame synced live video signal may be a live 2D video signal with, or without, an associated audio signal.
[80] In addition, the example in Figure 6 may be implemented in accordance with one or more examples of embodiments of Figures 4 and 5 described above. For example, the frame synced live video feed received in step 601 may be the same frame synced live video feed generated in step 425 in Figure 4 or may be the frame synced live video feed generated in step 427 that includes an audio feed associated with the originally captured video.
[81] Figure 7 illustrates an example access or communication distribution networks, such as the networks discussed with respect to Figure 1, in accordance with one or more aspects of the present disclosure. The example of Figure 7 shows a video-on-demand (VOD) type distribution network. The disclosure's principles and concepts are applicable for other types of distribution or access networks.
For example, the network may be applied to distribute 3D content to a movie theater (e.g., 3D-equipped movie theater, 3D-equipped home theater, etc.) [82] A content storage device 708 may store video signals in association with network 105. The video signals may be, for example, signals generated and transmitted by the stream generator 104 at a 3D content source 100 shown in Figure 1. The content storage device 708 at a central office 106 may receive and store these signals.
[83] The content storage device 708 may include components and operate similar to computing device 200. The content storage device 708 may include one or more processors 201, which may execute instructions of a computer program to perform any of the features described herein. The instructions may be stored in any type of non-transitory computer-readable medium or memory, to configure the operation of the processor 201. For example, instructions may be stored in a read-only memory (ROM) 202, random access memory (RAM) 203, removable media 204, such as a Universal Serial Bus (USB) drive, compact disk (CD) or digital versatile disk (DVD), floppy disk drive, or any other desired electronic storage medium. Instructions may also be stored in an attached (or internal) storage 205 (e.g., hard drive, flash, etc.). Content storage device 708 may be capable of streaming an encoded video signal or other type of signal to VOD
system 702.
[84] Content storage device 708 may, upon request, transmit the stored signals (e.g., video signals, encoded combined signals, etc.) to a stream generator 710 that, among other things, may transmit the requested content over network 704. The stream generator 710 may include electronic components, software components, and functional components similar to the stream generator 104 of the 3D
content source 100. In some embodiments, the functionality of the stream generator and content storage device may be combined into a single system/device. In some examples, stream generator 710 may encode, compress, and/or provide other extended services (e.g., insertion of closed captioning, AFD, ITV, SEI, user data, etc.) [85] Central office 106 may be operatively connected to a video-on-demand (VOD) system 702 through a network 704. Network 704 may include one or more of the aspects described herein.
[86] VOD system 702 may include a decoding system 107, synchronization system 108, video formatting system 109, and/or input processing (i.e., network processing) component 706. Through use of these systems/components, VOD
system 702 may provide a viewer with 3D-compatible video content for display on the viewer's device. In particular, the 3D-compatible video content may be generated in the desired format on-the-fly. As such, the upstream system may avoid needing to store multiple copies of the same type of video content in different formats. As explained herein, the efficiency and resource savings to a network may be substantial.
[87] Decoding system 107 in VOD system 702 may receive the encoded video signals or other signals from upstream components (e.g., central office 106). In some embodiments, the decoding system may be remotely located from the user premise equipment. For example, a VOD system 702 may be located in a local vicinity of a viewer, while a central office 106 may be further away. The VOD
system 702 may, through synchronization system 108, receive the decoded signals and synchronize as requested per the viewer's specific requirements/preferences.

1881 VOD system 702 may customize the video signal per preferences of a particular user or a device, e.g., display device. For example, a user (e.g., viewer) may set her 3D-compatible display to show 3D with greater depth (e.g., perspective).
As such, VOD system 702 provides the viewer's device with 3D content in the format desired by the user device, and eliminates the need for expensive and/or complex circuitry/firmware at the user's device to transform the incoming signal to accommodate the particular device. Rather, VOD system 702 may accept the viewer's preferences and generate content accordingly for display. A network processing component 706 in VOD system 702 may receive a request over network 341 and provide viewer preferences (e.g., desired depth of 3D
rendering, closed captioning, etc.), device manufacturer settings (e.g., progressive display, interleaved display), and/or provider settings (e.g., satellite TV provider, etc.) to VOD system 702.
1891 In addition, synchronized video signals generated by the synchronization system 108 may be further formatted/encoded by a video formatting system 109 in VOD
system 702. The video formatting system 109 may transmit the final video signal from the VOD system 702 over a network 341 to, for example, a 3D-compatible display 347.
1901 Figure 8 is a flowchart illustrating an example synchronization and encoding process in accordance with one or more aspects of the present disclosure. At step 802, a network processing component 706 may receive a request over a network 341 operatively connected to user premise equipment (e.g., gateway 345 with VOD services enabled, 3D-compatible display 347, etc.) The request may include one or more of at least viewer preferences (e.g., desired depth of 3D
rendering, closed captioning, etc.), device manufacturer settings (e.g., progressive display, interleaved display), and/or provider settings.
1911 At step 804, the request may be used by input processing component 706 to determine a requested 3D-compatible format, for example. Examples of rules or principles applied to identify the requested 3D-compatible format based on information indicated in the request is disclosed herein, but other rules may be apparent to those of skill in the art after review of the entirety disclosed herein.
1921 In addition, some information indicative of the request may be transmitted upstream to a central office 106 for further processing. For example, the title of the requested content (e.g., an identifier corresponding to the movie being requested by a user) may be sent to the central office 106 for identification, retrieval, and transmission. Central office 106 may receive and process the request. For example, content storage device 708 may identify and retrieve the encoded video signal and/or encoded combined signal for the requested movie title and use the stream generator 710 to transmit the signals over network 704 to the VOD system 702. In an alternate embodiment, content storage device 708 may stream the signals directly to the VOD system 702 without using a stream generator 710. In addition, the central office 106 may transmit two video signals:
a first video signal corresponding to a first video content for a first eye of a viewer, and a second video signal corresponding to a second video content for a second eye of the viewer. In some embodiments, a combined signal may be transmitted comprising the second video signal and corresponding audio signal multiplexed, or otherwise combined, into a combined signal.
[93] A decoding system 107 in a VOD system 702 may receive the encoded first video signal, encoded second video signal, and/or encoded combined signal in step 806.
The decoding system 107 may decode the encoded first video signal (see step 808) and/or the encoded second video signal. The decoding may include decompressing the encoded first video signal (see step 808) and the encoded second video signal. The decoding may further include other types of processing as desired. In an alternate embodiment, in addition the decoding system 107 in the VOD system 702 may, in step 810, decode the combined signal described above, which comprises a video signal for an eye and corresponding audio signal.
The video signal for the eye in the combined signal may be decoded (in step 812) similar to the other eye in step 808.
[94] The decoded signals (e.g., right eye video signal, left eye video signal, and optionally audio signal) may be transmitted to synchronization system 108 of the VOD system 702 for processing. The synchronization system 108 may synchronize the first video signal with the second video signal to create a single, synchronized video signal (in step 814). In some embodiments, in accordance with the disclosure, an audio signal may be synchronized with the video signals to result in a video signal with both video content and audio content (in optional step 816).

[95] Single, synchronized video signal may be formatted in the 3D-compatible format previously requested and provided to input processing component 706. For example, the requested 3D-compatible format may indicate that a frame-compatible 3D video signal is desired or that a service-compatible 3D video signal is desired. Some examples of frame-compatible 3D formats include, but are not limited to a top/bottom format, left/right format, alternative format, interlaced format, line interleaved format, page flip format, checkerboard format, and 2D+ depth format. Some examples of service-compatible 3D formats include, but are not limited to a full resolution top/down format, full resolution left/right format, MVC plus delta format, and full resolution line alternative format. In the case of service-compatible 3D formats, the video signals may be full resolution signals.
[96] The synchronization system 108 of the VOD system 702 may include rules to assist in determining a requested 3D-compatible format for the video content to be provided to the viewer based on the received request. For example, when the request indicates that the display 347 is a progressive scan display, then the synchronization system 108 may identify a top-bottom format as the requested 3D-compatible format. When the request indicates that the display 347 is interleaved, then the synchronization system 108 may identify a side-by-side (left/right) format as the requested 3D-compatible format.
[97] Different display devices may require different encoded/formatted signals. Some examples of different 3D displays include, but are not limited to, active displays requiring shutter glasses, micro-polarization displays requiring polarized glasses, and auto-stereoscopic displays not requiring glasses.
[98] In step 818, video formatting system 109 of VOD system 702 may perform other processing on the synchronized video signal and transmit the formatted 3D-compatible signal to the user premise equipment over the network 341. Some examples of other processing performed at system 109 includes compressing the synchronized video signal according to a requested 3D-compatible compression format. After compression, in some examples, the synchronized signal may be in accordance with one of: MPEG-2, H.264, and WindowsTM Media 9. In other embodiments, the synchronized video signal outputted by the VOD system 702 is an Internet Protocol (IP) datastream. The IP datastream may include video and/or audio content that is streamed over a packet switched network (e.g., network 341) to a display with 3D-display capabilities.
1991 As a result, the viewer's equipment is provided with a 3D-compatible signal (e.g., mpeg file, etc.) that is capable of being displayed on the viewer's equipment.

Such a system eliminates the need for expensive and/or complex circuitry/firmware at the user's device to transform the incoming signal to accommodate the particular device because the upstream components (e.g., VOD
system 702) provide the viewer's device with 3D content in the format desired by the user device.
[100] Aspects of the disclosure have been described in terms of illustrative embodiments thereof. While illustrative systems and methods as described herein embodying various aspects of the present disclosure are shown, the disclosure is not limited to these embodiments. Modifications may be made by those skilled in the art, particularly in light of the foregoing teachings.
1101] For example, the steps illustrated in the illustrative figures may be performed in other than the recited order, and that one or more steps illustrated may be optional in accordance with aspects of the disclosure. Modifications may be made without departing from the true spirit and scope of the present disclosure. The description is thus to be regarded as illustrative instead of restrictive on the present disclosure.

Claims

CLAIMS:

1. A method comprising:
receiving a request for 3D content;
determining a 3D-compatible format based on the request, wherein the 3D-compatible format comprises:
a top-bottom format if the request indicates a progressive display, and a side-by-side format if the request indicates an interleave display;
receiving, by a decoding system, a first encoded video signal and a second encoded video signal, wherein the first encoded video signal corresponds to a first video content for a first eye of a viewer, and wherein the second encoded video signal corresponds to a second video content for a second eye of the viewer;
decoding, by the decoding system, the first encoded video signal to produce a first decoded video signal;
decoding, by the decoding system, the second encoded video signal to produce a second decoded video signal;
synchronizing the first decoded video signal with the second decoded video signal into a synchronized video signal; and formatting the synchronized video signal to the 3D-compatible format.

2. The method of claim 1, where the decoding system is remotely located from user premise equipment.

3. The method of claim 1 or 2, wherein the synchronized video signal is a frame-compatible 3D video signal.

4. The method of any one of claims 1-3, where the 3D-compatible format comprises one or more of: top/bottom format, left/right format, interlaced format, line interleaved format, page flip format, checkerboard format, and 2D+ depth format.

5. The method of any one of claims 1-3, where the 3D-compatible format comprises one or more of: full resolution top/down format, full resolution left/right format, and MVC plus delta format.

6. The method of claim 5, wherein the first encoded video signal and the second encoded video signal are full resolution signals.

7. The method of any one of claims 1-6, where the first encoded video signal and the second encoded video signal are independently received via a network operatively connected to a remote central office.

8. The method of any one of claims 1-7, further comprising: transmitting the synchronized and formatted video signal over a packet switched network in an Internet Protocol (IP) data stream.

9. The method of any one of claims 1-8, wherein the formatting further comprises:
compressing the synchronized video signal according to a 3D-compatible compression format and causing transmission of the synchronized and formatted video signal via a network to user premise equipment.

10. The method of claim 9, where the synchronized video signal after the compressing is in accordance with one of: MPEG-2 and H.264.

11. The method of any one of claims 1-10, where the decoding the first encoded video signal comprises decompressing the first encoded video signal, and wherein the decoding the second encoded video signal comprises decompressing the second encoded video signal.

12. The method of any one of claims 1-11, wherein the second encoded video signal comprises an audio signal associated with at least one of the first encoded video signal and the second encoded video signal, wherein the decoding the second encoded video signal comprises decoding the second encoded video signal into the second decoded video signal and the audio signal, and wherein the synchronizing comprises synchronizing the audio signal with the first decoded_video signal and second decoded video signal.

13. An apparatus comprising:
one or more processors; and memory storing executable instructions that, when executed by the one or more processors, cause the apparatus to perform the method of any one of claims 1-12.

14. A system comprising:
one or more computing devices configured to perform the method of any one of claims 1-12; and one or more storage mediums configured to store the synchronized video signal.

15. A method comprising:
receiving, at a computing device, a request from user premise equipment;
determining, by the computing device, a 3D-compatible format based on the request from the user premise equipment, wherein the 3D-compatible format comprises:
a top-bottom format when the request indicates a progressive display;
and a side-by-side format when the request indicates an interleave display;
and formatting a video signal to the 3D-compatible form.

16. The method of claim 15, wherein the formatted video signal comprises a frame-compatible 3D video signal formatted in one or more of: top/bottom format, left/right format, interlaced format, line interleaved format, page flip format, checkerboard format, and 2D+
depth format.

17. The method of claim 15, wherein the formatted video signal comprises a service-compatible 3D video signal formatted in one or more of: full resolution top/down format, full resolution left/right format, and MVC plus delta format.

18. An apparatus comprising:
one or more processors; and memory storing executable instructions that, when executed by the one or more processors, cause the apparatus to perform the method of any one of claims 15-17.

19. A system comprising:
one or more computing devices configured to perform the method of any one of claims 15-17; and one or more storage mediums configured to store the video signal.