RU2500022C2 - Streaming interactive video client apparatus - Google PatentsStreaming interactive video client apparatus Download PDF
- Publication number
- RU2500022C2 RU2500022C2 RU2010127328/08A RU2010127328A RU2500022C2 RU 2500022 C2 RU2500022 C2 RU 2500022C2 RU 2010127328/08 A RU2010127328/08 A RU 2010127328/08A RU 2010127328 A RU2010127328 A RU 2010127328A RU 2500022 C2 RU2500022 C2 RU 2500022C2
- Prior art keywords
- Prior art date
- 230000002452 interceptive Effects 0 abstract title 5
- 230000036461 convulsion Effects 0 abstract 4
- 206010028347 Muscle twitching Diseases 0 abstract 3
- 230000000694 effects Effects 0 abstract 1
- 238000005516 engineering processes Methods 0 abstract 1
- 239000000126 substances Substances 0 abstract 1
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/30—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
- A63F13/35—Details of game servers
- A63F13/358—Adapting the game course according to the network or server load, e.g. for reducing latency due to different connection speeds between clients
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/12—Video games, i.e. games using an electronically generated display having two or more dimensions involving interaction between a plurality of game devices, e.g. transmisison or distribution systems
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F13/30—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
- A63F13/33—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers using wide area network [WAN] connections
- A63F13/335—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers using wide area network [WAN] connections using Internet
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F13/70—Game security or game management aspects
- A63F13/77—Game security or game management aspects involving data related to game devices or game servers, e.g. configuration data, software version or amount of memory
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/10—Current supply arrangements
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/426—Characteristics of or Internal components of the client
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network, synchronizing decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4347—Demultiplexing of several video streams
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network, synchronizing decoder's clock; Client middleware
- H04N21/436—Interfacing a local distribution network, e.g. communicating with another STB or inside the home ; Interfacing an external card to be used in combination with the client device
- H04N21/4363—Adapting the video or multiplex stream to a specific local network, e.g. a IEEE 1394 or Bluetooth® network
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/50—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers
- A63F2300/53—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers details of basic data processing
- A63F2300/534—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers details of basic data processing for network load management, e.g. bandwidth optimization, latency reduction
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/50—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers
- A63F2300/55—Details of game data or player data management
- A63F2300/5546—Details of game data or player data management using player registration data, e.g. identification, account, preferences, game history
- A63F2300/556—Player lists, e.g. online players, buddy list, black list
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements or protocols for real-time communications
- H04L65/60—Media handling, encoding, streaming or conversion
- H04L65/607—Stream encoding details
This application is a partial continuation (CIP) of application with serial number 10/315460, filed December 10, 2002, entitled "APPARATUS AND METHOD FOR WIRELESS VIDEO GAMING", the rights to which belong to the applicant of this CIP application.
FIELD OF THE INVENTION
The present disclosure of the subject invention generally relates to the field of data processing systems that improve the ability of users to manipulate and connect to audio and video media.
State of the art
Recorded audio and film media have been one side of public life since Thomas Edison. At the beginning of the 20th century, recordable audio media (rollers and records) and film media (cinema and films) were widely distributed, but both technologies were, however, in their early stages of development. At the end of the 1920s, movies were combined with sound on the basis of the consumer goods market, followed by color films with sound. Broadcasting gradually took the form of broadcast media for the distribution of audio information from the consumer goods market, supported by advertising. When the standard for television broadcasting was set in the mid-1940s, television joined radio as a form of broadcast media for the distribution of the consumer goods market, and previously recorded live movies or motion pictures came into the house.
By the mid-20th century, most American homes had phonographs for playing recorded audio media, a radio for receiving live audio broadcasts, and a television for playing audio / video (A / V) media of direct transmission. Very often, these 3 "media players" (player, radio and TV) were combined in one housing with the use of conventional speakers, which became the "media center" for the home. Despite the fact that the choice of storage media was limited for the consumer, the “ecosystem” of storage media was quite sustainable. Most consumers knew how to use “media players” and could fully utilize their functionality. At the same time, multimedia publishers (to a large extent film and television studios and music companies) could supply their media to theaters and for home use and did not suffer from widespread piracy or "secondary sales", i.e. resale of the used storage medium. As a rule, publishers do not receive income from "secondary sales", and, in essence, this reduces the income that publishers would otherwise receive from the buyer of the used storage medium due to new sales. Despite the fact that in the middle of the 20th century, of course, the used records were used, such sales did not have a big impact on the publishers of the records, because, unlike a movie or a video program, which an adult usually watches only once or only several times - a music track can be listened to hundreds or even thousands of times. Therefore, the medium of musical information is much less "transient" (ie, it has lasting value for an adult consumer) than film / video media. After buying a record, if the consumer liked the music, then he is likely to keep it for a long time.
From the mid-20th century to the present, the ecosystem of information carriers has undergone a series of radical changes, both to the benefit of consumers and publishers, and to the detriment of them. With the widespread introduction of tape recorders, especially cassette tapes with high-quality stereo sound, of course, it has become much more convenient for consumers. But it also marked the beginning of what is now a widespread practice with respect to consumer media: piracy. Of course, many consumers used cassette tapes to record their own records on it simply for convenience, but an increasing number of consumers (for example, students in a student dormitory with free access to each other's record collections) could make pirated copies. In addition, consumers could record music broadcasted on magnetic tape instead of buying a record or magnetic tape from a publisher.
The emergence of a consumer VCR (video cassette recorder) led to even greater consumer convenience, since now a video cassette recorder could be installed to record a television program that could be watched at a later time, and this also led to the creation of a video rental where access to films, as well as television programs could be provided "on demand". The rapid development of home storage devices for the consumer goods market since the mid-1980s has led to an unprecedented level of choice and consumer convenience, and has also led to the rapid expansion of the multimedia publishing market.
Currently, consumers are faced with a wide selection of multimedia, as well as with many storage devices, many of which are tied to specific types of multimedia or specific publishers. A passionate multimedia consumer can have a stack of devices connected to televisions and computers located in different rooms of the house, resulting in a network of cables leading to one or more televisions and / or personal computers (PCs), as well as a group of remote controls . (In the context of this application, the term "personal computer" or "PC" refers to any type of computer suitable for home or office, including a desktop computer, Macintosh® or other computers that do not require the use of the Windows operating system, compatible with Windows devices, varieties of UNIX, laptops, etc.) These devices may include a video game console, video cassette recorder, DVD player, sound processor / surround amplifier, satellite TV set-top box, television subscriber w CATV set-top box, etc. And, for the passionate consumer, there can be many devices with similar features due to compatibility issues. For example, a consumer may own both an HD-DVD and a Blu-ray DVD player, or both a Microsoft Xbox® and a Sony Playstation® gaming video system. In fact, due to the incompatibility of some games across all versions of game consoles, the consumer may own both the XBox and a later version, such as the Xbox 360®. Often, consumers are confused about which video input and which remote control to use. Even after the disc is inserted in the appropriate player (for example, DVD, HD-DVD, Blu-ray, Xbox or Playstation), the video input and audio input for this device are selected and the corresponding remote control is found, the consumer nevertheless encounters with technical problems. For example, in the case of a widescreen DVD, the user may need to first determine and then set the appropriate image format on his monitor or TV screen (for example, 4: 3, Full, Zoom, Wide Zoom, Cinema Wide, etc.). Similarly, the user may need to first determine and then install the appropriate audio format for the surround sound system (e.g., AC3, Dolby Digital, DTS, etc.). Often, the consumer is not aware that he may not use the multimedia content in full to the functionality of his television or audio system (for example, when watching a movie in compressed form with the wrong image format or when listening to audio with stereo sound rather than surround sound )
More and more Internet-based storage devices are being added to the device stack. Sound devices like the Sonos® Digital Music system stream audio directly from the Internet. Similarly, devices like the Slingbox ™ set-top box record video and stream it over the home network or over the Internet, where it can be watched remotely on a PC. And IPTV (IPTV) services offer services like cable television through a digital subscriber line (DSL) or other Internet connections in the home. Recently, there have also been attempts to integrate many multimedia functions in one device, for example, Moxi® Media Center and PC running Windows XP Media Center Edition. Despite the fact that each of these devices offers an element of the mechanism for the functions that it performs, each lacks the ubiquitous and easy access to most storage media. In addition, manufacturing such devices often costs hundreds of dollars, often because of the need for expensive processing and / or local storage. In addition, these modern consumer electronic devices tend to consume a lot of power, even in standby mode, which means that they are time-consuming and wasteful of energy. For example, a device may continue to work if a consumer forgot to turn it off or switched to another video input. And, since none of the devices is a complete solution, it must be integrated with another device stack in the house, which still leaves the user with a web of wires and a large number of remote controls.
In addition, when many of the latest devices based on Internet technology really work properly, they tend to offer multimedia in a more general way than it could otherwise be available. For example, devices that stream video over the Internet often stream only video material, rather than the interactive “supplementary materials” that often accompany DVDs, such as “opinions about” videos, games, or filmmaker comments. This is due to the fact that often interactive material is displayed in a specific format intended for a particular device that processes interactivity locally. For example, each DVD, HD-DVD, and Blu-ray disc has its own specific interactive format. Any home storage device or local computer that could be designed to support all popular formats may require some level of complexity and flexibility, which would probably make their operation unacceptably expensive and complicated for the consumer.
In addition to this problem, if a new format is later introduced, then in the local device, there may not be hardware to support this new format, which will mean that the consumer will have to buy an upgraded local storage device. For example, if you later introduced high-resolution video or stereoscopic video (for example, one video stream for each eye), then the local device may not have the processing power to decode the video, or it may not have hardware to output video in a new format (for example, suppose that stereoscopic perception is achieved through video at a frequency of 120 frames / s, synchronized with shutter glasses, and frames are fed to each eye at a frequency of 60 frames / s, if the consumer’s video equipment Jette only support video at 60 frames / sec, this option would not be available without the purchase of upgraded hardware).
The issue of complexity and obsolescence of a storage device is a serious problem when it comes to sophisticated interactive multimedia, especially video games.
Modern video game applications are mainly divided into four main machine-dependent hardware platforms: Sony PlayStation® 1, 2 and 3 (PS1, PS2 and PS3), Microsoft Xbox® and Xbox 360® and Nintendo Gamecube® and Wii ™, as well as PC-based games . Each of these platforms is different from the others so that games written for execution on one platform are usually not executed on another platform. There may also be device compatibility issues from generation to generation. Although most game developers create platform-independent game programs to run a specific game on a specific platform, their own software layer (often called the “game program development mechanism”) is required to adapt the game for use on a specific platform . Each platform is sold to the consumer as a “console” (ie, a standalone unit connected to a TV or monitor / speakers), or it is itself a PC. Typically, video games are sold on an optical storage medium, such as a Blu-ray DVD, DVD-ROM, or CD-ROM, which contains a video game embodied as a sophisticated real-time application. As home broadband speeds increase, video games are becoming more and more available for download.
The specific requirements for achieving platform compatibility with video game software are extremely high due to the nature of the real-time mode and the high computing requirements of advanced video games. For example, one would expect full compatibility of video games from generation to generation (for example, from XBox to XBox 360 or from Playstation 2 ("PS2") to Playstation 3 ("PS3"), similar to how there is general compatibility of working applications (for example , Microsoft Word) from one PC to another with a faster processor or core. However, this is not the case with video games. Since video game manufacturers, when they release a generation of video games, usually try to achieve the highest possible performance for this price point, they often significantly change the architecture systems so that many games written for a previous generation system do not run on a later generation system, for example, the XBox is based on the x86 processor family, while the XBox 360 is based on the PowerPC family.
Emulation methods of the previous architecture may be used, but given that video games are real-time applications, it is often impossible to achieve identical emulation behavior. This is detrimental to the consumer, the manufacturer of the video game console, and the publisher of the video game software. For the consumer, this means the need for video game consoles, both the old and the new generation, connected to the TV for the ability to conduct all games. For the console manufacturer, this means the costs associated with emulation and the slower implementation of new consoles. And for the publisher, this means the need to release many versions of new games to reach all possible consumers - not only the release of a version for each brand of a video game (for example, XBox, Playstation), but often the version for each version of this brand (for example, PS2 and PS3). For example, a separate version of Madden NFL 08 Electronic Arts was developed for the XBox, XBox 360, PS2, PS3, Gamecube, Wii and PC, along with other platforms.
Portable devices, such as cell phones and portable media players, also pose problems for game developers. A growing number of such devices are connected to wireless data networks and can download video games. But there is a wide selection of cell phones and storage devices on the market with a wide range of different display resolutions and computing capabilities. In addition, since such devices typically have restrictions on power consumption, cost, and weight, they generally do not have advanced graphics acceleration hardware such as a graphics processing unit (“GPU”), such as devices manufactured NVIDIA Corporation, Santa Clara, CA (Santa Clara, CA). Therefore, game developers typically develop this computer game simultaneously for many different types of portable devices. The user may find that this computer game is not specifically available for his cell phone or portable media player.
In the case of home game consoles, manufacturers of hardware platforms typically charge a license fee for game developers to publish games on their platform. Cellular carriers also typically charge a license to the game publisher to download the game to their cell phone. In the case of computer games, there is no royalty paid for publishing games, but game developers generally face high costs because of the higher level of indirect customer service costs to support a wide range of PC configurations and because of installation issues that may arise. In addition, PCs generally pose less obstacles to piracy of gaming programs, as a technically experienced user can easily reprogram them, and games can be easily unlicensed and republished and distributed more easily (for example, via the Internet). Accordingly, for the developer of game programs, there are costs and adverse conditions when publishing on game consoles, cell phones and PCs.
For publishers of a game console and computer program, the costs are not limited to this. To distribute games through retail channels, publishers charge a wholesale price below the sale price for retailers to generate net profit. The publisher also typically has to pay the costs of producing and distributing the physical media containing the game. The retailer also often charges the publisher a “price protection fee” to cover potential unforeseen expenses, for example, when the game isn’t sold out, or if the price of the game goes down, or if the retailer must refund part or all of the wholesale price and / or return the game from the buyer. In addition, retailers also typically charge publishers the cost of supporting sales of games in leaflets. In addition, retailers are increasingly buying games from users who have stopped playing them and then selling them as used games, usually without sharing the revenue from their sale with the game publisher. In addition to the burden of costs incumbent on game publishers, the fact that games are often unlicensedly republished and distributed via the Internet for users to download and make free copies of them.
As broadband Internet access speeds increase, and broadband connections have become more widespread in the US and around the world, in particular to the home and to the Internet “cafe” where PCs connected to the Internet are rented, games are increasingly distributed by downloading to PC or console. In addition, broadband connections are increasingly being used to play multi-user online games and massively multi-user online games (both of these in the present disclosure of the subject invention are denoted by the abbreviation "MMOG"). These changes reduce some of the costs and issues of retail distribution. Downloading online games is aimed at overcoming some of the adverse conditions for game publishers due to the fact that distribution costs are generally reduced and costs due to unsold media are small or absent. But downloaded games are still prone to piracy, and because of their size (often many gigabytes), they can take a very long time to load. In addition, many games can fill small (capacity) drives, like those sold with laptop computers or with video game consoles. However, large games or MMOG require that the game can be played online, the problem of piracy is reduced, since it is usually required that the user has a valid user account. Unlike linear multimedia (such as video and music), which can be copied by the camera when shooting video from the display screen or by a microphone when recording audio from speakers, each experience using a video game is unique and cannot be copied using simple video / audio recordings. Accordingly, even in regions where the enforcement of copyright laws is not strictly enforced and piracy is widespread, MMOGs can be protected from piracy, and therefore entrepreneurship can be supported. For example, the MMOG "World of Warcraft" of the media conglomerate Vivendi SA has been successfully introduced worldwide and does not suffer from piracy. And many online or MMOG games, such as Linden Lab’s MMOG “Second Life,” generate revenue for game operators through economic models built into games where you can buy, sell, and even create resources using online toolkits. funds. Accordingly, mechanisms can be used to pay for the use of online games in addition to regular purchases of gaming programs or subscription services.
Despite the fact that piracy in many cases can be reduced due to the nature of online (games) or MMOG, the operator of interactive games, however, faces other problems. Many games to properly execute them require significant local (i.e., home) processing resources for online (games) or MMOG. If the performance of the user's local computer is low (for example, a computer without a GPU, such as a younger laptop), then it may not be able to play the game. In addition, as game consoles become obsolete, they also lag behind the current level of development and may not be able to handle advanced games. Even if we assume that the local PC of the user can satisfy the computational requirements of the game, then often there is complexity of installation. There may be driver incompatibility (for example, if a new game is loaded, then it can install a new version of the graphics driver, which renders a previously installed game, depending on the old version of the graphics driver, which will not function). The console can run out of space on the local drive when more games are loading. Complex games tend to accept downloadable patches from the game developer over time, as errors are detected and fixed, or if modifications are made to the game (for example, if the game developer considers the game level to be too complicated or too simple). These patches need to be downloaded again. But sometimes not all users download all patches. In other cases, downloaded patches introduce other compatibility or disk space issues.
In addition, while playing a game, it may be necessary to download a large amount of data to provide graphic or behavioral information to a local PC or console. For example, if a user enters a room in an MMOG and encounters a scene or character composed of graphic data, or with behaviors that are not on the user's local machine, then this character or scene data must be loaded. This can result in a significant delay during game play if the Internet connection is not fast enough. And if the character or scene that they encounter requires memory or processing power that exceeds the capabilities of the local PC or console, then a situation may arise where the user cannot continue the game, or must continue with worse quality graphics. Accordingly, online games or MMOG games often limit their computational complexity and / or memory requirements. In addition, they often limit the number of data transfers during the game. And online games or MMOG games can also narrow the market for users who can host these games.
In addition, technically savvy users are increasingly decompiling local copies of games and modifying these games so that they can cheat. Frauds can be as simple as making a repeated keystroke faster than a person can (for example, to fire a firearm very quickly). In games that support in-game resource transactions, fraud can reach the level of falsification, which results in fraudulent transactions involving resources of actual economic value. When the economic model of online (games) or MMOG games is based on such resource transactions, this can result in consequences causing significant harm to game operators.
The cost of developing a new game grew when PCs and consoles became able to output increasingly sophisticated games (for example, with more realistic graphics, such as real-time ray tracing, and more realistic behaviors, such as real-time physics simulation). At the dawn of the video game industry, the development of video games was very similar to the development of application software, that is, software development was a large part of the development cost, as opposed to the development of graphic, sound, and behavioral elements or “resources” such as those that may be designed for motion picture with spatial special effects. Currently, many sophisticated video game development programs are more reminiscent of movie design with special effects than software development. For example, many video games provide simulations of 3D (three-dimensional) worlds, and form increasingly photorealistic (i.e., computer graphics that seem as realistic as a frame depicting actors playing, photographically) characters, props, and surroundings. One of the most promising aspects of the development of photorealistic games is the creation of a computer-shaped human face that cannot be distinguished from a human face when actors play. Face capture technologies like Contour ™ Reality Capture, developed by Mova, San Francisco, CA (San Francisco, CA) capture and track the exact geometry of an actor’s face with high resolution when it is in motion. This technology provides the ability to visualize 3D faces on a PC or game console, which can hardly be distinguished from a captured face when playing actors. Accurate capture and visualization of the "photoreal" human face are useful in several ways. Firstly, video games often use very easily recognizable celebrities or athletes (often highly paid), and the flaws can be obvious to the user, which distracts during the game or makes viewing unpleasant. Often, a high degree of detail is required to achieve a high degree of photorealism, requiring the visualization of a large number of polygons and high-resolution textures, possibly with polygons and / or textures that change on a frame-by-frame basis during face movement.
When scenes with a large number of polygons with detailed textures change quickly, the PC or game console supporting the game may not have enough RAM to store enough textures and polygons for the necessary number of animation frames formed in the game segment. In addition, on a PC or game console, as a rule, one optical disk drive or one magnetic disk drive is available, which is usually much slower than RAM and, as a rule, cannot keep up with the maximum data transfer speed that the GPU can take polygons and textures when rendering. Modern games, as a rule, load most polygons and textures into RAM, which means that this scene is largely limited in complexity and duration due to the amount of RAM. In the case of face animation, for example, this can limit the PC or game console to either a low-resolution face that is not photoreal, or to a photoreal face that can only be animated for a limited number of frames, before the game pauses and loads the polygons and textures (and other data) for additional frames.
Observation of the slow movement of the progress bar on the screen when a message like "Loading ..." is displayed on the PC or console screen is accepted by modern users as an inherent disadvantage of complex video games. The delay in loading the next scene from the disk (the “disk” in this description, unless otherwise specified, refers to non-volatile optical or magnetic storage media, as well as to non-disk storage media, for example, a semiconductor Flash memory) can take several seconds or even a few minutes. This is a waste of time and can completely disappoint a player. As discussed earlier, most or all of the delay can occur due to the loading time of the polygon, texture, or other data from the disk, but it can also happen that part of the loading time is spent preparing the processor and / or GPU in the PC or console data for the scene . For example, a video game of football may provide an opportunity for players to select from a large number of players, teams, stadiums and meteorological conditions. So, depending on which particular combination is selected, the scene may require different polygons, textures and other data (collectively “objects”) (for example, different teams have different colors and patterns in their form). It may be possible to list many or all of the various changes and pre-compute many or all of the objects in advance, and save these objects to the disk used to store the game. But, if the number of changes is large, then the amount of memory required for all objects may be too large and not correspond to the disk (or too unrealistic to load). Accordingly, existing PCs and console systems, as a rule, are limited both in complexity and in the duration of the playback of these scenes, and suffer from long loading times for complex scenes.
Another significant limitation of prior art application software and video game systems is that they are increasingly using large databases of, for example, 3D objects such as polygons and textures that need to be downloaded to a PC or game console for processing. As discussed above, loading such databases can take a lot of time when stored on a local disk. The load time, however, is usually much longer if the database is stored remotely and accessed via the Internet. In this case, it may take minutes, hours or even days to load a large database. In addition, the creation of such databases is often costly (for example, a 3D model of a detailed sailing vessel equipped with a high mast for use in a game, film or historical documentary), and they are intended for sale to a local end user. However, the database is at risk of unauthorized use after it is downloaded by a local user. In many cases, the user needs to download the database just to evaluate it, to check if it meets his needs (for example, does the 3D suit for the game character have a satisfactory appearance when the user performs a specific move). Long download times can be a deterrent for a user evaluating a 3D database before making a purchase decision.
Similar issues exist in MMOG, in particular, for example, in games that provide users with the opportunity to use more and more customizable characters. In order for the PC or game console to display on the character’s screen, they must have access to a database of 3D geometry (polygons, textures, etc.), as well as lines of behavior (for example, does the character have a shield, is this shield strong enough to reject spear or not) for this character. As a rule, when a user conducts an MMOG for the first time, a large number of databases for a character already come with the original copy of the game, which is available locally on the optical disc of the game or downloaded to disk. But, as the game progresses, if the user encounters a character or an object whose database is not accessible locally (for example, if another user created a character that can be configured independently), then before this character or object can be displayed, their The database must be loaded. This can result in a significant delay in the game.
Given the level of complexity and complexity of video games, another problem for video game developers and publishers related to prior art video game consoles is that video game development often takes two to three years at a cost of tens of millions of dollars. Given that new video game console platforms are being introduced at a rate of about one every five years, game developers should begin developing these games years before the release of a new game console so that video games go on sale simultaneously with the release of the new platform. Sometimes they release several consoles from competing manufacturers at about the same time (for example, with an interval of one or two years), but the popularity of each console is still unknown, for example, which console will make the largest sales of video game software. For example, in a recent series of consoles, it was planned to introduce Microsoft XBox 360, Sony Playstation 3, and Nintendo Wii in about the same total time period. But years before the introductions mentioned, game developers essentially had to "place their bets" on which console platforms would be more successful than others, and devote their development resources accordingly. Film companies should also apportion their limited production resources proportionally based on what they assess as the film’s likely success, long before the film is released. Given the growth in the investment required for video games, game production is becoming more and more similar to movie production, and game companies typically give up their productive resources based on their assessment of the future success of a particular video game. But unlike (they) movie companies, this bet is not based solely on the success of the production itself, rather it is based on the success of the game console on which this game should be played. Releasing the game on many consoles at the same time can reduce the risk, but because of this additional work program, the cost increases, and often the delay in the actual release of the game occurs.
The user’s working conditions and the conditions for developing application software on a PC require an increasing amount of computation, become dynamic and interactive, not only to make them more attractive visually for users, but also to make them more useful and intuitive. For example, both the new Windows Vista ™ operating system and later versions of the Macintosh® operating system include visual animation effects. Advanced graphics tools, such as Maya ™ from Autodesk, Inc., provide the ability to play dynamic images and the most sophisticated 3D visualizations that extend the boundaries of modern CPUs and GPUs. However, the computational requirements of these new tools raise some practical questions among users and software developers of such products.
Since the visual display of the operating system (OS) should work on a wide range of computer classes - including computers of the previous generation that are no longer sold, but on which, however, you can replace the operating system (OS) with a new one - graphical requirements OSs are largely limited by the "least common denominator" of computers for which this OS is intended, which typically includes computers that do not include a GPU. This greatly limits the functionality of the OS graphics. In addition, battery-powered laptop computers (such as laptops) limit visual display capabilities, as the high level of computational activity of the CPU or GPU, as a rule, results in higher power consumption and shorter battery life. Laptop computers typically include software that automatically lowers processor activity to reduce power consumption when the processor is not in use. On some computer models, the user may lower the processor activity manually. For example, Sony's VGN-SZ280P laptop contains a switch labeled “Stamina” on the one hand (for low performance, longer battery life) and “Speed” on the other (for high performance, less battery life). The OS that the laptop is running on must be usable even if the computer is running at a rate equal to a fraction of its maximum performance. Accordingly, the performance of the OS graphics subsystem often remains much lower than the processing power available at the current level of technology.
Comprehensive, computationally-intensive applications like Maya often sell with the expectation that they will be used on high-performance PCs. This, as a rule, defines the requirement for the “least common denominator” more expensive and less portable with much greater performance. As a result, such applications have a much more limited target audience than a universal OS (or a universal working application like Microsoft Office), and, as a rule, they are sold to a much smaller extent than universal OS software or universal application software. security. The potential audience is also limited, because it is often difficult for the potential user to try such applications that require a lot of computation in advance. For example, suppose a student needs to learn how to use Maya, or a potential buyer who is already well aware of such applications needs to test Maya before investing in a purchase (which may also include buying an older model that can run Maya). Despite the fact that a student or potential buyer can download a Maya demo or get a physical copy of a Maya demo media if they don’t have a computer that can run Maya with all its potential capabilities (for example, processing complex 3D scenes), they will not be able to fully appreciate this product. This significantly limits the audience for such rich functionality applications. It also contributes to an increase in the selling price, since the development cost is usually repaid by a much smaller number of purchases than in the case of a universal application.
Costly applications also encourage people and companies to use pirated copies of application software. As a result, high-performance application software suffers from widespread piracy, despite significant efforts by publishers of such software to reduce such piracy in various ways. However, even when using “pirated” applications with wide functionality, users cannot avoid the need to invest in expensive modern PCs to make pirated copies. Therefore, despite the fact that users of pirated software can use the application for a price equal to a fraction of its actual retail price, they nevertheless must buy or purchase an expensive PC to fully use this application.
This also applies to users of high-end pirated video games. Despite the fact that pirates can get games for a price equal to a fraction of their actual price, they nevertheless have to buy expensive computing equipment (for example, a PC with an advanced GPU or a high-performance video game console like the XBox 360) needed to run games properly. Given that video games are usually a pastime for consumers, the additional cost of a high-performance gaming video system can be an obstacle. This situation is worsening in countries (for example, China), where the average annual income of workers is currently quite low relative to the average annual income of workers in the United States. As a result, a much smaller percentage of people own a high-performance gaming video system or high-performance PC. In such countries, Internet cafes are very common, in which users pay for using a computer connected to the Internet. Often such Internet cafes have an old model or low-performance PC without high-performance features, for example, GPUs, which could otherwise provide players with the opportunity to play video games that require a lot of computation. This is a decisive factor for the success of games that run on low-end PCs, such as the "World of Warcraft" Vivendi media conglomerate, which is very successful in China and is usually played there in an Internet cafe. On the contrary, a game requiring a large amount of computation, such as "Second Life", is much less likely to be played on a PC installed in a Chinese Internet cafe. Such games are virtually inaccessible to users who have access only to low-performance PCs in Internet cafes.
Obstacles also exist for users who are considering buying a video game and would like to first test the demo version of the game by downloading it through the Internet to their home (your computer). The demo version of a video game is often a full-featured version of a game with some inaccessible features or with restrictions on the number of times a game is played. It may mean a lengthy process (possibly hours) of downloading gigabytes of data before the game is installed and executed on a PC or console. In the case of a PC, it can also mean figuring out what special drivers are needed (for example, OpenGL or DirectX drivers) for this game, downloading the appropriate version, installing them, and then determining whether it is possible to play this game on the PC. This final step may include determining whether the PC has sufficient processing capabilities (CPU and GPU), sufficient RAM, and a compatible OS (for example, some games run on Windows XP but not run on Vista). Accordingly, after a lengthy process when trying to run a demo version of a video game, the user may find that the demo version of the video game may not be playable, given the configuration of the user's PC. Worse, if after downloading the new drivers by the user to test the demo version, these driver versions may be incompatible with other games or applications that the user constantly uses on the PC, respectively, installing the demo version may bring previously running games or applications to a non-working state. These barriers not only frustrate the user, but they create barriers for video game software publishers and video game developers to market their games.
Another problem that results in economic inefficiencies relates to the fact that PC or game console data are usually designed to provide a certain level of performance requirement for applications and / or games. For example, some PCs have more or less RAM, slower or faster CPUs, and slower or faster GPUs, if they have a GPU at all. Some games or applications take advantage of the full processing power of a given PC or console, while many games or applications do not use them. If the game or application that the user selects requires performance less than the maximum performance of the local PC or console, then the user may be wasting money on unused features of the PC or console. In the case of a console, the console manufacturer may pay more than was necessary to finance the costs of the console.
Another problem that exists in the marketing and use of video games involves taking into account that the user is viewing other games before making a purchase for this game. In the prior art, there are several approaches for recording video games for replay later on. For example, US Patent Application No. 5558339 teaches the recording of game state information including operations of a game controller during a "gameplay" (game process) in a computer of a video game client (belonging to an identical or different user). This status information can then be used to replay some or all of the game’s operations on the video game client’s computer (for example, a PC or console). A significant drawback of this approach is that in order for the user to view the recorded game, he must have a video game client computer that can play the game and must have a video game application running on this computer so that the gameplay is identical when the recorded state is repeated games. In addition, the video game application must be written in such a way that there is no possibility of a performance difference between the recorded game and the replayed game.
For example, a graphical representation of a game is usually calculated on a frame-by-frame basis. For many games, game logic can sometimes take less or more time than one frame period to calculate the graphical representation displayed on the screen for the next frame, depending on whether the scene is especially complex or if there are other delays that slow down execution (for example, on a PC, another process may be executed that takes CPU cycles from gaming applications). In such a game, a “threshold” frame may occur over time, which is calculated in a slightly shorter time than one frame period (say, few CPU cycles less). When an identical scene is again calculated using identical information about the game state, it can easily take several CPU cycles for more than one frame period (for example, if the internal CPU bus is slightly out of phase with the external DRAM, and this introduces several time units CPU latency cycle, even if there is no big delay due to another process taking milliseconds of processor time from game processing). Therefore, when the game is replayed, the frame is calculated for two frame periods, and not for one frame period. Some behaviors are based on how often a new frame is computed in a game (for example, when a game selects input from game controllers). When a game is being played, this discrepancy in relation to time for different lines of behavior does not affect the conduct of the game, but this may result in a different result being displayed when the game is repeated. For example, if basketball ballistics is calculated at a stable speed of 60 frames / s, but the input of the game controller is sampled based on the speed of the calculated frames, then the speed of the calculated frames could be 53 frames / s when the game was being recorded, but 52 frames / s when the game repeatedly reproduced, which may lead to a mismatch between whether the basketball entered the basket or not, which as a result will lead to a different outcome. Accordingly, using the game state to record video games requires very careful design of the game program to ensure that replay using identical game state information will produce an identical outcome.
Another prior art approach for recording a video game is to simply record the video output of a PC or game video system (for example, to a VCR, a DVD writer or to a video capture card on a PC). The video can then be rewound and replayed, or alternatively, the recorded video can be uploaded to the Internet, usually after compression. The disadvantage of this approach is that when the sequence of frames of a 3D game is repeatedly reproduced, the user is limited to viewing this sequence of frames only from the viewpoint from which this sequence of frames was recorded. In other words, the user cannot change the viewpoint of the scene.
In addition, when compressed video of a recorded sequence of game frames played on a home PC or game console is made available to other users via the Internet, even if the video is compressed in real time, it may not be possible to download this compressed video in real time on the Internet. The reason this is happening is because many homes in the world that are connected to the Internet have highly asymmetric broadband connections (for example, DSL and cable modem, as a rule, have a much larger downstream bandwidth than the bandwidth upstream bandwidth). Compressed high-resolution video sequences often have larger bandwidths than the bandwidth of the upstream network data, making it impossible to download them in real time. Accordingly, there may be a significant delay after playing back a sequence of game frames (possibly minutes or even hours) before another Internet user can view the game. Despite the fact that this delay is permissible in certain situations (for example, to view player’s accomplishments that happened earlier), it makes it impossible to watch a game live (for example, a basketball competition with the winners) or with the possibility of “replay” when the game is played live.
Another prior art approach enables a viewer having a television to watch live video games, but only under the control of a television production team. Some television channels, both in the United States and in other countries, provide channels for watching video games through which a television audience can watch certain users of a video game (for example, top-rated players participating in competitions). This is done when there is a video output of gaming video systems (PC and / or consoles) supplied to equipment for processing and distributing video for a television channel. This is not unlike broadcasting a live basketball game on a television channel, in which several cameras provide live transmission lines at different angles around the basketball court. After that, the television channel can use its equipment for special effects and video / audio processing to manipulate the output from various gaming video systems. For example, a television channel can overlay text on top of a video from a video game that indicates the status of different players (similar to how you can overlay text during a basketball game live), and a television channel can overlay additional audio from a commentator who can discuss actions, occurring during games. In addition, the video game output can be combined with cameras recording videos of actual gamers (for example, showing their emotional reaction to the game).
One problem with this approach is that such live video transmission lines should be available for the equipment of the television channel for processing and distribution of real-time video so that it has direct transmission excitation. However, as discussed earlier, this is often impossible (to implement) when the game video system is controlled from home, especially if part of the broadcast includes live video from a camera that captures the player’s real video. In addition, during the competition, there is concern that a player at home may modify the game and cheat, as described above. For these reasons, such television game video broadcasts are often arranged so that players and video game systems are concentrated in a public place (for example, in a television studio or in the arena), where television production equipment can receive video lines from a variety of video game systems, and possibly cameras for direct transmission.
Despite the fact that such video game television channels of the prior art can provide a very exciting performance for a television audience, which is a practical experience close to a real sporting event, for example, with video players presented as “athletes” and in terms of their actions in the video game world , and in the sense of their actions in the real world, these gaming video systems are often limited by positions when players are in close physical proximity to each other. And, since television channels are broadcast, each broadcast channel can show only one video stream, which is selected by the production group of the television channel. Due to these restrictions and the high cost of airtime, production equipment and production groups, only top-rated players participating in the most important competitions are usually shown on such television channels.
In addition, a given television channel broadcasting a full-screen video game image of the entire television audience shows only one video game at a time. This greatly limits the choice of the viewer. For example, the viewer may not be interested in the game (s) currently being shown. Another viewer may be interested only in watching the game of a particular player, which is not shown on the television channel at this time. In other cases, the viewer may only be interested in watching how an experienced player controls a specific level in the game. However, other viewers may want to control the viewpoint from which they are watching the video game, which is different from the viewpoint selected by the production team, etc. In short, the viewer can have many preferences when watching video games that are not provided by a specific broadcast on a television network, even if several different television channels are available. For all of the above reasons, video game television channels of the prior art have significant limitations in the presentation of video games to viewers.
Another disadvantage of prior art gaming video systems and application software systems is that they are complex and typically suffer from errors, crashes and / or unintentional and undesired behaviors (generically, “errors”). Although games and applications typically go through a setup and debugging process (often called “software quality assurance” or SQA) before release, almost invariably, after a game or application is released to a wide audience, unexpectedly mistakes. Unfortunately, it is difficult for a software developer to identify and detect many of the errors after release. Software developers may find it difficult to learn about bugs. Even when they find out about the error, they may have at their disposal only a limited amount of information to identify the cause of this error. For example, a user can call the game developer’s customer support service by phone and leave a message stating that when the game was running, the screen began to flicker, then turned dark blue, and the PC froze. This gives the SQA group very little useful information for detecting an error. Some games or applications that are connected online may sometimes provide more information in certain cases. For example, a watchdog process can sometimes be used to monitor a game or application for a crash. The watchdog process can collect statistics about the status of the application or game process (for example, about the status of the stack, memory usage, how far the game or applications have moved, etc.) when a crash occurs, and then pass this information to the SQA through Internet But in a complex game or application, deciphering such information can take a very long time to accurately determine what the user did during an emergency. Even after this, it may not be possible to determine the sequence of events that led to the crash.
Another problem associated with PCs and game consoles is that they are subject to maintenance, which causes great inconvenience to the consumer. Service issues also affect the manufacturer of the PC or game console, as they are usually required to send a special box to safely transport the broken PC or console, and then bear the repair costs if the PC or console is under warranty. The publisher of an application software or game may also be affected by a loss of sales (or the use of online services) due to the PC and / or consoles being in a state of repair.
Figure 1 shows a prior art gaming video system, for example, Sony Playstation® 3, Microsoft Xbox 360®, Nintendo Wii ™, a personal computer based on Windows or Apple Macintosh. Each of these systems includes a central processing unit (CPU) for executing program code, typically a graphical processing unit (GPU) for performing advanced graphical operations, and multiple forms of input / output (I / O) for communicating with external devices and users . Each of these systems includes a central processing unit (CPU) for executing a control program, usually a graphic processor (GPU) for performing advanced graphic operations and various types of input / output (I / O) for exchanging information with external devices and users. For simplicity, these components are shown combined together as a single unit 100. It is also shown that the prior art gaming video system of FIG. 1 includes an optical storage device 104 (eg, a high-capacity disk drive), a hard disk drive 103 for data storage and video game control program, network connection 105 for playing games with several participants, for downloading games, patches, demos or other multimedia, random access memory (RAM) 10 1 for storing a control program currently executing by the CPU / GPU 100, a game controller 106 for receiving input commands from a user during gameplay, and a display 102 (e.g., SDTV / HDTV or computer monitor).
The prior art system depicted in FIG. 1 has several limitations. Firstly, optical disk drives 104 and hard disk drives 103 typically have much slower access speeds than the RAM 101 access speed. When working directly through RAM 101, the CPU / GPU 100 can, in practice, process much more polygons per second than possible when the control program and data are read directly from the hard disk drive 103 or the optical disk drive 104 due to the fact that the RAM 101 usually has a much larger bandwidth and does not suffer about Disc Brake delay with respect to a long search. But in these prior art systems, only a limited amount of RAM is provided (e.g., 256-512 MB). Therefore, the “Loading ...” sequence of frames is often required during which the RAM 101 is periodically populated with data for the next scene of the video game.
In some systems, an attempt is made to combine the loading of the control program simultaneously with the gameplay, but this can only be done when the sequence of events is known (for example, if the car is traveling on the road, then geometry can be loaded for approaching buildings located on the edge of the road, while car rides). For complex and / or quick scene changes, this type of combining usually does not work. For example, in the case when the user is in the middle of a battle and RAM 101 is completely filled with data representing objects in view at that moment, if the user quickly moves the image to the left to view objects that are not currently loaded in RAM 101, then, as a result, there will be a disruption in continuity in operation, since there will not be enough time to load new objects from the hard disk drive 103 or the optical information carrier 104 into RAM 101.
Another problem associated with the system of FIG. 1 arises from the limitations of the memory capacity of hard disk drives 103 and the optical storage medium 104. Despite the fact that disk storage devices can be manufactured with a relatively large memory capacity (for example, 50 gigabytes or more), however, they do not provide enough memory for certain scenarios found in modern video games. For example, as mentioned earlier, in a video game of soccer, the user may be able to select from dozens of teams, players and stadiums around the world. For each team, each player and each stadium, a large number of texture mapping maps and environment maps are needed to describe 3D surfaces in the world (for example, each team has a unique T-shirt, each requiring a unique texture mapping map).
One method used to solve this last problem is to pre-compute texture mapping and environment maps in the game after they are selected by the user. This may include several processes requiring a large amount of computation, including restoration of compressed images, 3D display, shading, organization of data structures, etc. As a result, there may be a delay for the user while these calculations are performed in the video game. One way to reduce this delay, in principle, is to do all of these calculations — including every change of team, team, and stadium — when the game was originally developed. The released version of the game then may include all of these pre-processed data stored on the optical information medium 104, or on one or more servers on the Internet with only switchable pre-processed data for team data, team composition, stadium selection, downloaded via the Internet to the drive 103 on hard drives when the user makes a choice. As a practical matter, however, such pre-loaded data of each change that is possible when playing a game can easily make terabytes of data, which far exceeds the capacity of modern optical data storage devices. In addition, data for team data, team composition, stadium selection can easily be hundreds of megabytes of data or more. For a home network connection of, say, 10 Mbps, downloading this data through network connection 105 may take longer than computing said data locally.
Accordingly, with the prior art game architecture shown in FIG. 1, the user experiences significant delays between the main editing transitions of complex games.
Another problem with prior art approaches such as that shown in FIG. 1 is that over time, video games become more advanced and require more CPU / GPU processing power. Accordingly, even assuming an unlimited amount of RAM, the video game hardware requirements exceed the maximum level of computing power available in these systems. As a result, users must upgrade their gaming equipment every few years to keep up (or play newer games at lower quality levels). One consequence of the trend of even greater improvements in video games is that home video game computers are generally economically inefficient because their cost is usually determined according to the requirements of the highest performance game that they can support. For example, the XBox 360 can be used to play a game like "Gears of War" that requires high-performance CPUs, GPUs and hundreds of megabytes of RAM, or the XBox 360 can be used to play Pac Man, a game from the 1970s that only requires kilobytes of RAM and The CPU is very poor in performance. In fact, the XBox 360 has enough computing power to host many Pac Man concurrent games simultaneously.
Video game computers are typically turned off for more time per week. According to a Nielsen Entertainment study in July 2006 of active players 13 years of age and older, on average, active players spend fourteen hours a week playing console video games, or only 12% of all time per week. This means that the average video game console is not used 88% of the time, which is an inefficient use of an expensive resource. This is especially important given that video game consoles are often funded by the manufacturer to lower the purchase price (with the expectation that this subsidy will be returned as a result of royalties from future purchases of video game software).
Video game consoles also incur costs associated with almost any consumer electronic device. For example, electronics and system mechanisms must be mounted in a housing. The manufacturer must provide a service guarantee. The retailer who sells the system must profit from any sale of the system and / or from the sale of video game software. All these factors add to the cost of a video game console, which should be financed either by the manufacturer, with the transfer to the consumer, or both.
In addition, piracy is a major concern in the video game industry. Security mechanisms that are used on virtually every major video game system “crack” over time, resulting in unauthorized copying of video games. For example, the Xbox 360 security system was hacked in July 2006, and users can now download illegal copies online. Games that can be downloaded (such as PC or Mac games) are particularly vulnerable to piracy. In certain regions of the world where piracy is poorly controlled, there is essentially no viable market for stand-alone video game software, because users can buy pirated copies as easily as legal copies for a price equal to a very small fraction of the price. In addition, in many parts of the world, the price of a gaming console is such a high percentage of revenue that even if piracy were controlled, few people could afford a modern gaming system.
In addition, the market for used games reduces the revenue of the video game industry. When a user loses interest in a game, he can sell the game to a retailer who resells the game to other users. This is an unauthorized but common practice that significantly reduces the income of game publishers. Likewise, sales typically fall by about 50% when switching to another platform every few years. This is because users stop buying games for old platforms when they find out that a new version of the platform is due to be released soon (for example, when they plan to release Playstation 3, users stop buying games for Playstation 2). Taken together, a decrease in sales and an increase in development costs associated with new platforms can have a very significant adverse effect on the profitability of game developers.
New game consoles are also very expensive. Xbox 360, Nintendo Wii and Sony Playstation 3 - all retail for hundreds of dollars. Powerful personal computer gaming systems can cost up to $ 8,000. This represents a significant investment for users, especially considering that the hardware becomes obsolete after several years and what many systems buy for children.
One approach to solving the above problems is online games in which the game control program and data are hosted on a server and delivered to client machines on demand as compressed video and audio, which are streamed over a digital broadband network. Some companies, such as G-Cluster in Finland (now a subsidiary of the Japanese corporation SOFTBANK Broadmedia) currently provide these services online. Similar gaming services have become available in local networks, for example, in hotel local networks, and are offered by DSL and cable television providers. The main disadvantage of these systems is the problem of latency, i.e. the time it takes for the signal to travel to and from the game server, which is usually located in the “central station” of the operator. Dynamic action video games (also known as twitch video games) require a very short wait time between the time the user performs an operation with the game controller and the time the display screen refreshes and shows the result of the user's operation. A short wait time is required in order for the user to have the feeling that the game responds “immediately”. Users may be satisfied with different timeout intervals, depending on the type of game and the user's skill level. For example, a 100 ms latency may be acceptable for a slow, unsystematic game (like a bakgammon) or a slow role-playing game, but in a dynamic action game, a latency greater than 70 or 80 ms can degrade the user's game and is therefore unacceptable. For example, in a game that requires a fast reaction time, accuracy decreases sharply as the latency increases from 50 to 100 ms.
When a game server or application server is installed in a nearby, controlled network environment, or where the network path to the user is predictable and / or it can withstand bandwidth peaks, it is much easier to manage the timeout, both in terms of maximum timeout, and from the point of view of constancy of latency (for example, so that the user observes steady motion from digital video transmitted by the stream through the network). This level of control can be achieved between the cable television network switchgear to the cable subscriber’s home, or from the DSL central office to the DSL subscriber’s home, or in a commercial office local area network (LAN) environment from a server or user. You can also get specially divided private point-to-point connections between companies that have guaranteed bandwidth and latency. But in a gaming or application system that hosts games in a server center connected to a common Internet, and then streams the compressed video to the user over a broadband connection, many factors affect the latency, which leads to serious limitations in the use of prior art systems .
In a typical house connected to a broadband channel, a user may have a DSL or cable modem for broadband services. For such broadband services, the round-trip signal transmission time between the user's home and the general Internet is usually 25 ms (and sometimes more). In addition, there are waiting times for signal transmission back and forth, which are the result of routing data through the Internet to the server center. The latency of transmission over the Internet varies depending on the route that is specified by the data and the delays that occur as a result of routing along it. In addition to routing delays, round-trip signal transmission time is also a consequence of the speed of light passing through the light guide that connects most of the Internet. For example, for every 1,000 miles (1,600 km), the round-trip signal transmission time is approximately 22 ms due to the speed of light through the light guide and other losses.
Additional latency may occur due to the data rate transmitted by the stream over the Internet. For example, if a user is provided with a DSL service, which is sold as a “6 megabits per second DSL service," then in practice, the user is likely to receive less than 5 megabits per second of downstream bandwidth at best, and is likely to observe that the connection periodically deteriorates due to various factors, for example, congestion during periods of peak load in the Digital Subscriber Line Access Multiplexer (DSLAM). A similar question may arise with a decrease in the data transfer speed of the cable modem (used) for the connection sold as a “6 megabits per second cable modem service” to a speed much lower than indicated if overload occurs in the local common coaxial cable forming a communication line through the neighborhood, or elsewhere on the network of cable modem systems. If data packets with a stable speed of 4 Mbit / s will be streamed in one direction in the format of the user datagram protocol (UDP) from the server center through such connections, if everything works correctly, then the data packets will be transmitted without additional waiting time, but in in case of congestion (or other obstacles), and if only 3.5 Mbit / s is available to streaming data to the user, then in a typical situation either packets will be dropped, which will result in data loss, or packets will stand out unit at the point of congestion until they are sent, therefore, with the introduction of an additional waiting time. At different congestion points, there is a different queue capacity for storing delayed packets; accordingly, in some cases, packets that cannot pass through the congestion point are immediately discarded. In other cases, several megabits of data are queued up and sent as a result. But, in almost all cases, queues at congestion points are limited in capacity, and after these limits are exceeded, the queues overflow and packets are discarded. Accordingly, to eliminate the additional waiting time (or, even worse, packet loss), it is necessary to eliminate the excess of the speed of data transfer from the game server or application server to the user.
Latency is also a consequence of the time required to compress video in a server and recover compressed video in a client device. Latency also occurs when the next frame is calculated in the video game running on the server and should be displayed on the screen. The currently available video compression algorithms suffer either from high data rates or from a long latency. For example, the motion picture compression algorithm MJPEG is a lossless intra-frame compression algorithm that has a short latency. Each video frame is compressed independently of each other video frame. When a client device receives a frame of video compressed using the MJPEG moving image compression algorithm, it can immediately restore the compressed frame and display it with the resulting very low latency. But due to the fact that each frame is compressed separately, the mentioned algorithm cannot use similar features between successive frames, and as a result, only intraframe video compression algorithms suffer from very high data transfer rates. For example, a video of 640x480, 60 frames per second (frames per second), compressed using the MJPEG moving image compression algorithm, may require 40 Mbps (megabits per second) or high (bit rate) data. Such high data transfer rates for video windows with such a low resolution require unreasonably high costs in many broadband applications (and of course for most consumer applications based on Internet technologies). In addition, due to the fact that each frame is compressed independently, artifacts in frames that can result from lossy compression are likely to appear in different places in successive frames. This can result in what the viewer looks like moving visual artifacts when the compressed video is restored.
Other compression algorithms, such as MPEG2, H.264, or VC9 from Microsoft, as used in prior art configurations, can achieve high compression ratios, but at the cost of a long latency. Such algorithms use interframe as well as intraframe compression. Periodically, such algorithms perform only intra-frame compression of the frame. Such a frame is known as a key frame (usually called an I-frame). After that, these algorithms, as a rule, compare the I-frame with previous frames and with subsequent frames. Instead of compressing the previous frames and subsequent frames independently, the above algorithm determines what has changed in the image of the previous and subsequent frames compared to the I-frame, and after that saves these changes as the so-called B-frames for changes preceding I- frame, and P-frames, for changes following the I-frame. This results in much lower data rates than intra-frame compression alone. But this, as a rule, is achieved due to a longer waiting time. An I-frame is usually much larger than a B-frame or P-frame (often 10 times larger), and as a result, its transmission takes proportionally longer at a given data rate.
Consider, for example, a situation where I-frames are 10 times larger than B-frames and P-frames, and there are 29 B-frames + 30 P-frames = 59 intermediate frames for each individual I-frame, or a total of 60 frames for each "Personnel Groups" (GOP). Accordingly, at 60 fps, 1 60-frame GOP exists every second. Assume that the maximum data rate on the transmission channel is 2 Mbps. To get the highest quality video in this channel, the compression algorithm can output a 2 Mbit / s data stream, and, taking into account the above factors, this results in 2 megabits (Mbit) / (59 + 10) = 30,394 bits in each intraframe and 303,935 bits in each I-frame. When the compressed data recovery algorithm receives a compressed video stream, for stable video playback, it is required to restore each compressed frame and display them at a constant interval (for example, 60 frames / s). To obtain this result, if there is a transmission waiting time for any frame, then all frames must be delayed at least by the time of this waiting time, respectively, the longest waiting time for the frame determines the waiting time for each video frame. I-frames introduce the largest transmission latency values since they are the largest and the entire I-frame must be received before the compressed I-frame can be restored and displayed (or any intermediate frame depending on I- frame). Given that the channel data rate is 2 Mbps, the transmission of the I-frame will take 303 935/2 Mbps = 145 ms.
In the inter-frame video compression system, as described above, using a large percentage of the transmission channel bandwidth, there are large latency values due to the large size of the I-frame relative to the average frame size. Or, in other words, despite the fact that with the inter-frame compression algorithms of the prior art, a lower average frame-by-frame data rate is achieved than with only intra-frame compression algorithms (for example, 2 Mbit / s compared to 40 Mbit / s), they, however, they suffer from a high peak frame rate (for example, 303 935 × 60 = 18.2 Mbit / s) due to the large I-frames. However, it should be taken into account that the above analysis assumes that all P-frames and B-frames are much smaller than I-frames. Although this is generally true, this is not true for frames with a high level of image complexity, uncorrelated with the previous frame, with a lot of movement, or with scene changes. In such situations, P-frames or B-frames can become as large as I-frames (if a P-frame or B-frame becomes larger than an I-frame, then a sophisticated compression algorithm, as a rule, “provides” an I-frame and replaces P-frame or B-frame I-frame). Accordingly, in a digital video stream, peaks of data rate having the size of an I-frame can occur at any time. Accordingly, in the case of compressed video, when the average video data rate approaches the bandwidth of the data rate of the transmission channels (as often happens, taking into account the requirements of a high data rate for video), high peak data rates due to I- frames or large P-frames or B-frames as a result lead to a large latency of the frame.
Undoubtedly, the discussion above only describes the latency of the compression algorithm created by large B-frames, P-frames, or I-frames in the GOP. If B frames are used, then the latency will be even longer. The reason this happens is that before the B-frame can be displayed, all B-frames following the B-frame and I-frame must be accepted. Accordingly, in a group of image sequence (GOP), for example BBBBBIPPPPPBBBBBIPPPPPP, in which there are 5 B-frames before each I-frame, the first B-frame cannot be displayed on the screen by the compressed video recovery device until the next B-frames and I-frame. Accordingly, if the video is streamed at 60 frames / s (i.e., 16.67 ms / frame), then before the compressed first B-frame can be restored, 16.67 × 6 = 100 ms is required to receive five B-frames and an I-frame, regardless of the channel bandwidth, and this is only with 5 B-frames. Compressed video sequences with 30 B frames are quite common. And, with a small channel bandwidth of 2 Mbps, the effect of latency caused by the size of the I-frame is a big addition to the influence of latency due to waiting for the arrival of B-frames. Accordingly, on a 2 Mbit / s channel, with a large number of B-frames, it is quite simple to exceed a latency of 500 ms or more using prior art video compression technology. If B-frames are not used (due to a lower compression ratio for a given quality level), then the B-frame does not cause a wait time, but, as before, the peak frame sizes described above cause a wait time.
The problem is compounded by the nature of many video games. Video compression algorithms using the GOP framework described above are highly optimized for use with live video or movie material intended for passive viewing. Typically, the camera (or a real camera or virtual camera in the case of computer-generated animation) and the scene are relatively stable, simply because if the camera or scene moves too jerkyly, then the film or video material (a) is generally not nice to watch, and (b) if it is watched, then usually the viewer does not closely monitor the action when the camera suddenly turns sharply (for example, if the camera bumps when shooting, when the child blows out the candles on the birthday cake, and suddenly turns sharply into a hundred to the cake and returns, the viewers are usually focused on the child and the cake, and do not pay attention to a brief interruption when the camera suddenly moves). In the case of a video interview or video teleconference, the camera can be held in a fixed position and not move at all, which results in a very very small number of data peaks. But a 3D video game with high activity is distinguished by constant movement (for example, consider 3D racing, where the entire frame is in fast motion during the race, or consider first-person action games where the virtual camera is constantly moving jerky). Such video games may result in sequences of frames with large and frequent peaks, where the user may need to clearly see what happens during these sudden movements. As such, compression artifacts are much less acceptable in high-activity 3D video games. Accordingly, the video output of many video games, due to their nature, outputs a compressed video stream with very high and frequent peaks.
Considering the fact that users of dynamic video game-fighters are not tolerant of long waiting times, and taking into account all the above reasons for the waiting time, until now there were restrictions for video games hosted on the server that streamed video to the Internet. In addition, users of applications that require a high degree of interactivity suffer from similar restrictions if these applications are hosted on the Internet and stream video. Such services require a network configuration in which hosting servers are installed directly in the switchgear (in the case of cable broadband) or in the central exchange (in the case of digital subscriber lines (DSL)), or within the LAN (or on a specially divided private connection) in the commercial environment so that the route and distance from the client device to the server is controlled to minimize latency, and peaks can be adapted so that they do not cause latency. LAN networks (typically at speeds of up to 100 Mbit / s-1 Gbit / s) and leased lines with appropriate bandwidth can generally support peak bandwidth requirements (for example, a peak bandwidth of 18 Mbit / s is a fraction of the LAN bandwidth of 100 Mbps).
Peak bandwidth requirements can also be provided by the infrastructure of the broadband network associated with residential buildings, if special adaptations are made. For example, in a cable television system, digital video traffic may be provided with a dedicated bandwidth that can handle peaks, such as large I-frames. And, in a DSL system, a DSL modem can be provided at a higher speed, taking into account large peaks, or a specially divided connection can be provided that can process data at higher bit rates. But the common cable modem and DSL infrastructure connected to the common Internet have far less tolerance for peak bandwidth requirements for compressed video. Accordingly, online services that host video games or applications in server centers that are far away from client devices and then stream compressed video output over the Internet through regular broadband connections associated with residential buildings suffer from significant latency and peak bandwidth limitations bandwidth - especially for games and applications that require a very short wait time (for example, first-person action games and other multiplayer, interactive games ry-action films or applications requiring a short response time).
Brief Description of the Drawings
The present disclosure of the subject invention will become fully understood from the following detailed description and from the accompanying drawings, which, however, should not be construed as limiting the disclosed subject matter to the particular depicted embodiments, but should be construed as intended only for explanation and understanding.
Figure 1 shows the architecture of a gaming video system of the prior art.
Figures 2a-2b illustrate the architecture of a high-level system according to one embodiment.
Figure 3 shows the actual, estimated and required data rates for transmitting information between the client and server.
Fig. 4a shows a hosting service and a client used in accordance with one embodiment.
Figure 4b shows illustrative latency values associated with the transfer of information between a client and a hosting service.
FIG. 4c shows a client device according to one embodiment.
Fig. 4d shows a client device according to another embodiment.
Fig. 4e shows an illustrative block diagram of the client device of Fig. 4c.
Fig. 4f shows an illustrative block diagram of the client device of Fig. 4d.
5 depicts an illustrative form of video compression that can be applied according to one embodiment.
Fig. 6a shows an illustrative form of video compression that can be applied in another embodiment.
Fig. 6b shows peaks of the data rate associated with the transmission of a low activity, low complexity video sequence.
Fig. 6c shows peaks of the data rate associated with the transmission of a video sequence with high activity, high complexity.
FIGS. 7a-7b illustrate exemplary video compression methods used in one embodiment.
On Fig depicts additional illustrative video compression methods used in one embodiment.
FIGS. 9a-9c illustrate illustrative methods used in one embodiment to reduce data rate peaks.
10a-10b depict one embodiment that efficiently packs fragments within packets.
On figa-fig.11d depicts embodiments of which apply methods of direct error correction.
12 depicts one embodiment that utilizes multi-core processors for compression.
On figa-fig.13b shows the geographical location and communication between hosting services according to various options for implementation.
On Fig shows illustrative values of the latency associated with the transfer of information between the client and the hosting service.
On Fig depicts an illustrative architecture of the server center of the hosting service.
16 is an illustrative screen shot of one embodiment of a user interface that includes multiple real-time video windows.
On Fig depicts the user interface of Fig after selecting a specific video window.
On Fig depicts the user interface of Fig after opening the said specific video window in full screen.
On Fig depicts illustrative joint user video data, combined on the screen of the game with several participants.
On Fig depicts an illustrative user page for a player in a hosting service.
On Fig depicts an illustrative 3D interactive advertising.
On Fig depicts an illustrative sequence of steps for outputting a photoreal image having a textured surface, based on the capture surface of a live performance.
On Fig depicts an illustrative page of the user interface, which provides the ability to select linear multimedia content.
24 is a graph that illustrates the time that elapses before a web page becomes active, compared with the connection speed.
Description of Illustrative Embodiments
The following description sets forth specific details, for example, device types, system configurations, information exchange methods, etc. to provide a complete understanding of the present disclosure of the subject matter. However, it will be understood by those skilled in the relevant technical fields to which this invention relates that these specific details are not necessary for the application of the described embodiments.
2a-2b, there is provided a high-level architecture of two embodiments in which video games and applications are hosted by the hosting service 210 and are accessed by client devices 205 located in the user territory 211 (note that “user territory” means any place where the user is located, including outdoors, if he uses a mobile device), via Internet 206 (or another public or private network) according to the subscription service. Client devices 205 may be general purpose computers, for example, Microsoft Windows PCs or Linux PCs or Apple, Inc. Macintosh computers. with a wired or wireless connection to the Internet, with an integrated or external display 222, or they can be specialized client devices, for example, a television set-top box (with a wired or wireless connection to the Internet) that outputs video and audio to a monitor or television 222, or they may be mobile devices, presumably with a wireless connection to the Internet.
Any of these devices may have their own user input devices (e.g., keyboards, buttons, touch screens, trackpad or light pen, video capture cameras and / or motion monitoring cameras, etc.), or they can use external devices 221 input devices (for example, keyboards, mice, game controllers, a light pen, video capture cameras and / or motion monitoring cameras, etc.) connected via wires or wirelessly. As described in more detail below, the hosting service 210 includes servers of various performance levels, including servers with powerful CPU / GPU processing capabilities. While playing a game or using an application in the hosting service 210, the home or office client device 205 receives the input of the controller and / or keyboard from the user, and then it sends this input to the controller via Internet 206 to the hosting service 210, which in response executes the program that controls the game and generates successive frames of video output (sequence of video images) for a game or application software (for example, if the user presses a button that can move be a character on the screen to the right, the game program then creates a video sequence in which the character moving to the right) is depicted. This sequence of video images is then compressed using a low-latency video compression device, and after that, the hosting service 210 transmits a low-latency video stream via the Internet 206. The home or office client device then decodes the compressed video stream and visualizes the reconstructed video images on a monitor or television . Therefore, the requirements for computing and graphics technology of the client device 205 are significantly reduced. It is required that client 205 only have processing power to send keyboard / controller input data to Internet 206 and to decode and recover the compressed video stream received from Internet 206, which at present virtually any personal computer can run in software on its central processor (for example , Intel’s dual-core processor (Intel Corporation Core Duo CPU), operating at a frequency of approximately 2 GHz, can recover compressed HDTV 1280 × 720 encoded using such devices tv compression like H.264 and Windows Media VC9). And, in the case of any client devices, also specialized integrated circuits can perform compressed video recovery for such standards in real time with much lower costs and much lower power consumption than a universal CPU, for example, which may be required for a modern PC. Namely, to perform the function of transferring the input data of the controller and recovering the compressed video, the home client devices 205 do not require any specialized graphic processors (GPUs), an optical disk drive or hard disk drives, for example, a video game system of the prior art depicted in figure 1.
As games and application software become more complex and more photorealistic, they require a GPU, a higher performance CPU, more RAM and faster magnetic disk drives of a larger capacity, and the processing power of the hosting service 210 can be continuously upgraded, but the end user will not need to update the home or office client platform 205, since the processing requirements will remain constant for the display resolution and frequency firewood with this algorithm for recovering compressed video. Accordingly, hardware limitations and compatibility issues currently available do not exist in the system depicted in FIGS. 2a-2b.
In addition, since the game and application software are executed only on servers in the hosting service 210, there is no copy of the game or application software (either as an optical storage medium or as downloadable software) in the office (“office” as used in this description, unless otherwise specified, includes any non-residential environment that is not associated with permanent residence, including classrooms, for example) or in the user's home. This significantly reduces the likelihood that the game or application software will be illegally copied (illegally used), and also reduces the likelihood that a valuable database that can be (used) by the game or application software will be illegally used. In fact, if the game requires specialized servers (for example, requiring very expensive, large or noisy equipment) or application software that cannot be used in practice at home or in the office, even if a pirated copy of the game or application is received software, then it can not be used at home or in the office.
In one embodiment, the hosting service 210 provides software development tools for application or game developers 220 (which typically refers to software companies, game or television studios, or application or game publishers) that design video games, so that they can design games that can be played in the 210 hosting service. Such tools enable developers to use features of a hosting service that are not usually available in a standalone PC or game console (for example, quick access to very large databases of complex geometry (“geometry”, unless otherwise specified, is used in this description to refer to polygons, textures, equipment, lighting, behaviors and other components and parameters that define 3D databases)).
Different business models are possible for this architecture. According to one model, the hosting service 210 receives a subscription fee from the end user and pays a license fee to developers 220, as shown in FIG. 2a. In the alternative implementation depicted in FIG. 2b, developers 220 receive a monthly fee directly from the user and pay the hosting service 210 for hosting the contents of the application or game. These underlying principles are not limited to any particular business model for hosting an application or online games.
Compressed Video Features
As discussed previously, one significant problem with providing video game services or online application software services is the latency problem. The latency of 70-80 ms (from the moment when the input device is activated by the user to the moment when the answer is displayed on the display screen) is at the upper limit for games and applications that require a short response time. However, it is very difficult to achieve such a delay in the context of the architecture depicted in FIGS. 2a and 2b, due to several practical and physical limitations.
As indicated in FIG. 3, when a user subscribes to an Internet service, the connection is usually rated at a nominal maximum data rate 301 to the user's office or home. Depending on the policies of the providers and the characteristics of the equipment for routing, this maximum data rate can be more or less strictly enforced, but as a rule, the actual available data rate is lower for one or several different reasons. For example, in a central DSL PBX or in a local cable modem communication line, there may be too much network traffic, or there may be noise in the cable network causing packet drops, or the provider may set the maximum number of bits per month for each user. Currently, the maximum downstream data rate for DSL and cable services typically ranges from a few hundred kilobits per second (Kbps) to 30 Mbps. Cellular services are usually limited to hundreds of Kbps downstream data. However, the speed of broadband services and the number of users who subscribe to broadband services will increase significantly over time. Currently, some analysts estimate that 33% of US broadband subscribers have a downlink data rate of 2 Mbps or more. For example, some analysts predict that by 2010, over 85% of US broadband subscribers will have a data rate of 2 Mbps or more.
As indicated in FIG. 3, the actual available maximum data rate 302 may fluctuate over time. Accordingly, in the context of low latency application software or online games, it is sometimes difficult to predict the actual available data rate for a particular video stream. If the data transfer rate 303 required to support a given quality level at a given number of frames per second (fps) with a given resolution (for example, 640x480, 60 fps) for a certain level of scene complexity and movement rises above the actual the available maximum data rate 302 (as indicated by the peak in FIG. 3), several problems may occur. For example, some Internet services will simply drop packets, which will result in data loss and distorted / lost images on the user's video screen. In other services, additional packages will be temporarily buffered (i.e., queued), and these packages will be provided to the client with the available data transfer rate, which as a result will increase the waiting time - an unacceptable result for many video games and applications. Finally, some Internet service providers will consider the increase in data transfer speed as a malicious attack, for example, as a denial of service attack (a known method (used) by hackers to block network connections), and interrupt the user's connection to the Internet for a certain period of time. Accordingly, in the embodiments described herein, steps are taken to ensure that the data rate required for a video game does not exceed the maximum available data rate.
Hosting Service Architecture
4a shows a hosting service architecture 210 according to one embodiment. Hosting service 210 can either be located in one server center, or it can be distributed across the entire set of server centers (to provide connections with users with less latency who have routes with less latency to certain server centers than others, to ensure alignment load among users and redundancy in the event of failure of one or more server centers). Hosting service 210 may eventually include hundreds of thousands or even millions of servers 402 serving a very large number of users. The hosting service management system 401 provides centralized management for the hosting service 210 and manages routers, servers, video compression systems, metering and billing systems, etc. In one embodiment, the hosting service management system 401 is implemented in a distributed data processing system based on a Linux operating system connected to RAID disk arrays used to store databases for user information, server information, and system statistics. In the foregoing descriptions, various operations performed by the hosting service 210, if not referred to other special systems, are initiated and controlled by the hosting service management system 401.
Hosting service 210 includes several servers 402, for example, those servers currently commercially available from Intel, IBM, Hewlett Packard, and others. Alternatively, the servers 402 can be assembled in a custom configuration from components, or, in the end, they can be integrated so that the entire server will be implemented as a single-chip integrated circuit. Although this diagram shows an example of a small number of servers 402, in actual use there can only be one server 402 or millions of servers 402 or more. All 402 servers can be configured in the same way (as an example of some configuration parameters, with the same performance and type of CPU, with or without a GPU, and if with a GPU, then with the same performance and type of GPU, with the same number of CPUs and GPUs, with the same volume and RAM type / speed and with the same RAM configuration), or different subsets of 402 servers can have the same configuration (for example, 25% of the servers can be executed in a certain way, 50% in a different way, and 25% in another way), or all 402 servers can would be different.
In one embodiment, the diskless servers 402, i.e. instead of having its own local mass storage device (regardless of whether it is an optical or magnetic storage device, or a semiconductor storage device such as flash memory, or another mass storage medium performing a similar function), each server has access to shared mass memory via a high-speed trunk or accelerated network connection. In one embodiment, this high-speed connection is a storage area network (SAN) 403 connected to a set of 405 array of independent redundant disk drives (RAID) with connections between devices using Gigabit Ethernet technology. As is well known to those skilled in the art, SAN 403 can be used to combine multiple 405 RAID arrays together, resulting in an extremely wide bandwidth — approaching or possibly exceeding the bandwidth available from RAM used in modern game consoles and PC. And, while RAID-type disk arrays based on rotating storage media, such as magnetic storage media, often have significant access latency during searches, semiconductor memory-based RAID arrays can be implemented with much shorter access latency. In another configuration, some or all of the servers 402 provide some or all of their own mass storage devices locally. For example, server 402 can store information that is often accessed, such as its operating system and a copy of a video game or application, on a local flash memory device with low latency, but it can use a SAN to access type 405 disk arrays RAID, based on rotating storage media with a long search time, to gain access to large databases of geometry or information about the state of the game at a lower frequency.
In addition, in one embodiment, the hosting service 210 applies the low latency video compression logic 404, described in detail below. Video compression logic 404 may be implemented in software, hardware, or any combination thereof (certain embodiments of which are described below). Video compression logic 404 includes logic for compressing audio and visual material.
In an operation while conducting a video game or using an application on the user territory 211 via a keyboard, mouse, game controller, or other input device 421, the control signal logic 413 on the client 415 transmits control signals 406a-406b (typically in the form of UDP packets) representing keystrokes (and other types of user inputs) driven by the user to the hosting service 210. The control signals from this user are routed to the corresponding server 402 (or servers, if on devices many servers react to user input). As shown in FIG. 4a, control signals 406a may be routed to servers 402 via a SAN. Alternatively or in addition, control signals 406b may be routed directly to servers 402 via a hosting service network (e.g., via an Ethernet LAN). Regardless of how they were transmitted, the server or servers execute the game or application software in response to control signals 406a-406b. 4a, although various network components, such as firewall (s) and / or gateway (s), are not shown, they can handle incoming and outgoing traffic at the boundary of hosting service 210 (for example, between hosting service 210 and the Internet 410) and / or on the border of the territory 211 of the user between the Internet 410 and the home or office client 415. Graphic output and audio output of an executable game or application software - i.e. new video sequences — are provided to the low latency video compression logic 404, which compresses the video sequence according to the low latency video compression methods, for example, according to the methods described herein, and transmits a compressed video stream, typically with compressed or uncompressed audio, back to client 415 via the Internet 410 (or, as described below, through an optimized high-speed network service that betrays the bypass of the common Internet). After that, the low-latency compressed video recovery logic 412 in the client 415 recovers the compressed video and audio streams and visualizes the restored video stream, and, as a rule, plays the restored audio stream, on the display 422. Alternatively, the audio can be played in speakers separated from the display 422, or not reproduced at all. Note that although the input device 421 and the display 422 are depicted in FIGS. 2a and 2b as standalone devices, they can be integrated within client devices, such as laptop computers or mobile devices.
The home or office client 415 (described previously in FIG. 2a and FIG. 2b as the home or office client 205) can be a very economical device with low power consumption, very limited computing or graphics performance, and can have very limited local mass storage , or she may be completely absent. In contrast, each server 402 connected to a SAN 403 and multiple RAID 405 can be an exceptionally high-performance computing system, and, of course, if multiple servers are shared in a parallel processing configuration, then there are almost no restrictions on the computing performance of the system and the performance of the graphics subsystem. which can be used. And, due to the compression of 404 video with low latency and the compression of 412 video with low latency, the user is under the impression that the computing power of the servers 402 is provided to the user. When the user presses a key on the input device 421, the image on the display 422 is updated in response to pressing this key without significant delay in terms of perception, as in the case of a local execution of a game or application software. Accordingly, with a home or office client 415, which is a computer with very low performance or only an economical integrated circuit, which implements the recovery of compressed video with a low latency and control signal logic 413, the user is provided with virtually randomly selected processing power from a remote location, moreover it seems that they are locally. This gives users the ability to run the most advanced, intensive (usually new) video games and the most high-performance applications.
FIG. 4c depicts a very basic and economical home or office client device 465. This device is an embodiment of the home or office client 415 of FIG. 4a and 4b. Its length is approximately 5 centimeters. It has a 462 Ethernet jack that provides an interface with an Ethernet cable powered by Ethernet (PoE), through which it receives power and receives an Internet connection. It can perform network address translation (NAT) within a network that supports NAT. In an office environment, many new Ethernet switches have PoE and bring PoE directly to the Ethernet jack in the office. (B) In this case, all that is required is an Ethernet cable from the wall jack to the client 465. If the existing Ethernet connection does not transfer energy (for example, in a house with a DSL or cable modem, but without PoE), then commercially available wall-mounted “modules” (ie, power supplies) that accept an Ethernet cable without power and output Ethernet from PoE.
Client 465 comprises a control signal logic 413 (of FIG. 4a) that is connected to a Bluetooth wireless interface that provides an interface to Bluetooth input devices 479, such as a keyboard, mouse, game controller, and / or microphone and / or headphones. In addition, one embodiment of client 465 can output video at a speed of 120 frames per second, when connected to a display 468 it can support video at 120 frames per second and transmit signals (usually via infrared radiation) to shutter glasses 466 for alternately closing one eyes, then another with each successive frame. The result perceived by the user is a stereoscopic 3D image that "pops up" from the display screen. One such 468 display that supports such an operation is the Samsung HL-T5076S. Since the video stream for each eye is separate, in one embodiment, two independent video streams are compressed by the hosting service 210, frames are interleaved in time, and compressed frames are restored as two independent processes for recovering compressed data within client 465.
Client 465 also contains low latency compressed video recovery logic 412 that recovers compressed incoming video and audio and outputs via HDMI (High Definition Multimedia Interface), a 463 patch cable that connects to SDTV (Standard Definition Television) or HDTV (High Definition Television) definition) 468, in the case of a TV with video and audio, or to a 468 monitor that supports HDMI. If the user's monitor 468 does not support HDMI, then HDMI-DVI (digital video interface) can be used, but audio will be lost. According to the HDMI standard, display characteristics 464 (e.g., supported resolutions, frame rates) are transmitted from the display 468, and this information is then returned via the Internet connection 462 to the hosting service 210, so it can stream compressed video in a format suitable for display.
On fig.4d shows a home or office client device 475, which is identical to the home or office client device 465 shown in figs, except that it has more external interfaces. In addition, the 475 client can accept any PoE for power, or it can be powered by an external power adapter (not shown) that plugs into the pin jack on the wall. Using the USB input of client 475, camcorder 477 provides compressed video to client 475, which is uploaded by client 475 to hosting service 210 for use as described below. A low latency compression device using the compression methods described below is integrated in the camera 477.
In addition to the Ethernet connector for connecting to the Internet, the 475 client also has an 802.11g wireless interface with the Internet. Both interfaces can use NAT inside a network that supports NAT.
In addition, in addition to the HDMI connector for video and audio output, the 475 client also has a Dual Link DVI-I (DVI-I) connector, which includes an analog output (and provides a VGA output with a reference adapter cable). It also has analog outputs for composite video and S-video.
For audio, the 475 client has left / right RCA analog stereo jacks, and it has a TOSLINK (optical output) output for digital audio output.
In addition to the Bluetooth wireless interface with input devices 479, it also has USB sockets for providing an interface with input devices.
FIG. 4e shows one embodiment of the internal architecture of client 465. Or, all or some of the devices shown in the diagram can be implemented in a user-programmable logic matrix, a custom ASIC, or in several separate devices, either custom or pre-built.
Ethernet with PoE 497 connects to the 481 Ethernet interface. Power 499 is supplied from Ethernet with PoE 497 and connected to other devices in client 465. Bus 480 is a common bus for exchanging information between devices.
The control CPU 483 (almost any small CPU, for example, the R4000 series CPU with a speed of a million instructions per second (MIPS) and a clock frequency of 100 MHz with built-in RAM meets the requirements), executing a small client control application from the flash memory 476, ensures the execution of the protocol stack for the network (ie, the Ethernet interface), and also communicates with the hosting service 210, and configures all devices in the client 465. It also manages the interfaces with the input devices 469 and sends packets back to the hosting service 210 with user controller data protected by direct error correction, if necessary. In addition, the control CPU 483 monitors packet traffic (for example, if packets are lost or delayed, and also makes a time stamp for their arrival). This information is sent back to the hosting service 210 so that it can constantly monitor the network connection and adjust what it sends, respectively. Initially, during the manufacturing process, the flash memory 476 is downloaded from the control program for the control CPU 483, as well as a serial number that is unique to a particular client equipment unit 465. This serial number enables the hosting service 210 to uniquely identify a client 465 piece of equipment.
The Bluetooth interface 484 communicates wirelessly with input devices 469 via its antenna located within client 465.
The compressed video recovery device 486 is a low latency compressed video recovery device configured to implement the compressed video recovery described in this document. There are a large number of compressed video recovery devices, either prefabricated or in the form of intellectual property (IP) designs, which can be integrated into FPGAs or custom ASICs. One company offering IP for the H.264 decoder is Ocean Logic, Manly, NSW Australia. The advantage of using IP is that the compression methods used in this description do not comply with compression standards. Some standard compressed data recovery devices are flexible enough to provide the compression methods described in this document, but some cannot (provide). But, through IP, there is complete flexibility for redesigning a compressed data recovery device as required.
The output of the compressed video recovery device is connected to the video output subsystem 487, which connects the video to the video output of the HDMI 490 interface.
The compressed audio recovery subsystem 488 is implemented either using a standard commercially available compressed audio recovery device, or it can be implemented as IP, or the compressed audio recovery can be implemented inside the control processor 483, which can, for example, implement a compressed audio recovery device Vorbis.
A device that recovers compressed audio is connected to an audio output subsystem 489 that connects audio to the audio output of the HDMI 490 interface.
4f shows one embodiment of the internal architecture of the client 475. As you can see, this architecture is identical to the architecture of the client 465 with the exception of additional interfaces and optional external DC power from the power adapter, which is inserted into the contact socket on the wall, and if used , replaces the power that can come from the PoE 497 Ethernet. The functionality that is present with client 465 will not be repeated below, but additional functionality is described as follows.
The CPU 483 communicates with and configures additional devices.
The WiFi subsystem 482 provides wireless access to the Internet, alternatively, to the Ethernet 497 through its antenna. On sale there are WiFi subsystems from a wide range of manufacturers, including Atheros Communications, Santa Clara, CA (Santa Clara, California).
The USB 485 subsystem provides an alternative to Bluetooth for wired USB input devices 479. USB subsystems are fairly standard and generally available for FPGAs and ASICs, and they are often embedded in ready-made devices that perform other functions, such as recovering compressed video.
The video output subsystem 487 outputs a wider range of video outputs than within the client 465. In addition to providing HDMI 490 video output, it provides DVI-I 491, S-video 492, and composite video 493. In addition, when DVI- I 491, display characteristics 464 are transferred back from the display to the control CPU 483 so that it can notify the hosting service 210 of the characteristics of the display 478. All interfaces provided by the video output subsystem 487 are fairly standard interfaces and are generally available many species.
The audio output subsystem 489 outputs audio in digital form via a digital interface 494 (S / PDIF and / or TOSLINK) and audio in analog form through a stereo analog interface 495.
Round-trip time analysis
Undoubtedly, in order to realize the advantages described in the previous section, the round-trip signal transmission time between the user’s action using the input device 421 and the appearance of the result of this action on the display 420 should not exceed 70-80 ms. This waiting time should take into account all the factors on the way from the input device 421 in the user territory 211 to the hosting service 210 and back to the user territory 211, to the display 422. Fig. 4b shows various components and networks through which the signals must pass, and above of these components and networks there is a timeline on which illustrative values of the waiting time, which can be expected in practical implementation, are displayed in an ordered form. Note that FIG. 4b is simplified, and only critical path routing is depicted. Other data routing used for other features of the system is described below. Bidirectional arrows (for example, arrow 453) indicate the time of signal transmission back and forth, and unidirectional arrows (for example, arrow 457) indicate time of signal transmission in one direction, and “~” indicate the approximate measurement. It should be noted that there will be real situations in which it will not be possible to obtain the listed latency values, but in most cases in the United States, using cable modem and DSL connections with user territory 211, these latency values can be obtained in the cases described in next section. We also note that, although the cellular wireless connection to the Internet will certainly function in the depicted system, in most modern American cellular data transmission systems (for example EVDO) there are very large values of latency, and latency values cannot be obtained. depicted in fig.4b. However, these underlying principles may be implemented in future cellular technologies in which this level of latency can be implemented.
Starting with the input device 421 in the user territory 211, after the user drives the input device 421, the user control signal is sent to the client 415 (which may be a stand-alone device, for example, a television set-top box, or it may be software or hardware software operating in another device, such as a PC or mobile device), and is packetized (in UDP format in one embodiment), and the packet is given a destination address for transmission to the 210 hosting service. The package will also contain information to indicate which user receives control signals. After that, the packet (s) of the control signal is sent (are) through the device 443 firewall / router / NAT (network address translation) in the 442 WAN interface. The WAN interface 442 is an interface device provided for the user territory 211 by the user's ISP provider (Internet service provider). The 442 WAN interface can be a DSL cable or modem, a WiMax transceiver, a fiber optic transceiver, a cellular data system interface, an IP protocol over an electrical network (Internet Protocol-over-powerline), or any other of a variety of interfaces with Internet In addition, the firewall / router / NAT device 443 (and possibly the WAN interface 442) can be integrated into the client 415. An example of this is a mobile phone, which includes software to implement the functionality of a home or office client 415, and also a means for routing and connecting to the Internet wirelessly through some standard (for example, 802.11g).
After that, the WAN interface 442 routes control signals to what is referred to in this description as an “Internet access point” 441 for a user’s Internet service provider (ISP), which is a means that provides an interface between a WAN data medium connected to territory 211 user, and the public Internet or private network. Internet access point characteristics vary depending on the nature of the Internet service provided. For DSL, this will usually be the central telephone exchange of the telephone company where the DSLAM is located. For cable modems, this will usually be the central station of the cable multisystem operator (MSO). For cellular communication systems, this will usually be the control room associated with the antenna mast of the cellular communication. But whatever the nature of the Internet entry point, it further routes the control signal packet (s) to the general Internet 410. The control signal packet (s) is further routed to the 441 WAN interface with hosting service 210, through what will most likely be the interface transceiver for fiber optic communication line. WAN 441 further routes the control signal packets to routing logic 409 (which can be implemented in many different ways, including an Ethernet switch and routing servers), which determines the user address and routes the control (s) signal (s) to the corresponding server 402 for this user.
After that, the server 402 receives control signals as input signals for the game or application software, which are executed on the server 402, and uses them to process the next frame of the game or application. After the formation of the next frame, video and audio are output from the server 402 to the video compression device 404. Video and audio can be output from server 402 to compression device 404 by various means. Firstly, the compression device 404 can be integrated into the server 402, respectively, the compression can be implemented locally inside the server 402. Or video and / or audio can be output in the form of packets through a network connection, for example, an Ethernet connection, to a network that is any a private network between the server 402 and the video compression device 404, or via a shared network, such as SAN 403. Or video can be output through the video output connector from the server 402, for example, a VGA or DVI connector, and then captured by the video compression device 404. In addition, audio can be output from server 402 either as digital audio (eg, via an S / PDIF or TOSLINK connector) or as analog audio that is digitized and encoded by audio compression logic within video compression device 404.
After the video compression device 404 captures the video frame and audio generated during this frame period from the server 402, the video compression device compresses the video and audio using the methods described below. After compressing the video and audio, packets with an address are formed for sending back to the client 415 of the user, and they are routed to the WAN interface 441, which then routes the video and audio packets through the common Internet 410, which further routes the video and audio packets to the entry point 441 in An Internet ISP of a user who routes video and audio packets to a 442 WAN interface on the territory of a user who routes video and audio packets to a firewall / router / NAT device 443, which then routes video and audio packets to client 415.
The client 415 recovers the compressed video and audio and then outputs the video to the display screen 422 (or to the client’s built-in display) and sends the audio to the display 422, or to separate amplifiers / speakers, or to the amplifier / speakers built into the client.
In order for the user to perceive the entire process described only without delay, the two-way delay should be less than 70 or 80 ms. Some of the latency delays in the described round-trip path are controlled by the hosting service 210 and / or the user, while others are not controlled. However, the following measurements, based on the analysis and testing of a large number of real-world scenarios, are approximate.
The one-way transmission time for sending control signals 451 is usually less than 1 ms, and round-trip routing through user territory 452 is usually performed using consumer-grade firewall / router / NAT public switches over an Ethernet LAN in approximately 1 ms. I (Connections) ISP users differ significantly in their round-trip delays 453, but with cable modem and DSL providers, there is usually a (delay) between 10 and 25 ms. Round-trip Internet 410 signal transmission times can vary greatly depending on how the traffic is routed and if there are any failures on the route (and these are discussed below), but as a rule, the general Internet provides fairly optimal routes , and the waiting time is largely determined by the speed of light through the light guide, taking into account the distance to the destination. As also discussed below, the authors installed the hosting service 210 at the most at a distance of about 1000 miles (1600 km), which, as expected, it will be located from the territory of 211 users. The actual time it takes to transmit a signal over the Internet for (distance) 1000 miles (1600 km) (2000 miles (3200 km) round trip) is approximately 22 ms. The WAN interface 441 with the hosting service 210 is typically an interface of a high-speed, commercial-grade fiber optic link with little latency. Accordingly, the latency time 454 of the common Internet is usually between 1 and 10 ms. One-way routing latency 455 can be reached through the hosting service 210 in less than 1 ms. Server 402, as a rule, calculates a new frame for a game or application in less than one frame period (which at 60 frames / s is 16.7 ms), respectively, 16 ms is acceptable for using a maximum time of 456 signal transmission in one side. In an optimized hardware implementation of the audio compression and video compression algorithms described in this document, compression 457 can be completed in 1 ms. In less optimized versions, compression may take 6 ms (of course, even less optimized versions may take longer, but such implementations may affect the overall round-trip wait time, and it may be necessary for other timeout values to be shorter (e.g. the allowed distance through the common Internet can be reduced) to maintain a target latency of 70-80 ms). The values of the round-trip signal transmission time for Internet 454, (connection) 453, ISP user, and routing 452 within the user's territory have already been considered, so it remains to consider the waiting time for recovering 458 compressed video, which, depending on whether recovery of 458 compressed is implemented video in specialized hardware, or whether it is implemented in software on a client device 415 (e.g., PC or mobile device), may vary depending on the size of the display and the manufacturer CPU FOR A restoration of the compressed data. Typically, recovering 458 compressed data takes between 1 and 8 ms.
Accordingly, by summing all the latency values in the worst case observed in practice, it is possible to determine the round-trip time of the signal in the worst case that can be expected to be experienced by the user of the system shown in Fig. 4a. They are equal: 1 + 1 + 25 + 22 + 1 + 16 + 6 + 8 = 80 ms. And, of course, in practice (with the reservations discussed below), this is approximately equal to the round-trip signal transmission time observed when using the version of the pilot model shown in Fig. 4a using ready-made Windows PCs as client devices and home cable modem and DSL connections within the United States. Undoubtedly, scenarios that are better than the worst case can result in much shorter waiting times, but they cannot be based on when developing a commercial service that is widely used.
To obtain the latency values displayed in an ordered form in FIG. 4b (when transmitting a signal) via the common Internet, it is required that the video compression device 404 and the compressed video recovery device 412 of FIG. 4a generate a packet stream in the client 415 (c) with very specific characteristics, for example, the sequence of packets formed all the way from the hosting service 210 to the display 422 was not subject to delays or excessive packet loss and, in particular, constantly met the bandwidth limitations available to users to the user by connecting to the Internet of the user through the 442 WAN interface and the firewall / router / NAT 443. In addition, the video compression device must create a packet stream that is stable enough to allow for inevitable packet loss and packet reordering that occurs in normal Internet and network transmissions.
Low latency video compression
In order to achieve the foregoing objectives, in one embodiment, a new approach to video compression is used, whereby latency and peak bandwidth requirements for video transmission are reduced. Prior to describing these embodiments, an analysis of current video compression methods according to FIG. 5 and FIGS. 6a-6b will be provided. Undoubtedly, these methods can be applied according to the underlying principles if the user is provided with a bandwidth sufficient to process the data at the transmission rate required by these methods. Note that this description does not address audio compression, except for stating that it is implemented simultaneously and synchronously with video compression. There are prior art audio compression methods that satisfy the requirements for this system.
5 depicts one particular prior art method for compressing video in which each individual video frame 501-503 is compressed by compression logic 520 using a specific compression algorithm to generate a sequence of compressed frames 511-513. One embodiment of this method is a “MJPEG moving image compression algorithm” in which each frame is compressed according to a still image compression algorithm developed by the Joint Photographic Image Processing Machine (JPEG) Expert Group based on Discrete Cosine Transformation (DCT). However, various other types of compression algorithms can be applied, although, as before, with the implementation of these underlying principles (e.g. compression algorithms based on a wavelet, e.g. JPEG-2000).
One problem associated with this type of compression is that it reduces the data rate of each frame, but it does not use similar features between consecutive frames to reduce the data rate of the entire video stream. For example, as shown in FIG. 5, suppose that the frame rate is 640 × 480 × 24 bit / pixel = 640 × 480 × 24/8/1024 = 900 Kb / frame (Kb / frame), for a given image quality, using an algorithm MJPEG moving image compression can compress the stream with a factor of 10, which results in a data stream of 90 Kbytes / frame. At 60 frames / sec, a channel bandwidth of 90 KB × 8 bits × 60 frames / sec = 42.2 Mb / s would be required, which would be too wide a bandwidth for almost all home Internet connections in the USA at present, and too wide bandwidth for many office Internet connections. Of course, given that this will require a constant data stream with such a wide bandwidth, and it will serve only one user, even in an office LAN environment, it will consume most of the 100 Mbps LAN Ethernet bandwidth and load the switches heavily Ethernet supporting LAN. Accordingly, compression for a moving video is inefficient when compared with other compression methods (for example, compression methods described below). In addition, single-frame compression algorithms like JPEG and JPEG-2000, which use lossy compression algorithms, produce compression artifacts that might not be noticeable on still images (for example, an artifact inside thick foliage in a scene may not seem like an artifact, since the eye does not know exactly how dense foliage should look). But, if the scene moves, then the artifact can be noticeable because the eye notices that the artifact (changes) from frame to frame, despite the fact that the artifact is in the area of the scene where it could be invisible in a still image. This as a result leads to the perception of "background interference" in a sequence of frames similar in appearance to "snow" interference, seen during the boundary analog television reception. Undoubtedly, this type of compression can nevertheless be used in certain embodiments described in this document, but, generally speaking, for this quality of perception, the background noise in the scene must be eliminated, a high data rate (i.e., low compression ratio) )
Other types of compression, such as H.264 or Windows Media VC9, MPEG2, and MPEG4, are all more effective at compressing a video stream because they use similar features between consecutive frames. All of these methods are based on the same common video compression methods. Accordingly, although the H.264 standard will be described, however, identical general principles apply to various other compression algorithms. A large number of H.264 compression devices and a compressed data recovery device are available, including the H.264 open source software library for H.264 compression and the FFmpeg open source software library for recovering H.264 compressed data.
6a and 6b depict an illustrative prior art compression method in which a sequence of uncompressed video frames 501-503, 559-561 are compressed by compression logic 620 into a sequence of “I-frames” 611, 671, “P-frames” 612 -613 and “B-frames” 670. The vertical axis in FIG. 6a generally denotes the resulting size of each of the encoded frames (although the frames are not drawn to scale). As described above, video coding using I-frames, B-frames and P-frames is understood by those skilled in the art. In a few words, the I-frame 611 is a DCT-based compression of the full uncompressed frame 501 (similar to a JPEG compressed image as described above). The P-frames 612-613 are generally much smaller than the I-frames 611 because they use the data of the previous I-frame or P-frame, that is, they contain data indicating changes between the previous I-frame or P frame. B-frames 670 are similar to P-frames, except that B-frames use a frame in the next reference frame, as well as, possibly, a frame in the previous reference frame.
In the following discussion, it will be assumed that the required frame rate is 60 frames / second, that each I-frame is approximately 160 kbps, the average P-frame and B-frame is 16 kbps, and that a new I-frame is generated every second. With this set of parameters, the average data transfer rate is: 160 Kbps + 16 Kbps × 59 = 1.1 Mbps. This data rate is well within the maximum data rate for many modern high-speed Internet connections in homes and offices. Using this method, it is also possible to solve the problem of background noise with only intra-frame coding, because P- and B-frames track differences between frames, therefore, compression artifacts should not appear and disappear from frame to frame, therefore, the background noise problem described above is reduced .
One problem with the above types of compression is that, although the average data rate is relatively low (for example, 1.1 Mbps), the transmission of one I-frame can take several frame periods. For example, using prior art methods, a 2.2 Mbps network connection (eg, DSL or cable modem with a peak of 2.2 Mbps maximum available data rate 302 of FIG. 3a) is usually sufficient for video streaming from 1.1 Mbps with an I-frame of 160 Kbps every 60 frames. This can be accomplished if the compressed data recovery device queues 1 second of video before recovering the compressed video. 1.1 Mbits of data are transmitted in 1 second, which can be easily achieved with a maximum available data rate of 2.2 Mbps, even with the assumption that the available data rate may periodically decrease by 50%. Unfortunately, this prior art approach results in a wait time of 1 second for video due to the 1 second video buffer in the receiver. Such a delay meets the requirements of many prior art applications (for example, playing linear video), but it is too long to wait for action-packed action video games that do not allow latencies of more than 70-80 ms.
If you try to exclude the 1-second video buffer, then this still will not lead to a reduction in the latency sufficient for dynamic video-game fighters. For example, the use of B-frames, as described previously, inevitably entails the reception of all B-frames preceding the I-frame, as well as the I-frame. Assuming that 59 frames of non-I frames are roughly divided between P and B frames, at least 29 B frames and an I frame are received before any B frame can be displayed. Accordingly, regardless of the available channel bandwidth, this inevitably entails a delay of 29 + 1 = 30 frames with a duration of 1/60 second each, or 500 ms latency. Obviously, this is too long.
Accordingly, another approach is to exclude B-frames and use only I- and P-frames. (One consequence of this is that the data rate increases for a given quality level, but for consistency in this example, we will continue to assume that the size of each I-frame is 160 Kbps, and the average P-frame is 16 Kbps, and, accordingly , the data transfer rate is still 1.1 Mbit / s.) This approach eliminates the inevitable latency introduced by B-frames, since the decoding of each P-frame depends only on the previous received frame. The problem that remains with this approach is that the I-frame is so much larger than the average P-frame that on a channel with a small bandwidth, as is usually the case in most homes and in many offices, transmitting the I-frame adds a significant latency . This is depicted in FIG. 6b. The video stream data rate 624 is lower than the available maximum data rate 621, with the exception of I-frames, where the peak data rate required for I-frames 623 is much higher than the available maximum data rate 622 (and even the nominal maximum data rate 621). The data rate required by the P frames is less than the maximum available data rate. Even if the peaks at 2.2 Mbit / s with the available maximum data rate constantly remain at their peak speed of 2.2 Mbit / s, then the transmission of the I-frame will take 160 Kbit / 2.2 Mbit = 71 ms, and if the available maximum speed Since 622 data transfers are reduced by 50% (1.1 Mbps), the transmission of the I-frame will take 142 ms. Accordingly, the latency for transmitting an I-frame is approximately between 71-142 ms. This latency is in addition to the latency values identified in FIG. 4b, which in the worst case total 70 ms, therefore, the total round-trip signal transmission time from the moment the user drives input device 421 to of how the image appears on the display 422, is equal to 141-222 ms, which is too large. And if the available maximum data rate becomes less than 2.2 Mbit / s, then the latency increases further.
Note also that, in general, there are serious consequences of ISP “plugging” through a peak data rate 623, which far exceeds the available data rate 622. Equipment for different ISPs behaves differently, but the following behaviors are quite common among cable modem and DSL ISPs when receiving packets with a much higher data rate than the available data rate 622: (a) packet delay by setting them to queue (setting a timeout), (b) dropping some or all packets, (c) interrupting the connection for a certain period of time (most likely because the ISP will be concerned that this is a malicious attack, for example, a denial of service attack service "). Accordingly, transmitting a packet stream with a full data rate with characteristics such as those shown in FIG. 6b is not a viable option. Peaks 623 may be queued in the hosting service 210 and sent at a data rate below the available maximum data rate, which introduces an unacceptable wait time described in the previous paragraph.
In addition, the video stream data rate sequence 624 shown in FIG. 6b is a very “normal” video stream data rate sequence and may be a type of data rate sequence that can be expected to result from video compression from the video sequence, which does not change very much and detects very little movement (for example, as is usually the case with video teleconferencing, where the cameras are in a fixed position and there is little movement ute, and objects in the scene, for example, sitting people talking, move little).
The video stream data rate sequence 634 shown in FIG. 6c is a sequence that can usually be expected to be observed from a video with a lot more actions, for example, which can be generated in a movie or video game, or in some application software. Note that along with peaks 633 of the I-frame, there are also peaks of the P-frame, for example 635 and 636, which are quite large and exceed the available maximum data rate in many cases. Despite the fact that these peaks of the P-frame are not quite as large as the peaks of the I-frame, they are nevertheless too large to transmit them over a channel with a full data rate, and as in the case of peaks I- frame, they, the peaks of the P-frame, should be transmitted slowly (therefore, with increasing latency).
On a channel with a wide bandwidth (for example, a 100 Mbit / s LAN or private connection with a wide bandwidth of 100 Mbit / s), large peaks may be allowed in the network, for example, peaks 633 I-frames or peaks 636 P-frames, and, in in principle, short latency can be supported. But such networks are often shared by many users (for example, in an office setting), and such “peak” data affects LAN performance, in particular if network traffic is routed to a private, shared connection (for example, from a remote data center data to the office). First of all, it should be borne in mind that in this example, a video stream with a relatively low resolution of 640 × 480 pixels with 60 frames / s. HDTV streams of 1920 × 1080 with 60 frames / s are easily handled by modern computers and displays, and more and more commercially available displays with a resolution of 2560 × 1440 with 60 frames / s (for example, a 30 "display by Apple Inc.). High-speed video sequences activity from 1920 × 1080 with 60 frames / s may require 4.5 Mbps using H.264 compression for an acceptable level of quality.If we assume that I-frames peak at a nominal data rate of 10X, the result is peaks 45 Mbps, as well as a smaller, but nonetheless significant, p to a P-frame: If several users receive video streams over an identical 100 Mbit / s network (for example, a private network connection between an office and a data processing center), it is easy to see how the peaks of a video stream of several users can combine with network bandwidth overflow and with possible bandwidth overflow on the backbone of switches supporting users on the network, even in the case of a network using Gigabit Ethernet technology, if enough peaks of a sufficient number of users are combined camping at the same time, it can overwhelm a network or network switches. And when video with a resolution of 2560 × 1440 becomes more common, the average data rate of the video stream can be 9.5 Mbit / s, which may result in a peak data transfer rate of 95 Mbit / s. There is no doubt that the 100 Mbps connection between the data center and the office (which is currently an exceptionally high-speed connection) will be completely overwhelmed by the peak traffic of an individual user. Accordingly, although LAN and private network connections may allow streaming video with peaks, streaming video with high peaks is not desirable, and special planning and adaptation by the IT office may be required.
Undoubtedly, for standard linear video applications, these issues are not a problem, because the data rate is “smoothed out” at the time of transmission, and the data for each frame is lower than the maximum available data rate 622, and the sequence I-, P is stored in the buffer in the client - and B-frames before they are restored. Accordingly, the data rate over the network remains close to the average data rate of the video stream. Unfortunately, this introduces a wait time, even if B-frames are not used, which is unacceptable for applications with low latency, for example, video games and applications require a short response time.
One prior art solution to reduce video streams that have high peaks is to use a method often referred to as "Constant Bitrate" (Constant Bit Rate, CBR) encoding. Although it seems that the term CBR means that all frames are compressed so that they have the same bit rate (i.e. size), it usually refers to a compression scheme that allows a maximum bit rate for a certain the number of frames (in our case, 1 frame). For example, in the case of FIG. 6c, if the CBR restriction is applied to encoding that limits the bit rate, for example, to 70% of the nominal maximum data rate 621, then the compression algorithm limits the compression of each of the frames so that any frame that is usually compressed using more than 70% of the nominal maximum data rate 621, is compressed by fewer bits. The result of this is that frames that usually require more bits to maintain a given quality level are “lacking” in bits, and the image quality of these frames is worse than the image quality of other frames that do not need more bits than 70% ( speed) maximum speed 621 data transmission. This approach may lead to acceptable results for certain types of compressed video, when (a) little movement or scene changes are expected, and (b) users may consider periodic degradation acceptable. A good example of an application suitable for CBR is videoconferencing, since there are few peaks in it, and if the quality deteriorates for a short period (for example, if the camera performs panning, which results in significant movement of the scene and large peaks, then during panning there may not be enough bits to compress a high-quality image, which can result in poor image quality), this is acceptable to most users. Unfortunately, CBRs are poorly suited for many other applications in which scenes of a high level of complexity or with high movement exist, and / or in which a fairly constant level of quality is required.
The low latency compression logic 404 used in one embodiment uses several different methods to solve a range of problems associated with streaming compressed video with low latency while maintaining a high level of quality. First, the low latency compression logic 404 generates only I-frames and P-frames, therefore, the need for waiting for several frame periods to decode each B-frame is reduced. In addition, as shown in FIG. 7a, in one embodiment, the low latency compression logic 404 splits each uncompressed frame 701-760 into a sequence of “tiles” and “encodes each tile individually or as I-frame or as a P-frame. The group of compressed I-frames and P-frames is referred to in this description as "R-frames" 711-770. In the specific example shown in FIG. 7a, each uncompressed frame is divided into a 4 × 4 matrix of 16 fragments. However, these underlying principles are not limited to any particular subdivision scheme.
In one embodiment, the low latency compression logic 404 splits the video frame into several fragments and encodes (i.e. compresses) one fragment from each frame as an I-frame (i.e., this fragment is compressed as if it were separate a 1/16 full-frame video frame, and the compression used for this “mini” frame is I-frame compression), and the remaining image fragments are P-frames (ie, compression used for each “mini” 1 / 16 (parts) of the frame, is the compression of the P-frame). Fragments compressed as I-frames and as P-frames should be called “I-fragments” and “P-fragments”, respectively. With each subsequent video frame, the fragment that should be encoded as an I-fragment changes. Accordingly, in a given frame period, only one fragment of the fragments in the video frame is an I-fragment, and the remaining fragments are P-fragments. For example, in FIG. 7a, slice 0 of uncompressed frame 701 is encoded as I-fragment I 0 , and the remaining 1-15 fragments are encoded as P-fragments P 1 through P 15 to output R-frame 711. In the next uncompressed video frame 702, fragment 1 of the uncompressed frame 701 is encoded as I-fragment I 1 , and the remaining fragments 0 and 2 to 15 are encoded as P-fragments, P 0 and P 2 to P 15 , to output the R-frame 712. Accordingly, I-fragments and P fragments for fragments gradually alternate in time in successive frames. The process continues until an R- (frame) 770 is formed with the last fragment in the matrix encoded as an I-fragment (i.e., I 15 ). After that, the process is repeated with the formation of another R-frame, for example, frame 711 (i.e., encoding an I-fragment for fragment 0), etc. Although not shown in FIG. 7a, in one embodiment, the first R-frame of a video sequence of R-frames contains only I-fragments (i.e. subsequent (R) -frames contain reference image data, based on which movement is calculated). Alternatively, in one embodiment, the sequence initially uses an I-fragment scheme identical to the usual one, but does not include P-fragments for those fragments that have not yet been encoded by the I-fragment. In other words, certain fragments are not encoded by any data until the first I-fragment arrives, therefore, the initial peaks in the video stream data transfer rate 934 of Fig. 9a are eliminated, which is explained in more detail below. In addition, as described below, various other sizes and shapes may be used for fragments, although still with the implementation of these underlying principles.
The compressed video recovery logic 412, executed in the client 415, restores each compressed fragment, as if it were a separate video sequence of small I- and P-frames, and then passes each fragment to the frame buffer that controls the display 422. For example, I 0 and P 0 of the R-frames 711-770 are used to recover the compressed fragment 0 of the video image and its visualization. Similarly, I 1 and P 1 from R frames 711-770 are used to recover fragment 1, and so on. As mentioned above, the restoration of compressed I-frames and P-frames is known in the art, and the restoration of compressed I-fragments and P-fragments can be performed by means of a plurality of compressed video recovery devices operating in client 415. Despite what appears to be that an increase in the number of processes increases the cost of computing resources in client 415, this is actually not the case, because the fragment itself is proportionally smaller relative to the number of additional processes, therefore, the number of output dimyh pixels on the screen is the same as if there were one process and used conventional I- and P-frames are full size.
This R-frame method significantly reduces the bandwidth peaks typically associated with I-frames depicted in FIG. 6b and FIG. 6c, because any given frame is usually composed of P-frames, which are usually smaller than I-frames. For example, suppose again that the usual I-frame is 160 Kbps, then the I-fragments of each of the frames shown in Fig. 7a are approximately 1/16 of this value or 10 Kbps. Similarly, suppose that a regular P-frame is 16 Kbit, then the P-frames for each of the fragments shown in Fig. 7a can be approximately 1 Kbit. The end result is an R-frame of approximately 10 Kbps + 15 × 1 Kbps = 25 Kbps. Accordingly, each sequence of 60 frames is 25 Kbps × 60 = 1.5 Mbps. Accordingly, at 60 frames / second, a channel with the ability to support bandwidth of 1.5 Mbit / s, but with much smaller peaks appearing due to I-fragments distributed over the entire interval of 60 frames, will be required.
Note that in the previous examples with the identical assumption of data transfer rates for I-frames and P-frames, the average data transfer rate was 1.1 Mbit / s. The reason for this is that in the previous examples, a new I-frame is introduced only once every 60 frame periods, while in this example the 16 fragments that make up the I-frame are cyclically repeated after 16 frame periods, and essentially the equivalent of I A frame is inserted every 16 frame periods, which results in a slightly higher average data rate. In practice, however, the more frequent introduction of I-frames does not increase the data rate linearly. This is due to the fact that the P-frame (or P-fragment) initially encodes the difference between the next frame from the previous one. Accordingly, if the previous frame is quite similar to the next frame, then the P-frame is very small, if the previous frame is very different from the next frame, then the P-frame is very large. But since the P-frame is largely obtained from the previous frame, and not from the actual frame, the resulting encoded frame may contain more errors (for example, visual artifacts) than the I-frame with the corresponding number of bits. And, when one P-frame follows another P-frame, error accumulation can occur which worsens when there is a long sequence of P-frames. Further, the sophisticated video compression device detects that the image quality is degraded after a sequence of P-frames, and, if necessary, it allocates more bits to subsequent P-frames to improve the quality or, if this is the most effective way, replaces the P-frame I frame. Accordingly, when long sequences of P-frames are used (for example, 59 P-frames, as in the previous examples above), in particular when the scene is very complex and / or has a lot of movement, usually more bits are needed for P-frames as they removal from the I-frame.
In other words, if you look at P-frames from the opposite point of view, P-frames that immediately follow the I-frame usually require fewer bits than P-frames that are farther from the I-frame. So, in the example shown in figa, all P-frames are removed from the previous I-frame by no more than 15 frames, whereas, for example, in the previous example, the P-frame can be located at a distance of 59 frames from I- frame. Accordingly, with a higher frequency of I-frames, P-frames are smaller. Undoubtedly, the exact relative sizes vary depending on the nature of the video stream, but in the example of Fig. 7a, if the I-fragment is 10 Kbit, then the size of P-fragments on average can be only 0.75 Kbit, which results in 10 Kbit + 15 × 0.75 Kbit = 21.25 Kbit, or at (speed) 60 frames per second, data transfer rate is 21.25 Kbit × 60 = 1.3 Mbit / s, or approximately 16% more than the data rate of the stream with an I-frame, followed by 59 P-frames, at 1.1 Mbps. And again, the relative results of these two approaches to video compression vary depending on the video sequence, but, as a rule, experience shows that using R-frames requires approximately 20% more bits for a given quality level than using I / P- frames. But, of course, R-frames significantly reduce peaks, which greatly reduces the latency when using the mentioned video sequences, compared with the latency for a sequence of I / P frames.
R-frames can be configured in many different ways, depending on the nature of the video sequence, channel reliability and available data rate. In an alternative embodiment, a number of fragments other than 16 in a 4x4 configuration is used. For example, 2 fragments can be used in a 2 × 1 or 1 × 2 configuration, 4 fragments can be used in a 2 × 2, 4 × 1 or 1 × 4 configuration, 6 fragments can be used in a 3 × 2, 2 × 3, 6 × configuration 1 or 1 × 6, or 8 fragments can be used in a 4 × 2 configuration (as shown in FIG. 7b), 2 × 4, 8 × 1, or 1 × 8. Note that the fragments do not have to be square, and the video frame does not have to be square or even rectangular. Snippets can take any form that is best suited to the application or video stream in use.
In another embodiment, the cyclic repetition of the I and P fragments is not fixed by the number of fragments. For example, in a 4 × 2 configuration with 8 fragments, however, a sequence with periodic repetition through 16 elements can be used, as shown in FIG. Successive uncompressed frames 721, 722, 723, each, are divided into 8 fragments, 0-7, and each fragment is compressed separately. In R-frame 731, only fragment 0 is compressed as an I-fragment, and the remaining fragments are compressed as P-fragments. In the next R-frame 732, all 8 fragments are compressed as P-fragments, and then in the next R-frame 733, fragment 1 is compressed as an I-fragment, and the remaining fragments are all compressed as P-fragments. And this sequencing continues for 16 frames, while the I-fragment is formed only through the frame, respectively, the last I-fragment will be formed for fragment 7 during the period of the 15th frame (not shown in Fig.7b) and during the period 16 -th frame compress the R-frame 780 using all the P-fragments. Then the sequence starts again from fragment 0, compressed as an I-fragment, while other fragments are compressed as P-fragments. As in the previous embodiment, the very first frame of the entire video sequence usually consists of all I-fragments to provide reference elements for P-fragments, starting from this moment. The cyclic repetition of I-fragments and P-fragments does not even have to be an even multiple of the number of fragments. For example, with 8 fragments, each frame with an I fragment can be followed by 2 frames with all P fragments before another I fragment is used. In yet another embodiment, for certain fragments, a sequence of I-fragments can be established more often than for other fragments, if, for example, it is known that in certain areas of the screen there is more movement, which requires frequent I-fragments, while others are more static (for example, with the image of the score in the game), which requires less frequent I-fragments. In addition, despite the fact that each frame in FIGS. 7a-7b is depicted with one I-fragment, many I-fragments can be encoded in one frame (depending on the transmission channel bandwidth). Conversely, certain frames or sequences of frames can be transmitted without I-fragments (i.e., only P-fragments).
The reason why the approaches described in the previous paragraph work well is that if I-fragments are not allocated to each individual frame, it seems that this will result in large peaks as a result, the behavior of the system is not so simple . Since each fragment is compressed separately from other fragments, then, as the fragments become smaller, the encoding of each fragment may become less efficient, because the compression device of this fragment cannot use similar image features and similar movement from the rest of the fragments. Accordingly, dividing the screen into 16 fragments will generally result in less efficient coding than dividing the screen into 8 fragments. But, if the screen is divided into 8 fragments, and this causes the input of the full I-frame data every 8 frames, instead of every 16 frames, this results in a much higher data transfer rate as a whole. Accordingly, with the introduction of a full I-frame every 16 frames, instead of entering it every 8 frames, the overall data rate decreases. In addition, when using 8 large fragments, instead of 16 smaller fragments, the overall data transfer rate decreases, which also reduces to some extent the data peaks caused by large fragments.
In another embodiment, the low latency video compression logic 404 of FIG. 7a and FIG. 7b controls bit allocation to different fragments in R frames or pre-configured through settings based on known characteristics of the video sequence to be compressed or automatically , based on continuous analysis of image quality in each fragment. For example, in some video racing games, the front of the player’s car (which is relatively motionless in the scene) occupies most of the lower half of the screen, while the upper half of the screen is completely filled with oncoming highway, buildings, and landscapes that are almost always in motion. If the compression logic 404 allocates an equal number of bits to each fragment, then the fragments in the lower half of the screen (fragments 4-7) in the uncompressed frame 721 of Fig. 7b will generally be compressed with higher quality than the fragments (than the fragments) in the upper half of the screen (fragments 0-3) in an uncompressed frame 721 of Fig.7b. If it is known that this particular game or this particular game scene has these characteristics, then the hosting service operators 210 may configure the compression logic 404 to allocate more bits to the fragments located at the top of the screen than the fragments located at the bottom of the screen. Or, compression logic 404 can evaluate the quality of fragment compression after frame compression (using one or more of a variety of compression quality indicators, for example, peak signal to noise ratio (PSNR)) and if it determines what kind of quality of certain fragments is constant over a certain time window improves, then it gradually allocates a larger number of bits to fragments, the quality of which deteriorates until the quality level of the various fragments becomes approximately the same. In an alternative embodiment, the compression device logic 404 allocates bits to obtain higher quality in a particular fragment or group of fragments. For example, with higher quality in the center of the screen than at the edges, it can provide a better overall experience.
In one embodiment, to improve the resolution of certain portions of the video stream, video compression logic 404 uses smaller fragments to encode areas of the video stream with relatively greater scene complexity and / or greater movement than for areas of the video stream with relatively less scene complexity and / or less movement. For example, as shown in FIG. 8, smaller fragments are applied around a moving character 805 in one area of a single R-frame 811 (which may possibly be followed by a sequence of R-frames with identical fragment sizes (not shown)). Further, when the character 805 moves to a new area of the image, smaller fragments are used around this new area inside another R-frame 812, as shown. As mentioned above, various other sizes and shapes can be applied as “fragments”, although, as before, with the implementation of these underlying principles.
Despite the fact that the cyclic I / P fragments described above significantly reduce the peaks in the data rate of the video stream, they do not completely eliminate the peaks, in particular in the case of such fast-changing or very complex video images that are found in movies, video games and some application software. For example, during a sudden framing transition, a complex frame may be followed by another complex frame that is completely different from it. Despite the fact that several I-fragments can only be located at a distance of several frame periods before the transition, they will not help in this situation, because the material of the new frame has no connection with the previous I-fragments. In such a situation (and in other situations where, although not everything changes, most of the image changes), the video compression device 404 determines that many, if not all, P fragments are more efficiently encoded as I fragments, and the result is a very large peak in data rate for this frame.
As discussed earlier, this simply is the case that with most consumer-level Internet connections (and many office connections), it is simply not feasible for data (creating) traffic jams that exceed the available maximum data rate shown at 622 in FIG. .6c, together with a nominal maximum data rate of 621. Note that the nominal maximum data transfer speed of 621 (for example, “DSL 6 Mbps”) is essentially a sales figure for users considering purchasing an Internet connection, but, in general, it does not guarantee a performance level. For the purposes of this application, this is not relevant, since the only problem is the maximum available data rate 622 during transmission by video stream over the connection. Therefore, in Figs. 9a and 9c, since we are describing a solution to the peak problem, the nominal maximum data rate is not shown in the graph, and only the available maximum data rate 922 is shown. The video stream data rate should not exceed the available maximum data rate 922.
To solve this (problem), the first thing that the video compression device 404 does is determine the peak data rate 941, which is the data rate that the channel is able to handle stably. This speed can be determined in several ways. One such method consists in gradually sending a test stream with an ever higher data transfer rate from the service for placing information on the server 210 to the client 415 in FIG. 4a and 4b, and providing the client with feedback to the hosting service regarding the level of packet loss and time expectations. When packet loss and / or latency begin to increase dramatically, this indicates approaching the available maximum data rate 922. After that, the hosting service 210 can gradually reduce the data rate of the test stream until the client 415 reports that for a sufficient period of time, the test stream was received with an acceptable level of packet loss, and the waiting time is almost minimal. This sets the peak maximum data rate 941, which is further used as the peak data rate for streaming video. Over time, the peak data transfer rate 941 will fluctuate (for example, if another user living in this house starts to use the Internet connection intensively), and the client 415 will have to constantly monitor the increase in latency or packet loss, indicating that the available maximum data rate 922 falls below the previously set peak data rate 941, and if so, the peak data rate 941. Similarly, if, over time, client 415 finds that packet loss and latency remain at optimal levels, then it may request that the video compression device slowly increase the data rate to check if the available maximum data rate has increased (e.g. whether another user living in the house has stopped using the Internet connection intensively) and wait again until packet loss and / or a higher latency indicate that an available poppy imalnaya data rate 922 has been exceeded, and again may be found to lower the peak data rate 941, but one that may be higher than the level before testing was to increase the data rate. Accordingly, using this method (and other methods such as this), peak data rate 941 can be found and periodically adjusted as necessary. Peak data rate 941 sets the maximum data rate that can be used by the video compression device 404 to transmit the video stream to the user. The logic for determining the peak data rate may be implemented in the user territory 211 and / or in the hosting service 210. In user territory 211, the client device 415 performs calculations to determine the peak data rate and transfers this information back to the hosting service 210, in the hosting service 210, the server 402 in the hosting service performs calculations to determine the peak data rate based on statistics received from client 415 (e.g., packet loss, latency, maximum data rate, etc.).
Fig. 9a shows an illustrative video stream data rate 934, which contains significant scene complexity and / or significant motion generated using the cyclic I / P fragment compression method described previously and shown in Fig. 7a, Fig. 7b and Fig. 8. The video compression device 404 has been configured to output compressed video at an average data rate that is lower than the peak data rate 941, and note that, almost all the time, the video stream data rate remains below the peak data rate 941. When comparing the data transfer rate 934 with the data stream rate 634 of the video stream shown in Fig. 6c created using I / P / B or I / P frames, it is seen that the compression output of the cyclic I / P fragment yields much more smooth data transfer rate. However, at a peak of 952 2 × frames (which is approaching a peak speed of 2 × data transmission 942) and a peak of 954 4 × frames (which is approaching a peak speed of 944 4 × data transmission), the data transfer rate exceeds a peak data rate 941, which is unacceptable . In practice, even with high activity videos in fast-paced video games, peaks exceeding the peak data rate 941 are less than 2% of the frames, peaks exceeding the peak 942 data rate 2 × are rare, and peaks exceeding a peak speed of 943 3 × data transfer, almost never occur. But, when they do meet (for example, during a transition), the data rate they require is necessary to display a good quality video image.
One way to solve this problem is to simply configure the video compression device 404 so that the output of its maximum data rate is peak data rate 941. Unfortunately, the resulting output video quality during peak frames is poor, because the compression algorithm lacks bits. As a result, compression artifacts appear when sudden transitions or rapid movement occur, and over time, the user understands that artifacts always unexpectedly occur when sudden changes or rapid movement occur, and this can be very annoying.
Despite the fact that the human visual system is quite sensitive to visual artifacts that appear during sudden changes or rapid movement, it is not very sensitive to notice a decrease in frame rate in such situations.
In fact, when such sudden changes occur, it turns out that the human visual system is busy tracking changes, and she does not notice if the frame rate briefly drops from 60 frames / s to 30 frames / s and then immediately returns to 60 frames / s. And, in the case of a very significant transition, such as a sudden change in scene, the human visual system does not notice if the frame rate drops to 20 frames / s or even 15 frames / s and then immediately returns to 60 frames / s. As long as the frame rate decreases rarely, it seems to the human observer that the video is continuously transmitted at a speed of 60 frames / s.
This property of the human visual system is used in the methods depicted in FIG. 9b. Server 402 (in FIG. 4a and FIG. 4b) outputs an uncompressed output video stream with a stable frame rate (at 60 frames per second in one embodiment). The timeline shows that each frame 961-970 is displayed every 1 / 60th of a second. Each uncompressed video frame, starting from frame 961, is output to the low-latency video compression device 404, which compresses the frame in a time shorter than the frame period, with output for the first frame of the compressed frame 981 1. Data output for the compressed frame 981 1 may be larger or smaller, depending on many factors, as described previously. If the data size is so small that it can be transmitted to client 415 during a frame period (1/60 of a second) or less with a peak data rate of 941, then it is transmitted during a transmission period 991 (xmit time) (arrow length indicates the length of the transmission period). During the next frame period, the server 402 outputs an uncompressed frame 962 2, it is compressed into a compressed frame 982 2, and it is transmitted to the client 415 during a transmission period 992, which is less than the frame period, with a peak data rate 941.
Further, during the next frame period, the server 402 outputs an uncompressed frame 963 3. When it is compressed by the video compression device 404, the resulting compressed frame 983 3 is more data than can be transmitted at a peak data rate 941 for one frame period. Accordingly, it is transmitted during a transmission period 993 (2 × peak), which occupies the entire frame period and part of the next frame period. Further, during the next frame period, the server 402 outputs another uncompressed frame 964 4 and outputs it to the video compression device 404, but the data is ignored and displayed by the position 974. This is because the video compression device 404 is configured to ignore subsequent uncompressed video frames that arrive when it still transmits the previous compressed frame. Undoubtedly, the compressed video client recovery device 415 will not receive frame 4, but it will simply continue to display 422 frame 3 for two frame periods (i.e., it will reduce the frame rate from 60 frames / s to 30 frames / from).
During the next frame 5, the server 402 outputs an uncompressed frame 965 5, which is compressed into a compressed frame 985 5 and transmitted within 1 frame during the transmission period 995. The compressed video recovery device of the client 415 restores the compressed frame 5 and displays it on the display screen 422. Next, the server 402 outputs an uncompressed frame 966 6, the video compression device 404 compresses it into a compressed frame 986 6, but this time the size of the resulting data is very large. The compressed frame is transmitted during the transmission period 996 (peak 4 ×) with a peak data rate 941, but the transmission of the frame takes almost 4 frame periods. Over the next 3 frame periods, the video compression device 404 ignores 3 frames from the server 402, and the client 415 compressed data recovery device constantly displays frame 6 on the display 422 for 4 frame periods (i.e., reduces the frame rate by 60 for a short time fps up to 15 fps). Finally, the server 402 outputs the frame 970 10, the video compression device 404 compresses it into a compressed frame 987 10, and it is transmitted during the transmission period 997, and the compressed data recovery device of the client 415 restores the compressed frame 10 and displays it on the display screen 422, and again, video output resumes at 60 frames / s.
Note that, although the video compression device 404 resets the video frames of the video stream generated by the server 402, it does not reset the audio data, regardless of the form in which the audio arrives, and it continues to compress the audio data when the video frames are reset, and transfers them to a client 415, which continues to recover compressed audio data and provide audio to a device that the user uses to play audio, whatever it is. Accordingly, audio continues to be played fully during periods when frames are discarded. Compressed audio uses a relatively small percentage of the bandwidth compared to compressed video, and as a result does not significantly affect the overall data transfer rate. Although not shown in any of the data rate schemes, there is always a data rate bandwidth reserved for compressed audio within the peak data rate 941.
The example just described in FIG. 9b was chosen to illustrate how the frame rate drops during the data rate peaks, but it does not illustrate how the cyclic I / P fragment methods described earlier, such speed peaks, are used data transmissions and subsequent dropped frames are rare, even during sequences with high activity / high complexity of the scene, for example, sequences that are found in video games, movies, and some application software. Therefore, reduced frame rates are infrequent and short-lived, and the human visual system does not notice them.
If the frame rate reduction mechanism just described is applied to the video stream data rate shown in FIG. 9a, the resulting video stream data rate is shown in FIG. 9c. In this example, peak 952 2 × was reduced to a smooth peak 953 2 ×, and peak 955 4 × was reduced to a smooth peak 955 4 ×, and the entire video stream data rate 934 remains at or below the peak data rate 941.
Accordingly, using the methods described above, a video stream with high activity can be transmitted with a low latency through a common Internet and through a consumer-level Internet connection. In addition, in an office setting on a LAN (e.g., Ethernet 10O Mbps or 802.11g wireless) or on a private network (e.g., 10O Mbps connection between the data center and offices), a high-activity video stream can be transmitted without peaks so that many users (for example, transmitting 1920 × 1080 with 60 frames / s with 4.5 Mbps) can use a LAN or a shared private data connection without overlapping peaks overflowing the network or network switch trunks.
Data Rate Adjustment
In one embodiment, the hosting service 210 first estimates the available maximum data rate 622 and the channel latency to determine the appropriate data rate for the video stream and then dynamically adjusts the data rate in the response. To adjust the data transfer rate, the hosting service 210 may, for example, modify the image resolution and / or the number of frames / second of the video stream to be sent to the client 415. In addition, the hosting service can adjust the quality level of the compressed video. When changing the resolution of the video stream, for example, from a resolution of 1280 × 720 to 640 × 360, the compressed video recovery logic 412 in the client 415 can scale the image to maintain an identical image size on the display screen.
In one embodiment, in a situation where a signal is lost in the channel, the hosting service 210 pauses the game. In the case of a game with several participants, the hosting service informs other users that the said user has left the game and / or pauses the game (s) by other users.
Dropped or delayed packets
In one embodiment, if the data is lost due to packet loss between the video compression device 404 and the client 415 of FIG. 4a or FIG. 4b, or due to an out-of-order packet that arrives too late to recover a compressed frame and to satisfy the latency requirements of the reconstructed frame, the compressed video recovery logic 412 can reduce visual artifacts. When implementing a streaming I / P frame, if there is a lost / delayed packet, this affects the entire screen, possibly causing the screen to freeze for a certain period of time or display other visual artifacts across the screen. For example, if a lost / delayed packet causes the loss of an I-frame, then in the compressed data recovery device there will be no reference element for all subsequent P-frames until a new I-frame is received. If the P-frame is lost, this will affect the P-frames for the entire screen that follow it. Depending on how long it takes before the I-frame appears, this will have a longer or shorter visual impact. Using alternating I / P fragments, as shown in FIG. 7a and FIG. 7b, it is much less likely that the lost / delayed packet will affect the entire screen since it will only affect the fragments contained in the affected packet. If the data of each fragment is sent in a separate packet, then if the packet is lost, then this will affect only one fragment. Undoubtedly, the duration of the visual artifact will depend on whether the packet of the I-fragment is lost, and if the P-fragment is lost, then after how many frames the I-fragment will appear. But, taking into account the fact that different fragments on the screen are updated by means of I-frames very often (perhaps each frame), even if one fragment on the screen is affected, other fragments may not be affected. In addition, if some event causes the loss of several packets at the same time (for example, a power surge near the DSL line, which interrupts the data flow for a short time), then some of the fragments will be affected more than others, but due to the fact that some fragments will be quickly updated by means of a new I-fragment, it will affect them only for a short time. In addition, with the implementation of a streaming I / P frame, I-frames are not only the most important frame, but I-frames are extremely large, so if there is an event that causes a packet to be dropped / delayed, then there is a high probability that the I-frame will be affected (i.e., if any part of the I-frame is lost, it is unlikely that the I-frame can be restored at all) than the much smaller I-fragment. For all these reasons, using I / P fragments results in much less visual artifacts when packets are dropped / delayed than with I / P frames.
One embodiment seeks to reduce the effect of lost packets by rationally packing compressed fragments inside TCP packets (transmission control protocol) or UDP packets (user datagram transfer protocol). For example, in one embodiment, fragments are aligned at the boundaries of the packet whenever possible. Figure 10a shows how fragments can be packaged within a sequence of packets 1001-1005 without realizing this feature. Namely, in FIG. 10a, fragments cross the boundaries of a packet and are packed inefficiently so that the loss of one packet results in the loss of multiple frames. For example, if packets 1003 or 1004 are lost, then three fragments are lost, resulting in visual artifacts.
In contrast, FIG. 10b depicts fragment packing logic 1010 for rationally packing fragments within packets to reduce the effect of packet loss. First, the fragment packing logic 1010 aligns the fragments along the boundaries of the packet. Accordingly, fragments T1, T3, T4, T7 and T2 are aligned on the border of packets 1001-1005, respectively. The fragment packing logic is also aimed at placing fragments inside packets in the most efficient way possible without crossing the boundaries of the packet. Based on the size of each of the fragments, fragments T1 and T6 are combined into one packet 1001, T3 and T5 are combined into one packet 1002, fragments T4 and T8 are combined into one packet 1003, fragment T (7) is added to packet 1004 and fragment T2 is added to packet 1005. Accordingly, according to this scheme, the loss of one packet as a result will result in the loss of no more than 2 fragments (and not 3 fragments, as shown in Fig. 10a).
One additional advantage with respect to the embodiment depicted in FIG. 10b is that the fragments are transmitted in a different order from the one in which they are displayed on the screen inside the image. Accordingly, if neighboring packets are lost due to an identical event preventing transmission, this will affect areas that are not next to each other on the screen, which will create less noticeable artifacts on the display.
In one embodiment, Direct Error Correction (FEC) methods are used to protect certain portions of the video stream from channel errors. As is known in the art, FEC methods, such as Reed-Solomon and Viterbi, generate and add error correction data information to data transmitted over a communication channel. If an error occurs in the master data (for example, an I-frame), then you can use FEC to correct this error.
FEC codes increase the transmission rate of transmission data, therefore, preferably they are used only where they are most needed. If data is sent that cannot result in a very visible visual artifact, it may be preferable not to use FEC codes to protect the data. For example, a lost P-fragment, which immediately goes before the I-fragment, will create a visual artifact on the screen (that is, it will not be updated on the fragment on the screen) only by 1 / 60th of a second. The human eye does not notice such a visual artifact. The farther the P fragments are from the next I fragment, the more noticeable the loss of the P fragment becomes. For example, if the cyclic diagram of the fragment is an I-fragment, followed by 15 P-fragments before the I-fragment is accessible again, then if the P-fragment immediately following the I-fragment is lost, this will result in the fact that this fragment will show an incorrect image for 15 frame periods (at 60 frames / s, this is equal to 250 ms). The human eye easily sees a gap in a stream lasting 250 ms. Accordingly, the farther the P-fragment is from the next new I-fragment (i.e., the closer the P-fragments are to the previous I-fragment), the artifact becomes more noticeable. As previously discussed, despite the fact that, in general, the closer the P fragment is to the previous I fragment, the less data for this P fragment. Accordingly, P fragments that follow I fragments are not only more important to protect against loss, but they are also smaller. And, in general, the less data that needs to be protected, the smaller the FEC code is needed to protect it.
Accordingly, as shown in FIG. 11a, in one embodiment, due to the importance of I fragments in a video stream, only I fragments are provided with FEC codes. Accordingly, FEC 1101 contains an error correction code for I-fragment 1100, and FEC 1104 contains an error correction code for I-fragment 1103. In this embodiment, an FEC is not generated for P-fragments.
In one embodiment, shown in FIG. 11b, FEC codes are also generated for P fragments that are most likely to cause visual artifacts if lost. In this embodiment, FEC codes 1105 provide error correction codes for the first 3 P fragments, but not for subsequent P fragments. In another embodiment, FEC codes are generated for P fragments whose data size is the smallest (which will, as a rule, independently select P fragments that are most likely to occur after the I fragment, the protection of which is most important).
In another embodiment, instead of sending the FEC code with the fragment, the fragment is transmitted twice, each time in a different packet. If one packet is lost / delayed, then another packet is used.
In one embodiment, shown in FIG. 11c, FEC codes 1111 and 1113 are generated for audio packets, 1110 and 1112, respectively, transmitted from the hosting service simultaneously with the video. It is especially important to maintain the integrity of the audio in the video stream, because distorted audio (for example, clicking or hissing) as a result leads to a particularly unpleasant practical experience for the user. FEC codes help to ensure that audio content on the client computer 415 is not distorted.
In another embodiment, instead of sending a FEC code with audio data, audio data is transmitted twice, each time in a different packet. If one packet is lost / delayed, then another packet is used.
In addition, in one embodiment shown in FIG. 11d, FEC codes 1121 and 1123 are used for user input commands 1120 and 1122, respectively (eg, keystroke), sent upstream from client 415 to hosting service 210. This is important because the loss of a keystroke or mouse movement in a video game or application can result in an unpleasant user experience.
In another embodiment, instead of sending an FEC code with user input command data, user input command data is transmitted twice, each time in a different packet. If one packet is lost / delayed, then another packet is used.
In one embodiment, the hosting service 210 evaluates the quality of the communication channel with the client 415 to determine whether to use FEC, and if so, which parts of the video, audio, and user commands with which FEC should be applied. The channel “quality” rating may include features such as packet loss, latency, etc., as described above. If the channel is particularly unreliable, then the hosting service 210 may apply FEC to all I fragments, P fragments, audio, and user commands. In contrast, if the channel is reliable, then the hosting service 210 may apply FEC only to audio and user commands, or it may not apply FEC to either audio or video, or it may not use FEC at all. Various other permutations of the application of FEC may apply, although, as before, with the implementation of these underlying principles. In one embodiment, the hosting service 210 continuously monitors channel conditions and accordingly changes the FEC policy.
In another embodiment, according to FIGS. 4a and 4b, when a packet is lost / delayed, resulting in loss of fragment data, or if, possibly due to a particularly large packet loss, the FEC cannot correct the lost fragment data, then the client 415 estimates how many frames remain before the new I-fragment is received, and compares this number with the time the signal was sent back and forth, from the client 415 to the hosting service 210. If the round-trip signal transmission time is less than the number of frames left before the expected arrival of a new I-fragment, then the client 415 sends a message to the information placement service 210 on the server requesting a new I-fragment. This message is routed to the video compression device 404, and instead of generating a P-fragment for the fragment whose data has been lost, it forms an I-fragment. Considering that the system depicted in Figs. 4a and 4b is designed to provide a round-trip signal transmission time, which, as a rule, is less than 80 ms, as a result, we obtain that the fragment is corrected within 80 ms (at 60 frame / s, the duration of the frames is 16.67 ms, respectively, in full frame periods, the waiting time of 80 ms as a result leads to correction of the fragment within 83.33 ms, which are equal to 5 frame periods - a noticeable gap, but much less noticeable, than, for example, a gap of 250 ms for 15 frames). When the compression device 404 forms such an I-fragment outside its usual cyclic order, if the I-fragment causes the bandwidth of this frame to exceed the available bandwidth, then the compression device 404 delays the cycles of other fragments so that other fragments receive P-fragments during this period of the frame (even if one fragment usually needs to receive the I-fragment during this frame), and then, starting from the next frame, the usual cyclic repetition continues, and the fragment that usually received the I-fragment t in the previous frame, takes an I-fragment. Despite the fact that this action delays the phase of cyclic repetition of the R-frame for a short time, this is usually not noticeable visually.
Implementing a compression device / compressed video and audio recovery device
12 depicts one particular embodiment in which a multi-core processor and / or multiprocessor 1200 is used to compress 8 fragments in parallel. In one embodiment, a computer system is used with a four-core Xeon CPU, a dual processor operating at 2.66 GHz or higher, with each core implementing an open-source H.264 × 264 compression device as an independent process. However, various other hardware / software configurations may be used, although still with the implementation of these underlying principles. For example, each of the CPU cores can be replaced by an H.264 compression device implemented in FPGA. In the example of FIG. 12, cores 1201-1208 are used to process I fragments and P fragments simultaneously as eight independent threads. As is known in the art, modern multi-core and multi-processor computer systems can by nature perform multi-threading when integrated with multi-threaded operating systems, for example, Microsoft Windows XP Professional Edition (or 64-bit or 32-bit) and Linux.
In the embodiment depicted in FIG. 12, since each of these 8 cores is responsible for only one fragment, it functions largely independently of the other cores, each executing a separate instance of × 264. A DVI capture card based on PCI Express × 1, for example, the Sendero Video Imaging IP Development Board from Microtronix, Oosterhout, The Netherlands, is used to capture uncompressed video with a resolution of 640 × 480, 800 × 600 or 1280 × 720, and FPGA on the card uses direct memory access (DMA) to transfer captured video via DVI to system RAM. Fragments are arranged in a layout 1205 4 × 2 (although they are depicted as square fragments, in this embodiment, they have a resolution of 160 × 240). Each instance × 264 is configured to compress one of the 8 fragments 160 × 240, and they are synchronized so that after the initial compression of the I-fragment, each core enters the cycle, and each frame does not coincide in phase with the other, to compress one I-fragment, followed by seven P-fragments, and depicted in Fig.12.
Each frame period, the resulting compressed fragments are combined into a packet stream using the methods described previously, and then the compressed fragments are transmitted to the target client 415.
Despite the fact that not shown in Fig. 12, if the data transfer rate of the combined 8 fragments exceeds a given peak data transfer rate 941, then all 8 x264 processes are delayed for such a number of frame periods that are necessary until all the data of the combined data 8 fragments cannot be transferred.
In one embodiment, client 415 is implemented as software on a PC running 8 instances of FFmpeg. The receiving process receives 8 fragments, and each fragment is routed to an FFmpeg instance, which restores the compressed fragment and visualizes it in the corresponding fragment location on the display 422.
Client 415 receives keyboard, mouse, or game controller input from the drivers of the PC input device and transfers it to server 402. Server 402 then applies the received data to the input device and applies it to the game or application running on server 402, which is a control PC Windows using a 2.16 GHz Intel Dual-Core (Core Duo) CPU. Server 402 then outputs a new frame and outputs it through its DVI output, either from a graphics system located on the system board or through the DVI output of the NVIDIA PCI 8800GTX board.
At the same time, server 402 outputs audio output by the game or applications through its digital audio output (e.g., S / PDIF), which is connected to the digital audio input on the Xeon-based dual quad-core PC, which implements video compression. The Vorbis open source audio compression device is used to compress audio simultaneously with video using any core available in the processing stream. In one embodiment, a kernel that compresses its fragment first performs audio compression. The compressed audio is then transmitted along with the compressed video and restored to the client 415 using the Vorbis compressed audio recovery device.
Hosting Server Center Distribution
Light through glass, such as a light guide, travels at a speed equal to a fraction of the speed of light in a vacuum, and accordingly, the exact speed of light propagation in the light guide can be determined. But, in practice, given the time due to routing delays, transmission inefficiencies and other losses, the authors observed that the optimal latency values on the Internet reflect transmission speeds closer to 50% of the speed of light. Accordingly, an optimal round-trip signal transmission time of 1000 miles (1600 km) is approximately 22 ms, and an optimal round-trip signal transmission time of 3000 miles (4800 km) is approximately 64 ms. Accordingly, one server located on one coast of the United States is too far away to serve customers on the other coast (which may be located at a distance of 3,000 miles (4,800 km)) with the required waiting time. However, as shown in FIG. 13a, if the server center 1300 of the hosting service 210 is located in the center of the United States (e.g., Kansas, Nebraska, etc.) so that the distance to any point in the continental United States is approximately 1,500 miles (2,400 km) or less, then the round-trip signal transmission time on the Internet can be equal to only 32 ms. According to fig.4b, we note that, despite the fact that the latency in the worst case, valid for (connecting) 453 ISP users, is 25 ms, as a rule, the authors observed latency values closer to 10-15 ms with systems cable modem communications and DSL. In addition, according to fig.4b, it is assumed that the maximum distance from the territory 211 of the user to the hosting center 210 is equal to 1000 miles (1600 km). Accordingly, with a typical used round-trip signal transmission time (via connection), the ISP user is 15 ms and the maximum distance via the Internet is 1500 miles (2400 km) for a round-trip signal transmission time of 32 ms, the total round-trip signal transmission time from when the user drives the input device 421 and observes the response on the display 422 is 1 + 1 + 15 + 32 + 1 + 16 + 6 + 8 = 80 ms. Accordingly, (when transmitting) over the Internet at a distance of 1,500 miles (2,400 km), as a rule, a response time of 80 ms can be obtained. This can provide any territory of the user with a sufficiently short waiting time of 453 times (for connecting) to an ISP user in the continental USA to access one server center, which is located in the center.
In another embodiment of FIG. 13b, server centers, HS1-HS6, hosting services 210 are strategically located in the United States (or other geographic region), with certain large hosting server centers being located close to high population centers (e.g. HS2 and HS5). In one embodiment, the server centers HS1-HS6 exchange information through a network 1301, which may be the Internet, or a private network, or a combination of both. For a plurality of server centers, services can be provided with less latency for users who have a long ISP 453 waiting time for the user.
Although distance over the Internet is definitely a factor that increases the time it takes to send a signal back and forth over the Internet, other factors sometimes play a role, which are mostly not related to latency. Sometimes a packet stream is routed over the Internet to a remote location and vice versa, resulting in latency due to a long communication line. Sometimes routing equipment in the path (signal path) does not work properly, resulting in a delay in transmission. Sometimes the path is overloaded with traffic, which introduces a delay. And, sometimes, there is a failure that does not allow the ISP user to route to this destination. Accordingly, despite the fact that the general Internet usually provides connections from one point to another with a fairly reliable and optimal route and latency, which is largely determined by the distance, (especially in long-distance connections, which result in routing outside the local user area), such reliability and waiting time are by no means guaranteed and often cannot be obtained from the user's territory to a given destination on the Internet.
In one embodiment, when a user client 415 first connects to a hosting service 210 to play a video game or use an application, the client communicates with each of the hosting service server centers HS1-HS6 available after startup (for example, using the methods described above). If the latency is short enough for a particular connection, then this connection is used. In one embodiment, the client communicates with all server centers of the hosting service or a subset thereof, and a server center with a connection with the shortest latency is selected. The client can select the service center with the connection with the lowest latency, or the service centers can identify the service center with the connection with the lowest latency and provide this information (for example, in the form of an Internet address) to the client.
If a particular server hosting service center is overloaded and / or for a user application or game the latency for connecting to another, less loaded server hosting service center is acceptable, then client 415 can be redirected to another hosting service server center. In such a situation, the game or application executed by the user is suspended on the server 402 in the congested server center of the user, and data on the state of the application or game is transmitted to the server 402 located in another server center of the hosting service. After that, the game or application resumes. In one embodiment, the hosting service 210 waits until the game or application reaches its natural suspension point (for example, between levels in the game, or after the user initiates a “save” operation in the application) for said transfer. In yet another embodiment, the hosting service 210 waits until the user activity has ceased for a predetermined period of time (e.g., 1 minute), and then initiates said transfer at that time.
As described above, in one embodiment, the hosting service 210 subscribes to the Internet bypass service 440 of FIG. 14 in order to provide its customers with a guaranteed waiting time. Internet bypass services, as used in this description, are services that provide routes through a private network from one point to another (point) to the Internet with guaranteed characteristics (for example, latency, data rate, etc.). For example, if the hosting service 210 receives a large amount of traffic from users using the DSL AT&T service offered in San Francisco, then instead of routing to AT&T central exchanges located in San Francisco, the hosting service 210 can rent a private high-bandwidth data connection from the provider services (possibly from AT&T directly or from another provider) between central exchanges located in San Francisco and one or more server centers for the 210 hosting service. Then, if the routes from all server centers HS1-HS6 of the hosting service via the common Internet to the user located in San Francisco using the AT&T DSL result in too long waiting times, then a private data connection can be used instead. Although private data connections are generally more expensive than routes over the common Internet, as long as they make up a small percentage of the connections of the hosting service 210 to users, the overall cost effect will remain low, and users will experience more consistent hands-on service experience.
Server centers often have two levels of backup power in the event of a power failure. The first level, as a rule, is battery backup (or from an alternative source of energy, ready for immediate use, such a flywheel that is maintained in operation and connected to the generator), which immediately provides power when a power failure occurs lines, and maintains a server center. If the power system failure is short, and the power line returns quickly (for example, within one minute), then the batteries are all that is necessary to maintain the server center in working condition. But if a failure in the power supply system occurs for a longer period of time, then, as a rule, they start generators (for example, diesel), which take on the functions of batteries and can work while they have fuel. Such generators are extremely expensive since they must be able to output the power that a server center typically receives from a power line.
In one embodiment, each of the hosting services HS1-HS5 shares user data so that if a power system failure occurs in one server center, it can pause games and applications that are in the process of (execution), and then transfers data about the state of the application or game from each server 402 to servers 402 located in other server centers, and then notify each user 415 and instructs him to transfer information to the new server 402. Given the fact that such situations rarely occur, it may be acceptable to transfer the user to the server center of the hosting service, which cannot provide optimal latency (i.e., the user simply has to put up with a higher latency during a power failure), which provides much greater opportunities for user transfers. For example, given the different time zones in the United States, users on the East Coast can go to bed at 11:30 p.m., while users on the West Coast at 20:30 can begin to make the most of video games. If at this time a power failure occurs in the server center of the hosting service located on the West Coast, then there may not be a sufficient number of servers 402 located on the West Coast in other server centers of the hosting service to manage all users. In this situation, some users may be transferred to server centers of hosting services located on the East Coast, in which 402 servers are available, and the only consequence for users will be a longer waiting time. After transferring users from the server center, which was left without electricity, the server center can begin to properly shut down its servers and equipment so that all equipment is turned off before the batteries (or other immediately connected backup power) are exhausted. Therefore, the server center can avoid generator costs.
In one embodiment, during periods of heavy load of the hosting services 210 (either due to peak user load, or due to the failure of one or more server centers), users are transferred to other server centers based on the latency requirements of the game or application that they use. Accordingly, users using games or applications that require low latency are given preference (when provided) to available connections to servers with low latency with limited power.
Signs of a hosting service
FIG. 15 illustrates an embodiment of server center components for a hosting service 210 used in the following description of features. As with the hosting service 210 of FIG. 2a, the components of this server center are managed and (their operation) coordinated by the management system 401 of the hosting service 210, unless otherwise specified.
Incoming traffic 1501 to the Internet from 415 clients is directed to incoming routing 1502. Typically, incoming traffic 1501 to the Internet enters the server center via a high-speed fiber-optic connection to the Internet, but any means of network connection with adequate bandwidth, reliability and low waiting time. Inbound routing 1502 is a network system (a network can be implemented as an Ethernet network, a network of fiber optic channels or by any other means of transport) of switches and routing servers supporting switches, which receives incoming packets and routes each packet to the corresponding application server 1521-1525 / games ("app / game"). In one embodiment, a packet that is delivered to a particular application / game server represents a subset of the data received from the client and / or can be converted / modified by other network components (e.g., network components such as gateways and routers) in the processing center and data storage. In some cases, packets are routed to several 1521-1525 servers at the same time, for example, if a game or application is running in parallel on multiple servers at the same time. A RAID array 1511-1512 of a RAID type is connected to an incoming routing network 1502 so that application / game servers 1521-1525 can read from and write to RAID 1511-1512 RAID arrays. In addition, a RAID type 1515 disk array (which can be implemented as a plurality of RAID type disk arrays) is also connected to inbound routing 1502, and data from a RAID type 1515 disk array can be read from application / game servers 1521-1525. Inbound routing 1502 can be implemented in a wide range of prior art network architectures, including a tree-like structure of switches, with incoming traffic 1501 on the internet in its root, in a mesh structure connecting all the different devices, or as a set of interconnected subnets, with traffic, concentrated between intercoms (devices), separated from traffic concentrated among other devices. One type of network configuration is a SAN, which, although it is typically used for mass storage devices, it can also be used for normal high-speed data transfer between devices. In addition, application / game servers 1521-1525 can each have multiple network connections for incoming routing 1502. For example, server 1521-1525 can have a network connection to a subnet connected to RAID arrays 1511-1512 and another network connection to a subnet connected to other devices.
Application / game servers 1521-1525 may all be configured the same, some differently, or all differently, as previously described with respect to servers 402 in the embodiment depicted in FIG. 4a. In one embodiment, each user when using a hosting service typically (uses) at least one application / game server 1521-1525. For ease of explanation, suppose a given user is using an application / game server 1521, but many servers can be used by one user, and many users can share a single application / game server 1521-1525. The user control signal input sent from client 415 as described previously is received as incoming traffic 1501 to the Internet and routed through incoming routing 1502 to the application / game server 1521. The application / game server 1521 uses the input of a user control signal as input of a control signal to a game or application running on the server, and calculates the next frame of video and audio associated with it. The application / game server 1521 then outputs the uncompressed video / audio 1529 to the shared video compression 1530. An application / game server can output uncompressed video by any means including one or more Gigabit Ethernet connections, but in one embodiment, video is output via a DVI connection, and audio and other compressed data and channel status information are output through the connection universal serial bus (USB).
Shared video compression 1530 compresses uncompressed video and audio from application / game servers 1521-525. Compression can be implemented entirely in hardware, or in hardware running software. For each application / game server 1521-1525, a dedicated compression device may (exist), or if the compression devices are fast enough, then this compression device can be used to compress video / audio from several application / game servers 1521-1525. For example, at 60 frames / s, the video frame time is 16.67 ms. If the compression device can compress one frame in 1 ms, then this compression device can be used to compress video / audio from 16 application / game servers 1521-1525 by receiving input from one server after another, and the compression device saves the state of each video compression process / audio and switches context as it cyclically switches between streams of video / audio from the servers. This results in significant cost savings in hardware for compression. Since different servers will complete frames at different times, in one embodiment, the resources of the compression device are in a shared pool 1530 with a shared storage medium (eg, RAM, flash memory) for storing the state of each compression process, and when the server frame 1521-1525 is finished and ready for compression, the control tool determines which compression resource is available at this moment, provides the compression resource with the state of the server compression process and a frame of uncompressed video / audio for compression.
Note that part of the state of the compression process of each server includes information about the compression itself, for example, the restored frame buffer data of the previous frame, which can be used as a reference element for P fragments, video output resolution, compression quality, mosaic structure, bit allocation for each clip, compression quality, audio format (e.g. stereo, surround sound, AC Dolby® 3). But the state of the compression process also includes information about the state of the communication channel regarding the peak data rate 941 and whether the previous frame is currently being displayed (as shown in Fig. 9b) (and as a result, the current frame should be ignored), and, it is possible that there are channel characteristics that should be taken into account during compression, for example, excessive packet loss, which affect decision making regarding compression (for example, in terms of the frequency of I-fragments, etc.). As the peak data rate 941 or other channel characteristics change over time, as determined by the application / game server 1521-1525 supporting each user's current monitoring data sent from client 415, the application / game server 1521-1525 sends the corresponding information into shared hardware compression 1530.
Shared hardware compression 1530 also packetizes compressed video / audio using tools, for example, tools described earlier, and if necessary, using FEC codes, duplicating certain data, or taking other measures to appropriately allow the client 415 to accept the video stream / audio data and recovery of compressed data with the highest possible quality and reliability.
Some applications, such as those described below, require the video / audio output of a given application / game server 1521-1525 to be available with many resolutions (or in various other formats) at the same time. If the server 1521-1525 application / game so notifies the resource of shared hardware compression 1530, then the uncompressed video audio 1529 of this server 1521-1525 application / game will be simultaneously compressed in different formats, with different resolutions and / or with different error correction structures / packages. In some cases, some compression resources can be shared by many compression processes that compress identical video / audio (for example, in many compression algorithms, there is a stage where the image is scaled to many sizes before applying compression. If you want to output images of different sizes, then this The stage can be used to serve several compression processes simultaneously. In other cases, separate compression resources are required for each format. In any case, compressed video / audio 1539 with the seven required different resolutions and in all the different formats required for a given application / game server 1521-1525 (be it one or many) is output simultaneously to outbound routing 1540. In one embodiment, the output of compressed video / audio 1539 is in UDP format, respectively , it is a unidirectional stream of packets.
The outbound routing network 1540 contains a set of switches and routing servers that direct each stream of compressed video / audio to the user (s) to whom they are intended, or to other destinations via the outgoing traffic interface 1599 from the Internet (which are usually connected to the fiber optic interface with the Internet), and / or back to the delay buffer 1515, and / or back to the incoming routing 1502, and / or through a private network (not shown) for video distribution. Note that (as described below) outbound routing 1540 can output a given video / audio stream to multiple destinations simultaneously. In one embodiment, this is implemented using Internet Protocol (IP) multicast, in which a given UDP stream is destined to be streamed to multiple destinations at the same time, and this broadcast is repeated by the switches and routing servers in the outbound routing 1540. The multiple destinations broadcasts can be clients 415 multiple users (available) via the Internet, multiple servers 1521-1525 applications / games, (available) through inbound routing 1502, and / or one and Do multiple 1515 delay buffers. Accordingly, the output of this server 1521-1522 is compressed into one or many formats, and each compressed stream is sent to one or many destinations.
In addition, in another embodiment, if a plurality of application / game servers 1521-1525 are simultaneously used by one user (for example, in a parallel processing configuration to create 3D output of a complex scene) and each server outputs a portion of the resulting image, then the video output from the plurality of servers 1521-1525 can be combined with shared hardware compression 1530 into a combined frame, and from now on it is processed as described above, as if it were leaving one server 1521-1525 adj eniya / game.
Note that in one embodiment, a copy (at least with a resolution of the video viewed by the user or higher) of the entire video generated by the application / game servers 1521-1525 is written to the delay buffer 1515 by at least a certain amount minutes (15 minutes in one embodiment). This provides an opportunity for each user to “rewind” the video from each session to view previous actions or actions (in the case of a game). Accordingly, in one embodiment, for each output 1539 of a compressed video / audio stream routed to a user client 415, a multicast is also sent to the delay buffer 1515. When video / audio is stored in the delay buffer 1515, the directory in the delay buffer 1515 provides a cross reference between the network address of the application / game server 1521-1525, which is the source of the delayed video / audio, and the location in the delay buffer 1515, where the delayed video / audio can to be found.
Real-time games that are immediately displayed on the screen, with the ability to immediately play the game
Application / game servers 1521-1525 can be used not only to execute a given application or video game for a user, but they can also be used to create user interface applications for hosting service 210 that support navigation in hosting service 210 and other features. A screen shot of one such user interface application is shown in FIG. 16, the “Game Finder” screen. This particular user interface screen allows the user to watch 15 games that are played in real time (or delayed) by other users. Each of the "miniature" video windows, for example 1600, is a window with real-time video in motion, showing one video from the game of one user. The view shown in thumbnail may be a view identical to that seen by the user, or it may be a delayed view (for example, if the user is fighting a game, then the user may need other users to not see where he is hiding, and he can decide to delay any kind of gameplay for a certain period of time, say, 10 minutes). The view may also be a field of view of the game camera, which is different from the view of any user. By selecting menu items (not shown in this example), the user can select a selection of games for simultaneous viewing, based on various criteria. As a small sampling of illustrative selections, the user can select a random selection of games (for example, shown in Fig. 16), all games of the same kind (all played by different players), only highly rated game players, players at a given level in the game, or low-rated players (for example, if a player is learning the basics), players who are “buddies” (or rivals), games that have the most viewers, etc.
Note that, in general, each user decides whether others can watch videos of his games or applications, and if so, which of them and when can watch him, can he watch only with a delay.
The application / game server 1521-1525, which forms the user interface screen shown in FIG. 16, requests 15 video / audio transmission lines by sending a message to the application / game server 1521-1525 for each user from whom he requests the game. The message is sent via inbound routing 1502 or another network. The message includes the size and format of the requested video / audio and identifies the user viewing the user interface screen. This user may decide to select the "privacy" mode and not provide the ability for any other users to view the video / audio of his game (either from his viewpoint or from another viewpoint), or as described in the previous paragraph, the user may decide to provide the ability to watch video / audio from his game, but delay the video / audio being watched. The server 1521-1525 of the user's application / game, accepting the request and agreeing to provide the possibility of viewing its video / audio, sends such a confirmation to the requesting server, and it also notifies the shared hardware compression 1530 of the need to generate an additional stream of compressed video in the requested format or with the requested screen size (assuming that the format and size of the screen are different from those that are already being formed), and it also indicates the destination for the compressed video (i.e., requesting server). If the requested video / audio is only delayed, the requesting server 1521-1525, the application / game is notified and it requests the delayed video / audio from the delay buffer 1515 by searching for the location of the video / audio in the directory located in the delay buffer 1515 and the network the server address 1521-1525 of the application / game, which is the source of the delayed video / audio. After the formation and processing of all these requests, up to 15 streams of miniature-sized video in real time are routed from the outgoing routing 1540 to the incoming routing 1502 to the application / game server 1521-1525, which forms the user interface screen, and are restored and displayed by the server. Delayed video / audio streams may be too large a screen size, and if so, the application / game server 1521-1525 will restore the compressed streams and reduce the scale of the video streams to the size of the thumbnail. In one embodiment, requests for audio / video are sent (and controlled by them) to a central “management” service, similar to the hosting service management system of FIG. 4a (not shown in FIG. 15), which then redirects these requests to the corresponding server 1521 -1525 applications / games. In addition, in one embodiment, a request may not be required because the thumbnails are “pushed” into the clients of those users who give permission to it.
Audio from 15 games, all mixed at the same time, can create a cacophony of sounds. The user can decide to mix all the sounds together in this way (it is only possible to get the feeling of “noise” created by the whole viewed action), or the user can decide to listen to only audio from one game at a time. The selection of one game is performed by moving the yellow frame 1601 of the fragment selection to the given game (moving the yellow frame can be performed by using the cursor keys on the keyboard, when moving the mouse, when moving the joystick or by pressing the direction buttons on another device, for example, on a mobile phone) . After selecting one game, only audio is played from this game. In addition, 1602 game information is displayed. In the case of this game, for example, the publisher’s logo ("EA") and the logo of the game "Need for Speed Carbon", and the orange horizontal bar indicates in relative terms the number of people leading or watching this game in this game. specific moment (a lot, in this case, so the game is "hot"). In addition, "Stats" is provided indicating that there are 145 players actively playing 80 different copies of the Need for Speed Game (i.e. it can be played either as a single player game or as a multi-player game) , and there are 680 viewers (one of which is this user). Note that these statistics (and other statistics) are collected by the hosting service management system 401 and stored in RAID type 1511-1512 disk arrays for maintaining the logs of the hosting service 210 and for proper billing of users and payments to publishers that provide the content. Some statistics are recorded due to the actions of the service management system 401, and some are reported to the service management system 401 by a separate application / game server 1521-1525. For example, an application / game server 1521-1525 executing this Game Finder application (game search tool) sends messages to the hosting service management system 401 when games are viewed (and when they are stopped for viewing) so that it can update statistics about how many games are displayed on the screen). Some statistics are available for front-end user applications, for example, for this application Game Finder (game search tool).
If the user clicks on the activation button on his input device, he sees that the thumbnail video in the yellow frame is enlarged, while it remains in real time, up to the size of the entire screen. This effect is depicted in the process of FIG. Note that the 1700 video window has increased in size. To implement this effect, the application / game server 1521-1525 requests the application / game server 1521-1525, executing the selected game, to obtain a copy of the video stream for the full screen size (with a resolution of 422 user displays) of the game routed to it. The application / game server 1521-1525 executing the game notifies the shared hardware compression device 1530 that a copy of the miniature size of the game is no longer required (if the other application / game server 1521-1525 does not require such a thumbnail), and after that it orders that he sent a full-size video copy to the server 1521-1525 of the application / game with video expansion. A game host may or may not have a display 422 with a resolution identical to that of a user display that enlarges a game. In addition, other game viewers may or may not have displays 422 with a resolution identical to the resolution of a user’s enlarged game display (and may have other audio playback means, such as stereo or surround sound). Accordingly, the shared hardware compression device 1530 determines whether a suitable compressed video / audio stream is already being generated that meets the requirements of the user requesting said video / audio stream, and if it already exists, it notifies the outgoing routing 1540 to route the copy stream to the application / game server 1521-1525 with zooming the video, and if (it) does not compress another copy of the video that is suitable for this user, (then) issues a command to the outgoing routing uw send said stream back into the incoming routing server 1502 and 1521-1525 apps / games, zoom the video image. This server, now accepting a full-screen version of the selected video, restores it and gradually scales it to full size.
On Fig depicts how the screen looks after the final zooming of the game to full screen, and the game is depicted with the full resolution of the user display 422, as shown by the image indicated by arrow 1800. Application / game server 1521-1525, the executing application of the game search tool sends messages to other servers 1521-1525 of the application / game that provided the thumbnails that they are no longer needed, and messages to the hosting service management server 401 that other games no longer viewed. At this moment, the only image that it forms is an overlay 1801 of another graphic image at the top of the screen, which provides information and menu controls to the user. Note that as this game progresses, the audience has grown to 2,503 viewers. With such a large number of viewers, there must be many viewers with displays 422 whose resolution capabilities are identical or close (each application / game server 1521-1525 can scale the video to adjust compliance).
Since the game depicted is a game with several participants, the user can decide to join the game at some point in time. Hosting service 210 may or may not provide the ability for a user to join the game for various reasons. For example, a user may have to pay in order to play the game, and he decided not to pay, the user may not have a sufficient rating to join this particular game (for example, he cannot compete with other players), or the user's Internet connection may not have a sufficiently short waiting time to enable the user to play the game (for example, there is no time limit for watching the game, respectively, a game that is played far (certainly on another continent) It is worth watching, not caring for the waiting time, but in order to play the game, the waiting time should be short enough so that the user (a) enjoys the game and (b) is on an equal footing with other players who may have connections with shorter waiting times). If the user is provided with the opportunity to play the game, then the application / game server 1521-1525, which provides the user with the Game Finder user interface (game search means), sends a request for the hosting service management server 401 to initiate (i.e., determine the location and start) the server 1521-1525 application / game, which is respectively configured to play a specific game, to load a game from a RAID array 1511-1512, and after that the hosting service control server 401 issues a command to the incoming routing 1502 to transfer giving control signals from the user to the game application / game server that is currently hosting the game, and it issues a command to the shared hardware compression 1530 to switch from video / audio compression from the application / game server that hosts the Game Finder application ( game search tools) to compress video / audio from an application / game server that is currently hosting the game. The personnel synchronizing impulse of the Game / Finder application / game service (game search tool) and the new application / game server that hosts the games are not synchronized, and as a result, there is probably a time difference between the two synchronizing impulses. Since the shared video compression hardware 1530 will begin to compress the video after the application / game server 1521-1525 completes the video frame, the first frame from the new server may be completed before the full frame period of the old server has passed (which may occur before after the transmission of the previous compressed frame is completed (for example, consider the transmission period 992 of Fig. 9b: if the uncompressed frame 963 3 had been completed half the frame period earlier, it would have encountered a transmission period 992). In such a situation, the shared video compression hardware 1530 ignores the first frame from the new server (for example, like 974 frame 964 4 is ignored), and the client 415 holds the last frame from the old server an additional frame period, and the shared compression hardware 1530 video starts compressing the video of the next frame period from the new application / game server that hosts the game. Visually, for the user, the transition from one application / game server to another will be uninterrupted. The hosting service management server 401 then notifies the game server 1521-1525 of the application / game hosted by the Game Finder (game search tool) to switch to the standby state until it is needed again.
After that, the user can play the game. Exceptional, it seems that the game can be played immediately (since it is downloaded to the game server 1521-1525 of the application / game from the RAID type 1511-1512 array with gigabit / second speed), and the game is loaded on the server, which is fully suitable for a game, together with an operating system that is fully configured to play with ideal drivers, the registry configuration (in the case of Windows), and other applications that may interfere with the functioning of the game are not executed on the server.
In addition, as the user advances in the game, each of the segments of the game is loaded into the server at a gigabit / second speed (i.e. 1 gigabyte is loaded in 8 seconds) from a RAID array 1511-1512, and because of the huge memory capacity a RAID array 1511-1512 of type RAID (since it is a shared resource among many users, it can be very large and, nevertheless, be cost-effective), geometry configuration or tuning of another segment of the game can be pre-computed and stored on the disk array 1511-1512 type RAID and loaded very quickly. In addition, due to the fact that the hardware configuration and processing power of each application / game server 1521-1525 are known, calculations (using) vertex and pixel shaders can be preliminarily performed.
Accordingly, the game starts almost immediately, it runs in an ideal environment, and subsequent segments load almost immediately.
But besides these advantages, the user can watch how others are playing the game (through the Game Finder (the search tool for the game) described earlier, and other tools), and (also) decide whether the game is interesting, and if so, then study recommendations while observing others. And the user can experience the demo version of the game immediately, without having to wait for a long download and / or installation, and the user can play the game immediately, possibly for testing for a small fee, or on a long-term basis. And the user can play the game on a Windows PC, Macintosh, on TV, at home, while traveling, and even on a mobile phone, through a wireless connection with a fairly short waiting time. And all this can be done, and never have a physical copy of the game.
As mentioned earlier, the user may decide not to provide others (users) with the ability to view his gameplay, provide the ability to view his game after a delay, provide the ability to view his game to selected users or provide the ability to view his game to all users. No matter what, the video / audio will be stored, in one embodiment, for 15 minutes in the delay buffer 1515, and the user will be able to “rewind” and view his previous game, and pause, play it slowly, fast-forward, etc. .d., as it can do when watching TV with a digital video recorder (DVR). Although the user is playing the game in this example, the identical “DVR” tool is available if the user is using the application. This may be useful when viewing previous work and for other applications, as described in detail below. In addition, if the game was designed with a rewind tool based on the use of game state information so that the camera’s field of view can be changed, etc., this 3D DVR tool will also be supported, but the game will need to be developed with his support. The DVR tool, using the delay buffer 1515, works with any game or application, of course, with restrictions, for video that is generated during the use of the game or application, but in the case of games with the DVR 3D tool, the user can control the “pass-through” ( fly through) in 3D of a previously conducted segment, and record the resulting video into a delay buffer 1515, and record the game state for the game segment. Accordingly, a particular “end-to-end fast movement” is recorded as compressed video, but since the state of the game is also recorded, another end-to-end fast movement for an identical segment of the game is subsequently possible.
As described below, each user has on the hosting service 210 a user page (User Page), where he can put information about himself and other data. Among other things, users can place video segments from the gameplay (game process) that they have saved. For example, if the user has overcome a particularly difficult serious test in the game, then the user can “rewind” to the place immediately before his great achievement in the game, and then issue a command to the hosting service 210 to save a video segment of some duration (for example, 30 seconds) on the page User (User Page) of this user for viewing by other users. To implement this, it is simply a matter of the application / game server 1521-1525 that the user uses to read the video stored in the delay buffer 1515 into a RAID array 1511-1512 and then link to this video segment on the user’s page (User Page) of the mentioned user.
If the DVR 3D tool exists in the game, as described above, then the game status information required for the DVR 3D can also be recorded by the user and made available on the User Page of the said user.
In the case when the game is developed with the possibility of having “visitors” (that is, users who can move through the 3D world and watch the action without participating in it) along with active players, the Game Finder application (game search tool) provides the opportunity users to join games as visitors as well as players. In terms of implementation, there is no difference for the hosting system 210 whether the user is a visitor or an active player. The game is uploaded to the application / game server 1521-1525, and the user controls the game (for example, by controlling a virtual camera that is looking into the world). The only difference will be the experience of using the game by the user.
Collaboration of multiple users
Another feature of the hosting service 210 is the ability for multiple users to work together when they watch live video, even when using significantly different viewing devices. This is useful when playing the game and when using applications.
Many PCs and mobile phones are equipped with video cameras and there is a tool in them to perform real-time video compression, in particular when the image is small. In addition, small cameras are available for sale that can be connected to a TV, and it is not difficult to implement real-time compression either in software or using one of the many hardware compression devices for video compression. In addition, microphones exist in many PCs and in all mobile phones, and headphones with microphones are available for sale.
Such cameras and / or microphones combined with a local video / audio compression tool (in particular, using the low-latency video compression methods described in this document) provide the user with the ability to transfer video and / or audio from user territory 211 to the service 210 hosting along with the control data of the input device. When such methods are applied, the means illustrated in FIG. 19 can be obtained: a user can display his video and audio 1900 on a screen inside another user's application or game. This example is a multi-player game where teammates participate in a car race together. The user's video / audio can only be selectively viewed / listened by his teammates. And, since there is actually no waiting time when using the methods described above, players can talk to each other or show each other in real time without a noticeable delay.
This video / audio integration is accomplished by receiving compressed video and / or audio from a user's camera / microphone as incoming traffic 1501 to the Internet. Thereafter, inbound routing 1502 routes the video and / or audio to the application / game game servers 1521-1525, which are able to view / listen to the video and / or audio. Next, users of the respective game servers 1521-1525 applications / games that decide to use video and / or audio, restore it and integrate it as required for output within the game or application, for example, as shown in position 1900.
The example of FIG. 19 shows how such collaboration is used in a game, but such collaboration can be a powerful tool for applications. Consider the situation when a large building is designed for architects in Chicago for New York by a developer who is located in New York, but making a decision involves a financial investor who travels and is at an airport in Miami, and a decision must be made about certain elements of the building design based on how it is consistent with the buildings located near it, so that both the investor and the developer are satisfied. Suppose an architecture firm has a high-resolution monitor with a camera connected to a PC in Chicago, a developer has a laptop with a camera in New York, and an investor has a mobile phone with a camera in Miami. An architecture firm can use the 210 hosting service to host a powerful architectural design application with highly realistic 3D visualization and the ability to use a large database of buildings in New York, as well as a database of designed buildings. The architectural design application is executed on one, or, if more computing power is required, on several of the application / game servers 1521-1525. Each of these 3 users located in different places connects to the hosting service 210, and each has the same type of output of the architectural design application, but its size will be accordingly changed by the shared hardware compression 1530 for this device and the network connection characteristics that each user has (for example, in an architectural firm you can see an image, 2560 × 1440, 60 frames / s, through a commercial Internet connection of 20 Mbps, a developer in New York can see images , 1280 × 720, 60 frames / s, via a DSL connection of 6 Mbps on your laptop, and the investor can see the image, 320 × 180, 60 frames / s, via a 250 Kbps cellular data connection on your mobile phone. the participant hears the voice of other participants (the conferencing is processed by any of the many widely available software packages for conferencing in the application / game server (s) 1521-1525) and, by activating the button on the user input device, the user can output the video itself yourself to the screen using with howl of a local camera. As the meeting progresses, architects will be able to show how the structure looks when they rotate it and then quickly move along it to another building in this area, with extremely photorealistic 3D rendering, and all participants see an identical video with the resolution of each participant's display. It doesn’t matter that none of the local devices used by any participant can handle 3D animation with such realism, let alone load or even store the huge database required to render nearby buildings in New York. From the point of view of each user, despite the fact that they are at a distance, and despite the various local devices, they will simply have an uninterrupted experience of use with an incredible degree of realism. And when one participant needs his face to be visible with the best transmission of his emotional state, he can do it. In addition, if either (the developer) or investor needs to take control of the architectural program and use their own input device (regardless of whether it is a keyboard, mouse, button panel or touch screen), they can do this and it will respond without any tangible latency (assuming that its network connection has an acceptable latency). For example, in the case of a mobile phone, if the mobile phone is connected to the WiFi network at the airport, it will have a very short waiting time). But if he uses the cellular data networks currently available in the US, then he is likely to experience a noticeable delay. However, for most meeting purposes, when an investor observes how architects manage fast movement along a building, or to conduct a video teleconference, even the waiting time for cellular communications should be acceptable.
Finally, at the end of the joint conference call, the developer and investor will make their comments and disconnect from the hosting service, the architecture firm will be able to “rewind” the video conference call, which was recorded in the delay buffer 1515 and view comments, facial expressions and / or actions Related to a 3D model of a building made during a meeting. If there are specific segments that they want to save, then these video / audio segments can be moved from the delay buffer 1515 to a RAID disk array 1511-1512 for archival storage and playback later.
In addition, in terms of cost, if architects only need to use computing power and a large New York database for 15 minutes of conferencing, they should only pay for the time that these resources are used, and not for possessing powerful workers stations and the purchase of an expensive copy of a large database.
Extensive public video services
Hosting service 210 provides an unprecedented opportunity to install extensive public video services on the Internet. On Fig depicts an illustrative user page (User Page) for a player in the hosting service 210. As in the case of the Game Finder application (game search tool), the User Page (user page) is an application that runs on one of the application / game server 1521-1525. In all thumbnails and video windows on this page, a moving video is constantly shown (if segments are short, they are cyclically repeated).
By using a video camera or by uploading a video, a user (whose username is "KILLHAZARD") can post a video of himself 2000, which other users can view. Video is stored in a RAID array 1511-1512. In addition, when other users go to the KILLHAZARD User Page, if KILLHAZARD uses the hosting service 210 at this time, then a 2001 video will be shown in real time of what it is doing (assuming that it allows users to view his user page (User Page) to monitor him). This is done by the application / game server 1521-1525, which hosts the User Page application (user page), asking the service control system 401 whether KILLHAZARD is active, and if so, then the application / game server 1521-1525, which it uses. After that, using the identical methods used by the Game Finder (game search tool), the compressed video stream with the appropriate resolution and format will be sent to the application / game server 1521-1525 that runs the User Page application and it will be displayed on the screen . If the user selects the KILLHAZARD real-time gameplay window, and then properly clicks on his input device, the window is enlarged (and again using identical methods as in the Game Finder applications (game search tools) and real-time video fills the screen with the resolution of the observing user display 422 corresponding to the Internet connection characteristics of the observing user.
The main advantage of this approach over prior art approaches is that the user viewing the User Page can see a game playing in real time that does not belong to the user, and he may not have a local computer or game console with the ability to play games. The fact that the user sees the user in the User Page "in action", the host of the game, provides him with great opportunities, and this is an opportunity to find out about the game that the viewing user may need to test or improve their results in it.
The recorded or downloaded video clips from 2002 KILLHAZARD buddies are also shown on the User Page (on the user’s page), and under each video clip there is text that indicates whether this buddy is online and is playing the game (for example, six_shot is playing the game “Eragon”, and MrSnuggles99 is offline, etc.). By clicking on a menu item (not shown), the buddy’s video clips will switch from showing recorded or uploaded videos to live video of what the friends who are currently playing games in the 210 hosting service do at that moment in their games. Accordingly, it becomes a Game Finder (search tool for the game), performing grouping by friends. If a friend’s game is selected and the user clicks on it, it will expand to full screen, and the user will be able to watch the game played in full screen in real time.
And again, the user viewing the friend’s game does not own a copy of this (the game), nor does it own the local computing resources / resources of the game console for running this game. Watching the game is virtually immediate.
As previously described above, when the user plays the game in the hosting service 210, the user can “rewind” the game and find the video segment that he needs to save, and then saves this video segment on his user page (User Page). They are called "Brag Clip" ("bragging clips"). All segments of the 2003 video are Brag Clip 2003, saved by KILLHAZARD from previous games that he hosted. Position 2004 shows how many times the Brag Clip has been viewed, and when the Brag Clip is viewed, users have the opportunity to evaluate it, and the number of orange 2005 pictograms in the form of a keyhole indicates how high the rating is. Brag Clip 2003 is constantly cycled as the user views the User Page, along with the rest of the video on the page. If the user selects and clicks one of Brag Clip 2003, it is enlarged to represent Brag Clip 2003, and the DVR controls provide the ability to play the clip, pause it, rewind, fast forward, go through the steps, etc.
Playback of Brag Clip 2003 is implemented by the application / game server 1521-1525, loading the compressed video segment stored in the RAID array 1511-1512 when the user recorded the Brag Clip, and restoring and reproducing it.
Brag Clip 2003 can also be segments of “DVR 3D” video (ie, a sequence of game states from a game that can be replayed and allows the user to change the camera’s viewing point) from games that support such a tool. In this case, information about the state of the game is stored, along with the recording of a compressed video of a specific “pass-through” that the user made when a segment of the game was recorded. When the User Page is viewed and all the thumbnails and video windows are constantly cyclically repeated, the Brag Clip 2003 DVR 3D constantly repeats the Brag Clip 2003, which was recorded as compressed video when the user recorded a “pass-through” of the game segment. But when the user selects the Brag Clip 2003 DVR 3D and clicks on it, along with the DVR controls that enable the playback of the compressed Brag Clip, the user can click on the button that provides him with the 3D DVR tool for the game segment. He will be able to control the “through passage” of the camera during the segment of the game on his own, and if he needs it (and the user who owns this user page provides this opportunity), he will be able to record an alternative “through passage” of Brag Clip in the form of compressed video, (which) after that will be available to other viewers of the user's page (either immediately, or after the owner of the user’s page has the opportunity to view this Brag Clip).
This Brag Clip 2003 DVR 3D tool is provided by activating a game that is ready to start re-playing the recorded game state information on another application / game server 1521-1525. Since the game can be activated almost instantly (as described earlier), it is not difficult to activate it, and its playback is limited to the state of the game recorded by the Brag Clip segment, and after that it is possible for the user to perform a “pass-through”, and the camera records compressed video in buffer 1515 delay. After the user completes the “pass-through”, the game is turned off.
From a user’s point of view, activating the “pass-through” through the Brag Clip 2003 DVR 3D does not require more effort than controlling the DVR controls of the linear Brag Clip 2003. He may not know anything about the game or even how to play the game. He is the only operator of a virtual camera looking into the 3D world during a segment of a game recorded by another.
Users can also overlay their own audio onto a Brag Clip, which is either recorded from microphones or downloaded. Therefore, Brag Clip can be used to create custom animations using characters and actions from games. This animation method is commonly known as machinima.
As users progress in games, they achieve differing levels of skill. Played games report these achievements to the 401 service management system, and these skill levels will be displayed on the User Page.
Interactive animated advertisements
Online advertisements were moving from text to still images, to video, and now to interactive segments, typically implemented using thin animation clients such as Adobe Flash. The reason thin animation clients are used is because users are impatient to delay (exercise the pre-emptive right to receive) a product or service that is transferred to them. In addition, thin clients run on very low-performance PCs and, in essence, the advertiser can be completely sure that the interactive ad will work properly. Unfortunately, thin animation clients, such as Adobe Flash, are limited in terms of interactivity and duration of use experience (to reduce download time).
On Fig shows an interactive advertisement in which the user must select the external and internal colors of the car during rotation of the car in the showroom, when by means of ray tracing in real time it shows how the car looks. Next, the user selects an “avatar” to drive the car, and after that the user can take the car to travel either along the race track or in an exotic area, for example, Monaco. The user can choose an engine with a large displacement or the best tires and after that can check how the changed configuration affects the ability of the car to accelerate or keep the road.
Undoubtedly, the mentioned advertisement is actually a complicated 3D video game. But to play such advertisements on a PC or video game console, you may need to download 100 megabytes and, in the case of a PC, it may be necessary to install special drivers, and it may not be possible to execute it if the PC does not have the GPU or CPU compute capability. Accordingly, such advertisements are practically impracticable in prior art configurations.
In the hosting service 210, such advertisements are launched almost instantly and executed excellently, regardless of the capabilities of the client 415 users. Accordingly, they run faster than interactive thin-client advertisements, with significantly more extensive use experience and a high degree of reliability.
Stream geometry during real-time animation
A RAID array 1511-1512 and inbound routing 1502 can provide data rates that are so large and with latencies so small that video games and applications can be designed that are based on RAID array 1511-1512 and inbound routing 1502 for reliable delivery of geometry on the fly in the middle of the gameplay or in the application during real-time animation (for example, a "pass-through" with a complex database.
With prior art systems, for example, the video game system shown in FIG. 1, available mass storage devices, in particular in practical home devices, are too slow to stream geometry during gameplay, except in situations where the required geometry is somewhat predictable. For example, in a driving game where there is a given highway, the geometry for buildings that appear in the field of view can be reasonably predictable, and mass storage devices can search in advance to where the approaching geometry is.
But in a complex scene with unpredictable changes (for example, in a battle scene with complex characters everywhere), if the RAM in the PC or video game system is completely filled with geometry for objects that are currently in sight, and then the user suddenly turns his character to see what is behind it, if the geometry has not been previously loaded into RAM, then there may be a delay before displaying it on the screen.
In the 210 hosting service, RAID type 1511-1512 disk arrays can stream data at speeds faster than Gigabit Ethernet technology, and in the case of SAN, you can achieve speeds of 10 gigabits / second using 10 Gigabit Ethernet technology or other network technologies . At 10 gigabits per second, gigabytes of data are downloaded in less than (than) per second. During a frame period of 60 fps (16.67 ms), approximately 170 megabytes (21 megabytes) of data can be downloaded. Rotating storage media, undoubtedly, even in a RAID configuration, still cause latency values that exceed the frame period, but a flash-based RAID storage device will eventually be as large as disk arrays such as RAID rotating storage media, and not will cause such a long wait time. In one embodiment, write-through caching to massive RAM is used to provide very low latency access.
Accordingly, with a sufficiently high network speed and sufficient mass memory with a sufficiently short latency, the geometry can be streamed to game servers 1521-1525 of the application / game at such a speed that the CPU and / or GPU can process 3D data. Accordingly, in the example given earlier, where the user suddenly turns his character and looks back, the geometry for all the characters that are behind can be loaded before the character completes the turn, and, accordingly, it will seem to the user as if he or she are in a photorealistic world that is as real as the acting.
As discussed earlier, one of the last frontiers in photorealistic computer animation is the human face, and due to the sensitivity of the human eye to imperfections, the smallest mistake in the photoreal face can result in a negative reaction from the viewer. Figure 22 shows how a live performance captured using Contour ™ Reality Capture technology (subject of simultaneous review: "Apparatus and method for capturing the motion of a performer" Ser. No. 10/942609, filed September 15, 2004, "Apparatus and method for capturing the expression of a performer" Ser. No. 10/942413, filed September 15, 2004, "Apparatus and method for improving marker identification within a motion capture system" Ser. No 11/066954, February 25, 2005, "Apparatus and method for performing motion capture using shutter synchronization" Ser. No. 11/077628, March 10, 2005, "Apparatus and method for performing motion capture using a random pattern on capture surfaces, "Ser. No. 11/255854, filed October 20, 2005, "System and method for performing motion capture using phosphor application techniques," Ser. No. 11/449131, filed June 7, 2006, "System and method for performing motion capture by strobing a fluorescent lamp," Ser. No. 11/449043, filed June 7, 2006, "System and method for three dimensional capture of stop-motion animated characters," Ser. No. 11/449127, filed June 7, 2006, the rights to each of which belong to the applicant of this CIP application), resulting in a very smooth gripped surface, then in a mesh surface with a large number of polygons (i.e., the movement of the polygon is accurate follows the movement of the face). Finally, when a live performance video is displayed on a mesh surface to output a textured surface, a photoreal result is output.
Despite the fact that with the help of modern GPU technology it is possible to visualize the number of polygons in a mesh surface and texture and illuminate the surface in real time, if polygons and textures change every frame period (which leads to the most photoreal results), then all available RAM of a modern PC or video game The console will run out quickly.
Using the flow geometry methods described above, it becomes possible to put into practice the continuous flow of geometry to the game servers 1521-1525 applications / games so that they can continuously animate photoreal faces with the possibility of creating video games with faces that are almost indistinguishable from faces when playing actors.
Integration of linear content with interactive features
Movies, television programs and audio material (collectively, “linear content”) are widely available for home and office users in many forms. Linear content can be acquired on physical media such as CD, DVD, HD-DVD, and Blu-ray media. It can also be recorded via DVR from a cable television broadcast and satellite broadcast. And it is available as pay-per-view (PPV) content via satellite and cable television and as video on demand (VOD) on cable television.
More and more linear content is available over the Internet, both as downloadable and streaming content. Currently, in fact, there is not a single place where you can experience all the signs associated with linear multimedia. For example, DVDs and other optical media for video information typically have interactive features that are not available elsewhere, such as commentary by a filmmaker, short films "opinion of", etc. Online music sites cover art and song information, usually not available on CD, but not all CDs are available online. And on television program related Web sites, there are often additional signs, blogs, and sometimes comments by actors or creative staff.
In addition, for many movies or sports events, there are often video games that release (in the case of movies) often along with linear multimedia, or (in the case of sports) that can be closely related to real events (e.g. player trading).
Hosting service 210 is well suited for delivering linear content in disparate forms of interconnected content. Of course, delivering movies requires no more effort (than) delivering highly interactive video games, and hosting service 210 can deliver linear content to a wide range of devices (located) in homes or offices, or to mobile devices. On Fig shows an illustrative page of the user interface for the hosting service 210, which depicts a selection of linear content.
But unlike most linear content delivery systems, hosting service 210 can also deliver interconnected interactive components (such as menus and tags on DVDs, interactive overlays of one graphic image on another on HD-DVDs, and Adobe Flash animations (as explained below) on the Web Accordingly, the restrictions of the client device 415 no longer impose restrictions on which features are available.
In addition, the hosting system 210 can dynamically and real-time link linear content to the contents of the video game. For example, if a user watches a Quidditch match in a Harry Potter movie and decides that she would like to try playing Quidditch, then she can simply click on the button and the movie will pause and it will immediately move to the Quidditch segment of the Harry Potter video game. After the Quidditch match, another click of a button and the film will continue immediately.
With photoreal graphics and production technology, in which a photographed video can not be distinguished from characters with an actor playing, when a user transitions from a Quidditch game in a movie with an actor playing to a Quidditch game in a video game on a hosting service, these two scenes virtually impossible to distinguish. This provides completely new creative options for directors and linear content and interactive content (such as video games), as the boundaries between the two worlds become indistinguishable.
Using the hosting service architecture depicted in FIG. 14, the viewer can be provided with virtual camera control in a 3D movie. For example, in a scene that takes place inside a train car, it is possible to provide the viewer with the ability to control a virtual camera and inspect the car during the story. It is assumed that all 3D objects ("resources") in the car are accessible, as well as an adequate level of computing capabilities with the ability to visualize scenes in real time, as well as the original film.
And even for non-computer entertainment, very exciting interactive features can be provided. For example, in the movie Pride and Prejudice, 2005, there are many scenes in the richly decorated old English mansions. For scenes in a specific mansion, the user can pause the video and then control the camera to take a trip around the mansion, or perhaps around the neighborhood. To accomplish this, the camera can be worn in a mansion with a fisheye lens while it is recording its position, much like the QuickTime VR technology of Apple, Inc. Then various frames can be converted so that the images will not be distorted, and then stored in a RAID array 1511-1512 type with the movie, and played back when the user decides to go on a virtual trip.
In relation to sporting events, a real-time sporting event, such as a game of basketball, can be transmitted through the hosting service 210, so that users watch it as they can watch it on a conventional TV. After users have watched a specific game, the video game of this game (with basketball players who look as photoreal as real players as well) can appear on the screen, with the players starting from an identical position, and users (each may get control single player) can restore the game and see if they can do better than these players.
The hosting service 210 described in this document is very well suited to support this world of the future because it can draw on computing capabilities and mass storage resources that are practically impractical to install in a home environment or in most office environments, and also its computing resources are always modern, with the latest computing hardware available, while in the home environment, there will always be homes with video games and PCs of previous generations. And, in the hosting service 210, all this computational complexity is hidden from the user, therefore, despite the fact that they can use very complicated systems, from the user's point of view, it is as simple as switching channels on a TV. In addition, users can access all the computing power and experience that this computing power can provide from any client 415.
In cases where the game is a game with several participants, then it can communicate with the game servers 1521-1525 of the application / game through the inbound routing network 1502 and through a network bridge with the Internet (not shown) with servers or game machines that are not operate in the service 210 hosting. When playing a game with several participants through computers on the Internet, game servers 1521-1525 applications / games will have an advantage with extremely fast access to the Internet (compared to playing the game on a server at home), but they will be limited by the capabilities of other computers that play the game on slower connections, and may also be limited by the fact that game servers on the Internet are designed to provide the smallest common denominator that can be home computers on relatively slow consumer Internet connections.
But when the game with several participants is completely played inside the server center of the hosting service 210, then a big difference can be achieved. Each game server 1521-1525 of the application / game that hosts the game for the user will be connected to other game servers 1521-1525 of the application / game, as well as with any servers that host the central control for playing with multiple participants with extremely high speed , with a connection with extremely short latency and huge, very fast arrays of storage devices. For example, if Gigabit Ethernet technology is used for the inbound routing network 1502, then the game servers 1521-1525 of the application / game communicate with each other and communicate with any servers that host the central control for playing with several participants at a gigabit / second speed perhaps the latency is only 1 ms or less. In addition, RAID type 1511-1512 disk arrays will be able to respond very quickly and then transfer data at gigabit / second speeds. As an example, if a user customizes the character on his own in terms of appearance and equipment so that this character has a large amount of geometry and behaviors that are unique to the character, then with prior art systems that are limited to the game client running at home on PC or game console, if this character should appear in the field of view of another user, then this user should wait until the long, slow download is completed so that all the data on the line behaves The geometry and geometry downloaded to his computer. Within the hosting service 210, identical downloads can be performed via Gigabit Ethernet and serviced from a 1511-1512 RAID array with gigabit / second speed. Even if a home user has an Internet connection of 8 Mbps (which is extremely fast by modern standards), then Gigabit Ethernet is 100 times faster. Accordingly, what takes one minute to quickly connect to the Internet will take less than one second on Gigabit Ethernet.
Groups of the best players and competitions
The 210 hosting service is very suitable for competitions. Since the game is not executed in the local client, there is no possibility for users to cheat. In addition, due to the fact that the output routing 1540 can multicast UDP streams, Hosting Service 210 can broadcast the main competitions to thousands of people in the audience at the same time.
In fact, when there are certain video streams that are so popular that thousands of users accept the same stream (for example, showing the types of the main competition), it may be more efficient to send the video stream to a content delivery network (CDN), for example, Akamai or Limelight for mass distribution in many client devices 415.
A similar level of performance can be obtained when CDNs are used to display the pages of the Game Finder (search engine game) groups of top players.
For major competitions, you can use a real sports celebrity commentator to provide comments during certain matches. Although a large number of users will watch the main competition, and a relatively small number of users will play in the competition. The audio from a celebrity sports commentator can be routed to the game servers 1521-1525 of the application / game hosting the users playing the competition and hosting any copies of game visitor modes during the competition, and this audio can be overlaid on top of the audio game. Celebrity sports commentator videos can also be superimposed on games, perhaps only on visitor screen displays.
Speed up web page loading
The original World Wide Web transport protocol, the Hypertext File Transfer Protocol (HTTP), was conceived and defined at a time when only companies had high-speed Internet connections and consumers who were online used dial-up modems or ISDN. At this time, the “gold standard” for high-speed connectivity was the T1 line, which provided a data transfer rate of 1.5 Mbit / s symmetrically (that is, with an equal data rate in both directions).
Currently, the situation is completely different. The average home connection speed through cable modem or DSL connections in most developed countries has a much higher downstream data rate than the T1 line. In fact, in some parts of the world, using fiber-to-the-curb cable entry technology, data transfer rates from 50 to 100 Mbps are sent to the house.
Unfortunately, HTTP was not designed (it was also not implemented) for the actual use of these significant speed increases. A website is a collection of files on a remote server. In simplest words, HTTP requests the first file, waits for it to download, and after that it requests the second file, waits for it to download, etc. In fact, HTTP provides the possibility of several "open connections", i.e. requesting multiple files at a time, but due to established standards (and the desire to prevent web server congestion) only a very small number of open connections are possible. In addition, due to the way web pages are created, browsers often do not know that many pages can be immediately available for immediate download (i.e. only after analyzing (parsing) the page, it becomes obvious that a new file should be loaded, for example , image). Accordingly, files on a web site are essentially uploaded one after another. And, because of the request-and-response protocol used by HTTP, the latency associated with each uploaded file is approximately (gaining access to typical US web servers) 100 ms.
With regard to connections at a relatively low speed, this does not present a big problem, because the download time of the files themselves dominates the wait time for web pages. But, as connection speeds increase, especially regarding complex web pages, problems begin to arise.
On Fig presents the amount of time that elapses before the web page becomes active, as the connection speed increases. With a 2401 connection speed of 1.5 Mbps, using a conventional web server using a regular web browser, it takes 13.5 seconds for the web page to become active. At a speed of 2402 connections of 12 Mbps, the download time is reduced to 6.5 seconds, or almost twice as fast. But at a speed of 2403 connections of 96 Mbps, the download time decreases only to approximately 5.5 seconds. The reason this happens is that with such a high download speed, the download time of the files themselves is minimal, but the latency for each file, approximately 100 ms, still remains, resulting in 54 × 100 files ms = 5.4 seconds latency. Accordingly, regardless of the speed of the connection to home, the time for this website to become active will always be at least 5.4 seconds. Another factor is queuing on the server side. Each HTTP request is added to the end of the queue, respectively, on a busy server this has a significant impact, because in order to receive every small element from the web server, HTTP requests must wait for their turn.
One way to solve this problem is to reject HTTP or override it. Or, it might be better for the website owner to combine their files into a single file (for example, in Adobe Flash format). But, in practice, this company, as well as many others (companies) have invested a lot in the architecture of their website. In addition, while 12-100 Mbps connections exist in some homes, slower speeds still exist in most homes, and HTTP works well at slow speeds.
One alternative is hosting web browsers on application / game servers 1521-1525 and hosting files for web servers on RAID arrays 1511-1512 (or, possibly, in RAM or on a local storage device on application servers 1521-1525 / games hosting web browsers). Due to the very fast interconnection via inbound routing 1502 (or with local storage), instead of 100 ms latency when using HTTP for each file, there will be a slight latency when using HTTP for each file. Then, instead of having the user in his home access the web page via HTTP, the user can access the web page through client 415. Then, even with a 1.5 Mbps connection (because this web page is not requires a large bandwidth for your video), the web page will be active in less than 1 second for each 2400 link. Essentially, there will be no waiting time for the active page to be displayed on the screen by a web browser executing on the application / game server 1521-1525, and there will be no noticeable waiting time for the client 415 to display the video output from the web browser. As the user moves the mouse over the web page and / or types text on it, the user input information is sent to the web browser running on the application / game server 1521-1525, and the web browser responds accordingly.
One drawback of this approach is that if the compression device constantly transmits video, then the bandwidth is used even if the web page becomes stationary. This can be corrected by making the compression device capable of transmitting data only when (and if) the web page changes, and then transfer data only to parts of the page that have changed. Despite the fact that there are some web pages with flash banners, etc. that are constantly changing, such web pages are generally annoying, and usually the web pages are motionless if there is no reason for any movements (e.g. video clip). For such web pages, this is probably the case when less data will be transmitted using the hosting service 210 than for a regular web server, because only the images actually displayed on the screen, no thin-client executable code, and no large objects that may never be viewed, such as switching pictures.
Accordingly, by using the hosting service 210 to host the inherited web pages, the loading time of the web pages can be reduced so much that opening the web page will look like switching the television channels: the web page becomes active almost immediately.
Debugging games and applications
As mentioned earlier, real-time video games and applications with graphics are very complex applications and, as a rule, when they are released in real-world conditions, they contain errors. Although software developers will receive error messages from users (feedback) and they may have some means of returning the machine state after a crash, it is very difficult to pinpoint exactly what caused the crash or improper execution of a game or real-time application .
When a game or application is executed in the hosting service 210, the video / audio output of the game or application is constantly recorded in the delay buffer 1515. In addition, the watchdog process is executed by each application / game server 1521-1525, which regularly reports to the hosting service management system 401 that the application / game server 1521-1525 is running smoothly. If the watchdog process does not transmit the message, then the server management system 401 tries to contact the application / game server 1521-1525, and if successful, collects all available machine state. All available information, along with the video / audio recorded by the delay buffer 1515, will be sent to the software developer.
Accordingly, when a developer of an application software or a game receives a crash notification from the hosting service 210, (he) receives a frame-by-frame recording of what led to the crash. This information can be very valuable in detecting and correcting errors.
Also note that when an application / game server 1521-1525 crashes, the server restarts from the very last restart point, and a message is sent to the user with an apology for the technical problem.
Resource Sharing and Cost Reduction
The system depicted in FIGS. 4a and 4b provides many advantages for both end users and application and game developers. For example, as a rule, home and office client systems (such as PCs or game consoles) are used only for a small amount of time per week. According to a press release published on October 5, 2006, Nielsen Entertainment's "Active Gamer Benchmark Study" (http://www.prnewswire.com/cgi-bin/stories.pl?ACCT=104&STORY=/www/story/10- 05-2006 / 0004446115 & EDATE =) active players spend an average of 14 hours a week playing on video game consoles and about 17 hours a week on handheld computers. The report also claims that all gaming activity (including playing games on a PC, handheld computer, console) (active players) averages 13 hours a week. Given the higher video game lead time on the console, 24 × 7 = 168 hours per week, which implies that in the house of an active player, the video game console only uses 17/168 = 10% of the time per week. Or 90% of the time the video game console is not used. Given the high cost of video game consoles, and the fact that manufacturers are financing such devices, this is a very inefficient use of an expensive resource. Company-owned PCs also typically only use a fraction of the time per week, especially desktop PCs, often required for high-end applications such as Maya Autodesk. Despite the fact that some companies work all the time during the holidays, and some PCs (for example, laptops brought home for work in the evening) are used all the time and during the holidays, most of the commercial activity is usually concentrated from about 9 hours to 17 hours, in the time zone of this company, from Monday to Friday, except weekends and breaks (for example, lunch), and since most PCs are used when the user is actively working with the PC, it follows that desktop PCs are usually used at this time of work. Assuming that PCs are constantly used from 9 hours to 17 hours, 5 days a week, this means that PCs are used 40/168 = 24% of the time per week. High-performance desktop PCs are a very expensive investment for companies, and this translates into very low usage. Schools that teach on desktop computers can use computers for even a smaller part of the week, and although it varies with school hours, most classes take place during the day, Monday through Friday. Accordingly, in general, PC and video game consoles are used only a small part of the time per week.
That is, since many people work in companies or at school during the day, from Monday to Friday, except weekends, these people, as a rule, do not play video games at this time, and therefore, in fact, they play video games usually at other times, such as in the evenings, on weekends and on holidays.
Given the configuration of the hosting service depicted in FIG. 4a, the usage patterns described in the two paragraphs above result in a very efficient use of resources. Obviously, there is a limit on the number of users that can be hosted by the hosting service 210 at a given time, especially if users require quick real-time response for complex applications similar to complicated 3D video games. But, unlike the video game console in the home or PC used by the company, which are usually not used most of the time, 402 servers can be reused by different users at different times. For example, a high-performance 402 server with high-performance dual CPUs and dual GPUs and plenty of RAM can be used by companies and schools from 9 hours to 5 hours on weekdays, but can be used by players playing advanced video games in the evenings, weekends, and holidays. Similarly, low-performance applications can be used by companies and schools on a low-performance server 402 with Celeron CPUs, no GPUs (or very low-performance GPUs) and limited RAM during working hours, and low-performance games can use a low-performance server 402 during idle hours time.
In addition, with the hosting service layout described in this document, resources are shared by virtually thousands, if not millions, of users. In general, only a small percentage of the total number of users of online services are currently using the service. Given the Nielsen statistics for video game usage listed earlier, it's easy to see why. If active players only play console games 17 hours a week, and assuming that the game’s peak usage time is usually off hours, off hours in the evenings ((17) hours to (24) hours, 7 × 5 days = 35 hours / week) and on weekends (from 8 hours to (24) hours, 16 × 2 = 32 hours / week), i.e. 35 + 32 = 65 peak hours per week for 17 hours of gameplay. The exact peak user load in the system is difficult to evaluate for many reasons: some users play at non-peak periods of time, there may be periods of the day when there are peaks of dividing into user groups, and the type of ongoing game (for example, children's games) can influence these peak periods will probably play in the evening at an earlier time), etc. But, taking into account the fact that the average number of hours a player spends playing a game is much smaller than the number of hours of the day when a player is probably playing a game, only a fraction of the number of users of the hosting service 210 will use it at the moment. For this analysis, suppose the peak load is 12.5%. Accordingly, only 12.5% of the bandwidth, compression, and computing resources are currently being used, which results in only 12.5% of the hardware cost to support a given user to maintain a given level of game execution due to reuse of resources.
In addition, given that some games and applications require more processing power than others, resources can be allocated dynamically based on the ongoing game or applications executed by users. Accordingly, a user selecting a low-performance application or game will be allocated a low-performance (less expensive) server 402, and a high-performance (more expensive) server 402 will be allocated to a user selecting an application or high-performance game. Of course, this game or application may have sections with higher characteristics and lower characteristics of the game or applications, and the user can switch from one server 402 to another server 402 between sections of the game or application To support the user's work on the server 402 with the lowest cost, which meets the needs of the application or game. Note that RAID type 405 disk arrays, which are much faster than a single disk, are available even to low-performance 402 servers, which will have the advantage of faster disk transfer rates. Accordingly, the average cost of each server 402 for all running games or used applications is much less than the cost of the most expensive server 402, which runs applications or a game with the highest characteristics, however even low-performance servers 402 take advantage of disk performance from type 405 disk arrays RAID
In addition, the server 402 in the hosting service 210 can be nothing more than a PC motherboard without peripheral or disk storage interfaces, with the exception of the network interface, and over time, can be integrated into a single-chip integrated circuit only with a fast network interface with SAN 403 In addition, RAID type 405 disk arrays are likely to be shared by a much larger number of users than there are disks, so the cost of disk storage for each active user is of Tell will be much smaller than one magnetic disk. All of this equipment is likely to be in a rack in a room environment for installing servers with controlled environmental characteristics. If server 402 fails, it can be quickly fixed or replaced in hosting service 210. On the contrary, a PC or game console in a home or office should be a rigid, self-contained device that must withstand acceptable wear due to bumps or falls, requires a case, has at least one magnetic disk drive, and must withstand adverse environmental conditions ( for example, being squeezed into an overheated audio and video enclosure along with other equipment), requires warranty service, must be packaged and shipped, and sold by a retailer who is likely to receive a retail mark-up. In addition, the PC or game console must be in accordance with the peak characteristics of the expected application or game, requiring the largest amount of computing to be used at some point in the future, even though the application or games with lower characteristics (or sections of games or applications) can be executed most of the time. And, if a PC or console fails, fixing them is an expensive, time-consuming process (with an adverse effect on the manufacturer, user and software developer).
Accordingly, given that the system depicted in FIG. 4a provides a user experience comparable to that of using a local computing resource for a user in a home, office or school to experience this level of computing capabilities, it is much cheaper to provide these computing capabilities through the architecture depicted in figa.
Eliminating the need for modernization
In addition, users no longer have to worry about upgrading PCs and / or consoles for playing new games or managing new higher-end applications. Any game or application in the hosting service 210, regardless of what type of server 402 is required for this game or application, is accessible to the user, and all games and applications are executed almost immediately (i.e., with fast loading from RAID 405 or local storage on servers 402) and properly with the latest updates and bug fixes (i.e., software developers can choose the ideal server configuration for server (s) 402 that is running this game or application, then configure the server (s) 402 with the best drivers, and then over time, developers can provide updates, bug fixes, etc. for all copies of the game or application in the service of hosting 210 simultaneously). Of course, after the user starts using the hosting service 210, the user is likely to find that the games and applications continue to provide a better experience (for example, through updates and / or bug fixes, and it may happen that the user discovers a year later, that a new game or application became available in service 210, which uses computing technology (for example, a higher-performance GPU), which did not even exist a year before, respectively, a year earlier it was impossible it is desirable for the user to buy the mentioned technology, which could execute the game or applications a year later. Since the computing resource on which the game is played, or which runs the application, is invisible to the user (i.e., from the point of view of the user, the user simply selects the game or an application that starts to run almost immediately — almost the same as if the user switched the television channels), the user's hardware will be “upgraded” without the user, even if he knows about modernization.
Another important issue for users in companies, schools and homes are backups. Information stored in the local PC or video game console (for example, in the case of the console, rating and achievements in the user's game), may be lost in the event of a disk failure or accidental erasure. There are many applications available that provide manual or automatic backups for the PC, and the state of the game console can be uploaded to the online server for backup, but local backups are usually backed up to another local disk (or other storage device) that must be stored in some a secure location and be organized, and backups to online services are often limited due to the low upstream data rate available through typical low-cost connectivity cheniya to the Internet. Through the hosting service 210 of FIG. 4a, data stored in RAID disk arrays 405 can be configured using prior art RAID configuration methods known to those skilled in the art such that if a disk fails, then the data will not be lost, and the technician in the server center where the damaged disk is located will be notified, and then (he) will replace the disk, which will then be automatically updated so that the RAID array is fault tolerant again pits. In addition, since all magnetic disk drives are located next to each other with (providing) fast local area networks between them via SAN 403, it is not difficult to organize in the server center that regular backups of all disk systems are created on an auxiliary storage device, which can be stored in the server center or moved beyond. From the point of view of users of the 210 hosting service, their data is simply protected all the time, and they should never think about backups.
Users often need to try out games or applications before buying them. As described previously, there are prior art tools for testing the demo version (the verb form "demo" means to test the demo version, which is also called the "demo", but as a noun) of games and applications, but in each of them there are limitations and / or disadvantages . Using the 210 hosting service, users can easily and conveniently experience the demo versions. In fact, all the user does is select the demo version through the user interface (for example, the interface described below) and test it. The demo version is uploaded to the server 402, suitable for the demo version, almost immediately, and it runs like any other game or application. Regardless of whether the demo version requires a very high-performance server 402 or a low-performance server 402, and no matter what type of home or office client 415 the user uses, from the user's point of view, the demo version just works. The software publisher of the demo version or game or application can control which particular demo version to provide to the user for testing and for how long, and, of course, this demo version may include user interface elements that enable the user to access the full version Demonstrated game or application.
Since demos are likely to be offered for a lower price or free of charge, some users may try to reuse demos (especially demos of games that may be fun to play repeatedly). Hosting service 210 may employ various methods to restrict the use of the demo version for a given user. The most direct approach is to set a user ID for each user and limit the number of trial trials provided for this user ID. A user, however, can set multiple user IDs, especially if they are free. One way to solve this problem is to limit the number of tests of the demo version provided for this client 415. If the client is a standalone device, then this device has a serial number, and hosting service 210 can limit the number of times that a client can access demo with this serial number. If client 415 runs as software on a PC or other device, then the serial number can be assigned by the hosting service 210 and stored on the PC and used to restrict the use of the demo version, but taking into account that the PCs can be reprogrammed by users, and the serial number can be erased or changed, another possible option for hosting service 210 is to record the address of the medium access control protocol (MAC) address of the network adapter PC (and / or other machine-specific identifiers, for example p, serial number of hard disk drives, etc.) and limit the use of the demo for him. Given that the MAC addresses of network adapters can be changed, however, this is not a reliable way. Another approach is to limit the number of trial trials for a given IP address. Despite the fact that IP addresses can be periodically reassigned by DSL and cable modem providers, in practice this does not happen very often, and if it can be determined (for example, by contacting the ISP) that this IP is in the block of IP addresses for accesses If a DSL or cable modem connection is associated with a dwelling, a small number of demo uses can usually be installed for a given home. In addition, there may be many devices in the home behind the NAT router sharing the same IP address, but usually in a residential environment there are a limited number of such devices. If the IP address is in the unit serving the company, then more demo versions can be installed for the company. But, ultimately, a combination of all the previously mentioned approaches is the best way to limit the number of demos for the PC. Despite the fact that there cannot be a reliable way by which full determination and a technically skilled user can be limited in the number of repeated uses of the demo versions, with the creation of a large number of obstacles it is possible to create such a sufficient deterrent that violation of the operating mode of the demo version system will become inappropriate for most PC users, and rather, they will use demos as intended to test new games and applications.
Benefits for schools, companies and other institutions
Significant advantages, in particular, get companies, schools and other institutions that use the system depicted in figa. Companies and schools have significant costs associated with installing, maintaining, and upgrading PCs, especially when it comes to PCs for running high-end applications such as Maya. As stated earlier, PCs are typically used only part of the time per week, and as in a home, the cost of a PC with a given level of performance is much higher in a school or office environment than in a server center environment.
In the case of large companies or schools (for example, large universities), it may be appropriate for IT departments of such organizations to establish server centers and support computers that access remotely via LAN-level connections. There are several solutions for remotely accessing computers over a LAN or through a private broadband connection between offices. For example, through a Microsoft Windows Terminal Server or through remote network desktop applications (virtual network computing) like VNC from RealVNC, Ltd. or through a thin client tool from Sun Microsystems, users can remotely access a PC or servers, with a range of quality in response time of graphic devices and practical user experience. In addition, such self-managed server centers are usually assigned to the same company or school and, in essence, cannot take advantage of the combination of use that is possible when different applications (for example, entertainment applications and business applications) use identical computing resources at different times weeks. Accordingly, many companies and schools lack the scale, resources or special knowledge to install a server center on their own, in which there is a network connection with LAN speed with each user. In fact, a large percentage of schools and companies have identical Internet connections (such as DSL, cable modems), as in homes.
However, in such organizations, however, there may be a need for very high performance computing, either continuously or periodically. For example, in a small architectural firm there may be only a small number of architects with relatively moderate computing needs when performing design work, but it may periodically require very high-performance 3D calculations (for example, when creating a 3D “pass-through” of a new architectural project for a client). The system depicted in FIG. 4a is very well suited for such organizations. Mentioned organizations need nothing more than a network connection of a kind identical to those offered for homes (for example, DSL, cable modems) and, as a rule, are very economical. They can either use cost-effective PCs as a 415 client or completely dispense with PCs and use cost-effective specialized devices that simply implement control signal logic 413 and recover 412 compressed video with low latency. These symptoms are particularly attractive to schools that may have problems with theft of the PC or damage to components requiring careful handling inside the PC.
This arrangement solves several problems for such organizations (and many of these benefits are also shared by home users performing universal computing. For example, operating costs (which ultimately must be returned in some form to users in order to have a viable business) much less because (a) computing resources are shared with other applications that have different peak usage periods during the week, (b) organizations could To gain access to (and bear the cost of) high-performance computing resources only when required (c) organizations should not provide resources for creating backup copies or in any other way support high-performance computing resources.
The exception to piracy
In addition, games, applications, interactive films, etc., will no longer be illegally used, as at present. Since the game is executed in the service center, users are not provided with access to the underlying control program, respectively, there is nothing that can be illegally used. Even if the user can copy the source code, then he cannot execute it on a standard game console or home computer. This opens up markets in places around the world like China where standard video games are not available. Resale of used games is also not possible.
For game developers, there are fewer concentrated market heterogeneities, as is currently the case. Hosting service 210 may be gradually updated over time as the requirements for games change, in contrast to the current situation, when a completely new generation of technology forces users and developers to upgrade, and the game developer depends on the timely delivery of the hardware platform.
Online streaming video
The above descriptions provide a wide range of applications, the possibility of which is provided by the new underlying idea of interactive video streaming based on common Internet technologies with a low latency (which also implicitly includes audio along with video, as used in this description). Systems of the prior art that provide streaming video over the Internet only allow applications that can be implemented with long latency interactions. For example, the basic playback controls for linear video (for example, pause, rewind, fast forward) work with a high latency, respectively, and you can choose between transmission lines of linear video. And, as stated earlier, the nature of some video games makes it possible to play them with long waiting times. But the long latency (or low compression ratio) of prior art approaches for streaming video severely limits possible streaming video applications or narrows their applications to specialized network environments, and even in such environments, prior art methods introduce significant network loads. The technology described in this document opens the door for a wide range of applications, possible with interactive streaming video with low latency over the Internet, in particular, for those that are provided through consumer Internet connections.
In fact, with such small client devices as the client 465 of FIG. 4c, which is sufficient to provide enhanced user experience with virtually any amount of computing power, arbitrary amount of fast memory and extremely fast network operation among powerful servers, this provides the possibility of a new era of computing. In addition, since the bandwidth requirements do not increase, as the computing power of the system grows (i.e., since the bandwidth requirements are tied only to the display resolution, quality and frame rate), then after the widespread use of high-speed connection to Internet (for example, through the universal coverage of wireless communications with low latency), reliable and with a sufficiently high bandwidth to meet the needs of displays 422 of all users, the question will be, vlyayutsya if necessary thick clients (eg, PC, or mobile phones running Windows, Linux, OSX, etc.), or even thin clients (eg, Adobe Flash or Java) for the typical consumer and business applications.
The advent of interactive video streaming results in a rethinking of assumptions about computing system architectures. An example of this is an embodiment of a server center of a hosting service 210 depicted in FIG. The video path for the delay buffer and / or group video 1550 is a feedback loop in which the interactive video stream of the application / game servers 1521-1525 transmitted to the user group is fed back to the application / game servers 1521-1525 either in real time along the path 1552 or according a selectable delay along path 1551. This allows a wide range of practical applications (for example, those depicted in FIG. 16, FIG. 17 and FIG. 20) that would be either impossible or not feasible through local archives. ektur computing system or server of the prior art. But, as a more general architectural feature, what the feedback loop 1550 provides is recursion at the level of interactive video streaming, since the video can come back unlimitedly when the application requires it. This provides a wide range of application features that were not previously available.
Another key architectural feature is that video streams are unidirectional UDP streams. This provides the possibility of a virtually arbitrary level of group transmission of interactive video streaming (for comparison, two-way streams, for example, TCP / IP streams, would create more traffic congestion in networks due to the transfer of information back and forth as the number of users increases). Group transmission is an important tool in the server center, because it enables the system to quickly respond to the growing needs of Internet users (and, in fact, the world's population) in one-to-many, or even many-to-many communications. . And again, the examples discussed in this description, for example in FIG. 16, which depicts the use and recursion of interactive video streaming and multicast, are only the visible part of a huge iceberg of possibilities.
In one embodiment, the various functional modules illustrated in this description and associated steps may be performed by specific hardware components that contain “wired” logic to perform these steps, for example, a specialized integrated circuit (“ASIC”) or any combination programmable computer components and custom hardware components.
In one embodiment, said modules may be implemented on a programmable digital signal processor ("DSP"), for example, Texas Instruments TMS320x architecture (for example, TMS320C6000, TMS320C5000 ... etc.). Various other DSPs may be used, although still with the implementation of these underlying principles.
Embodiments may include various steps, as set forth above. Stages can be embodied in machine-executable instructions that cause certain steps to be performed by a universal or specialized processor. Various elements that are not relevant to these underlying principles, such as computer memory, hard drive, input devices, were not included in the drawings in order to avoid difficulty understanding the relevant aspects.
Elements of the disclosed subject matter of the invention may also be provided as a computer-readable storage medium for storing computer-executable instructions. A computer-readable storage medium may include, for example, flash memory, optical disks, CD-ROM, ROM DVD, RAM, EPROM, EEPROM, magnetic or optical cards, distribution media, or other type of computer-readable storage media suitable for storing computer instructions. For example, the present invention can be downloaded as a computer program that can be transmitted from a remote computer (e.g., server) to the requesting computer (e.g., client) via data signals embodied in a carrier or other distribution medium via a communication line (e.g., modem or network connection).
It should also be understood that elements of the disclosed subject matter may also be provided as a computer program product, which may include a computer-readable storage medium on which instructions are stored that can be used to program a computer (e.g., processor or other electronic device) to execute a sequence operations. Alternatively, operations may be performed by a combination of hardware and software. A computer-readable storage medium may include, for example, floppy disks, optical disks, CD-ROMs and magneto-optical disks, ROM, RAM, EPROM, EEPROM, magnetic or optical cards, distribution media or other types of media / computer-readable media computer commands. For example, elements of the disclosed subject matter of the invention may be downloaded as a computer program product, which program may be transmitted from a remote computer or electronic device to the requesting process via data signals embodied in a carrier or other distribution medium via a communication line (for example, a modem or network connection) .
In addition, although the disclosed subject matter has been described in conjunction with specific embodiments, there are numerous modifications and variations within the scope of the present disclosure. Accordingly, the description of the invention and the drawings are to be regarded as illustrations and not limitation.
a hosting service comprising one or more servers that run one or more tweak video games or applications to provide one or more streams of compressed, interactive, low-latency video streaming transmitted over a network including components of a public network to at least one client a device remote from at least one of the servers;
moreover, the client device contains:
a network interface communicatively connected to the bus and communicatively connecting the client device to the network hosting service;
audio and video output means, communicatively configured for receiving and decompressing compressed interactive streaming video with low latency received from the network interface,
a controller configured to receive control signals from user input devices and provide these control signals to one or more of the servers over the network, the servers running one or more tweak video games or applications in response to control signals to provide one or more streams of compressed interactive video streaming with a short waiting time;
moreover, the operations of receiving and providing control signals to the servers, executing one or more tweak video games or applications on the servers, providing streams of compressed interactive streaming video with low latency and decompressing compressed interactive streaming video with low latency are performed with such a waiting time that the user interacting with at least one of one or more tweak video games or applications, it seems that tweak video games or applications instantly respond to ravlyaetsya signals.
a central processor running a client management application.
Priority Applications (3)
|Application Number||Priority Date||Filing Date||Title|
|PCT/US2008/085605 WO2009073830A1 (en)||2007-12-05||2008-12-04||Streaming interactive video client apparatus|
|Publication Number||Publication Date|
|RU2010127328A RU2010127328A (en)||2012-01-10|
|RU2500022C2 true RU2500022C2 (en)||2013-11-27|
Family Applications (1)
|Application Number||Title||Priority Date||Filing Date|
|RU2010127328/08A RU2500022C2 (en)||2007-12-05||2008-12-04||Streaming interactive video client apparatus|
Country Status (10)
|EP (1)||EP2232379A4 (en)|
|JP (1)||JP2011507350A (en)|
|KR (1)||KR20100114881A (en)|
|CN (1)||CN101918936A (en)|
|AU (1)||AU2008333832A1 (en)|
|CA (1)||CA2707707A1 (en)|
|NZ (1)||NZ585910A (en)|
|RU (1)||RU2500022C2 (en)|
|TW (4)||TW200943075A (en)|
|WO (1)||WO2009073830A1 (en)|
Families Citing this family (44)
|Publication number||Priority date||Publication date||Assignee||Title|
|US8661496B2 (en)||2002-12-10||2014-02-25||Ol2, Inc.||System for combining a plurality of views of real-time streaming interactive video|
|US9108107B2 (en)||2002-12-10||2015-08-18||Sony Computer Entertainment America Llc||Hosting and broadcasting virtual events using streaming interactive video|
|US8832772B2 (en)||2002-12-10||2014-09-09||Ol2, Inc.||System for combining recorded application state with application streaming interactive video output|
|US8549574B2 (en)||2002-12-10||2013-10-01||Ol2, Inc.||Method of combining linear content and interactive content compressed together as streaming interactive video|
|US20090118019A1 (en)||2002-12-10||2009-05-07||Onlive, Inc.||System for streaming databases serving real-time applications used through streaming interactive video|
|US8893207B2 (en)||2002-12-10||2014-11-18||Ol2, Inc.||System and method for compressing streaming interactive video|
|US8840475B2 (en)||2002-12-10||2014-09-23||Ol2, Inc.||Method for user session transitioning among streaming interactive video servers|
|US9003461B2 (en)||2002-12-10||2015-04-07||Ol2, Inc.||Streaming interactive video integrated with recorded video segments|
|US8949922B2 (en)||2002-12-10||2015-02-03||Ol2, Inc.||System for collaborative conferencing using streaming interactive video|
|US8495678B2 (en)||2002-12-10||2013-07-23||Ol2, Inc.||System for reporting recorded video preceding system failures|
|US8468575B2 (en)||2002-12-10||2013-06-18||Ol2, Inc.||System for recursive recombination of streaming interactive video|
|US9032465B2 (en)||2002-12-10||2015-05-12||Ol2, Inc.||Method for multicasting views of real-time streaming interactive video|
|US8387099B2 (en)||2002-12-10||2013-02-26||Ol2, Inc.||System for acceleration of web page delivery|
|US8613673B2 (en)||2008-12-15||2013-12-24||Sony Computer Entertainment America Llc||Intelligent game loading|
|US8926435B2 (en)||2008-12-15||2015-01-06||Sony Computer Entertainment America Llc||Dual-mode program execution|
|US8147339B1 (en)||2007-12-15||2012-04-03||Gaikai Inc.||Systems and methods of serving game video|
|KR20170129967A (en)||2010-09-13||2017-11-27||소니 인터랙티브 엔터테인먼트 아메리카 엘엘씨||A method of transferring a game session, over a communication network, between clients on a computer game system including a game server|
|US8171148B2 (en)||2009-04-17||2012-05-01||Sling Media, Inc.||Systems and methods for establishing connections between devices communicating over a network|
|US9723319B1 (en)||2009-06-01||2017-08-01||Sony Interactive Entertainment America Llc||Differentiation for achieving buffered decoding and bufferless decoding|
|US8968087B1 (en)||2009-06-01||2015-03-03||Sony Computer Entertainment America Llc||Video game overlay|
|US8888592B1 (en)||2009-06-01||2014-11-18||Sony Computer Entertainment America Llc||Voice overlay|
|JP5481693B2 (en) *||2009-07-21||2014-04-23||コクヨ株式会社||Video call system, calling terminal, receiving terminal, program|
|US9015225B2 (en)||2009-11-16||2015-04-21||Echostar Technologies L.L.C.||Systems and methods for delivering messages over a network|
|US9178923B2 (en)||2009-12-23||2015-11-03||Echostar Technologies L.L.C.||Systems and methods for remotely controlling a media server via a network|
|US9275054B2 (en)||2009-12-28||2016-03-01||Sling Media, Inc.||Systems and methods for searching media content|
|TWI468014B (en)||2010-03-30||2015-01-01||Ibm||Interactively communicating a media resource|
|US8771064B2 (en)||2010-05-26||2014-07-08||Aristocrat Technologies Australia Pty Limited||Gaming system and a method of gaming|
|KR20110132680A (en) *||2010-06-03||2011-12-09||주식회사 토비스||Game machine|
|US9113185B2 (en)||2010-06-23||2015-08-18||Sling Media Inc.||Systems and methods for authorizing access to network services using information obtained from subscriber equipment|
|US8676591B1 (en)||2010-08-02||2014-03-18||Sony Computer Entertainment America Llc||Audio deceleration|
|EP2609520B1 (en)||2010-09-13||2018-05-30||Sony Computer Entertainment America LLC||Add-on management|
|CN102457694A (en) *||2010-10-26||2012-05-16||芯讯通无线科技（上海）有限公司||Mobile terminal with HDMI (High-Definition Multimedia Interface), control device, system and control method|
|TWI450208B (en) *||2011-02-24||2014-08-21||Acer Inc||3d charging method, 3d glass and 3d display apparatus with charging function|
|TWI550408B (en) *||2011-04-22||2016-09-21||晨星半導體股份有限公司||Multi-core electronic system and speed adjusting device thereof|
|CN102917275A (en) *||2011-08-02||2013-02-06||英华达（上海）科技有限公司||Streaming media playing system and playing method thereof|
|CN103517144A (en) *||2012-06-29||2014-01-15||深圳市快播科技有限公司||Multi-screen interaction adapter and display device|
|KR101502806B1 (en) *||2013-08-28||2015-03-18||건국대학교 산학협력단||System and method for streaming based lod for navigation|
|US9900362B2 (en)||2014-02-11||2018-02-20||Kiswe Mobile Inc.||Methods and apparatus for reducing latency shift in switching between distinct content streams|
|DE102014011339A1 (en) *||2014-07-30||2016-02-04||Exaring Ag||Smart HDMI streaming adapter|
|US9432734B2 (en)||2014-09-10||2016-08-30||Telefonaktiebolaget L M Ericsson (Publ)||Multi-person and multi-device content personalization|
|EP3325116A1 (en)||2015-07-24||2018-05-30||Gorillabox GmbH I. G.||Method and telecommunications network for streaming and for reproducing applications|
|CN105898548A (en) *||2015-12-10||2016-08-24||乐视致新电子科技（天津）有限公司||HDMI (High Definition Multimedia Interface) based video output wireless adaption method, device and system|
|EP3507958A1 (en)||2016-09-03||2019-07-10||Gorillabox GmbH||Method for streaming and reproducing applications via a particular telecommunications system, telecommunications network for carrying out the method, and use of a telecommunications network of this type|
|US10416954B2 (en) *||2017-04-28||2019-09-17||Microsoft Technology Licensing, Llc||Streaming of augmented/virtual reality spatial audio/video|
|Publication number||Priority date||Publication date||Assignee||Title|
|RU46397U1 (en) *||2005-03-05||2005-06-27||Еремеев Владимир Сергеевич||Information transmission system|
|US7135985B2 (en) *||2002-04-11||2006-11-14||Koninklijke Philips Electronics N. V.||Controlling a home electronics system|
|EP1777966A1 (en) *||2005-10-20||2007-04-25||Siemens Aktiengesellschaft, A German Corporation||Decomposition of a H.264-decoder on a playstation|
|US7292588B2 (en) *||2001-05-01||2007-11-06||Milley Milton E||Wireless network computing|
Family Cites Families (9)
|Publication number||Priority date||Publication date||Assignee||Title|
|CN1110964C (en) *||1997-08-08||2003-06-04||联华电子股份有限公司||Adaptive access priority selecting method for memory in MPEG circuit|
|GB0219509D0 (en) *||2001-12-05||2002-10-02||Delamont Dean||Improvements to interactive TV games system|
|JP2003289553A (en) *||2002-03-28||2003-10-10||Sanyo Electric Co Ltd||Image data processor and stereoscopic image display system|
|US7356588B2 (en) *||2003-12-16||2008-04-08||Linear Technology Corporation||Circuits and methods for detecting the presence of a powered device in a powered network|
|EP1869599A2 (en) *||2005-03-21||2007-12-26||Yosef Mizrahi||Method, system and computer-readable code for providing a computer gaming service|
|US20060282855A1 (en) *||2005-05-05||2006-12-14||Digital Display Innovations, Llc||Multiple remote display system|
|JP2006350919A (en) *||2005-06-20||2006-12-28||Matsushita Electric Ind Co Ltd||Video distribution terminal and video distributing method|
|US20070011712A1 (en) *||2005-07-05||2007-01-11||White Technologies Group||System for multimedia on demand over internet based network|
|JP4463237B2 (en) *||2006-04-28||2010-05-19||株式会社ソニー・コンピュータエンタテインメント||Communication device, game device, wireless game controller, and game system|
- 2008-12-04 KR KR1020107014739A patent/KR20100114881A/en not_active Application Discontinuation
- 2008-12-04 NZ NZ585910A patent/NZ585910A/en unknown
- 2008-12-04 JP JP2010537090A patent/JP2011507350A/en active Pending
- 2008-12-04 AU AU2008333832A patent/AU2008333832A1/en not_active Abandoned
- 2008-12-04 CA CA2707707A patent/CA2707707A1/en not_active Abandoned
- 2008-12-04 TW TW097147259A patent/TW200943075A/en unknown
- 2008-12-04 RU RU2010127328/08A patent/RU2500022C2/en active
- 2008-12-04 CN CN2008801193004A patent/CN101918936A/en not_active Application Discontinuation
- 2008-12-04 WO PCT/US2008/085605 patent/WO2009073830A1/en active Application Filing
- 2008-12-04 TW TW098125719A patent/TW200949567A/en unknown
- 2008-12-04 EP EP08857774A patent/EP2232379A4/en not_active Ceased
- 2008-12-04 TW TW098123940A patent/TW201001177A/en unknown
- 2008-12-04 TW TW098122810A patent/TW200942305A/en unknown
Patent Citations (4)
|Publication number||Priority date||Publication date||Assignee||Title|
|US7292588B2 (en) *||2001-05-01||2007-11-06||Milley Milton E||Wireless network computing|
|US7135985B2 (en) *||2002-04-11||2006-11-14||Koninklijke Philips Electronics N. V.||Controlling a home electronics system|
|RU46397U1 (en) *||2005-03-05||2005-06-27||Еремеев Владимир Сергеевич||Information transmission system|
|EP1777966A1 (en) *||2005-10-20||2007-04-25||Siemens Aktiengesellschaft, A German Corporation||Decomposition of a H.264-decoder on a playstation|
Also Published As
|Publication number||Publication date|
|AU2008333803B2 (en)||System for reporting recorded video preceding system failures|
|AU2010202242B2 (en)||System for recursive recombination of streaming interactive video|
|AU2008333797B2 (en)||System for combining a plurality of views of real-time streaming interactive video|
|AU2010229095B2 (en)||System and method for multi-stream video compression using multiple encoding formats|
|AU2008333804B2 (en)||System for acceleration of web page delivery|
|CN101918937B (en)||System for collaborative conferencing using streaming interactive video|
|CA2756692C (en)||System and method for utilizing forward error correction with video compression|
|CA2707608C (en)||Method for multicasting views of real-time streaming interactive video|
|EP2451184A1 (en)||System and method for remote-hosted video effects|
|CN101897183B (en)||Method of combining linear content and interactive content compressed together as streaming interactive video|
|EP2656888A2 (en)||Method for user session transitioning among streaming interactive video servers|
|CN105227952B (en)||The method implemented by computer and system for executing video compress|
|AU2008333821B2 (en)||System for combining recorded application state with application streaming interactive video output|
|US9756349B2 (en)||User interface, system and method for controlling a video stream|
|US9707481B2 (en)||System for streaming databases serving real-time applications used through streaming interactive video|
|US9108107B2 (en)||Hosting and broadcasting virtual events using streaming interactive video|
|US10369465B2 (en)||System and method for streaming game video|
|US9192859B2 (en)||System and method for compressing video based on latency measurements and other feedback|
|US9003461B2 (en)||Streaming interactive video integrated with recorded video segments|
|US10071308B2 (en)||System and method for capturing text for an online application|
|US8366552B2 (en)||System and method for multi-stream video compression|
|US9643084B2 (en)||System and method for compressing video frames or portions thereof based on feedback information from a client device|
|CA2756686C (en)||Methods for streaming video|
|US8839336B2 (en)||System for recursive recombination of streaming interactive video|
|US9061207B2 (en)||Temporary decoder apparatus and method|
|PC41||Official registration of the transfer of exclusive right||
Effective date: 20150522