EP1763767A4 - Discrete graphics system and method - Google Patents
Discrete graphics system and methodInfo
- Publication number
- EP1763767A4 EP1763767A4 EP05773703A EP05773703A EP1763767A4 EP 1763767 A4 EP1763767 A4 EP 1763767A4 EP 05773703 A EP05773703 A EP 05773703A EP 05773703 A EP05773703 A EP 05773703A EP 1763767 A4 EP1763767 A4 EP 1763767A4
- Authority
- EP
- European Patent Office
- Prior art keywords
- gpu
- computer system
- serial bus
- coupled
- graphics
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4027—Coupling between buses using bus bridges
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
Definitions
- the present writing is generally related to computer implemented graphics. More particularly, the present ion is directed towards a highly scalable graphics processor for graphics applications. This writing discloses a method and system for stand alone graphics independent of computer system form factor.
- the rendering of three-dimensional (3D) graphical images is of interest in a variety of electronic games and other applications.
- Rendering is the general term that describes the overall multi-step process of transitioning from a database representation of a 3D object to a pseudo realistic two-dimensional projection of the object onto a viewing surface.
- the rendering process involves a number of steps, such as, for example, setting up a polygon model that contains the information which is subsequently required by shading/texturing processes, applying linear transformations to the polygon mesh model, culling back facing polygons, clipping the polygons against a view volume, scan converting/rasterizing the polygons to a pixel coordinate set, and shading/lighting the individual pixels using interpolated or incremental shading techniques.
- GPUs are specialized integrated circuit devices that are commonly used in graphics systems to accelerate the performance of a 3D rendering application. GPUs are commonly used in conjunction with a central processing unit (CPU) to generate 3D images for one or more applications executing on a computer system. Modern GPUs typically utilize a graphics pipeline for processing data.
- a modern GPU can comprise an integrated circuit device having over 200 million transistors and running at several hundred megahertz. Such a modern GPU can consume hundreds of watts of power and require carefully designed thermal protection components (e.g., heat sink fans, access to adequate airflow, etc.).
- GPU subsystems e.g., GPU graphics cards
- the ATX form factor refers to the widely used industry standard motherboard form factor supported by the leading industry manufacturers. Such manufactures include, for example, CPU manufacturers, chipset manufacturers, motherboard manufacturers, and the like.
- the ATX form factor allows a limited amount of space for a card-based GPU.
- a typical card-based GPU connects to the motherboard via an AGP slot.
- the AGP slot has a limited amount of space for the components of the card-based GPU.
- the limited amount of space directly impacts the efficiency of the thermal protection components of the card-based GPU.
- the available power (e.g., the specified voltages and currents) of the AGP connection has become increasingly insufficient.
- the BTX form factor refers to a more recent industry standard motherboard form factor.
- the BTX form factor is generally considered the next generation ATX follow on specification for a "desktop" PC chassis and, as with the earlier ATX form factor, is widely supported by the leading industry manufacturers.
- Unfortunately, the BTX form factor persons even more problems with respect to high- performance GPU subsystems.
- the BTX form factor is problematic in that the BTX design rules place a number of constraints on the form and performance of the GPU subsystem. For example, BTX design rules locate the desktop computer system's CPU at the front entry point for cooling airflow, while positioning the GPU subsystem (e.g., graphics card) in its downstream airflow and adding restrictions on the GPU subsystem's physical dimensions (e.g., x-y-z size), available air flow, available thermal dissipation, and power delivery.
- the GPU subsystem e.g., graphics card
- the future evolution of GPU subsystems for laptop computers is constrained by the fact that the laptop chassis (e.g., motherboard platform, case, airflow, etc.) is optimized for the requirements of CPUs and their associated chipsets. This optimization limits the available thermal dissipation budget, power delivery, and physical dimensions (e.g., x-y-z size) for any graphics subsystem implementation.
- the laptop chassis e.g., motherboard platform, case, airflow, etc.
- This optimization limits the available thermal dissipation budget, power delivery, and physical dimensions (e.g., x-y-z size) for any graphics subsystem implementation.
- PCI Express is one such standard.
- Some versions of the PCI Express standard specify a maximum power available for a coupled device (e.g., 150W prescribed by the PCI SIG specification for PCI Express Graphics).
- the requirements of high-end GPU implementations may greatly exceed the specified maximum power available.
- some versions of the PCI Express standard specify an insufficient amount of bandwidth between the GPU subsystem and the rest of the computer system platform (e.g., system memory, CPU, etc.). The insufficient bandwidth limits the upward scalability of the GPU subsystem performance by bottlenecking data pathways between the GPU subsystem and the computer system platform resources.
- Embodiments of the present invention provide a method and system for stand alone graphics independent of computer system form factor. Embodiment of the present invention should eliminate data transfer bandwidth constraints and form factor constraints that limit the upward scalability of a GPU subsystem.
- the present invention is implemented as a discrete graphics system (DGS) for executing 3D graphics instructions for a computer system.
- the discrete graphics system includes one or more GPUs for executing 3D graphics instructions and a DGS system chassis configured to house the GPU(s).
- a serial bus connector built into the DGS system chassis and is configured to couple to the GPU(s).
- the serial bus connector is configured to removably connect the DGS and the GPU(s) to the computer system.
- the GPU(s) of the DGS access the computer system via the serial bus connector to execute the 3D graphics instructions for the computer system.
- the rendered 3D data is then transmitted back to the computer system for presentation on a display coupled to the computer system.
- the rendered 3D data is sent to a display directly coupled to the DGS for presentation to the user.
- the DGS uses multiple card-based GPUs.
- the GPUs can be implemented as single GPU add-in graphics cards (e.g., one GPU per card), multi-GPU add-in graphics cards (e.g., two or more GPUs per card).
- multiple add-in graphics cards are used wherein each card has two or more GPUs.
- the present invention is implemented as a DGS (discrete graphics system) unit.
- the DGS unit includes a system chassis configured to house a GPU, and a GPU mounting unit coupled to the system chassis and configured to receive the GPU.
- a serial bus connector is coupled to the chassis and is coupled to the GPU mounting unit, wherein the serial bus connector is configured removably connect the GPU to a computer system to enable the GPU to access the computer system via the serial bus connector and execute the 3-D graphics instructions for the computer system.
- a power supply coupled to the system chassis for supplying power to the GPU independent of the computer system.
- the DGS unit includes a thermal management system for cooling the GPU and the power supply.
- the DGS unit includes an acoustic management system for controlling the operation of the thermal management system and the power supply to limit the noise produced by the DGS unit.
- the DGS uses multiple card-based GPUs.
- the GPUs can be implemented as single GPU add-in graphics cards (e.g., one GPU per card), multi-GPU add-in graphics cards (e.g., two or more GPUs per card). In one embodiment, multiple add-in graphics cards are used wherein each card has two or more GPUs.
- the present invention is implemented as a scalable discrete graphics system (DGS).
- the DGS includes a serial bus bridge (e.g., PCI Express) configured to couple a plurality of GPUs to a serial bus.
- a serial bus connector is coupled to the serial bus bridge.
- a system chassis coupled to the serial bus bridge and the serial bus connector and configured to house the GPUs.
- the serial bus connector is configured to removably connect to a computer system.
- the GPUs access the computer system via the serial bus bridge and the serial bus connector to cooperatively execute 3-D graphics instructions from the computer system.
- the serial bus bridge is configured to enable an upward scaling of a 3-D rendering performance of the computer system by coupling at least one new GPU to function with an existing GPU.
- the new GPU and the existing GPU cooperatively execute graphics instructions from the computer system.
- the multiple GPU graphics system uses multiple card-based GPUs.
- the GPUs can be implemented as single GPU add-in graphics cards (e.g., one GPU per card), multi-GPU add-in graphics cards (e.g., two or more GPUs per card).
- multiple add-in graphics cards are used wherein each card has two or more GPUs.
- Figure 1 shows a computer system in accordance with one embodiment of the present invention.
- FIG. 2 shows a DGS in accordance with one embodiment of the present invention wherein the DGS is coupled to drive a display.
- FIG. 3 shows a DGS in accordance with one embodiment of the present invention wherein the DGS is configured to utilize the display coupled directly to a computer system.
- Figure 4 shows certain components of a computer system and a bus in accordance with one embodiment of the present invention.
- Figure 5 shows certain components of a computer system in accordance with one embodiment of the present invention.
- Figure 6 shows a diagram depicting the manner in which a DGS in accordance with one embodiment of the present connects to a computer system via PCI Express connectors.
- Figure 7 shows internal components of a DGS in accordance with one embodiment of the present invention.
- Figure 8 shows an exemplary configuration of the internal components of the DGS in accordance with one embodiment of the present invention.
- Figure 9 shows a scalable DGS in accordance with one embodiment of the present invention.
- Figure 10 shows a graph illustrating the increase in rendering performance as additional GPUs are added to a DGS in accordance with one embodiment of the , present invention.
- Figure 11 shows an AGP based card mounted GPU in accordance with one embodiment of the present invention.
- Figure 12 shows a PCI Express based card mounted GPU in accordance with one embodiment of the present invention.
- Figure 13 shows a block diagram depicting internal components of a multiple GPU (graphics processor unit) graphics system in accordance with one embodiment of the present invention.
- Figure 14 shows a graph depicting the range of operation available to a multiple GPU graphics system in accordance with one embodiment of the present invention.
- Figure 15 shows a diagram depicting the manner in which the respective graphics instruction workload is executed by each of the GPUs.
- Figure 16 shows a side view of a DGS in accordance with one embodiment of the present invention.
- Figure 17 shows a front view of the DGS in accordance with one embodiment of the present invention.
- Figure 18 shows a view of the DGS with the chassis cover removed in accordance with one embodiment of the present invention.
- FIG 19 shows a view of the chassis cover of the DGS as it is being closed in accordance with one embodiment of the present invention.
- Figure 20 shows a view of the DGS connected to a laptop computer system via a PCI Express cable in accordance with one embodiment of the present invention.
- Figure 21 shows a view of the DGS driving the display of the laptop computer system in accordance with one embodiment of the present invention.
- Computer system 100 in accordance with one embodiment of the present invention provides the execution platform for implementing certain software-based functionality of the present invention.
- the computer system 100 includes a CPU 101 and a system memory 102.
- a discrete graphics system (e.g., hereafter DGS) 110 is coupled to the CPU 101 and the system memory 102 via a bus 115 and a bridge 120.
- the system memory 102 stores instructions and data for both the CPU 101 and the DGS 110.
- the DGS 110 accesses the system memory 102 via the bridge 120.
- the bridge 120 communicates with the DGS 110 via the bus 115 and functions by bridging the respective data formats of the bus 115 and the computer system 100.
- the computer system 100 includes any type of computing device, including, without limitation, a desktop computer, server, workstation, laptop computer, computer-based simulator, palm-sized computer and other portable/handheld devices such as a personal digital assistant, tablet computer, game console, cellular telephone, smart phone, handheld gaming systems and the like
- the computer system 100 embodiment of Figure 1 shows the basic components of a computer system coupled to utilize a DGS 110 to execute 3D graphics instructions.
- the DGS 110 includes at least one GPU for executing 3D graphics instructions.
- the GPU(s) is enclosed within a DGS system chassis configured to house the GPU(s) and provide the necessary resources for its optimal operation.
- the DGS 110 includes a serial bus connector to couple to the bus 115 and thereby couple the DGS 110 to the bridge component 120.
- the bus 115 is a PCI Express serial bus.
- the GPU(s) of the DGS 110 accesses the computer system via the serial bus 115 to execute 3D graphics instructions for the computer system. In this manner, the DGS 110 provides a discrete graphics system that is separate and independent from the resources/constraints of the computer system 100. Internal components of the DGS 110 are described greater detail below (e.g., Figure 7, etc.)..
- FIG. 2 shows the DGS 110 in accordance with one embodiment of the present invention wherein the DGS 110 is coupled to directly drive a display 201 (e.g., an LCD display /CRT display, etc.).
- the DGS 110 includes the components (e.g., frame buffers, DACs, etc.) necessary to drive the display 201.
- the display 201 is coupled to the DGS 110 via, for example, a display adapter cable 202 (e.g., analog video cable, digital video cable, or the like).
- the DGS 110 embodiment of Figure 2 provides an advantage in that rendered video data (e.g., frames of rendered 3D video) can be sent directly to the display 201 supposed to being sent over the bus 115 to the computer system 100. This has the effect of reducing bandwidth demands placed on the bus 115.
- Figure 3 shows a DGS 310 in accordance with one embodiment of the present invention wherein the DGS 310 is configured to utilize the display 201 as coupled directly to a computer system 300 (e.g., as opposed to being connected to the DGS as in the Figure 2 embodiment).
- the DGS 310 embodiment is configured to transmit rendered video data back to the computer system 300 using the available bandwidth of the bus 115.
- the DGS 310 functions with the components of the computer system 300 (e.g., CPU 301, system memory 302, bridge 320, and the computer system GPU 330) to present the rendered video data on the display 201.
- the resources of the computer system GPU 330 e.g., frame buffers, DACs, etc.
- the display 201 is used to drive the display 201.
- the DGS 310 embodiment of Figure 3 provides an advantage in ⁇ that the resources available in a typical desktop or laptop computer system can be used to drive the display 201. This allows the DGS 310 to be more easily connected and used by a typical computer system. For example, when the performance benefits of a powerful 3D rendering system are desired, the DGS 310 can hot plug to the computer system 300 and immediately begin driving its display 201, as opposed to forcing a user to disconnect the display 201 from the computer system 300 and reconnect the display to the DGS 310.
- FIG 4 shows certain components of a computer system 400 and a bus 415 in accordance with one embodiment of the present invention.
- the bus 415 is a PCI Express bus.
- the PCI Express bus 415 couples a DGS 410 to a PCI Express bridge 420 of computer system 400.
- the PCI Express bridge 420 provides the internal data transfer bandwidth between the CPU 401, system memory 402, and the personal devices (e.g., disk drive 421, DVD drive 422, and the Like).
- PCI Express bus 415 provides a number of advantages.
- PCI Express comprises a serial bus standard that serializes data for much more efficient transfer in comparison to older parallel bus standards (e.g., AGP, etc.).
- the PCI Express standard defines increased bandwidth transfer modes whereby multiple "lanes" can be combined to scale data transfer bandwidth.
- the typical PCI Express bus connecting a graphics subsystem to system memory is specified as a "16 lane" bus, whereby 16 serial PCI Express data pathways are linked to provide 16 times the data transfer bandwidth of a single lane PCI Express bus. If more bandwidth is needed, an additional number of PCI Express lanes can be used to implement the bus 415.
- the PCI Express bus 415 can be much longer than the older parallel buses.
- prior art AGP buses could not be more than several millimeters long without risking data skew and data corruption. This effectively forced the GPU to be located, or plugged, directly onto a computer system's motherboard.
- a PCI Express bus cable can be more than one meter long, allowing the DGS 410 to be completely removed (e.g., located some distance away) from the chassis of the computer system 400.
- FIG. 5 shows certain components of a computer system 500 in accordance with one embodiment of the present invention.
- a PCI Express North bridge 424 and a PCI Express South bridge 425 are used in place of a single bridge 420 as in computer system 400 of Figure 4.
- Computer system 500 shows a typical North bridge/South bridge configuration whereby the North bridge 424 provides memory master/memory controller functionality for the system memory 402 and the South bridge 425 provides data transfer bandwidth for the peripheral devices (e.g., disk drive 421, DVD drive 422, and the like).
- peripheral devices e.g., disk drive 421, DVD drive 422, and the like.
- FIG. 6 shows a diagram depicting the manner in which a DGS in accordance with one embodiment of the present invention connects to a computer system via PCI Express connectors 601 and 602.
- the PCI Express standard provides a hot plug capability whereby devices can be connected and disconnected from a PCI Express bus while remaining on. This allows the DGS 410 to be plugged into the computer system 400 virtually on demand. For example, when high-performance 3D rendering is desired (e.g., for a high fidelity real-time 3D rendering application), the DGS 410 can be simply plugged in to provide the necessary performance.
- a PCI Express bus cable 415 can be more than one meter long, allowing the DGS 410 to be completely removed from the chassis of the computer system 500.
- FIG. 7 shows internal components of a DGS 710 in accordance witih. one embodiment of the present invention.
- the DGS 710 comprises a chassis separate from the computer system chassis. This chassis includes a DGS bridge 720 for coupling to the PCI Express bus 415, one or more GPUs 730, a power supply 721 , a thermal management system 722, and an acoustic management system 723.
- the DGS 710 embodiment includes one or more GPUs for executing graphics instructions from a coupled computer system (e.g., computer system 500, etc.). As described above, the graphics instructions received from the computer system via the PCI Express bus 415.
- the independent power supply 721 is for providing power to DGS components independent of a computer system's power supply.
- power supply requirements for future GPU performance increases can evolve independent of any external constraints of any industry-standard computer system configurations (e.g., ATX form factor standards, BTX form factor standards, etc.).
- the thermal management system 722 is for providing a source of cooling independent of a computer system's cooling configuration.
- cooling requirements for future GPU performance-increases can evolve independent of any external constraints (e.g., BTX cooling standards, etc.).
- the thermal management system 722 can comprise the heat sink fans, heat pipe mechanisms, liquid cooling mechanisms, or the like.
- the acoustic management system 723 is for providing acoustic management mechanisms/algorithms which function independent of a computer system's cooling, power, or operating constraints. For example, specialized sound absorbing materials can be used in the chassis of the DGS 710. Similarly, special operating modes can be used to control the speed/operation of the power supply 721 and thermal management system 722 of the DGS 710 to reduce noise.
- Figure 8 shows an exemplary configuration of the internal components of the DGS 710 in accordance with one embodiment of the present invention.
- the DGS 710 includes a heat sink fan (HSF) 801 and a power supply fan (PSF) 802 for providing thermal dissipation for the GPU(s) 730 and the power supply 721.
- HSF heat sink fan
- PSF power supply fan
- these components are controlled by an acoustic management system 723.
- a separate power connection 803 e.g., AC power
- a dedicated connection is shown for the display 201.
- FIG. 9 shows a scalable DGS 910 in accordance with one embodiment of the present invention.
- the DGS 910 includes the DGS bridge 720 which functions by coupling a plurality of GPUs the PCI Express bus 415.
- a number of GPUs are shown coupled to the bridge 720. This is shown as a GPU 1 901, GPU 2 902, and GPU X 904 are shown.
- Each of the GPUs (GPU 1 through GPU X) has a respective bus link to the DGS bridge 720 (e.g., shown as links 911-914).
- the DGS 910 embodiment shows the scalability features of a DGS in accordance with one embodiment of the present invention.
- the DGS bridge 720 functions by cooperatively sharing the data transfer bandwidth of the PCI Express bus 415 among the links 911-914. The sharing is configured to allow the GPUs to cooperatively execute 3D graphics instructions from a coupled computer system (e.g., computer system 500).
- a coupled computer system e.g., computer system 500.
- the data transfer bandwidth available with a multi-lane PCI Express bus connection removes a critical performance bottleneck present in prior art type parallel bus connections.
- the available data transfer bandwidth allows the performance of a graphics subsystem to rapidly scale. Embodiments of the present invention take advantage of this increased data transfer bandwidth by utilizing GPUs in a cooperative execution array.
- Graphics processing workload can be allocated among available GPUs such that the workload is executed parallel. Such cooperative execution enables a rapid scaling of graphics subsystem rendering performance. Additionally, because of the features of a DGS system in accordance with embodiment of the present invention, the scaling is not limited by the constraints (e.g., power constraints, thermal constraints, etc.) of any coupled computer system.
- the DGS system 910 can include its own dedicated power supply (e.g., power supply 721 of Figure 8), and because the DGS system 910 can include its own thermal management system (e.g., HSF 801 and PSF 802 of Figure 8), the performance of the overall graphics subsystem is free to rapidly evolve as technology changes. Furthermore, removal of such computer system related constraints allows the inclusion of multiple GPUs as shown in Figure 9, which provides a rapid upward scaling of graphics subsystem performance.
- the DGS bridge 720 functions by sequentially allocating the bandwidth of the PCI Express bus 415 to each of the GPUs in a round robin fashion. For example, the entire bandwidth of a 16 lane PCI Express bus 415 can be round robin allocated to the GPUs as they work on and complete portions of the overall graphics execution workload.
- the bridge 720 can implement an arbitration mechanism, whereby the bus 415 is allocated to the GPUs on an as-needed basis.
- FIG. 10 shows a graph illustrating the increase in rendering performance as additional GPUs are added to a DGS 910 in accordance with one embodiment of the present invention.
- adding additional GPUs causes a rapid increase in the rendering power of the DGS 910.
- transitioning from a single GPU DGS to a dual GPU DGS yields a nearly 100% increase in rendering power.
- the increased rendering power is not quite 100% percent since some additional overhead is required to ensure the proper cooperative execution of the graphics processing workload.
- FIG 11 shows an AGP based card mounted GPU 1101 in accordance with one embodiment of the present invention.
- the GPU 1101 comprises a graphics processor 1105, a graphics memory 1106, and an AGP edge connect 1107.
- the GPU 1101 comprises a typical GPU available in a typical retail outlet.
- Such a GPU can be utilized off-the-shelf by a DGS system in accordance with embodiments of the present invention.
- the chassis of the DGS would include an AGP edge connect socket configured to accept the edge connect GPU 1101.
- the GPU 1101 can be purchased by a user to replace an older GPU. The upgrade can be accomplished by simply removing the older GPU from the DGS and simply inserting the new GPU 1101. The removal and replacement can be accomplished with requiring the user to open or otherwise access the chassis of the computer system.
- the GPU 1101 can be purchased by the user to complement and existing GPU installed in the DGS. This allows the user to immediately scale the performance of the user's graphics subsystem by using the cooperative graphics instruction execution features of the DGS as described above.
- FIG. 12 shows a PCI Express based card mounted GPU 1201 in accordance with one embodiment of the present invention.
- the GPU 1201 is substantially similar to the GPU 1101.
- the GPU 1201 comprises a graphics processor 1205, a graphics memory 1206, and a PCI Express connect 1207, as opposed to the AGP edge connect 1107 of Figure 11.
- the GPU 1201 has one or more separate power connector(s) 1208 for coupling power directly to the GPU 1201.
- Such power connectors 1208 are increasingly common with modern high-performance GPUs.
- the chassis of the DGS would include a PCI Express connection socket configured except the PCI Express connect GPU 1201 and would also include appropriate sockets for the power connector(s) 1208.
- a DGS can except different types of card mounted GPUs.
- the chassis of the DGS can include provisions for accepting AGP based GPUs and/or PCI Express based GPUs.
- FIG. 13 shows a block diagram depicting internal components of a multiple GPU (graphics processor unit) graphics system 1300 in accordance with one embodiment of the present invention.
- the multiple GPU graphics system includes a plurality of GPUs 901-904 configured to execute graphics instructions from a computer system.
- a GPU output multiplexer 1302 and a controller unitj comprising a frame synchronization master 1301 and respective clock control units 1311-1313, are coupled to the GPUs 901-904.
- the multiple GPU graphics system 1300 can be used to implement the cooperative GPU execution processes for a DGS.
- the frame synchronization master 1301 and respective clock control units 1311-1313 are configured to control the GPUs 901-904 and the output multiplexer 1302 such that the GPUs 901-904 cooperatively execute the graphics instructions from the computer system.
- the clock control units 1301-1313 function by enabling or disabling respective GPUs 901-904.
- the frame synchronization master 1301 functions by synchronizing the rendered 3D graphics frames produced by the respective GPUs 901-904.
- the output of the respective GPUs 901-904 are combined by the output multiplexer 1302 to produce a resulting GPU output stream 1330.
- the memory master 1320 e.g., bridge 420 of Figure 4
- controls access to the memory 1321 e.g., system memory 402 of Figure 4).
- the multiple GPU graphics system 1300 illustrates an exemplary configuration in which a cooperative execution among a plurality of GPUs (e.g., GPUs 901-904) can be implemented and controlled in accordance -with one embodiment of the present invention. It should be noted that although system 1300 shows one exemplary configuration, other configurations for intimately cooperative execution among a plurality of GPUs are possible.
- FIG 14 shows a graph depicting the range of operation available to a multiple GPU graphics system 1300 in accordance with one embodiment of the present invention.
- the graphics system 1300 is capable of low-power modes and high-power modes.
- the controller unit turns off one or more of the GPUs 901-904. This saves power while also reducing the peak performance of the graphics system 1300.
- the controller unit turns on additional GPUs to deliver additional rendering performance. This increases peak rendering performance while also increasing the power consumption.
- FIG. 15 shows a diagram depicting the manner in which the respective graphics instruction workload is executed by each of the GPUs 901-904.
- sequential frames of rendering workload are assigned to the GPUs 901- 902 (e.g., frame 1, frame 2, and so on to frame N+N).
- the sequential frames can be allocated to the GPUs 901-904 in a staggered fashion with respect to time such that the frames essentially executed in parallel and can be combined by the output multiplexer into a snooze uninterrupted GPU output stream, as shown by line 1501.
- the respective graphics instruction workload for each of the GPUs 901-904 are executed by the GPUs in parallel.
- multiple GPU graphics system 1300 can be used to implement functionality for a DGS coupled to the computer system, the multiple GPU graphics system 1300 can also be directly built into a chassis of a computer system (e.g., incorporated directly to a desktop computer system).
- each of the GPUs 901-903 has its own clock so that clock distribution and GPU-to-GPU skew around the chip or system is not as critical in other designs. This can significantly reduce the cost and complexity of chip or board layout.
- Each GPU is responsible for generating a portion (e.g., frame, series of frames, etc.) of the output stream 1330 with its neighboring GPUs.
- the GPUs 901-903 in total are run at a slightly faster frame rate than needed by an application (e.g., 3D rendering application) to eliminate frame stuttering at the composite - image sequence. As shown in Figure 13, these frames are combined by the output multiplexer 1302 to deliver the final N-Frames-per-second.
- the GPU-to-GPU skew, Frame distribution, and output multiplexer 1302 are managed by the Frame Sync Master 1301.
- the system 1300 architecture provides a number of benefits. For example, for graphics implementations for ultra-high performance that are AC-tethered, such as workstation and desktop applications, very high performance can be achieved by a super-scaled on-chip design that reuses GPU cores or with chip-on-PCB solutions. Similarly, graphics performance can be provided for ultra-low power graphics solutions from the same basic re-targetable GPU building blocks (e.g., for portable applications such as cell phones, PDAs, and Mobile Pes). This feature yields a time-to- market and NRE (non-recurrent engineering) cost advantage in delivering products for each GPU generation for extreme performance and extreme mobile graphics solutions.
- NRE non-recurrent engineering
- Comparable fill rates and frame rates can be provided with significantly lowered clock frequencies, therefore delivering performance but with far less power.
- the clock-per-GPU features allows unused GPUs to be dynamically turned on and off as dictated by an application. Simple 2D interfaces and DVD or mpeg playback will only require a fraction of the total system 1300 to be active, thereby significantly reducing the power used.
- graphics system 1300 has been described in the context of a DGS chassis based system, the graphics system 1300 architecture can be implemented in a wide variety of computer system platforms, including, for example, desktop, workstation, mobile PCs, cell phones, PDAs, chipsets, and the like.
- Figure 16 shows a side view of a DGS in accordance with one embodiment of the present invention.
- Figure 17 shows a front view of the DGS.
- Figure 18 shows a view of the DGS with the chassis cover removed. This views shows two internal GPU cards coupled to the chassis of the DGS.
- Figure 19 shows a view of the chassis cover of the DGS as it is being closed.
- Figure 20 shows a view of the DGS connected to a laptop computer system via a PCI Express cable.
- Figure 21 shows a view, of the DGS driving the display of the laptop computer system.
- Figure 16 shows a side view of a DGS in accordance with one embodiment of the present invention.
- Figure 17 shows a front view of the DGS.
- Figure 18 shows a view of the DGS with the chassis cover removed. This views shows two internal GPU cards coupled to the chassis of the DGS.
- Figure 19 shows a view of the chassis cover of the DGS as it is being closed.
- Figure 20 shows a view of the DGS connected to a laptop computer system via a PCI Express cable.
- Figure 21 shows a view of the DGS driving the display of the laptop computer system.
- a discrete graphics system for executing 3D graphics instructions for a computer system, a DGS unit, and a scalable DGS are disclosed.
- the discrete graphics system includes a GPU for executing 3D graphics instructions and a DGS system chassis configured to house the GPU.
- a serial bus connector coupled is to the GPU and the DGS chassis.
- the serial bus connector is configured to removably connect the DGS and the GPU to the computer system.
- the GPU of the DGS accesses the computer system via the serial bus connector to execute the 3D graphics instructions for the computer system.
- a scalable discrete graphics system includes a serial bus bridge configured to couple a plurality of GPUs to a serial bus.
- a serial bus connector is coupled to the serial bus bridge.
- a system chassis coupie ⁇ to the serial bus bridge and the serial bus connector and configured to house the GPUs.
- the serial bus connector is configured to removably connect to a computer system.
- the GPUs access the computer system via the serial bus bridge and the serial bus connector to cooperatively execute 3-D graphics instructions from the computer system.
- the DGS unit includes a system chassis configured to house a GPU, the GPU for executing 3- D graphics instructions, and a GPU mounting unit coupled to the system chassis and configured to receive the GPU.
- a serial bus connector is coupled to the chassis and is coupled to the GPU mounting unit, wherein the serial bus connector is configured removably connect the GPU to a computer system to enable the GPU to access the computer system via the serial bus connector and execute the 3-D graphics instructions for the computer system.
- a power supply coupled to the system chassis for supplying power to the GPU independent of the computer system.
- this writing discloses a discrete graphics system (DGS) for executing 3D graphics instructions for a computer system.
- the discrete graphics system includes a GPU for executing 3D graphics instructions and a DGS system chassis configured to house the GPU.
- a serial bus connector coupled is to-the GPU and the DGS chassis.
- the serial bus connector is configured to removably connect the DGS and the GPU to the computer system.
- the GPU of the DGS accesses the computer system via the serial bus connector to execute the 3D graphics instructions for the computer system.
Abstract
Description
Claims
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/877,723 US8941668B2 (en) | 2004-06-25 | 2004-06-25 | Method and system for a scalable discrete graphics system |
US10/877,642 US8411093B2 (en) | 2004-06-25 | 2004-06-25 | Method and system for stand alone graphics independent of computer system form factor |
US10/877,724 US8446417B2 (en) | 2004-06-25 | 2004-06-25 | Discrete graphics system unit for housing a GPU |
PCT/US2005/022790 WO2006004682A2 (en) | 2004-06-25 | 2005-06-24 | Discrete graphics system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1763767A2 EP1763767A2 (en) | 2007-03-21 |
EP1763767A4 true EP1763767A4 (en) | 2008-07-02 |
Family
ID=35783292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP05773703A Ceased EP1763767A4 (en) | 2004-06-25 | 2005-06-24 | Discrete graphics system and method |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1763767A4 (en) |
JP (1) | JP4912299B2 (en) |
TW (1) | TWI402764B (en) |
WO (1) | WO2006004682A2 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7535433B2 (en) * | 2006-05-18 | 2009-05-19 | Nvidia Corporation | Dynamic multiple display configuration |
US8560755B2 (en) * | 2006-09-07 | 2013-10-15 | Toshiba Global Commerce Solutions Holding Corporation | PCI-E based POS terminal |
ES2818348T3 (en) * | 2008-08-07 | 2021-04-12 | Mitsubishi Electric Corp | Semiconductor integrated circuit device, installation apparatus control device and apparatus status display device |
US8508538B2 (en) * | 2008-12-31 | 2013-08-13 | Apple Inc. | Timing controller capable of switching between graphics processing units |
CN103988190A (en) * | 2011-12-16 | 2014-08-13 | 英特尔公司 | Method, apparatus, and system for expanding graphical processing via external display-data i/o port |
CN109840876B (en) * | 2017-11-24 | 2023-04-18 | 成都海存艾匹科技有限公司 | Graphic memory with rendering function |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0279226A2 (en) * | 1987-02-12 | 1988-08-24 | International Business Machines Corporation | High resolution display adapter |
US6167476A (en) * | 1998-09-24 | 2000-12-26 | Compaq Computer Corporation | Apparatus, method and system for accelerated graphics port bus bridges |
WO2003083680A1 (en) * | 2002-03-22 | 2003-10-09 | Deering Michael F | Scalable high performance 3d graphics |
WO2003100588A1 (en) * | 2002-05-20 | 2003-12-04 | Sun Microsystems, Inc. | Modular computer system and method |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03160495A (en) * | 1989-11-17 | 1991-07-10 | Fuji Xerox Co Ltd | Image display device |
US6075929A (en) * | 1996-06-05 | 2000-06-13 | Compaq Computer Corporation | Prefetching data in response to a read transaction for which the requesting device relinquishes control of the data bus while awaiting data requested in the transaction |
JP2000124646A (en) * | 1998-10-15 | 2000-04-28 | Pfu Ltd | Cooling structure for printed circuit board |
JP2000222590A (en) * | 1999-01-27 | 2000-08-11 | Nec Corp | Method and device for processing image |
JP2001005574A (en) * | 1999-06-22 | 2001-01-12 | Toshiba Corp | Computer system |
JP2001290754A (en) * | 2000-04-05 | 2001-10-19 | Nec Corp | Computer system |
JP2002032324A (en) * | 2000-07-17 | 2002-01-31 | Hitachi Ltd | System for controlling pci bus device connection |
US6778390B2 (en) * | 2001-05-15 | 2004-08-17 | Nvidia Corporation | High-performance heat sink for printed circuit boards |
US6832269B2 (en) * | 2002-01-04 | 2004-12-14 | Silicon Integrated Systems Corp. | Apparatus and method for supporting multiple graphics adapters in a computer system |
US20020097220A1 (en) * | 2002-03-28 | 2002-07-25 | Compaq Information Technologies Group, L.P. | Method of supporting audio for KVM extension in a server platform |
EP1416356A1 (en) * | 2002-10-31 | 2004-05-06 | Elitegroup Computer Systems Co.,Ltd. | Portable computer comprising an upgrading apparatus |
US20050190536A1 (en) * | 2004-02-26 | 2005-09-01 | Microsoft Corporation | Method for expanding PC functionality while maintaining reliability and stability |
-
2005
- 2005-06-24 TW TW94121285A patent/TWI402764B/en active
- 2005-06-24 WO PCT/US2005/022790 patent/WO2006004682A2/en not_active Application Discontinuation
- 2005-06-24 EP EP05773703A patent/EP1763767A4/en not_active Ceased
- 2005-06-24 JP JP2007518356A patent/JP4912299B2/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0279226A2 (en) * | 1987-02-12 | 1988-08-24 | International Business Machines Corporation | High resolution display adapter |
US6167476A (en) * | 1998-09-24 | 2000-12-26 | Compaq Computer Corporation | Apparatus, method and system for accelerated graphics port bus bridges |
WO2003083680A1 (en) * | 2002-03-22 | 2003-10-09 | Deering Michael F | Scalable high performance 3d graphics |
WO2003100588A1 (en) * | 2002-05-20 | 2003-12-04 | Sun Microsystems, Inc. | Modular computer system and method |
Non-Patent Citations (1)
Title |
---|
See also references of WO2006004682A2 * |
Also Published As
Publication number | Publication date |
---|---|
EP1763767A2 (en) | 2007-03-21 |
TW200606751A (en) | 2006-02-16 |
TWI402764B (en) | 2013-07-21 |
WO2006004682A2 (en) | 2006-01-12 |
WO2006004682A3 (en) | 2006-08-03 |
JP2008504611A (en) | 2008-02-14 |
JP4912299B2 (en) | 2012-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8446417B2 (en) | Discrete graphics system unit for housing a GPU | |
US7663633B1 (en) | Multiple GPU graphics system for implementing cooperative graphics instruction execution | |
US8941668B2 (en) | Method and system for a scalable discrete graphics system | |
US9087161B1 (en) | Asymmetrical scaling multiple GPU graphics system for implementing cooperative graphics instruction execution | |
US7861013B2 (en) | Display system with frame reuse using divided multi-connector element differential bus connector | |
US6985152B2 (en) | Point-to-point bus bridging without a bridge controller | |
US8319782B2 (en) | Systems and methods for providing scalable parallel graphics rendering capability for information handling systems | |
US7372465B1 (en) | Scalable graphics processing for remote display | |
EP1763767A2 (en) | Discrete graphics system and method | |
US8411093B2 (en) | Method and system for stand alone graphics independent of computer system form factor | |
US7576745B1 (en) | Connecting graphics adapters | |
CN201667070U (en) | Combined structure of Mini PCI-E display card and output module | |
US7886094B1 (en) | Method and system for handshaking configuration between core logic components and graphics processors | |
US20120194529A1 (en) | Interface card | |
CN110515887A (en) | A kind of independent display card that multiple GPU are worked at the same time | |
EP4343562A1 (en) | Enabling universal core motherboard with flexible input-output ports | |
Casey | Computer Hardware: Hardware Components and Internal PC Connections | |
JP2006171736A (en) | Image display device and method therefor | |
CN1303055A (en) | System control chip with multiplex pattern bus structure and computer system | |
Platforms et al. | Second-Generation Intel® CentrinoTM Mobile Technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20060515 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: DIAMOND, MICHAEL, B. Inventor name: CARRERA, CESAR |
|
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20080530 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 13/40 20060101AFI20080526BHEP |
|
17Q | First examination report despatched |
Effective date: 20111010 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20151007 |