EP2297723A1 - Scalable and unified compute system - Google Patents
Scalable and unified compute systemInfo
- Publication number
- EP2297723A1 EP2297723A1 EP09755281A EP09755281A EP2297723A1 EP 2297723 A1 EP2297723 A1 EP 2297723A1 EP 09755281 A EP09755281 A EP 09755281A EP 09755281 A EP09755281 A EP 09755281A EP 2297723 A1 EP2297723 A1 EP 2297723A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- scalable
- texture
- texel data
- data
- unified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/04—Texture mapping
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/121—Frame memory handling using a cache memory
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/125—Frame memory handling using unified memory architecture [UMA]
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/363—Graphics controllers
Definitions
- the present invention generally relates to computing operations performed by computing systems, and more particularly to graphics processing tasks performed by computing systems.
- a graphics processing unit is a complex integrated circuit that is specially configured to carry out graphics processing tasks.
- a GPU can, for example, execute graphics processing tasks required by an end-user application, such as a video game application.
- an end-user application such as a video game application.
- graphics processing tasks required by an end-user application, such as a video game application.
- an end-user application such as a video game application.
- the end-user application communicates with an application programming interface (API).
- API allows the end-user application to output graphics data and commands in a standardized format, rather than in a format that is dependent on the GPU.
- Several types of APIs are commercially available, including DirectX® developed by Microsoft Corp. and OpenGL® developed by Silicon Graphics, Inc.
- the API communicates with a driver.
- the driver translates standard code received from the API into a native format of instructions understood by the GPU.
- the driver is typically written by the manufacturer of the GPU.
- the GPU then executes instructions from the driver.
- a GPU produces, by carrying out a process known as "rendering” creates individual pixels that together form an image based on a higher level description of image components.
- a GPU typically carries out continuous rendering using pipelines to process pixel, texture, and geometric data. These pipelines are often referred to as a collection of fixed function special purpose pipelines such as rasterizers, setup engines, color blenders, hieratical depth, texture mapping and programmable stages that can be accomplished in shader pipes or shader pipelines, "shader” being a term in computer graphics referring to a set of software instructions used by a graphic resource primarily to perform rendering effects.
- GPU's can also employ multiple programmable pipelines in a parallel processing design to obtain higher throughput.
- a multiple of shader pipelines can also be referred to as a shader pipe array.
- GPUs also support texture mapping.
- Texture mapping is a process used to determine the texture color for a texture mapped pixel through the use of the colors of nearby pixels of the texture, or texels. The process is also referred to as texture smoothing or texture interpolation.
- texture smoothing or texture interpolation.
- high image quality texture mapping requires a high degree of computational complexity.
- GPUs equipped with a Unified Shader also simultaneously support many types of shader processing, from pixel, vertex, primitive, surface and generalized compute are raising the demand for higher performance generalized memory access capabilities.
- the present invention includes method and apparatus related to a row based Scalable and Unified Compute Unit Module.
- the Scalable and Unified Compute Unit Module includes a shader pipe array and texture mapping unit with a level one cache system to perform texture mapping and general load/store accesses with the ability to process shader pipe data destined to a defective shader pipe.
- the Scalable and Unified Compute System comprises a sequencer, Scalable and Unified Compute Unit Module with access to a level two texture cache system and thus an external memory system.
- the Scalable and Unified Compute System is configured to accept an executing shader program instruction including input, output, ALU and texture or general memory load/store requests with address data from the shader pipes and program constants to generate the return texel or memory data based on state data controlling the pipelined address and filtering operations for a specific pixel or thread.
- the texture filter system is configured based on the shader program instruction and constant to generate a formatted interpolation based on texel data stored in the cache system for the addresses stored in the shader pipeline.
- the row based shader pipe Scalable and Unified Compute System further comprises a redundant shader pipe system.
- the redundant shader pipe system is configured to process shader pipe data destined to a defective shader pipes in the shader pipe array.
- the System further comprises a level two texture cache system.
- the level two texture cache system can be read and written to by any level one row based texture cache system.
- the texture filter in the texture mapping unit in the Scalable and Unified Compute Unit Module further comprises a pre- formatter module, an interpolator module, an accumulator module, and a format module.
- the pre-formatter module is configured to receive texel data and convert it to a normalized fixed point format.
- the interpolator module is configured to perform an interpolation on the normalized fixed point texel data from the pre-formatter module and generate re-normalized floating point texel data.
- the accumulator module is configured to accumulate floating point texel data from the interpolator module to achieve the desired level of bilinear, trilinear, and anisotropic filtering.
- the format module is configured to convert texel data from the accumulator module into a standard floating point representation.
- FIG. 1 is a system diagram depicting an implementation of a Scalable and Unified Compute System.
- FIG. 2 is a system diagram depicting an implementation of a Scalable and Unified Compute System illustrating the details of the shader pipe array.
- FIG. 3 is a system diagram depicting an implementation of a Scalable and Unified Compute System illustrating the details of the texture mapping unit.
- FIG. 4 is a flowchart depicting an implementation of a method for a
- the invention relates to a Scalable and Unified Compute System whereby a shader pipe array processes shader program instructions on input pixel, vertex, and primitive, surface or compute work items to create output data for each item using generated texel data or memory load/store operations, hi embodiments of this invention, bilinear texture mapping, trilinear texture mapping, and anisotropic texture mapping are applied to the texel data that is stored in a multi-level cache system.
- a redundant shader system can be added and configured to process shader pipe data directed to defective shader pipes within the shader pipe array to recover devices with a defective sub-circuit in one or more shader pipes.
- Embodiments of this invention that have configurations containing two or more Scalable and Unified Compute Systems, a subset of the Unified Compute Unit System itself can be configured to be a repairable unit, hi such an embodiment workloads destined to a defective Unified Compute Unit System will instead be sent to a redundant Unified Compute Unit System to process all ALU, texture, and memory operations. This increases the portion of the device that is covered by repair significantly due to the inclusion of texture mapping unit and Ll cache system and thus significantly improves on the yield of such a device.
- an embodiment indicates that the embodiment described can include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of one skilled in the art to incorporate such a feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
- FIG. 1 is an illustration of a Scalable and Unified Compute System 100 according to an embodiment of the present invention.
- System 100 comprises a sequencer 110, a Scalable and Unified Compute Unit Module 120, and a level two cache system 130.
- Scalable and Unified Compute Unit Module 120 comprises a shader pipe array 122, optional redundant shader pipe array 124, texture mapping unit 126, and level one texture cache system 128.
- Shader pipe array 122 performs ALU operations on input data.
- Sequencer 110 controls the shader program instruction issue for contained workloads and the flow of data through shader pipe array 122.
- sequencer 110 reacts to defective shader pipes that occur within shader pipe array 122 by scheduling instructions to the appropriate redundant units.
- Sequencer 110 can issue a texture fetch or load/store operation that will initiate shader pipe array 122 to send addresses with the instruction issued to texture mapping unit 126.
- texture mapping unit 126 generates appropriate addresses to the level one texture cache system 128 that contains texel data or memory data associated with the addresses.
- Level one cache system 128, after receiving the addresses, will return the associated texel or memory data to texture mapping unit 126.
- the request is forwarded to a level two cache system 130 obtain and return the requested texel data.
- FIG. 2 is an illustration of a Scalable and Unified Compute Unit
- Shader pipe array 122 comprises one or more shader pipe blocks, here represented as SP_O through SP_M, where "M" represents a positive integer greater than one.
- redundant shader pipe array 124 In an embodiment where redundant shader pipe array 124 is present, if sequencer 110 identifies, as an example, that the shader pipe located in shader pipe block SP_1 is defective, then the shader pipe data destined to the defective pipe would be sent to redundant shader pipe array 124 via the input stream by the input module and processed by redundant shader pipe array 124. All texture mapping requests would be intercepted by redundant shader pipe array 124 when instructed via horizontal control path 211 from sequencer 110. Once redundant shader pipe array 124 processes the shader pipe data initially destined to the defective shader pipe, the processed redundant shader pipe array 124 data would be transferred from redundant shader pipe array 124 back to the output stream of shader pipe 122 and realigned in an output unit (not shown).
- redundant shader pipe array 124 consists of a single block, and therefore can only process shader pipe data destined to a single defective shader pipe at a time. In another embodiment wherein redundant shader pipe array 124 comprises multiple redundant shader blocks, then redundant shader pipe array 124 can process shader pipe data destined to more than one defective shader pipe simultaneously.
- FIG. 3 illustrates a more detailed view of texture mapping unit 126 according to an embodiment of the present invention.
- shader pipe array 122 generates a texture or memory load/store request to texture mapping unit 126 that comprises an address generator system 318, a pre-formatter module 310, an interpolator module 312, an accumulator module 314, and a format module 316.
- the texture mapping unit 126 receives a request from shader arrays 122 and 124 and sequencer 110 respectively, and processes the instruction in address generator system 318 to determine the actual addresses to service.
- the resultant texel data is sent back to the requesting resource in shader pipe array 122 and/or redundant shader pipe array 124.
- Pre-formatter module 310 is configured to receive texel data and perform a block normalization thereby generating normalized fixed point texel data.
- Interpolator module 312 receives the normalized fixed point texel data from pre-formatter module 310 and performs one or more interpolations, each of that are accumulated in accumulator module 314, to achieve the desired level of bilinear, trilinear, and anisotropic texture mapping.
- Format module 316 converts the accumulated texel data in accumulator module 314 to a standard floating point representation for the requesting resource, shader pipe array 122. For general load/store data pre-formatter module 310, interpolator module 312, accumulator module 314, and format module 316 pass the requested return data unmodified.
- FIG. 3 also illustrates the use of a level two cache system 130.
- the level two cache system is additional memory that is available to Scalable and Unified Computer Unit Module 120 when it is necessary or desirable to read and/or write data from and to the level one cache system 128.
- FIG. 4 is a flowchart depicting a method 400 for texture mapping using a Scalable and Unified Compute System.
- Method 400 begins at step 402.
- a shader pipe receives set texture requests from a sequencer for a specific set of pixels, vertices, primitives, surfaces, or computer work items.
- the shader pipe generates data set addresses based on shader program instructions for the specified set of pixels, vertices, primitives, surfaces, or compute work items.
- a texture mapping unit retrieved stored texel data from a level one and/or level two texture cache system.
- the texture mapping unit calculates a formatted accumulated interpolation based on the retrieved texel data and the originating shader instruction.
- Method 400 ends at step 412.
- 3, and 4 can be implemented in software, firmware, or hardware, or using any combination thereof. If programmable logic is used, such logic can execute on a commercially available processing platform or a special purpose device.
- HDL hardware description language
- Verilog Verilog
- VHDL hardware description language
- the HDL-design can model the behavior of an electronic system, where the design can be synthesized and ultimately fabricated into a hardware device, hi addition, the HDL-design can be stored in a computer product and loaded into a computer system prior to hardware manufacture.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Graphics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Generation (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US5748308P | 2008-05-30 | 2008-05-30 | |
PCT/US2009/003316 WO2009145918A1 (en) | 2008-05-30 | 2009-06-01 | Scalable and unified compute system |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2297723A1 true EP2297723A1 (en) | 2011-03-23 |
EP2297723A4 EP2297723A4 (en) | 2015-08-19 |
Family
ID=41377445
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP09755281.4A Withdrawn EP2297723A4 (en) | 2008-05-30 | 2009-06-01 | Scalable and unified compute system |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP2297723A4 (en) |
JP (1) | JP5491498B2 (en) |
KR (1) | KR101427408B1 (en) |
CN (1) | CN102047315B (en) |
WO (1) | WO2009145918A1 (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101926570B1 (en) | 2011-09-14 | 2018-12-10 | 삼성전자주식회사 | Method and apparatus for graphic processing using post shader |
KR101862785B1 (en) | 2011-10-17 | 2018-07-06 | 삼성전자주식회사 | Cache memory system for tile based rendering and caching method thereof |
US10089708B2 (en) * | 2016-04-28 | 2018-10-02 | Qualcomm Incorporated | Constant multiplication with texture unit of graphics processing unit |
GB2566733B (en) * | 2017-09-25 | 2020-02-26 | Advanced Risc Mach Ltd | Performimg convolution operations in graphics texture mapping units |
CN109614086B (en) * | 2018-11-14 | 2022-04-05 | 西安翔腾微电子科技有限公司 | GPU texture buffer area data storage hardware and storage device based on SystemC and TLM models |
CN110930493A (en) * | 2019-11-21 | 2020-03-27 | 中国航空工业集团公司西安航空计算技术研究所 | GPU texel parallel acquisition method |
CN112581575B (en) * | 2020-12-05 | 2024-05-03 | 西安翔腾微电子科技有限公司 | Texture system is done to outer video |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3645024B2 (en) * | 1996-02-06 | 2005-05-11 | 株式会社ソニー・コンピュータエンタテインメント | Drawing apparatus and drawing method |
US6104415A (en) * | 1998-03-26 | 2000-08-15 | Silicon Graphics, Inc. | Method for accelerating minified textured cache access |
US7136068B1 (en) * | 1998-04-07 | 2006-11-14 | Nvidia Corporation | Texture cache for a computer graphics accelerator |
AU5688199A (en) * | 1998-08-20 | 2000-03-14 | Raycer, Inc. | System, apparatus and method for spatially sorting image data in a three-dimensional graphics pipeline |
US6771264B1 (en) * | 1998-08-20 | 2004-08-03 | Apple Computer, Inc. | Method and apparatus for performing tangent space lighting and bump mapping in a deferred shading graphics processor |
US6919895B1 (en) * | 1999-03-22 | 2005-07-19 | Nvidia Corporation | Texture caching arrangement for a computer graphics accelerator |
US6731303B1 (en) * | 2000-06-15 | 2004-05-04 | International Business Machines Corporation | Hardware perspective correction of pixel coordinates and texture coordinates |
US7124318B2 (en) * | 2003-09-18 | 2006-10-17 | International Business Machines Corporation | Multiple parallel pipeline processor having self-repairing capability |
CN1239023C (en) * | 2003-10-16 | 2006-01-25 | 上海交通大学 | Three-dimensional video format conversion method based on motion adaption and marginal protection |
KR100519779B1 (en) * | 2004-02-10 | 2005-10-07 | 삼성전자주식회사 | Method and apparatus for high speed visualization of depth image-based 3D graphic data |
US7385607B2 (en) * | 2004-04-12 | 2008-06-10 | Nvidia Corporation | Scalable shader architecture |
US7577869B2 (en) * | 2004-08-11 | 2009-08-18 | Ati Technologies Ulc | Apparatus with redundant circuitry and method therefor |
US7218291B2 (en) | 2004-09-13 | 2007-05-15 | Nvidia Corporation | Increased scalability in the fragment shading pipeline |
JP2006244426A (en) * | 2005-03-07 | 2006-09-14 | Sony Computer Entertainment Inc | Texture processing device, picture drawing processing device, and texture processing method |
JP4660254B2 (en) * | 2005-04-08 | 2011-03-30 | 株式会社東芝 | Drawing method and drawing apparatus |
CN101156176A (en) * | 2005-10-25 | 2008-04-02 | 三菱电机株式会社 | Image processor |
US20070211070A1 (en) * | 2006-03-13 | 2007-09-13 | Sony Computer Entertainment Inc. | Texture unit for multi processor environment |
US7965296B2 (en) * | 2006-06-20 | 2011-06-21 | Via Technologies, Inc. | Systems and methods for storing texture map data |
-
2009
- 2009-06-01 EP EP09755281.4A patent/EP2297723A4/en not_active Withdrawn
- 2009-06-01 JP JP2011511650A patent/JP5491498B2/en active Active
- 2009-06-01 KR KR1020107029824A patent/KR101427408B1/en active IP Right Grant
- 2009-06-01 CN CN200980119829.0A patent/CN102047315B/en active Active
- 2009-06-01 WO PCT/US2009/003316 patent/WO2009145918A1/en active Application Filing
Non-Patent Citations (1)
Title |
---|
See references of WO2009145918A1 * |
Also Published As
Publication number | Publication date |
---|---|
JP2011524562A (en) | 2011-09-01 |
EP2297723A4 (en) | 2015-08-19 |
CN102047315B (en) | 2015-09-09 |
KR20110019764A (en) | 2011-02-28 |
WO2009145918A1 (en) | 2009-12-03 |
JP5491498B2 (en) | 2014-05-14 |
KR101427408B1 (en) | 2014-08-07 |
CN102047315A (en) | 2011-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8558836B2 (en) | Scalable and unified compute system | |
US7948500B2 (en) | Extrapolation of nonresident mipmap data using resident mipmap data | |
US8421794B2 (en) | Processor with adaptive multi-shader | |
US9177351B2 (en) | Multi-primitive graphics rendering pipeline | |
US7999819B2 (en) | Systems and methods for managing texture descriptors in a shared texture engine | |
US8009172B2 (en) | Graphics processing unit with shared arithmetic logic unit | |
EP2297723A1 (en) | Scalable and unified compute system | |
CN109584140B (en) | graphics processing | |
US20070291044A1 (en) | Systems and Methods for Border Color Handling in a Graphics Processing Unit | |
US10825125B2 (en) | Performing convolution operations in graphics texture mapping units | |
US20160071232A1 (en) | Texture state cache | |
US20180165790A1 (en) | Out-of-order cache returns | |
US7944453B1 (en) | Extrapolation texture filtering for nonresident mipmaps | |
US20120013629A1 (en) | Reading Compressed Anti-Aliased Images | |
US20230097097A1 (en) | Graphics primitives and positions through memory buffers | |
EP2294571A1 (en) | Shader complex with distributed level one cache system and centralized level two cache | |
US12106418B2 (en) | Sampling for partially resident textures | |
US20210304488A1 (en) | Sampling for partially resident textures | |
GB2625797A (en) | Retrieving a block of data items in a processor | |
GB2625798A (en) | Retrieving a block of data items in a processor | |
GB2625800A (en) | Applying texture processing to a block of fragments in a graphics processing unit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20101230 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA RS |
|
DAX | Request for extension of the european patent (deleted) | ||
RA4 | Supplementary search report drawn up and despatched (corrected) |
Effective date: 20150716 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G09G 5/00 20060101AFI20150710BHEP Ipc: G09G 5/36 20060101ALN20150710BHEP Ipc: G06T 15/04 20110101ALI20150710BHEP Ipc: G06T 15/00 20110101ALI20150710BHEP |
|
17Q | First examination report despatched |
Effective date: 20171219 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: ADVANCED MICRO DEVICES, INC. |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: ADVANCED MICRO DEVICES, INC. |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Effective date: 20191014 |