EP2294571A1 - Shader complex with distributed level one cache system and centralized level two cache - Google Patents
Shader complex with distributed level one cache system and centralized level two cacheInfo
- Publication number
- EP2294571A1 EP2294571A1 EP09755282A EP09755282A EP2294571A1 EP 2294571 A1 EP2294571 A1 EP 2294571A1 EP 09755282 A EP09755282 A EP 09755282A EP 09755282 A EP09755282 A EP 09755282A EP 2294571 A1 EP2294571 A1 EP 2294571A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- level
- cache
- cache system
- shader
- filter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0811—Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/084—Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/30—Providing cache or TLB in specific location of a processing system
- G06F2212/302—In image processor or graphics adapter
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/45—Caching of specific data in cache memory
- G06F2212/455—Image or video data
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/121—Frame memory handling using a cache memory
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/363—Graphics controllers
Definitions
- the present invention is generally directed to computing operations performed in computing systems, and more particularly directed to graphics processing tasks performed in computing systems.
- a graphics processing unit is a complex integrated circuit that is specially designed to perform graphics processing tasks.
- a GPU can, for example, execute graphics processing tasks required by an end-user application, such as a video game application. In such an example, there are several layers of software between the end-user application and the GPU.
- the end-user application communicates with an application programming interface (API).
- API allows the end-user application to output graphics data and commands in a standardized format, rather than in a format that is dependent on the GPU.
- Several types of APIs are commercially available, including DirectX® developed by Microsoft Corp. and OpenGL® developed by Silicon Graphics, Inc.
- the API communicates with a driver.
- the driver translates standard code received from the API into a native format of instructions understood by the GPU.
- the driver is typically written by the manufacturer of the GPU.
- the GPU then executes the instructions from the driver.
- a GPU produces the pixels that make up an image from a higher level description of its components in a process known as rendering.
- GPUs typically utilize a concept of continuous rendering by the use of pipelines to processes pixel, texture, and geometric data. These pipelines are often referred to as a collection of fixed function special purpose pipelines such as rasterizers, setup engines, color blenders, hieratical depth, texture mapping and programmable stages that can be accomplished in shader pipes or shader pipelines, "shader” being a term in computer graphics referring to a set of software instructions used by a graphic resource primarily to perform rendering effects.
- GPUs can also employ multiple programmable pipelines in a parallel processing design to obtain higher throughput. A multiple of shader pipelines can also be referred to as a shader pipe array.
- GPUs also support a concept known as texture mapping.
- Texture mapping is a process used to determine the texture color for a texture mapped pixel through the use of the colors of nearby pixels of the texture, or texels.
- the process is also referred to as texture smoothing or texture interpolation.
- texture smoothing or texture interpolation.
- high image quality texture mapping requires a high degree of computational complexity.
- GPUs equipped with a Unified Shader also simultaneously support many types of shader processing, from pixel, vertex, primitive, surface and generalized compute are raising the demand for higher performance generalized memory access capabilities.
- Texture filters rely on high speed access to local cache memory for pixel data.
- the use of dedicated local cache memory for texture filters typically precludes the use of more general purpose shared memory. While general purpose shared memory is more flexible, it typically has slower response time and hence is less performant.
- the present invention includes method and apparatus whereby a shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. While each level one cache system is associated with a particular shader pipe texture filter, level two cache memory has no such association and is therefore available to all level one cache systems. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.
- a level one cache system is configured with dual access so that two shader pipe texture filters have access to a single level one cache system.
- more than one level two cache systems are configured to be accessible by each level one cache systems.
- the communication between a level one cache system and a level two cache systems utilizes more than one memory channel thereby resulting in a greater data throughput.
- one or more level one cache systems can allocate defined areas of memory to be shared amongst other resources, including other level one cache systems. In certain instances this approach will allow for quicker fetch times of texel data where the required data has already been moved from a level two cache system to a level one cache system.
- FIG. 1 is a system diagram depicting an implementation of a single level one cache system with a single level two cache system.
- FIG. 2 is a system diagram depicting an implementation of a plurality of level one and level two cache systems.
- FIG. 3 is a system diagram depicting an implementation of a dual ported level one and a plurality of level two cache systems.
- FIG. 4 is a flowchart depicting an implementation of a method for a shader filter cache system.
- the present invention relates to a distributed level one cache system with a centralized level two cache system.
- Each shader pipe texture filter has a dedicated level one cache system configured to provide read and write access to texel data contained within the level one cache system.
- an embodiment indicates that the embodiment described can include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of one skilled in the art to incorporate such a feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
- FIG. 1 is an illustration of a single level one cache system and a single level two cache system 100 according to an embodiment of the present invention.
- System 100 comprises a single shader pipe texture filter 110 with an associated level one cache system 120 configured to communicate with a level two cache system 130 utilizing a wide channel memory bus 125.
- the shader pipe texture filter 110 employs the concept of bilinear filtering to determine the color of a particular pixel.
- the shader pipe texture filter 110 analyzes texel data for the four nearest pixels to the pixel in question.
- the texel data for the four texels is then combined by a weighted average according to distance to calculate the desired result.
- the texel data in question is retrieved from the level one cache system 120 that is associated with the current shader pipe texture filter 110.
- level one cache system 120 issues a read request to level two cache system 130 for the desired texel data.
- the required data is then copied from level two cache system 130 to level one cache system 120 in order to be analyzed and processed by shader pipe texture filter 1 10.
- FIG. 2 is an illustration of multiple shader pipe texture filters with associated level one caches and multiple level two cache systems according to an embodiment of the present invention.
- System 200 comprises one or more shader pipe texture filters, here represented as shader pipe texture filter 1 through shader pipe texture filter N, labeled 110-1 through HO-N, where "N" represents a positive integer greater than one.
- System 200 also comprises level one cache systems associated with each shader pipe texture filter, here represented as Ll-I cache system through Ll-N cache system, labeled 120-1 through 120-N, where "N" represents a positive integer greater than one.
- Ll-I cache system Ll-I cache system through Ll-N cache system
- 120-1 through 120-N where "N” represents a positive integer greater than one.
- wide channel memory bus 125 that links level one cache systems 120-1 through 120-N to a set of level two cache systems.
- the level two cache system includes one or more level two cache systems, here represented as L2-1 cache system through L2-M cache system, where "M”
- each shader pipe texture filter, 110-1 through 110-N needs to analyze the texel data for the four nearest pixels to the pixel in question. Therefore, the texel data in question for each shader pipe texture filter is retrieved from its associated level one cache system. As such, shader pipe texture filter 1, 110-1, issues a request for texel data to Ll-I cache system, 120-1. The remaining shader pipe texture filters will issue texel data requests in a similar manner.
- level one cache system can issue a read request to the level two cache system 130 for the desired texel data.
- one or more of the level one cache systems can allocate defined areas of memory to be shared amongst other resources, including other level one cache systems. In certain instances, this approach allows for quicker fetch times of texel data where the required data has already been moved from a level two cache system to a level one cache system.
- FIG. 3 is an illustration of multiple shader pipe texture filters with associated dual-ported level one caches and multiple level two cache systems according to an embodiment of the present invention.
- System 300 comprises one or more dual-ported level one cache systems, of which each supports up to two shader pipe texture filters, and a level two cache system.
- a level one cache system supports up to two shader pipe texture filters for their requests of texel data.
- each level one cache supports two shader pipe texture filters, as illustrated, as shader pipe texture filter A, 310, and shader pipe texture filter B, 312.
- the level one caches, 320-1 through 320-N have access, via wide channel memory bus 125, to the level two cache system illustrated as L2-1 through L2-N, where N is a positive integer.
- one or more of the level one cache systems can allocate defined areas of memory to be shared amongst other resources, including other level one cache systems. In certain instances, this approach allows for quicker fetch times of texel data where the required data has already been moved from a level two cache system to a level one cache system.
- FIG. 4 is a flowchart depicting a method 400 whereby a shader pipe texture filter utilizes a level one cache system as a primary method of storage with the ability to access a level two cache when necessary.
- Method 400 begins at step 402.
- each level one cache system can allocate defined areas of memory as sharable amongst other resources.
- a shader pipe texture filter issues a read or write command to its associated level one cache system.
- the associated level one cache system retrieves or writes texel data, as appropriate.
- each level one cache system can issue read and write requests to a level two cache system.
- Method 400 concludes at step 412.
- 3, and 4 can be implemented in software, firmware, or hardware, or using any combination thereof. If programmable logic is used, such logic can execute on a commercially available processing platform or a special purpose device.
- HDL hardware description language
- Verilog Verilog
- VHDL hardware description language
- the HDL-design can model the behavior of an electronic system, where the design can be synthesized and ultimately fabricated into a hardware device.
- the HDL-design can be stored in a computer product and loaded into a computer system prior to hardware manufacture.
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US5749208P | 2008-05-30 | 2008-05-30 | |
PCT/US2009/003317 WO2009145919A1 (en) | 2008-05-30 | 2009-06-01 | Shader complex with distributed level one cache system and centralized level two cache |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2294571A1 true EP2294571A1 (en) | 2011-03-16 |
EP2294571A4 EP2294571A4 (en) | 2014-04-23 |
Family
ID=41377446
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP09755282.2A Ceased EP2294571A4 (en) | 2008-05-30 | 2009-06-01 | Shader complex with distributed level one cache system and centralized level two cache |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP2294571A4 (en) |
JP (1) | JP5832284B2 (en) |
KR (1) | KR101427409B1 (en) |
CN (1) | CN102047316B (en) |
WO (1) | WO2009145919A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110471943A (en) * | 2018-05-09 | 2019-11-19 | 北京京东尚科信息技术有限公司 | Real time data statistic device and method and computer readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1498824A2 (en) * | 2003-06-30 | 2005-01-19 | Microsoft Corporation | System and method for parallel execution of data generation tasks |
US20050225558A1 (en) * | 2004-04-08 | 2005-10-13 | Ati Technologies, Inc. | Two level cache memory architecture |
US7103720B1 (en) * | 2003-10-29 | 2006-09-05 | Nvidia Corporation | Shader cache using a coherency protocol |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10232825A (en) * | 1997-02-20 | 1998-09-02 | Nec Ibaraki Ltd | Cache memory control system |
US6629188B1 (en) * | 2000-11-13 | 2003-09-30 | Nvidia Corporation | Circuit and method for prefetching data for a texture cache |
JP3620473B2 (en) * | 2001-06-14 | 2005-02-16 | 日本電気株式会社 | Method and apparatus for controlling replacement of shared cache memory |
US7248585B2 (en) * | 2001-10-22 | 2007-07-24 | Sun Microsystems, Inc. | Method and apparatus for a packet classifier |
JP3840966B2 (en) * | 2001-12-12 | 2006-11-01 | ソニー株式会社 | Image processing apparatus and method |
US6871264B2 (en) * | 2002-03-06 | 2005-03-22 | Hewlett-Packard Development Company, L.P. | System and method for dynamic processor core and cache partitioning on large-scale multithreaded, multiprocessor integrated circuits |
US7069387B2 (en) | 2003-03-31 | 2006-06-27 | Sun Microsystems, Inc. | Optimized cache structure for multi-texturing |
JP4451717B2 (en) * | 2004-05-31 | 2010-04-14 | 株式会社ソニー・コンピュータエンタテインメント | Information processing apparatus and information processing method |
US7280107B2 (en) * | 2005-06-29 | 2007-10-09 | Microsoft Corporation | Procedural graphics architectures and techniques |
CN100451952C (en) * | 2005-12-19 | 2009-01-14 | 威盛电子股份有限公司 | Processor system of multi-tier accelerator architecture and method for operating the same |
JP4295814B2 (en) * | 2006-03-03 | 2009-07-15 | 富士通株式会社 | Multiprocessor system and method of operating multiprocessor system |
US20070211070A1 (en) * | 2006-03-13 | 2007-09-13 | Sony Computer Entertainment Inc. | Texture unit for multi processor environment |
US7965296B2 (en) | 2006-06-20 | 2011-06-21 | Via Technologies, Inc. | Systems and methods for storing texture map data |
US20080094408A1 (en) | 2006-10-24 | 2008-04-24 | Xiaoqin Yin | System and Method for Geometry Graphics Processing |
-
2009
- 2009-06-01 KR KR1020107029825A patent/KR101427409B1/en active IP Right Grant
- 2009-06-01 CN CN200980119830.3A patent/CN102047316B/en active Active
- 2009-06-01 EP EP09755282.2A patent/EP2294571A4/en not_active Ceased
- 2009-06-01 WO PCT/US2009/003317 patent/WO2009145919A1/en active Application Filing
- 2009-06-01 JP JP2011511651A patent/JP5832284B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1498824A2 (en) * | 2003-06-30 | 2005-01-19 | Microsoft Corporation | System and method for parallel execution of data generation tasks |
US7103720B1 (en) * | 2003-10-29 | 2006-09-05 | Nvidia Corporation | Shader cache using a coherency protocol |
US20050225558A1 (en) * | 2004-04-08 | 2005-10-13 | Ati Technologies, Inc. | Two level cache memory architecture |
Non-Patent Citations (2)
Title |
---|
PARK S-J ET AL: "A RECONFIGURABLE MULTILEVEL PARALLEL GRAPHICS CACHE MEMORY WITH 75 GB/S PARALLEL CHACHE REPLACEMENT BANDWIDTH", 2001 SYMPOSIUM ON VLSI CIRCUITS. DIGEST OF TECHNICAL PAPERS. KYOTO, JAPAN, JUNE 14 - 16, 2001; [SYMPOSIUM ON VLSI CIRCUITS], TOKYO : JSAP, JP, 14 June 2001 (2001-06-14), pages 233-236, XP001071986, DOI: 10.1109/VLSIC.2001.934250 ISBN: 978-4-89114-014-4 * |
See also references of WO2009145919A1 * |
Also Published As
Publication number | Publication date |
---|---|
KR20110015034A (en) | 2011-02-14 |
KR101427409B1 (en) | 2014-08-07 |
EP2294571A4 (en) | 2014-04-23 |
CN102047316B (en) | 2016-08-24 |
CN102047316A (en) | 2011-05-04 |
JP5832284B2 (en) | 2015-12-16 |
WO2009145919A1 (en) | 2009-12-03 |
JP2011523745A (en) | 2011-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8195882B2 (en) | Shader complex with distributed level one cache system and centralized level two cache | |
US11024007B2 (en) | Apparatus and method for non-uniform frame buffer rasterization | |
US8421794B2 (en) | Processor with adaptive multi-shader | |
KR20190100194A (en) | Forbidden Rendering in Tiled Architectures | |
US9811940B2 (en) | Bandwidth reduction using vertex shader | |
KR102006584B1 (en) | Dynamic switching between rate depth testing and convex depth testing | |
US9852536B2 (en) | High order filtering in a graphics processing unit | |
US7605825B1 (en) | Fast zoom-adaptable anti-aliasing of lines using a graphics processing unit | |
KR20190030174A (en) | Graphics processing | |
CN106575428B (en) | High order filtering in a graphics processing unit | |
US9530237B2 (en) | Interpolation circuitry and techniques for graphics processing | |
EP2297723A1 (en) | Scalable and unified compute system | |
US11748010B2 (en) | Methods and systems for storing variable length data blocks in memory | |
US10192349B2 (en) | Texture sampling techniques | |
US20120013629A1 (en) | Reading Compressed Anti-Aliased Images | |
US10089708B2 (en) | Constant multiplication with texture unit of graphics processing unit | |
EP2294571A1 (en) | Shader complex with distributed level one cache system and centralized level two cache | |
CN101339647B (en) | Method and system for processing texture samples with programmable offset positions | |
US20210304488A1 (en) | Sampling for partially resident textures | |
US20190371043A1 (en) | Method and system for smooth level of detail interpolation for partially resident textures |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20101224 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA RS |
|
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20140325 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06T 15/00 20110101ALI20140319BHEP Ipc: G06F 12/08 20060101ALI20140319BHEP Ipc: G09G 5/36 20060101AFI20140319BHEP Ipc: G06T 1/60 20060101ALI20140319BHEP |
|
17Q | First examination report despatched |
Effective date: 20170530 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: ADVANCED MICRO DEVICES, INC. |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: ADVANCED MICRO DEVICES, INC. |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20180925 |