GB2336086A - Optimised pixel/texel memory configuration for tile arrays - Google Patents

Optimised pixel/texel memory configuration for tile arrays Download PDF

Info

Publication number
GB2336086A
GB2336086A GB9907701A GB9907701A GB2336086A GB 2336086 A GB2336086 A GB 2336086A GB 9907701 A GB9907701 A GB 9907701A GB 9907701 A GB9907701 A GB 9907701A GB 2336086 A GB2336086 A GB 2336086A
Authority
GB
United Kingdom
Prior art keywords
memory
pixel
data
pixels
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9907701A
Other versions
GB2336086B (en
GB9907701D0 (en
Inventor
Scott Hartog
Michael Mantor
John Austin Carey
Thomas A Piazza
Ralph Clayton Taylor
Matthew Radecki
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Real 3D Inc
Original Assignee
Real 3D Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Real 3D Inc filed Critical Real 3D Inc
Priority to GB0216637A priority Critical patent/GB2374781B/en
Priority to GB0216636A priority patent/GB2374780B/en
Priority to GB0216638A priority patent/GB2374782B/en
Publication of GB9907701D0 publication Critical patent/GB9907701D0/en
Publication of GB2336086A publication Critical patent/GB2336086A/en
Application granted granted Critical
Publication of GB2336086B publication Critical patent/GB2336086B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0207Addressing or allocation; Relocation with multidimensional access, e.g. row/column, matrix
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1006Data managing, e.g. manipulating data before writing or reading out, data bus switches or control circuits therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0215Addressing or allocation; Relocation with look ahead addressing means
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/04Detection or location of defective memory elements, e.g. cell constructio details, timing of test signals
    • G11C29/08Functional testing, e.g. testing during refresh, power-on self testing [POST] or distributed testing
    • G11C29/12Built-in arrangements for testing, e.g. built-in self testing [BIST] or interconnection details
    • G11C29/18Address generation devices; Devices for accessing memories, e.g. details of addressing circuits
    • G11C2029/1806Address conversion or mapping, i.e. logical to physical address

Abstract

A computer graphics system and a method of configuring data in a memory unit of a computer graphics system. Generally, the data is configured such that the number of memory page breaks is reduced when data is accessed from the memory for image computation. For example, when the memory is used to store pixel values, each page of the memory is comprised of pixel values for a rectangular or tile array of pixels. This increases the spatial coherence between the pixel values and the pixels of the polygons that are rasterized when the system renders an image. Preferably, a translation algorithm is provided to allow standard operating systems and software applications to work with the tiled configuration of the pixel values in the memory. This algorithm translates the scalar memory address initially provided by the operating system or the software application, and translates that first scalar memory address to a second scalar memory address that will properly access the value for the pixel conventionally associated with the first scalar memory address.

Description

2336086 A L2WAR SURFACE MEMORY TO -SI2ATIAL TILINO ALQQ2-ITHMIXECHAMISM
BACKGROUND OF THE INVENTION 1. Technical Field
The present invention relates generally to computer graphics systems. More specifically, this invention relates to methods and apparatus for storing data in, and accessing data from, memory devices in computer graphics systems.
2. Description of the Prior Art
Many modern computer systems are able to display complex threedimensional objects on display devices that are controlled by the computer systems, and often these complex objects are displayed interactively to allow the computer user to manipulate the object images. Well known graphics techniques, such as hidden surface algorithms, clipping, texture mapping, polygon filling and coordinate transformations, may be used to generate the images on a suitable device, such as a CRT video display that is controlled by the computer system.
Typically, these display devices include a display surface comprised of a matrix of pixels that are illuminated at controlled intensities to produce the images. Conventionally, in the operation of these computer systems, values for these pixels are calculated or determined and then stored in a frame buffer. More particularly, a polygon representation of the object to be displayed is converted to a raster scanned image that is stored in the frame buffer. When an image is actually displayed, or rendered, on the display device, values for the pixels in the image are obtained, one value at a time and in a specified order, from the frame buffer. The pixels are then illuminated at these values to produce the image on the display device.
In the operation of these computer graphics display systems, various types of data are stored and accessed from memory units or devices, and many of these memory units are random access memory devices such as DRAMs. Traditional configurations for holding data in DRAMs and traditional procedures for accessing those memories have some inherent inefficiencies that may slow the rate at which ';..-he computer systems process the images.
More specifically, a standard DRAM has a matrix or array of data storage locations, arranged in a multitude of rows and columns, and each of these locations is capable of holding a respective data value. At the same time, DRAMs are physically organized as a series of contiguous subregions referred to as pages. For example, each row of data locations of a DRAM may form a respective page. In use, in order to obtain a data value from a particular data storage location, the subregion or page in which that data value is located must first be accessed or opened. Once a page is opened, data can then be obtained from the data storage location. If data is then needed from a second p-age, however -- a situation referred to as a page break or a page switch -that second page must be opened, and then that data can be obtained.
A certain amount of time, often expressed as a number of clock cycles, is needed just to open a page. For instance, four or five clock cycles might be needed to open a page, while only one or two clock cycles might be required to obtain data from an open page. As a result of this, when data are needed from a large number of different pages, a significant amount of time can be used simply to open the pages of the memory.
A large number of memory page breaks -- with theassociated, abovedescribed loss of time -- may happen during several of the procedures or routines that are used in the operation of a computer graphics system. For example, this often happens when pixel data are obtained from the frame buffer to render an image an the display. This is because of a combination of several specific reasons. A first of these reasons is the standard spatial relationship between the pixels on the display device and the pixel values in the frame buffer. More particularly, with conventional computer graphics display systems, the organization of the pixels on the display device is very similar to the organization of the pixel data in the frame buffer -- both involve a matrix pattern comprised of a multitude of rows and columns. in the frame buffer, the pixel values are stored in this matrix pattern; while on the display device, the pixels themselves are arranged in this matrix pattern.
In addition, generally, the data for successive pixels of the video display are stored in successive data Is i 1 storage locations in the frame buffer. This type of memory organization is termed linear and is well known in the field. Commonly, for instance, the data for each complete row of pixels on the image (o r display) is stored in one or more successive rows of data storage locations in the frame buffer, and the data for successive lines of pixels are stored in successive lines of data locations in the frame buffer.
Moreover, when pixel values are obtained for image computation from the frame buffer, a rasterization. procedure is used to do this. With this procedure, the object to be displayed is represented as a series of polygons, each of which covers a group of pixels. Each of these polygons is scannedi line by line; and values are obtained for the pixels in the polygon in the order in which the pixels are encountered as the polygon is scanned -- that is, line by line, and within each line, in a linear order such as left to right. Each of these polygons may extend over a number of lines of pixels, and in addition, the polygon representation of many objects is comprised of a large number of such polygons. Because of this, and because of the above-discussed conventional design and operation of DRAM frame buffers, a substantial number of page breaks may occur when accessing pixel data from a DRAM frame buffer to render an image of an object on the display device.
A large number of memory page breaks, with the associated inefficiencies, may also occur when a texture mapping procedure is used to determine, or to help determine, pixel values. To elaborate, texture mapping is a cominonly employed technique for adding detail in -4, is computer graphics rendering to achieve a high degree of realism in the rendered image. In this procedure, an array of data values that represent a texture or surface appearance of an object are stored in a memory device, which often is a DRAM. The data values, often referred to as texels, are such that if a corresponding array of display pixels are illuminated at these values, then that pixel array shows that texture or surface appearance.
This array of data values, or texels, that is stored in memory is referred to as a texture map or a texture, and these data values are organized in this memory in a linear manner. When texture mapping is used, a rasterization pattern is also employed to locate and fetch data values from the texture map. In particular, texel locations are related to pixel locations through a perspective mapping function. For example, assume pixel coordinates are x and y, and texture coordinates are u and v. u and v may be determined f rom x and y by means of known translation functions, represented as Fx () and FuG. Mathematically this is expressed as: U = FU (X, Y) and v = F,(x,y). The rasterization pattern in the texture map is not linear but is related to a linear rasterization pattern through these functions- These texture values are then used to calculate, or to help calculate, the pixel values that are then stored in the frame buffer.
As a result of the above -mentioned perspective mapping function, a horizontal scan line across a polygon Of Pixels is, tYPicallY, mapped into an angled scan line across the texture map -- that is, a scan line that is not horizontal. Such an angled scan line may cross a multitude of rows -and thus pages -- of the texture memory; and as individual texels are accessed along that scan line, numerous page break. s may occur.
I0 - -1 i 1 1 Several prior art patents describe memory architectures designed to improve memory bandwidth efficiency. These include U.S. Patents 5,675,387; 5,517,609; 5,251,293; 5,131,080; 5,056,044; 4,965,751; 4,935,880; and 4, 755,810. For example, U.S. Patent 5,675,387 describes a tiled memory organization for efficient DRAM access in a video decompression processor; and U.S. Patent 5,517,609 describes a tiled memory organization in a VRAM architecture. Another memor y architecture is disclosed in the article "The Design and Analysis of a Cache Architecture for Texture Mapping," Computer Systems Laboratory, Stanford University, Stanford California, 1997. While each of these architectures have their own benefits and advantages, further improvements and efficiencies are highly desirable.
SUMMARY OF THE INVENTION
An object of this invention is to store data in a memory unit of a computer graphics system in a way that improves the efficiency of the system.
Another object of the present invention is to write data,into and to read data from a memory unit of a computer graphics system in a manner that improves the efficiency of the system.
1 A further object of this invention is to provide a computer graphics system with an improved tile rasterization procedure.
Another object of this invention is to organize data values for a spatial group of pixels from a surface (rectangular region) into linear memory to reduce page misses and to increase cache coherency and efficiency when rendering three-dimensional polygons in a span or area based manner.
Another object of the present invention is to reduce the number of memory page breaks that occur, in the operation of a computer graphics system, when that system is used to render an image.
An object of this invention is to reduce the number of memory page breaks that occur in the operation of a computer graphics system during texture mapping.
These and other objectives are attained with a computer graphics system, and with a method of configuring data in a memory unit of a computer graphics system. Generally, the data is configured such that the number of memory page breaks is reduced when data is accessed from the memory for image computation. For example, when the memory is used to store pixel values, each page of the memory is comprised of pixel values for a rectangular or tile array of pixels.. that is, an array of pixels on the display surface at least several pixels wide and at least several pixels high. This increases the spatial coherence between the pixel values and the pixels of the polygons that are rasterized when the system renders an image.
is Preferably, a translation 'algorithm is provided to allow standard operating systems and software applications to work with the tiled configuration of the pixel values in. the memory. This algorithm translates the scalar memory address initially provided by the operating system or the software application, and that is intended for use with a memory in which the pixel values are linearly arranged, and translates that first scalar memory address to a second scalar memory address that will properly access the value for the pixel conventionally associated with the first scalar memory address. In this way, this algorithm, in effect, makes the tiled organization of the pixel values in the memory look linear to the operating systems and software applications.
Further benefits and advantages of the invention will become apparent from a consideration of the following detailed description, given with reference to the accompanying drawings, which specify and show preferred embodiments of the invention..
Brief Description Of The Drawings
Figure 1 schematically illustrates a computer graphics system that may be used to embody the present invention.
Figure 2 illustrates another computer graphics system that may also be used to embody this invention.
Figure 3 shows a conventional memory organization.
Figure 4 illustrates a pixel array of a video display of the computer graphics system of Figure 2- is Figure 5 shows a procedure for rendering two polygons using the memory organization of Figure 3.
Figure 6 illustrates a memory separated into, tiles.
Figure 7 depicts a procedure for rendering two polygons using the tiled memory of Pigure 6.
Figure 8 shows another procedure for rendering the two polygons using the tiled memory of Figure 6.
Figure 9 depicts a third procedure for rendering the two polygons using the tiled memory of Figure 6.
Figure 10 shows the memory configuration of Figure 6 and conceptually illustrates various values used to calculated an address referred to as a tiled address.
Figure 11 shows a representation of a linear memory configuration separated into regions by boundaries referred to as fences.
Figure 12 illustrates a hardware circuit for translating a memory address, intended for use with the memory configuration of Figure 3, into an address for use with a memory in which the pixel values are stored in a tiled configuration.
Figure 13 is a more detailed drawing of a comparator unit of the circuit of Figure 12.
Figure 14 is a more detailed drawing of an address computation unit of the circuit of Figure 13.
Figure 15 shows an alternate address computation unit.
Figure 16 illustrates a polygon mapped onto a texture map by means of a Perspective mapping function.
is 1 Detailed Description Of The Preferred Embodiments
Computer system 10 illustrated in Figure 1 includes a bus 12 for communicating information, a processor 14 coupled with the bus for processing information, and a memory 16 such as a RAM that is coupled with the'bus for storing information and instructions for the processor. Sytem 10 further includes video display device 20, such as a CRT raster scan device and a data storage device 22, such as a magnetic disc, coupled with the bus 12 that is also used to store information and instructions.
Alternative computer systems having specifically designed graphics engines are well known in the art. Commonly, these alternative computer systems modify the system of Figure 1 by incorporating a specialized graphics subsystem that includes a graphics processor, a dedicated frame buffer, often in the form of video DRAM, and a video display.
Figure 2 shows an example of a computer system 30 having a graphics subsystem 32. In this computer system 30, input.image data from the main processor 14 are communicated over bus 12 and bus 34 to the graphics processor 36. This image data are typically in the form of graphics primitives such as lines, points, polygons or character strings. The graphics processor 36 receives that input image data from the main processor 14 and uses that data to create a complete image data set utilizing well known graphics techniques such as scan conversion, clipping, Couraud shading and hidden surface algorithms.
The image data developed by the graphics processor 36 is stored 1 n graphics RAM 40, which typically includes the frame buffer 4. Graphics processor 36 addresses the graphics PJM 40 and supplies the video information over bus 44. Periodically, the output of the graphics RAM is read and sent to a digital to analog converter 46 and then to a video display device 52 or to other raster scan display aevices. More specifically, display device 52 includes a display surface 54 comprised of a matrix or grid of pixels that are illuminated in a controlled manner to form an image. These pixels are illuminated at colors and intensities determined by the data that was assembled in, and then read from, frame buffer 42.
As will be understood by those of ordinary skill in the art, computer systems 10 and 30 may include more elements than are expressly shown in Figures 1 and 2 and described herein in detail. in addition, the individual elements shown in Figures 1 and 2 may be conventional items used in the manner described herein. Also, computer systems of the type shown in Figure 2 are more appropriate for graphics intensive processing, and thus may be preferred for implementing or embodying the present invention. The present invention may be embodied in other computer systems as well, however.
Figure 3 illustrates a conventional memory device 60 that may be used as frame buffer 42, and a conventional arrangement for holding or storing the image data in the frame buffer. More specifically, Figure 3 shows a grid pattern that represents memory device 60, and each small rectangle 62 of the grid represents a data 11- 1 is storage location. Thus, as shown in this Figure, the memory 60 comprises a matrix of columns and rows of data storage locations 62. These memory locations 62 are numbered, and are considered as being arranged, in a successive or consecutive order starting at the upper left corner of the memory and proceeding left to right across each row, and downward from row to row. The first location is at the base address of trame buffet 42; however, it is commonly referenced as a relative address at position 0..
In addition, with memory 60, each row of data storage locations forms one page. The pages of memory 60 are contiguous and consecutively numbered, and each page represents a fixed number of data storage locations or positions. in addition, in memories of the type represented in Figure 3, each data location 62 may hold one or more bytes of data, and thus may be referred to as being multiple bytes long or wide. As a typical example, each data location 62 may hold two bytes of data.
When the memory 60 shown in Figure 3 is used as the frame buffer 42 of system 30 of Figure 2, there is a one-to-one correspondence between data storage locations 62 in the graphics memory and the pixels on the display device 52 -- that is, a value for each pixel is held in a respective one storage location 62, and each storage location 62 contains a value for a respective one pixel. Also, values for successive pixels of the video display are stored in successive data storage locations in the memory 60. This type of organization of the pixel values in memory 60 is referred to as linear.
As a result of this arrangement for storing the pixel data in memory 60, there is also a direct spatial correspondence between the specific location of each pixel on display surface 54 and the specific location in memory 60 at which the value for that pixel is held. In particular, the position of each pixel relative to the grid of pixels on the display surface 54 is identical to the position of the memory location, at which the value for that pixel is held, relative to the grid of locations 62 in the memory 60. This direct correspondence may be understood by referring to Figure 3 and Figure 4, which illustrates the grid of pixels 64 on the display surface 54. Thus, for example, the value for pixel 64a is held in memory location 62a, and likewise the values for pixels 64b and 64c are held in memory locations 62b and 62c respectively.
With, for example, a specific embodiment described herein in detail, each row ' of data storage locations 62 holds the data for a respective one row of pixels 64 of the display surface 54. Thus, with this embodiment, for example, the number of pixel values in each page of memory 60 Is equal to the number of pixels 64 in one row, or horizontal line, on the display surface 54. Other arrangements may be used, however. Por instance, if the memory pitch (or width) is wider than the storage of a DRAM page, then two or more DRAM pages can be used per line. Fractional amounts are also possible. For example, 2.5 memory pages may be used per line of the image where the pages are split across image rows. The reverse argument is also true. If the memory pitch (or width) is smaller than the storage of a DRAM 13- is page, one DRAM page may span across two or more image pixel rows. Fractional amounts are also possible here.
Although the memory locations 62 in memory 60 and the pixels 64 on display device 54 are both arranged in grid formats, different addressing systems are conventionally used to identify, on the one hand, the positions of individual data locations in a memory device, and on the other hand, and positions of individual pixels on a pixel display. Generally, a onecoordinate, or scalar, addressing system is used to identify specific memory areas, and a two coordinate, rectangular addressing system is used to identify specific pixels.
More particularly, as mentioned above, the memory locations are numbered consecutively, starting at the upper left corner of the memory, and proceeding left to right across each row, and downward from row to row. Thus, for example, with reference to Figure 3, if the memory location at the upper left corner of memory 60 is 110,1 then the memory location at the upper right corner of the memory is "1023." The memory 1 ocation at the left end of the second row from the top of the memory is -1024,0 and the memory location at the right end of this second row is 02047.01# in contrast, the individual pixels on the display device are identified by the column and row in which the pixel is located. In this addressing system, the columns are consecutively numbered from left to right, and the rows are consecutively numbered from top to bottom. For instance, with reference to Figure 4, the pixel at the top left corner of the display device has i is the address 0,0 -- that is, column number 0, and row number 0 -- and the pixel at the top right corner of the display has the address 1023, 0 -- that is, column number 1023, and row number 0. The pixel at the left end of the second row of the display has the address 0, 1, and the pixel at the right end of this second row has the address 1023,1.
Figures 3-5 illustrate a conventional procedure for linearly rasterizing two polygons using data from memory 60. In particular, Figures 3 and 5 show two polygons, A and B, laid over a portion of memory 60, and Figure 4 shows these same two polygons laid over the corresponding portion of the grid of pixels 64 of display surface 54. Because of the above- discussed spatial relationship between the positions of the pixels 64 on display surface 54 and the positions of data locations 62 in memory 60, the pixel value for each of the pixels covered by the polygons in Figure 4 is in the spatially corresponding data location of memory 60 of Figures 3 and 5. When a polygon is rasterized, data values for the pixels in the polygon are obtained one row at a time; and within each row, data values are obtained for the pixels, one pixel at a time, left to right across the row, as represented by the arrows in Figure 5.
As previously mentioned, with a specific embodiment described herein in detail,,for example, each row of memory 60 is one page. With this example, as can be seen from Figure 5, whenever a new row of pixels in either of the polygons A and B is rasterized, a new memory page is accessed. A page switch, or break. occurs for each numbered directional arrow shown in Figure 5.
-Is- is 1 i 1 Specifically, twenty-seven page breaks occur when Polygon A is rasterized, and twenty page breaks occur when polygon B is rasterized.
The present invention provides, among other features, a mechanism and a procedure for reducing the number of page breaks that occur as a polygon is rasterized.
Figures 6 and 7 depict a memory configuration and a rasterization procedure illustrating principals of this invention. More specifically, Figure 6 shows a grid pattern that represents a memory 70, and each small rectangle 72 of the grid represents a data storage location. This memory is shown as being separated into an array of smaller rectangular areas, referred to as tiles, comprised of a plurality of rows and a plurality of columns of the data storage locations 72, and the data in each of these tiles is stored in a respective one page of memory 70.
With the specific embodiment of memory 70 shown in Figure 6, each tile is 64 memory data wide and 16 data locations high. Also, this particular memory 70 is divided into 768 tiles, arranged in a matrix pattern 16 tiles wide by 48 tiles high. As will be understood by those.of ordinary skill in the art, other specific dimensions may be used. If these dimensions of the tiles are varied, though, preferably they are varied in a dependent manner such that the data in each tile can be stored in one page of DRAM.
The memory locations 72 are numbered, and are considered as being arranged, in an order that is is different from the order in which memory locations 62 of memory 60 are arranged. More specifically, with memory 70, the memory locations 72 are numbered, and are considered as being arranged in a successiv ' e or consecutive order within each tile, starting at the upper left corner.of the tile, and proceeding left to right across each row in the tile and downward from row to row. Thus, for instance, if the data location at the upper left corner of tile 0 is assigned address number 0, then the data locationat the upper right corner of this tile 0 is assigned address number 63. Address number 64 is assigned to the leftmost data location in the second row of tile 0, and the data location at the right end of this second row of tile 0 is assigned address number 127.
From tile to tile, the address values increase in the order in which the tiles are numbered in Figure 6 -- that is, from left to right across each row of tiles, and downward from tile row to tile row. With this arrangement, for example, if the data location at the bottom right corner of tile 0 is address number 1023, then address number 1024 is assigned to the data location at the top left corner of tile 1.
Memory 70 does have the same direct spatial correspondence as memory 60 has between, on the one hand, the specific location of each pixel on display surface 54 and, on the other hand, the specific memory location in the memory at which the value for that pixel is held. Thus,' the position of each pixel relative to the grid of pixels on the display surface 54 is identical to the position of the memory location, at which the value for that pixel is held, relative to the grid of locations 72 i is 1 in the memory 70. For example. with reference to Figures 4 and 6, the values for pixels 64a, 64b and 64c are held in memory locations 72a, 72b and 72c respectively.
Generally, the size and dimensions of the tiles of memory 70 are selected by grouping together data locations 72 in a way that increases the spatial coherence between the data locations and the pixels of a polygon to be rasterized. In particular, the data locations 72 are grouped together so that the dimensions of the tiles formed by the data storage locations more closely match the dimensions of the polygon to be rasterized using data from the tiles. The effect of grouping the data storage locations 72 in this way is to minimize, or at least to reduce, the number of page breaks that occur when accessing the memory storage locations while rasterizing the polygon.
Figure 7 illustrat"es how this reduction is achieved. In particular, Figure 7 illustrates a procedure for linearly rasterizing polygons A and B using data from memory 70. This Figure shows polygons A and 3 laid over an area of memory 70, and this area of the memory includes portions of six tiles or pages of data. Here also, each data location shown inside a polygon in Figure 7 can be considered to represent illustratively, and to hold the pixel value for, a pixel that would be inside the polygon on the display surface 54. - The numbered Arrow segments in Figure 7 represent scan lines or scan line segments across the polygons and indicate the order in which data values for the pixels inside each polygon are obtained when the polygons A and B are rasterized.
-1 is It is apparent from Figure 7 that, as a consequence of the tiled configuration of the pages of memory 70, only some of the scan lines cause page breaks. This is in clear contrast to the conventional, prior art procedure illustrated in Figure 5. More specifically, for polygon A, no page breaks occur for scan lines 1-6, and the first page break happens between scan line segments 6 and 7. This break occufs because cliff erent DRAM pages are accessed to obtain data from tile 0 and tile. 1. A second page switch occurs between scan line segments 7 and 8 because tile 0 mustbe re-accessed to retrieve data for the pixels on scan line segment 8.
Page breaks occur after scan line segment 8 and after each of the next twenty-four scanline segments. Then, no breaks occur for the next seven scan lines, and a final page break happens after scan line 38. Thus, this combination of tiled memory organization and linear raster.i4.zation of polygon A results in a total of twentyseven page breaks. For polygon B, during the rasterization of the entire polygon, only one page break occurs, between scan lines 9 and 10.
Figure 8 illustrates an alternate rasterization procedure, referred to as span rasterization. with this procedure, instead of continuously scanning across complete rows of the polygons, each polygon is scanned in relatively short horizontal segments. More particularly, with the embodiment of memory 70 shown in Figure 8, each tile, or page, of the memory is comprised of a matrix of equal length spans 74 that are arranged in a plurality of rows and columns. Each of these spans, in turn, is comprised of a rectangle, such as a four by four is j i 1 rectangle, of data storage locations 72. With these dimensions, as shown in Figure 8, each of the tiles is comprised of 64 of these spans, positioned in a 16 by 4 matrix.
The numbered arrow segments in Figure represent scan lines or scan line segments and indicate the order in which values for the pixels are obtained when the polygons A and B are raste rized, The rasterization procedure starts at the span that contains the topmost vertex of the polygon to be rasterized. This span is referred to as the start span, and eath pixel that is covered by the polygon in this start span is rasterized. Upon completion of the processing of this first span, the method will proceed to the adjacent span to the right on the condition that at least one pixel in that.adjacent span is covered by the polygon and, therefore, requires rasterization. If none of the pixels in that adjacent span are covered by the polygon, then the procedure moves down one row of spans and to the leftmost span in that next, lower row that contains at least one pixel covered by the polygon.. Processing continues in a similar fashion, span by span, from left to right, as illustrated in Figure 8, for each row of spans that are covered, completely or partially, by the polygon. Span processing may also be right to left on a span line or a combination of left to right and right to left.
with this procedure, as Figure 8 shows, for polygon A, the first page switch does not occur until the rasterization procedure reaches the end of scan line segment 19. At this point, a tile boundary is crossed, and the rasterization procedure proceeds to scan line segment 20. The next page break does not occur until the rasterization procedure-reaches the end of scan line segment 23, and only nine page breaks occur during the entire rasterization of polygon A. A- similar analysis of the rasterization of polygon B shows that the entire polygon is completely rasterized with only one page break.
Figure 9 illustrates a further rasterization procedure, referred to as tile aware rasterization. Tile aware rasterization can be considered to be a special case of span rasterization where the size of a span equals that of a tile. Alternately, tile aware rasterization could mean a span rasterization pattern that processes all of the spans within a tile before moving to the spans in another tile. As Figure 9 shows, when rasterizing polygon A using this procedure, the first page break occurs after rasterizing scan line segment 9, the next page break happens after rasterizing scan line segment 13, and a total of only four page.break occur during the rasterization of the entire polygon.
Also, the entire polygon B is rasterized with only one page break. Thus, in cam-Darisen to the conventional procedure illustrated in Figure 5, the tile aware rasterization procedure reduces the number of page breaks needed to rasterize polygons A and B by 22 and 10 respectively.
In memory 70, unlike in memory 60, values for successive pixels on display surface 54 are not necessarily arranged in successively numbered memory locations 72. There are two basic reasons for this. The first reason is the direct spatial correspondence (which is the same with memory 60) between the locations of the pixels on display surface 54 and the locations at which the pixel values are held in memory 70. The second of these reasons is the manner (which is different than in memory 60) in which the data locations in memory 70 arpnumbered.
Thus, for example, the value for pixel. (63,0) is located at memory address 63 in memory 70; however, the value for the next pixel, pixel (64, 0), is not located at memory address 64, but instead is located at memory address 1024, which is the first memory location in tile 1 of memory 70. Memory address 64 in memory 70, which is the first memory location in the second row of tile 0, holds the value for pixel (0,1).. In contrast to the linear arrangement of the pixel values in memory 60, the type of organization of the pixel values in memory 70, relative to the memory addresses, is referred to as a tiled organization or configuration.
Standard operating systems and software applications for computer graphics systems are designed and written, however, for the conventional linear arrangement of the pixel values of memory 60 not for the tiled arrangement of memory 70. in accordance with another aspect of this invention, an algorithm is provided to allow those standard operating systems and software applications to work with the tiled arrangement of memory 70. This algorithm, in effect, makes the tiled organization look linear to the operating systems and software applications, allowing those systems and is applications to address and to use properly a memory having a tiled organization of the pixel values.
This algorithm may be used both to write data into the proper memory area and to read data properly from the memory. Generally, in either case, read or write, an address is converted from a linear address to a tiled address. More specifically, in accordance with this algorithm, a memory address provided by an operating system or a software application and intended for use with a memory, such as memory 60, in which the pixel values are linearly organized, is converted to an address, referred to as the tiled address, for use with a memory, such as memory 70, in which the pixel values have a tiled organization. Then, the appropriate data is written into or read from that tiled address.
TO elaborate, in order to store or to obtain a value for a particular pixel, the operating system or the software application provides a scalar value identifying a specific memory address that conventionally holds the data.for that pixel -- that is, the data for this particular pixel is stored at this memory address when the pixel values are linearly organized in the memory. Memory 60 of Figure 3 has this linear organization of the pixel values. Memory 70 of Figure 6 does not have this linear organization, however. when the pixel values have a tiled organization, as they do in memory 70, the memory address identified by the scalar value provided by the operating system or the software generally does not hold the value for the pixel conventionally associated with that memory address.
is The algorithm of this invention translates the scalar memory address provided by the operating system or the software application to another scalar memory address. This latter scalar address identifies the location in a memory, in which the pixel values have a tiled organization, that holds the value for the pixel conventionally associated with 1Che former scalar memory address. As will be understood bythose of ordinary skill in the art, many different specific algorithms may be used to translate the initially provided scalar address value to the final desired scalar value. with the preferred embodiment of this algorithm, that initially provided scalar address is first converted to an,x,y rectangular address, and then that rectangular address is converted back to a scalar address that identifies the appropriate data storage location in the tiled memory configuration.
This two-step translation process may be understood with reference to Figures 3, 6 and 10. Figure 10, like Figure 6, shows memory 70 comprised of a matrix of data locations 72. Figure 10 alsoidentifie s several parameters, discussed in more detail below, used to compute the tiled address. Memory locations 62d and 72a are identified in Figure 3 and in Figures 6 and 10, respectively. These locations are used to store the pixel value for the same pixel on display surface 54, even 'though the locations 62d and 72d have different addresses in"the respective memories 60 and 70.
With the preferred embodiment of the address translation algorithm, the x, y rectangular address calculatted in the translation process is the x,y address of the pixel on the display surface that is conventionally associated with the initially provided scalar address value -- that is, when the pixel values are stored as they are in memory 60. These x and y values, hence, also identify the column and row, respectively, of the location in that conventional memory organization that holds the value for that pixel. These x and y values are elated to the scalar linear address' by the following equation: Linear Address = BA + (PK y) + x where, BA, referred to as the surface base address, is the scalar address of the first data location in memory 70, and P,, referred to as the memory pitch, is the number of data locations in each row of memory 70. The hardware will perform equation (1) backwards to determine the x and y coordinates for use in the tiling algorithm. The hardware subtracts the surface base address from the linear address, and this result is divided by P.. The integer portion of the quotient is the y coordinate, and the remainder of the quotient is the x coordinate. In general terms: y int[(linear_address BA)/PX). x = remHlinear-address BA)/PM) (2) (3) Thus, for example, with reference to Figure 10, if the scalar linear address of location 72d is 106,588, the surface base address is 65,536, and the memory pitch is 1024, then:
y int[(106,588 - 65,536)/10241 int[41,052/10241 = 40 1\ is tiled address = BA + (0 T7k + 0,r) + (a,,,, + OP) (4 x = rem[(106,588 - 65,53C/10241 = rem[41,052/10241 = 92 once this x,y address is calculated, the tiled address is calculated. With reference to Figure 10, this is done, first, by determining the total number of data locations in the tiles that come before the tile 76 having the data location 72d, and second, by determining the number of data locations in that tile that come before location 72d. The sum of these two numbers is then added to the base surface.address of memory 70 to obtain the tile address for the data location 72d.
Mathematically, this is expressed in the following equation:
where: RA is the address, referred to as the base address, of the very beginning of memory 70, o.rR is the of f set to the tile row - - that is, the total number of data locations in the tiles of the memory 70 that are above the row. of tiles in which the pixel value is located, OT is the of f set to the tile that is, the number of data locations in the tiles in the same row as, but before, the tile in which the pixel value is located, OPP. is the offset to the pixel value row -- that is, the number of data locations in the rows of the tile that are above the row in which the pixel value is located, and i OP is the horizontal offset to the pixel value -that is, the number of data locations in the saine row, but before, the pixel value.
OTA 0 OT 0' OPR #and 0. are illustrated in Figure 10 and they are given by the following equations:
is OTR Y T p T TH PM7 PT OT TH OPR ya PT 0 p X a (5) (6) (7) (8) where, T. is the tile height in lines, Y.is the integer portion of y/T,,, Rris the width, or pitch, of the tile, and is the number of data locations in each row of the tile, PMT is the pitch of the tiled memory in units of tiles. In other words, P.. is the number of tiles that make up the total horizontal dimension of the tiled memory. XT 'S the integer portion of X/PT, and XR is the remainder of x/PTI YR is the remainder of y/TH.
Because R, PT is the width of the tiled mernory in units of storage, equation (5) could also be written as OTR m Y T TH Pg, where P. is the pitch of the tiled memory in units of storage or simply the pitch of the memory.
Thus, for example, if x and y are 92 and 40, respectively, the tile height and pitch are 16 lines and 64 data locations, respectively, PR, is sixteen, and the base address is 65,535, then i is tiled address BA + (OTR + 0 T) + (011R + OP) BA + E ( YT Rr Tli P2fr) + ( T)( Pr X., + [ (Y R p + XX 1 = 65,536 + [(int(40/16)641616) +(1664int(92/64))] + [(rem(40/16) 64) + rem(92/64)l = 65,536 + [(2 64 16 16) + (16 64 l)] + L(8 64) + 281 = 65,536 + [32768 + 10241 + [512 + 281 = 65,536 + 33792 + 540 = 99,868 Often, multiple pixel values are stored in one memory word; and, for instance, two, four or eight pixel values may be stored in one memory word. When pixel values are grouped in these words, it may be preferred, when writing the pixel values into the memory area, to write those values word-by-word. The pixel. words are arranged in the memory area so that each tile of pixel values is a single page of the memory, but it is not necessary to re-arrange the pixel, values within the memory word. Also, when pixel values are grouped in these words, memory addresses may be used to identify individual memory words, rather than individual pixel values.
in many applications of computer graphics systems, the DRAM is separated into a multitude of separate regions or areas, each of which may be used differently than of the others. These memory regions are sometimes 'refer red to as surfaces. Commonly, most of these memory regions are associated with a respective area of the display surface 52, although some regions, referred to as off-screen surfaces, might not be directly associated with any area of the display surface.
An important purpose for providing the DRAM with these separate regions is to match memory pitch closely to surface pitch to conserve memory. surfaces in the system have different Pitches.
Therefore, areas of memory are desireld Different to nave different pitches. Preferably in each of these memory regions, the data values stored in the region may be tiled or linearly organized, independent of how data 'values are organized in any of the other memory regions.
With the Preferred embodiment of this invention, for any given application, the user is able, first, to determine the boundaries of these memory areas, and second, to indicate, for each of these memory areas, whether the organization of the data in the memory area is linear or tiled. Further, for each memory region that is, Or is to be, tiled, values are provided for the parameters needed by the algorithm used to translate the linear address to the tiled address.
The boundaries of th as fences since, ese regions are referred to in effect, those boundaries form a fence around, or fence off, a region of the memory. In addition, the boundaries or fence f- L. - particular memory region may be, simPlY, the addresses of the lowermost and Ppermost data locations in. the region. Figure 11 illustrates this use of fences to separate a standard linear memory configuration 120 into a group of regions 110a, 110b and 110c. As illustrated in this Figure, address A is less than address B, which is less than address C, which in turn is less than address D. A device driver will associate one fence with one or more of the surfaces.
In order to accommodate this feature, preferably each memory region that may be tiled is associated with a respective register that contains values identifying the boundaries of the memory region, a flag to indicate whether or not the memory region is tiled, and values for the parameters needed by the address translation algorithm. These latter values may include the surface pitch, the tile' height, the tile pitch, and the base address of the memory region, which is also the lower bound of the memory region.
With this feature, when an operating system or software application provides a linear address, the graphics subsystem first determines which fence bounds the surface containing the data value for the memory is address. Then, the values needed for the address translation algorithm are retrieved from the register associated with the memory region having that provided linear address., Using these values, the graphics subsystem calculates the tiled address, and the data can be read from or written to that tiled address.
As those of ordinary skill in the art will understand, practical hardware limitations may limit the number of memory regions that may be tiled. in an embodiment of this invention that has been ac"kually reduced to practice, for example, the system is restricted to storing translation parameters for up to seven regions of memory. In addition, in a typical use, the translation parameters for a particular fence exist in the system for the entire time that the surface is active in the memory within that fence. This is so because the system does not know when an application may I! 1 is access the surface- Thus, preferably, fences are created and assigned at the time the renderer surface is created, and released only when that surface is destroyed.
In the practice of this invention, software or hardware, or a combination of both, may be used to write data into and to read data from a memory having values arranged in a tiled configuration. Figure 12, for ' example, shows a hardware circuit 2'00 for performing the above-discussed translation algorithm$. In the operation of this circuit, the initial, scalar linear address value is provided via input line 202 and applied to fence range comparator 204. If the scalar value does not fall into any of the active fence regions, the scalar address value is conducted to output line 206, via multiplexer 210.
In contrast, if the scalar value does fall into one of the active fence regions, then comparator 204 determines the particular memory region 'having the location identified by the input scalar linear address value, and whether tiling is active for that region. if tiling is not active for that memory region, then the initial scalar value is transmitted to output line 206, via multiplexer 210. If, though, tiling is active for the memory region, then the parameters needed for the translation algorithm are supplied to subtractor 212, divider 214 and computation device 216.
Subtractor 212 calculates the numerator of equation (2) -- that is, this subtractor subtracts the surface base address from the initial scalar address. This calculated value is then transmitted to divider 214. This divider 214 divides that value by the pitch of the memory and determines the x and y values discussed above in connection with equations (2) and (3). These x and y values are transmitted to address computation unit 216, which calculates the tiled address using equation (4), and this tiled address value is transmitted to output line 206 via multiplexer 210. It may be,noted that the subtract and add of the base number may be omitted if the fence boundaries are restricted to multiples of the amount of memory in a row of tiles kor a memory pitch in that fence.
Figure 13 illustrates more specifically the operation of the fence compare unit 204. This Pigure shows two groups of registers 2.20 and two groups of comparators 222. Each group of the registers is associated with a respective memory region and holds, among other values, the upper and lower bounds of the memory region. Comparators 222 are used to compare the input, scalar address to these bounds to determine if that address is within one of these memory regions and, if so, which one. If 'these comparators determine that the input scalar address value is within one of the memory regions, a fence match signal is given over line 224, and the parameter values from the associated registers are supplied over line 226.
- Figure 14 shows in more detail the operation of the tiled address computation unit 216. The unit receives the surface base address, the tile row size, and the x and y values, referred to as the surface x and the surface y values, determined by the divider 214. These values are used to determine the 0 TRi OTO OPR and 0,, values used in equation (4). Generally, the x coordinate is decomposed into sub-tile x R (pixel--. calumn) and tile x T i (tile-column). The y coordinate is decomposed into subtile y. (pixel_rOw) and tile yT (tile_row). The various pieces of the coordinates are then re- arranged (in terms of their positions within the surface relative address). The tile Y piece is multiplied, at multiplier 230, by the surface pitch in tiles and added, at adder 232, into the other re- arranged pieces. The fence base address is then added into the tiled surface relatiVe address, also at adder 232, to form the final absolute tiled address.
It may be noted that the specific diagram shown in Figure 14 is designed for use with tile heights and the tile widths that are both powers of 2. As will be mnderstood, this is not necessary to the practice of the invention is its broadest sense. Those of ordinary skill in the art will be readily able to modify the pipeline expressly illustrated in Figure 2 to accommodate tile heights and tile widths that are not powers of 2. Also, when the memory pitch is a power of two, such as four, eight, or sixteen, then the multiplication performed by multiplier 230 can be done by a simple shift, as illustrated at 234 in Figure 15.
As previously mentioned, various types of data may be stored in inemcry in a tiled organization or configuration. In particular, in addition to the pixel values discussed above, it may be advantageous to store texture data in a tiled organization. Commonly, texture data are stored, in a linearly organized way, in a memory area of the graphics subsystem, When that data are needed, that memory area is accessed, and texture values are obtained and used, in any of a number of ways, to add texture to a displayed object.
is 1 During this procedure, as explained above, a perspective mapping functions is used to map pixels onto a texture map. - An example of this procedure'is generally illustrated in Figure 16, which shows a triangle 250 mapped, via a perspective mapping function, onto a texture map 252. The arrow segments 254 in Figure 16 represent horizontal scan lines across a pixel array mapped, via the perspective mapping function, onto the texture map. As can be seen, the line segments shown in Figure 16 cross numerous rows of texels. Hence, numerous page breaks occur as the texels along these lines are accessed. The number of these page breaks can be significantly reduced, or even eliminated, by writing and storing the texture values in a memory area in a tiled organization. tt may be noted that utilizing a span rasterization pattern will further reduce the page misses for access to texture in tiledmemory.
Any suitable procedure may be used to do this. For example, generally, the procedure may be similar to the procedure discussed above. Texture 252 may be separated into tiles, such as memory 70 of Figure 6, with the data in each tile being stored in a respective one page of the memory 252; and the texture value locations may be numbered or addressed in this memory, in a manner similar to the way the pixel value locations are numbered or addressed in memory 70. With this arrangement, page breaks do'not occur each time texture values are accessed from different rows of texture map 252.
in this procedure also, preferably the tile dimensions are selected by the user, and the user has considerable flexibility in choosing Ehese dimensions.
1 The important consideration here too is that the tile size, as measured by the number of data locations in the tile, is equal to the page size, as measured by the number of data locations in the page.
As will be understood by those of ordinary skill in the art, the teaching of the present invention may be employed in many different specific systems and methods. For instance, it may be noted that multiple pixel values 'and multiple texture values may be - stored in one memory word. when this is done individual memory addresses may be used to identify individual memory words, rather than individual pixel or texture values. in addition, multiple -tile geometries may be used in the practice of this invention, and individual hardware systems may be used that support more than one tile geometry.
Moreover, there are many suitable addressing protocols that may be used to number the data locations within the tiles. With the example shown in Figure 6, addressing starts at the upper left corner of each tile, and proceeds left to right across each row in the tile and downward from row to row. Alternatively, addressing may start at the upper left corner, and proceed downward in each column and then rightward from column to column; or addressing may start at the lower left and proceed upward in each column and then rightward from column to column. As a still further example, addressing may proceed in a checkerboard manner such that consecutive addresses are assigned to address locations that are physically non-consecutive. Likewise, the tiles may be numbered in any suitable manner. For example, the tiles may be numbered left to right, right to left, upwards, or downwards, or in any other suitable way.
While it is apparent that the invention herein disclosed is well calculated to fulfill the objects previously stated, it will be appreciated that numerous modifications and embodiments may be devised by those skilled in the art, and it is intended that the appended claims cover all such modifications and embodiments as fall within the true spirit anc scUp--e of the present invention.
is

Claims (23)

CLAIMS 1 - A method f or storing and retrieving data f rom a random access memory (R.AM) to compute rendered values for an array of pixels, wherein the RAM comprises a multitude of pages, the method comprising the steps of: storing pixel data for a multitude of rectangular tiles of the pixels in 'the PWA, wherein each tile has a height of M pixels and a width of N pixels, the pixel data for each tile substantially fills a respective one page of the RAM, and each pixel of the array is associated with one of the pixel values stored in the RAM; rasterizing a group of polygons, including the steps of for each of the polygons i) identifying a set of pixels on the array that are in the poilygon, and ii) scanning the pixels in the polygon in a defined order, and iii) for each of the pixels,. in said defined order, fetching the value associated with the pixel frorn the ?,AM, and using the fetched value to determine a rendered value for the pixel, wherein the size and dimensions of the tiles are selected by grouping together the pixels to increase the spatial coherence between the tiles and the polygons, thereby to reduce the number of page breaks that occur during the rasterization of the polygons.
1 is
2. A method according to Claim 1, wherein the scanning-step includes the step of, for each of the tiles that have a pixel in the polygon, one tile at a time, scanning across all of the pixels in the tile that are also in the polygon.
3.
A method according to Claim 1, wherein: the storing step includes the step of separating each of the tiles into a regular array of spans, each span having a height of h pixels and a width of w pixels; and the- scanning step includes the step of, for each of the spans that have a pixel in the polygon, one span at a time, scanning across all of the pixels in the span that are.also in the polygon.
4. A method according to Claim 1, wherein the storing step includes the steps of: for each pixel, providing a scalar address value identifying a location in the memory for storing the data values for the pixel when the pixel values are linearly organized in the memory, and translating the provided scalar address value to another scalar address value identifying a location i the RAM for actually storing the data value for the pixel.
is
5. A system for storing and retrieving data from a random access memory to compute rendered values for an array of pixels, the system comprising:
a random access memory (R.AM) including a multitude of pages; means for storing pixel data for a multitude of rectangular tiles of the pixels in the RAM, wherein each tile has a height of m pixels and width of N pixels, the pixel data for each tile substantially fills a respective one page of the RAM, and each pixel of the array is associated with one of the pixel values stored in the P.AM; means for rasterizing a group of polygons, including i) means for identifying pixels on the array that are in the polygons, ii) means for scanning the pixels in each of the polygons in a defined order, and iii) means for fetching, for each of the scanned pixels and in said defined order, the value associated with the pixel from the RAM, and for using the fetched value to determine a rendered value for the pixel, wherein the size and dimensions of the tiles are selected by grouping together the pixels to increase the spatial coherence between the tiles and the polygons, thereby to reduce the number of page breaks that occur during the rasterization of the polygons.
6.
A system according to Claim 5, wherein! the scanning means includes, is means to determine, for each of the polygons, tiles that have a pixel in the polygon, means to determine, for each of the polygons, the pixels in the determined tiles that are also in the polygon, and means to scan, for each of the polygons, across all of the determined pixels in the determined tiles one tile at a time.
7. A system according to Claim 5, wherein:
the storing means includes means for separating each of the tiles into a regular array of spans, each span having a height of h pixels and a width of w pixels; and the scanning means includes, means to determine, for each of the polygons, the spans that have a pixel in the polygon, means to determine, for each of the polygons, the pixels in the determined spans that are also in the polygon, and means to scan, for each of the polygons, across all of the determined pixels in the determined spans, one span at a time.
8. A system according to Claim 5, wherein the storing means includes:
1 mean s for providing, for each pixel, a scalar address value identifying a location in the memory for storing the data value for the pixel when the pixel values are linearly organized in the memory, and means for translating the provided scalar address value to another scalar address value identifying a location in the RAM for actually storing the data value for the pixel.
9. A method for storing and retrieving data from a random access memory (R-AM) to compute textured values fox an array of pixels, wherein the RAlf comprises a multitude of pages, the method comprising the steps of:
storing texture data for a multitude of rectangular tiles of texels, in the P.AM, wherein each tile has a height of M texels and a width of N texels, the texture data for each tile substantially fills a respective one page of the PAM, and each texel is associated with one of the texture values stored in the RAM; rasterizing a group of polygons, including the steps of for each of the polygons i) identifying a set of pixels, on the array that are in the polygon, and ii) scanning the pixels in the polygon in a defined order, and iii) for each of the pixels, in said defined order, mapping the pixel onto at least one of the texels, fetching from the RAM the value associated with said at least one 'of the texels,and using the fetched value to determine a textured value for the pixel, wherein the size and dimensions of the tiles are selected by grouping together the texels to increase the spatial coherence between the tiles and the texels 1 is i 1 1 mapped from the pixels in the polygons, thereby to reduce the number.of page breaks that occur during the rasterization of the pixels.
10. A method according to Claim 9, wherein the array of pixels is separated into an array of pixel tii es, and the fetching step includes the step of, for each of the polygons,'.for each of the pixel tiles that has a pixel in the polygon, one pixel tile at a time, fetching values for all of the texels that are mapped from pixels in the pixel tile.
A method according to Claim 9, wherein the array of pixels is sep arated into an array of pixel tiles and each of the pixel tiles is separated into a regular array of pixel spans, and the fetching step includes the step of, for each of the polygons, for each of the pixel spans that has a pixel in the-polygon, one pixel span at a time,. fetching values for all of the texels that are mapped from pixels in the pixel span.
12. A system for storing and retrieving data from a random access memory to compute textured values f or an array of pixels, the system comprising: a random access memory (RAM) including a multitude of pages; means for storing texture data for a multitude of rectangular tiles of texels in the RAM, wherein each 11 1 tile has a height of M texels and a width of N texels, the texture data for each tile substantially fills a respective one page of the RAM, and each texel is associated with one of the texture values stored in the RAM; means for rasterizing a group of polygons, including i) means for identifying pixels on the array that are in the polygons, il) means for scanning the pixels in' each of the polYgons in a defined order, and iii) means for mapping each of the scanned pixels, in said defined order, onto at least one of the texels, for fetching from the RAM the value associated with said at least one of the texels, and for using the fetched value to determine a textured value for the pixel, wherein the size and dimensions of the tiles are selected by grouping together the pixels to increase the spatial coherence between the tiles and the polygons, thereby to reduce the number of page breaks that occur during the rasterization of the pixels.
13. A system according to claim 12, wherein the array of pixels is separated into an array of pixel tiles, and the fetching means includes:
means to determine, for each of the polygons, the pixel tiles that have a pixel in the polygon, and means to fetch, for each of the determined tiles, values for all of the texels mapped from pixels in the determined tiles, one determined tile at a time.
1 i
14. A system according to Claim 12, wherein the array of pixels is separated into an array of pixel tiles, and each of the pixel tiles is separated into a regular array of pixel spans, and the fetching means includes:
means to determine, for each of the polygons, the pixel spans that have a pixel in the polygon; and means to fetch, for each'of the determined spans, values for all of the texels mapped from the pixels in the determined spans, one pixel span at a time.
15. A method for computing a tiled address for a location of a data value in a memory area, from a linear address for said location, wherein the memory area is comprised of a regular array of rectangular tiles, and each tile is multiple lines high and multiple data locations wide, the method comprising: using the linear address to determine two rectangular coordinates for the location of the data value in the memory, said rectangular coordinates specifying a data row' and a data column of the location of the data value in the memory; and using said two rectangular coordinates to determine a tiled address for the location of the data value in the memory.
16. A method according to Claim 15, further comprising the step of: providing a base address for the memory area; and wherein the step of using the two rectangular coordingates includes the steps of:
(a) determining an of f set to the particular tile row; (b) determining an of f set to the particular tile; (c) determining an offset to the particular data row; (d) determining an offset to the location in the particular data row; and (e) summing the base address and the of f sets determined in steps (a)- (d).
17.. A method according to Claim 16, wherein the step of determining the offset to the particular tile row includes the steps of:
identifying the tile height; identifying the tile width; identifying the memory width in units of tiles; identifying the number of rows of tiles between a beginning of the memory area and the particular tile row; and multiplying said tile height by said tile width by said memory width by said number of rows of tiles.
18. A method of using a memory unit having an array of data storage locations, comprising the steps:
identifying a multitude of separate memory regions in the memory unit; is for each of the memory regions, specifying parameters needed to translate, for all of the data locations in the memory regions, a linear address for each data location to a tiled address for the data location; wherein each of the memory regions is,able to store data, at different times, in a linear organization and in a tiled organization.
19. A method-according to Claim 18, further comprising:
providing a linear address to access a data value in the memory unit; identifying one of the memory regions having said linear address; if data are stored in said identified one of the memory regions in a linear organization, then fetching the data value stored at said linear address; and if data are stored in said identified one of the memory regions in a tiled organization, then i) using said specified parameters to translate the linear address to a tiled address, and ii) fetching the data value stored at said tiled address.
20.. A method according to Claim 18, wherein, the step of specifying parameters needed to translate a linear address to a tiled address includes the steps of:
associating a respective one register area with each of the memory regions; and Is storing the parameters for each memory region in the register area associated with the mernory region.
21. A method for Using a memory unit having a multitude of data storage locations to hold data values, comprising the steps:
separating the memory unit into a multitude of separate memory regions; activating tiling in a subset of said multitude of memory regions, including the step of, for each of said subset of memory regions, specifying parameters needed to translate, for the data locations in the region, a linear address for each data location to a tiled address; location in the memory unit; location; providing a linear address to access a data determining the memory region having the data determining whether tiling is active in said determined memory region; if tiling is not active in said determined memory region, then accessing the data location identified by said linear address; and if tiling is active in said d t region, then --rm-inea memory i) Iling the parame ters specified for the mernory region to translate the linear address to a tiled address, and ii) accessing the data location identified by said tiled address.
47-
22. A memory system to hold data values in both linear and tiled arrangements, comprising:
a random access memory (RAM) including a multitude of data storage locations, and separated into a multitude of separate memory regions; a register holding for each of the memory regions, i) means to indicate whether tiling is active in the memory region, and ii) if tiling is active in the memory region, specified parameters needed to translate, for the data locations in the region, a linear address for each data location to a tiled address; means for providing a linear address to access a data location in the memory unit; means to determine the memory region having the data location; means to determine whether tiling is active in said determined memory region; means (i) to access the data location identified by said linear address, if tiling is not active in said determined memory region; and, (ii) if tiling is active in said determined memory region, to use the parameters specified for the determined memory region to translate the linear address to a tiled address, and to access the data location identified by said tiled address.
23. A method or a system for storing and retrieving data f rom a random access memory to compute rendered or textured values for an array of pixels or a method for computing a tiled address for a location of a data value in a memory area or a method of using a memory unit or a memory system to hold data values substantially as hereinbefore described with reference to any one or any combination of figures 1, 2, 6 to 11, 15 and 16.
-49
GB9907701A 1998-04-01 1999-04-01 A linear surface memory to spatial tiling algorithm/mechanism Expired - Fee Related GB2336086B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
GB0216637A GB2374781B (en) 1998-04-01 1999-04-01 A linear surface memory to spatial tiling algorithm/mechanism
GB0216636A GB2374780B (en) 1998-04-01 1999-04-01 A linear surface memory to spatial tiling algorithm/mechanism
GB0216638A GB2374782B (en) 1998-04-01 1999-04-01 A linear surface memory to spatial tiling algorithm/mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US8027098P 1998-04-01 1998-04-01

Publications (3)

Publication Number Publication Date
GB9907701D0 GB9907701D0 (en) 1999-05-26
GB2336086A true GB2336086A (en) 1999-10-06
GB2336086B GB2336086B (en) 2002-12-11

Family

ID=22156301

Family Applications (1)

Application Number Title Priority Date Filing Date
GB9907701A Expired - Fee Related GB2336086B (en) 1998-04-01 1999-04-01 A linear surface memory to spatial tiling algorithm/mechanism

Country Status (4)

Country Link
JP (1) JP2000090280A (en)
CA (1) CA2267870A1 (en)
GB (1) GB2336086B (en)
TW (1) TW513676B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU731863B2 (en) * 1999-04-29 2001-04-05 Canon Kabushiki Kaisha Image processing operation in hierarchical memory systems
US6417848B1 (en) * 1997-08-25 2002-07-09 Ati International Srl Pixel clustering for improved graphics throughput
US7719539B2 (en) 2000-06-08 2010-05-18 Imagination Technologies Limited Memory management for systems for generating 3-dimensional computer images
US9232156B1 (en) 2014-09-22 2016-01-05 Freescale Semiconductor, Inc. Video processing device and method

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5362915B2 (en) 2010-06-24 2013-12-11 富士通株式会社 Drawing apparatus and drawing method
CN113375568B (en) * 2021-05-12 2023-03-31 苏州阿普奇物联网科技有限公司 Metal wiredrawing polishing defect detection method based on laser scanning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5056044A (en) * 1989-12-21 1991-10-08 Hewlett-Packard Company Graphics frame buffer with programmable tile size
US5131080A (en) * 1987-08-18 1992-07-14 Hewlett-Packard Company Graphics frame buffer with RGB pixel cache
US5251293A (en) * 1987-09-02 1993-10-05 Ascii Corporation Character display apparatus
US5517609A (en) * 1990-08-06 1996-05-14 Texas Instruments Incorporated Graphics display system using tiles of data
US5675387A (en) * 1994-08-15 1997-10-07 General Instrument Corporation Of Delaware Method and apparatus for efficient addressing of DRAM in a video decompression processor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5131080A (en) * 1987-08-18 1992-07-14 Hewlett-Packard Company Graphics frame buffer with RGB pixel cache
US5251293A (en) * 1987-09-02 1993-10-05 Ascii Corporation Character display apparatus
US5056044A (en) * 1989-12-21 1991-10-08 Hewlett-Packard Company Graphics frame buffer with programmable tile size
US5517609A (en) * 1990-08-06 1996-05-14 Texas Instruments Incorporated Graphics display system using tiles of data
US5675387A (en) * 1994-08-15 1997-10-07 General Instrument Corporation Of Delaware Method and apparatus for efficient addressing of DRAM in a video decompression processor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
'Design and Analysis of a Cache Architechture for Texture Mapping',Hakura and Gurta, Computer *
Systems Laboratory, Stanford University, Stanford, Cal. *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6417848B1 (en) * 1997-08-25 2002-07-09 Ati International Srl Pixel clustering for improved graphics throughput
AU731863B2 (en) * 1999-04-29 2001-04-05 Canon Kabushiki Kaisha Image processing operation in hierarchical memory systems
US7719539B2 (en) 2000-06-08 2010-05-18 Imagination Technologies Limited Memory management for systems for generating 3-dimensional computer images
US9613598B2 (en) 2000-06-08 2017-04-04 Imagination Technologies Limited Memory management for systems for generating 3-dimensional computer images
US10102608B2 (en) 2000-06-08 2018-10-16 Imagination Technologies Limited Memory management for systems for generating 3-dimensional computer images
US10552938B2 (en) 2000-06-08 2020-02-04 Imagination Technologies Limited Memory management for systems for generating 3-dimensional computer images
US11004172B2 (en) 2000-06-08 2021-05-11 Imagination Technologies Limited Memory management for systems for generating 3-dimensional computer images
US9232156B1 (en) 2014-09-22 2016-01-05 Freescale Semiconductor, Inc. Video processing device and method

Also Published As

Publication number Publication date
TW513676B (en) 2002-12-11
GB2336086B (en) 2002-12-11
JP2000090280A (en) 2000-03-31
GB9907701D0 (en) 1999-05-26
CA2267870A1 (en) 1999-10-01

Similar Documents

Publication Publication Date Title
US6778177B1 (en) Method for rasterizing a graphics basic component
US5287438A (en) System and method for drawing antialiased polygons
US5230039A (en) Texture range controls for improved texture mapping
US4692880A (en) Memory efficient cell texturing for advanced video object generator
US5544292A (en) Display apparatus having a display processor for storing and filtering two dimensional arrays forming a pyramidal array, and method of operating such an apparatus
US5495563A (en) Apparatus for converting pyramidal texture coordinates into corresponding physical texture memory addresses
KR100421623B1 (en) Hardware architecture for image generation and manipulation
US5598517A (en) Computer graphics pixel rendering system with multi-level scanning
US5856829A (en) Inverse Z-buffer and video display system having list-based control mechanism for time-deferred instructing of 3D rendering engine that also responds to supervisory immediate commands
US8189007B2 (en) Graphics engine and method of distributing pixel data
JPH04222071A (en) Method and apparatus for texture mapping
US6919895B1 (en) Texture caching arrangement for a computer graphics accelerator
JPH10105723A (en) Memory constitution for texture mapping
GB2302002A (en) Computer graphics triangle rasterization with frame buffers interleaved in two dimensions
US6885384B2 (en) Method of creating a larger 2-D sample location pattern from a smaller one by means of X, Y address permutation
KR950014979B1 (en) Image computing system
US6724396B1 (en) Graphics data storage in a linearly allocated multi-banked memory
JP4198087B2 (en) Image generating apparatus and image generating method
GB2336086A (en) Optimised pixel/texel memory configuration for tile arrays
US6400370B1 (en) Stochastic sampling with constant density in object space for anisotropic texture mapping
US6661423B2 (en) Splitting grouped writes to different memory blocks
US6236408B1 (en) Computer graphics pixel rendering system with multi-level scanning
JP2003504697A (en) Anti-aliasing of subsampled texture edges
GB2374780A (en) A linear surface memory to spatial tiling algorithm/mechanism
US20040012586A1 (en) Image processing apparatus and method of same

Legal Events

Date Code Title Description
732E Amendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977)
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20070401