WO1993004462A1 - Computer graphics system - Google Patents

Computer graphics system Download PDF

Info

Publication number
WO1993004462A1
WO1993004462A1 PCT/US1992/007055 US9207055W WO9304462A1 WO 1993004462 A1 WO1993004462 A1 WO 1993004462A1 US 9207055 W US9207055 W US 9207055W WO 9304462 A1 WO9304462 A1 WO 9304462A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
graphics
bus
address
cursor
Prior art date
Application number
PCT/US1992/007055
Other languages
French (fr)
Inventor
Kim Meinerth
Joseph Bouchard
Colyn Case
Robert Crouse
Blaise Fanning
Kevin Fielding
Chris Franklin
Rodney Gamache
John Irwin
John Kirk
Srinivasan Krishnaswami
George Lord
Agnes M. Masucci
Ali Moezzi
Christopher J. Payson
George Scott
Original Assignee
Digital Equipment Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US07/748,357 external-priority patent/US5313577A/en
Priority claimed from US07/748,360 external-priority patent/US5321806A/en
Application filed by Digital Equipment Corporation filed Critical Digital Equipment Corporation
Publication of WO1993004462A1 publication Critical patent/WO1993004462A1/en

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/39Control of the bit-mapped memory
    • G09G5/393Arrangements for updating the contents of the bit-mapped memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
    • G06F9/3879Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor for non-native instruction execution, e.g. executing a command; for Java instruction set
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/02Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed
    • G09G5/06Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed using colour palettes, e.g. look-up tables
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/08Cursor circuits
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/363Graphics controllers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0831Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2310/00Command of the display device
    • G09G2310/04Partial updating of the display screen
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2360/00Aspects of the architecture of display systems
    • G09G2360/12Frame memory handling
    • G09G2360/121Frame memory handling using a cache memory
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2360/00Aspects of the architecture of display systems
    • G09G2360/12Frame memory handling
    • G09G2360/123Frame memory handling using interleaving
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2360/00Aspects of the architecture of display systems
    • G09G2360/12Frame memory handling
    • G09G2360/126The frame memory having additional data ports, not inclusive of standard details of the output serial port of a VRAM

Definitions

  • This invention relates to computer graphics system, a d more particularly to computer architecture of graphics systems.
  • Computer graphics systems are special purpose computers that are used to create complex images on a display and to allow the computer user to modify and store the images.
  • a graphics accelerator such as the graphics accelerator of Fig. 4 of Costello, is a special purpose processor unit which receives graphics commands from the CPU and executes the commands, typically by changing information stored in a frame buffer memory (referred to as "frame buffer” in Costello) .
  • a frame buffer memory is a special purpose type of memory in which the memory locations correspond to locations, or pixels, on a monitor, or other type of display. Devices not shown in Fig.
  • Prior art graphics system such as the system described by Costello do not provide for efficient use of main memory by the graphics accelerator. Transfers of information between the main memory and the frame buffer memory must be transferred via the system bus, under control of the CPU. Also, if an application program contains drawing instructions and/or commands that specify that an image is to be "drawn" and stored in main memory, rather than the frame buffer memory, the CPU can not take advantage of the graphics processing power of the graphics system processor, because the graphics system processor does not have access to main memory. If a graphics command involves retrieving a graphics image . stored in main memory, the image must be retrieved by the CPU. This not only involves action by the CPU, but also creates undesirable traffic on the system bus.
  • the graphics system processor is set up to interact with the frame buffer memory rather than the main memory, and therefore, the CPU must convert the drawing instructions to draw to main memory locations.
  • Other graphics systems such as those described in U.S. Patent 4,947,342, entitled “Graphic Processing System for Displaying Characters and Pictures at High Speed", issued August 7, 1990 to Katsura, et al. , and U.S. Patent 5,020,003, issued May 28, 1991, to Moshenberg, overcome some of the difficulties of Costello, but require that the graphics processor must access a memory access controller over a memory bus, or require additional frame buffer memory, which is a special purpose type of memory, and is therefore relatively expensive.
  • a Cache memory is a memory placed in close proximity to the CPU, and which contains the information in main memory locations most frequently accessed by the CPU.
  • the close proximity of cache memory enable the CPU to more quickly access the data in the cache memory than can the CPU access the data in the main memory.
  • a typical cache memory system consists- of cache RAM (random access memory), a cache controller, and a tag store.
  • the tag store is a table of the main memory addresses of the information that is stored in the cache RAM.
  • the cache RAM stores the information that is operated on by the CPU.
  • the cache controller controls the information that passes in and out of the cache RAM, and updates the cache tag store.
  • the specific structure of the cache tag store and of the entries in a cache tag store are dependent on whether a cache is a "direct mapped" cache, a "set associative" cache, or a "fully associative" cache.
  • One characteristic of all cache tag stores, however, is that they have some method for indicating the main memory addresses corresponding to the entries in the cache memory.
  • the tag store is searched for the main memory address. If the main memory address is in the tag store (a cache "hit”) , the information is retrieved from the cache RAM and sent to the CPU. If the main memory address is not in the tag store (a cache "miss”) , the cache controller retrieves the information from main memory, stores it in cache RAM, and records the main memory address in the tag store. When the CPU stores the information, it sends the information back to the cache controller, which stores the information in the cache RAM. If the cache is a "writethrough” cache, the information is also written to the corresponding address in main memory. If the cache is a "writeback” cache, the information is not written to the corresponding address in main memory until a later time.
  • main memory location In a computer system, it is important that the contents of any location in main memory is identical to all system components that access main memory. If a main memory location is in cache memory, particularly a "writeback" cache memory, the information in main memory may not be the most current value, and the information in main memory must be updated before any system component other than the CPU attempts to read that main memory location. Verifying whether or not a main memory location is resident in cache, and updating the value can be done in a number of ways, but many of them result on some traffic on the system bus, even if the main memory location is not currently resident in cache.
  • any system component other than the CPU writes to a main memory location, there must be some method of verifying if that location is in cache, and if it is, to update the information in cache. This can be done in a number of ways, but all result in some traffic on the system bus, even if the memory location is not currently resident in cache.
  • One method for providing a more efficient, less expensive computer graphics system is to provide more efficient means for transmitting and processing the graphics commands.
  • the information being transferred may be a stream of graphics commands. If the graphics commands are all the same length, and the length is the same as the uniform transmission size, then the graphics commands may be transmitted to and from main memory efficiently. However, restricting the length of the graphics commands to the uniform transmission size may be inconvenient for the designer of the graphics command set. If the designer of the graphics command set wishes to create graphics commands of varying length, it may be very difficult for the system designer to select an optimal uniform transmission size.
  • a feature present in most computer graphics systems is a cursor.
  • a cursor is a movable marker that appears on the display screen and provides a visible indicator of the position of interest. Cursors are typically controlled by devices such as directional keys on a keyboard, or by an input device, such as a "mouse" . The cursor appears as an overlay over other images that are displayed on the screen.
  • Information regarding the shape and size of the cursor and the cursor' s location are typically stored in a dedicated memory or in a portion of "off-screen" frame buffer memory, that is a portion of the frame buffer memory that does not correspond to a pixel on the screen of the display.
  • a problem with storing cursor information in frame buffer memory is that frame buffer memory is expensive relative to main memor .
  • Video timing and control circuitry is often placed on the same circuit board as the frame buffer memory. However, including the timing and control circuitry on a single circuit board increases the expense and complexity of manufacturing the circuit board, and therefore causes the circuit boards to be very expensive. If a computer user adds more frame buffer memory, (which is necessary if the user changes the type of display to a display with higher resolution, or changes from a monochrome to color display, ) the user must replace the circuit board on which the frame buffer memory is placed.
  • Buses such as the VME bus and the local memory bus are important components of a computer system. Generally, it is desirable to minimize the number of communications lines in each bus, because adding more lines generally means more expense. In addition, there must be a terminal pin at each point that a communication line connects to a system component. It is very difficult and expensive to add pins to system components .
  • This invention provides a low cost graphics system which enables graphics generation by a process which requires a minimum of traffic on the system bus, and which allows the graphics processor to interact efficiently with main memory, and requires a minimum of frame buffer memory.
  • the graphics processor is formed as a component of a memory and graphics processor unit which communicates with other components via a memory control unit to a plurality of buses, specifically, the system bus, a memory bus, and an I/O bus.
  • the central processor unit communicates with the system bus and ultimately with the main memory, under control of the memory control unit.
  • the flow of graphics information is effected via the memory bus, that is, any communications with the main memory are effected over the memory bus, and any communications between the main memory and the system bus must •take place through the memory control unit.
  • the graphics processor unit can access main memory by issuing memory requests directly to the memory control unit. Additionally, the memory and graphics control unit can interact directly with virtual memory in a manner that will be described below.
  • the invention further provides a method for translating virtual memory addresses to physical memory addresses.
  • a graphics processing unit includes, among other things, an address generator which retrieves data from memory locations, and writes data to memory locations .
  • the address generator retrieves data from memory locations memory access request directly to a memory control unit, which retrieves the contents of the memory location. Prior to issuing the request, the address generator sends the address to a virtual translation unit, which translates the virtual address to a physical address.
  • the virtual translation/FIFO control unit also contains three translation buffers, in which are stored the most recently accessed virtual addresses, which, in many situations, enables the virtual translation/FIFO control unit to translate the virtual address using less memory accesses.
  • the invention further provides a method for determining if a main memory location written to by a system component other than the CPU is in cache. The method performs the verification without causing traffic on the system bus.
  • the invention includes a CPU cache tag store, and a second cache tag store. Any entry into, or displacement from, the CPU cache tag store is also entered into, or displaced from, the second cache tag store.
  • the second cache tag store is situated such that system components other than the CPU can access the second cache tag store without creating traffic on the system bus .
  • the invention further provides a command buffer, between the CPU and the graphics processor, for storing graphics commands.
  • the command buffer is a FIFO buffer in a reserved, contiguous section of main memory. Commands from the CPU are written to the FIFO command buffer for temporary storage at a location pointed to by a tail pointer. Commands are transmitted from the FIFO command buffer for processing from a location pointed to by a head pointer. Commands are thus stored and processed in the same order they were received by the FIFO command buffer.
  • the FIFO command buffer is dynamic, that is commands can be added to the tail of the FIFO command buffer at any time, provided the space allocated to the FIFO command buffer is not full. Since the FIFO command buffer is stored in main memory, it adds little expense to the system.
  • the CPU can transmit a graphics command over the system bus whenever the bus is available, even if the graphics processor is occupied. Graphics commands are available to the graphics processor, from the FIFO, even if the system bus is currently occupied.
  • the invention described herein further provides a method by which graphics commands can be of varying length, and yet can be transferred in transmissions of uniform size.
  • graphics commands are transmitted, in transmission units of uniform size, from a processor unit to an address generator, which processes the commands.
  • a residue buffer is provided so that portions of data transmission units not immediately usable by the address generator can be stored in the residue buffer.
  • the invention further provides a method for improving the efficiency of transmissions to the graphics accelerator by monitoring the status of all the elements in the path between the CPU and the graphics accelerator (sometimes called a graphics processor) , and routes the graphics command to the element closest to the graphics processor, or to the graphics processor directly.
  • the invention described herein provides a processor unit which issues graphics commands to an address generator, which processes the commands.
  • Two storage buffers (a FIFO command buffer and a residue buffer) are provided to make efficient use of the buses over which the commands are transmitted.
  • logic is provided which routes the graphics command directly to the address generator, or to the buffer closest (in the transmission path) to the address generator, thereby minimizing the number of computer cycles necessary to get the command to the address generator.
  • the invention described herein f rther provides- a graphics system in which cursor pattern information is stored in main memory.
  • a graphics unit capable of directly accessing main memory has in it logic which controls the position of the cursor on a display screen.
  • a cursor control unit requests the cursor pattern information from main memory, and transmits the pattern data to a digital to analog convertor, which causes the cursor to be displayed on the screen.
  • the invention further provides a frame buffer module on which the frame buffer memory is mounted, which can be replaced with a minimum of expense.
  • the invention places the timing and control circuits in a module, separate from the frame buffer module, thereby making the frame buffer module inexpensive to replace.
  • the invention still further provides a cursor bus having two configurations. In a first configuration, the cursor bus transmits cursor control information from a cursor control unit, in a graphics processor, to a video unit.
  • the invention further provides a register in the cursor control unit which allows the user to select a second configuration, in which the cursor bus is used to transfer information between a video option and a different unit within the graphics processor. In the second configuration the bus carries signals such as stall, inhibit, and interrupt signals, rather than cursor information, and in the second configuration some of the signal line carry signals from the video unit to the graphics processor unit.
  • Fig. 1 is a block diagram of the computer graphics system.
  • Fig. 2 is a block diagram of the computer graphics system of Fig. 1, in greater detail;
  • Fig. 3 is a block diagram of the graphics/memory control unit
  • Fig. 4 is a block diagram of the virtual translation/FIFO control unit
  • Fig. 5 is a diagram showing the translation of a virtual address to a physical address
  • Fig. 6 is a block diagram of the elements of the computer graphics system that are involved in the transmission of graphics commands from the processor unit to the graphics processor unit;
  • Fig. 7 is a block diagram of the video/cursor control unit
  • Fig. 8a is a block diagram of the video unit and the memory bus structure, the cursor bus, and the video control bus;
  • Fig. 8b is a block diagram of the video unit and the memory bus structure, with the video unit configured to provide output to two displays;
  • Fig. 9 is a block diagram of Fig. 8a, with the video unit replaced by an optional device.
  • Fig. 10 is a block diagram of the memory bus structure.
  • Fig. 1 shows a block diagram of a computer graphics system 100, according to the invention.
  • Processor unit 102 is interconnected, via a system bus 110, for communication with a memory control unit 220.
  • a main memory 140 is connected, via memory bus 150 to the memory control unit 220 as well as to the frame buffer memory 164.
  • Frame buffer memory 164 is in turn connected to a video DAC 166, which reads data, in digital form, from frame buffer memory 164, and converts the data to video signals, which are transmitted to display 162 over video signal line 168.
  • a graphics processor unit 210 is connected to memory control unit 220 via signal lines comprising a multiplicity of signal lines (including PADRS line 272, FPUT line 388, MA_REQ line 282, and PXDAT line 222, shown in more detail in Fig. 4, and described below) .
  • Graphics processor unit 210 and memory control unit 220 are included in a functional unit, the graphics/memory control unit 130, which may be implemented on a single computer chip. In a preferred embodiment, graphics/memory control unit 130 is implemented on a single computer chip.
  • the memory control unit 220 is interconnected to the I/O bus 170 and the network bus 180, each of which is bidirectional for data/address transfers through, to and from the memory control unit 220.
  • graphics/memory control unit 130 serves to effect graphics generation by arranging the components of the system such that graphics processor 210 can directly access (that is read from and write to) main memory 140. Access requests by the graphics processor 210 are transferred directly to memory control unit 220, by signal lines 169, without having to be transmitted over system bus 110, memory bus 150, or I/O bus 170.
  • Graphics commands are generated in processor unit 102, for processing in graphics processor unit 210.
  • Processor unit 102 transmits the graphics commands over system bus 110 to memory control unit 220.
  • the graphics commands are in turn transmitted, by memory control unit 220, to graphics processor unit 210.
  • Processing of graphics commands by graphics processor unit 210 involves reading data from locations in main memory 140 and writing data to locations in main memory 140 or in frame buffer memory 164.
  • graphics processor 210 issues a memory read request to memory control unit 220 over one of signal lines 169.
  • Memory control unit 220 prioritizes the memory request, along with memory requests received over system bus 110, network bus 180, and I/O bus 170.
  • Memory control unit 220 retrieves the data from the location in main memory 140, and transmits the data to graphics processor unit 210 over one of signal lines 169.
  • Writing data to a location in main memory 140 is effected by the same process as writing data to location in frame buffer memory 164, the difference being the address specified in the memory write request.
  • graphics processor unit 210 issues a memory write request to memory control unit 220 over one of signal lines 169.
  • Memory control unit 220 prioritizes the memory request, along with memory requests received over system bus 110, network bus 180, and I/O bus 170.
  • Memory control unit 220 then writes the data to the location in main memory 140 or in frame buffer memory 164-, depending on the address specified in the memory write request.
  • the invention provides for first translating virtual addresses to physical addresses; this process is described in more detail below.
  • Graphics processor 210 can issue requests to memory control unit 220 directly, over signal lines 169; processing of the memory request does not create any traffic on system bus 110; and graphics processor 210 can read data from main memory 140.
  • writing data to both main memory 140 and frame buffer memory 164 are both accomplished by the same process, with the only difference being the memory address; stated differently, frame buffer 164 and main memory 140 appear to graphics processor unit 210 and memory control unit 220 as two parts of a single address space.
  • FIG. 2 shows a block diagram of a computer graphics system 100 according to the invention.
  • a computer graphics system may be connected to a computer network by network bus 180 as shown, or may be incorporated within a single user workstation or a multi-user computer.
  • the computer graphics system 100 includes functional units, such as a processor unit 102 (depicted within a broken line block) interconnected, via a CPU bus interface 104 to a system bus, generally designated 110, for communication with a graphics/memory control unit 130.
  • the system bus 110 is shown as a plurality of buses, designated 112, 113, and 114, each of which, respectively, transfer data information (bus 112) , request information (bus 113) and address information (bus 114) ' .
  • a main memory 140 is connected, via memory data and address buses 152 and 154, respectively, to the graphics/memory control unit 130 as well as to the video unit 160 (shown in broken lines) , which, in turn controls the display 162.
  • the video unit 160 inc'ludes therein a frame buffer memory 164 and a DAC (digital to analog convertor) 166.
  • the graphics/memory control unit 130 is interconnected to a plurality of bus structures, each of which is bidirectional for data/address transfers through, to and from the graphics/memory control unit 130.
  • bus structures include the video unit 160 control bus structure, that is, video control bus 126 and the cursor bus 128; the I/O bus structure 170, that is I/O data bus 172, I/O request bus 173, and I/O address bus 174; and a network bus structure 180, such as data bus 182, request bus 183, and address bus 184.
  • Connected to the I/O bus structure 170 may be any type of peripheral device, such as disk storage device 191 and any other suitable I/O device 192.
  • Graphics/memory control unit 130 consists of two major units, graphics processor unit 210, and memory control unit 220.
  • graphics processor unit 210 consists of address generator 212, pixel shift logical unit (pixel SLU) 214, virtual translation/FIFO control unit 230, mask generator
  • Address generator 212 transmits control signals via address generator control line 226 (AGCTL) to graphics data buffer 218, mask generator 216, and pixel SLU
  • Address generator mask lines AGBMSK 276 and AGWMSK 278 transmit signals from mask generator ' 216 and pixel SLU and memory address and control unit 236, respectively.
  • Memory address and control unit 236 sends readback data on MAD_RB line 284, VIR RB line 286, and FIFO RB line 288, and address generator 212 sends readback data on AG_RB line 292.
  • the various graphics processor unit elements are also connected by a number of signal lines, which will be explained as they become useful to the description of the operation of the graphics processor unit.
  • Graphics processor unit 210 is connected to video control bus 126 and cursor bus 128.
  • Memory control unit 220 consists of flow control unit 232, memory state unit 234, memory address and control unit 236, memory data buffer 238, and address/data output multiplexer (mux) 242, all connected by memory control line
  • Pixel SLU 214 is a part of both memory control unit 220 and graphics processor unit 210.
  • Memory address and control 236 unit sends signals to flow control unit 232 and memory state unit 234 over ACCESS_ADRS line 251 and MEMTYP line 252, respectively.
  • the interconnections between the component units of memory control unit 220 will be described below in an illustrative example of a memory request by disk storage device 191.
  • Flow control unit 232 receives memory requests from incoming portions 113a, 173a, and 183a of the- system request bus 113 (of Fig. 2) , the I/O address bus 173 (of Fig. 2) , and the network address bus 183 (of Fig. 2) , respectively.
  • Flow control unit 232 acknowledges memory requests through external acknowledgment line 294, which connects to outgoing portions 113b, 173b, and 183b of the system request bus 113
  • the memory address and control unit 236 receives the address portion of a memory request over incoming portions 114a, 174a, and 184a of the system address bus 114 (of Fig. 2) , the I/O address bus 174 (of Fig. 2) , and the network address bus 184 (of Fig. 2) , respectively.
  • Address/data output multiplexer 242 sends data on outgoing data portions 172b and 182b of the I/O data bus 172 (of Fig. 2) and the network data bus 182 (of Fig.
  • Memory address and control unit 236 sends address and control information over memory address bus 154 to main memory 140 (of Fig. 2) and frame buffer memory 164 (of Fig. 2) .
  • Memory data buffer 238 sends data to and receives data from main memory 140 (of Fig.
  • graphics/memory control unit 130 accomplishes a desired result of processing a memory transaction without causing a transmission on system bus 110, and without any action by processor unit 102, is most easily understood by an example of a memory read request from a system component, such as disk storage device 191.
  • a memory request consists of at least two parts, namely a request information part, which contains information about the requester, and an address part, which contains the memory address of the requested data. The request part and the addressed part are processed separately.
  • the request information part is transmitted over request portion 173a, of I/O request bus 173, to flow control unit 232.
  • Flow control unit 232 prioritizes the request and transmits request information over prioritized request identification (PREQSEL) line 254 and next memory request
  • PREQSEL prioritized request identification
  • NEXTMREQ NEXTMREQ line 256 to memory state unit 234.
  • Information transmitted includes information about the requester (in this example disk storage device 191) , access type, and operand size.
  • Memory state unit 234 transmits the request information to memory address and control unit 236 over request identification (REQSEL) line 258. 12
  • the address part of the memory location that is requested is transmitted over address portion 174a of I/O data bus 174 directly to memory address and control unit 236.
  • Memory address and control unit 236 sends request information on memory address bus 154.
  • Pixel SLU 214 transmits the data over pixel data bus (PXDAT) 222 to address and data output multiplexer 242 to disk storage device 191 over outgoing I/O data bus 172b.
  • PXDAT pixel data bus
  • a request for a memory read by processor unit 102 proceeds in the same manner, except memory buffer 238 transmits the data to processor unit 102 over outgoing system data bus 112b.
  • the interconnections of the elements of the graphics/memory control unit 130 also allows the processing of memory request by graphics processor unit 210 without any action by processor unit 102 and without causing any traffic on system bus 110.
  • Memory requests by graphics processor unit 210 are issued by the address generator 212.
  • Address generator 212 issues request information over address generator request (AGMREQ) line 266, to virtual translation/FIFO control unit 230.
  • Virtual translation/FIFO control 230 unit in turn transmits the request to flow control unit 232 over MA_REQ line 282, which prioritizes the request, and transmits the request to memory address and control unit 236 over REQSEL line 258.
  • the address part of the memory request is transmitted to virtual translation/FIFO control unit 230 over address generator address (AGADRS) line 268.
  • the address is UL translated, if necessary, to a physical address by virtual translation/FIFO control unit 230 in a manner that will be described below.
  • the address is then sent to memory address and control unit 236 over physical address (PADRS) line 272.
  • the memory address and the request information are then sent over memory address bus 154.
  • the data is returned over memory data bus 152 to memory data buffer 238, which in turns transmits the data to pixel SLU 214. If the memory access is a write to memory, the data is transmitted from pixel SLU 214 to memory data buffer 238 and then to main memory (140 of Fig. 2) over memory data bus 152.
  • the method of processing memory requests issued by the graphics processor unit 210 and the method of processing memory requests issued by other system components both include the steps of transmitting a request information part to flow control unit 232; transmitting an address part to memory address and control unit 236; and the receiving or sending of the requested data by pixel SLU 214. It can also be noted that the memory access was executed in both cases without any action by processor unit 102, and without causing any traffic on system bus 110.
  • the address contained in the address part of the memory request can be any memory address that is accessible by memory address bus 154.
  • the graphics processor unit 210 can access both main memory 140 and frame buffer memory 164 and thus can transfer information between main memory 140 and frame buffer memory 164.
  • a feature of the invention is the method by which virtual addresses are translated to physical addresses.
  • the virtual address is translated to a physical address by virtual translation/FIFO control unit 230, which is shown in greater detail in Fig. 4.
  • AGMREQ line 266 transmits signal packets containing eight bits from address generator 212 to ii virtual translation/FIFO control unit controller 274. Bit position seven indicates whether the address that is requested on AGADRS line 268 is a physical address or a virtual address. If the signal on AGMREQ line 266 indicates that the address that is requested on AGADRS line 268 is a virtual address, virtual translation/FIFO control unit controller 274 causes the address on AGADRS line 268 to enter virtual translation unit 280 (shown enclosed in broken lines in Fig. 4) .
  • Virtual translation unit 280 has separate translation units for source, destination, and stencil operands. The components of the translation units are multiplexed through multiplexers 492, 493, 494. The virtual translation of a source operand will be described, however it should be understood that virtual addresses for destination and stencil operands can be translated in a similar manner.
  • the translation of a virtual address is more easily understood by first briefly discussing virtual address translation generally.
  • the amount of main memory available to a program is more than the amount of main memory (140 of Fig. 2) that is actually present in the computer system.
  • the program operates on memory locations specified as "virtual addresses".
  • the data identified by a virtual address may actually reside in main memory (140 of Fig. 2) or in some other system component, such as a disk storage device 191.
  • system page table 310 is a data base, stored in main memory, accessible by the computer operating system.
  • the main memory address 311 of the first entry in system page table 310 is fixed by the operating system, and known to the address generator 212 of Fig. 3.
  • Each of the entries 312 of system page table 310 referred to as "page frame numbers,” consists of two portions.
  • First portion 314 contains a "valid" bit, an access code, and a modify bit which will be discussed below.
  • Second portion 316 contains the base address 318 of a secondary page table 320.
  • Secondary page table 320 may be present in main memory 140 (of Fig. 2) , or may be stored in some other system location, such as a disk storage device 191. If secondary page table 320 is present in main memory, the "valid" bit in first portion 314 of system page table entry 312 identifying secondary page table 320 is set to a "valid” state. If secondary page table 320 is not present in main memory 140, the "valid" bit in the identifying entry 312 is set to "invalid" .
  • Each of the entries 322 of secondary page table 320 consists of two portions.
  • First portion 324 contains a "valid" bit, an access code, and a modify bit, which will be discussed below.
  • Second portion 326 contains the base address 328 of the section, or "page" of main memory (140 of Fig. 2) corresponding to the virtual address. If the "valid" bit in first portion 324 of secondary page table identifying entry 322 is set to “valid”, the current physical location of the data identified by the virtual address is in main memory. If the "valid" bit in the identifying entry 322 is set to "invalid", the data identified by the virtual address may not be present in main memory 140.
  • a virtual address 330 consists of three sections.
  • First section 332 is an index, identifying the address of the entry relative to the base address 311 of system page table 310.
  • Second section 334 is an index, identifying the address of the entry relative to the entry in the base address 318 of the secondary page table 320.
  • Third section 336 is an index which identifies the main memory location relative to the base address 328.
  • the translation of a virtual address consists of examining the contents of the entry obtained by indexing, from base address of system page table 310, by the index specified in virtual address first section 332, to find the base address 318 of the secondary page table 320; indexing into secondary page table 320 by the index specified in virtual address second section 334 to find the base address 328 of the page of main memory which contains the desired data; and indexing into the page of main memory by the index specified in virtual address third section 336 to find the actual physical address of the desired data.
  • the process is made more efficient by arranging the addressing method such that virtual address third section 336 is concatenated onto the address stored in second section 326 of entry 322 of secondary page table 320 to obtain the physical address of the desired data.
  • a virtual address translation involves accessing two memory locations to translate the physical address .
  • One way for reducing the number of memory locations that must be accessed to translate a virtual address is to record, in a memory buffer, the virtual address first section 332 and second section 334 of each virtual address that is translated, and the corresponding secondary page table base address 318, and main memory base address 328 that the virtual address first section 332 and second section 334, respectively, signify.
  • a memory buffer used for this purpose is known as a "translation buffer.” If the next virtual address to be translated has the same first section 332 and second section 334 .as the value stored in the translation buffer, the physical address is available without any need to access any memory locations.
  • next virtual address to be translated has the same first section 332, but a different second section 334, than the value stored in the translation buffer, then the base address 318 of secondary page table 320 is immediately available, without having to access system page table 310.
  • bits ⁇ 29:16> that is, bits 29 through 16 of a virtual address correspond to virtual address first section 332; bits ⁇ 15:9> correspond to virtual address second ' section 334; and bits ⁇ 8:2> correspond to virtual address third section 336.
  • source page table page latch 342 contains the base address (corresponding to 318 of Fig. 5) of the secondary page table specified b-y the most recentl translated virtual address of a source operand.
  • Sourc secondary page table latch 344 contains the base address 328 of the page of main memory of the most recently translate virtual address of a source operand.
  • Previous source address latch 346 contains the most recently translated source virtual address. Latches 346, 342, and 344 collectively are the source translation buffer.
  • the virtual address on AGADRS 268 enters virtual translation unit 280.
  • the address generator calculates a main memory address obtained by indexing by the index specified in address bits ⁇ 29:16>, and transmits the result to virtual translation unit 280 on PASPTE line 246, for reasons that will be apparent later.
  • Bits ⁇ 29:16> of the virtual address are compared, in page table page comparator 348, with bits ⁇ 29:16> of the address stored in previous source address latch 346.
  • Bits ⁇ 15:09> are compared, in page frame number comparator 349, with bits ⁇ 15:09> of the address stored in previous source address latch 346.
  • page table page comparator 348 indicates a match, and page frame number comparator 349 also indicates a match, the virtual address is translated to a physical address by concatenating bits ⁇ 8:2> of the virtual address onto the value stored in source secondary page latch 344. If page table page comparator 348 indicates a "hit", but page frame number comparator 349 indicates a miss, the base address (corresponding to 318 of Fig. 5) of the secondary page table (corresponding to 320 of Fig. 5) identified by the value in source page table latch 342 is indexed by bits ⁇ 15:09> of the virtual address to be translated to yield the base address (corresponding to 328 of Fig. 5) of the section, or "page" of main memory in which the desired data is located.
  • bits ⁇ 8:2> are then concatenated to the base address to yield the main memory address of the desired data.
  • the contents of base address (corresponding to 328 of Fig. 5) of the page of main memory containing the desired information are then stored in source secondary page latch 34 .
  • the access code is checked. If the access code indicates that the base address (corresponding to 328 of Fig. 5) of the page of main memory containing the desired information is in a section of memory to which the graphics unit 130 is not allowed access, a signal is generated by virtual translation and FIFO control unit controller 274) which prevents the memory access from occurring.
  • the modify bit is also checked. If the modify bit indicates that the page of main memory containing the desired information has not previously been written to, a signal is generated which causes the operating system to change the modify bit in the entry in the secondary page table (corresponding to 320 of Fig. 5) .
  • page table page comparator 348 indicates a miss
  • virtual translation/FIFO control unit controller 274 retrieves the address transmitted on PASPTE line 246, which contains the base address of the secondary page table 320. The process then proceeds as described above if page table comparator 348 had indicated a "hit” but page frame number comparator 349 had indicated a miss .
  • destination translation buffer consististing of destination page table page latch 352, destination secondary page table page latch 354, and previous destination address latch 356
  • stencil translation buffer consististing of stencil page table page latch 362, stencil secondary page table page latch 364, and previous stencil address latch 366
  • processor unit 102 contains ' a CPU translation unit 108 which has a translation buffer similar to the source translation buffer described above. If a virtual address is displaced from main memory 140 (typically to a disk storage device 191) , then the corresponding entry in system page table is set to "invalid", as is the corresponding entry, if present, in the translation buffer in CPU translation unit 108. The next time that virtual address is translated, the computer's operating system reads that information from the disk storage device 192 into main memory 140, and places an entry, indicating the location of the data, and that the data is valid, in the translation buffer in CPU translation unit 108.
  • the invention provides two methods for maintaining consistency between the system page table page and the entries in the translation buffers in virtual translation unit 230.
  • the address is transmitted to virtual translation/FIFO control unit controller 274 which compares the appropriate portion of the address with the address portions in page table latches 342, 344, 352, 354, 362, and 364; if a matching address is found, the corresponding entry is marked "invalid”.
  • a second method if any entry in the system page table is marked invalid, all entries in virtual translation unit 280 are marked invalid.
  • any entry that is marked as “invalid” in the system page table will also be marked “invalid” in the virtual translation unit 280.
  • virtual translation unit 280 will never have an entry marked as "valid” when the corresponding entry in the system page table is marked as "invalid”.
  • a translation of a virtual address by virtual translation unit 280 results in the same physical address as a translation of that same virtual address by the processor unit 102.
  • duplicate tag store 194 situated relative to other system elements in a manner that enables graphi * ⁇ s processor unit 210, as well as other system components, such as disk storage device 191, to effect a search of duplicate tag store 194 without creating traffic on system bus 110.
  • Duplicate tag store 194 is located on the same side of he system bus as graphics/memory control unit 130, network data bus structure 180, and I/O data bus structure 170. Specifically, duplicate tag store 194 is connected to I/O data bus structure 170. The purpose of duplicate tag store 194 is to ensure coherency between the information in CPU cache RAM 117 and the corresponding address in main memory 140.
  • processor unit 102 may have a six-way associative or a direct-mapped write-through cache tag store 118. Whenever a main memory location is read into processor cache tag store 118 or displaced from processor cache tag store 118, the same main memory location is read into, or displaced from, duplicate tag store 194. Thus the contents, of duplicate tag store-194 are identical to processor cache tag.store 118.-
  • Memory transaction occurs are acknowledged by memory state unit (234 of Fig. 3) external address acknowledgement line (294 of Fig. 3) , which connects to outgoing portions 113b, 173b, and 183b, of the system request bus 113, the I/O request bus 173, and the network bus 183, respectively.
  • the information indicates the type of memory transaction (such as a read or a write) and the address in main memory 140 that is involved in the transaction. If a memory transaction is a write to an address in main memory 140 by some device other than CPU 106, duplicate tag store 194 searches its contents to see if there is a match with an entry in duplicate tag store 194.
  • duplicate tag store 194 21 issues an invalidate request to processor • unit 102.
  • Cache controller 116 marks corresponding entry in cache tag store "invalid", so the next time that CPU 106 attempts to access that entry, cache controller 116 reads the new value from the corresponding address in main memory 140.
  • an invalidate request and therefore a transaction on system bus 110 occurs only if graphics processor unit 210, or some other system component writes to a location in main memory 140 that is resident in processor cache tag store 118, thereby further reducing traffic on the system bus 110.
  • Another feature of the invention is the use of system components to more efficiently transmit commands from processor unit 102 to address generator 212, which processes the commands. This can be understood by referring to Fig. 6, which shows the important elements of the path by which commands are transmitted from processor unit 102 to address generator 212.
  • the invention provides for efficient use of memory write buffers for the transmission of graphics commands; provides a command buffer in main memory 140, which allows processor unit 102 to transmit commands whenever the system bus is available; provides a second buffer, which allows transmissions from the command buffer to the address generator 212 to be of uniform length, even if the commands themselves are of variable length; and provides logic that sends the commands directly from the processor unit 102 to address generator 212 or to the second buffer, if the intervening elements on the transmission path are empty.
  • elements in the path by which commands are transmitted from processor unit 102 to address generator 212 include memory write buffers 371, 372, and 373, which are present in CPU bus interface 104; FIFO command buffer 134 , which is present in main memory 140; residue buffer 378, which is present in pixel SLU 2.14; and short circuit logic 382, which is in virtual translation/FIFO control unit 230. Also in virtual translation/FIFO control 2 unit 230 are a number of system components, shown in Fig. 4, involved in the management of FIFO command buffer 134.
  • FIFO command buffer base register latch 384 FIFO command buffer tail index latch 386, and FIFO put (FPUT) line 388
  • FIFO command buffer head index latch 392 FIFO/clip list next address index multiplexer 394, FIFO/clip list next address index latch 396, next address multiplexer 398, next base address multiplexer 402, FIFO length lines 404, FIFO empty/full comparator 406, FIFO length threshold mask 484, and FIFO save head latch 474.
  • These components include clip list base address latch 472 and clip list starting index latch 474.
  • Write range comparator 482 is used to ensure that any addresses accessed by virtual translation/FIFO control unit 230 are within allowable bounds.
  • Other components of virtual translation/FIFO control unit 230 include address generator comparator latch 476 and address generator comparator 478, and multiplexers 486 - 490.
  • graphics commands are transmitted from processor unit 102 as writes to a predefined range of addresses in memory.
  • CPU 106 writes graphics commands and other writes to memory, as well as other CPU transactions, on CPU bus 132 to CPU bus interface 104, which identifies the type of CPU transaction. If a transaction is a write to memory, CPU bus interface 104 places the contents of the transaction in one of memory write buffers 371, 372 until the buffer is filled, whereupon it writes further graphics commands and writes to memory into the other of buffers 371, 372.
  • Memory write buffers 371 and 372 alternately send their contents, in the order in which they were filled, to memory write buffer 373, when memory write buffer 373 is empty.
  • Memory write buffer 373 then transmits its contents on system bus structure 110 to graphics/memory control unit 130.
  • CPU bus interface 104 Prior to placing the contents of the CPU transaction into the memory write buffers 371, 372, CPU bus interface 104 examines the addresses of the writes to memory to see if there are addresses in the range of addresses designated for graphics commands; if the write to memory within the range, CPU bus interface 104 changes a bit in the memory request portion of the write to memory to indicate that the write to memory is a graphics command.
  • the request portion of the write to memory is sent along system request bus 113, where it is received by flow control unit 232.
  • Flow control unit 232 reads the bit that indicates that the write to memory is a graphics command and signals memory state unit 234 that the write to memory should be sent to the FIFO command buffer 134.
  • FIFO command buffer base register latch 384 stores the memory address of the head of the FIFO command buffer (134 of Fig. 6) .
  • FIFO command buffer tail index latch 386 stores the number of FIFO positions between the base address of the FIFO command buffer 134 and the tail of the FIFO command buffer 134 (that is, the current length of FIFO command buffer 134) .
  • FIFO command buffer tail index latch 386 and the FIFO command buffer base register latch 384 are combined to yield the memory address to which the graphics command should be sent; this address is transmitted on FIFO put (FPUT) line 388. Since memory address and control unit 236 has been previously signaled that the next write to memory is write to FIFO command buffer 134, memory address and control unit 236 reads the address stored on FPUT line 388, and transmits the address on memory address bus 154. When the command is transmitted, index incrementer 496 increments the value in FIFO command buffer tail index latch 386.
  • the data portion of the write to memory is sent along system data bus 112 to pixel SLU 214.
  • Pixel SLU 214 sends the graphics command over memory data out (MDATO) line 408 to memory buffer 238, which transmits the data on memory data bus 152.
  • MDATO memory data out
  • a command is to be fetched from FIFO command buffer 134 for processing, it is processed as a memory request.
  • a memory request is issued to the flow control unit 232 over MA_REQ line 282 by the virtual translation/FIFO control unit controller 274.
  • the address for the memory request is generated by the virtual translation/FIFO control unit 230.
  • FIFO command buffer head index latch 392 stores the number of FIFO positions between the base address of the FIFO command buffer (134 of Fig. 6) and the head of the FIFO command buffer (134 of Fig. 6) .
  • the contents of the FIFO command buffer head index latch 392 are multiplexed through the FIFO/clip list next address index multiplexer 394, and are stored in the FIFO/clip list next address index latch 396, and transmitted to the next address multiplexer 398, where it is indexed by the contents of the FIFO/clip list next address latch 396 to yield the memory address of the head of the FIFO command buffer which is transmitted on the physical address bus (PADRS) 272.
  • PADRS physical address bus
  • the address is then sent to memory address and control unit 236, for transmission on the memory address bus 154.
  • the command is returned from memory on memory data bus 152 to memory buffer 238, and the pixel SLU 214 over MEMDAT line 264.
  • Pixel SLU 214 in turn transmits the graphics command to address generator 212 over pixel data bus (PXDAT) 222.
  • the use of the FIFO command buffer presents increases the efficiency of transmitting of graphics commands from the processor unit 102 to the address generator 212 in many ways. Processing the transmission of commands from processor unit 102 to address generator 212 as writes to memory allows for increases in efficiency in system operation.
  • the use of the memory write buffers 371, 372, and 373 ensures that when commands are transmitted, that the full bandwidth of the system bus is used. Providing a FIFO command buffer 134 permits commands to be transmitted from processor unit 102 whenever the system bus is available, whether or not address generator 212 is available to process a command. Residue Buffer
  • the performance of the FIFO command buffer 134 and, generally, the process by which graphics commands are sent from the processor unit 102 to the graphics processor unit 210, is made even more efficient by the another feature of the invention, the residue buffer 138, in pixel SLU 214.
  • graphics commands are structured in the form of command packets of one to four 32 bit words.
  • the first word (referred to as the "header" has two bits that designate how many 32 bit words there are in the command packet in addition to the heade ' r. For example, the length bits in a three word command packet would be set to the value two.
  • the transmissions from FIFO command buffer 134 to the graphics processor unit 210 are a uniform packet of four 32 bit words.
  • a transmission may contain parts of more than one command.
  • Residue buffer 138 is a set of three 32 bit registers. Residue buffer 138 is controlled by signals transmitted from the virtual translation/ FIFO control unit controller 274 over FIFO_CTL line 248. When a transmission from the FIFO-Gommand buffer 134 is received in the pixel SLU 214, the virtual translation/FIFO control unit controller 274 causes the first command packet to be forwarded to the address generator 212 over the pixel data bus 222.. The virtual translation/FIFO control unit controller 274 causes the remainder of the transmission, which may contain additional command packets, or a portion of an additional command packet, or both, to be loaded into the residue buffer 138. When the address generator 212 has completed executing a command, the virtual translation/FIFO control unit controller 274 causes the contents of the residue buffer 138 to be immediately forwarded from residue buffer 138 to the address generator 212 over the pixel data bus 222.
  • residue buffer 138 ensures that the full bandwidth of memory data bus 152 is used, while still allowing ⁇ for variable length of graphics command packets.
  • storing the next command at a location close to address generator 212 decreases the idle time of address generator 212.
  • the short circuit mechanism 382 is logic in the virtual translation/FIFO control unit controller 274 that monitors the status of the FIFO command buffer 134, the residue buffer 138, and the address generator 212, and causes commands that are transmitted from processor unit 102 to address generator 212 to be transmitted in a minimum number of steps .
  • the address generator 212 When the address generator 212 is processing a command, it asserts, over AG_BUSY line 287, a signal to short circuit mechanism 382. Additionally, the short circuit mechanism monitors the status of the residue buffer by monitoring commands issued by virtual translation/FIFO control- unit controller 274. And finally, the short circuit mechanism 382 monitors the length of the FIFO command buffer 134 over FIFO length lines 404 (of Fig. 4) . Each time a transaction involving the FIFO command buffer 134, the address generator 212, or the residue buffer 138, occurs, the short circuit mechanism calculates a logic equation.
  • This equation to determines the most effective destination for the next command transmitted from processor unit 102 that is intended for the FIFO command buffer 134, that is, where should the graphics command be sent to minimize the number of transfers from one graphics command storage or processing element to another.
  • the logic equation is summarized in tlie following table:
  • FIFO command buffer 134 is not allowed to get full. Use of the "almost full” function causes the virtual translation/ FIFO control unit controller 274 to inhibit further transmissions to the FIFO command buffer 134 when the "almost full" function is activated.
  • the FIFO command full/empty comparator 406 compares the value stored in FIFO command buffer head index latch 392 (of Fig. 4) with the value stored in the FIFO command buffer tail index latch 386 (of Fig. 4) to calculate the length of (that is the number of commands in) FIFO command buffer 134 and compare the length with a programmable maximum, or "almost full” value, stored in FIFO length threshold mask 484. If the "almost full" value is reached, the FIFO command full/empty comparator 406 (of Fig. 4) signals the virtual translation/FIFO control unit controller 274. Virtual translation/FIFO control unit controller 274 issues an interrupt to CPU 106, to suspend the transmitting of graphics commands.
  • the length of the FIFO command buffer 134 changes, thereby changing the value stored in FIFO command buffer head index latch 392 (of Fig. 4) .
  • the FIFO command full/empty comparator 406 (of Fig. 4) signals the virtual translation/FIFO control unit controller 274.
  • Virtual translation/FIFO control unit controller 274 issues an interrupt to CPU 106, to resume the transmission of graphics commands.
  • Short circuit logic 382 ensures that graphics commands are sent directly to the point in the command processing path as close to the address generator as possible, thereby 3 eliminating memory transactions and also eliminating traffic on memory buses 152 and 154.
  • Yet another feature of the invention is the method of storing cursor information in main memory 140, and the method of controlling cursor movement by graphics processor unit 210.
  • the invention simplifies the design of video unit 160, as well as minimizing the amount of frame buffer memory 164 that is required by the system. Additionally, some of the cursor control components can be adapted to serve as data paths for video units 160 that include devices other than frame buffer memory 164 and video DAC 166.
  • FIG. 7 is a block diagram of video/cursor control unit 240.
  • Internal components of video/cursor control unit 240 are video state unit 412, cursor scanline data buffer and shifter 422, cursor position controller 424, and video/cursor control unit controller 428.
  • Video/cursor control unit 240 transmits memory address data and memory request data through video and cursor memory interface 414 over cursor memory address line 416 and cursor memory request line 418, respectively. Additionally, video/cursor control unit 240 transmits video control information and cursor information through video and cursor external interface 426 over video control bus 126 and cursor bus 128, respectively.
  • Video/cursor control unit controller 428 receives system clock and video clock signals, and send clock pulses to video state unit 412, cursor position controller 424, and video and cursor memory interface 414 over clock pulse lines .
  • Cursor information is of two forms, pattern data and screen location. Referring to Fig. 2, information regarding the cursor pattern, which extends over 64 consecutive scanlines on the display 162, is stored in a 1024 byte contiguous section of main memory 140. The address of the first byte of the 1024 byte contiguous section of main memory is stored in the video and cursor memory interface (414 of Fig. 7) .
  • the screen location of the cursor is defined in terms of the X and Y coordinates on the display 162 of the topmost, leftmost pixel of the cursor.
  • the X position is stated in terms of number of pixels from the leftmost side of the display 162, and the Y position is stated as the number of a scanline, beginning at the top of the display 162.
  • the screen location of the cursor is controlled by the computer user, and is input to the computer graphics system by a cursor position controller 109, typically a "mouse", attached to bus interface 104.
  • Cursor screen location input from the mouse is transmitted to CPU 106.
  • CPU 106 transmits the cursor screen location input to graphics/memory control unit 130.
  • the cursor screen location input enters graphics/memory control unit 130 over incoming data portion 112a of system data bus 112, and is transmitted to pixel SLU 214.
  • Pixel SLU 214 in turn routes the cursor screen location input to the video/cursor control unit 240 over pixel data bus 222.
  • the cursor screen location input enters video/cursor control unit 240 over pixel data bus 222 and is routed to video state unit 412, where it is stored.
  • video state unit 412 monitors the X position (that is the number of pixels from the leftmost side of the display (162 of Fig. 2) ) and Y position (that is the number of scanlines from the top of the display (162 of Fig. 2)) of next pixel to be shown on display (162 of Fig. 2) .
  • video state unit 412 signals video and cursor memory interface 414.
  • Video and cursor memory interface 414 generates a memory read request, consisting of an address portion and a request portion.
  • the address portion of the memory read request contains the address of the first byte of the 1024 byte contiguous section of main memory in which the cursor pattern information is stored.
  • the address portion of the memory read request is transmitted on cursor memory address line 416.
  • the request portion of the memory read request is transmitted on cursor memory request line 418. Referring now to Fig. 3, the address portion of the memory read request, which was transmitted on cursor memory address line 416 is routed to memory address and control unit 236; the request portion of the memory read request, which was transmitted on cursor memory request line 418 is routed to flow control unit 232.
  • the memory read request is then processed in the same manner as a memory request issued by address generator 212, which was described above, resulting in the cursor pattern data (for the next scanline to be displayed) being returned to pixel SLU 214.
  • Pixel SLU 214 in turn transmits the cursor pattern data on pixel data bus 222.
  • Video/cursor control unit 240 reads the cursor pattern data off pixel data bus 222.
  • cursor pattern data is routed on pixel data bus 222 to cursor scanline data buffer and shifter 422, which aligns the cursor according to signals generated by cursor position controller 424, for reasons that will be explained below.
  • the cursor pattern data is then transmitted, by video and cursor external interface 426 over cursor bus 128 to video unit (160 of Fig. 2) , where the cursor pattern data is processed in a manner that will be described in the discussion of the video uni .
  • Video state unit 412 subsequent to generating the signal to cursor memory interface 414, increments a scanline counter
  • video state unit 412 examines the value in the scanline counter. If the value in the counter is less than 64 (meaning that there are more 2i scanlines of cursor to be displayed) , video state unit 412 issues a signal to cursor memory interface 414, which requests the next scanline of cursor information from main memory (140 of Fig. 2) .
  • the video state unit 412 takes no further action relative to cursor display, until the next time the X and Y position of the cursor matches the X and Y position of the next pixel to be shown.
  • the alignment signals generated by cursor position controller 424 are necessary because transmissions on cursor bus include groups of pixels. If the first pixel of the cursor pattern is in the middle of the group of pixels, it is necessary to properly align the first pixel of cursor pattern data with the group of pixels in the transmission.
  • Video/cursor control unit controller 428 receives control information from flow control state unit 232 over FCTL line
  • video/cursor control unit controller 428 An additional function of video/cursor control unit controller 428 is to control the configuration of the other elements of video/cursor control unit 240. If it is desired to replace video unit 160 with a more complex video unit, such a three dimensional graphics processor unit, or a more sophisticated video DAC, the invention provides for using some of the communications links and memory request capabilities of video/cursor control unit 240 for purposes other than cursor control.
  • a signal (not shown) , is generated by CPU 106 to video/cursor control unit controller 428 causes a change in a bit in a register in video/cursor control unit controller
  • the invention allows cursor pattern information to be stored in main memory 140, and further allows video unit 160 to require less logic and less frame buffer memory than would otherwise be required.
  • Frame Buffer Module
  • Another feature of the invention is the arrangement of system elements so that the circuit board, or module, on which the frame buffer memory 164 is placed contains a minimum of circuitry and components, and is therefore less expensive. If the computer user wishes to upgrade the monitor from, for example, a low resolution monitor to a high resolution monitor, or from a monochrome monitor to a color monitor, the computer user must typically add more frame buffer memory. This generally requires replacing the frame buffer module. Since, according to the invention, the frame buffer module contains a minimum of circuitry and components, the frame buffer module is relatively inexpensive, thus minimizing the cost to the user.
  • FIG. 8a shows video unit 160 in greater detail, in a configuration designed to support a low resolution monitor.
  • Frame buffer memory 164 consists of interleaved frame buffer memory banks 432a and 432b, each bank being four 128K eight bit units of standard dual ported video RAM.
  • Frame buffer memory banks 432a and 432b are connected to video DAC 164 (for example, a model BT458 RAMDAC, available from the Brooktree Corporation of San Diego, CA. is suitable) through video multiplexer 434.
  • video DAC 164 for example, a model BT458 RAMDAC, available from the Brooktree Corporation of San Diego, CA. is suitable
  • nibble clock 436 for example, a model BT458 RAMDAC, available from the Brooktree Corporation of San Diego, CA. is suitable
  • LUT load path multiplexer 438 connected to video multiplexer 434 and video DAC 164
  • frame buffer ROM 442 is also present on video unit 160.
  • Video unit 160 is implemented as a module, that is the components of video unit 160 are mounted on an easily replaceable unit, such as a circuit board.
  • Video synchronization (VSYNC) line 446 and video blanking (VBLNK) line 448 connect to video DAC 166, and video DAC enable (BTEN) line attaches to video DAC timing unit 452.
  • Video shift line 453 connects to frame buffer memory banks 432a and 432b, and nibble clock (NIBCLK) line 454 connects to nibble clock 436.
  • NIBCLK nibble clock
  • Frame buffer memory banks 432a and 432b Data is communicated between frame buffer memory banks 432a and 432b over memory data bus 152 according to signals transmitted on memory address bus 154.
  • Frame buffer latches 444a and 444b act as temporary storage that allow memory data banks 432a and 432b, which transmit data in 64 bit units, to interface with memory data bus 152, which transmits data in 32 bit units.
  • Video DAC 164 converts data to video signals for display 162. Data from memory banks 352 and 354 may be overwritten by input from cursor bus 128, which superimposes the cursor over the graphics -image, or by VBLNK line 364, which causes screen of display 162 to be blanked.
  • a color look up table is stored in either or both of frame buffer memory banks 352a and 352b.
  • Each entry in the LUT contains a combination of the colors (typically red, blue, and green) that the display 162 can illuminate, with varying degrees of intensity, at each pixel.
  • Each entry in frame buffer memory banks 432a and 432b contain a reference to an entry in the LUT.
  • the LUT is loaded into video DAC 164 through video multiplexer 434 and the LUT load path multiplexer 438.
  • the LUT load path multiplexer separates the output from video multiplexer into two portions, a data portion and a control portion.
  • the control portion is transmitted to video DAC over LUT control input line 456, and the data portion is transmitted to video DAC over LUT control data line 458.
  • the LUT load path multiplexer also selects between LUT input from video multiplexer 434 and from diagnostic signal line 462.
  • Transceiver 464, video analog comparator 466, and diagnostic signal bus 468 are a part of the diagnostic system.
  • Fig. 8a shows video unit 160 in a configuration designed to support a multiple headed system, that is a system that has two displays.
  • Additional elements required to support the second display include additional frame buffer 164', which includes memory banks 432c and 432d, video multiplexer 434', LUT load path multiplexer 438', and video analog comparator 466', and video DAC 166'. Signal lines from timing bus 126 and cursor bus 128 are split and connected to the corresponding additional elements .
  • video unit 160 as a module, connected to buses 126, 128, 152, 154, and 468, by ports 501 - 505 enables the upgrade to be accomplished by removing module 160 from ports 501 - 505 and replacing it with another module 160' (not shown in this Figure) .
  • FIG. 9 shows the structure of Fig. 8, with video unit 160 replace"Srby a video unit 160', which has on it video device 161.
  • Video device 161 may be a video option, such as a three dimensional video device or a graphic accelerator.
  • Video device 161 may also be any other type of computer device which can advantageously be attached to the memory bus.
  • Video unit 160' is connected to memory data bus 152 and memory address bus 154, thereby allowing for memory transfers between video unit 160' and main memory (140 of Fig. 2) in the same manner as described previously for transfers between frame buffer memory (164 of Fig. 2) and main memory (140 of Fig. 2) .
  • video unit 160' is connected to cursor bus 128.
  • cursor bus 128 can be used to transmit signals, such as inhibit, reset, and interrupt signals.
  • the memory bus structure 150 is especially adapted to efficiently transfer data between frame buffer memory 164 and to other system components connected to memory bus structure 150.
  • Memory bus structure 150 consisting of memory data bus 152, memory control bus 154, video control bus 126, and cursor bus 128 is implemented as a set of communication lines from memory control unit 220 to main memory 140, frame buffer ii memory 164, and video DAC 166.
  • Memory bus structure 150 is shown in Fig. 10.
  • Memory data bus 152 consists of three portions. First portion 152a of memory data bus 152 connects to both main memory 140 and to frame buffer memory 164. First portion 152a transmits data, and latch enable signals that allow memory bus to be of a different width, in number of bits, from main memory 140 or frame buffer 164. Thus, data can be transmitted over memory data bus 152 first portion 152a to either main memory 140 or frame buffer memory 164. Second portion 152b of memory data bus 152 consists of three communications lines that connect memory control unit 220 and frame buffer memory 164. The three communications lines of second portion of memory data bus 152b are output enable lines for frame buffer memory 164. Third portion 152c of memory data bus 152 consists of communication lines that connect memory control unit 220 and main memory 140.
  • Memory address bus 154 consists of three portions .
  • First portion 154a of memory address bus 154 consists of communications that connect graphics/memory unit 130 with both main memory 140 and frame buffer memory 164, thereby enabling address data to be transmitted from memory control unit 220 to both main memory 140 and frame buffer memory 164.
  • Second portion 154b of memory address bus 154 consists of three communication lines that terminate at memory control unit 220 and frame buffer memory 164. The three communication lines transmit timing signals, output enable signals, and special function information, respectively.
  • Third portion 154c of memory address bus 154 connects memory control unit 220 and main memory 140.
  • the cursor bus 128 consists of communications lines that transmit cursor information to video DAC 166. Eight communications lines can also be used for other purposes if the ' video unit 160 is replaced (as shown in Fig. 8) with a video unit 160', which has on in it a video device 161 such as more complex video unit, a three dimensional graphics unit, or a more sophisticated video DAC. In this case, the eight £2 communication lines do not carry cursor signals .
  • two of the lines carry system clock signals to the video unit 160'; two of the lines carry signals to video unit 160' indicating the validity and length, in 32 bit words, of transmissions intended for video device 161, one of the lines transmits reset signals to video device 161, and the remaining three lines carry inhibit, interrupt, and stall signals from the video unit 160' .
  • a signal, generated by CPU (106 of Fig. 2) to video/cursor control unit controller 428 changes a bit in a register in video/cursor control unit controller 428, which causes cursor scanline data buffer and shifter 422 not to perform its normal function. Instead, cursor scanline data buffer and shifter 422 passes data between cursor bus 126 and pixel data bus 222.
  • This configuration provides a direct communication path between cursor bus 126 and cursor memory interface 414, thus enabling the video unit (160' of Fig. 9) to communicate control signals through cursor memory interface 414.
  • This configuration further provides a method for accomplishing memory transfers directly between main memory 140 and video unit 160' without moving the data through graphics/memory unit 130.
  • Graphics processor unit 210 issues a memory read request in the manner described above. The read request results in the data from the requested memory address to be transmitted on memory data bus 152. Signals are transmitted on the two lines of cursor bus 128 that indicate the validity and length of transmissions intended for video device 161, thereby causing one of latches 444a and 444b to read the data that is on cursor bus 128.
  • Video control bus 126 transmits video control signals to video unit 160' .
  • Video control bus 126 consists of a plurality of communications lines. Eight of the communications lines transmit, respectively, a video blanking signal, a video synchronization signal, a video shift register enable signal, a video multiplexer select signal, an enable signal for loading the color look-up table (LUT) , an LUT input multiplexer select signal, and a video nibble clock signal.
  • the invention provides a method by which the cursor bus 128 can be used for purposes other than communicating cursor information.
  • This enables the system designer to replace the video module (160 of Fig. 2) with a video unit 160', without requiring the expensive and complex task of redesigning the memory bus structure 150.
  • the graphics system can therefore be easily and inexpensively upgraded from a low resolution monitor, to a higher resolution monitor, to a more complex video option, or to some other optional device.
  • the invention allows for the transfer of data directly from main memory 140 to video unit 160', without the data passing through graphics/memory unit 130.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Image Generation (AREA)

Abstract

A low cost, high performance computer graphics system. A graphics processor, capable of making memory requests is connected to a memory control unit, which controls a memory bus. Both main memory and a frame buffer memory are attached to the memory bus, thereby giving the graphics processor the capability of writing to either main memory or to the frame buffer memory. The disclosure further describes a computer graphics system, having a duplicate cache tag store accessible by the graphics system without generating traffic on the system bus; having a FIFO command buffer in main memory for temporary storage of graphics commands; having a residue buffer for the temporary storage of memory transmissions not immediately usable by the graphics processor; having a 'short circuit' feature for routing graphics commands to the command processor in the minimum number of steps; having a cursor control system capable of storing cursor pattern information in, and retrieving cursor pattern information from, main memory; having a cursor bus that is reconfigurable to carry information other than cursor information; and having a frame buffer module that contains no timing or cursor control circuitry.

Description

COMPUTER GRAPHICS SYSTEM
Field of the Invention
This invention relates to computer graphics system, a d more particularly to computer architecture of graphics systems.
Background
Computer graphics systems are special purpose computers that are used to create complex images on a display and to allow the computer user to modify and store the images.
Many of the elements of a typical computer graphics system are shown in Fig. 4 of U.S. Patent 4,745,407, entitled "Memory Organization Apparatus and Method", issued May 17, 1988 to Costello. A graphics accelerator, such as the graphics accelerator of Fig. 4 of Costello, is a special purpose processor unit which receives graphics commands from the CPU and executes the commands, typically by changing information stored in a frame buffer memory (referred to as "frame buffer" in Costello) . A frame buffer memory is a special purpose type of memory in which the memory locations correspond to locations, or pixels, on a monitor, or other type of display. Devices not shown in Fig. 4 of Costello read sequentially the memory locations in the frame buffer memory, and cause the pixel to be lit with the appropriate intensity or color, thereby causing the image to be shown on the display. Pixels are illuminated starting at the upper left corner of the display screen; the pixels are illuminated from left to right in a horizontal row (called a scanline) . When one scanline has been illuminated, the display proceeds to the scanline below, beginning at the left side of the screen, until it has illuminated the pixel that in the lower right hand corner.
Prior art graphics system, such as the system described by Costello do not provide for efficient use of main memory by the graphics accelerator. Transfers of information between the main memory and the frame buffer memory must be transferred via the system bus, under control of the CPU. Also, if an application program contains drawing instructions and/or commands that specify that an image is to be "drawn" and stored in main memory, rather than the frame buffer memory, the CPU can not take advantage of the graphics processing power of the graphics system processor, because the graphics system processor does not have access to main memory. If a graphics command involves retrieving a graphics image . stored in main memory, the image must be retrieved by the CPU. This not only involves action by the CPU, but also creates undesirable traffic on the system bus. In addition, the graphics system processor is set up to interact with the frame buffer memory rather than the main memory, and therefore, the CPU must convert the drawing instructions to draw to main memory locations. Other graphics systems, such as those described in U.S. Patent 4,947,342, entitled "Graphic Processing System for Displaying Characters and Pictures at High Speed", issued August 7, 1990 to Katsura, et al. , and U.S. Patent 5,020,003, issued May 28, 1991, to Moshenberg, overcome some of the difficulties of Costello, but require that the graphics processor must access a memory access controller over a memory bus, or require additional frame buffer memory, which is a special purpose type of memory, and is therefore relatively expensive.
If an application program running on a system such as that described by Costello contains drawing instructions and/or commands that specify that an image is to be "drawn" and stored in main memory, rather than the frame buffer memory, the CPU can not take advantage of the graphics processing power of the graphics system processor, because the graphics system processor does not have access to main memory. If the computer graphics system has virtual memory capability graphics system is further disadvantaged in that it cannot access virtual memory, and therefore cannot take advantage of the increased memory made available by a virtual memory system. Prior art compute graphics systems do not provide a method to efficiently interact directly with main memory, and do not provide a method for the graphics processor to access virtual memory.
In addition to the elements shown in Costello, some computer graphics systems may have a CPU cache memory. A Cache memory is a memory placed in close proximity to the CPU, and which contains the information in main memory locations most frequently accessed by the CPU. The close proximity of cache memory enable the CPU to more quickly access the data in the cache memory than can the CPU access the data in the main memory.
A typical cache memory system consists- of cache RAM (random access memory), a cache controller, and a tag store. The tag store is a table of the main memory addresses of the information that is stored in the cache RAM. The cache RAM stores the information that is operated on by the CPU. The cache controller controls the information that passes in and out of the cache RAM, and updates the cache tag store. The specific structure of the cache tag store and of the entries in a cache tag store are dependent on whether a cache is a "direct mapped" cache, a "set associative" cache, or a "fully associative" cache. One characteristic of all cache tag stores, however, is that they have some method for indicating the main memory addresses corresponding to the entries in the cache memory.
When the CPU needs the information in a main memory address, the tag store is searched for the main memory address. If the main memory address is in the tag store (a cache "hit") , the information is retrieved from the cache RAM and sent to the CPU. If the main memory address is not in the tag store (a cache "miss") , the cache controller retrieves the information from main memory, stores it in cache RAM, and records the main memory address in the tag store. When the CPU stores the information, it sends the information back to the cache controller, which stores the information in the cache RAM. If the cache is a "writethrough" cache, the information is also written to the corresponding address in main memory. If the cache is a "writeback" cache, the information is not written to the corresponding address in main memory until a later time.
In a computer system, it is important that the contents of any location in main memory is identical to all system components that access main memory. If a main memory location is in cache memory, particularly a "writeback" cache memory, the information in main memory may not be the most current value, and the information in main memory must be updated before any system component other than the CPU attempts to read that main memory location. Verifying whether or not a main memory location is resident in cache, and updating the value can be done in a number of ways, but many of them result on some traffic on the system bus, even if the main memory location is not currently resident in cache.
Similarly, if any system component other than the CPU writes to a main memory location, there must be some method of verifying if that location is in cache, and if it is, to update the information in cache. This can be done in a number of ways, but all result in some traffic on the system bus, even if the memory location is not currently resident in cache. One method for providing a more efficient, less expensive computer graphics system is to provide more efficient means for transmitting and processing the graphics commands.
In a system such as the one taught by Costello, commands are transmitted from the CPU to the graphics accelerator over the VME bus. Unfortunately, general purpose buses such as the VME bus frequently carry a lot of traffic. This is particularly true in a system such as Costello, in which traffic between main memory and the frame buffer must travel over the same bus as traffic between the CPU and the graphics processor.
Thus it is important to maximize the efficiency of transmissions over the bus . One way of doing this is to force transmissions to be of a uniform size, typically a number of words or bytes, and to optimize the bus for transmissions of that size. However, for certain types of transmissions, it may be desirable for the information to be in sizes other than the uniform size. For example, the information being transferred may be a stream of graphics commands. If the graphics commands are all the same length, and the length is the same as the uniform transmission size, then the graphics commands may be transmitted to and from main memory efficiently. However, restricting the length of the graphics commands to the uniform transmission size may be inconvenient for the designer of the graphics command set. If the designer of the graphics command set wishes to create graphics commands of varying length, it may be very difficult for the system designer to select an optimal uniform transmission size.
A feature present in most computer graphics systems is a cursor. A cursor is a movable marker that appears on the display screen and provides a visible indicator of the position of interest. Cursors are typically controlled by devices such as directional keys on a keyboard, or by an input device, such as a "mouse" . The cursor appears as an overlay over other images that are displayed on the screen.
Information regarding the shape and size of the cursor and the cursor' s location are typically stored in a dedicated memory or in a portion of "off-screen" frame buffer memory, that is a portion of the frame buffer memory that does not correspond to a pixel on the screen of the display.
U.S. Patent 4,706,074, entitled "Cursor Circuit for a Dual Port Memory", issued November 10, 1987 to Muhich, et al. is exemplary of a cursor in which data relating to the cursor is stored in off screen frame buffer memory, most clearly seen in Fig. 3 of Muhich.
A problem with storing cursor information in frame buffer memory is that frame buffer memory is expensive relative to main memor .
Video timing and control circuitry is often placed on the same circuit board as the frame buffer memory. However, including the timing and control circuitry on a single circuit board increases the expense and complexity of manufacturing the circuit board, and therefore causes the circuit boards to be very expensive. If a computer user adds more frame buffer memory, (which is necessary if the user changes the type of display to a display with higher resolution, or changes from a monochrome to color display, ) the user must replace the circuit board on which the frame buffer memory is placed.
Buses, such as the VME bus and the local memory bus are important components of a computer system. Generally, it is desirable to minimize the number of communications lines in each bus, because adding more lines generally means more expense. In addition, there must be a terminal pin at each point that a communication line connects to a system component. It is very difficult and expensive to add pins to system components .
Summary of the Invention
This invention provides a low cost graphics system which enables graphics generation by a process which requires a minimum of traffic on the system bus, and which allows the graphics processor to interact efficiently with main memory, and requires a minimum of frame buffer memory. 2
In accordance with the invention, there is an increase in speed and efficiency by the utilization of a graphics processor which interacts directly with main memory for transfer of graphics information to main memory, for later retrieval, or to a frame buffer memory, for ultimate display. There is a reduction in system cost by enabling the graphics processor to draw to main memory directly, thereby minimizing the amount of graphics dedicated memory required.
The graphics processor is formed as a component of a memory and graphics processor unit which communicates with other components via a memory control unit to a plurality of buses, specifically, the system bus, a memory bus, and an I/O bus. The central processor unit communicates with the system bus and ultimately with the main memory, under control of the memory control unit. The flow of graphics information is effected via the memory bus, that is, any communications with the main memory are effected over the memory bus, and any communications between the main memory and the system bus must •take place through the memory control unit. The graphics processor unit can access main memory by issuing memory requests directly to the memory control unit. Additionally, the memory and graphics control unit can interact directly with virtual memory in a manner that will be described below.
The invention further provides a method for translating virtual memory addresses to physical memory addresses. A graphics processing unit includes, among other things, an address generator which retrieves data from memory locations, and writes data to memory locations . The address generator retrieves data from memory locations memory access request directly to a memory control unit, which retrieves the contents of the memory location. Prior to issuing the request, the address generator sends the address to a virtual translation unit, which translates the virtual address to a physical address. The virtual translation/FIFO control unit also contains three translation buffers, in which are stored the most recently accessed virtual addresses, which, in many situations, enables the virtual translation/FIFO control unit to translate the virtual address using less memory accesses. The invention further provides a method for determining if a main memory location written to by a system component other than the CPU is in cache. The method performs the verification without causing traffic on the system bus.
The invention includes a CPU cache tag store, and a second cache tag store. Any entry into, or displacement from, the CPU cache tag store is also entered into, or displaced from, the second cache tag store. The second cache tag store is situated such that system components other than the CPU can access the second cache tag store without creating traffic on the system bus .
The invention further provides a command buffer, between the CPU and the graphics processor, for storing graphics commands. The command buffer is a FIFO buffer in a reserved, contiguous section of main memory. Commands from the CPU are written to the FIFO command buffer for temporary storage at a location pointed to by a tail pointer. Commands are transmitted from the FIFO command buffer for processing from a location pointed to by a head pointer. Commands are thus stored and processed in the same order they were received by the FIFO command buffer. The FIFO command buffer is dynamic, that is commands can be added to the tail of the FIFO command buffer at any time, provided the space allocated to the FIFO command buffer is not full. Since the FIFO command buffer is stored in main memory, it adds little expense to the system. The CPU can transmit a graphics command over the system bus whenever the bus is available, even if the graphics processor is occupied. Graphics commands are available to the graphics processor, from the FIFO, even if the system bus is currently occupied.
The invention described herein further provides a method by which graphics commands can be of varying length, and yet can be transferred in transmissions of uniform size.
According to the invention, graphics commands are transmitted, in transmission units of uniform size, from a processor unit to an address generator, which processes the commands. A residue buffer is provided so that portions of data transmission units not immediately usable by the address generator can be stored in the residue buffer. The invention further provides a method for improving the efficiency of transmissions to the graphics accelerator by monitoring the status of all the elements in the path between the CPU and the graphics accelerator (sometimes called a graphics processor) , and routes the graphics command to the element closest to the graphics processor, or to the graphics processor directly.
The invention described herein provides a processor unit which issues graphics commands to an address generator, which processes the commands. Two storage buffers (a FIFO command buffer and a residue buffer) are provided to make efficient use of the buses over which the commands are transmitted. Additionally, logic is provided which routes the graphics command directly to the address generator, or to the buffer closest (in the transmission path) to the address generator, thereby minimizing the number of computer cycles necessary to get the command to the address generator.
The invention described herein f rther provides- a graphics system in which cursor pattern information is stored in main memory. A graphics unit, capable of directly accessing main memory has in it logic which controls the position of the cursor on a display screen. When the scanlines which contain portions of the cursor are illuminated on the screen, a cursor control unit requests the cursor pattern information from main memory, and transmits the pattern data to a digital to analog convertor, which causes the cursor to be displayed on the screen.
The invention further provides a frame buffer module on which the frame buffer memory is mounted, which can be replaced with a minimum of expense. The invention places the timing and control circuits in a module, separate from the frame buffer module, thereby making the frame buffer module inexpensive to replace. The invention still further provides a cursor bus having two configurations. In a first configuration, the cursor bus transmits cursor control information from a cursor control unit, in a graphics processor, to a video unit. The invention further provides a register in the cursor control unit which allows the user to select a second configuration, in which the cursor bus is used to transfer information between a video option and a different unit within the graphics processor. In the second configuration the bus carries signals such as stall, inhibit, and interrupt signals, rather than cursor information, and in the second configuration some of the signal line carry signals from the video unit to the graphics processor unit.
% Other objects, features and advantages of the invention will become apparent from a reading of the specification, when taken in conjunction with the drawings, in which like reference characters refer to like elements in the several views. It is to be understood that the drawings represent the interrelationships of the elements and do not necessarily represent the physical location of the elements.
Brief Description of the Drawings
Fig. 1 is a block diagram of the computer graphics system.
Fig. 2 is a block diagram of the computer graphics system of Fig. 1, in greater detail;
Fig. 3 is a block diagram of the graphics/memory control unit;
Fig. 4 is a block diagram of the virtual translation/FIFO control unit; Fig. 5 is a diagram showing the translation of a virtual address to a physical address;
Fig. 6 is a block diagram of the elements of the computer graphics system that are involved in the transmission of graphics commands from the processor unit to the graphics processor unit;
Fig. 7 is a block diagram of the video/cursor control unit; Fig. 8a is a block diagram of the video unit and the memory bus structure, the cursor bus, and the video control bus;
Fig. 8b is a block diagram of the video unit and the memory bus structure, with the video unit configured to provide output to two displays;
Fig. 9 is a block diagram of Fig. 8a, with the video unit replaced by an optional device.
Fig. 10 is a block diagram of the memory bus structure.
Detailed Description of the Invention:
Fig. 1 shows a block diagram of a computer graphics system 100, according to the invention. Processor unit 102 is interconnected, via a system bus 110, for communication with a memory control unit 220. A main memory 140 is connected, via memory bus 150 to the memory control unit 220 as well as to the frame buffer memory 164. Frame buffer memory 164 is in turn connected to a video DAC 166, which reads data, in digital form, from frame buffer memory 164, and converts the data to video signals, which are transmitted to display 162 over video signal line 168. A graphics processor unit 210 is connected to memory control unit 220 via signal lines comprising a multiplicity of signal lines (including PADRS line 272, FPUT line 388, MA_REQ line 282, and PXDAT line 222, shown in more detail in Fig. 4, and described below) . Graphics processor unit 210 and memory control unit 220 are included in a functional unit, the graphics/memory control unit 130, which may be implemented on a single computer chip. In a preferred embodiment, graphics/memory control unit 130 is implemented on a single computer chip.
The memory control unit 220 is interconnected to the I/O bus 170 and the network bus 180, each of which is bidirectional for data/address transfers through, to and from the memory control unit 220.
As will become apparent, in main part, graphics/memory control unit 130 serves to effect graphics generation by arranging the components of the system such that graphics processor 210 can directly access (that is read from and write to) main memory 140. Access requests by the graphics processor 210 are transferred directly to memory control unit 220, by signal lines 169, without having to be transmitted over system bus 110, memory bus 150, or I/O bus 170.
Graphics commands are generated in processor unit 102, for processing in graphics processor unit 210. Processor unit 102 transmits the graphics commands over system bus 110 to memory control unit 220. The graphics commands are in turn transmitted, by memory control unit 220, to graphics processor unit 210.
Processing of graphics commands by graphics processor unit 210 involves reading data from locations in main memory 140 and writing data to locations in main memory 140 or in frame buffer memory 164. To read data from main memory 140, graphics processor 210 issues a memory read request to memory control unit 220 over one of signal lines 169. Memory control unit 220 prioritizes the memory request, along with memory requests received over system bus 110, network bus 180, and I/O bus 170. Memory control unit 220 retrieves the data from the location in main memory 140, and transmits the data to graphics processor unit 210 over one of signal lines 169.
Writing data to a location in main memory 140 is effected by the same process as writing data to location in frame buffer memory 164, the difference being the address specified in the memory write request. To write data to main memory 150 of to frame buffer memory 164, graphics processor unit 210 issues a memory write request to memory control unit 220 over one of signal lines 169. Memory control unit 220 prioritizes the memory request, along with memory requests received over system bus 110, network bus 180, and I/O bus 170. Memory control unit 220 then writes the data to the location in main memory 140 or in frame buffer memory 164-, depending on the address specified in the memory write request.
If the computer system is a virtual memory system, the invention provides for first translating virtual addresses to physical addresses; this process is described in more detail below.
Thus it can be seen that the invention provides for a more efficient computer graphics- system. Graphics processor 210 can issue requests to memory control unit 220 directly, over signal lines 169; processing of the memory request does not create any traffic on system bus 110; and graphics processor 210 can read data from main memory 140. In addition, writing data to both main memory 140 and frame buffer memory 164 are both accomplished by the same process, with the only difference being the memory address; stated differently, frame buffer 164 and main memory 140 appear to graphics processor unit 210 and memory control unit 220 as two parts of a single address space.
Other features and advantages of the invention can be seen in a more detailed description of the computer graphics system 100 and its components, which are described below.
Computer Graphics System
Fig. 2 shows a block diagram of a computer graphics system 100 according to the invention. A computer graphics system may be connected to a computer network by network bus 180 as shown, or may be incorporated within a single user workstation or a multi-user computer. The computer graphics system 100 includes functional units, such as a processor unit 102 (depicted within a broken line block) interconnected, via a CPU bus interface 104 to a system bus, generally designated 110, for communication with a graphics/memory control unit 130. The system bus 110 is shown as a plurality of buses, designated 112, 113, and 114, each of which, respectively, transfer data information (bus 112) , request information (bus 113) and address information (bus 114)'. A main memory 140 is connected, via memory data and address buses 152 and 154, respectively, to the graphics/memory control unit 130 as well as to the video unit 160 (shown in broken lines) , which, in turn controls the display 162. The video unit 160 inc'ludes therein a frame buffer memory 164 and a DAC (digital to analog convertor) 166.
The graphics/memory control unit 130 is interconnected to a plurality of bus structures, each of which is bidirectional for data/address transfers through, to and from the graphics/memory control unit 130. Such bus structures include the video unit 160 control bus structure, that is, video control bus 126 and the cursor bus 128; the I/O bus structure 170, that is I/O data bus 172, I/O request bus 173, and I/O address bus 174; and a network bus structure 180, such as data bus 182, request bus 183, and address bus 184. Connected to the I/O bus structure 170 may be any type of peripheral device, such as disk storage device 191 and any other suitable I/O device 192. Also connected to I/O bus 170 for reasons which will be hereinafter discussed, is a duplicate tag store 194. Referring now to Fig. 3, Graphics/memory control unit 130 consists of two major units, graphics processor unit 210, and memory control unit 220.
As shown in Fig. 3, graphics processor unit 210 consists of address generator 212, pixel shift logical unit (pixel SLU) 214, virtual translation/FIFO control unit 230, mask generator
216, graphics data buffer 218, and video/cursor control unit
240, all interconnected the pixel data bus (PXDAT) 222 and flow control bus (FCTL) 224. Address generator 212 transmits control signals via address generator control line 226 (AGCTL) to graphics data buffer 218, mask generator 216, and pixel SLU
214, and receives acknowledgement signals from virtual translation/FIFO control unit 230 over AG_ACK line 267.
Address generator mask lines AGBMSK 276 and AGWMSK 278 transmit signals from mask generator '216 and pixel SLU and memory address and control unit 236, respectively. Memory address and control unit 236 sends readback data on MAD_RB line 284, VIR RB line 286, and FIFO RB line 288, and address generator 212 sends readback data on AG_RB line 292. The various graphics processor unit elements are also connected by a number of signal lines, which will be explained as they become useful to the description of the operation of the graphics processor unit. Graphics processor unit 210 is connected to video control bus 126 and cursor bus 128.
Memory control unit 220 consists of flow control unit 232, memory state unit 234, memory address and control unit 236, memory data buffer 238, and address/data output multiplexer (mux) 242, all connected by memory control line
(MEMCTL) 244. Pixel SLU 214 is a part of both memory control unit 220 and graphics processor unit 210. Memory address and control 236 unit sends signals to flow control unit 232 and memory state unit 234 over ACCESS_ADRS line 251 and MEMTYP line 252, respectively. The interconnections between the component units of memory control unit 220 will be described below in an illustrative example of a memory request by disk storage device 191.
Certain of the component units of memory control unit 220 communicate various types of information over the various bus structures. Flow control unit 232 receives memory requests from incoming portions 113a, 173a, and 183a of the- system request bus 113 (of Fig. 2) , the I/O address bus 173 (of Fig. 2) , and the network address bus 183 (of Fig. 2) , respectively. Flow control unit 232 acknowledges memory requests through external acknowledgment line 294, which connects to outgoing portions 113b, 173b, and 183b of the system request bus 113
(of Fig. 2), the I/O request bus 173 (of Fig. 2), and the network address bus 183 (of Fig. 2) , respectively. The memory address and control unit 236 receives the address portion of a memory request over incoming portions 114a, 174a, and 184a of the system address bus 114 (of Fig. 2) , the I/O address bus 174 (of Fig. 2) , and the network address bus 184 (of Fig. 2) , respectively. Address/data output multiplexer 242 sends data on outgoing data portions 172b and 182b of the I/O data bus 172 (of Fig. 2) and the network data bus 182 (of Fig. 2), respectively, and sends address information over outgoing address portions 174b and 184b of the I/O address bus 174 (of Fig. 2) , and the network address bus 184 (of Fig. 2) , respectively. Data is received by pixel SLU 214 over incoming data portions 112a, 172a, and 182a of system data bus 112 (of Fig. 2), I/O data bus 172 (of Fig. 2), and network data bus 182 (of Fig. 2), respectively. Memory address and control unit 236 sends address and control information over memory address bus 154 to main memory 140 (of Fig. 2) and frame buffer memory 164 (of Fig. 2) . Memory data buffer 238 sends data to and receives data from main memory 140 (of Fig. 2) and frame buffer memory 164 (of Fig. 2) over memory data bus 152, and also sends data to CPU bus interface 104 (of Fig. 2) over outgoing data portion 112b of system bus 112 (of Fig. 2) . The method by which graphics/memory control unit 130 accomplishes a desired result of processing a memory transaction without causing a transmission on system bus 110, and without any action by processor unit 102, is most easily understood by an example of a memory read request from a system component, such as disk storage device 191. A memory request consists of at least two parts, namely a request information part, which contains information about the requester, and an address part, which contains the memory address of the requested data. The request part and the addressed part are processed separately.
The request information part is transmitted over request portion 173a, of I/O request bus 173, to flow control unit 232. Flow control unit 232 prioritizes the request and transmits request information over prioritized request identification (PREQSEL) line 254 and next memory request
(NEXTMREQ) line 256 to memory state unit 234. Information transmitted includes information about the requester (in this example disk storage device 191) , access type, and operand size. Memory state unit 234 transmits the request information to memory address and control unit 236 over request identification (REQSEL) line 258. 12
The address part of the memory location that is requested is transmitted over address portion 174a of I/O data bus 174 directly to memory address and control unit 236. Memory address and control unit 236 sends request information on memory address bus 154.
Contents of address are returned over memory data bus 152 to memory buffer 238 and are then sent to pixel SLU 214 over memory data (MEMDAT) line 264. Pixel SLU 214, in turn transmits the data over pixel data bus (PXDAT) 222 to address and data output multiplexer 242 to disk storage device 191 over outgoing I/O data bus 172b. A request for a memory read by processor unit 102 proceeds in the same manner, except memory buffer 238 transmits the data to processor unit 102 over outgoing system data bus 112b. Those skilled in the art will appreciate from this example that reads from memory or writes to memory can be accomplished in a like manner by other devices attached to one of the bus structures . Those familiar with the art will also note that the memory request by disk storage device 191 proceeds without any action by processor unit 102, and with no traffic on the system bus 110.
The interconnections of the elements of the graphics/memory control unit 130 also allows the processing of memory request by graphics processor unit 210 without any action by processor unit 102 and without causing any traffic on system bus 110.
Memory requests by graphics processor unit 210 are issued by the address generator 212. Address generator 212 issues request information over address generator request (AGMREQ) line 266, to virtual translation/FIFO control unit 230. Virtual translation/FIFO control 230 unit in turn transmits the request to flow control unit 232 over MA_REQ line 282, which prioritizes the request, and transmits the request to memory address and control unit 236 over REQSEL line 258. The address part of the memory request is transmitted to virtual translation/FIFO control unit 230 over address generator address (AGADRS) line 268. The address is UL translated, if necessary, to a physical address by virtual translation/FIFO control unit 230 in a manner that will be described below. The address is then sent to memory address and control unit 236 over physical address (PADRS) line 272. The memory address and the request information are then sent over memory address bus 154.
If the memory access is a read from memory, the data is returned over memory data bus 152 to memory data buffer 238, which in turns transmits the data to pixel SLU 214. If the memory access is a write to memory, the data is transmitted from pixel SLU 214 to memory data buffer 238 and then to main memory (140 of Fig. 2) over memory data bus 152.
Thus, the method of processing memory requests issued by the graphics processor unit 210 and the method of processing memory requests issued by other system components both include the steps of transmitting a request information part to flow control unit 232; transmitting an address part to memory address and control unit 236; and the receiving or sending of the requested data by pixel SLU 214. It can also be noted that the memory access was executed in both cases without any action by processor unit 102, and without causing any traffic on system bus 110.
The address contained in the address part of the memory request can be any memory address that is accessible by memory address bus 154. Thus, by referring to Fig. 2, it can be seen that the graphics processor unit 210 can access both main memory 140 and frame buffer memory 164 and thus can transfer information between main memory 140 and frame buffer memory 164.
Virtual Translation
A feature of the invention is the method by which virtual addresses are translated to physical addresses.
The virtual address is translated to a physical address by virtual translation/FIFO control unit 230, which is shown in greater detail in Fig. 4. AGMREQ line 266 transmits signal packets containing eight bits from address generator 212 to ii virtual translation/FIFO control unit controller 274. Bit position seven indicates whether the address that is requested on AGADRS line 268 is a physical address or a virtual address. If the signal on AGMREQ line 266 indicates that the address that is requested on AGADRS line 268 is a virtual address, virtual translation/FIFO control unit controller 274 causes the address on AGADRS line 268 to enter virtual translation unit 280 (shown enclosed in broken lines in Fig. 4) .
Virtual translation unit 280 has separate translation units for source, destination, and stencil operands. The components of the translation units are multiplexed through multiplexers 492, 493, 494. The virtual translation of a source operand will be described, however it should be understood that virtual addresses for destination and stencil operands can be translated in a similar manner.
The translation of a virtual address is more easily understood by first briefly discussing virtual address translation generally. In a system with virtual memory, the amount of main memory available to a program is more than the amount of main memory (140 of Fig. 2) that is actually present in the computer system. The program operates on memory locations specified as "virtual addresses". The data identified by a virtual address may actually reside in main memory (140 of Fig. 2) or in some other system component, such as a disk storage device 191.
Virtual addresses consist of references to various tables, called "page tables" which record the physical locations of virtual address. The translation of virtual addresses is done by examining the various tables. Referring to Fig. 5, system page table 310 is a data base, stored in main memory, accessible by the computer operating system. The main memory address 311 of the first entry in system page table 310 is fixed by the operating system, and known to the address generator 212 of Fig. 3. Each of the entries 312 of system page table 310, referred to as "page frame numbers," consists of two portions. First portion 314 contains a "valid" bit, an access code, and a modify bit which will be discussed below. Second portion 316 contains the base address 318 of a secondary page table 320. Secondary page table 320 may be present in main memory 140 (of Fig. 2) , or may be stored in some other system location, such as a disk storage device 191. If secondary page table 320 is present in main memory, the "valid" bit in first portion 314 of system page table entry 312 identifying secondary page table 320 is set to a "valid" state. If secondary page table 320 is not present in main memory 140, the "valid" bit in the identifying entry 312 is set to "invalid" .
Each of the entries 322 of secondary page table 320 consists of two portions. First portion 324 contains a "valid" bit, an access code, and a modify bit, which will be discussed below. Second portion 326 contains the base address 328 of the section, or "page" of main memory (140 of Fig. 2) corresponding to the virtual address. If the "valid" bit in first portion 324 of secondary page table identifying entry 322 is set to "valid", the current physical location of the data identified by the virtual address is in main memory. If the "valid" bit in the identifying entry 322 is set to "invalid", the data identified by the virtual address may not be present in main memory 140.
A virtual address 330 consists of three sections. First section 332 is an index, identifying the address of the entry relative to the base address 311 of system page table 310. Second section 334 is an index, identifying the address of the entry relative to the entry in the base address 318 of the secondary page table 320. Third section 336 is an index which identifies the main memory location relative to the base address 328.
The translation of a virtual address consists of examining the contents of the entry obtained by indexing, from base address of system page table 310, by the index specified in virtual address first section 332, to find the base address 318 of the secondary page table 320; indexing into secondary page table 320 by the index specified in virtual address second section 334 to find the base address 328 of the page of main memory which contains the desired data; and indexing into the page of main memory by the index specified in virtual address third section 336 to find the actual physical address of the desired data. The process is made more efficient by arranging the addressing method such that virtual address third section 336 is concatenated onto the address stored in second section 326 of entry 322 of secondary page table 320 to obtain the physical address of the desired data.
An analysis of the process described above shows that a virtual address translation . involves accessing two memory locations to translate the physical address . One way for reducing the number of memory locations that must be accessed to translate a virtual address is to record, in a memory buffer, the virtual address first section 332 and second section 334 of each virtual address that is translated, and the corresponding secondary page table base address 318, and main memory base address 328 that the virtual address first section 332 and second section 334, respectively, signify. A memory buffer used for this purpose is known as a "translation buffer." If the next virtual address to be translated has the same first section 332 and second section 334 .as the value stored in the translation buffer, the physical address is available without any need to access any memory locations. It can be further noted that if the next virtual address to be translated has the same first section 332, but a different second section 334, than the value stored in the translation buffer, then the base address 318 of secondary page table 320 is immediately available, without having to access system page table 310.
In the current implementation of the invention, bits <29:16> (that is, bits 29 through 16) of a virtual address correspond to virtual address first section 332; bits <15:9> correspond to virtual address second 'section 334; and bits <8:2> correspond to virtual address third section 336.
Referring again to Fig. 4, source page table page latch 342 contains the base address (corresponding to 318 of Fig. 5) of the secondary page table specified b-y the most recentl translated virtual address of a source operand. Sourc secondary page table latch 344 contains the base address 328 of the page of main memory of the most recently translate virtual address of a source operand. Previous source address latch 346 contains the most recently translated source virtual address. Latches 346, 342, and 344 collectively are the source translation buffer.
The virtual address on AGADRS 268 enters virtual translation unit 280. In addition, the address generator calculates a main memory address obtained by indexing by the index specified in address bits <29:16>, and transmits the result to virtual translation unit 280 on PASPTE line 246, for reasons that will be apparent later. Bits <29:16> of the virtual address are compared, in page table page comparator 348, with bits <29:16> of the address stored in previous source address latch 346. Bits <15:09> are compared, in page frame number comparator 349, with bits <15:09> of the address stored in previous source address latch 346. If page table page comparator 348 indicates a match, and page frame number comparator 349 also indicates a match, the virtual address is translated to a physical address by concatenating bits <8:2> of the virtual address onto the value stored in source secondary page latch 344. If page table page comparator 348 indicates a "hit", but page frame number comparator 349 indicates a miss, the base address (corresponding to 318 of Fig. 5) of the secondary page table (corresponding to 320 of Fig. 5) identified by the value in source page table latch 342 is indexed by bits <15:09> of the virtual address to be translated to yield the base address (corresponding to 328 of Fig. 5) of the section, or "page" of main memory in which the desired data is located. The contents of bits <8:2> are then concatenated to the base address to yield the main memory address of the desired data. The contents of base address (corresponding to 328 of Fig. 5) of the page of main memory containing the desired information are then stored in source secondary page latch 34 . In addition, the access code is checked. If the access code indicates that the base address (corresponding to 328 of Fig. 5) of the page of main memory containing the desired information is in a section of memory to which the graphics unit 130 is not allowed access, a signal is generated by virtual translation and FIFO control unit controller 274) which prevents the memory access from occurring. If the type of memory access is a write to memory, the modify bit is also checked. If the modify bit indicates that the page of main memory containing the desired information has not previously been written to, a signal is generated which causes the operating system to change the modify bit in the entry in the secondary page table (corresponding to 320 of Fig. 5) .
If page table page comparator 348 indicates a miss, virtual translation/FIFO control unit controller 274 retrieves the address transmitted on PASPTE line 246, which contains the base address of the secondary page table 320. The process then proceeds as described above if page table comparator 348 had indicated a "hit" but page frame number comparator 349 had indicated a miss .
If the "valid" bit in the address stored in source page table latch 342 or source secondary page table latch 344 is set to "invalid" then an interrupt is sent to CPU 106 and the graphics system halts. CPU 106 fetches the data into main memory, if necessary (typically from disk storage device 191, sets the "valid" bit in the appropriate entry in the system page table 310 or secondary page table 320 to "valid", and restarts the graphics system. Virtual translation unit 280 then proceeds as if page table comparator 348 had indicated a "miss".
Those familiar with the art will understand that destination translation buffer (consisting of destination page table page latch 352, destination secondary page table page latch 354, and previous destination address latch 356) and stencil translation buffer (consisting of stencil page table page latch 362, stencil secondary page table page latch 364, and previous stencil address latch 366) operate in a similar manner as described above with regard to source operands.
Referring to Fig. 2, processor unit 102 contains' a CPU translation unit 108 which has a translation buffer similar to the source translation buffer described above. If a virtual address is displaced from main memory 140 (typically to a disk storage device 191) , then the corresponding entry in system page table is set to "invalid", as is the corresponding entry, if present, in the translation buffer in CPU translation unit 108. The next time that virtual address is translated, the computer's operating system reads that information from the disk storage device 192 into main memory 140, and places an entry, indicating the location of the data, and that the data is valid, in the translation buffer in CPU translation unit 108. When an entry in the translation buffer in CPU translation unit 108 is marked "invalid", the invention provides two methods for maintaining consistency between the system page table page and the entries in the translation buffers in virtual translation unit 230. Referring -to Fig. 4, in one method, if any entry in system page table is marked "invalid", the address is transmitted to virtual translation/FIFO control unit controller 274 which compares the appropriate portion of the address with the address portions in page table latches 342, 344, 352, 354, 362, and 364; if a matching address is found, the corresponding entry is marked "invalid". In a second method, if any entry in the system page table is marked invalid, all entries in virtual translation unit 280 are marked invalid.
Thus, in either method, any entry that is marked as "invalid" in the system page table will also be marked "invalid" in the virtual translation unit 280. Stated differently, virtual translation unit 280 will never have an entry marked as "valid" when the corresponding entry in the system page table is marked as "invalid". In addition, a translation of a virtual address by virtual translation unit 280 results in the same physical address as a translation of that same virtual address by the processor unit 102. Duplicate Tag Store
Referring again to Fig. 2, another feature of the invention is the provision of duplicate tag store 194, situated relative to other system elements in a manner that enables graphi*σs processor unit 210, as well as other system components, such as disk storage device 191, to effect a search of duplicate tag store 194 without creating traffic on system bus 110. Duplicate tag store 194 is located on the same side of he system bus as graphics/memory control unit 130, network data bus structure 180, and I/O data bus structure 170. Specifically, duplicate tag store 194 is connected to I/O data bus structure 170. The purpose of duplicate tag store 194 is to ensure coherency between the information in CPU cache RAM 117 and the corresponding address in main memory 140.
In the current implementation, processor unit 102 may have a six-way associative or a direct-mapped write-through cache tag store 118. Whenever a main memory location is read into processor cache tag store 118 or displaced from processor cache tag store 118, the same main memory location is read into, or displaced from, duplicate tag store 194. Thus the contents, of duplicate tag store-194 are identical to processor cache tag.store 118.-
Memory transaction occurs are acknowledged by memory state unit (234 of Fig. 3) external address acknowledgement line (294 of Fig. 3) , which connects to outgoing portions 113b, 173b, and 183b, of the system request bus 113, the I/O request bus 173, and the network bus 183, respectively. The information indicates the type of memory transaction (such as a read or a write) and the address in main memory 140 that is involved in the transaction. If a memory transaction is a write to an address in main memory 140 by some device other than CPU 106, duplicate tag store 194 searches its contents to see if there is a match with an entry in duplicate tag store 194.
If there is no match, duplicate tag store takes no further action. If there is a match, duplicate tag store 194 21 issues an invalidate request to processor unit 102. Cache controller 116 marks corresponding entry in cache tag store "invalid", so the next time that CPU 106 attempts to access that entry, cache controller 116 reads the new value from the corresponding address in main memory 140.
Thus, an invalidate request, and therefore a transaction on system bus 110 occurs only if graphics processor unit 210, or some other system component writes to a location in main memory 140 that is resident in processor cache tag store 118, thereby further reducing traffic on the system bus 110.
FIFO Command Buffer
Another feature of the invention is the use of system components to more efficiently transmit commands from processor unit 102 to address generator 212, which processes the commands. This can be understood by referring to Fig. 6, which shows the important elements of the path by which commands are transmitted from processor unit 102 to address generator 212. As will be seen, the invention provides for efficient use of memory write buffers for the transmission of graphics commands; provides a command buffer in main memory 140, which allows processor unit 102 to transmit commands whenever the system bus is available; provides a second buffer, which allows transmissions from the command buffer to the address generator 212 to be of uniform length, even if the commands themselves are of variable length; and provides logic that sends the commands directly from the processor unit 102 to address generator 212 or to the second buffer, if the intervening elements on the transmission path are empty.
In addition to elements already identified, elements in the path by which commands are transmitted from processor unit 102 to address generator 212 include memory write buffers 371, 372, and 373, which are present in CPU bus interface 104; FIFO command buffer 134 , which is present in main memory 140; residue buffer 378, which is present in pixel SLU 2.14; and short circuit logic 382, which is in virtual translation/FIFO control unit 230. Also in virtual translation/FIFO control 2 unit 230 are a number of system components, shown in Fig. 4, involved in the management of FIFO command buffer 134. These components include FIFO command buffer base register latch 384, FIFO command buffer tail index latch 386, and FIFO put (FPUT) line 388, FIFO command buffer head index latch 392, FIFO/clip list next address index multiplexer 394, FIFO/clip list next address index latch 396, next address multiplexer 398, next base address multiplexer 402, FIFO length lines 404, FIFO empty/full comparator 406, FIFO length threshold mask 484, and FIFO save head latch 474. These components include clip list base address latch 472 and clip list starting index latch 474. Write range comparator 482 is used to ensure that any addresses accessed by virtual translation/FIFO control unit 230 are within allowable bounds. Other components of virtual translation/FIFO control unit 230 include address generator comparator latch 476 and address generator comparator 478, and multiplexers 486 - 490.
Referring again to Fig. 6, graphics commands are transmitted from processor unit 102 as writes to a predefined range of addresses in memory. CPU 106 writes graphics commands and other writes to memory, as well as other CPU transactions, on CPU bus 132 to CPU bus interface 104, which identifies the type of CPU transaction. If a transaction is a write to memory, CPU bus interface 104 places the contents of the transaction in one of memory write buffers 371, 372 until the buffer is filled, whereupon it writes further graphics commands and writes to memory into the other of buffers 371, 372. Memory write buffers 371 and 372 alternately send their contents, in the order in which they were filled, to memory write buffer 373, when memory write buffer 373 is empty. Memory write buffer 373 then transmits its contents on system bus structure 110 to graphics/memory control unit 130. Prior to placing the contents of the CPU transaction into the memory write buffers 371, 372, CPU bus interface 104 examines the addresses of the writes to memory to see if there are addresses in the range of addresses designated for graphics commands; if the write to memory within the range, CPU bus interface 104 changes a bit in the memory request portion of the write to memory to indicate that the write to memory is a graphics command. The request portion of the write to memory is sent along system request bus 113, where it is received by flow control unit 232. Flow control unit 232 reads the bit that indicates that the write to memory is a graphics command and signals memory state unit 234 that the write to memory should be sent to the FIFO command buffer 134. If the write to memory is a graphics command, the CPU specified address portion of the write to memory is ignored. Instead, the address to which the command is sent is calculated in the virtual translation/FIFO control unit 230. Referring now to Fig. 4, FIFO command buffer base register latch 384 stores the memory address of the head of the FIFO command buffer (134 of Fig. 6) . FIFO command buffer tail index latch 386 stores the number of FIFO positions between the base address of the FIFO command buffer 134 and the tail of the FIFO command buffer 134 (that is, the current length of FIFO command buffer 134) . The contents of the FIFO command buffer tail index latch 386 and the FIFO command buffer base register latch 384 are combined to yield the memory address to which the graphics command should be sent; this address is transmitted on FIFO put (FPUT) line 388. Since memory address and control unit 236 has been previously signaled that the next write to memory is write to FIFO command buffer 134, memory address and control unit 236 reads the address stored on FPUT line 388, and transmits the address on memory address bus 154. When the command is transmitted, index incrementer 496 increments the value in FIFO command buffer tail index latch 386.
Referring again to Fig. 6, the data portion of the write to memory, that is the graphics command itself, is sent along system data bus 112 to pixel SLU 214. Pixel SLU 214 sends the graphics command over memory data out (MDATO) line 408 to memory buffer 238, which transmits the data on memory data bus 152. When a command is to be fetched from FIFO command buffer 134 for processing, it is processed as a memory request. A memory request is issued to the flow control unit 232 over MA_REQ line 282 by the virtual translation/FIFO control unit controller 274. The address for the memory request is generated by the virtual translation/FIFO control unit 230. Referring now to Fig. 4, FIFO command buffer head index latch 392 stores the number of FIFO positions between the base address of the FIFO command buffer (134 of Fig. 6) and the head of the FIFO command buffer (134 of Fig. 6) . The contents of the FIFO command buffer head index latch 392 are multiplexed through the FIFO/clip list next address index multiplexer 394, and are stored in the FIFO/clip list next address index latch 396, and transmitted to the next address multiplexer 398, where it is indexed by the contents of the FIFO/clip list next address latch 396 to yield the memory address of the head of the FIFO command buffer which is transmitted on the physical address bus (PADRS) 272. Referring again to Fig. 6, the address is then sent to memory address and control unit 236, for transmission on the memory address bus 154. The command is returned from memory on memory data bus 152 to memory buffer 238, and the pixel SLU 214 over MEMDAT line 264. Pixel SLU 214 in turn transmits the graphics command to address generator 212 over pixel data bus (PXDAT) 222.
The use of the FIFO command buffer presents increases the efficiency of transmitting of graphics commands from the processor unit 102 to the address generator 212 in many ways. Processing the transmission of commands from processor unit 102 to address generator 212 as writes to memory allows for increases in efficiency in system operation. The use of the memory write buffers 371, 372, and 373 ensures that when commands are transmitted, that the full bandwidth of the system bus is used. Providing a FIFO command buffer 134 permits commands to be transmitted from processor unit 102 whenever the system bus is available, whether or not address generator 212 is available to process a command. Residue Buffer
The performance of the FIFO command buffer 134 and, generally, the process by which graphics commands are sent from the processor unit 102 to the graphics processor unit 210, is made even more efficient by the another feature of the invention, the residue buffer 138, in pixel SLU 214.
According to the invention, graphics commands are structured in the form of command packets of one to four 32 bit words. The first word (referred to as the "header" has two bits that designate how many 32 bit words there are in the command packet in addition to the heade'r. For example, the length bits in a three word command packet would be set to the value two.
For the most efficient use of the memory bus, the transmissions from FIFO command buffer 134 to the graphics processor unit 210 are a uniform packet of four 32 bit words.
Thus a transmission may contain parts of more than one command.
Residue buffer 138 is a set of three 32 bit registers. Residue buffer 138 is controlled by signals transmitted from the virtual translation/ FIFO control unit controller 274 over FIFO_CTL line 248. When a transmission from the FIFO-Gommand buffer 134 is received in the pixel SLU 214, the virtual translation/FIFO control unit controller 274 causes the first command packet to be forwarded to the address generator 212 over the pixel data bus 222.. The virtual translation/FIFO control unit controller 274 causes the remainder of the transmission, which may contain additional command packets, or a portion of an additional command packet, or both, to be loaded into the residue buffer 138. When the address generator 212 has completed executing a command, the virtual translation/FIFO control unit controller 274 causes the contents of the residue buffer 138 to be immediately forwarded from residue buffer 138 to the address generator 212 over the pixel data bus 222.
The provision of residue buffer 138 ensures that the full bandwidth of memory data bus 152 is used, while still allowing ϋ for variable length of graphics command packets. In addition, storing the next command at a location close to address generator 212 decreases the idle time of address generator 212.
Short Circuit
Yet another feature of the invention that improves the performance of the FIFO command buffer 134 and, generally, the process by which graphics commands are sent from the processor unit 102 to the graphics processor unit 210, is the short circuit mechanism 382. The short circuit mechanism is logic in the virtual translation/FIFO control unit controller 274 that monitors the status of the FIFO command buffer 134, the residue buffer 138, and the address generator 212, and causes commands that are transmitted from processor unit 102 to address generator 212 to be transmitted in a minimum number of steps .
When the address generator 212 is processing a command, it asserts, over AG_BUSY line 287, a signal to short circuit mechanism 382. Additionally, the short circuit mechanism monitors the status of the residue buffer by monitoring commands issued by virtual translation/FIFO control- unit controller 274. And finally, the short circuit mechanism 382 monitors the length of the FIFO command buffer 134 over FIFO length lines 404 (of Fig. 4) . Each time a transaction involving the FIFO command buffer 134, the address generator 212, or the residue buffer 138, occurs, the short circuit mechanism calculates a logic equation. This equation to determines the most effective destination for the next command transmitted from processor unit 102 that is intended for the FIFO command buffer 134, that is, where should the graphics command be sent to minimize the number of transfers from one graphics command storage or processing element to another. The logic equation is summarized in tlie following table:
Figure imgf000033_0001
Figure imgf000034_0001
In a preferred embodiment, FIFO command buffer 134 is not allowed to get full. Use of the "almost full" function causes the virtual translation/ FIFO control unit controller 274 to inhibit further transmissions to the FIFO command buffer 134 when the "almost full" function is activated.
The FIFO command full/empty comparator 406 (of Fig. 4) compares the value stored in FIFO command buffer head index latch 392 (of Fig. 4) with the value stored in the FIFO command buffer tail index latch 386 (of Fig. 4) to calculate the length of (that is the number of commands in) FIFO command buffer 134 and compare the length with a programmable maximum, or "almost full" value, stored in FIFO length threshold mask 484. If the "almost full" value is reached, the FIFO command full/empty comparator 406 (of Fig. 4) signals the virtual translation/FIFO control unit controller 274. Virtual translation/FIFO control unit controller 274 issues an interrupt to CPU 106, to suspend the transmitting of graphics commands. As commands are transferred from FIFO command buffer 134 to address generator 212 for processing, the length of the FIFO command buffer 134 changes, thereby changing the value stored in FIFO command buffer head index latch 392 (of Fig. 4) . When the length of the FIFO command buffer 134 has reached a programmable "almost empty" value, the FIFO command full/empty comparator 406 (of Fig. 4) signals the virtual translation/FIFO control unit controller 274. Virtual translation/FIFO control unit controller 274 issues an interrupt to CPU 106, to resume the transmission of graphics commands.
Short circuit logic 382 ensures that graphics commands are sent directly to the point in the command processing path as close to the address generator as possible, thereby 3 eliminating memory transactions and also eliminating traffic on memory buses 152 and 154.
Cursor Control
Yet another feature of the invention is the method of storing cursor information in main memory 140, and the method of controlling cursor movement by graphics processor unit 210. Referring to Fig. 2, by storing cursor information in main memory 140, and controlling cursor movement in graphics processor unit 210, the invention simplifies the design of video unit 160, as well as minimizing the amount of frame buffer memory 164 that is required by the system. Additionally, some of the cursor control components can be adapted to serve as data paths for video units 160 that include devices other than frame buffer memory 164 and video DAC 166.
An element in the storing and display of the cursor is video/cursor control unit 240. Fig. 7 is a block diagram of video/cursor control unit 240. Internal components of video/cursor control unit 240 are video state unit 412, cursor scanline data buffer and shifter 422, cursor position controller 424, and video/cursor control unit controller 428. Video/cursor control unit 240 transmits memory address data and memory request data through video and cursor memory interface 414 over cursor memory address line 416 and cursor memory request line 418, respectively. Additionally, video/cursor control unit 240 transmits video control information and cursor information through video and cursor external interface 426 over video control bus 126 and cursor bus 128, respectively. Video/cursor control unit controller 428 receives system clock and video clock signals, and send clock pulses to video state unit 412, cursor position controller 424, and video and cursor memory interface 414 over clock pulse lines .
Cursor information is of two forms, pattern data and screen location. Referring to Fig. 2, information regarding the cursor pattern, which extends over 64 consecutive scanlines on the display 162, is stored in a 1024 byte contiguous section of main memory 140. The address of the first byte of the 1024 byte contiguous section of main memory is stored in the video and cursor memory interface (414 of Fig. 7) .
The screen location of the cursor is defined in terms of the X and Y coordinates on the display 162 of the topmost, leftmost pixel of the cursor. The X position is stated in terms of number of pixels from the leftmost side of the display 162, and the Y position is stated as the number of a scanline, beginning at the top of the display 162. The screen location of the cursor is controlled by the computer user, and is input to the computer graphics system by a cursor position controller 109, typically a "mouse", attached to bus interface 104.
Cursor screen location input from the mouse is transmitted to CPU 106. CPU 106 transmits the cursor screen location input to graphics/memory control unit 130.
Referring now to Fig. 3, the cursor screen location input enters graphics/memory control unit 130 over incoming data portion 112a of system data bus 112, and is transmitted to pixel SLU 214. Pixel SLU 214 in turn routes the cursor screen location input to the video/cursor control unit 240 over pixel data bus 222. Referring now to Fig. 7, the cursor screen location input enters video/cursor control unit 240 over pixel data bus 222 and is routed to video state unit 412, where it is stored.
In addition to storing cursor screen location input, video state unit 412 monitors the X position (that is the number of pixels from the leftmost side of the display (162 of Fig. 2) ) and Y position (that is the number of scanlines from the top of the display (162 of Fig. 2)) of next pixel to be shown on display (162 of Fig. 2) . When the X and Y position of the next pixel to be shown match the screen location of the cursor, video state unit 412 signals video and cursor memory interface 414. Video and cursor memory interface 414 generates a memory read request, consisting of an address portion and a request portion. The address portion of the memory read request contains the address of the first byte of the 1024 byte contiguous section of main memory in which the cursor pattern information is stored. The address portion of the memory read request is transmitted on cursor memory address line 416. The request portion of the memory read request is transmitted on cursor memory request line 418. Referring now to Fig. 3, the address portion of the memory read request, which was transmitted on cursor memory address line 416 is routed to memory address and control unit 236; the request portion of the memory read request, which was transmitted on cursor memory request line 418 is routed to flow control unit 232. The memory read request is then processed in the same manner as a memory request issued by address generator 212, which was described above, resulting in the cursor pattern data (for the next scanline to be displayed) being returned to pixel SLU 214. Pixel SLU 214 in turn transmits the cursor pattern data on pixel data bus 222. Video/cursor control unit 240 reads the cursor pattern data off pixel data bus 222.
Referring now to Fig. 7, cursor pattern data is routed on pixel data bus 222 to cursor scanline data buffer and shifter 422, which aligns the cursor according to signals generated by cursor position controller 424, for reasons that will be explained below. The cursor pattern data is then transmitted, by video and cursor external interface 426 over cursor bus 128 to video unit (160 of Fig. 2) , where the cursor pattern data is processed in a manner that will be described in the discussion of the video uni . Video state unit 412, subsequent to generating the signal to cursor memory interface 414, increments a scanline counter
(not shown) , and continues to monitor the X position of the next pixel to be shown in the display (162 of Fig. 2) . When the X position of the next pixel to be shown in the display matches the X position of the cursor, video state unit 412 examines the value in the scanline counter. If the value in the counter is less than 64 (meaning that there are more 2i scanlines of cursor to be displayed) , video state unit 412 issues a signal to cursor memory interface 414, which requests the next scanline of cursor information from main memory (140 of Fig. 2) . If the value in the scanline counter is 64, then cursor has been displayed, and the video state unit 412 takes no further action relative to cursor display, until the next time the X and Y position of the cursor matches the X and Y position of the next pixel to be shown.
The alignment signals generated by cursor position controller 424 are necessary because transmissions on cursor bus include groups of pixels. If the first pixel of the cursor pattern is in the middle of the group of pixels, it is necessary to properly align the first pixel of cursor pattern data with the group of pixels in the transmission. Video/cursor control unit controller 428 receives control information from flow control state unit 232 over FCTL line
224, and receives timing signals from system clock line 431 and video clocks (which will be described later in the description of video unit 160 ) , over incoming portion of video control bus 126a.
An additional function of video/cursor control unit controller 428 is to control the configuration of the other elements of video/cursor control unit 240. If it is desired to replace video unit 160 with a more complex video unit, such a three dimensional graphics processor unit, or a more sophisticated video DAC, the invention provides for using some of the communications links and memory request capabilities of video/cursor control unit 240 for purposes other than cursor control. A signal (not shown) , is generated by CPU 106 to video/cursor control unit controller 428 causes a change in a bit in a register in video/cursor control unit controller
428, which causes video/cursor control unit controller 428 to disable the logic in other elements of the video/cursor control unit. Instead, signals passes directly between cursor bus 126 to flow control unit (232 of Fig. 3) over V_R_REQ line
416. This will be described in more detail in the discussion below of the video unit 160 and the memory bus structure 150. 32
Thus it can be seen that by placing the cursor and video control functions in graphics processor unit 210, the invention allows cursor pattern information to be stored in main memory 140, and further allows video unit 160 to require less logic and less frame buffer memory than would otherwise be required. Frame Buffer Module
Another feature of the invention is the arrangement of system elements so that the circuit board, or module, on which the frame buffer memory 164 is placed contains a minimum of circuitry and components, and is therefore less expensive. If the computer user wishes to upgrade the monitor from, for example, a low resolution monitor to a high resolution monitor, or from a monochrome monitor to a color monitor, the computer user must typically add more frame buffer memory. This generally requires replacing the frame buffer module. Since, according to the invention, the frame buffer module contains a minimum of circuitry and components, the frame buffer module is relatively inexpensive, thus minimizing the cost to the user.
Several of the elements of the frame buffer module are contained in video unit 160. Fig. 8a shows video unit 160 in greater detail, in a configuration designed to support a low resolution monitor. Frame buffer memory 164 consists of interleaved frame buffer memory banks 432a and 432b, each bank being four 128K eight bit units of standard dual ported video RAM. Frame buffer memory banks 432a and 432b are connected to video DAC 164 (for example, a model BT458 RAMDAC, available from the Brooktree Corporation of San Diego, CA. is suitable) through video multiplexer 434. Also present on video unit 160 are nibble clock 436, LUT load path multiplexer 438 (connected to video multiplexer 434 and video DAC 164) , and frame buffer ROM 442. It can be noted that these elements are either available commercially (frame buffer memory banks 432a and 432b, and video DAC 164) or are relatively simple components (multiplexers 434 and 438, ROM 442, nibble clock 436 and latches 444a and 444b referenced below) . There is no timing or video control logic (typically custom -designed for the computer system, and therefore relatively expensive) , on video unit 160. Video unit 160 is implemented as a module, that is the components of video unit 160 are mounted on an easily replaceable unit, such as a circuit board.
Components of video unit 160 are connected to a plurality of buses and communication lines through ports 501 - 505. Frame buffer memory banks 432a and 432b are connected to memory data bus 152 and memory address bus 154 through frame buffer latches 444a and 444b. Cursor bus 128 is connected to video DAC 166. Timing bus 126 carries a variety of timing signals for various components of video unit 160. Video synchronization (VSYNC) line 446 and video blanking (VBLNK) line 448 connect to video DAC 166, and video DAC enable (BTEN) line attaches to video DAC timing unit 452. There are two multiplexer select lines, a first line 444 which connects to and controls video multiplexer 434, and a second line which connects to and controls LUT load path multiplexer 438. Video shift line 453 connects to frame buffer memory banks 432a and 432b, and nibble clock (NIBCLK) line 454 connects to nibble clock 436.
Data is communicated between frame buffer memory banks 432a and 432b over memory data bus 152 according to signals transmitted on memory address bus 154. Frame buffer latches 444a and 444b act as temporary storage that allow memory data banks 432a and 432b, which transmit data in 64 bit units, to interface with memory data bus 152, which transmits data in 32 bit units.
Data is communicated between frame buffer memory banks 432a and 432b to video DAC 164 through video multiplexer 434. Memory banks 432a and 432b are "interleaved"; according to interleaved memory operation, video multiplexer 354 selects data alternately from memory banks 432a and 432b, and reads the data into video DAC 164. Video DAC 164 converts data to video signals for display 162. Data from memory banks 352 and 354 may be overwritten by input from cursor bus 128, which superimposes the cursor over the graphics -image, or by VBLNK line 364, which causes screen of display 162 to be blanked. If video display 162 is a color or gray scale monitor, a color look up table (LUT) is stored in either or both of frame buffer memory banks 352a and 352b. Each entry in the LUT contains a combination of the colors (typically red, blue, and green) that the display 162 can illuminate, with varying degrees of intensity, at each pixel. Each entry in frame buffer memory banks 432a and 432b contain a reference to an entry in the LUT. The LUT is loaded into video DAC 164 through video multiplexer 434 and the LUT load path multiplexer 438. The LUT load path multiplexer separates the output from video multiplexer into two portions, a data portion and a control portion. The control portion is transmitted to video DAC over LUT control input line 456, and the data portion is transmitted to video DAC over LUT control data line 458. The LUT load path multiplexer also selects between LUT input from video multiplexer 434 and from diagnostic signal line 462. Transceiver 464, video analog comparator 466, and diagnostic signal bus 468 are a part of the diagnostic system.
If display 162 is a high resolution monitor, the video unit 160 required is the same as that of Fig. 8a, except frame buffer memory banks 432a and 432b are each 8 256K four bit units of standard dual ported video RAM. Elements common to the implementation of video unit 160 necessary to support a high resolution monitor and the implementation of video unit 160 necessary to support a low resolution monitor are relatively simple, standard items. Fig. 8a shows video unit 160 in a configuration designed to support a multiple headed system, that is a system that has two displays. Additional elements required to support the second display include additional frame buffer 164', which includes memory banks 432c and 432d, video multiplexer 434', LUT load path multiplexer 438', and video analog comparator 466', and video DAC 166'. Signal lines from timing bus 126 and cursor bus 128 are split and connected to the corresponding additional elements .
Thus upgrading to a different monitor is accomplished with little additional cost, other than the additional cost of the monitor. In addition, implementing video unit 160 as a module, connected to buses 126, 128, 152, 154, and 468, by ports 501 - 505 enables the upgrade to be accomplished by removing module 160 from ports 501 - 505 and replacing it with another module 160' (not shown in this Figure) .
Cursor Bus
Another feature of the invention is the method by which the cursor bus 128 can be adapted for uses other than transmitting cursor information. Fig. 9 shows the structure of Fig. 8, with video unit 160 replace"Srby a video unit 160', which has on it video device 161. Video device 161 may be a video option, such as a three dimensional video device or a graphic accelerator. Video device 161 may also be any other type of computer device which can advantageously be attached to the memory bus. Video unit 160' is connected to memory data bus 152 and memory address bus 154, thereby allowing for memory transfers between video unit 160' and main memory (140 of Fig. 2) in the same manner as described previously for transfers between frame buffer memory (164 of Fig. 2) and main memory (140 of Fig. 2) . In addition, video unit 160' is connected to cursor bus 128. As will be more fully described later in the discussion of the video cursor bus, cursor bus 128 can be used to transmit signals, such as inhibit, reset, and interrupt signals.
The memory bus structure 150 is especially adapted to efficiently transfer data between frame buffer memory 164 and to other system components connected to memory bus structure 150. Memory bus structure 150, consisting of memory data bus 152, memory control bus 154, video control bus 126, and cursor bus 128 is implemented as a set of communication lines from memory control unit 220 to main memory 140, frame buffer ii memory 164, and video DAC 166. Memory bus structure 150 is shown in Fig. 10.
Memory data bus 152 consists of three portions. First portion 152a of memory data bus 152 connects to both main memory 140 and to frame buffer memory 164. First portion 152a transmits data, and latch enable signals that allow memory bus to be of a different width, in number of bits, from main memory 140 or frame buffer 164. Thus, data can be transmitted over memory data bus 152 first portion 152a to either main memory 140 or frame buffer memory 164. Second portion 152b of memory data bus 152 consists of three communications lines that connect memory control unit 220 and frame buffer memory 164. The three communications lines of second portion of memory data bus 152b are output enable lines for frame buffer memory 164. Third portion 152c of memory data bus 152 consists of communication lines that connect memory control unit 220 and main memory 140.
Memory address bus 154 consists of three portions . First portion 154a of memory address bus 154 consists of communications that connect graphics/memory unit 130 with both main memory 140 and frame buffer memory 164, thereby enabling address data to be transmitted from memory control unit 220 to both main memory 140 and frame buffer memory 164. Second portion 154b of memory address bus 154 consists of three communication lines that terminate at memory control unit 220 and frame buffer memory 164. The three communication lines transmit timing signals, output enable signals, and special function information, respectively. Third portion 154c of memory address bus 154 connects memory control unit 220 and main memory 140.
The cursor bus 128 consists of communications lines that transmit cursor information to video DAC 166. Eight communications lines can also be used for other purposes if the' video unit 160 is replaced (as shown in Fig. 8) with a video unit 160', which has on in it a video device 161 such as more complex video unit, a three dimensional graphics unit, or a more sophisticated video DAC. In this case, the eight £2 communication lines do not carry cursor signals . Instead, two of the lines carry system clock signals to the video unit 160'; two of the lines carry signals to video unit 160' indicating the validity and length, in 32 bit words, of transmissions intended for video device 161, one of the lines transmits reset signals to video device 161, and the remaining three lines carry inhibit, interrupt, and stall signals from the video unit 160' .
Referring now to Fig. 7, it was described above that if video unit 160 is replaced by a optional device (160' of Fig. 9) , a signal, generated by CPU (106 of Fig. 2) to video/cursor control unit controller 428 changes a bit in a register in video/cursor control unit controller 428, which causes cursor scanline data buffer and shifter 422 not to perform its normal function. Instead, cursor scanline data buffer and shifter 422 passes data between cursor bus 126 and pixel data bus 222. This configuration provides a direct communication path between cursor bus 126 and cursor memory interface 414, thus enabling the video unit (160' of Fig. 9) to communicate control signals through cursor memory interface 414.
This configuration further provides a method for accomplishing memory transfers directly between main memory 140 and video unit 160' without moving the data through graphics/memory unit 130. Graphics processor unit 210 issues a memory read request in the manner described above. The read request results in the data from the requested memory address to be transmitted on memory data bus 152. Signals are transmitted on the two lines of cursor bus 128 that indicate the validity and length of transmissions intended for video device 161, thereby causing one of latches 444a and 444b to read the data that is on cursor bus 128.
Video control bus 126 transmits video control signals to video unit 160' . Video control bus 126 consists of a plurality of communications lines. Eight of the communications lines transmit, respectively, a video blanking signal, a video synchronization signal, a video shift register enable signal, a video multiplexer select signal, an enable signal for loading the color look-up table (LUT) , an LUT input multiplexer select signal, and a video nibble clock signal.
Thus, the invention provides a method by which the cursor bus 128 can be used for purposes other than communicating cursor information. This enables the system designer to replace the video module (160 of Fig. 2) with a video unit 160', without requiring the expensive and complex task of redesigning the memory bus structure 150. The graphics system can therefore be easily and inexpensively upgraded from a low resolution monitor, to a higher resolution monitor, to a more complex video option, or to some other optional device. In addition, the invention allows for the transfer of data directly from main memory 140 to video unit 160', without the data passing through graphics/memory unit 130.
Although the foregoing has described preferred embodiments of the invention, it will be appreciated by those skilled in the art that changes in the embodiment may be made without departing from the principles of the invention, the scope of which is defined in the appended claims.

Claims

We claim :
1. A computer graphics system, comprising: a memory bus; a main memory, connected to said memory bus; a frame buffer memory; a CPU, to generate graphics commands; a system bus, coupled to said CPU; an I/O bus; and a graphics processor, to process said graphics commands, said graphics processor capable of issuing memory requests to said main memory; and a memory control unit, connected to said graphics processor, said system bus, and said memory bus.
2. A computer graphics system as claimed in claim 1, further comprising a contiguous section of said main memory, for storing said graphics commands.
3. A computer graphics system as claimed in claim 2, wherein said CPU is capable of issuing memory access requests, said memory access requests containing virtual addresses, said computer graphics system further comprising means, coupled to said address generator, for translating said virtual addresses to physical addresses in main memory, and wherein said graphics processor comprises an address generator, capable of generating memory access requests, said memory access requests including virtual addresses; and a second means, coupled to said CPU and said memory bus, for translateing said virtual addresses to physical addresses.
4. The computer graphics system as claimed in claim 3, further comprising: means for storing cursor pattern information in said main memory; a cursor control unit, coupled to said memory control unit, for retrieving said cursor pattern information from said main memory.
5. The computer graphics system as claimed in claim 3, further comprising: a video unit; a cursor control unit; a bus, coupled to said video unit; switch means, in said cursor control unit, having at least two positions; means, responsive to said switch means being in a first postion, for transmitting cursor control information from said cursor control unit to said video unit; means, responsive to said switch means being in a second position, for coupling said cursor control unit and said bus; means, responsive to said switch means being in said second position, for transmitting a control signal between said video unit and said memory control unit.
6. The computer graphics system as claimed in claim 2, further comprising: means, coupled to said CPU, for transmitting data units from said CPU, said units containing a plurality of words, said plurality of words containing at least one said graphic command; and means, coupled to said transmitting means and to said graphics processor, for receiving said data units, for storing at least one of said words, and for transmitting remaining said words to said graphics processor, said remaining said words being a graphics command.
7. The computer graphics system as claimed in claim 2, further comprising: means, for transmitting data units from said CPU to said graphics processor; means for determining if said graphics processor is currently processing a graphics command; means, responsive to a determination by said determining means that said said graphics processor is not currently processing a graphics command, for transmitting said graphics command to said graphics processor, bypassing said section of main memory; means, responsive to a determination by said determining means that said said graphics processor is currently processing a graphics command, for storing said graphics command to said section of main memory.
8. A computer graphics system as claimed in claim 1, wherein said CPU is capable of issuing memory access requests, said memory access requests containing virtual addresses, said computer graphics system further comprising means, coupled to said address generator, for translating said virtual addresses to physical addresses in main memory, and wherein said graphics processor comprises an address generator, capable of generating memory access requests, said memory access requests including virtual addresses; and a second means, coupled to said CPU and said memory bus, for translateing said virtual addresses to physical addresses.
9. A computer graphics system as claimed in claim 8, further comprising a contiguous section of said main memory, for storing said graphics commands.
10. The computer graphics system as claimed in claim 9, further comprising: means for storing cursor pattern information in said main memory;
PCT/US1992/007055 1991-08-21 1992-08-21 Computer graphics system WO1993004462A1 (en)

Applications Claiming Priority (18)

Application Number Priority Date Filing Date Title
US748,357 1985-06-24
US748,361 1985-06-24
US74835891A 1991-08-21 1991-08-21
US74836291A 1991-08-21 1991-08-21
US74835991A 1991-08-21 1991-08-21
US74836391A 1991-08-21 1991-08-21
US74836191A 1991-08-21 1991-08-21
US74835691A 1991-08-21 1991-08-21
US74835591A 1991-08-21 1991-08-21
US748,363 1991-08-21
US748,358 1991-08-21
US748,362 1991-08-21
US07/748,357 US5313577A (en) 1991-08-21 1991-08-21 Translation of virtual addresses in a computer graphics system
US07/748,360 US5321806A (en) 1991-08-21 1991-08-21 Method and apparatus for transmitting graphics command in a computer graphics system
US748,356 1991-08-21
US748,359 1991-08-21
US748,355 1991-08-21
US748,360 1991-08-21

Publications (1)

Publication Number Publication Date
WO1993004462A1 true WO1993004462A1 (en) 1993-03-04

Family

ID=27578910

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1992/007055 WO1993004462A1 (en) 1991-08-21 1992-08-21 Computer graphics system

Country Status (2)

Country Link
AU (1) AU2553592A (en)
WO (1) WO1993004462A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0712073A2 (en) 1994-11-10 1996-05-15 Brooktree Corporation System and method for processing commands without polling of the hardware by the software
WO1996041324A1 (en) * 1995-06-07 1996-12-19 International Business Machines Corporation Computer system with optimized display control
EP0902355A2 (en) * 1997-09-09 1999-03-17 Compaq Computer Corporation System and method for invalidating and updating individual gart (graphic address remapping table) entries for accelerated graphics port transaction requests
WO2000000887A1 (en) * 1998-06-30 2000-01-06 Intergraph Corporation Method and apparatus for transporting information to a graphic accelerator card
US6266753B1 (en) 1997-07-10 2001-07-24 Cirrus Logic, Inc. Memory manager for multi-media apparatus and method therefor
WO2012109619A1 (en) * 2011-02-10 2012-08-16 Qualcomm Incorporated Data storage address assignment for graphics processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4811281A (en) * 1986-02-20 1989-03-07 Mitsubishi Denki Kabushiki Kaisha Work station dealing with image data
EP0425185A2 (en) * 1989-10-23 1991-05-02 International Business Machines Corporation Memory management for hierarchical graphic structures

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4811281A (en) * 1986-02-20 1989-03-07 Mitsubishi Denki Kabushiki Kaisha Work station dealing with image data
EP0425185A2 (en) * 1989-10-23 1991-05-02 International Business Machines Corporation Memory management for hierarchical graphic structures

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0712073A2 (en) 1994-11-10 1996-05-15 Brooktree Corporation System and method for processing commands without polling of the hardware by the software
EP0712073A3 (en) * 1994-11-10 1998-09-23 Brooktree Corporation System and method for processing commands without polling of the hardware by the software
WO1996041324A1 (en) * 1995-06-07 1996-12-19 International Business Machines Corporation Computer system with optimized display control
US6266753B1 (en) 1997-07-10 2001-07-24 Cirrus Logic, Inc. Memory manager for multi-media apparatus and method therefor
EP0902355A2 (en) * 1997-09-09 1999-03-17 Compaq Computer Corporation System and method for invalidating and updating individual gart (graphic address remapping table) entries for accelerated graphics port transaction requests
EP0902355A3 (en) * 1997-09-09 2000-01-12 Compaq Computer Corporation System and method for invalidating and updating individual gart (graphic address remapping table) entries for accelerated graphics port transaction requests
WO2000000887A1 (en) * 1998-06-30 2000-01-06 Intergraph Corporation Method and apparatus for transporting information to a graphic accelerator card
WO2012109619A1 (en) * 2011-02-10 2012-08-16 Qualcomm Incorporated Data storage address assignment for graphics processing
CN103370728A (en) * 2011-02-10 2013-10-23 高通股份有限公司 Data storage address assignment for graphics processing
US9047686B2 (en) 2011-02-10 2015-06-02 Qualcomm Incorporated Data storage address assignment for graphics processing
KR101563070B1 (en) * 2011-02-10 2015-10-23 퀄컴 인코포레이티드 Data storage address assignment for graphics processing

Also Published As

Publication number Publication date
AU2553592A (en) 1993-03-16

Similar Documents

Publication Publication Date Title
US5321806A (en) Method and apparatus for transmitting graphics command in a computer graphics system
US5313577A (en) Translation of virtual addresses in a computer graphics system
US6104418A (en) Method and system for improved memory interface during image rendering
US6124865A (en) Duplicate cache tag store for computer graphics system
US7334108B1 (en) Multi-client virtual address translation system with translation units of variable-range size
US5559952A (en) Display controller incorporating cache memory dedicated for VRAM
US7296139B1 (en) In-memory table structure for virtual address translation system with translation units of variable range size
US5548742A (en) Method and apparatus for combining a direct-mapped cache and a multiple-way cache in a cache memory
US7278008B1 (en) Virtual address translation system with caching of variable-range translation clusters
EP1325417B1 (en) Shared address translation and caching
US7415575B1 (en) Shared cache with client-specific replacement policy
US5664161A (en) Address-translatable graphic processor, data processor and drawing method with employment of the same
US6658531B1 (en) Method and apparatus for accessing graphics cache memory
KR100346817B1 (en) Interface Controller for Framebuffered Random Access Memory Devices
EP1741089B1 (en) Gpu rendering to system memory
US6750870B2 (en) Multi-mode graphics address remapping table for an accelerated graphics port device
US6772291B2 (en) Method and apparatus for cache replacement for a multiple variable-way associative cache
US6598136B1 (en) Data transfer with highly granular cacheability control between memory and a scratchpad area
US5914730A (en) System and method for invalidating and updating individual GART table entries for accelerated graphics port transaction requests
US5999198A (en) Graphics address remapping table entry feature flags for customizing the operation of memory pages associated with an accelerated graphics port device
US20090307406A1 (en) Memory Device for Providing Data in a Graphics System and Method and Apparatus Thereof
CA2249389C (en) Method and apparatus for performing direct memory access (dma) byte swapping
WO1993004462A1 (en) Computer graphics system
US5761722A (en) Method and apparatus for solving the stale data problem occurring in data access performed with data caches
US5786825A (en) Virtual display subsystem in a computer

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
122 Ep: pct application non-entry in european phase