WO1996039667A1 - Write cache for write performance improvement - Google Patents

Write cache for write performance improvement Download PDF

Info

Publication number
WO1996039667A1
WO1996039667A1 PCT/US1996/008595 US9608595W WO9639667A1 WO 1996039667 A1 WO1996039667 A1 WO 1996039667A1 US 9608595 W US9608595 W US 9608595W WO 9639667 A1 WO9639667 A1 WO 9639667A1
Authority
WO
WIPO (PCT)
Prior art keywords
write
cache
address
write cache
data
Prior art date
Application number
PCT/US1996/008595
Other languages
French (fr)
Inventor
Uwe Kranich
Original Assignee
Advanced Micro Devices, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Micro Devices, Inc. filed Critical Advanced Micro Devices, Inc.
Priority to DE69618783T priority Critical patent/DE69618783T2/en
Priority to EP96917987A priority patent/EP0835490B1/en
Publication of WO1996039667A1 publication Critical patent/WO1996039667A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0844Multiple simultaneous or quasi-simultaneous cache accessing
    • G06F12/0846Cache with multiple tag or data arrays being simultaneously accessible
    • G06F12/0848Partitioned cache, e.g. separate instruction and operand caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0804Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0879Burst mode

Definitions

  • the invention relates to improving the efficiency of data transfer in a data processing system.
  • a system according to the invention provides a write cache to support burstwrite capability.
  • Conventional processing systems include a central processing unit (processor), a main memory, and in some systems, a cache memory between the processor and the main memory.
  • processor central processing unit
  • main memory main memory
  • cache memory between the processor and the main memory.
  • High-speed, small-capacity cache memories hold portions of the information from the main memory that are used frequently by the processor in order to expedite memory fetch, thereby leaving more time for the processor to perform other functions.
  • the time required to supply a processor with required information may be decreased by lowering the time lapse between the memory's receipt of address information from the processor and the transmission of the required information back to the processor. This time lapse is determined by the "speed" of the memory. Since the cost of memory is directly related its speed, it is often not practical to use the fastest memory available, especially in processing systems which require large quantities of memory.
  • cache memory Using a relatively small bank of relatively high-speed memory, called cache memory, as a buffer for a larger bank of slower system memory improves the average information-request to information-supply speed.
  • the processor initially requests information it needs from the cache memory. If the information is stored in the cache memory, the request is said to be a "cache hit” and the information is provided to the processor from the cache memory at the faster rate. If the required information is not stored in the cache memory, the information request is said to be a "cache miss" and the information is retrieved from the system memory at the slower transfer rate.
  • a copy of the information can be stored in the cache memory in anticipation of subsequent requests for the same information by the processor.
  • a plurality of n registers can be written as a burst in a burstwrite process.
  • the information is stored internally without writing to an external bus.
  • access to the cache is necessary because the information is not in main memory.
  • Cache access can be accomplished using a writeback or inquiry cycle in which multiple writes are sent in a single burstwrite.
  • Certain processing architectures like the X86 architecture, do not support instructions to generate burstwrite accesses to an external bus that connects the processor to another memory, such as main memory. Instead, all writes initiated by software are simple writes. Burstwrites occur only during a cache replacement or a writeback of a modified line within the cache during snooping. These burstwrite cases are controlled by the cache itself and are not under software control. Thus, the application software does not initiate a burstwrite for a specific sequence of memory locations. For certain applications, this inability to perform burstwrites is inefficient because it restricts the possible transfer rate between the processor and memory.
  • One example is a graphics transfer, where very often sequential memory locations will be written into memory, with each memory location conforming to a pixel and adjacent memory locations corresponding to data concerning adjacent pixels for a graphics image. Repeatedly executing a simple write for each pixel is a highly inefficient use of system resources.
  • system which includes a processor, a first level cache (level 1) cache operativeiy connected to the processor and a main memory, and a write cache operativeiy connected to the processor, the level 1 cache, and the main memory.
  • the write cache detects a write request from the processor and receives a command from the level 1 cache indicating whether the level 1 cache will service the write request. If the command received from the level 1 cache indicates that the level 1 cache will not service the write request, the write cache compares a memory address on an internal address bus corresponding to the write request to a prespecified range of addresses within the write cache. If the memory address is within the prespecified range, the write cache stores data from an internal data bus and the memory address on the internal address bus into the write cache.
  • the write cache detects a burstwrite request from the processor.
  • the write cache outputs information corresponding to a storage location within the write cache that matches a tag address corresponding to the burstwrite request.
  • the tag address is read by the write cache from the internal data bus, and the write cache outputs the information onto an external bus for output to the main memory.
  • a memory system includes a processor, a level 1 cache operativeiy connected to the processor and a main memory, and a write cache operativeiy connected to the processor, the level 1 cache and the main memory.
  • the write cache includes a write cache controller operativeiy connected to the processor and the level 1 cache.
  • the write cache controller detects write requests from the processor. Upon receipt of such a write request, the write cache controller receives a first command from the level 1 cache signifying whether the level 1 cache will process the write request. The write cache controller sends a second command to the level 1 cache signifying whether the write cache will process the write request.
  • the write cache also includes a write cache storage area operativeiy connected to the write cache controller and connected to the main memory over an external address and data bus.
  • the write cache storage area includes a plurality of storage locations for storing information.
  • the write cache also has an address comparator operativeiy connected to the processor and the write cache controller.
  • the address comparator compares an address on an internal address bus to a prespecified address range within the address comparator.
  • the address comparator notifies the write cache controller when the address read from the address bus is within the prespecified address range.
  • the write cache also includes a copyback logic circuit operativeiy connected to the processor, the write cache controller, and the write cache storage area.
  • the copyback logic circuit detects a burstwrite request from the processor. Upon receipt of such a burstwrite request, the copyback logic circuit notifies the write cache controller of the burstwrite request.
  • the copyback logic circuit stores an address corresponding to the burstwrite request from an internal data bus, and the copyback logic circuit sends the address corresponding to the burstwrite request to the write cache storage area.
  • the write cache storage area has a status field for each of the plurality of storage locations within the write cache storage area.
  • the status field signifies whether or not information held in a storage location has been sent to the main memory in response to a burstwrite request from the processor.
  • the write cache controller checks the status fields to determine if any of the storage locations can store information corresponding to the write request from the processor.
  • a memory system as described above further includes a write cache available notification means in the write cache controller for determining if the write cache can service the write request from the processor.
  • Figure 1 is a block diagram of the elements of a write cache architecture according to the invention
  • Figure 2 is a functional flow diagram of the write sequence from a processor in a cache system with a level 1 cache and a write cache, according to the invention.
  • Figure 3 is a flow diagram of the sequence involved in a burstwrite of a line to an external bus according to the invention in order to get data out from the cache.
  • a write cache 5 includes write cache storage area 10, copyback logic circuit 20. address comparator 30, and write cache controller 40.
  • a level 1 (L 1 ) cache 50 is connected to the write cache 5 over internal buses 60, which include an internal address bus 61 and an internal data bus 62. Both the level 1 cache
  • Cache storage can be organized as lines, each line having a plurality of words.
  • An address of a first word in a line can represent the address of the line.
  • One way that burstwrites can be facilitated is by indicating the address of a line of words to be the subject of a burstwrite.
  • burstwrites are controlled by the cache itself and are not under software control. Thus, the application software does not initiate a burstwrite for a specific sequence of memory locations.
  • the write cache 5 according to the invention is preferably enabled only for a specific memory region.
  • processor software can be written to take advantage of the write cache 5 in desired applications, such as a graphics implementation or disk accelerator.
  • desired applications such as a graphics implementation or disk accelerator.
  • specific software routines which can be changed to take advantage of the write cache features, are implemented without having an impact on the other processor software.
  • restriction to a predetermined region of memory is a limitation of the invention.
  • the write cache controller 40 controls all of the functions of the write cache 5 and interfaces with the level 1 cache 50.
  • the write cache storage area 10 stores data, a tag address for the data, and a status indication for the data.
  • the write cache storage area 10 is under control of the write cache controller 40.
  • the address comparator 30 compares an address received on the internal address bus 61 with a predetermined address range stored within the address comparator 30. If the address received on the internal address bus 61 falls within the predetermined address range, the address comparator 30 sends a signal to the write cache controller 40 indicating this "address range hit".
  • the copyback logic circuit 20 generates a request to the write cache controller 40 to copy a line corresponding to a specific address from the write cache storage area 10 to the external buses (not shown) upon detecting a burst write command from the processor. As discussed further herein, the physical address of the line to be copied back is written into the copyback logic circuit 20. The logic itself is selected by a specific address. During a write on the external buses, the copyback logic circuit 20 detects a write to a location and latches the physical address of the line to be copied back from the internal data bus 60. The copyback logic circuit 20 then indicates to the write cache controller 40 that a copyback needs to be initiated.
  • the write cache 5 only works on writes by the processor. When a read occurs, no operations are performed by the write cache 5.
  • the write cache controller 40 sees this write access and starts a cache look up to determine if the address corresponding to the write access is currently in the write cache storage area 10.
  • the address comparator 30 reads the address corresponding to the write access from the internal address bus 61 and determines if that address falls within a prespecified range of addresses for which the write cache 5 is programmed.
  • the level 1 cache 50 is also looking to see if it can service that write access.
  • a processor initiates a write request.
  • the write cache controller 40 begins a cache look up and an address range comparison.
  • level 1 cache 50 also begins a cache look up.
  • the level 1 cache 50 has a cache hit, as in step 202. If this is the case, the write request access address is resident in the level 1 cache 50 and the level 1 cache 50 services the request (step 204). No further operation is required of the write cache controller 40, and the sequence ends, as in step 206.
  • the level 1 cache 50 When the level 1 cache 50 has a cache hit, the level 1 cache 50 notifies the write cache 5 of this occurrence by sending a Ll_Hit signal to the write cache controller 40, as can be seen from figure 1. Upon receipt of the LI Hit signal, the write cache 5 knows that it is not to perform any further operations with respect to the write request. Another possibility is that the level 1 cache 50 has a cache miss (i.e., no Ll_ Hit signal received by the write cache controller 40) and the requested write address, as determined by the address comparator 30, is not within the selected address region of the write cache 5, as shown in step 208 of Figure 2. In this case, no further operation is required of the write cache controller 40, and the sequence ends, as in step 210. If the requested write address is within the range of addresses in the write cache 5, step 212 is performed. In step 212, the write request is tested to determine if any of the tag addresses in the write cache 5 matches the request. This leads to third and fourth possible outcomes.
  • the third possible outcome is that the level 1 cache 50 has a cache miss (step 202), the requested write address, as determined by the address comparator 30. is within the selected address region of the write cache 5 (step 208), and the result of step 212 is that the write access hits the write cache 5.
  • the write access is stored in the appropriate storage area corresponding to the selected line in the write cache storage area 10, as in step 213. No external simple write is generated in this case.
  • the fourth possibility is that the level 1 cache 50 has a cache miss, the write access, as determined by the address comparator 30, is within the selected address region of the write cache 5, but that the outcome of step 212 is such that the write access does not hit the write cache 5.
  • the write cache is examined to determine if at least one storage area in the write cache storage area 10 is available to store the write access. Two outcomes are now possible.
  • step 216 the write cache controller 40 notifies the level 1 cache 50 that it stored the data corresponding to the write access. In this case, no external simple write, such as to the main memory, is generated.
  • step 202 if the level 1 cache 50 has a cache miss (step 202), and the write access, as determined by the address comparator 30, is within the selected address region of the write cache 5 (step 208), and the write access does not hit the write cache 5 (step 212), and the outcome of step 214 indicates that there are no available storage locations in the write cache storage area 10, the write cache 5 does not store the write access in its memory (step 220).
  • the write cache 5 notifies the level 1 cache 5 that it did not service the write access, as in step 222. for example, by sending a no hit signal to the LI cache.
  • a plurality of bits e.g., four bits per word, can be used to define which byte was written or modified. This is useful to avoid sending invalid data, as discussed further herein.
  • the processor software initiates a single write to a copyback address register (not shown) in copyback logic circuit 20, as given in step 301 and places the physical address of the line to be copied on the internal bus.
  • the copyback logic circuit 20 itself is selected by a specific address. From the internal data bus 62, as given in step 302, the copyback address register 20 reads and latches the physical address of the write location (line) to be copied back.
  • the copyback logic circuit 20 notifies the write cache controller 40 of the burstwrite request from the processor, as given in step 303.
  • the write cache controller 40 responds to that burstwrite request by placing the selected line of the write cache 5 corresponding to the requested copyback access onto the external buses for a fast burstwrite, as given in step 304.
  • the storage locations in the write cache storage area 10 that were dumped onto the external buses are freed up for the storage of new write requests from the processor, as given in step 306.
  • the write cache controller 40 keeps track of the available storage locations within the write cache 5, as given in step 307.
  • the LI cache 50 performs the functions of a conventional cache memory and is not affected at all by the operation of the write cache 5.
  • the results of the address comparisons activate the functionality of the write cache 5 for a particular memory region. Address comparisons are performed in a processor or logic circuits within address comparator 30 implementing this function.
  • the write cache controller 40 which can be implemented in logic circuitry or in a processor, controls buffering and informs the LI cache 50 that it need not respond to a particular write request by sending a signal over the WRC_Hit Line connecting the LI cache 50 to the write cache controller 40.
  • the write cache controller 40 controls buffering and informs the LI cache 50 that it need not respond to a particular write request by sending a signal over the WRC_Hit Line connecting the LI cache 50 to the write cache controller 40.
  • the write cache controller 40 controls buffering and informs the LI cache 50 that it need not respond to a particular write request by sending a signal over the WRC_Hit Line connecting the LI cache 50 to the write cache controller 40.
  • the copyback logic circuit 20 which can also be implemented in logic circuitry or in a processor, is used to execute a burstwrite.
  • the write cache 5 according to the invention is particularly useful in applications where streams of sequential data are written to memory. For example, a display driver can be written to draw four adjacent horizontal pixels. After four writes are performed into the write cache 5 without going to the external buses, the driver determines that it has completed its writes to the write cache 5 (assuming the write cache 5 is not full).
  • a copyback is initiated through the copyback logic circuit 20.
  • a complete address corresponding to a tag address is written into the copyback logic circuit 20.
  • the tag address is the address of the data in the write cache 5 which is to be put on the external buses to accomplish the drawing.
  • the copyback logic circuit 20 retrieves from the internal bus 60 the address of the data to be sent to the bus interface unit 80 connected to the external buses, to thereby be sent to a particular address in the main memory (not shown) at a later time. Since the data is sequential, a single burstwrite cycle can transfer the information. For example, if line 1000 contains 16 bytes, addresses 1000 through 1015 are accessed as a burst.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A memory system has a level (1) cache and write a cache connected to a processor, wherein the write cache has a memory address range and wherein the processor initiates a write to the write cache which is detected by the write cache. The write cache responds to the write request by storing information into the write cache if the write cache is not already full. If there is no storage location available in the write cache, a message is sent to the level (1) cache notifying that cache of this condition. The write cache responds to requests from the processor to write information stored in particular areas of the write cache into a main memory by placing that information on an external bus to be read by the main memory. The write cache then frees up those storage locations within the write cache to be used for storing subsequent writes requested by the processor.

Description

WRITE CACHE FOR WRITE PERFORMANCE IMPROVEMENT
Background of the Invention
1. Field of The Invention
The invention relates to improving the efficiency of data transfer in a data processing system. In particular, a system according to the invention provides a write cache to support burstwrite capability.
2. Description of the Related Art
Conventional processing systems include a central processing unit (processor), a main memory, and in some systems, a cache memory between the processor and the main memory. High-speed, small-capacity cache memories hold portions of the information from the main memory that are used frequently by the processor in order to expedite memory fetch, thereby leaving more time for the processor to perform other functions.
The time required to supply a processor with required information may be decreased by lowering the time lapse between the memory's receipt of address information from the processor and the transmission of the required information back to the processor. This time lapse is determined by the "speed" of the memory. Since the cost of memory is directly related its speed, it is often not practical to use the fastest memory available, especially in processing systems which require large quantities of memory.
Using a relatively small bank of relatively high-speed memory, called cache memory, as a buffer for a larger bank of slower system memory improves the average information-request to information-supply speed. Specifically, in a system having a cache memory, the processor initially requests information it needs from the cache memory. If the information is stored in the cache memory, the request is said to be a "cache hit" and the information is provided to the processor from the cache memory at the faster rate. If the required information is not stored in the cache memory, the information request is said to be a "cache miss" and the information is retrieved from the system memory at the slower transfer rate. When the information is supplied to the processor from the system memory, a copy of the information can be stored in the cache memory in anticipation of subsequent requests for the same information by the processor.
In a processor architecture such as the 29K architecture, a plurality of n registers can be written as a burst in a burstwrite process. The information is stored internally without writing to an external bus. When it is necessary to transmit the information over a bus such as a graphics bus, access to the cache is necessary because the information is not in main memory. Cache access can be accomplished using a writeback or inquiry cycle in which multiple writes are sent in a single burstwrite.
Certain processing architectures, like the X86 architecture, do not support instructions to generate burstwrite accesses to an external bus that connects the processor to another memory, such as main memory. Instead, all writes initiated by software are simple writes. Burstwrites occur only during a cache replacement or a writeback of a modified line within the cache during snooping. These burstwrite cases are controlled by the cache itself and are not under software control. Thus, the application software does not initiate a burstwrite for a specific sequence of memory locations. For certain applications, this inability to perform burstwrites is inefficient because it restricts the possible transfer rate between the processor and memory. One example is a graphics transfer, where very often sequential memory locations will be written into memory, with each memory location conforming to a pixel and adjacent memory locations corresponding to data concerning adjacent pixels for a graphics image. Repeatedly executing a simple write for each pixel is a highly inefficient use of system resources.
In a system according to the invention, as discussed further herein, rather than perform a simple write for each pixel, sequential pixels could be written into the cache using a burstwrite feature. The implementation of such a burstwrite feature in an X86 architecture would most likely improve bandwidth improvement by at least 50%. With this bandwidth improvement, the required bus bandwidth for these types of operations would be reduced, resulting in a corresponding increase in performance by the processing system.
Summary of the Invention
In view of the limitations of the related art, as discussed above, it is an object of the invention to provide a mechanism to support a burstwrite, even if the software architecture, such as the X86 architecture does not support a burstwrite feature.
The above and other objects of the invention are achieved by system according to the invention which includes a processor, a first level cache (level 1) cache operativeiy connected to the processor and a main memory, and a write cache operativeiy connected to the processor, the level 1 cache, and the main memory. The write cache detects a write request from the processor and receives a command from the level 1 cache indicating whether the level 1 cache will service the write request. If the command received from the level 1 cache indicates that the level 1 cache will not service the write request, the write cache compares a memory address on an internal address bus corresponding to the write request to a prespecified range of addresses within the write cache. If the memory address is within the prespecified range, the write cache stores data from an internal data bus and the memory address on the internal address bus into the write cache.
In another aspect according to the invention in a processing system as above the write cache detects a burstwrite request from the processor. The write cache outputs information corresponding to a storage location within the write cache that matches a tag address corresponding to the burstwrite request. The tag address is read by the write cache from the internal data bus, and the write cache outputs the information onto an external bus for output to the main memory.
Further according to the invention, a memory system includes a processor, a level 1 cache operativeiy connected to the processor and a main memory, and a write cache operativeiy connected to the processor, the level 1 cache and the main memory. The write cache includes a write cache controller operativeiy connected to the processor and the level 1 cache. The write cache controller detects write requests from the processor. Upon receipt of such a write request, the write cache controller receives a first command from the level 1 cache signifying whether the level 1 cache will process the write request. The write cache controller sends a second command to the level 1 cache signifying whether the write cache will process the write request. The write cache also includes a write cache storage area operativeiy connected to the write cache controller and connected to the main memory over an external address and data bus. The write cache storage area includes a plurality of storage locations for storing information. The write cache also has an address comparator operativeiy connected to the processor and the write cache controller. The address comparator compares an address on an internal address bus to a prespecified address range within the address comparator. The address comparator notifies the write cache controller when the address read from the address bus is within the prespecified address range. The write cache also includes a copyback logic circuit operativeiy connected to the processor, the write cache controller, and the write cache storage area. The copyback logic circuit detects a burstwrite request from the processor. Upon receipt of such a burstwrite request, the copyback logic circuit notifies the write cache controller of the burstwrite request. The copyback logic circuit stores an address corresponding to the burstwrite request from an internal data bus, and the copyback logic circuit sends the address corresponding to the burstwrite request to the write cache storage area.
In another aspect according to the invention, the write cache storage area has a status field for each of the plurality of storage locations within the write cache storage area. The status field signifies whether or not information held in a storage location has been sent to the main memory in response to a burstwrite request from the processor. The write cache controller checks the status fields to determine if any of the storage locations can store information corresponding to the write request from the processor.
In another aspect of the invention a memory system as described above further includes a write cache available notification means in the write cache controller for determining if the write cache can service the write request from the processor.
The other purposes, characteristics and efficiencies of a system according to the invention will be clear by the following detailed descriptions.
Brief Description of the Drawings
The invention, together with further objects and advantages thereof, may best be understood by making reference to the following description taken in conjunction with the accompanying drawing, in the figures included, and wherein: Figure 1 is a block diagram of the elements of a write cache architecture according to the invention;
Figure 2 is a functional flow diagram of the write sequence from a processor in a cache system with a level 1 cache and a write cache, according to the invention; and
Figure 3 is a flow diagram of the sequence involved in a burstwrite of a line to an external bus according to the invention in order to get data out from the cache.
Detailed Description of the Preferred Embodiments
Referring to Figure 1, a write cache 5 includes write cache storage area 10, copyback logic circuit 20. address comparator 30, and write cache controller 40. A level 1 (L 1 ) cache 50 is connected to the write cache 5 over internal buses 60, which include an internal address bus 61 and an internal data bus 62. Both the level 1 cache
50 and the write cache 5 are connected to a processor (not shown) over the internal buses 60. Cache storage can be organized as lines, each line having a plurality of words. An address of a first word in a line can represent the address of the line. One way that burstwrites can be facilitated is by indicating the address of a line of words to be the subject of a burstwrite. As previously discussed, in X86 architectures, burstwrites are controlled by the cache itself and are not under software control. Thus, the application software does not initiate a burstwrite for a specific sequence of memory locations. In order to provide compatibility with software for existing X86 architectures, the write cache 5 according to the invention is preferably enabled only for a specific memory region. In this way, processor software can be written to take advantage of the write cache 5 in desired applications, such as a graphics implementation or disk accelerator. In those cases, specific software routines, which can be changed to take advantage of the write cache features, are implemented without having an impact on the other processor software. However, it will be understood by those of ordinary skill that restriction to a predetermined region of memory is a limitation of the invention.
Referring again to Figure 1, the write cache controller 40 controls all of the functions of the write cache 5 and interfaces with the level 1 cache 50. The write cache storage area 10 stores data, a tag address for the data, and a status indication for the data. The write cache storage area 10 is under control of the write cache controller 40.
The address comparator 30 compares an address received on the internal address bus 61 with a predetermined address range stored within the address comparator 30. If the address received on the internal address bus 61 falls within the predetermined address range, the address comparator 30 sends a signal to the write cache controller 40 indicating this "address range hit".
The copyback logic circuit 20 generates a request to the write cache controller 40 to copy a line corresponding to a specific address from the write cache storage area 10 to the external buses (not shown) upon detecting a burst write command from the processor. As discussed further herein, the physical address of the line to be copied back is written into the copyback logic circuit 20. The logic itself is selected by a specific address. During a write on the external buses, the copyback logic circuit 20 detects a write to a location and latches the physical address of the line to be copied back from the internal data bus 60. The copyback logic circuit 20 then indicates to the write cache controller 40 that a copyback needs to be initiated.
The write cache 5 only works on writes by the processor. When a read occurs, no operations are performed by the write cache 5. According to the invention, when a processor, such as a core CPU, initiates a write, the write cache controller 40 sees this write access and starts a cache look up to determine if the address corresponding to the write access is currently in the write cache storage area 10. Concurrently and in parallel, the address comparator 30 reads the address corresponding to the write access from the internal address bus 61 and determines if that address falls within a prespecified range of addresses for which the write cache 5 is programmed. At the same time that write cache 5 is acting on the write access request from the processor, the level 1 cache 50 is also looking to see if it can service that write access.
Referring now to the functional flow diagram in Figure 2, in step 200 a processor initiates a write request. As noted above, in response to this write request, the write cache controller 40 begins a cache look up and an address range comparison. In parallel, level 1 cache 50 also begins a cache look up. Several outcomes are possible. One possibility is that the level 1 cache 50 has a cache hit, as in step 202. If this is the case, the write request access address is resident in the level 1 cache 50 and the level 1 cache 50 services the request (step 204). No further operation is required of the write cache controller 40, and the sequence ends, as in step 206. When the level 1 cache 50 has a cache hit, the level 1 cache 50 notifies the write cache 5 of this occurrence by sending a Ll_Hit signal to the write cache controller 40, as can be seen from figure 1. Upon receipt of the LI Hit signal, the write cache 5 knows that it is not to perform any further operations with respect to the write request. Another possibility is that the level 1 cache 50 has a cache miss (i.e., no Ll_ Hit signal received by the write cache controller 40) and the requested write address, as determined by the address comparator 30, is not within the selected address region of the write cache 5, as shown in step 208 of Figure 2. In this case, no further operation is required of the write cache controller 40, and the sequence ends, as in step 210. If the requested write address is within the range of addresses in the write cache 5, step 212 is performed. In step 212, the write request is tested to determine if any of the tag addresses in the write cache 5 matches the request. This leads to third and fourth possible outcomes.
The third possible outcome is that the level 1 cache 50 has a cache miss (step 202), the requested write address, as determined by the address comparator 30. is within the selected address region of the write cache 5 (step 208), and the result of step 212 is that the write access hits the write cache 5. In this case, the write access is stored in the appropriate storage area corresponding to the selected line in the write cache storage area 10, as in step 213. No external simple write is generated in this case.
The fourth possibility is that the level 1 cache 50 has a cache miss, the write access, as determined by the address comparator 30, is within the selected address region of the write cache 5, but that the outcome of step 212 is such that the write access does not hit the write cache 5. In this case, in step 214 the write cache is examined to determine if at least one storage area in the write cache storage area 10 is available to store the write access. Two outcomes are now possible.
If a write cache storage location is available, the write data and write address are stored in the write cache 5 in an av ilable storage area of the write cache storage area 10, as in step 216. In step 218, the write cache controller 40 notifies the level 1 cache 50 that it stored the data corresponding to the write access. In this case, no external simple write, such as to the main memory, is generated.
Alternatively, if the level 1 cache 50 has a cache miss (step 202), and the write access, as determined by the address comparator 30, is within the selected address region of the write cache 5 (step 208), and the write access does not hit the write cache 5 (step 212), and the outcome of step 214 indicates that there are no available storage locations in the write cache storage area 10, the write cache 5 does not store the write access in its memory (step 220).
The write cache 5 notifies the level 1 cache 5 that it did not service the write access, as in step 222. for example, by sending a no hit signal to the LI cache.
In those of the above steps in which information is stored in the write cache, it should be noted that in addition to the data conventionally written in a cache, a plurality of bits, e.g., four bits per word, can be used to define which byte was written or modified. This is useful to avoid sending invalid data, as discussed further herein.
After the writes have been placed into the write cache 5, software executing on the processor may want to initialize a burstwrite of data beginning at a specific address to the external bus. Referring now to Figure 3, the processor software initiates a single write to a copyback address register (not shown) in copyback logic circuit 20, as given in step 301 and places the physical address of the line to be copied on the internal bus. As previously discussed, the copyback logic circuit 20 itself is selected by a specific address. From the internal data bus 62, as given in step 302, the copyback address register 20 reads and latches the physical address of the write location (line) to be copied back.
NexL the copyback logic circuit 20 notifies the write cache controller 40 of the burstwrite request from the processor, as given in step 303. The write cache controller 40 responds to that burstwrite request by placing the selected line of the write cache 5 corresponding to the requested copyback access onto the external buses for a fast burstwrite, as given in step 304. After the information has been dumped onto the external buses for transfer to main memory, as given in step 305, the storage locations in the write cache storage area 10 that were dumped onto the external buses are freed up for the storage of new write requests from the processor, as given in step 306. The write cache controller 40 keeps track of the available storage locations within the write cache 5, as given in step 307.
According to the invention, the LI cache 50 performs the functions of a conventional cache memory and is not affected at all by the operation of the write cache 5. During write requests, the results of the address comparisons activate the functionality of the write cache 5 for a particular memory region. Address comparisons are performed in a processor or logic circuits within address comparator 30 implementing this function. Within the write cache module 5, the write cache controller 40, which can be implemented in logic circuitry or in a processor, controls buffering and informs the LI cache 50 that it need not respond to a particular write request by sending a signal over the WRC_Hit Line connecting the LI cache 50 to the write cache controller 40. For example, the write cache controller
40 can store information in a frame buffer memory for a video graphics accelerator, where one address of the frame buffer is followed by a plurality of data words or bytes to be sequentially transmitted in a burst in order to produce a display. The copyback logic circuit 20, which can also be implemented in logic circuitry or in a processor, is used to execute a burstwrite. The write cache 5 according to the invention is particularly useful in applications where streams of sequential data are written to memory. For example, a display driver can be written to draw four adjacent horizontal pixels. After four writes are performed into the write cache 5 without going to the external buses, the driver determines that it has completed its writes to the write cache 5 (assuming the write cache 5 is not full). In order to provide a burstwrite of the data to the external buses in order to produce the display, a copyback is initiated through the copyback logic circuit 20. A complete address corresponding to a tag address is written into the copyback logic circuit 20. The tag address is the address of the data in the write cache 5 which is to be put on the external buses to accomplish the drawing. Thus, the copyback logic circuit 20 retrieves from the internal bus 60 the address of the data to be sent to the bus interface unit 80 connected to the external buses, to thereby be sent to a particular address in the main memory (not shown) at a later time. Since the data is sequential, a single burstwrite cycle can transfer the information. For example, if line 1000 contains 16 bytes, addresses 1000 through 1015 are accessed as a burst. Thus, where all the pixels in one line are stored in sequential locations 1000 through 1015, it is only necessary to write address 1000 to the copyback logic circuit 20 to place all the data in locations 1000 through 1015 on the external bus to produce the display. Such a burstwrite, which can be achieved according to the invention, is far more efficient than a series of conventional individual writes. Moreover, according to the invention, all writes can be stored in cache by using sequential writes in software, thus reducing the load on the cache.
It is to be understood that the detailed drawings and specific examples given describe preferred embodiments of the invention and are for the purpose of illustration, that the apparatus and method of the invention is not limited to the precise details and conditions disclosed, and that various changes may be made therein without departing from the spirit of the invention which is defined by the following claims.

Claims

What Is Claimed Is:
1. A memory system, comprising: a processor for outputting a write request signal to write data to a particular memory address; a first cache operativeiy connected to said processor, said first cache configured to detect said write request signal and to output a first cache hit signal indicative of whether said particular address is within said first cache, said first cache comprising a first plurality of storage locations; a write cache operativeiy connected to said processor and said first cache, said write cache configured to detect said write request signal and to receive said first cache hit signal, said write cache comprising a second plurality of storage locations, wherein when said first cache hit signal indicates that said first cache will not write said data corresponding to said write request signal into said first cache, said write cache compares said particular memory address to a predetermined address range stored within said write cache, and when said particular memory address is within said predetermined address range, said write cache writes said data corresponding to said write request signal into one of said second plurality of storage locations in said write cache.
2. A memory system as recited in Claim 1, wherein when said write cache writes said data corresponding to said write request signal into said one of said second plurality of storage locations in said write cache, said write cache _ends a write cache hit signal to said first cache.
3. A memory system as recited in Claim 1, wherein said write cache comprises: a write cache storage area including said second plurality of storage locations; a copyback logic circuit operativeiy connected to said processor and configured to detect a burstwrite command sent from said processor and configured to store an address corresponding to said burstwrite command, said copyback logic circuit outputs a burstwrite signal and said address corresponding to said burstwrite command upon detection of said burstwrite command; and a write cache controller operativeiy connected to said processor, said copyback logic circuit and said write cache storage area, said write cache controller configured to receive said burstwrite signal and said address corresponding to said burstwrite command from said copyback logic circuit and configured to send a write control signal to said write cache storage area to write data stored in a particular one of said second plurality of storage locations in said write cache storage area that corresponds to said address received from said copyback logic circuit, wherein said write cache storage area outputs data stored in said particular one of said second plurality of storage locations onto an external data bus.
4. A memory system as recited in Claim 3, wherein said copyback logic circuit receives said address corresponding to said burstwrite command on an internal data bus connecting said write cache to said processor, and said copyback logic circuit receives said burstwrite command on a control bus connecting said copyback logic circuit to said processor.
5. A memory system as recited in Claim 4, wherein said copyback logic circuit receives said address corresponding to said burstwrite command concurrently with receiving said burstwrite command.
6. A memory system as recited in claim 3, further comprising: a bus interface unit connected to said first cache and said write cache; and a main memory connected to said bus interface unit through said external data bus and said external address bus. wherein said data resident in said particular one of said second plurality of storage locations in said write cache storage area is sent to said bus interface unit to be subsequently sent to said main memory from said bus interface unit at a predetermined later time.
7. A memory system as recited in claim 3, wherein each of said second plurality of storage locations includes a data available bit, and when said data stored in said particular one of said second plurality of storage locations is written to said external data bus, said write cache controller sets a data available bit corresponding to said particular one of said second plurality of storage locations in said write cache storage area to a data available state, and wherein data stored in said write cache storage area that have not been written to said external data bus have corresponding data available bits set to a data unavailable state, and wherein said data available state indicates that said corresponding storage location can accept new data, and said data unavailable state indicates that said corresponding storage location cannot accept new data.
8. A memory system, comprising: a processor for outputting a write command and information corresponding to said write command, said information including data and a memory address representing a storage location for writing said data into, said processor also outputting a burstwrite command and information corresponding to said burstwrite command; a first cache operativeiy connected to said processor, said first cache receiving said write command and outputting a first cache hit signal indicating whether said memory address corresponding to said write command resides in said first cache; a write cache operativeiy connected to said processor and said first cache, said write cache receiving said write command and outputting a write cache hit signal indicating whether said memory address corresponding to said write command resides in said second cache, said write cache comprising: a write cache controller operativeiy connected to said processor and said first cache, said write cache controller configured to receive said first cache hit signal from said first cache and said write signal from said processor, said write cache controller controlling reading and writing operations for said write cache; a write cache storage area operativeiy connected to said write cache controller, said write cache storage area comprising a plurality of information storage areas; an address comparator operativeiy connected to said processor and said write cache controller, said address comparator configured to compare an address received on an internal address bus to a prespecified address range, wherein said address comparator sends an address range hit signal to said write cache controller when said address received on said internal address bus is within said prespecified address range; and a copyback logic circuit operativeiy connected to said processor, said write cache controller and said write cache storage area, said copyback logic circuit configured to detect said burstwrite command and configured to send a burstwrite signal and a memory address corresponding to said burstwrite signal to said write cache controller upon detection of said burstwrite command, wherein said write cache controller sends a cache write command to said write cache storage area to write information stored in one of said storage locations corresponding to said memory address of said burstwrite signal when said first cache hit signal indicates that said first cache does not contain said memory address corresponding to said burstwrite signal.
9. A memory system as recited in claim 8, wherein said write cache storage area includes a status field for each of said plurality of storage locations of said write cache storage area, said status field indicating whether or not information held in a corresponding one of said storage locations has been written out of said write cache storage area, and wherein said write cache controller monitors each of said status fields in said write cache storage area to determine which, if any, of said storage locations is available to store said information corresponding to said write command sent from said processor.
10. A memory system as recited in Claim 8, wherein said copyback logic circuit receives said address corresponding to said burstwrite command on an internal data bus connecting said write cache to said processor.
11. A memory system as recited in Claim 10, wherein said copyback logic circuit receives said burstwrite command on a control bus connecting said copyback logic circuit to said processor concurrently with receiving said burstwrite command.
12. A memory system as recited in claim 10, wherein said write cache storage area further includes a tag address field and a data field for each of said storage locations.
13. A memory system as recited in claim 10, further comprising: a bus interface unit connected to said first cache and said write cache; and a main memory connected to said bus interface unit through said external data bus and said external address .bus, wherein said data in said storage location in said write cache storage area is written to said bus interface unit to be subsequently sent to said main memory at a predetermined time.
14. A method for improving memory write performance in a processing system having a write cache which includes a plurality of storage locations and a first cache, the method comprising the steps of: a) receiving a data write command which includes data and a memory address to write said data to; and b) determining if said memory address is within a prespecified address range, and if said memory address is within said prespecified address range: i) determining if any of said storage locations are available for storing said memory address and said data corresponding to said memory address, and if there is at least one available storage location:
A) storing said data corresponding to said memory address in said at least one available storage location; B) updating an availability status for each of said storage locations, and
C) writing said memory address and said data corresponding to said memory address on a respective external address bus and an external data bus upon receiving a burstwrite command which indicates that said memory address is to be written out.
15. A method as recited in Claim 14, wherein said memory address and said data corresponding to said memory address are written out to a main memory connected to said external address bus and said external data bus.
16. A method as recited in Claim 14, wherein the step b) is only performed when said memory address corresponding to said data write request does not reside in said first cache.
17. A method as recited in Claim 14, wherein in the step b), said memory address is compared with a maximum address value and a minimum address value, and when said memory address is both greater than said minimum address value and less than said maximum address value, said memory address is determined to be within said predetermined range, otherwise said memory address is determined to be outside said predetermined range.
18. A method as recited in Claim 16, further comprising the step of:
D) sending a write cache hit signal to said first cache when said write cache writes said data corresponding to said memory address onto said external data bus.
PCT/US1996/008595 1995-06-05 1996-06-04 Write cache for write performance improvement WO1996039667A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
DE69618783T DE69618783T2 (en) 1995-06-05 1996-06-04 WRITE Cache to improve writing performance
EP96917987A EP0835490B1 (en) 1995-06-05 1996-06-04 Write cache for write performance improvement

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/464,352 US5761709A (en) 1995-06-05 1995-06-05 Write cache for servicing write requests within a predetermined address range
US08/464,352 1995-06-05

Publications (1)

Publication Number Publication Date
WO1996039667A1 true WO1996039667A1 (en) 1996-12-12

Family

ID=23843598

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1996/008595 WO1996039667A1 (en) 1995-06-05 1996-06-04 Write cache for write performance improvement

Country Status (4)

Country Link
US (1) US5761709A (en)
EP (1) EP0835490B1 (en)
DE (1) DE69618783T2 (en)
WO (1) WO1996039667A1 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6137763A (en) * 1998-09-24 2000-10-24 Zen Research N.V. Method and apparatus for buffering data in a multi-beam optical disk reader
US6437789B1 (en) * 1999-02-19 2002-08-20 Evans & Sutherland Computer Corporation Multi-level cache controller
US20020029354A1 (en) * 2000-08-23 2002-03-07 Seagate Technology Llc Non-volatile write cache, in a disc drive, using an alternate power source
JP2003281084A (en) * 2002-03-19 2003-10-03 Fujitsu Ltd Microprocessor for efficiently accessing external bus
US7536529B1 (en) 2005-06-10 2009-05-19 American Megatrends, Inc. Method, system, apparatus, and computer-readable medium for provisioning space in a data storage system
US7689766B1 (en) * 2005-06-10 2010-03-30 American Megatrends, Inc. Method, system, apparatus, and computer-readable medium for integrating a caching module into a storage system architecture
CN101617354A (en) 2006-12-12 2009-12-30 埃文斯和萨瑟兰计算机公司 Be used for calibrating the system and method for the rgb light of single modulator projector
US8352716B1 (en) 2008-01-16 2013-01-08 American Megatrends, Inc. Boot caching for boot acceleration within data storage systems
US8799429B1 (en) 2008-05-06 2014-08-05 American Megatrends, Inc. Boot acceleration by consolidating client-specific boot data in a data storage system
US8358317B2 (en) 2008-05-23 2013-01-22 Evans & Sutherland Computer Corporation System and method for displaying a planar image on a curved surface
US8702248B1 (en) 2008-06-11 2014-04-22 Evans & Sutherland Computer Corporation Projection method for reducing interpixel gaps on a viewing surface
US8077378B1 (en) 2008-11-12 2011-12-13 Evans & Sutherland Computer Corporation Calibration system and method for light modulation device
US9641826B1 (en) 2011-10-06 2017-05-02 Evans & Sutherland Computer Corporation System and method for displaying distant 3-D stereo on a dome surface

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0470735A1 (en) * 1990-08-06 1992-02-12 NCR International, Inc. Computer memory system
EP0602807A2 (en) * 1992-12-18 1994-06-22 Advanced Micro Devices, Inc. Cache memory systems
EP0602808A2 (en) * 1992-12-18 1994-06-22 Advanced Micro Devices, Inc. Cache systems

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5025366A (en) * 1988-01-20 1991-06-18 Advanced Micro Devices, Inc. Organization of an integrated cache unit for flexible usage in cache system design
US5197144A (en) * 1990-02-26 1993-03-23 Motorola, Inc. Data processor for reloading deferred pushes in a copy-back data cache
US5325499A (en) * 1990-09-28 1994-06-28 Tandon Corporation Computer system including a write protection circuit for preventing illegal write operations and a write poster with improved memory
US5359723A (en) * 1991-12-16 1994-10-25 Intel Corporation Cache memory hierarchy having a large write through first level that allocates for CPU read misses only and a small write back second level that allocates for CPU write misses only

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0470735A1 (en) * 1990-08-06 1992-02-12 NCR International, Inc. Computer memory system
EP0602807A2 (en) * 1992-12-18 1994-06-22 Advanced Micro Devices, Inc. Cache memory systems
EP0602808A2 (en) * 1992-12-18 1994-06-22 Advanced Micro Devices, Inc. Cache systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NASS R: "DYNAMIC MEMORY ARCHITECTURE INCREASES PERFORMANCE OF 486-BASED MOTHERBOARDS", ELECTRONIC DESIGN, vol. 41, no. 15, 22 July 1993 (1993-07-22), pages 36, 38, XP000387988 *

Also Published As

Publication number Publication date
DE69618783D1 (en) 2002-03-14
US5761709A (en) 1998-06-02
EP0835490A1 (en) 1998-04-15
DE69618783T2 (en) 2002-08-29
EP0835490B1 (en) 2002-01-23

Similar Documents

Publication Publication Date Title
US5829032A (en) Multiprocessor system
KR100252570B1 (en) Cache memory with reduced request-blocking
US5740400A (en) Reducing cache snooping overhead in a multilevel cache system with multiple bus masters and a shared level two cache by using an inclusion field
US6772295B2 (en) System and method for managing data in an I/O cache
US6718441B2 (en) Method to prefetch data from system memory using a bus interface unit
JP2509766B2 (en) Cache memory exchange protocol
JPH10504124A (en) Two-way set associative cache memory
US5761709A (en) Write cache for servicing write requests within a predetermined address range
US5479636A (en) Concurrent cache line replacement method and apparatus in microprocessor system with write-back cache memory
US5850534A (en) Method and apparatus for reducing cache snooping overhead in a multilevel cache system
US5724547A (en) LRU pointer updating in a controller for two-way set associative cache
JPH08185355A (en) Data memory and its operating method
US5974497A (en) Computer with cache-line buffers for storing prefetched data for a misaligned memory access
US5678025A (en) Cache coherency maintenance of non-cache supporting buses
US5987570A (en) Performing overlapping burst memory accesses and interleaved memory accesses on cache misses
US6044441A (en) Method and apparatus for encoding valid and invalid states in a cache with an invalid pattern
US6008823A (en) Method and apparatus for enhancing access to a shared memory
JPH04336641A (en) Data cache and method for use in processing system
US7543113B2 (en) Cache memory system and method capable of adaptively accommodating various memory line sizes
JPH06103169A (en) Read data prefetching mechanism for central arithmetic processor
US6078971A (en) Input/output buffer and method for invalidating transferred data in direct memory access transfer
JPH0415493B2 (en)
EP0513784A1 (en) Cache control system
EP0631236B1 (en) A bus-master computer system and method
US6298417B1 (en) Pipelined cache memory deallocation and storeback

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1996917987

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1996917987

Country of ref document: EP

WWG Wipo information: grant in national office

Ref document number: 1996917987

Country of ref document: EP