GB2288256A

GB2288256A - Bus configuration for memory systems

Info

Publication number: GB2288256A
Application number: GB9504794A
Authority: GB
Inventors: Thomas R Hotchkiss; James B Williams; John L Wood
Original assignee: Hewlett Packard Co
Current assignee: HP Inc
Priority date: 1994-04-08
Filing date: 1995-03-09
Publication date: 1995-10-11
Also published as: DE19503022A1; JPH07281988A; GB9504794D0

Description

2288256 BUS CONFIGURATION FOR MEMORY SYSTEMS This invention relates to a

bus configuration for computer memory systems.

The characteristics of a computer's memory system determine to a large extent the computer's achievable speed and power. A high-speed processor is, of course, also essential, but even an extremely fast processor (or even groups of processors) will appear slow if it must often waste time waiting for a slow memory to carry out read and write instructions on data the processor needs.

One reason for this difference in speed is that it takes time for memory devices to latch an address and for the individual memory elements to transfer to an output line signals corresponding to their respective logical states. Consequently, much researc.- effort is put into developing faster semiconductor technologies.

Another reason man,., modern processors are slowed down by existing memory configurations is that the main bus, over which one or more processors communicate with the memory system, is typically much faster than the memory system's internal bus. In this case, even if the separate devices are very fast, the s,.,stem gill still be slowed down because the devices must wait for data to make it from one to the other. What is needed is a memory bus structure that eliminates much of' this waiting time.

The ability to expand the memory is another goal of modern computer design, especially as applications such as graphics become more and more memorv intensive. A fast memory is useless, for example, if: it isn't large enough to hold needed data and it can.t be exmanded to n accommodate the greater needs of a particular application. Often, the reason a memory can't be expanded is that the bus structure allows only a limited address space. As a simple example, assume that only sixteen bits are available for addressing the memory. This means that there are only 2" = 65,536 Possible unambiguous addresses. One common way to deal with this problem is to time-multiplex memory addressing, so that, for example, a 32-bit address is transmitted sequentially as two 16-bit partial addresses. This solution, however, doubles addressing time. What is needed is a memory system that can be expanded easily without slowing it down.

3 - The computer memory system according to the invention has a plurality of memory units grouped in memory segments. A master memory controller receives from a requesting system an ordered sequence of memory transaction requests. Each memory segment has its own slave memory controller, which is connected to the master memory controller. Each slave controller individually stores the ordered sequence of transaction requests in a respective slave transaction queue, decodes the transaction requests with respect to a transaction type and a transaction address. The slave controller completes a memory transaction only when the transaction request is an earliest received transaction request in the ordered sequence and it then advances its slave transaction queue whenever any slave memory control means carries out a memory transaction.

In the preferred embodiment of the invention, each memory unit has a plurality of data addresses and the memory system further includes a main bus. The master memory controller includes a master transaction queue and a slave data memory that stores read and write data to be applied to and retrieved from the memory units. Each slave memory controller includes a slave transaction queue and an address drive circuit that generates the addressing signals to the memory units in the respective memory segment.

A single transaction bus connects the master memory controller with all of the slave memory controller and has a transaction acknowledgement (TACK) line that has busy Passerted") state and a free ("non-asserted") state.

In the preferred embodiment, the master memory controller further includes a master control and transaction acknowledgement circuit that senses the state of the transaction acknowledgement (TACK) line; receives via the main bus a sequence of memory 4 transaction requests; stores the memory transaction requests in the master transaction queue as a sequence of transaction codes and memory transaction addresses in that same order that they are received from the main bus. It also outputs the transaction requests to the slave transaction queues by applying to the slave memory controller, via the transaction bus, transaction signals and slave address signals corresponding to the transaction codes and memory transaction addresses and it transfers read data from the memory units to the main bus. Whenever the TACK line indicates that a transaction has been completed, the master memory controller advances all transaction requests in the master transaction queue.

is Each slave memory controller preferably includes a slave control and transaction acknowledgement circuit that senses the state of the transaction acknowledgement (TACK) line. It also stores the memory transaction requests in the slave transaction crieue in the same order as they are stored in the master transaction queue. Furthermore, it decodes the transaction memory addresses to determine whether the transaction request stored at the head of the slave transaction crueue maps into a memory unit --ts c-.vn memory sj---gmenc. If so, when the TACK signal is non-asserted TACK line indicates "free"), the slave memory controller then asserts TACK (drives the TACK signal to indicate "busy"), completes the transaction request at the head of the slave transaction queue, and, when completed, drives the TACK line back to its non-asserted state. Each slave controller advances all transaction requests in its slave transaction queue whenever a transaction is completed, as indicated by changes in the TACK signal.

In a simplified embodiment of invention, only a single memory bank is connected to the transaction bus and each slave memor... controller in the: data single memory bank indicates a direction of transfer between the master memory controller and the memory units via one or more read/write control lines.

In the preferred embodiment, however, the memory segments are grouped into memory banks, which are grouped into memory regions. For each memory region, a multiplexer is provided for multiplexing memory data between the memory units in the memory region and the master memory controller. The read/write control lines then connect each slave memory controller in a memory region with the multiplexer in the same memory region to indicate the direction of data transfer between the master memory controller and the memory units.

Each slave controller stores parameters for the memory units in its associated memory segment so that the memory units are electrically transparent to the master memory controller.

In order that the invention snail be clearly understood, several exemplary embodiments tnereof bed with reference to the will now be descrL accompanvina drawinas. in which:

6 Figure 1 is a block diagram showing the main components of the memory system according to a preferred embodiment of the invention, as well the bus structure connecting the various components.

Figure 2 is an enlarged block diagram showing in greater detail the main components of the memory system of Figure 1.

Figure 3 is a block diagram of a simplified embodiment of the memory system according to the invention.

Throughout the description of the invention will be included the parameters and characteristics of one implementation of the invention. The general features of the implementation are also those illustrated in the figures. The implementation was designed to be compatible with the needs of a complete computer system, and other considerations often played a large part, such as ease of manufacturing, reduction of costs, and use of standard components. Many features of the irkplemented design are advantageous in general and reflect aspects of the invention; others are simply choices based on i it practicality and do not represent 11m---ations of the invent ion. the invention Moreover, it is to be understood that t includes conventional de-vices and llnes to all components for such purpeses as power supply, ground, and s-7ste.m clock signals. These devices and lines are not illustrated or described further because they are well known in the field of digital design and including them would merely clutter the text and figures with known information without making the presentation of the invention airy clearer.

7 In Figure 1, a main bus 100 is the bus over which one or more processors, I/0 devices, or other data-requesting systems communicate with a memory system to store and retrieve data from the memory. The main bus will normally be faster than other buses in the system. According to the invention, transfer of data, memory addresses and transaction codes (described below) are transferred to and from the main bus 100 by a master memory controller 102. The master controller 102 is able to receive and latch data from and apply data to the main bus 100 at the speed of the main bus itself.

In the embodiment illustrated in Figure 1, the actual memory units are organized in four logically identical memory banks, which are indicated within the dashed-line boxes 104a, 104b, 104c, and 104d. It is not necessary for the banks to be identical, but in general it is both easier and more efficient to implement identical banks since they then allow for uniform addressing and maximum utilization of address space. Since the illustrated banks are, however, logically identical, similar reference numerals are used for similar components in each of the banks.

Each memory bank 104a, 104b, 104c, and 104d includes at least one slave memory controller, one each for each memory segment in the bank. The illustrated embodiment corresponds to one actual implementation of the invention and, in this implementation, the slave controllers are dual units 106U and 106L (where the suffixes "U" and "L" indicate "upper" and "lower" as viewed in the figures), each of which is a logically separate slave controller controlling different address segments of the memory bank, and each of which addresses two banks of dual memory units 108L, 108R, 11OL, 11OR, 112L, 112R, 114L, 114R PL" and "R" indicating "left" and "right", respectively).

8 The memory is thus divided into banks, with each bank divided into memory segments, and with a slave controller for each segment; each memory segment includes a number of memory units. The slave controller 106U of memory bank 104a thus controls a memory segment made up of the memory units 108L, 108R, 11OL, and 11OR, whereas the slave controller 106L of the memory bank 104a controls a memory segment made up of the memory units 112L, 112R, 114L, and 114R. It is not necessary for the slave controllers to be implemented as dual units; rather, slave controllers may be implemented as separate devices, or even more than two could be combined in a single integrated circuit, as long as there is a logically independent slave controller for each memory segment.

The number of slave controllers in each memory bank will be chosen for any given application. Two-way data buses 116L, 116R are combined physically as a single combined memory bus (118a for bank 104a, 118b for bank 104b, and so on) to connect the memory units with one input of a corresponding multiplexer unit 120. Each multiplexer multiplexes read and write data from a predetermined memory region, which preferably includes at least two data banks. In the illustrated embodiment, each multiplexer unit 120 multiplexes the data to and from a pair of memory banks (104a and 104b, or 104c and 104d), so that each memor-7 region c=rises two banks. Multiplexers that each multiplex data for more than two memory banks may be used as long as more selection control lines and associated selection circuitry is included; the necessary changes can be made using known techniques and c=ponents.

It is not necessary for each memory unit 108L, 108R,..., 114L, 114R to be a dual unit: with left and right halves; rather, single "unsplit" units, or combinations of even more but smaller units, may be used as long as they are enouah b-ts wide.lln the actual 9 implementation of the invention illustrated in Figure 1, data words were 144 bits wide, including 16 bits used for error correction. Having paired 72-bit units allowed the use of more readily implemented memory 5 units.

All the multiplexers 120 are connected with the master memory controller 102 by way of a two-way multiplexer data bus 122. output from the multiplexers 120 need not be time-multiplexed, so that the multiplexer data bus 122 preferably has the same width as the combined RAM data buses 118. Memory data is read and written via the multiplexer data bus 122.

The master memory controller 102 is connected with all the slave memory controllers 106U, 106L by a single transaction bus 124, which carries addresses and control signals (described below) between the master and slave controllers. The control signals preferably include signals to indicate such information as the state of the multiplexer bus 122, parity bits, transaction acknowledgement (TACK) signals, and reset commands; the control signals used in a preferred embodiment of the invention are described in greater detail below. The multiplexer data bus 122 and the transaction bus 124 together make up a master/slave interface (14SI) bus, which is indicated b,/ the reference number 126.

In the illustrated, implemented embodiment, there is a multiplexer unit 120 for each two slave controllers 106U, 106L in a memory bank. The slave controllers in each bank can apply to the multiplexer a memory read (MREAD) and a memory write (14WRITE) signal via two pairs of corresponding control lines (128a for memory bank 104a and lines 128b for the memory bank 104b shown as the "left half,' of the system in Figure 1). These signals are described in greater detail below.

In the implemented embodiment of the invention, data words were 144 bits wide, so the memory buses (for example, 118a and 118b, respectively, for memory banks 104a and 104b), as well as the multiplexer data bus 122, were also 144 bits wide. if the multiplexer unit were implemented as a single-chip device, it would therefore have needed at least 3-144 = 432 pins for the bus lines, two pins for each of the MPEAD/MWRITE control lines 128a and 128b, plus other pins for such conventional (and therefore not illustrated) lines as those needed for power, ground, clock, reset, and possibly diagnostic signals. Although such a large chip may be used according to the invention, it would be very complicated and expensive to manufacture.

in order to reduce complexity and cost, without reducing performance, each multiplexer unit 120 is two or more preferably bit-sliced, that is, made up of t substantially identical sub-units. in the implemented embodiment of the invention, each multiplexer unit 120 Comprised four sub-units 121a, 121b, 121c and 121d. Each sub-u:iit included identical data latching and Control C4rcuitry for 36 bits of each bus. In other words, the buses 118a, 118b and 122 were each "split" or "grouped" physically into fourths, with the different fourths of each bus being multiplexed by different multiplexer sub-units. Each of the sub-units was connected to both of the MREAD/MWRITE control line pairs for the corresponding memory banks (for example, the sub-units in the multiplexer unit for memory banks 104a and 104b were connected to the control line pairs 128a, 128b).

The number of pins needed for each sub-unit chip was therefore no more than 3 x 36 (for the bus lines) + 4 (for the MREAD/MWRITE control lines) = 112 pins, plus a few other pins for power, ground, and so on. Each sub-unit could thus be included in a common, low- cost, 160-pin package. This greatly reduced the complexity and manufacturing cost of the multiplexer units 120 with no loss of performance: each of the multiplexer units 120 still performed as single logical devices.

11 other multiplexer arrangements are possible. For example, wider multiplexers (themselves possibly bitsliced into "sub-units") may be included that multiplex the signals to and from more than two memory banks. Moreover, in a simplified embodiment of the invention described below, no multiplexers are needed at all. One great advantage of the invention is the expandability of the memory system: any number of slave controllers 106U, 106L (as single units or, as in the illustrated embodiment, as multiple units) and their respective memory banks may be included by tying each to the transaction bus 124 and providing additional multiplexers as needed.

Figure 2 shows in greater detail the preferred general structure of the master memory controller 102, of one memory bank 104a, of a multiplexer unit 120, and of the signal and data buses connecting these components.

The master memory controller 102 includes timing and control circuitry 202, a slave data memory unit 204, a first-in-first- out (FIFO) master transaction queue 206 and a transaction acknowledgement (TACK) sensing circuit 208. Each of these sub-components may be implemented using known design techniques and C4rCUitS. The timing and control circuitry 202 receives and acknowledges memory requests from the main bus 100, including addresses, transaction codes (descr_4bed below), and WRITE data; determines the priority of the requests according to predetermined, application-dependent rules (for example, that READ requests have higher priority than WRITE requests, to minimize processor waiting time, or that 1/0 requests must wait until processor requests are carried out); controls the transfer of data to and from the slave data memory unit 204; generates and sequences the appropriate slave control signals (described below) via the FIFO master transaction queue 206; and advances the queue 206 when the TACK sensing 12 circuit 208 indicates that a queued transaction has been carried out. The slave data memory unit 204 stores data received from the main bus to be written to the memory units via the multiplexer data bus 122 and also data transferred by the multiplexers over the multiplexer data bus 122 that is to be output onto the main bus 100 in response to a memory read request.

Each slave controller 106U, 106L includes a control and decoding circuit 212, an FIFO slave transaction queue 216, a transaction acknowledgement (TACK) circuit 218, and an address output drive circuit 220. These logical units within each slave controller may be connected with each other, and with the memory units and transaction bus 124, using known technology and is therefore not described in greater detail. In the preferred embodiment, the actual memory units 108L, 108R,..., 114L, 114R, are dynamic random-access memory (DRAM) circuits.

The slave controller's control and decoding circuit 212 preferably includes conventional logic circuitry that generates control signals (described below) for the multiplexer 120, checks for MSI bus parity errors, monitors and sets, via %the TACK circuit 218, the state of a TACK signal (described below), advances the queue 216, and controls the timing and latching of the address output drive circuit 220. The control and decoding circuit 2212 also includes circuitry for the timing and refreshing of the DRA14 circuits; this circuitry may be conventional, but a preferred embodiment according to the invention is described below that is advantageous when large numbers of DRA14 devices are included in the memory system.

It is not necessary for the control and decoding circuit 212, the slave transaction queue 216, the TACK circuit 218, and the address output drive circuit 220 to be physically separate units; rather, any or all of 1 1 13 these circuits may be implemented as different Portions of a single integrated circuit.

One should note that the parameters for timing, addressing, and refreshing the various DRAM memory units are stored in and followed by the slave controller's control and decoding circuit 212. It is not necessary to store or act on these parameters in the master controller 102. This is a great advantage of the invention, since the master controller will typically be much more specialized, complicated, and expensive than the slave controllers. As a result, the memory units may be upgraded (for example, to synchronous DRAM) or changed without having to reprogram or reconfigure the master memory controller. Indeed, according to the invention, the memory units themselves are essentially transparent to the master controller, so that it is even possible to have different types of memory units within the same system.

Similarly, the memory map is held in the slave controllers. This means that even the layout and segmentation of the memory is transparent to the master controller. All slave controllers communicate with each other and with the master controller via the single transaction bus 124, and all multiplexers communicate with the master controller via the single multiplexer data bus 122. The memory can therefore be expanded to the limits of the theoretical address space. One can expand the memory system according to the invention with a minimal decrease in speed, depending on how much of the address space is controlled by any given slave controller and the structure of the multiplexers used to gate the data to and from the memory units.

The memory units 108L, 108R,..., 114L, 114R are preferably conventional DRAM single in-line memory modules (SIM4), which are able to accept and output data without the need for separate input and output pins. This simplifies the layout: of the memorv boards. In the 14 implemented embodiment, each SIMM memory unit was 72 bits wide, of which included 64 data bits were used for data and 8 bits were used for an error-correcting code (ECC). Since 16-byte data was used in the embodiment, two SIMM memory units in,tandem" each held one half of each data word, which thus consisted of 144 bits (16 data bytes plus 16 ECC bits).

It is not necessary according to the invention for the memory units to be SIMM units, or even DRAM.

Instead, static RAM components may be used, in which case there will be no need to provide any circuitry or procedures for refreshing the memory units. The static RAM components will, however, typically be more expensive than DRAM.

of course, it is also possible to forego two sets of 72-bit memory units logically side-by-side in favor of single 144-bit units. Paired 72-bit units were preferred because they allowed the used of existing connectors. It is also possible to have some other data word width than 144; the invention allows for data words of arbitrary width.

In the implemented embodiment, each data position in the SIMM memory units was accessed by row and column, and each unit could be accessed independently. The slave controller therefore had separate lines for rows 238, 240, 242, 244 and columns 239, 2121, 243, 245. These lines preferably also include lines for refreshing and driving each memory unit into its high-impedance state (that is, for "tri-stating,l the memory units); in this way, of all the memorv units connected to data bus 116L or data bus 116R, only one at a time will be able to drive data onto the respective bus, all others being held tri-stated. In the implemented embodiment, the minimum memory increment for full 16-h,/te data was two.

It is of course not necessary in every application of the invention to have independent addressing of the is paired memory units; indeed, as is pointed out above, the invention does not required paired units at all.

It is also preferable, although not necessary, for the address output drive circuit 212 to include a separate address latch and address driver for each memory unit that the slave controller addresses. In such case, the slave controller's control and decoding circuit 212 will preferably include separate control circuitry for each of the separate latch and address drivers. This allows each memory unit to be addressed independently. As is described further below, this also allows addresses to be driven to more than one, and even to all, of the memory units in a memory segment simultaneously. The control lines 238, 239,..., 244, 245, will in this case include both row and column address signals. These lines will also include lines for tri-stating signals for each memory unit, so that only the memory units that are to drive their data onto the buses 116L, 116R will not be tri-stated, even though the others have also properly been addressed. Tristating of digital circuits such as the memory units is a well-understood technique, and is therefore not described in greater detail.

It is possible to have more or fewer than four memory units connected to each slave controller; for any given application, the number of memory units per slave controller will depend on factors suc.h. as space (each slave controller takes up space without increasing memory sizze), cost (slave controllers will typically be more complicated than memory units, since the structure of the slave controllers is unique to the invention but standard components may be used for the memory units), and speed (the more slave controllers there are, the more can be done simultaneously by different slave controllers).

Each multiplexer 120 (or, if bit-sliced into subunits, each multiplexer sub-unit 121a, 121b, 121c, 121d) 16 has an internal control circuit 250, which controls the direction of data flow, either from the master controller 102 to the memory units for a write operation or in the reverse direction for a read operation, and also selects which connected memory bank's memory units are to be accessed for the read or write operation. Each multiplexer 120 (or sub- unit 121a, 121b, 121c, 1221d) also has at least one 1/0 latch 252 to hold data (or the respective sub-set of the data) coming from or going to the memory units. The multiplexers 120 used in the invention may be implemented with known components and design techniques.

To avoid tedious repetition, each reference to "the control circuit 250" or "the 1/0 latch 252" is to be understood to refer to the single circuit of the respective type in the multiplexer unit if the multiplexer unit is not bit-sliced, but to each such circuit in each multiplexer sub-unit if it is.

As is described above, the slave controllers in each memory bank are connected to their corresponding multiplexer (or to all of the sub-units in the multiplexer) via the MRa:D/1TIR7TE signal lines; for example, line pair 128a connects the slave controllers of memory bank 104a to the MUjt4 plexer 120, that is to all the sub-units 1271a, l2lb, i-'Ic and 112ld shown in Figure 2. When a slave control-14-r ses MPEZ.,D "true" (defined in any conventicnal manner), the control circuit 250 configures the multimlexer for a memory read operation. Similarly, when the slave controller sets 14WRITE "t-rue," the control circuit _250 sets the multiplexer for a memory write operation. klthough single MREAD/MRITE control lines could be used instead of the line pairs 128a, 128b, twith, for example HIGH meaning MREAD and LOW meaning Y.^.IRITE:, separate lines are preferable for each in order to eliminate any ambiguity about why a single line might be LOW, and also ti t4..

17 to provide for proper coordination and timing of the multiplexer operations.

In order to simplify the circuitry, the timing of passage of data between the RAM data buses 118 and the multiplexer data bus 122 is preferably fixed relative to when the MRFIAD or MWRITE signals are set "true"; this timing will depend on the particular implementation and can be determined and optimized using well-known design techniques.

The following signals are examples of the types of signals that will typically be passed over the transaction bus 124. Certain of these, such as the TACK signal, are necessary to the invention, whereas others, such as a bus parity signal MSI-PAR, are advantageous or optional and will depend on the needs of any particular application. Signals that pass in both directions between the slave controllers 106U, 106L and the mastercontroller 102 are labelled (10). Signals sent only from the master controller 102 to the slave controllers are labelled (0), and signals sent only from the slave controllers to the master controller 102 are labelled M. In other words, the direction of signals is labelled relative to the master controller 102. The bit size of the signals used in the implemented embodiment is not necessary to the invention but can vary depending on the application. An n-bit signal 4S indicated as (O:n-11, as is common SA[0:311 (10): Slave Address. This signal gives the memory address to be accessed.

TC[0:2] (0): Transaction Code. Several different transactions are possible using the invention and the bit size of this signal is selected depending on how many different transactions one wishes the slave controllers to be able to perform. in one embodiment of the invention there were five different: types of transactions, with the following bit patterns (an "x" represents an arbitrary bit state):

18 Transaction TCrO-21 DescrintiOn WRITE 000 write a full 32-byte cache line to memory WRITE16 001 Write 16-byte data to memory DIAG_RD 010 Perform a diagnostic read operation DIAG_YRT 011 Perform a diagnostic write operation READ 1XX Read a full 32-bit cache line Of these operations, only two will be common to all applications of the invention, namely, WRITE and READ, so that TC could theoretically be reduced to a single bit. Even in the implemented embodiment of the invention, WRITE and READ are by far the most common operations, and the bit pattern chosen for TC minimizes memory latency as follows: The READ operation is the most time- critical, since the processor that has requested data from the memory typically will have to wait to get the data before completing some operation.

If only a single READ code "lxx" is included, slave controllers can decode the READ command by simply observing the single least significant bit (LSB), which indicates READ every time this bit is set to a one (the choice of "I" or "HIGH" as "positi-,,e" is of course arbitrary for this as well as all other binary signals in the in-jention as is the choice of which bit to use; conversion to "positive LOW" or 11positive C is a wellunderstood design choice.) llf more than one type of READ operation is possible, for example, a READ16 code corresponding to WRITE16, decoding time may still be reduced, albeit not as much, if the LSE is always the same for every READ code. Similarly, the less time- critical but common 1WRITE operation can be decoded based on the two LSE's alone: if they are "00", a WRITE is indicated. Such decoding schemes are well-known in the field of digital design and the invention does not require any particular encoding or decoding scheme for the transaction code signal Te.

19 The diagnostic signals DIAG_RD and DIAG_WRT are not required according to the invention but may be included to indicate such transactions as storing data about the power-up configuration or for memory test support.

In the implemented embodiment of the invention, the "standard" data word was 32 bytes wide. In order to increase the performance of certain 16-byte direct memory access (DMA) operations, however, WRITE16 signal was included to indicate that the slave controller 106U, 106L was to carry out only a 16byte write operation to the memory. This signal-and its corresponding operation are not necessary in other applications of the invention that have only one possible data word width.

TV (10): Transaction Valid. This single-bit signal indicates that both the signals SA and TC are valid and that the slave controllers must process the indicated transaction on the indicated address. The slave controllers will not latch and process SA and TC unless TV is in its positive or "true" state. This is described in greater detail below.

TACK: Transaction Acknowledge. This is a singlebit signal whose function greatly increases the flexibility and expandability of the memory system according to the invention. It is used to advance a slave transaction crueue, which is described in detail below. TACK can be either (I) or (I0) depending on the application, as described below.

MSI_PAR (0): MSI Bus Parity. This optional signal may have an arbitrary number of error detection or correction bits, for example, a single bit to indicate the parity of the other bus signals.

SRESET (0): Slave Reset. This signal indicates that all the slave controllers 106U, 106L are to reset their internal states to a predetermined initial condition, which will typically include erasing the contents of their transaction queues (described below).

The master memory controller will typically apply this signal when the memory system is powered up.

The operation of the memory bus according to the invention will now be described.

Assume by way of example that the master controller 102 receives a series of memory read and write requests over the main bus 100. Each request will define a transaction type and, except for the slave reset signal SRESET and possibly for diagnostic signals, will also contain a real memory address corresponding to the transaction. Each requested transaction and address is stored in the master transaction queue 206 and is also output onto the transaction bus 124 as the transaction code TC and the slave address SA. once the transaction code TC and address SA are stably on the transaction bus 124, the master controller 102 sets the transaction valid TV signal to 'It-.ue."

The transaction bus 124 is common to all slave controllers throughout the system, and when the control circuits 212 in each slave controller sense m-hat TV is true, they load the transaction code and the corresponding address at the end of their respective slave transaction queues 215. in other words, a complete copy of all transactions to be processed is stored not only in the master transaction crueue 206, but also identicallv in everv slave transa=tion aueue 216.

According to the invention, all transactions are processed in order, but the, 1 do not necessarily have to be processed one at a time. t',.s is mentioned above, memory mapping is carried out in the slave controllers 106U, 106L. As each new transaction code and address is received by a slave controller, it decodes whether the address lies within the memory space it- controls. If it doesn't, the transaction remains stored in the queue, but the slave takes no other action than to c,:ntinue to z 21 monitor the transaction bus 124 and advance the slave's queue (described below).

Each correct address will, however, lie within the memory space of one of the slave controllers 106U, 106L in one of the memory banks 104a, 104b, 104c, 104d. (A method for dealing with incorrect addresses is described below.) The control and decoding circuit 212 of this slave controller will recognize this upon decoding the address. The indicated slave controller, which controls the memory space in which the request lies, also loads the request into its transaction queue. Assuming no earlier transaction in the queue involves that particular slave controller, the slave controller also immediately begins to load its address output drive circuit 220 with the address given by the SA signal so as to access the corresponding word in the DRAM memory units.

Note that, in the preferred embodiment, each memory unit in each memory segment is independently addressable, with corresponding separate address latches and drivers included in the address output drive circuit 212, and corresponding separate control circuits included in the control and decoding circuit 212 of the respectJA.ve slave controller. This means that the slave controller, as it decodes addresses for incoming transaction requests (or for those aireadv in the transaction queue), can drive addresses for any or all of its memory units as long as they are not already involved in as yet uncompleted transaccions. In other words, each slave controller can begin to set up several transactions at once (by driving out the respective addresses and tri-stating the corresponding memory units) even before any one of them reaches the head of the queue. The number of memory units that can be addressed at any one time in the system is limited only by the chosen size of the transaction queues; bus conflicts are avoided because only the memory unit or 22 units whose addresses are indicated by the transaction at the head of the queue will be able to access the common buses, and all other "addressed" memory units will remain tri-stated until it is their "turn" to drive 5 data onto the buses.

Each transaction queue 206 (master), 216 (slaves) operates on a first-in, first-out (FIFO) basis, with transactions moving from the end of the queues toward the head of the queues. The contents of all transaction queues are identical.

Assume that the memory transaction at the head of the queue operates on a memory unit within the address space controlled by the slave controller 106U in the memory bank 104a, that is, within the address space shown in Figure 2. The slave controller 106U then,,owns,, the multiplexer data bus 122, the RAM data bus 118 and the MREAD/MWRITE signal lines 128a in its memory bank, and the TACK signal, -which means that no other slave controller may output or request output of signals onto any of these lines or buses.

The slave controller 106U then carries out the memory transaction indicated by the code TC, which will typically include latching the address in its address output drive circuit 2.20 Uir it hasn It already done so), and indicatina to the multip-1 exer, over the MRMFD/ITRRITE signal lines 128a, the direction of data flow between it and the memory units. Since each memorv bank has its own R!'-_%M data bus 118 and its own PIPEAD/MWRITE signal line 128a, 128b to the corresponding multiplexer unit, the multiplexer will detect which of its data latches contains the valid data.

Completion of a requested transaction thus involves both "internal" and "external" steps. internal steps are those that a slave memor-, controller can carr-/ out without accessing the MSDI bus 126, whereas external steps require access to the MSI hus 126.

1 23 Internal steps include decoding each slave address signal in the transaction queue (preferably, but not necessarily, as it arrives from the transaction bus) to determine whether the corresponding address lies within the slave controller's associated memory segment. If so, and if no other earlier but uncompleted transaction request involves that slave controller's address segment, then the slave controller drives the address to the corresponding memory unit via the address output drive circuit 220. Any or even all of the slave memory controllers can be performing internal steps simultaneously, in parallel.

External steps are those involving actual data transfer to or from the master controller 102. These steps include commanding the corresponding multiplexer 120 to accept (MWRITE) or transfer (MREAD) data over the multiplexer data bus 122, depending on the transaction code TC, and altering the state of the TACK line. Only one slave controller at a time may perform external steps, so that requested transactions are processed sequentially.

A transaction is not completed until a slave controller has carried out both the internal and external steps of the transaction and relinquished control of the TACK line.

As long as the slave controller is processing the transaction, it asserts the TACK signal via its transaction acknowledgement (TACK) circuit 218, thereby indicating that the TACK and other control lines are in a "busy" or "not available" state. The manner in which the TACK signal is asserted may be chosen to suit the needs of a particular implementation, but a particularly advantageous method, and one alternative method, are described below. Recall that DRAM timing requirements, which will tv pically differ from application to application, are preferably pre-stored or pre-configured in the control circuit 212 of each slave controller 24 using known design methods. The slave controller will therefore "know" how long a transaction will take.

The preferred protocol for the TACK signal eliminates or greatly reduces the likelihood of any bus conflict by providing stable signal states. According to the preferred protocol, the TACK line is either logically "HIGH," or logically "LOW,,, and can be tristated either HIGH or LOW. The state of the TACK line can be changed by the TACK circuit 218 in any slave controller. Providing three-state control of a line is a well-known technique in digital design and the TACK circuits 218 may be implemented using such techniques and known devices.

According to the invention, if the TACK signal is currently LOW, a slave controller asserts the TACK signal by changing it to HIGH for one clock cycle and then tri-stating the TACK line HIGH; if it is currently HIGH, the slave controller asserts the TACK signal by changing it to LOW for one clock cycle and then tri- stating the TACK line LOW for at least one cycle. In other words, a slave controller asserts the TACK signal by changing its ljgical state from whatever it just was and then tri-stating the TACK signall at the new logical state. This means that there iss alwavs at least Gne full c,,-cle during which the TACK line is tri-stated before another slave contrciler can assert the TA= signal, and this in turn eliminates or greatly reduces the likelihood of bus conflicts due tz ambiguity about the logical state of the TACK signal.

The master memory contrz)ller and all slave memory controllers continuously monitor the state c,'-' the TACK line using their respective TA.CK circuits 208, 2218.

Whenever the master and slave controllers sense that the logical state of the TACK signal has changed, they advance their transaction wieues and only the slave controller that controls or the transaction at the head of the crueue accesses or "o,.. :ns,. the ccrunon buses and lines for data transfer, in particular, the corresponding multiplexer data bus 122, RAM data bus 118, the MREAD/MWRITE signal lines, and the TACK line itself.

Any other slave controllers that control memory segments addressed in other transaction requests in the queue may, however, proceed with internal steps for these transactions. When a particular slave controller's transaction comes to the head of the queue, all that would then be required would be to send the appropriate MREAD/MWRITE signal to the associated multiplexer unit, since the corresponding address signals would already have been driven to the proper memory units. This ability to drive addresses in parallel to different memory segments while processing transactions sequentially, all without bus conflict, and with no need for special bus arbitration by the master memory controller, greatly increases the bandaidth of the memory system according to the invention.

One example of an alternative TACK signal protocol would be that the slave controller responsible for the transaction request at the head of the queue holds the TACK line LOW (or "false") until it has completed the transaction and then signals completion of ("acknowledges") the transaction b,l driving the TACK line HIGH (or "true"). No other slave controller would attempt to access common buses as long as the TACK signal is LOW. As soon as the master and slave controllers sensed a transition of the TACK signal from LOW to HIGH, however, they would all advance their transaction queues and the slave controller in charge of the new transaction request at the head of the queue would drive the TACK signal LOW and begin processing the transaction. Transaction completion would in this case be signalled by a transition from LOW to HIGH.

Yet another possible protocol would be for the "owning" slave controller to hold the TA.CK signal LOW "' 6 while it is carrying out a transaction request, then to drive it HIGH, and then LOW again for a certain number of cycles. In this case, the master controller and all slave controllers know that a transaction is completed, and that they are to advance their transaction queues, when they sense that the TACK signal is "asserted,,, that is, in this case, when they sense the first TACK LOW a predetermined number of cycles after a "falling edge" from HIGH to LOW.

All such protocols for the TACK signal are logically equivalent, and conventional electrical considerations will typically dictate which protocol is least likelv to suffer from errors or conflicts with other system requirements. In all cases, however, the TACK line or signal will have some busyll or "asserted" state and some "free" or "non-asserted" state. In the preferred embodiment, the "free,, state of the TACK signal is that it just changed from HIGH to LOW or vice versa in the last clock cycle and one cycle has elapsed when the TACK line has been tri-stated. The "busy,, state is that the TACK signal is tri-stated and has been so for more than one clock cycle. in the firstdescribed alternative protocol, the "busy" state is that the T!.CK signal is LOM and the "free" state is that the s 4 ' 4s H7GH.

igna_ - - In the second-described alternative protocol, -the "busy" state -JZ that the TACK line has been L071 more than some oredetermined number of cvcles and the "free" state is when they sense the first TACK LOW a predetermined number cf cycles after a "falling edge" from HIGH to LOW.

The logically identical transaction crueues in the various controllers and the use of the common TACK signal provide several advantages:

1) All transactions;--:sued by the master controller to the slaves will be completed in order, which eliminates the need ':)r the master to decode which transaction has just been completed.

27 2) The simultaneous advancement of all transaction queues and the common interpretation of a single TACK signal eliminates the need for further bus arbitration, since all slave controllers wait their turn to access any buses or lines shared with other slave controllers.

3) Operations do not need to be synchronized: a slave controller can own the various common data buses and control signal lines for as long as it takes to complete a transaction and then simply assert the TACK signal when it is completed. This feature also permits time-multiplexing of data transfers. If, for example, in order to reduce bus width or to use conventional memory units, it is found advantageous to transfer data words from memory units to a multiplexer as a sequence of sub-words (for example, as two 72-bit words instead of as a single 144- bit word), this is possible according to the invention, since the slave controller will simply hold TACK false until the transfer is completed.

4) The functioning and structure of the slave controllers and memory units are substantially transparent to the master controller and need not be set: as long as the slave controller is compatible with the rest of the system with respect to timing and electrical considerations, can latch and decode transaction codes and addresses, and can interpret and process the TACK signal according to the same rules of protocol as the master controller and the other slave controllers, it can be used in the system according to the invention. This means that improvements in the slave controllers can be incorporated into the system without having to change the more complicated and expensive master controller.

5) The number of memory units in the system can be increased to fill the theoretical lim4t of the address space without requiring any changes -to the master controller. Only electrical and timing considerations will typically limit expansion.

28 Of the various signals on the transaction bus 124, only the TACK signal line is unavailable to the master controller 102 while a slave controller is processing a transaction. This means that the master controller can continue to load other transaction requests into the transaction queues 206, 216 (all) even when a slave controller is processing a transaction, and regardless of the state of the TACK signal.

Since all the slave memory controllers store all the transaction codes and addresses in their transaction queues 216, some mechanism is needed to ensure that the master controller 102 does not issue more transaction requests than the slave controllers can accept, store in the queues, and process. In other words, there needs to be a mechanism to prevent the slave transaction queues from overflowing. According to the invention, this is arranged by storing in the master controller the size of the slave transaction queues. A simple counter or a pointer to the master transaction queue can then indicate when the number of unprocessed transactions has reached the limit of the slave transaction queues to store them.

it is not necessarv for the master memory queue to have the same size as '--he slave memory queues, but it mus- he at least as big. -,:ssume the slave transaction queues can hold at most j transactions. When the master controll-ler counts or detects that -4z: has issued N unprocessed transaction requests, it stops issuing transaction requests until some slave controller asserts the TACK signal, which indicates that one transaction request has been processed and that there is now room at the end of the slave transaction queues for one more transaction. The master controller then issues another transaction request to the slave csntrollers. if the master transaction queue 206 reaches the limits of its own storage capacity, the master control circuit 202 stops taking in and acknowledging ad3ditional memory 29 access requests from the main bus 100 and requesting units must wait. The size of the transaction queues 206, 216 that can be chosen by experiment to minimize the likelihood that the main bus will have to wait while still keeping the transaction queues small enough to be easily implemented.

If a WRITE operation immediately precedes or follows a READ operation in the same memory bank, then the timing will typically be different than if two operations of the same type follow one another. When a slave controller owns a READ transaction, it is preferably responsible for avoiding conflicts on the corresponding RAM data bus 118 with an immediately preceding WRITE operation. In the implemented embodiment, if a slave controller performs a WRITE operation, then it avoids such conflicts by holding TACK false at least three system clock cycles before once again driving the TACK signal true and proceeding with a READ transaction. Other timing schemes may of course be employed; indeed, depending on the memory units used in a given application, such conflicts may not ever arise and no measures to avoid RAM data bus conflicts may even be needed.

It is possible that software errors or electrical problems can lead to a requesting unit sending a request for a transaction for a memorv address that doesn't exist. Since the master memory contr,,jller does not need to (and therefore preferably does not) include a memory map, it will cause the corrupted or incorrect address to be loaded into the transaction crueues just like any other. When that transaction request reaches the head of the queues, no slave controller will recognize the address so that no slave controller will ever assert the TACK signal to move the faulty request out of the queues -- the memory system will be Iparalvzed."

If this is a concern in a particular application, the TI= line is preferablv made two-way, that is, both for input to and output from the master controller. The control circuitry 202 or TACK circuitry 208 of the master memory controller then includes a conventional timer, such as a count-down circuit. The timer begins counting when the TACK line enters the "free" state. if the TACK signal is asserted within a predetermined number of clock cycles (more than the greatest number of cycles any valid transaction request would ever take to be processed) the timer resets. If the TACK signal is not asserted within this time, however, the timer times out and the master controller assumes that the address was not recognized by any slave. The master controller then itself, via the TACK circuit 208, asserts the TACK signal, which causes all the transaction queues to advance, and it also issues some conventional error signal onto the main bus 100 to indicate to the requesting unit that its request could not be processed. The requesting unit can then reissue the request with the correct address or take some other corrective action. If this feature is included in an implementation of the invention, then the TACK signal becomes two-way, and is not just an input signal from the slave memory controllers to the master controller. Because the invention provides for a very high canacJ--1 yet still very fast memory system, there will often be thousands of individual memory units in the system. Assuming these memory u.-.its are DRAM chips (the preferred, but not necessary choice), all of these will need to be refreshed regularly. If all of the slave controllers refresh their DP-UI circuits at the same time, a powerconsuming power spike may result. In order to avoid such a "step-load," each slave memory controller is preferably assigned (in hardware) a unique identification number i, preferably one in a consecutive series of integers (for example, for M slave controllers, i = 0, 1, 2,..., M-!).

When the memory system is first powered up, the master controller will apply the slave reset signal SRESET. The control circuits 212 in the slave controllers then each preferably include identical conventional counters that count clock cycles and start over again when they reach some number N, which is a predetermined refresh period (expressed in number of clock cycles). The refresh period N will be chosen based on the specifications for the chosen DRAM chips.

The refresh period N must be no greater than the maximum allowable refresh period for the chips, but it should preferably also be large enough that the DRAM chips will not be refreshed unnecessarily often; otherwise, the slave controllers will waste time refreshing the DRAMs when they could instead be processing transactions.

The refresh period N will typically, but not necessarily, be on the order of thousands of clock cycles. According to the invention, slave controller i preferably refreshes its DRAM memory units on the following clock cycles:

k. i + j N where k is a predetermined integer spacing constant, and j = 0, 1, 2,... (unt_d1 system power down or some general reset).

To illustrate this preferred "staggered" refreshing procedure, assume that N = 1000, i = 2, and k = 1. The slave controller with the identification number 2 will therefore issue refresh commands on clock cvcles 2, 1002, 2002, 3002, and so on. Slave controller number 3 (i=3) will refresh its DRAM chips on cycles 3, 1003, 2003, 3003, and so on. It is preferable that N be chosen greater than or equal to M, where M is the total number of slave controllers; in this way, no two slave controllers will refresh their DR.AMs at the same time, that is, there will be no overlapping and bunching up of the refresh operations.

32 In this example, if there are only ten slave controllers W=10), N=1000, and k=l, then refresh operations will only be carried out during the first ten cycles of every 1000-cycle refresh period. For reasons of timing or other electrical considerations, it may be preferable, however, to space the refresh operations more evenly over the refresh period. This can be done by changing the spacing constant k. To illustrate, if k=100 (other values remaining as before), then slave 10 controller number 2 will refresh its DRAM chips on '1200, and so on, and slave cycles 200, 1200, 2200, controller number 3 will refresh on cycles 300, 1300, 2300, 3300, and so on.

In other words, by changing the spacing constant k from 1 to 100, the spacing in cycles between refresh operations also increases from 1 to 100. Notice that if N is chosen to be a multiple of M and k = N/M, then the refresh operations will be evenly spaced over the entire refresh period, without overlapping.

Actual implementations of the counters in the slave controllers will oftencount clock cycles modulo N, that is, they will reset to zero on the clock cycle after they have counted to ii-i. Continuing with the example, this merely means that slave controllers number 2 and 3 will refresh every time their counter reaches 200 and 300, respectively.

As an alternative to the refreshing scheme just described, the individual DP-zbis or predetermined groups of DRAMs (instead of the slave controllers) could be assigned unique identification numbers, which would be stored in the control hardware of t.,.eir respective slave controllers. Whenever the counters reached some number of clock cycles corresponding to some predetermined integer function of the identification numbers, then the corresponding DRAMs could be refreshed.

Other refreshing schemes than the one just described may also be used as long as they provide for t 33 staggered refreshing and avoid overlapping refreshing operations. It is furthermore not necessary according to the invention to eliminate all simultaneous refreshing of different memory units by even spacing, but the more units are refreshed at the same time, the larger will be the corresponding step load. The invention makes it possible to eliminate overlapping, however, by staggering the refreshing of different DRAMs or groups of DRAMs and as a result provides the advantage that the power system experiences only a series of small step loads instead of a single large step load.

Figure 3 illustrates a simplified, low-cost embodiment of the invention. In the illustrated embodiment, only a single memory bank 304 and a single dual slave controller 106U, 106L is included. There is no need for multiplexers; rather, a RAM data bus 322 (which is the logical equivalent of the data bus 118 in Figures 1 and 2), is connected directly to data input/output pins of a master memory controller 302, which may be the same pins to which the MUlt4 plexer data bus 122 is connected in the multi-bank embodiment shown in Figure 2. A memory read (MREAD) signal and a memory write (114WRITE) signal are also applied d-reczly to the master memory controller via two corresponding control lines 328 (corresponding to the lines 128a in Figure 2).

The master memorv controller incorporates the multiplexer control circuitry to determine the direction (READ or WRITE) of data flow.

In all other respects this simplified embodiment has the same structure and function as the single memory bank shown in Figure 2, except that there is now only one pair of slave controllers (the different halves of the dual unit, but separate units may of course be used instead) that need to share the data buses, queue transactions, and drive the TACK signal. By eliminating the multiplexers for applications with less memory 34 requirements, the embodiment shown in Figure 3 can be made at much less cost.

It is preferable that a single master memory controller be provided that can be used in both the multi-bank embodiment of Figures 1 and 2 and the simplified embodiment shown in Figure 3. This can be done simply by providing pins to which the MREAD and WRITE signals can be connected, as well as some physical switch or sensing circuit to indicate whether the multi-bank or the simplified embodiment is in operation. This sensing operation could be carried out in a known manner by firmware at power-up of the system, or by a simple testing circuit that detects, for example, that the MREAD and WRITE pins are not grounded, which it then interprets as indicating the simplified embodiment.

Claims

A computer memory system comprising:

A. a plurality of memory units (108L, 108R, 114L, 114R) grouped in memory segments; B. a master memory controller (102; 302) that receives from a requesting system an ordered sequence of memory transaction requests and transfers data to and from the memory units; C. for each memory segment, a slave memory controller (106U, 106L) that is connected to the master memory controller and 1) that individually stores the ordered sequence of transaction requests in a respective slave transaction queue (216); 2) that decodes the transaction requests with respect to a transaction type and a transaction address;

3) that completes a memory transaction, including accessing the memory unit having the transaction address and transferring data between the master memory controller and the memory unit, only when the transaction request is an earliest received transaction request in the ordered sequence; and 4) that advances the slave transaction queue (216) whenever any slave memory controller completes the memory transaction.
2. A computer memory system as in claim 1, in which:

A. each memory unit has a plurality of data addresses; B. the memory system further includes a main bus (100); includes:

C. the master memory controller (102; 302) 1) a master transaction queue (206); 36 2) a slave data memory (204) holding read and write data to be applied to and retrieved from the memory units; D. each slave memory controller (106U, 106L) includes:

1) the slave transaction queue (216); 2) an address drive circuit (220) that connects the slave memory controller to the memory units in the respective memory segment and has, as output signals, address signals corresponding to the data addresses of the memory units in the memory segment; E. a single transaction bus (124) that connects the master memory controller (102; 302) with all of the slave memory controllers and has a transaction acknowledgement (TACK) line that has an asserted state and a non-asserted state; F. the master memory controller further includes a master control and transaction acknowledgement circuit (208) 1) that senses the state of the transaction acknowledgement (TACK) line; 2) that receives via the main bus (100) the ordered sequence of memory transaction requests; 3) that stores the memory transaction requests in the master transaction queue (206) as a sequence of transaction codes and memory transaction addresses in order of receipt of the transaction requests from the main bus (100); 4) that outputs the transaction requests to the slave transaction queues (216) by applying to the slave memory controllers, via the transaction bus (124), transaction signals and sla-ie address signals corresponding to the transaction codes and memory transaction addresses; 5) that applies write data to and latches read data from the memory units; 37 6) that transfers read data from the memory units to the main bus; and 7) that advances all transaction requests in the master transaction queue (206) when the transaction acknowledgement (TACK) line enters the asserted state; G. in each slave memory controller (106U, 106L), a slave control and transaction acknowledgement circuit (212, 218) 1) that senses the state of the transaction acknowledgement (TACK) line; 2) that stores the memory transaction requests in the slave transaction queue (216) in the same order as they are stored in the master transaction queue (206), with an earliest received unprocessed transaction request at a head of the slave transaction queue and a latest received unprocessed transaction request at an end of each slave transaction queue;
3) that decodes the transaction memory addresses to determine whether the transaction request stored at the head of the slave transaction queue maps into a memory unit within the memory segment of the respective slave memory controller and, when so, and when the TACK line is in the non- asserted state, a) that completes the transaction request at the head of the slave transaction queue (216); and b) that drives the TACK line to its asserted state after completing the transaction request at the head of the slave transaction queue; and 4) that advances all transaction requests in the slave transaction queue when the transaction acknowledgement (TACK) line enters the asserted state.

38 3. A computer memory system as in claim 1, in which the memory segments are grouped into memory banks (104a, 104b, 104c, 104d), which are grouped into memory regions, further comprising: for each memory region, a multiplexer (120) that multiplexes memory data between the memory units in the memory region and the master memory controller (102); read/write control circuitry (128a, 128b, 128c, 128d) that connects each slave memory controller in a memory region with the multiplexer in the memory region for indicating a direction of data transfer between the master memory controller (102) and the memory units.
4. A computer memory system as in claim 1, in which each slave control and transaction acknowledgement circuit (212, 216) in each slave memory controller is also included for storing memory unit timing and drive parameters for the memory units in the memory segment corresponding to the respective slave memory controller, the memory units thereby being electrically transparent to the master memory controller (102).
5. A computer memory system as in claim 2, in which the memory units are dynamic randomaccess-memory (DRAM) units, and in which the slave control and transaction acknowledgement circuit (212, 218) refreshes the DRAM units in a time-staggered manner.
6. A computer memory system as in claim 1, in which:

memory segments form only a me-nory bank (3V) which is connected to the transaction bus (124); and each slave memory controller (106U, 106L) in the single memory bank includes a read/write control circuit (328) that connects each slave memory controller to the master memory controller (302) and indicates a direction of data transfer between the master memory controller and the memory units.

1 39
7. A method for accessing a plurality of memory units in a computer memory comprising the following steps:

A. grouping the plurality of memory units into memory segments; B. sequentially storing an ordered sequence of memory transaction requests from a requesting system in a master transaction queue (206); C. for each memory segment, individually storing the ordered sequence of transaction requests in a corresponding slave transaction queue (216); D. decoding the transaction requests in each slave transaction queue with respect to a transaction type and a transaction address; E. completing a memory transaction by accessing the memory unit having the transaction address only when the transaction request is an earliest received transaction request in the ordered sequence; and F. advancing all transaction queues whenever any memory transaction is carried out.
8. A computer memory system substantially as herein described with reference to the accompanying drawings.
9. A method of accessing a computer memory substantially as herein described with reference to the accompanying drawings.