WO2004046931A1

WO2004046931A1 - Memory control device and store bypass control method

Info

Publication number: WO2004046931A1
Application number: PCT/JP2002/012136
Authority: WO
Inventors: Yasutomo Sakurai
Original assignee: Fujitsu Limited
Priority date: 2002-11-20
Filing date: 2002-11-20
Publication date: 2004-06-03

Abstract

A memory control device comprises a cache access control unit (12) which reads data D1 corresponding to the address from a cache memory (13) according to an access request (a store access) from a CPU (11), and a merge unit (101) for merging the data D2 and to be stored and fed from the CPU (11) with data D1 and storing the merged result as data D3 in a store buffer (19). When an access request (a load access) corresponding to the same address is made from the CPU (11) before data D3 is stored in the cache memory (13), data D3 is loaded on the CPU (11) from the store buffer (19).

Description

Description Memory control device and store bypass control method

The present invention relates to a memory control device having a store buffer and a store bypass control method, and more particularly to a memory control device and a store bypass control method capable of shortening a load access time at the time of store bypass. Background art

FIG. 4 is a block diagram showing a configuration of a conventional cache memory device 10. The cache memory device shown in this figure is a device provided with a cache memory 13 in order to fill a speed difference between a CPU (Central Processing Unit) 11 and a main memory unit 16.

The CPU 11 reads / writes data by accessing the cache memory 13 or the main memory unit 16. The main memory unit 16 has a feature that it has a large capacity and an access time is slower than that of the cache memory 13. The main memory unit 16 stores all data used in the CPU 11.

The cache memory 13 stores a part of the data stored in the main memory unit 16 and includes a tag RAM (Random Access Memory) 14 and a data RAM I5. The cache memory 13 is, for example, a static random access memory (SRAM), and has a characteristic that the access time is shorter than that of the main memory unit 16.

Further, from the viewpoint of the storage capacity, the storage capacity of the cache memory 13 is smaller than that of the main memory unit 16. FIG. 5 is a diagram for explaining the correspondence between the main memory cut 16 and the cache memory 13 shown in FIG. In the cache memory 13 shown in the figure, the “index” stores the lower n bits of the address stored in the main memory unit 16 as an index.

In the “tag”, the upper three bits of the address stored in the main memory unit 16 are stored as a tag. These “index” and “tag” are stored in tag RAMI 4 (see Fig. 4).

On the other hand, “data” stores data stored in the main memory unit 16. This "data" is stored in the data RAMI5.

Referring back to FIG. 4, the cache access control unit 12 has a function of controlling access to the cache memory 13 and the store buffer 19 in response to an access request from the CPU 11. Here, there are two types of access requests: load access for loading (reading) data from the cache memory 13 or the like, and store access for storing (writing) data to the cache memory 13 or the like.

The comparison circuit 17 compares the address of the CPU 11 with the address of the tag RAMI 4 under the control of the cache access control unit 12. The refill request unit 18 issues a refill request to request data corresponding to an address from the CPU 11 to the main memory unit 16 when the comparison result of the comparison circuit 17 does not match. The store buffer 19 is a buffer for storing an address from the CPU 11, control information (data size, etc.) and data D2 at the time of access to the store. The bypass enable / disable judging unit 20 judges whether or not the data D2 is stored in the store buffer 19 (whether or not the store bypass described later is possible).

The comparison circuit 21 compares the address from the CPU 11 with the address stored in the store buffer 19 under the control of the cache access control unit 12 at the time of load access. The AND circuit 22 ANDs the output signal of the comparison circuit 21 and the output signal of the bypass enable / disable determining unit 20. The valid signal generation unit 23 includes an AND circuit 22 On the basis of the output signal of the comparator 11 or the output signal of the comparison circuit 17, a valid signal indicating that the data loaded to the CPU 11 is valid is generated.

The merging unit 24 merges data D2 stored in the store buffer 19 and data D1 stored in the data RAMI5. The selection circuit 25 is a circuit for selecting the data RAMI 5 when the output signal of the comparison circuit 17 is a “1J signal. The multiplexer 26 outputs the output of the merge unit 24 or the output of the data RAMI 5. The multiplexer 27 selects the output of the main memory unit 16 or the output of the store buffer 19.

Next, the operation of the conventional cache memory device 10 will be described. When the access request from the CPU 11 is a load access for loading (reading) data from the cache memory 13 (or the main memory unit 16), the cache access control unit 12 arbitrates with another access request. Measure. Then, when the priority of the access request (load access) becomes the highest, the cache access control unit 12 controls access to the cache memory 13 based on the address from the CPU 11.

As a result, the comparison circuit 17 compares the address from the CPU 11 with the address of the tag RAMI 4. If they match (cache hit), the comparison circuit 17 outputs a “1” signal, and the selection circuit 25 selects the data RAMI 5 side.

Then, from the data RAMI 5, the data D1 corresponding to the above address is loaded into the CPU 11 via the multiplexer 26.

In the case of a cache hit, a valid signal indicating that data D 1 is valid is output from CPU 11 to CPU 11.

On the other hand, if the address from the CPU 11 does not match the address of the tag RAMI 4 (cache miss), that is, if the data does not exist in the data RAMI 5, the refill request unit 18 refills the main memory unit 16. Make a request. As a result, the main memory unit 16 maps the data corresponding to the above address. The data is written to the data RAMI 5 via the multiplexer 27. Then, the cache access control unit 12 executes the above access control again, so that the data D1 corresponding to the above address is loaded from the data RAMI 5 and the CPU 1 via the merge unit 24 and the multiplexer 26. At the same time, the valid signal is output from the valid signal generator 23 to the CPU 11.

When performing a store access to store (write) data in the cache memory 13 (or the main memory unit 16), the CPU 11 transmits an access request (store access) and an address to the cache access control unit 12. Output to

Next, the CPU 11 stores the data D2 to be stored and the control information in the store buffer 19. At this point, the CPU 11 recognizes that the store is completed irrespective of whether the data D2 is actually stored in the data RAM5. The cache access control unit 12 controls access to the cache memory 13 based on an address from the CPU 11.

As a result, the comparison circuit 17 compares the address from the CPU 11 with the address of the tag RAMI 4. If the two match (cache hit), the data D2 stored in the store buffer 19 is stored (written) to the data RAMI5 via the multiplexer 27.

Here, before the data D2 is stored (written) to the data RAMI 5, an access request to the same address as the address stored in the store buffer 19 is made.

If there is (load access), store bypass processing of storing data from the store buffer 19 to the CPU 11 is executed.

That is, when there is an access request (load access) to the same address from the CPU 11, the cache access control unit 12 performs access control. As a result, the comparison circuit 21 compares the address from the CPU 11 with the address of the store buffer 19, and in this case, outputs a “1” signal because they match. The bypass enable / disable determining unit 20 stores the data in the store buffer 19 based on the control information. It is determined whether or not D2 is stored, that is, whether or not store bypass is possible. In this case, a “1” signal is output as possible.

As a result, the “1 J signal is output from the AND circuit 22 to the valid signal generating unit 23, and the valid signal is output from the valid signal generating unit 23 to the CPU 11.

Here, if the data requested by the access request (load access) is all stored in the storage buffer 19, the data D2 is loaded via the merging unit 24 and the multiplexer 26. Loaded to CPU 11 as data. On the other hand, if all the data requested by the access request (load access) is not stored in the store buffer 19, as shown in FIG. 5 and the data D 1 from the store buffer 19 are merged with the data D 2 from the store buffer 19 in the merge section 24, via the multiplexer 26. 11 1 1 is loaded as data 0 3 (P-data). By the way, as described above, in the conventional cache memory device 10, the result of merging the data D1 from the data RAM I5 and the data D2 from the store buffer 19 in the store bypass at the time of the store bypass. Since (data D 3) is stored in the CPU 11, a processing step for the merge is required, and there is a problem that the load access time is required.

The present invention has been made in view of the above, and an object of the present invention is to provide a memory control device and a store bypass control method that can reduce the load access time at the time of store bypass. Disclosure of the invention

In order to achieve the above object, the present invention provides a storage control means for reading first data corresponding to a specified address from a memory in response to a store access request from a higher-level device; Merging means for storing, as third data, a result of merging the second data of the first data and the first data as a third data; and storing the third data before storing the third data in the memory. Load access control means for loading the third data from the storage buffer to the host device when a load access request corresponding to the designated address is issued from the device.

Also, the present invention provides a store access control step of reading first data corresponding to a desired address from a memory capacity in response to a store access request from a higher-level device; and a second storage target supplied from the higher-level device and targeted for a store. Merging a result of merging data and the first data as third data in a store buffer; and storing the specified key from the higher-level device before the third data is stored in the memory. And a load access control step of loading the third data from the store buffer to the host device when there is a load access request corresponding to the address.

According to the powerful invention, the first data corresponding to the specified address is read from the memory in response to the store access request from the higher-level device, and the second data supplied from the higher-level device to be stored and the first data are stored. When the result of merging with the data is stored in the store buffer as the third data, and when a load access request corresponding to the specified address is issued from the higher-level device before the third data is stored in the memory. Since the third data is loaded from the store buffer to the host device, the load access time at the time of store bypass can be shortened. BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a diagram showing the configuration of the first embodiment according to the present invention, FIG. 2 is a diagram for explaining the operation of the first embodiment, and FIG. FIG. 4 is a diagram showing a configuration of a conventional cache memory device 10, and FIG. 5 is a diagram showing a main memory unit 16 shown in FIG. FIG. 6 is a diagram for explaining a correspondence relationship with the cache memory 13, and FIG. 6 is a diagram for explaining an operation of the cache memory device 10 shown in FIG. BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, Embodiments 1 and 2 according to the present invention will be described in detail with reference to the drawings.

(Embodiment 1)

FIG. 1 is a block diagram showing a configuration of a first embodiment according to the present invention. In this figure, parts corresponding to the respective parts in FIG. 4 are denoted by the same reference numerals, and description thereof will be omitted. In the cache memory device 100 shown in FIG. 1, a merge unit 101 is provided instead of the merge unit 24 shown in FIG.

The merging unit 101 is interposed between the CPU 11 and the storage buffer 19, and is a result of merging data D2 from the CPU 11 and data D1 from the data RAMI 5 at the time of store access. Is data D 3. The data D3 is stored in the storage buffer 19.

That is, in the first embodiment, merging is not performed at the time of load access as in the conventional case (see FIG. 6), but by performing merging at the time of store access, load access time at the time of store bypass can be reduced. Can be

Next, the operation of the first embodiment will be described. In FIG. 1, when performing a store access to store (write) data to the cache memory 13 (or the main memory unit 16), the CPU 11 transmits an access request (store access) and an address to the cache access control unit. Output to 12.

Next, the CPU 11 outputs the data D 2 to be stored to the merging unit 101 and stores the control information in the store buffer 19.

The cache access control unit 12 controls access to the cache memory 13 based on an address from the CPU 11.

As a result, the comparison circuit 17 compares the address from the CPU 11 with the address of the tag RAMI 4 as shown in FIG. 2 (store access). If the two match (cache hit), the data D1 corresponding to the above address is output from the data RAMI 5 to the merge unit 101. As a result, the merging unit 101 stores the result (data D3) obtained by merging the data D2 from the CPU 11 with the data D1 from the data RAM 15 in the store buffer 19. This data D3 corresponds to the data D3 merged by the merging unit 24 shown in FIG.

Here, before the data D3 is stored (written) to the data RAMI5, an access request to the same address as the address stored in the store buffer 19 is made.

If there is a (load access), store bypass processing of loading data from the store buffer 19 to the CPU 11 is executed.

That is, when there is an access request (load access) to the same address from the CPU 11, the cache access control unit 12 performs access control. As a result, the comparison circuit 21 compares the address from the CPU 11 with the address of the storage buffer 19, and in this case, outputs a “1” signal because they match. Further, the bypass enable / disable determining unit 20 determines whether the data D3 is stored in the store buffer 19, that is, whether or not the store bypass is possible, based on the control information. And outputs a “1” signal.

As a result, the “1” signal is output from the AND circuit 22 to the valid signal generating unit 23, and the valid signal generating unit 23 outputs a valid signal to the CPU 11.

Then, as shown in FIG. 2 (load access), the data D3 corresponding to the address is loaded from the store buffer 19 to the CPU 11 via the multiplexer 26 as load data.

As described above, according to the first embodiment, the data D1 corresponding to the address is read from the cache memory 13 in response to the access request (store access) from the CPU 11, and the storage target supplied from the CPU 11 is stored. The result of merging data D2 and data D1 is stored in the storage buffer 19 as data D3.Before data D3 is stored in the cache memory 13, the data is transferred from the CPU 11 to the above address. When there is a corresponding access request (load access), the data D3 is loaded from the store buffer 19 to the CPU 11, so the store Load access time during bypass can be reduced.

(Embodiment 2)

In the first embodiment, an example in which the data D1 from the data RAMI 5 and the data D2 from the CPU 11 shown in FIG. 1 are merged by the merging unit 101, and then the data D3 is stored in the store buffer 19 However, the configuration may be such that the data from the main memory unit 16 and the data D2 are merged in the event of a cache miss. Hereinafter, this configuration example will be described as a second embodiment.

FIG. 3 is a block diagram showing a configuration of a second embodiment according to the present invention. In this figure, parts corresponding to the respective parts in FIG. 1 are denoted by the same reference numerals. In the cache memory device 200 shown in FIG. 3, a multiplexer 201 is provided.

The multiplexer 201 selects the output (data D1) of the data RAM 15 or the output (data D4) of the main memory unit 16 and outputs it to the merge unit 101.

Next, the operation of the second embodiment will be described. In FIG. 3, when performing a store access to store (write) data to the cache memory 13 (or the main memory unit 16), the CPU 11 transmits an access request (store access) and an address to the cache access control unit 12. Output to

As a result, the comparison circuit 17 compares the address from the CPU 11 with the address of the tag RAMI 4 as shown in FIG. 2 (store access). If they do not match (a cache miss), the refill request unit 18 issues a refill request to request data corresponding to the address of the CPU 11 from the main memory unit 16. As a result, the data D4 corresponding to the above address is output from the main memory unit 16. This data D4 is output to the merge unit 101 via the multiplexer 201.

As a result, the merging unit 101 stores the result (data D3,) obtained by merging the data D2 from the CPU 11 and the data D4 from the main memory unit 16 in the store buffer 19.

If there is an access request (load access) to the same address as the address stored in the store buffer 19 before the data D3 'is stored (written) to the data RAMI 5, the store buffer 19 The store bypass process of loading the data D3, from the CPU 11 into the CPU 11 is executed.

That is, when there is an access request (load access) to the same address from the CPU 11, the data D3 'is transmitted from the store buffer 19 via the multiplexer 26 as load data to the CPU 1 through the multiplexer 26 through the above-described operation. Loaded to 1.

As described above, according to the second embodiment, when the data D1 does not exist in the cache memory 13, the data D4 is read from the main memory unit 16, so that the data D4 and the data D2 are Is stored in the store buffer 19 as the merged data D3, so that even in the event of a cache miss, the load access time during store bypass can be reduced.

The first and second embodiments according to the present invention have been described in detail with reference to the drawings. However, specific configuration examples are not limited to the first and second embodiments. Even a design change or the like within a range not departing from the present invention is included in the present invention.

As described above, according to the present invention, the first data corresponding to the specified address is read from the memory in response to the store access request from the higher-level device, and the second data to be stored is supplied from the higher-level device. The result of merging with the first data is stored in the store buffer as third data, and before the third data is stored in the memory, a load access request corresponding to the specified address is issued from the higher-level device. Ah In this case, the third data is loaded from the store buffer to the host device, so that the load access time at the time of store bypass can be reduced.

Further, according to the present invention, if the first data does not exist in the memory, the first data is read from the main memory unit, so that the first data and the second data are merged. The third data is stored in the store buffer, so that even in the event of a cache miss, the load access time during store bypass can be reduced. Industrial applicability

As described above, the memory control device and the store bypass control method according to the present invention are useful for accessing a cache memory having a store buffer.

Claims

The scope of the claims

1. store access control means for reading first data corresponding to a specified address from a memory in response to a store access request from a higher-level device;

Merging means for storing the result of merging the second data to be stored and the first data supplied from the higher-level device and the first data in the store buffer as third data; and storing the third data in the memory in the memory. And a load access control unit that loads the third data from the storage buffer to the host device when a load access request corresponding to the specified address is received from the host device before the request is issued. A memory control device characterized by the above-mentioned.

2. The memory control according to claim 1, wherein the first data is read from the main memory unit when the first data does not exist in the memory. apparatus.

3. a storage control step of reading first data corresponding to a specified address from a memory in response to a store access request from a higher-level device;

A merge step of storing, as third data, the result of merging the second data to be stored and the first data supplied from the higher-level device and storing the third data in a store buffer; and A load access control step of loading the third data from the storage buffer to the higher-level device when a higher-level device requests a load access corresponding to the specified address before the higher-level device stores the third data. A store bypass control method.

4. The store bypass according to claim 3, wherein in the store access control step, when the first data does not exist in the memory, the first data is read from a main memory unit. Control method.