CN110688155A - Merging method for storage instruction accessing non-cacheable area - Google Patents

Merging method for storage instruction accessing non-cacheable area Download PDF

Info

Publication number
CN110688155A
CN110688155A CN201910859164.2A CN201910859164A CN110688155A CN 110688155 A CN110688155 A CN 110688155A CN 201910859164 A CN201910859164 A CN 201910859164A CN 110688155 A CN110688155 A CN 110688155A
Authority
CN
China
Prior art keywords
buffer
write request
uncacheable
merge
write
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910859164.2A
Other languages
Chinese (zh)
Inventor
胡向东
王飙
杨剑新
路冬冬
张晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Integrated Circuits with Highperformance Center
Original Assignee
Shanghai Integrated Circuits with Highperformance Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Integrated Circuits with Highperformance Center filed Critical Shanghai Integrated Circuits with Highperformance Center
Priority to CN201910859164.2A priority Critical patent/CN110688155A/en
Publication of CN110688155A publication Critical patent/CN110688155A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30047Prefetch instructions; cache control instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0897Caches characterised by their organisation or structure with two or more cache hierarchy levels

Abstract

The invention relates to a merging method of storage instructions for accessing an uncacheable area, which is characterized in that merging buffer is arranged behind a storage instruction queue, the storage instructions of a plurality of access uncacheable areas with access addresses falling in the same Cache block range are merged, and the write data of the storage instructions are merged and stored in an 'uncacheable area write data buffer' entry. The invention reduces occupation of relevant request channels and data channels by storage instructions accessing the uncacheable area.

Description

Merging method for storage instruction accessing non-cacheable area
Technical Field
The invention relates to the technical field of micro-structure design of a central processing unit, in particular to a merging method for storage instructions accessing an uncacheable area.
Background
In a modern microprocessor, in order to solve the problem of a storage wall caused by the increasingly large difference between the access speed of a storage and the execution speed of a processor, a storage system is generally divided into a first-level Cache (L1 Cache), a second-level Cache (L2 Cache), a third-level Cache (L3 Cache), a memory, a disk and the like from top to bottom in sequence. In the general microprocessor, in order to better utilize the locality of a program and exert the function of each level of Cache to the maximum extent, when an access instruction accesses a storage system, the whole Cache block data where an access address is located is written into each level of Cache. Correspondingly, the access instruction can access each level of cache from top to bottom when accessing the storage system, and if the data block of the current access address is stored in the first level of cache, the data is read from the first level of cache; if the data block of the current access address is not stored in the first-level cache, sequentially accessing the second-level cache and the third-level cache to the main memory, and loading the data block of the access address into each-level cache.
In the general microprocessor, before a storage instruction is executed, the whole Cache block where an access address is located needs to be loaded into a first-level data Cache and the write permission of the Cache block is obtained, and data can exist in lower-level caches at the same time. The storage instruction directly writes data into a first-level data cache when being executed; if the first-level data cache adopts a write-through write strategy, the storage instruction can write data into the first-level data cache and the second-level cache simultaneously.
The premise that the hierarchical storage system can obtain the effect is that the access behavior has better locality; for a storage area without access locality, if data of a corresponding area is written into each level of cache, data which may be useful in the future in the cache is occupied, and "pollution" of the cache is caused. For such cases, the software may mark areas that do not have memory locality as uncacheable areas. When the storage instruction accesses the area, the write request and the write data are directly sent to the main memory to be executed; or special memory access instructions can be designed in the microprocessor, memory access is carried out in a non-cache mode, namely, the main memory is directly accessed during execution, and accessed data blocks are not written into caches of all levels. In addition, when the storage instruction accessing the IO device is executed, the write request and the write data also need to be sent to the corresponding IO device for execution. Subsequently, the uncacheable storage area, the storage area accessed in an uncacheable manner, and the address space where the IO device is located are collectively referred to as an uncacheable area.
The execution of a store instruction to access a non-cacheable location can be divided into the following steps: 1) when all the old instructions in the process are normally finished, the storage instruction accessing the uncacheable area is withdrawn from the storage instruction queue, and the write data is stored in a 'uncacheable area write data buffer', wherein the buffer is provided with a plurality of entries; 2) the storage instruction sends a write request to the Cache consistency processing component for the non-cacheable area and carries an entry number of write data in the 'non-cacheable area write data buffer'; 3) the Cache consistency processing part sends a data fetching request to the 'non-cacheable region write data buffer'; 4) the 'non-cacheable area write data buffer' sends write data to a Cache consistency processing part and releases an entry corresponding to the write data; 5) and the Cache consistency processing part sends the write request and the write data to a main memory or IO equipment, and executes write operation in the main memory or the IO equipment.
In the operation of step "2)", after the Cache consistency processing unit receives a write request to the uncacheable area, if the latest data of the address is found in a certain level of Cache, it needs to notify the corresponding Cache to write the latest data back to the main memory.
The storage instruction accessing the non-cacheable area is processed in the above mode, and the advantage is that the request channel of Cache consistency and the data channel of the Cache data write-back main memory can be multiplexed. However, when a plurality of storage instructions are used to write a continuous address space of the uncacheable area, the plurality of storage instructions are executed in sequence, which will occupy the request channel of Cache consistency and the data channel of the main memory for writing back the storage data for many times, and the execution efficiency of the storage instructions is also low.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a merging method for storage instructions accessing an uncacheable area, which reduces the occupation of the storage instructions accessing the uncacheable area on the relevant request channel and data channel.
The technical scheme adopted by the invention for solving the technical problems is as follows: a merging method for storage instructions accessing an uncacheable area is provided, merging buffering is arranged behind a storage instruction queue, the storage instructions accessing the uncacheable area with the access addresses falling in the same Cache block range are merged, and write data of the storage instructions are merged and stored in an 'uncacheable area write data buffering' entry.
Before merging the storage instructions of a plurality of access non-cacheable areas with the access addresses falling in the same Cache block range, merging judgment is carried out on the storage instructions of the access non-cacheable areas and the write requests in the merging buffer, and corresponding processing is carried out according to the judgment result.
When the storage instruction accessing the uncacheable area is merged and judged with the write request in the merge buffer, if the write request in the merge buffer is invalid, the storage instruction is converted into the write request and is registered in the merge buffer, a new 'uncacheable area write data buffer' entry is applied to record the write data of the storage instruction, and meanwhile, the applied 'uncacheable area write data buffer' entry number is registered in the merge buffer.
When the storage instruction accessing the uncacheable area is merged with the write request in the merge buffer, if the write request in the merge buffer is valid and the storage instruction can be merged with the write request in the merge buffer, merging the storage instruction with the write request in the merge buffer to form a new write request and register the new write request in the merge buffer, and at the same time merging the write data of the storage instruction with the write data in the "uncacheable area write data buffer" entry corresponding to the merge buffer.
When the storage instruction accessing the uncacheable area is merged and judged with the write request in the merge buffer, if the write request in the merge buffer is valid but the storage instruction cannot be merged with the write request in the merge buffer, the write request in the merge buffer is immediately sent out, the storage instruction is converted into the write request and is registered in the merge buffer, and a new 'uncacheable area write data buffer' is applied for recording the write data of the storage instruction.
During the merging period of the write request waiting in the merging buffer and the subsequent accessing of the non-cacheable storage instruction, if a loading instruction with the memory access address of the write request in the merging buffer falling in the same Cache block range is executed, the write request in the merging buffer is immediately sent out, and the write request in the merging buffer is ensured to be sent out firstly, and then a read request corresponding to the loading instruction accessing the non-cacheable area and the Cache block is sent out.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the method combines a plurality of storage instructions accessing the uncacheable area into one write request and then sends the write request to the Cache consistency processing part, and simultaneously, the write data of the plurality of storage instructions after combination share one 'uncacheable area write data buffer' entry.
Drawings
FIG. 1 is a schematic diagram of the location of merge buffers in the present invention;
FIG. 2 is a schematic view of a process flow for converting a store instruction accessing an uncacheable area into a write request to be registered in a merge buffer, with the merge buffer disabled;
FIG. 3 is a flow chart illustrating a process of merging a store instruction accessing an uncacheable region with a write request in a merge buffer with a valid merge buffer;
FIG. 4 is a flow chart illustrating a process of a merge buffer being active and a store instruction accessing an uncacheable region not being able to merge with a write request in the merge buffer;
FIG. 5 is a flow chart illustrating a process for closing a merge buffer for a load instruction accessing an uncacheable area;
FIG. 6 is a flow chart illustrating an execution of a write request for accessing an uncacheable area after the write request is issued from a merge buffer.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The embodiment of the invention relates to a merging method of storage instructions for accessing an uncacheable area, which comprises the steps of setting merging buffer behind a storage instruction queue, merging a plurality of storage instructions for accessing the uncacheable area, the access addresses of which fall in the same Cache block range, and merging and storing write data of the plurality of storage instructions into an 'uncacheable area write data buffer' entry.
As shown in fig. 1, in the present embodiment, a merge buffer is set behind the store instruction queue, and the merge buffer records relevant information such as the access address and the access granularity of the current pending merge request, and records the entry number of the current requested write data in the "write data buffer in the uncacheable area". And after the storage instruction of the access and storage non-cacheable area exits from the storage instruction queue, the storage instruction and the write request in the merge buffer are merged and judged, and corresponding processing is carried out according to the judgment result.
When the storage instruction accessing the uncacheable area and the write request in the merge buffer are merged and judged, if the write request in the merge buffer is invalid, the storage instruction is converted into the write request and is registered in the merge buffer, a new 'uncacheable area write data buffer' entry is applied to record the write data of the storage instruction, and meanwhile, the applied 'uncacheable area write data buffer' entry number is registered in the merge buffer. As shown in fig. 2, after the store instruction accessing the uncacheable area exits from the store instruction queue, address merge determination is performed on the store instruction and the write request in the merge buffer, and at this time, the request in the merge buffer is invalid, so that the store instruction registers the write address and the write granularity in the merge buffer, applies for a new "uncacheable area write data buffer" entry, and writes the write data therein.
When the storage instruction accessing the uncacheable area is merged with the write request in the merge buffer, if the write request in the merge buffer is valid and the storage instruction can be merged with the write request in the merge buffer, merging the storage instruction with the write request in the merge buffer to form a new write request and register the new write request into the merge buffer, and at the same time merging the write data of the storage instruction with the write data in the "uncacheable area write data buffer" entry corresponding to the merge buffer. As shown in fig. 3, after the store instruction accessing the uncacheable area exits from the store instruction queue, an address merge determination is performed with the write request in the merge buffer, at this time, the request in the merge buffer is valid, and the store instruction may be merged with the write request in the merge buffer; thus, the store instruction merges the write address and write granularity into the merge buffer and merges the write data with the data in the "uncacheable region write data buffer" corresponding to the merge buffer.
When the storage instruction accessing the uncacheable area is merged and judged with the write request in the merge buffer, if the write request in the merge buffer is valid but the storage instruction cannot be merged with the write request in the merge buffer, the write request in the merge buffer is immediately sent out, the storage instruction is converted into the write request and is registered in the merge buffer, and a new 'uncacheable area write data buffer' is applied for recording the write data of the storage instruction. As shown in fig. 4, after the store instruction accessing the uncacheable area exits from the store instruction queue, address correlation determination is performed on the store instruction and the write request in the merge buffer, at this time, the request in the merge buffer is valid, but the store instruction cannot be merged with the write request in the merge buffer; therefore, the request in the merge buffer is immediately sent to the Cache consistency processing unit, the storage instruction registers the write address and the write granularity in the merge buffer, and applies for a new 'uncacheable region write data buffer' entry to write the write data into the new 'uncacheable region write data buffer'.
Therefore, whether the memory access addresses are in the same Cache block range in the embodiment is a basic criterion for judging whether storage instructions accessing a non-cacheable area can be merged. If the merge buffer is effective, and the memory address of the storage instruction accessing the non-cacheable area and the memory address of the write request in the merge buffer are in the same Cache block range, merging the storage instruction accessing the non-cacheable area and the write request in the merge buffer, otherwise, sending the write request in the merge buffer to the Cache consistency processing component, registering the storage instruction accessing the non-cacheable area in the merge buffer, applying for a new 'non-cacheable area write data buffer' entry, and writing the write data into the applied entry.
In an out-of-order execution microprocessor, the launch and execution of load instructions are out-of-order; store instructions may be issued out of order, but must be executed in order. Therefore, if the load instruction being executed intersects with the access address of the write request in the merge buffer, it indicates that one or more storage instructions corresponding to the write request in the merge buffer should be executed first according to the program sequence, and therefore the merge buffer is immediately closed, that is, the merge buffer does not wait for the subsequent storage instruction, and immediately sends the write request to the Cache consistency processing component, so as to ensure that the Cache consistency processing component receives the write request corresponding to the previous storage instruction in the program sequence first. As shown in fig. 5, when waiting for the merging judgment of the write request in the merge buffer and the subsequent storage instruction, a load instruction for accessing the uncacheable area is received, and the access address of the instruction is crossed with the access address of the write request in the merge buffer, so that the write request in the merge buffer is immediately sent to the Cache consistency processing component, and it is ensured that the write request in the merge buffer is sent to the Cache consistency processing component first, and then the load instruction for the uncacheable area is sent to the Cache consistency processing component.
The processing flow after sending the write request in the merge buffer and the corresponding "non-cacheable area write data buffer" entry number to the Cache consistency processing component is shown in fig. 6, and the Cache consistency processing component sends out a data fetching request of the entry corresponding to the "non-cacheable area write data buffer"; after the 'data writing buffer in the non-cacheable area' receives the data fetching request, the data writing is sent to the Cache consistency processing part, and meanwhile, the corresponding item is released; after receiving the write data, the Cache consistency processing part sends the write data to the non-cacheable area together with information such as a write address and write granularity, and specific write operation is completed in the non-cacheable area. Therefore, the multiple storage instructions are merged into one write request and then executed concurrently, so that the total time required by the execution of the multiple instructions can be effectively shortened, and the performance of the processor for executing the related program segments is improved.
It is not difficult to find that the invention merges a plurality of storage instructions accessing the uncacheable area into a write request and sends the write request to the Cache consistency processing component, and simultaneously, the write data of the merged storage instructions share one 'uncacheable area write data buffer' entry, so that the method not only reduces the occupation of the storage instructions accessing the uncacheable area on the relevant request channel and data channel, but also can improve the entry use efficiency of the 'uncacheable area write data buffer' and improve the execution efficiency of the storage instructions accessing the uncacheable area.

Claims (6)

1. A merging method for storage instructions accessing an uncacheable area is characterized in that merging buffer is arranged behind a storage instruction queue, the storage instructions accessing the uncacheable area with the access addresses falling in the same Cache block range are merged, and write data of the storage instructions are merged and stored in an 'uncacheable area write data buffer' entry.
2. The method for merging the storage instructions accessing the uncacheable area according to claim 1, wherein before merging the storage instructions accessing the uncacheable area with the access addresses falling within the same Cache block, the method further comprises merging and judging the storage instructions accessing the uncacheable area and the write request in the merge buffer, and performing corresponding processing according to the judgment result.
3. The method as claimed in claim 2, wherein when the merge determination is performed on the store instruction accessing the uncacheable region and the write request in the merge buffer, if the write request in the merge buffer is invalid, the store instruction is converted into the write request and registered in the merge buffer, and a new "uncacheable region write data buffer" entry is applied to record the write data of the store instruction, and at the same time, an entry number of the applied "uncacheable region write data buffer" is registered in the merge buffer.
4. The method according to claim 2, wherein when the merge determination is performed on the store instruction accessing the uncacheable region and the write request in the merge buffer, if the write request in the merge buffer is valid and the store instruction can be merged with the write request in the merge buffer, the store instruction and the write request in the merge buffer are merged to form a new write request and are registered in the merge buffer, and at the same time, the write data of the store instruction and the write data in the "uncacheable region write data buffer" entry corresponding to the merge buffer are merged.
5. The method as claimed in claim 2, wherein when determining to merge the storage instruction accessing the uncacheable area with the write request in the merge buffer, if the write request in the merge buffer is valid but the storage instruction cannot be merged with the write request in the merge buffer, the write request in the merge buffer is immediately issued, the storage instruction is converted into a write request and registered in the merge buffer, and a new "uncacheable area write data buffer" is applied to record the write data of the storage instruction.
6. The method as claimed in claim 1, wherein during the period that the write request in the merge buffer waits for merging with the subsequent store instruction accessing the uncacheable area, if a load instruction whose memory address is in the same Cache block range as the memory address of the write request in the merge buffer is executed, the write request in the merge buffer is immediately issued, and it is ensured that the write request in the merge buffer is issued first, and then a read request corresponding to the load instruction accessing the uncacheable area is issued.
CN201910859164.2A 2019-09-11 2019-09-11 Merging method for storage instruction accessing non-cacheable area Pending CN110688155A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910859164.2A CN110688155A (en) 2019-09-11 2019-09-11 Merging method for storage instruction accessing non-cacheable area

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910859164.2A CN110688155A (en) 2019-09-11 2019-09-11 Merging method for storage instruction accessing non-cacheable area

Publications (1)

Publication Number Publication Date
CN110688155A true CN110688155A (en) 2020-01-14

Family

ID=69109035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910859164.2A Pending CN110688155A (en) 2019-09-11 2019-09-11 Merging method for storage instruction accessing non-cacheable area

Country Status (1)

Country Link
CN (1) CN110688155A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140089599A1 (en) * 2012-09-21 2014-03-27 Fujitsu Limited Processor and control method of processor
CN107066393A (en) * 2017-01-12 2017-08-18 安徽大学 The method for improving map information density in address mapping table

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140089599A1 (en) * 2012-09-21 2014-03-27 Fujitsu Limited Processor and control method of processor
CN107066393A (en) * 2017-01-12 2017-08-18 安徽大学 The method for improving map information density in address mapping table

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
梅魁志等: "一种面向写穿透Cache的写合并设计及验证", 《西安交通大学学报》 *
陈宇收: "基于Mycat的分布式数据存储研究", 《中国新通信》 *

Similar Documents

Publication Publication Date Title
US10019369B2 (en) Apparatuses and methods for pre-fetching and write-back for a segmented cache memory
JP3323212B2 (en) Data prefetching method and apparatus
JP7340326B2 (en) Perform maintenance operations
US6782454B1 (en) System and method for pre-fetching for pointer linked data structures
US9418011B2 (en) Region based technique for accurately predicting memory accesses
US20110173400A1 (en) Buffer memory device, memory system, and data transfer method
US6748496B1 (en) Method and apparatus for providing cacheable data to a peripheral device
CN110297787B (en) Method, device and equipment for accessing memory by I/O equipment
US20110167223A1 (en) Buffer memory device, memory system, and data reading method
US7702875B1 (en) System and method for memory compression
US10831673B2 (en) Memory address translation
US20060179173A1 (en) Method and system for cache utilization by prefetching for multiple DMA reads
US10853262B2 (en) Memory address translation using stored key entries
CN117609110A (en) Caching method, cache, electronic device and readable storage medium
US6801982B2 (en) Read prediction algorithm to provide low latency reads with SDRAM cache
JP5699854B2 (en) Storage control system and method, replacement method and method
US10713165B2 (en) Adaptive computer cache architecture
CN111124954A (en) Management device and method for two-stage conversion bypass buffering
CN109478163B (en) System and method for identifying a pending memory access request at a cache entry
CN110688155A (en) Merging method for storage instruction accessing non-cacheable area
US8214597B2 (en) Cache tentative read buffer
JP7311959B2 (en) Data storage for multiple data types
US11836085B2 (en) Cache line coherence state upgrade
CN111857601B (en) Solid-state disk cache management method based on garbage collection and channel parallelism
US20240111425A1 (en) Tag and data configuration for fine-grained cache memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200114