CN117573573B - Processing method, device, equipment and storage medium for cache request - Google Patents

Processing method, device, equipment and storage medium for cache request Download PDF

Info

Publication number
CN117573573B
CN117573573B CN202410057628.9A CN202410057628A CN117573573B CN 117573573 B CN117573573 B CN 117573573B CN 202410057628 A CN202410057628 A CN 202410057628A CN 117573573 B CN117573573 B CN 117573573B
Authority
CN
China
Prior art keywords
data
request
target
cache
target request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410057628.9A
Other languages
Chinese (zh)
Other versions
CN117573573A (en
Inventor
陈熙
张林隽
王凯帆
蔡洛姗
陈键
唐丹
包云岗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Open Source Chip Research Institute
Original Assignee
Beijing Open Source Chip Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Open Source Chip Research Institute filed Critical Beijing Open Source Chip Research Institute
Priority to CN202410057628.9A priority Critical patent/CN117573573B/en
Publication of CN117573573A publication Critical patent/CN117573573A/en
Application granted granted Critical
Publication of CN117573573B publication Critical patent/CN117573573B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0844Multiple simultaneous or quasi-simultaneous cache accessing
    • G06F12/0853Cache with multiport tag or data arrays
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application provides a processing method, a device, an electronic device and a computer readable storage medium for a cache request, comprising the following steps: selecting a target request from all requests according to the type of the request obtained by the secondary cache, and entering the target request from a specific data bit of a pipeline queue of the secondary cache into the pipeline queue for execution; when the pipeline queue successfully executes the target request, returning a response generated by executing the target request; and when the pipeline queue does not successfully execute the target request, a corresponding missing state register is allocated for the target request, and the target request is executed through the missing state register. The application can allocate the corresponding missing state register for the target request when the target request fails to be executed, so that the application reduces the allocation consumption of the missing state register resource and reduces the delay caused by the allocation of the missing state register resource on the basis of meeting the requirement of re-executing the target request which is not executed successfully.

Description

Processing method, device, equipment and storage medium for cache request
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and apparatus for processing a cache request, an electronic device, and a computer readable storage medium.
Background
The cache is an important part in modern high-performance processors, and by setting the cache between the processor and the memory, the cache can be used for storing data frequently accessed by the processor, so that the access speed of the processor is improved.
Currently, modern processors are generally provided with three levels of cache: the access instruction of the processor realizes the access of the processor to the cache data by sequentially accessing the first-level cache L1, the second-level cache L2 and the third-level cache L3. The second level cache L2 is generally shared by a single core, and needs to have both a low access delay and a high throughput. A good second level cache L2 design can significantly improve the performance of the processor. Specifically, for each request input, the secondary cache L2 is allocated a Miss-status register (MSHR HANDLING REGISTERS), which is a register for recording each outstanding request, and the MSHR can help the outstanding requests to be executed again successfully.
However, in the above procedure, the corresponding MSHR is allocated for each request, which results in excessive consumption of MSHR resources and increased delay.
Disclosure of Invention
The embodiment of the application provides a processing method and device of a cache request, electronic equipment and a computer readable storage medium, which are used for solving the problems in the related art.
In a first aspect, an embodiment of the present application provides a method for processing a cache request, where the method includes:
Selecting a target request from all the requests according to the type of the requests obtained by the secondary cache, and entering the target request from a specific data bit of a pipeline queue of the secondary cache into the pipeline queue for execution; the pipeline queue comprises a plurality of data bits which are sequentially arranged;
When the pipeline queue successfully executes the target request, returning a response generated by executing the target request;
And when the pipeline queue does not successfully execute the target request, allocating a corresponding missing state register for the target request, and executing the target request through the missing state register.
In a second aspect, an embodiment of the present application provides a processing apparatus for a cache request, where the apparatus includes:
The enqueuing module is used for selecting a target request from all the requests according to the types of the requests obtained by the secondary cache, and entering the target request into the pipeline queue from specific data bits of the pipeline queue of the secondary cache for execution; the pipeline queue comprises a plurality of data bits which are sequentially arranged;
The return module is used for returning a response generated by executing the target request when the pipeline queue successfully executes the target request;
And the execution module is used for distributing a corresponding missing state register for the target request when the pipeline queue does not successfully execute the target request, and executing the target request through the missing state register.
In a third aspect, an embodiment of the present application further provides an electronic device, including a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method of the first aspect.
In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the method of the first aspect.
In the embodiment of the application, the target request can be selected from all the requests to enter the pipeline queue for execution according to the type of the request obtained by the secondary cache, and when the pipeline queue does not successfully execute the target request, a corresponding missing state register is allocated for the target request for processing. In the application, when the target request enters the pipeline queue of the second-level cache, the corresponding missing state register is not allocated for the target request, and the corresponding missing state register is allocated for the target request only when the target request fails to be executed, so that the allocation consumption of the missing state register resource is reduced and the time delay caused by the allocation of the missing state register resource is reduced on the basis of meeting the requirement of re-execution of the target request which is not executed successfully.
The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
FIG. 1 is a schematic diagram of an implementation scenario provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for processing a cache request according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating steps of a method for processing a cache request according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a second level cache architecture according to an embodiment of the present application;
FIG. 5 is a block diagram of a processing device for a cache request according to an embodiment of the present application;
FIG. 6 is a block diagram of an electronic device provided by an embodiment of the invention;
fig. 7 is a block diagram of another electronic device in accordance with another embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, the term "and/or" as used in the specification and claims to describe an association of associated objects means that there may be three relationships, e.g., a and/or B, may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. The term "plurality" in embodiments of the present application means two or more, and other adjectives are similar.
Referring to fig. 1, fig. 1 is a schematic diagram of an implementation scenario provided by an embodiment of the present application, in order to improve execution efficiency and reduce interaction between a processor and a memory, a multi-level cache architecture may be integrated on the processor, where a common architecture is a three-level cache structure of fig. 1, and includes: a first level cache L1, a second level cache L2 and a third level cache L3. The first level cache L1 is the cache closest to the processor, and has the smallest capacity and the fastest speed; the capacity of the second-level buffer L2 is larger, but the speed is slower than that of the first-level buffer L1, the second-level buffer L2 is a buffer of the first-level buffer L1, the second-level buffer L2 is used for storing data which are needed by the processors during processing, but the first-level buffer L1 cannot store; the capacity of the third level buffer L3 is the largest and the first level buffer L3 and the memory with the slowest speed can be regarded as the buffer of the second level buffer L2.
When the processor operates, the processor firstly searches the first level cache L1 for the required data according to the memory access instruction, then the processor firstly searches the second level cache L2, then the processor firstly searches the third level cache L3, and if the third level cache does not find the required data, the processor acquires the data from the memory. The longer the path is found, the longer the time is, so if some data is to be acquired very frequently, it is guaranteed that the data is in the first level cache L1, so the speed will be very fast. The memory access instruction is an instruction for acquiring data from a specified address of the memory or storing data to the specified address in the memory.
Preferably, the second-level cache L2 may employ a 5-level pipeline architecture, and the pipeline queue may include 5 sequentially arranged data bits: the data bits S1-S5, each data bit corresponds to a pipeline time, pipeline time corresponding to different data bits is different, the request is used for entering the pipeline from the initial data bit of the pipeline queue, the data bit is changed along with time migration, and according to different types of the request, the state of the data block accessed by the request and the data bit (one data bit is used for representing one pipeline time) in the pipeline queue, the operation executed by the data bit is designed according to the type of the request and the state of the data block accessed by the request, so that the operation corresponding to the type of the request and the target data bit can be executed when the request is in a specific target data bit in the pipeline queue.
Fig. 2 is a flowchart of a method for processing a cache request according to an embodiment of the present application, where, as shown in fig. 2, the method may include:
Step 101, selecting a target request from all requests according to the type of the request obtained by the secondary cache, and entering the target request from a specific data bit of a pipeline queue of the secondary cache into the pipeline queue for execution; the pipeline queue includes a plurality of data bits arranged in sequence.
In the embodiment of the application, the second-level cache can adopt a multi-level pipeline architecture, namely a pipeline queue comprising a plurality of data bits which are sequentially arranged, each data bit corresponds to one pipeline time, the pipeline time corresponding to different data bits is different, and the pipeline queue is used for receiving and sending instructions and maintaining the state of the instructions according to time sequences. Preferably, the specific data bit may be a start data bit, and the instruction may specifically enter the pipeline from the start data bit of the pipeline queue for execution. Of course, the specific data bit may be other data bits besides the initial data bit, which is not limited in the present application.
The secondary cache may handle many requests in operation to implement functions related to data access. The requests processed by the secondary cache have multiple types, and when the requests of different types are executed, corresponding operations can be performed at different moments. Based on this feature, embodiments of the present application may design operations performed on data bits in a pipeline queue according to the type of request, the directory state of the data block being accessed by the request, the data bit in which the request is located (one data bit is used to characterize one pipeline time), so that operations corresponding to the type of request and the target data bit may be performed when the request is in a particular target data bit in the pipeline queue.
For example, the types of requests obtained by the secondary cache include: the channel task type issued by the processor, the miss status register type generated by the miss status register (a register used for processing the unsuccessfully executed request), and the like, wherein the channel task type further comprises a data acquisition type (used for reading data in the secondary cache), a data release type (used for releasing the data in the secondary cache), a data probe type (used for invalidating the data in the secondary cache), and the like.
Specifically, the secondary cache may obtain multiple requests at the same time, but the pipeline design of the secondary cache requires that only one request enters the pipeline queue at the same time, so that in the case that the secondary cache obtains multiple requests at the same time, a target request may be selected from all the requests to enter the pipeline queue from a specific data bit (preferably, the data bit S1 in fig. 1), and execution of the target request by the secondary cache is started. In one implementation manner, a target request may be selected from all requests based on the types of the requests, that is, the embodiment of the present application sets priorities of the types of the requests based on characteristics of the different types of the requests, and may select, as the target request, a request of a type with a high priority based on the priority of the type.
And 102, returning a response generated by executing the target request when the pipeline queue successfully executes the target request.
In the embodiment of the application, aiming at different types of target requests, response results generated by executing the target requests are different, and after the target requests of the data acquisition type are successfully executed, the generated response results comprise read data; after the target request of the data release type is successfully executed, the generated response result comprises a notification message that the data is released. The embodiment of the application can return the response generated by the execution target request when the target request is successfully executed by the pipeline queue, namely, the response result of the request is fed back to the initiator of the request.
Step 103, when the pipeline queue does not successfully execute the target request, allocating a corresponding missing state register for the target request, and executing the target request through the missing state register.
In the embodiment of the application, for the target request which is not successfully executed in the second-level cache, a corresponding missing state register (MSHR) can be allocated for the target request, the target request is put into the missing state register to wait, and the target request is executed through the missing state register. Specifically, the miss status register is a register for recording each incomplete transaction, where the recorded information includes a invalidation address, information of an incomplete execution instruction, directory conditions, and the like, and sends a sub-request required for completing the execution of the request to the upper and lower caches. Once the subsequent secondary cache has the condition of executing the target request, the target request in the missing state register can be separated from the missing state register and enter the pipeline queue of the secondary cache for re-execution.
In the cache architecture, the number of missing state registers is limited, and if a corresponding missing state register is allocated for each request entering the secondary cache pipeline queue, the missing state register will have insufficient resources if the number of requests is large. In addition, when the number of requests is large and the missing state register resources are insufficient, new requests are blocked from entering, and the maximum parallel number of the second-level cache processing requests is limited. In addition, in this allocation manner, for target instructions that can be successfully executed by the pipeline queues of the second-level cache, the missing state registers allocated for these target instructions that can be successfully executed actually cause resource waste without actually entering the missing state registers for processing.
In the embodiment of the application, when the target request enters the pipeline queue of the second-level cache, the corresponding missing state register is not allocated for the target request, and the corresponding missing state register is allocated for the target request only when the target request fails to execute successfully.
In summary, in the embodiment of the present application, according to the type of the request obtained by the second-level cache, a target request may be selected from all the requests, and the target request enters the pipeline queue for execution, and when the pipeline queue does not successfully execute the target request, a corresponding miss state register is allocated to the target request for processing. In the application, when the target request enters the pipeline queue of the second-level cache, the corresponding missing state register is not allocated for the target request, and the corresponding missing state register is allocated for the target request only when the target request fails to be executed, so that the allocation consumption of the missing state register resource is reduced and the time delay caused by the allocation of the missing state register resource is reduced on the basis of meeting the requirement of re-execution of the target request which is not executed successfully.
Fig. 3 is a flowchart of specific steps of a method for processing a cache request according to an embodiment of the present application, where, as shown in fig. 3, the method may include:
step 201, selecting a target request from all the requests according to the types of the requests obtained by the secondary cache.
The step may refer to step 101, and will not be described herein.
Optionally, step 201 may specifically include sub-steps 2011-2014:
Sub-step 2011, when there is a request of a missing status register type in all the requests, using the request of the missing status register type as the target request.
Sub-step 2012, in all the requests, there is no request of the miss status register type, but when there is a request of the data release type, regards the request of the data release type as the target request.
Sub-step 2013, wherein among all the requests, there is no request of a missing status register type and a data release type, but when there is a request of a data probe type, the request of the data probe type is taken as the target request.
Sub-step 2014, wherein among all the requests, there is no request of a miss status register type, a data release type, a data probe type, but when there is a request of a data acquisition type, the request of the data acquisition type is taken as the target request.
In the embodiment of the present application, for sub-steps 2011-2014, the requests processed by the second level buffer have multiple types, when the requests of different types are executed, corresponding operations are performed at different times, and the second level buffer may obtain multiple requests at the same time, but the pipeline design of the second level buffer requires only one request to enter the pipeline queue at the same time, so that in the case that the second level buffer obtains multiple requests at the same time, a target request can be selected from all the requests to enter the pipeline queue from a specific data bit, and execution of the target request by the second level buffer is started.
Specifically, the embodiment of the application can select the target request from all requests based on the priority of the request types, that is, the embodiment of the application sets the priority of each type of request based on the characteristics of the requests of different types, and can select the request of the type with high priority as the target request based on the priority of the type. The types of requests obtained by the secondary cache include: the channel task type sent by the processor, the missing state register type generated by the missing state register, and the like, wherein the channel task type further comprises a data acquisition type, a data release type, a data exploration type, and the like. The embodiment of the application can set the priority strategy as follows: the priority of the miss status register type > the priority of the data release type > the priority of the data probe type > the priority of the data acquisition type. By dividing the priority, the embodiment of the application can meet the condition that only one target request enters the pipeline queue of the secondary cache at the same time. According to the above priority division, the second-level cache can process the target request of the missing state register type preferentially, so as to ensure timely re-execution of the unsuccessfully executed request in the missing state register and avoid overlong waiting time of the request. The priority of the data release request is higher than that of the data probe request, the priority of the data probe request is higher than that of the data acquisition request, and based on division, the stability and efficiency of executing the requests by the pipeline queue are improved on the basis of ensuring that the requests smoothly enter the pipeline queue.
Step 202, entering the target request from the initial data bit into the pipeline queue.
Wherein the specific data bit is a start data bit.
In the embodiment of the present application, referring to fig. 1, a target request newly entering the pipeline queue may specifically enter the pipeline queue from a start data bit (data bit S1), and the target request does not stay in a certain data bit after entering the pipeline queue, and moves by one data bit every time a moment passes.
Step 203, according to the type of the target request, executing an operation corresponding to the type and the target data bit when the target request is in the target data bit of the pipeline queue.
Based on the characteristics, the embodiment of the application can design the operation executed by the data bit according to the type of the request, the directory state of the data block accessed by the request and the data bit (one data bit is used for representing one pipeline time) of the request in the pipeline queue, so that the operation corresponding to the type of the request and the target data bit can be executed when the request is in a specific target data bit in the pipeline queue.
For example, for a target request of a data acquisition type, when the target request is in a target data bit S1, an operation of reading a cache directory of the secondary cache may be performed; obtaining a directory result when the target data bit S3 is located, and performing an operation of judging whether the target request hits in the cache directory or not, and performing an operation of reading data if the target request hits; upon hit and at the target data bit S5, data is obtained and an operation of feeding back data to the upper layer is performed.
Aiming at a target request of a data release type, when the target request is positioned in a target data bit S1, the operation of reading a cache directory of the secondary cache can be designed; and when the target data bit S3 is positioned, obtaining a directory result, and performing operations of storing the data released by the upper layer in a secondary cache and feeding back a release completion message to the upper layer.
Aiming at a target request of a data exploration type, when the target request is positioned in a target data bit S1, the operation of reading a cache directory of the secondary cache can be carried out; when the target data bit S3 is located, obtaining a directory result, and judging whether the target request hits in the cache directory or not, if the target request does not hit, responding data to the three-level cache at the same time in the target data bit S3; if the directory result is hit (the target data is only in the second-level cache), the target data is read at the same time in the target data bit S3, the target data is obtained by reading in the target data bit S5 for invalidation, and the response is performed to the third-level cache.
Optionally, the type of the target request is in a channel task type set, where the channel task type set includes a type of a request sent by the processor for the secondary cache, and step 203 may specifically include sub-steps 2031-2033:
Sub-step 2031, when the target data bit is the start data bit, reads the cache directory of the second level cache while blocking other requests except the target request.
Sub-step 2032, when the target data bit is the first data bit (S3), determining a hit result of the target request in the second level cache according to the cache directory, and when the hit result is a miss, determining that the pipeline queue did not successfully execute the target request, and when the hit result is a hit, starting to read target data corresponding to the target request; the first data bit is spaced apart from the starting data bit by a first number of data bits.
In a substep 2033, when the hit result is a hit and the target data bit is a second data bit, the target data is obtained by reading and a response is generated for the target data, the second data bit being spaced apart from the first data bit by a second number of data bits.
In an embodiment of the present application, for sub-steps 2031-2033, the target request has a total of two broad classes, one being a set of channel task types including the type of request sent by the processor for the secondary cache, and one being a request generated by the miss status register. The second level cache of the embodiment of the application has different strategies for processing the target requests of the two main types.
When the type of the target request is in the channel task type set, referring to fig. 1, the target request may read the cache directory of the secondary cache in the initial data bit S1, while blocking other requests except the target request, and the purpose of reading the cache directory is to traverse the data block in the secondary cache, and determine the state of the data block (whether there is data to be accessed). In data bit S2, the return of the cache directory is awaited.
The target request can obtain a read cache directory in a first data bit S3, determine a hit result of the target request in a second-level cache according to the cache directory, determine that the pipeline queue does not successfully execute the target request when the hit result is missed (the data to be accessed is not stored in the second-level cache and the second-level cache is needed to interact with other caches currently), and start to read the target data corresponding to the target request when the hit result is hit (the data to be accessed is stored in the second-level cache); the first data bit S3 is spaced apart from the start data bit S1 by a first number (2) of data bits. In data bit S4, return of the target data is awaited.
In the embodiment of the application, if the hit result is a miss, the processing strategy of the target request of the missing state register type is skipped; and when the hit result is hit and the target data bit is the second data bit S5, the target data to be accessed by the target request is read, and a response (read, release, invalid, etc.) is generated to the target data according to the specific type of the target request, and a second number (2) of data bits are spaced between the second data bit S5 and the first data bit S3.
Therefore, the embodiment of the application definitely designs the operation executed by the target request at each pipeline time (data bit) aiming at the target request in the channel task type set, adopts the shortest processing flow for each target request in the channel task type set, has no other additional operation in the middle except the operation required for realizing the response of the target request, executes the corresponding operation in each target request in the channel task type set strictly according to the designed pipeline time, greatly improves the efficiency of processing the request by the secondary cache and reduces the delay in processing.
And 204, returning a response generated by executing the target request when the pipeline queue successfully executes the target request.
The step may refer to step 102, and will not be described herein.
Optionally, when the type of the target request belongs to a data acquisition type in the channel task type set, the target data is data stored in the secondary cache, and the generating response for the target data is: and refilling the target data into a first-level cache for target request reading.
In the embodiment of the present application, the target request of the data acquisition type is intended to read target data from the secondary cache and feed back the target data to the upper layer primary cache, so when the type of the target request belongs to the data acquisition type in the channel task type set, if the first data bit S3 in fig. 1 obtains the read cache directory, and determines that the hit result of the target request in the secondary cache is hit (the miss status register needs to be allocated for processing) according to the cache directory, the target data corresponding to the target request starts to be read at the first data bit S3, and the second data bit S5 in fig. 1 obtains the target data, and then the response to the target data is: and feeding back Grant (comprising target data and used for refilling the target data into the first-level cache) to the upper-level cache by the target data for final reading by the processor. After the upper layer receives the Grant, the upper layer may reply GrantAck to the lower layer's second level buffer, indicating that the upper layer has received the target data. In addition, no matter the hit result is hit or miss, when Grant is sent to the first-level cache, the request is recorded at the same time; after waiting for the level one cache return GrantAck, the request is de-tagged as having been processed.
It should be noted that, the hit result is hit, which indicates that the second level cache has target data requested to be read, so the first level cache seeks to allow the second level cache to refill the target data into the first level cache, and the second level cache also needs to send a wake-up request and a refill request to the first level cache, where the wake-up request is used for waking up the operation of reading the target data from the first level cache through the target request; the refill request is used to write target data into the primary cache for target request reading. The wake-up request sent by the second-level cache can wake up the target request which is missed in the first-level cache, and after wake-up, the target request can quickly read the target data from the first-level cache, so that the target request of the processor can normally realize the function of reading the data.
However, if the target data is refilled in the second-level buffer memory and then the target request is awakened to access the first-level buffer memory, the refilled data needs to wait for the awakening of the target request, a long time delay is generated in the whole process (i.e. after the data is refilled, the target request needs to take a certain time and still needs a plurality of periods to be actually read by the target request), in order to reduce the time delay, the embodiment of the application can control the second-level buffer memory, so that the second-level buffer memory sends the awakening request in advance before the refill request is sent, and the fixed and accurate advance is ensured.
To achieve the objective, referring to fig. 1, in the case that the hit result is hit, the embodiment of the present application may generate, immediately, a wake-up request from the second level buffer and send the wake-up request from the wake-up queue to the first level buffer when the hit result is obtained (the time corresponding to the data bit S3), and after 2 data bits are spaced apart from the time when the hit result is obtained (the time corresponding to the data bit S3), obtain, at the time corresponding to the data bit S5, target data that is requested to be read by the target request, and send, as a refill request, the target request from the refill queue to the first level buffer when the time corresponding to the data bit S5.
The first-level cache receives the wake-up request at the moment corresponding to the data bit S3, so that the wake-up of the target request can be started to be executed in advance, and the first-level cache can immediately read the target data according to the subsequently received refill request, so that the time cost for waiting for the wake-up of the target request is not needed, and the delay of the target request for reading the target data is remarkably saved.
For example, referring to fig. 1, in the case that the hit result is a hit, the actual sending timing of the wake-up request is the time corresponding to the data bit S3, and since the refill request waits for a duration represented by one data bit in the refill queue (to ensure that the refill request at the exit of the refill queue is sent out in time and reduce the probability of blocking the refill queue), the actual sending timing of the refill request is the time corresponding to the data bit S6 (not depicted), which can ensure that, for each refill request, there is a wake-up request sent out in advance of three data bits.
The embodiment of the application realizes the management of the memory access request through a multi-stage pipeline queue architecture with concise and clear secondary cache, can achieve the accurate and stable control of the fixed advance of the wake-up request based on the architecture of the pipeline queue and the design of each processing time of the request in the pipeline, and ensures the accuracy and coverage rate of the memory access request reading process. The whole process does not need to read the states of all levels of requests of the pipeline and calculate the sending time in advance based on the states of the refill queue requests, so that the complexity is extremely low, and the cost and the power consumption of the circuit are reduced.
When the type of the target request belongs to the data release type in the channel task type set, the target data is the data stored in the primary cache, and the response generated for the target data is as follows: and receiving and storing the target data sent by the first-level cache through the second-level cache so as to release the target data by the first-level cache.
In the embodiment of the present application, the target request of the data release type is intended to be received and stored by the second level cache, so when the type of the target request belongs to the data release type in the channel task type set, the first data bit S3 in fig. 1 obtains the read cache directory result, and stores the target data released by the first level cache of the upper level into the data block, that is, the response to the target data is: and receiving and storing the target data sent by the first-level cache through the second-level cache. After the second level buffer finishes storing the target data, the release completion can be represented by a response RELEASEACK to the upper level buffer.
When the type of the target request belongs to the data exploration type in the channel task type set, the target data is the data stored in the secondary cache, and the response generated for the target data is as follows: and setting the target data to an invalid state.
In embodiments of the present application, there may be access to shared data in a multi-core processor system. In order to ensure consistency of access (i.e. the latest written value can be read every time of reading), it is necessary to ensure that in the processor at the same time, either multiple processor cores have read rights, or only one processor core has write rights, multiple processor cores with write rights cannot exist, or one processor core with write rights cannot exist while other readers exist. To meet this property, the cache design must also support a class of invalidation operations that invalidate the data of the specified data block in the present cache.
For example: the processor core 1 has write permission now, at this time, the processor core 2 also wants to write data, and it needs to acquire write permission from the tertiary cache, at this time, the tertiary cache will invalidate the processor core 1 to withdraw its write permission, and let the processor core 1 return data to the tertiary cache, so that the target request of the data exploration type is aimed at invalidating the requested target data by the secondary cache based on the request of the lower tertiary cache.
Specifically, when the type of the target request belongs to the data probing type in the channel task type set, if the first data bit S3 in fig. 1 obtains the read cache directory and determines that the hit result is a hit (the miss needs to be allocated to the miss status register for processing) according to the cache directory, the target data starts to be read in the first data bit S3, and the second data bit S5 in fig. 1 obtains the target data, and the response to the target data is: the target data is set to an invalid state. The second level cache needs to respond ProbeAck to the lower level third level cache (if the data block where the target data is located is modified by the upper level, the target data is also referred to as dirty data, the dirty data must be contained in ProbeAck, otherwise the second level cache may be ProbeAck without data).
Referring to FIG. 4, a two-level cache architecture diagram of an embodiment of the present application is shown, including an arbitration module, a miss status register, a cache directory, a refill buffer, a bus, a release buffer, a data storage array, a request buffer, and a channel controller. The bus contains five channels: A. b, C, D, E, wherein the channel controller: sink node A (sink A), sink node C (sink C), source channel B (source B), source channel D (source D), sink node E (sink E) is connected with the bus of the upper layer cache, channel controller: source channel a (SourceA), source channel C (SourceC), sink node B (sink B), sink node D (sink D), source channel E (SourceE) are connected to the underlying cached bus. The request buffer can store some requests which cannot enter the pipeline queue temporarily, so that other requests which can enter the pipeline queue can be executed first, and the requests in the request buffer are allowed to enter the pipeline queue after the conditions are met. The release buffer area is used for temporarily storing the data released by the upper layer buffer. The refill buffer is used to temporarily store refill data.
The request sent by the processor can reach the secondary cache through the bus, and is converted into an internal task of the secondary cache at the channel controller, and the request is different in form and consistent in content before and after conversion. The channel controller is also capable of translating secondary cache internal tasks into bus responses.
The arbitration module is used for arbitrating the requests of each channel, selecting a target request from the requests and entering a pipeline of the secondary cache for processing, and in one implementation manner, the embodiment of the application can select the target request according to the priority of the type of the request, and set the priority strategy of selecting the target request as follows: the priority of the miss status register type > the priority of the data release type > the priority of the data probe type > the priority of the data acquisition type.
In the above architecture, a plurality of miss status registers are included, where the miss status registers are used to store requests (such as requests that miss access) that cannot be directly executed on the pipeline queue. In addition, the second level cache architecture of the embodiment of the present application may employ a full inclusion (Inclusive) policy, that is, the data in the first level cache is all present in the second level cache, that is, the first level cache is a subset of the second level cache.
Specifically, during execution of various types of target requests, the target requests in the pipeline queues may enter the channel controller SourceC, sourceD at data bits S3, S4, S5, but because the channel controller can only enter one request at a time, if multiple requests want to enter, arbitration is performed to select one request and block the rest, and the priority is data bit S5> data bit S4> data bit S3.
Further, sinkA is configured to correspondingly receive a target request of a data acquisition type and convert the target request into a request in a bus format, and the target request of the data acquisition type is configured to feed back a response through SourceD. If the target request of the data acquisition type is missed, the request can be sent to the tertiary cache through SourceA, and the response from the tertiary cache can be received through SinkD.
SinkB is configured to correspondingly receive a target request of a data probe type and convert the target request into an internal task; and a target request of the data probe type is received for responding to the tertiary cache via SourceC and for sending the request to the primary cache via SourceB.
SinkC is configured to correspondingly receive a target request of a data release type, and convert the target request into an internal task; and the target request of the data release type is used to feed back the response through SourceD.
Step 205, when the pipeline queue does not successfully execute the target request, allocating a corresponding miss state register for the target request, and executing the target request through the miss state register.
This step may refer to step 103, and will not be described herein.
Optionally, the type of the target request is a channel task type set, and the type of the target request includes a data acquisition type; the pipeline queue does not successfully execute the target request and is used for representing that the second-level cache does not store the data requested to be acquired by the target request; step 205 may include sub-steps 2051-2052:
And step 2051, after the target request is separated from the pipeline queue, controlling the target request to enter the allocated missing state register for waiting, and generating a new target request of a missing state register type through the missing state register.
Sub-step 2052, when refill data is obtained from the third level cache and refilled into the second level cache by the new target request, waking up an access operation of the data obtaining type target request to refill data in the second level cache by the new target request.
In the embodiment of the present application, when the type of the target request is a channel task type set and includes a data acquisition type, the foregoing embodiment describes the execution process when the pipeline queue successfully executes the target request (hits). When the pipeline queue does not successfully execute the target request (miss), the second-level cache allocates a miss status register to the target request, so that the unsuccessfully executed target request is separated from the pipeline queue and enters the allocated miss status register to wait for a subsequent generation of a new target request of a miss status register type through the miss status register.
The hit result is miss, which indicates that the second level cache does not store the data requested to be read by the target, at the moment, whether the lower level third level cache has the data is checked, if the third level cache stores the data, the third level cache is enabled to refill the data into the second level cache, and then the second level cache is enabled to refill the data into the first level cache for the target to request to be read; if the data is not stored in the three-level cache, the data is read from the memory and refilled into the three-level cache, the data is refilled into the second-level cache by the three-level cache, and finally the data is refilled into the first-level cache by the second-level cache for the target to request for reading.
Specifically, when the hit result is a miss, since the third-level cache needs to be waited to refill the data into the second-level cache, the target request of the data acquisition type can be controlled to be separated from the pipeline queue to wait for the refilling of the third-level cache to be completed. After the tertiary cache is used for refilling data into the secondary cache, the access operation of the data acquisition type target request to the refilled data in the secondary cache can be awakened through the new target request generated by the missing state register.
Optionally, substep 2052 may specifically include substeps 20521-20522:
And a substep 20521, controlling the target request of the data acquisition type to enter the pipeline queue from the specific data bit through the new target request, and generating a wake-up request through the secondary cache and sending the wake-up request to the primary cache through the wake-up queue.
Sub-step 20522 reads refill data from the secondary cache after the third number of data bits are spaced and issues a target request of the data fetch type as a refill request from a refill queue to the primary cache.
The wake-up request is used for waking up the operation of reading data from the primary cache through the target request of the data acquisition type; the refill request is used for writing the refill data into the primary cache for reading by the target request of the data acquisition type.
In the embodiment of the present application, for sub-steps 20521-20522, the access operation of the target request of the wake-up data acquisition type to the refill data in the secondary cache is performed by the new target request, specifically, the target request which is not executed successfully before the re-execution is executed, so that the target data can be successfully acquired, which is realized by sending the wake-up request and the refill request to the primary cache through the secondary cache.
Referring to fig. 1, after waiting for the third-level cache to refill the second-level cache, a new target request generated by the miss status register controls the target request to enter the pipeline queue again from a specific data bit (preferably, the data bit S1), and at the same time, a wake-up request is generated by the second-level cache and sent out by the wake-up queue (the time when the wake-up request is entered and sent out is the time corresponding to the data bit S1); and after a third number of data bits (2 data bits apart), obtaining refill data through the secondary cache at data bit S3, and issuing the target request as a refill request from the refill queue.
For example, referring to fig. 1, when the hit result is a miss, the actual sending time of the wake-up request is the time corresponding to the data bit S1, and since the refill request waits for a duration represented by one data bit in the refill queue (to ensure that the refill request at the exit of the refill queue is sent out in time and reduce the probability of blocking the refill queue), the actual sending time of the refill request is the time corresponding to the data bit S4, which can be seen in that the embodiment of fig. 1 of the present application can ensure that a wake-up request with three data bits in advance is sent out for each refill request. The embodiment of the application can control the secondary cache to send the wake-up request in advance before sending the refill request, and ensure that the advance is fixed and accurate. Therefore, the problem that the refilled data in the related technology needs to wait for the target request to wake up can be solved, and the whole process can generate longer time delay (namely, after the data is refilled, a certain time is required for waking up the target request, so that the target request still needs a plurality of periods to actually read the refilled target data).
The embodiment of the application designs a hit result of a target request and a wake-up request to be sent in one fixed data bit based on a pipeline queue, and obtains refill data in another fixed data bit and sends the refill request through the refill queue; based on the architecture of the pipeline queue and the design of each processing time of the instruction in the pipeline, the accurate and stable control of the fixed advance of the wake-up request can be achieved, and the accuracy and coverage rate of the target request reading process are ensured.
Optionally, in order to implement a process of obtaining refill data from the third level cache and refilling the third level cache, step 205 may specifically further include sub-steps 2053-2054:
step 2053, according to the new target request, obtaining the refill data from the third-level cache request.
Step 2054, when the refill data is received through the new target request, entering the new target request from the specific data bit into the pipeline queue, performing operations of determining a target data block from the secondary cache, releasing old data stored in the target data block, and writing the refill data into the target data block.
In the embodiment of the present application, for steps 2053-2054, when the processor operates, the processor will first go to the first level cache L1 to find the required data according to the access instruction, then go to the second level cache L2, and then go to the third level cache L3. In the above process, if the access instruction misses in the first-level cache L1 (i.e., there is no data requested to be read by the access instruction in the first-level cache L1), then continuing to search whether the access instruction hits in the second-level cache L2, if yes, then the second-level cache L2 refill the data requested to be read by the access instruction into the first-level cache L1; if the access instruction does not hit in the second-level cache L2, continuing to search whether the access instruction hits in the third-level cache L3, if the access instruction hits in the third-level cache L3, the third-level cache L3 re-fills the data requested to be read by the access instruction into the second-level cache L2, and then the second-level cache L2 re-fills the data into the first-level cache L1.
However, due to limited space of the cache, when the cache has capacity conflict (i.e. the cache space is full and insufficient to continue storing the refilled data), a data block is selected from the secondary cache waiting for refilling to release the old data therein, so that space is made for storing the refilled data.
The related technology is that when the secondary cache informs the lower tertiary cache to refill data, a data block is selected from the secondary cache to release old data therein, and when the tertiary cache waits for sending refill data, the refill data is written into the data block, so that the refill of the data is realized. This results in the empty data block being continuously occupied while waiting for the data to be refilled, and in addition, if there is a need for access to the old data during this process, the old data is released and cannot be successfully accessed.
In order to solve the problem, the embodiment of the application can firstly select the data block to release when the access instruction is not hit in the secondary cache, but inform the lower tertiary cache to acquire the refill data required by the access request, wait for the tertiary cache to acquire the refill data and send the refill data to the secondary cache, then determine the target data block from the secondary cache, release the old data stored in the target data block, and write the refill data into the target data block. In this way, during the period of time before the refill data arrives, the data block storing the old data normally operates and is not empty or occupied, and the old data can be normally accessed. In addition, after the secondary cache receives the refill data, the application releases the old data stored in the selected target data block and writes the refill data into the target data block, thereby ensuring the normal realization of the old data reading process by the access instruction in the period.
Specifically, in response to an acquisition request, the process of searching and acquiring the refill data and sending the refill data to the secondary cache generally needs a longer time, but in the embodiment of the application, in the period of time when the secondary cache waits for the refill data to be sent, the data block with old data is not released and occupied, and the utilization rate of the cache resource is improved on the basis of ensuring that the old data can be normally accessed.
Furthermore, in the embodiment of the application, the selection of the target data block is started only when the refilling data sent by the third-level cache is received through the second-level cache, and if old data is stored in the target data block, the old data is released and the refilling data is written into the target data block. In this way, in the period of time before the arrival of the refill data, the data block with the old data stored in the secondary cache can normally operate and is not empty or occupied, after the secondary cache receives the refill data, the old data stored in the selected target data block is released, and the refill data is written into the target data block, so that the normal implementation of the data reading process of the access instruction is also ensured.
Further, a new target request is preferably sent from the initial data bit S1 in fig. 1 to the pipeline queue, and the cache directory is started to be read, where the second level cache needs to temporarily store the refill data before the refill data is written into the target data block when the refill data sent by the third level cache is just received (when the refill data is not written into the target data block yet), so that an independent refill buffer zone can be set in the second level cache, and the second level cache can temporarily store the refill data written into the refill buffer zone. And when the data bit S2 is read, the secondary cache reads the refilling buffer zone, determines a target data block from the secondary cache according to the obtained cache directory in the data bit S3, reads the obtained refilling data, writes the obtained refilling data into the writing buffer zone of the secondary cache, and reads old data in the target data block. Old data in the target data block may be read out in data bit S5, where the old data is read out from the target data block. And when the data bit S5 is generated, the read old data can be sent to the three-level cache to finish the release of the target data block.
And when the operation of reading the target data block is detected to be finished, writing the refilled data in the write buffer zone into the target data block to finish data refilling.
In the embodiment of the present application, the function of the write buffer is to determine whether to write refill data to the target data block according to the judgment of the read operation performed on the target data block. Specifically, the embodiment of the application sets a higher priority for the operation of reading the target data block, so that the write buffer area can start the operation of writing the refill data in the write buffer area into the target data block after the read operation of the target data block is completely finished. That is, in the case where the write buffer zone determines that the read operations performed for the target data block are all completed, the write buffer zone performs an operation of writing the refill data in the write buffer zone into the target data block, thereby completing an operation of refilling the refill data into the secondary cache.
In one implementation manner, the last access time of each data block in the second-level cache is recorded in the cache directory, so that the embodiment of the application can obtain the last access time of each data block in the second-level cache based on the cache directory, and take the data block with the earliest last access time as the target data block, wherein the data block with the earliest last access time represents that the data stored in the data block is least active, and therefore, the data block with the earliest last access time is taken as the target data block, and the influence on more active data in other data blocks can be reduced to the minimum. In another implementation manner, a data block may be randomly selected from the second-level cache as the target data block, and the embodiment of the present application is not limited specifically to the selection policy of the target data block.
Optionally, step 205 may specifically further include sub-steps 2055-2058:
Substep 2055, retrieving the first number of requests contained in the first n data bits of the pipeline queue.
Sub-step 2056, obtaining a second number of miss status registers from all said miss status registers that have been allocated.
Substep 2057, where the sum of the first number and the second number is greater than or equal to the total number of miss status registers, prevents a new request from entering the pipeline queue.
Substep 2058, allocating a miss status register of an idle status to the target request if the sum of the first number and the second number is less than the total number of miss status registers.
In an embodiment of the present application, for sub-steps 2055-2058, since the pipeline queues of the secondary cache are non-blocking (i.e., the request in the pipeline queue moves one data bit every time a request passes, the request does not reside in a data bit), it is necessary to finely schedule pipeline queue entry conditions to ensure that a request entering the pipeline queue can either be processed directly on the pipeline queue or can be assigned a miss status register, and the situation where a miss status register is needed but there is currently no miss status register available cannot occur.
Therefore, the embodiment of the application designs a back pressure control logic, namely, a first number of requests contained in the first n data bits of the pipeline queue and a second number of the allocated missing state registers are acquired first, and then the sum result of the first number and the second number is calculated, and under the condition that the sum result is greater than or equal to the total number of the missing state registers, the number of the idle missing state registers is considered to be insufficient at the moment, so that new requests are prevented from entering the pipeline queue. And under the condition that the addition result is smaller than the total number of the missing state registers, the number of the missing state registers which are idle at the moment is considered to be sufficient, and the missing state registers in the idle state are allocated for the target request. In addition, the present application also needs to ensure that the buffers SourceC and SourceD in fig. 4 do not overflow, that is, the number of data acquisition type requests and data release type requests contained in the pipeline queue+the number of occupied buffer entries is greater than or equal to the total number of buffer entries.
Optionally, when the type of the target request is a data probe type in the channel task type set, the target request is issued by a tertiary cache, and the method may further include steps 206-208:
And 206, feeding back to the tertiary cache that the target data requested to be invalid by the target request is not stored in the secondary cache when the hit result is miss.
Step 207, when the hit result is hit and the target data is stored in the secondary cache only, starting to read the target data corresponding to the target request.
And step 208, when the hit result is hit and the target data is stored in the secondary cache and the primary cache at the same time, starting to read the target data in the secondary cache and simultaneously requesting the primary cache to invalidate the stored target data.
Specifically, for steps 206-208, the read cache directory is obtained according to the first data bit S3 in fig. 1, if the hit result is hit and the target data is only in the second level cache but not in the first level cache, the target data is read in the first data bit S3, the target data is obtained in the second data bit S5, and the response is sent to the third level cache (if the target data is dirty, the response includes the target data); if the hit result is hit and the target data is in the second-level cache and the first-level cache, the second-level cache also needs to invalidate the first-level cache to lose the stored target data, and then invalidate the target data stored by the second-level cache, at the moment, a missing state register is allocated for the target request, the information of the target request is recorded, and meanwhile, the target data is read in the first data bit S3, and the target data is obtained in the second data bit S5 and written into the release buffer.
The miss status register sends a request for a data probe type to the primary cache and waits for a request to invalidate the primary cache from its stored target data. And aiming at the response result returned by the first-level cache, if data exists in the response result, writing the response result into a release buffer area to cover the target data which is just read, and waking up a missing state register to enable the missing state register to send a new target request to enter a pipeline queue of the second-level cache. The new target request will read the release buffer in data bit S2 in the pipeline queue, read the resulting data in data bit S3, and may respond to the data in data bits S3-S5 to the three levels of buffers. The reason why the data is read and written into the release buffer is that if no data is returned in the response of the primary cache, but the target data of the secondary cache itself is dirty data, dirty data needs to be added to the response of the tertiary cache. In this case, the data in the release buffer area can be read out in advance, and the response flow is quickened.
Optionally, the second level cache comprises a plurality of data block groups, and each database group comprises a plurality of data blocks; the data block group is used for parallel processing of target requests for accessing the data blocks contained in the data block group; the target data block is: and the data blocks in the secondary cache which are not occupied by other requests.
In the embodiment of the present application, the secondary cache includes a plurality of data block groups (sets), each database group includes a plurality of data blocks, and the cache design of the related art is to block according to the data block groups, that is, requests of the same data block group can only process one request at the same time, so that the throughput of the cache is reduced, and serious performance loss is caused in a specific access mode.
To solve this problem, the policy of selecting the target data block for storing the refill data according to the embodiment of the present application may be designed as follows: and selecting the data blocks which are not occupied by other requests in the secondary cache as target data blocks. Specifically, when a target data block is selected, data blocks occupied by other requests belonging to the same data block group as the target request in the missing cache register are avoided.
Therefore, each data block group can process a plurality of requests in parallel, and the data blocks associated with the requests are mutually independent, so that the requests are not interfered with each other, and the accuracy of parallel processing is ensured. I.e. the policy ensures that one request will not have another request to occupy the requested data block before processing is completed, i.e. no two in-process requests select the same data block. Based on this strategy, the strategy of the present application will have a very significant performance improvement, especially when the memory sequence of the program is performed with the size of the data block group as an interval of jump (e.g. data block group size 100, memory sequence 0/100/200/300 …).
Optionally, the second-level cache includes a plurality of data block groups, and the data block groups are used for parallel processing target requests to access the data blocks included in the data block groups; prior to step 202, the method may further comprise steps 209-210:
step 209, when it is determined that the target request and the write buffer directory request in the pipeline queue all access the same data block group, preventing the target request from entering the pipeline queue; wherein the write buffer directory request is in the first three data bits in the pipeline queue, the third data bit being used to perform the write buffer directory operation.
Step 210, when it is determined that the target request and the request in the miss state register both access the same data, then the target request is prevented from entering the pipeline queue.
In the embodiment of the present application, for steps 209-210, when it is determined that the target request and the write buffer directory request in the pipeline queue both access the same data block group, the target request is prevented from entering the pipeline queue, which aims to prevent a conflict between a write operation and a read operation, i.e. to ensure that the read operation performed by the data bit S1 must read the data to be written by the data bits S2 and S3.
In addition, when the target request and the request in the missing state register are determined to access the same data, the target request is prevented from entering the pipeline queue, and the purpose of the target request is to ensure the consistency of the same data operation.
In summary, in the embodiment of the present application, according to the type of the request obtained by the second-level cache, a target request may be selected from all the requests, and the target request enters the pipeline queue for execution, and when the pipeline queue does not successfully execute the target request, a corresponding miss state register is allocated to the target request for processing. In the application, when the target request enters the pipeline queue of the second-level cache, the corresponding missing state register is not allocated for the target request, and the corresponding missing state register is allocated for the target request only when the target request fails to be executed, so that the allocation consumption of the missing state register resource is reduced and the time delay caused by the allocation of the missing state register resource is reduced on the basis of meeting the requirement of re-execution of the target request which is not executed successfully.
Fig. 5 is a block diagram of a processing device for a cache request according to an embodiment of the present application, where the device includes:
An enqueuing module 301, configured to select a target request from all the requests according to a type of a request obtained by a second level cache, and enter the target request from a specific data bit of a pipeline queue of the second level cache into the pipeline queue for execution; the pipeline queue comprises a plurality of data bits which are sequentially arranged;
a return module 302, configured to return a response generated by executing the target request when the pipeline queue successfully executes the target request;
And the execution module 303 is configured to allocate a corresponding miss status register to the target request when the pipeline queue does not successfully execute the target request, and execute the target request through the miss status register.
Optionally, the specific data bit is a start data bit, and the enqueuing module 301 includes:
an enqueuing sub-module for entering the target request from the start data bit into the pipeline queue;
And the execution sub-module is used for executing the operation corresponding to the type and the target data bit when the target request is in the target data bit of the pipeline queue according to the type of the target request.
Optionally, the type of the target request is in a channel task type set, and the channel task type set includes the type of the request sent by the processor for the second-level cache;
the execution sub-module includes:
A blocking unit, configured to read a cache directory of the second level cache when the target data bit is a start data bit, and simultaneously block other requests except the target request;
The first execution unit is used for determining a hit result of the target request in the secondary cache according to the cache directory when the target data bit is the first data bit, determining that the pipeline queue does not successfully execute the target request when the hit result is a miss, and starting to read target data corresponding to the target request when the hit result is a hit; a first number of data bits are spaced between the first data bit and the start data bit;
And the second execution unit is used for reading, obtaining and responding to the target data when the hit result is hit and the target data bit is a second data bit, wherein a second number of data bits are spaced between the second data bit and the first data bit.
Optionally, the type of the target request is a data acquisition type included in a channel task type set, and the channel task type set includes a type of a request sent by the processor for the secondary cache; the pipeline queue does not successfully execute the target request and is used for representing that the second-level cache does not store the data requested to be acquired by the target request;
the execution module 303 includes:
The generation submodule is used for controlling the target request to enter the allocated missing state register to wait after the target request is separated from the pipeline queue, and generating a new target request of a missing state register type through the missing state register;
And the awakening sub-module is used for awakening the access operation of the target request of the data acquisition type to the refilled data in the secondary cache through the new target request when the refilled data is obtained from the tertiary cache through the new target request and is refilled into the secondary cache.
Optionally, when the type of the target request is a data probe type in the channel task type set, the target request is issued by a tertiary cache, and the apparatus further includes:
The first judging module is used for feeding back to the third-level cache that the target data requested to be invalid by the target request is not stored in the second-level cache when the hit result is a miss;
the second judging module is used for starting to read the target data corresponding to the target request when the hit result is hit and the target data is only stored in the secondary cache;
and the third judging module is used for starting to read the target data in the second-level cache and requesting the first-level cache to invalidate the stored target data when the hit result is hit and the target data is stored in the second-level cache and the first-level cache simultaneously.
Optionally, the apparatus further includes:
The refill data module is used for acquiring the refill data from the tertiary cache request according to the new target request;
And the refilling module is used for entering the new target request from the specific data bit into the pipeline queue when the refill data is received through the new target request, and executing the operations of determining a target data block from the secondary cache, releasing old data stored in the target data block and writing the refill data into the target data block.
Optionally, the second level cache comprises a plurality of data block groups, and each database group comprises a plurality of data blocks;
The data block group is used for parallel processing of target requests for accessing the data blocks contained in the data block group;
The target data block is: and the data blocks in the secondary cache which are not occupied by other requests.
Optionally, the wake-up sub-module includes:
the first sending submodule is used for controlling the target request of the data acquisition type to enter the pipeline queue from the specific data bit through the new target request, and generating a wake-up request through the secondary cache and sending the wake-up request to the primary cache through the wake-up queue;
The second sending submodule is used for reading the refilled data from the second-level cache after the third number of data bits are spaced, and sending the target request of the data acquisition type as a refill request from a refill queue to the first-level cache;
The wake-up request is used for waking up the operation of reading data from the primary cache through the target request of the data acquisition type; the refill request is used for writing the refill data into the primary cache for reading by the target request of the data acquisition type.
Optionally, when the type of the target request belongs to a data acquisition type in the channel task type set, the target data is data stored in the secondary cache, and the generating response for the target data is: refilling the target data into a first-level cache for target request reading;
When the type of the target request belongs to the data release type in the channel task type set, the target data is the data stored in the primary cache, and the response generated for the target data is as follows: receiving and storing target data sent by the first-level cache through the second-level cache so as to release the target data by the first-level cache;
When the type of the target request belongs to the data exploration type in the channel task type set, the target data is the data stored in the secondary cache, and the response generated for the target data is as follows: and setting the target data to an invalid state.
Optionally, the enqueuing module 301 includes:
A first selecting submodule, configured to, when a request of a missing state register type exists in all the requests, use the request of the missing state register type as the target request;
the second selecting submodule is used for taking the request of the data release type as the target request when the request of the data release type is not existed in all the requests, but the request of the data release type is existed;
A third selecting sub-module, configured to use, in all the requests, a request of a missing state register type and a data release type, but when a request of a data probe type exists, the request of the data probe type as the target request;
And a fourth selecting sub-module, configured to use, in all the requests, a request of a missing state register type, a data release type, and a data probe type, but when a request of a data acquisition type exists, take the request of the data acquisition type as the target request.
Optionally, the executing module 303 includes:
A first statistics sub-module for obtaining a first number of requests contained in first n data bits of the pipeline queue;
A second statistics sub-module, configured to obtain a second number of allocated missing state registers from all the missing state registers;
A request blocking sub-module for blocking new requests from entering the pipeline queue if the sum of the first number and the second number is greater than or equal to the total number of missing state registers;
an allocation submodule, configured to allocate a missing state register of an idle state for the target request if the sum of the first number and the second number is smaller than the total number of missing state registers.
Optionally, the second-level cache includes a plurality of data block groups, and the data block groups are used for parallel processing target requests to access the data blocks included in the data block groups;
The apparatus further comprises:
A first blocking module, configured to prevent the target request from entering the pipeline queue when it is determined that the target request and a write buffer directory request in the pipeline queue both access the same data block group; the write buffer directory request is in at least one third data bit in the pipeline queue, and the third data bit is used for executing the operation of writing the buffer directory;
And the second blocking module is used for preventing the target request from entering the pipeline queue when the target request and the request in the missing state register are determined to access the same data.
In summary, in the embodiment of the present application, according to the type of the request obtained by the second-level cache, a target request may be selected from all the requests, and the target request enters the pipeline queue for execution, and when the pipeline queue does not successfully execute the target request, a corresponding miss state register is allocated to the target request for processing. In the application, when the target request enters the pipeline queue of the second-level cache, the corresponding missing state register is not allocated for the target request, and the corresponding missing state register is allocated for the target request only when the target request fails to be executed, so that the allocation consumption of the missing state register resource is reduced and the time delay caused by the allocation of the missing state register resource is reduced on the basis of meeting the requirement of re-execution of the target request which is not executed successfully.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
Embodiments of the present application provide a processing device for a cache request, including a memory, and one or more programs, wherein the one or more programs are stored in the memory, and configured to be executed by one or more processors, include means for performing the method described in one or more embodiments above.
Fig. 6 is a block diagram of an electronic device 600, according to an example embodiment. For example, the electronic device 600 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 6, an electronic device 600 may include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an input/output (I/O) interface 612, a sensor component 614, and a communication component 616.
The processing component 602 generally controls overall operation of the electronic device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 may include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.
The memory 604 is used to store various types of data to support operations at the electronic device 600. Examples of such data include instructions for any application or method operating on the electronic device 600, contact data, phonebook data, messages, pictures, multimedia, and so forth. The memory 604 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 606 provides power to the various components of the electronic device 600. The power supply components 606 can include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 600.
The multimedia component 608 includes a screen between the electronic device 600 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense demarcations of touch or sliding actions, but also detect durations and pressures associated with the touch or sliding operations. In some embodiments, the multimedia component 608 includes a front camera and/or a rear camera. When the electronic device 600 is in an operational mode, such as a shooting mode or a multimedia mode, the front-facing camera and/or the rear-facing camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 610 is for outputting and/or inputting audio signals. For example, the audio component 610 includes a Microphone (MIC) for receiving external audio signals when the electronic device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.
The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 614 includes one or more sensors for providing status assessment of various aspects of the electronic device 600. For example, the sensor assembly 614 may detect an on/off state of the electronic device 600, a relative positioning of the components, such as a display and keypad of the electronic device 600, the sensor assembly 614 may also detect a change in position of the electronic device 600 or a component of the electronic device 600, the presence or absence of a user's contact with the electronic device 600, an orientation or acceleration/deceleration of the electronic device 600, and a change in temperature of the electronic device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 616 is utilized to facilitate communication between the electronic device 600 and other devices, either in a wired or wireless manner. The electronic device 600 may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In one exemplary embodiment, the communication component 616 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for implementing the methods provided by the embodiments of the application.
In an exemplary embodiment, a non-transitory computer-readable storage medium is also provided, such as memory 604, including instructions executable by processor 620 of electronic device 600 to perform the above-described method. For example, the non-transitory storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
Fig. 7 is a block diagram of an electronic device 700, according to an example embodiment. For example, the electronic device 700 may be provided as a server. Referring to fig. 7, electronic device 700 includes a processing component 722 that further includes one or more processors and memory resources represented by memory 732 for storing instructions, such as application programs, executable by processing component 722. The application programs stored in memory 732 may include one or more modules that each correspond to a set of instructions. Further, the processing component 722 is configured to execute instructions to perform the methods provided by embodiments of the present application.
The electronic device 700 may also include a power supply component 726 configured to perform power management of the electronic device 700, a wired or wireless network interface 750 configured to connect the electronic device 700 to a network, and an input output (I/O) interface 758. The electronic device 700 may operate based on an operating system stored in memory 732, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.
The embodiment of the application also provides a computer program product, comprising a computer program which, when being executed by a processor, realizes the method described in the above embodiment.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (13)

1. A method for processing a cache request, the method comprising:
Selecting a target request from all the requests according to the type of the requests obtained by the secondary cache, and entering the target request from a specific data bit of a pipeline queue of the secondary cache into the pipeline queue for execution; the pipeline queue comprises a plurality of data bits which are sequentially arranged; the specific data bit is a starting data bit;
When the pipeline queue successfully executes the target request, returning a response generated by executing the target request;
When the pipeline queue does not successfully execute the target request, a corresponding missing state register is allocated for the target request, and the target request is executed through the missing state register;
the step of entering the target request from the specific data bit of the pipeline queue of the secondary cache into the pipeline queue for execution comprises the following steps:
Entering the target request from the start data bit into the pipeline queue;
According to the type of the target request, executing an operation corresponding to the type and the target data bit when the target request is in the target data bit of the pipeline queue; the type of the target request is in a channel task type set, and the channel task type set comprises the type of the request sent by the processor aiming at the secondary cache;
the performing operations corresponding to the type and the target data bit when the target request is at the target data bit of the pipeline queue, comprising:
when the target data bit is the initial data bit, reading a cache directory of the secondary cache, and simultaneously blocking other requests except the target request;
When the target data bit is the first data bit, determining a hit result of the target request in the second-level cache according to the cache directory, determining that the pipeline queue does not successfully execute the target request when the hit result is a miss, and starting to read target data corresponding to the target request when the hit result is a hit; a first number of data bits are spaced between the first data bit and the start data bit;
And when the hit result is hit and the target data bit is a second data bit, reading to obtain the target data and generating a response to the target data, wherein the second data bit is separated from the first data bit by a second number of data bits.
2. The method according to claim 1, wherein the type of the target request is a data acquisition type included in a channel task type set, the channel task type set including a type of the request sent by the processor for the secondary cache; the pipeline queue does not successfully execute the target request and is used for representing that the second-level cache does not store the data requested to be acquired by the target request;
the allocating a corresponding missing state register for the target request, executing the target request through the missing state register, includes:
After the target request is separated from the pipeline queue, controlling the target request to enter the allocated missing state register for waiting, and generating a new target request of a missing state register type through the missing state register;
And when the new target request is used for obtaining the refilled data from the tertiary cache and refilling the refilled data into the secondary cache, waking up the access operation of the target request of the data acquisition type to the refilled data in the secondary cache through the new target request.
3. The method of claim 1, wherein when the type of the target request is a data probe type in a set of channel task types, the target request is issued by a tertiary cache, the method further comprising:
When the hit result is miss, feeding back to the third-level cache that the target data requested to be invalid by the target request is not stored in the second-level cache;
When the hit result is hit and the target data is only stored in the secondary cache, starting to read the target data corresponding to the target request;
When the hit result is hit and the target data are stored in the secondary cache and the primary cache at the same time, the target data in the secondary cache and the target data in the primary cache are read, and the primary cache is requested to invalidate the stored target data.
4. The method for processing a cache request according to claim 2, further comprising:
acquiring the refill data from the tertiary cache request according to the new target request;
And when the refill data is received through the new target request, the new target request enters the pipeline queue from the specific data bit, the operations of determining a target data block from the secondary cache, releasing old data stored in the target data block and writing the refill data into the target data block are performed.
5. The method according to claim 4, wherein the secondary cache includes a plurality of data block groups, each database group including a plurality of data blocks;
The data block group is used for parallel processing of target requests for accessing the data blocks contained in the data block group;
The target data block is: and the data blocks in the secondary cache which are not occupied by other requests.
6. The method according to claim 2, wherein the waking up the access operation of the data acquisition type target request to the refill data in the secondary cache by the new target request includes:
controlling the target request of the data acquisition type to enter the pipeline queue from the specific data bit through the new target request, and generating a wake-up request through the secondary cache and sending the wake-up request to the primary cache through the wake-up queue;
Reading refill data from the secondary cache after a third number of data bits are spaced, and sending a target request of the data acquisition type as a refill request from a refill queue to the primary cache;
The wake-up request is used for waking up the operation of reading data from the primary cache through the target request of the data acquisition type; the refill request is used for writing the refill data into the primary cache for reading by the target request of the data acquisition type.
7. The method according to claim 1, wherein when the type of the target request belongs to a data acquisition type in the channel task type set, the target data is data stored in the secondary cache, and the generating response for the target data is: refilling the target data into a first-level cache for target request reading;
When the type of the target request belongs to the data release type in the channel task type set, the target data is the data stored in the primary cache, and the response generated for the target data is as follows: receiving and storing target data sent by the first-level cache through the second-level cache so as to release the target data by the first-level cache;
When the type of the target request belongs to the data exploration type in the channel task type set, the target data is the data stored in the secondary cache, and the response generated for the target data is as follows: and setting the target data to an invalid state.
8. The method according to claim 1, wherein selecting a target request from all the requests according to the request type, comprises:
when a request of a missing state register type exists in all the requests, taking the request of the missing state register type as the target request;
in all the requests, a request of a missing state register type does not exist, but when a request of a data release type exists, the request of the data release type is taken as the target request;
In all the requests, there is no request of a missing state register type and a data release type, but when there is a request of a data exploration type, the request of the data exploration type is taken as the target request;
Among all the requests, there is no request of a missing state register type, a data release type, a data probe type, but when there is a request of a data acquisition type, the request of the data acquisition type is taken as the target request.
9. The method according to claim 1, wherein the allocating a corresponding miss status register for the target request comprises:
Obtaining a first number of requests contained in first n data bits of the pipeline queue;
acquiring a second number of the allocated missing state registers in all the missing state registers;
Preventing new requests from entering the pipeline queue if the sum of the first number and the second number is greater than or equal to the total number of missing state registers;
And allocating a missing state register of an idle state for the target request if the sum of the first number and the second number is less than the total number of missing state registers.
10. The method according to claim 1, wherein the second level cache includes a plurality of data block groups for parallel processing of access to data blocks included in the data block groups by a target request;
before the target request is entered into the pipeline queue from a particular data bit of the pipeline queue of the secondary cache for execution, the method further comprises:
When the target request and the write buffer directory request in the pipeline queue access the same data block group, the target request is prevented from entering the pipeline queue; the write buffer directory request is in at least one third data bit in the pipeline queue, and the third data bit is used for executing the operation of writing the buffer directory;
Upon determining that the target request and the request in the miss status register both access the same data, the target request is prevented from entering the pipeline queue.
11. A processing apparatus for buffering requests, the apparatus comprising:
The enqueuing module is used for selecting a target request from all the requests according to the types of the requests obtained by the secondary cache, and entering the target request into the pipeline queue from specific data bits of the pipeline queue of the secondary cache for execution; the pipeline queue comprises a plurality of data bits which are sequentially arranged; the specific data bit is a starting data bit;
The return module is used for returning a response generated by executing the target request when the pipeline queue successfully executes the target request;
The execution module is used for distributing a corresponding missing state register for the target request when the pipeline queue does not successfully execute the target request, and executing the target request through the missing state register;
The enqueuing module comprises:
an enqueuing sub-module for entering the target request from the start data bit into the pipeline queue;
An execution sub-module, configured to execute, according to a type of the target request, an operation corresponding to the type and the target data bit when the target request is in the target data bit of the pipeline queue; the type of the target request is in a channel task type set, and the channel task type set comprises the type of the request sent by the processor aiming at the secondary cache;
the execution sub-module includes:
A blocking unit, configured to read a cache directory of the second level cache when the target data bit is a start data bit, and simultaneously block other requests except the target request;
The first execution unit is used for determining a hit result of the target request in the secondary cache according to the cache directory when the target data bit is the first data bit, determining that the pipeline queue does not successfully execute the target request when the hit result is a miss, and starting to read target data corresponding to the target request when the hit result is a hit; a first number of data bits are spaced between the first data bit and the start data bit;
And the second execution unit is used for reading, obtaining and responding to the target data when the hit result is hit and the target data bit is a second data bit, wherein a second number of data bits are spaced between the second data bit and the first data bit.
12. An electronic device, comprising: a processor;
a memory for storing the processor-executable instructions;
Wherein the processor is configured to execute the instructions to implement the method of any one of claims 1 to 10.
13. A computer readable storage medium, characterized in that instructions in the computer readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any one of claims 1 to 10.
CN202410057628.9A 2024-01-15 2024-01-15 Processing method, device, equipment and storage medium for cache request Active CN117573573B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410057628.9A CN117573573B (en) 2024-01-15 2024-01-15 Processing method, device, equipment and storage medium for cache request

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410057628.9A CN117573573B (en) 2024-01-15 2024-01-15 Processing method, device, equipment and storage medium for cache request

Publications (2)

Publication Number Publication Date
CN117573573A CN117573573A (en) 2024-02-20
CN117573573B true CN117573573B (en) 2024-04-23

Family

ID=89862810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410057628.9A Active CN117573573B (en) 2024-01-15 2024-01-15 Processing method, device, equipment and storage medium for cache request

Country Status (1)

Country Link
CN (1) CN117573573B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013361A (en) * 2006-02-02 2007-08-08 国际商业机器公司 Apparatus and method for handling data cache misses out-of-order for asynchronous pipelines
CN103365794A (en) * 2012-03-28 2013-10-23 国际商业机器公司 Data processing method and system
CN110737475A (en) * 2019-09-29 2020-01-31 上海高性能集成电路设计中心 instruction buffer filling filter
CN114327641A (en) * 2021-12-31 2022-04-12 海光信息技术股份有限公司 Instruction prefetching method, instruction prefetching device, processor and electronic equipment
CN115048142A (en) * 2022-03-22 2022-09-13 深圳云豹智能有限公司 Cache access command processing system, method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11061822B2 (en) * 2018-08-27 2021-07-13 Qualcomm Incorporated Method, apparatus, and system for reducing pipeline stalls due to address translation misses

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013361A (en) * 2006-02-02 2007-08-08 国际商业机器公司 Apparatus and method for handling data cache misses out-of-order for asynchronous pipelines
CN103365794A (en) * 2012-03-28 2013-10-23 国际商业机器公司 Data processing method and system
CN110737475A (en) * 2019-09-29 2020-01-31 上海高性能集成电路设计中心 instruction buffer filling filter
CN114327641A (en) * 2021-12-31 2022-04-12 海光信息技术股份有限公司 Instruction prefetching method, instruction prefetching device, processor and electronic equipment
CN115048142A (en) * 2022-03-22 2022-09-13 深圳云豹智能有限公司 Cache access command processing system, method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN117573573A (en) 2024-02-20

Similar Documents

Publication Publication Date Title
WO2018082570A1 (en) I/o request scheduling method and device
EP4451115A1 (en) Data access method and apparatus, and non-transient computer-readable storage medium
CN107291626B (en) Data storage method and device
US20220156084A1 (en) Processor architecture with micro-threading control by hardware-accelerated kernel thread
CN109254849B (en) Application program running method and device
US20170168727A1 (en) Single-stage arbiter/scheduler for a memory system comprising a volatile memory and a shared cache
US20140351825A1 (en) Systems and methods for direct memory access coherency among multiple processing cores
TWI514262B (en) Method and system for identifying and prioritizing critical instructions within processor circuitry
CN117453435B (en) Cache data reading method, device, equipment and storage medium
US20120151103A1 (en) High Speed Memory Access in an Embedded System
CN114428589B (en) Data processing method and device, electronic equipment and storage medium
CN105930213A (en) Application running method and apparatus
CN111198757A (en) CPU kernel scheduling method, CPU kernel scheduling device and storage medium
JPH11272552A (en) Bridge method, bus bridge and multiprocessor system
CN117193963A (en) Function feature-based server non-aware computing scheduling method and device
CN117573573B (en) Processing method, device, equipment and storage medium for cache request
CN116627501B (en) Physical register management method and device, electronic equipment and readable storage medium
CN112988375A (en) Process management method and device and electronic equipment
CN116089049B (en) Asynchronous parallel I/O request-based process synchronous scheduling method, device and equipment
CN113360254A (en) Task scheduling method and system
CN117573572B (en) Method, device, equipment and storage medium for processing refill data
CN111459852B (en) Cache control method and device and electronic equipment
CN113467935A (en) Method and system for realizing L1cache load forward
CN118656351B (en) Large file copying method and device, electronic equipment and storage medium
CN118656342A (en) Caching method, caching device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant