CN115756604A - Execution instruction extraction method and device and electronic equipment - Google Patents

Execution instruction extraction method and device and electronic equipment Download PDF

Info

Publication number
CN115756604A
CN115756604A CN202211348834.2A CN202211348834A CN115756604A CN 115756604 A CN115756604 A CN 115756604A CN 202211348834 A CN202211348834 A CN 202211348834A CN 115756604 A CN115756604 A CN 115756604A
Authority
CN
China
Prior art keywords
instruction
prefetch
cache
module
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211348834.2A
Other languages
Chinese (zh)
Inventor
宋杰
张楠赓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Canaan Creative Information Technology Ltd
Original Assignee
Hangzhou Canaan Creative Information Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Canaan Creative Information Technology Ltd filed Critical Hangzhou Canaan Creative Information Technology Ltd
Priority to CN202211348834.2A priority Critical patent/CN115756604A/en
Publication of CN115756604A publication Critical patent/CN115756604A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Advance Control (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The disclosure provides an execution instruction extraction method and device and electronic equipment, and relates to the technical field of computers. The specific implementation scheme is as follows: acquiring a first extraction request, wherein the first extraction request comprises an address of an execution instruction; detecting the hit condition of the address in the instruction cache and the pre-fetching module; according to the hit condition, the execution instruction is extracted. According to the technical scheme, the execution instruction and the related prefetch instruction can be stored by utilizing the prefetch module space, the hit condition of the execution instruction address can be detected in the instruction fetch cache and the prefetch module at the same time, and the execution instruction is extracted according to the hit condition, so that the method is different from the conventional prefetch hit access only performed on the instruction fetch cache, the cache access efficiency is improved, the power consumption is reduced, the instruction in the prefetch module can be sent to the instruction fetch running water in advance, and the running water performance is improved.

Description

Execution instruction extraction method and device and electronic equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for extracting an execution instruction, and an electronic device.
Background
In a computer system, a Cache memory (Cache) is a part of hierarchical storage and is responsible for caching instructions and data. Wherein the first level Cache comprises an instruction Cache (I-Cache) and a data Cache (D-Cache). In the instruction fetch pipeline, the hit rate of the I-Cache has a decisive influence on the performance of the pipeline. How to improve the hit rate of the instruction fetch flow becomes a problem to be solved urgently in the prior art.
Disclosure of Invention
The disclosure provides an execution instruction extraction method and device and electronic equipment.
According to an aspect of the present disclosure, there is provided an execution instruction fetch method including:
acquiring a first extraction request, wherein the first extraction request comprises an address of an execution instruction;
detecting the hit condition of the address in the instruction cache and the pre-fetching module;
according to the hit condition, the execution instruction is extracted.
In one embodiment, fetching the execution instruction according to a hit condition includes:
in response to detecting a trigger miss event in the instruction cache and a trigger hit event in the prefetch module, an execution instruction is fetched from the prefetch module.
In one embodiment, the fetching of the execution instruction is based on a hit condition, further comprising:
and controlling the prefetching module to backfill the execution instruction into the instruction buffer, and deleting the execution instruction in the prefetching module after backfilling is completed.
In one embodiment, the fetching of the execution instruction is based on a hit condition, further comprising:
in response to detecting a trigger hit in the instruction cache and a trigger hit in the prefetch module, the execution instruction is fetched from the instruction cache.
In one embodiment, fetching the execution instruction according to a hit condition further comprises:
the execution instruction in the prefetch module is deleted.
In one embodiment, fetching the execution instruction according to a hit condition further comprises:
in response to detecting a trigger hit in the instruction cache and a trigger miss in the prefetch module, an execution instruction is fetched from the instruction cache.
In one embodiment, fetching the execution instruction according to a hit condition further comprises:
in response to detecting a miss-triggering event in the instruction buffer and a miss-triggering event in the prefetch module, controlling the prefetch module to send a second fetch request to the secondary buffer, wherein the second fetch request comprises an execution instruction and an address of a related prefetch instruction;
according to the hit condition of the address in the second-level cache, backfilling the execution instruction and the related prefetch instruction to the prefetch module;
the execution instruction is fetched from the prefetch module.
In one embodiment, backfilling the execution instruction and its associated prefetch instruction to the prefetch module based on a hit of the address in the level two cache comprises:
and controlling the secondary cache to backfill the execution instruction and the related prefetch instruction to the prefetch module in response to detecting that the addresses of the execution instruction and the related prefetch instruction all trigger hit events in the secondary cache.
In one embodiment, backfilling the execution instruction and its associated prefetch instruction to the prefetch module based on a hit of the address in the level two cache further comprises:
in response to the fact that the address of the execution instruction triggers a hit event in the secondary cache and the addresses of the related prefetch instructions of the execution instruction do not trigger the hit event in the secondary cache, controlling the secondary cache to backfill the execution instruction and the instructions of the related prefetch instructions triggering the hit event into the prefetch module;
and acquiring the instruction which does not trigger the hit event in the related prefetch instructions from the main instruction cache and backfilling the instruction to the prefetch module.
In one embodiment, backfilling the execution instruction and its associated prefetch instruction to the prefetch module based on a hit of the address in the level two cache further comprises:
and controlling the main instruction register to backfill instructions which do not trigger hit events in the related prefetch instructions into the secondary register.
In one embodiment, backfilling the execution instruction and its associated prefetch instruction to the prefetch module based on a hit of the address in the level two cache further comprises:
and responding to the detected miss event triggered by the address of the execution instruction in the secondary cache, acquiring the execution instruction and the related prefetch instruction from the main instruction cache, and backfilling the execution instruction and the related prefetch instruction to the prefetch module.
In one embodiment, backfilling the execution instruction and its associated prefetch instruction to the prefetch module based on a hit of the address in the level two cache further comprises:
and controlling the main instruction register to backfill the execution instruction and the related prefetch instruction into the secondary register.
In one embodiment, after the instruction is fetched from the prefetch module, the method further includes:
and backfilling the execution instruction to the instruction buffer, and deleting the execution instruction in the prefetching module after backfilling.
According to an aspect of the present disclosure, there is provided an instruction fetch apparatus that executes an instruction, including:
the prefetch module is used for acquiring a backfilling instruction of the second-level buffer, backfilling the instruction to the instruction buffer, and is connected with the instruction fetching module through a first bypass;
the prefetch module is further configured to obtain an execution instruction fetch request of the instruction fetch module, so that the instruction fetch module fetches the execution instruction in the prefetch module through the first bypass.
In one embodiment, the prefetch module is connected to the main instruction buffer via a second bypass, and the prefetch module is further configured to obtain the execution instruction and its related prefetch instruction in the main instruction buffer via the second bypass.
According to another aspect of the present disclosure, there is provided an electronic device for performing instruction fetching, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method according to any one of the embodiments of the present disclosure.
According to the technology disclosed by the invention, the execution instruction and the related prefetch instruction are stored by utilizing the prefetch module space, the hit condition of the execution instruction address can be detected in the instruction fetch cache and the prefetch module at the same time, and the execution instruction is extracted according to the hit condition, so that the method is different from the conventional prefetch hit access only performed on the instruction fetch cache, the cache access efficiency is improved, the power consumption is reduced, the instruction in the prefetch module can be sent to the instruction fetch running water in advance, and the running water performance is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a first flowchart illustrating a method for fetching an execution instruction according to an embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating a second method for instruction fetch according to an embodiment of the present disclosure;
FIG. 3 is a block diagram of an apparatus for fetching an execution instruction according to an embodiment of the present disclosure;
FIG. 4 is a block diagram of an electronic device for implementing the fetch method of execution instructions of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The instruction prefetching can effectively reduce the waiting and backfilling time when the I-Cache instruction is missing, and the speed of front-end instruction fetching is improved.
In one embodiment of Instruction prefetching, an Instruction fetching module, that is, an Instruction fetching Pipeline (Instruction Fetch Pipeline), is a work Unit in a Central Processing Unit (CPU), an I-Cache is used to store data with a high recent demand degree of the Instruction fetching Pipeline, a storage capacity of a secondary Cache (L2 Cache) is greater than that of the I-Cache and is used to store data with a second demand degree of the Instruction fetching Pipeline, a storage space of the PreFetch module (prefetcch) is smaller than that of the I-Cache and is used to generate a PreFetch address sequence, and a Bus Interface Unit (Bus Interface Unit, BIU for short) is connected to a main memory and a hard disk of the CPU and stores all data required by the CPU. The execution steps are as follows:
1) The instruction fetching stream sends an access request carrying an execution instruction address to the I-Cache, and if the execution instruction is hit in the I-Cache, the execution instruction is directly obtained and a new access request of the execution instruction is sent; and if the execution instruction is not hit in the I-Cache, sending the address of the missed execution instruction to the L2Cache and the prefetching module.
2) The prefetching module generates a prefetching address queue according to an address of a miss execution instruction sent by the I-Cache, wherein the prefetching address queue comprises the address of the prefetching instruction related to the miss execution instruction, meanwhile, in order to avoid the I-Cache from storing the related prefetching instruction, the prefetching module needs to access the I-Cache in the reverse direction after generating the prefetching address queue, the address of the related prefetching instruction stored in the I-Cache is removed in the prefetching address queue to avoid repeated storage, and then, a prefetching request is sent to the L2Cache according to the prefetching address queue.
3) After receiving the address of the miss execution instruction and the prefetch request, if the corresponding address is hit in the self storage, the L2Cache directly returns the corresponding instruction to the I-Cache/prefetch module; and if the corresponding address is not hit, sending an access request to the BIU, and searching in a main memory of the CPU to obtain a corresponding instruction. And then, the L2Cache backfills the I-Cache with an execution instruction and backfills a prefetch module with an instruction corresponding to the prefetch address queue.
4) After the I-Cache receives the backfilled execution instruction, the prefetching module returns the prefetched data to the I-Cache, and the instruction fetching running water obtains the execution instruction from the I-Cache and sends an access request of a new execution instruction.
For the above execution steps, as the prefetch instruction addresses related to the execution instruction all need to be hit-judged in the I-Cache, the idle period or idle channel of the I-Cache needs to be utilized for access and judgment, and under the condition that the I-Cache is a single channel, the fetch instruction flow needs to be blocked to perform hit-judgment of the execution instruction, which may affect the accuracy of data after blocking release; under the condition that the I-Cache is of multiple channels, the area and power consumption of the bottom layer implementation can be increased; meanwhile, in the existing instruction prefetching scheme, the hit instruction is sent to the I-Cache first and then is transmitted to the instruction fetching pipeline, so that the instruction fetching efficiency is reduced.
Based on this, the present disclosure provides a method for fetching an execution instruction, which is different from the conventional prefetching scheme, and fig. 1 is a flowchart illustrating the method for fetching an execution instruction according to an embodiment of the present disclosure, which includes:
s110, acquiring a first extraction request, wherein the first extraction request comprises an address of an execution instruction;
s120, detecting the hit condition of the address in the instruction cache and the prefetching module;
and S130, extracting the execution instruction according to the hit condition.
Illustratively, the instruction Cache may be an I-Cache as an instruction fetch Cache for the instruction fetch pipeline, and the first fetch request is an instruction fetch request issued by the instruction fetch pipeline and contains an address of an executed instruction. In step S120, hit detection of the executed instruction is performed in the I-Cache and the Prefetch module prefetcch, respectively, based on the address of the executed instruction. Then, in step S130, the execution instruction may be fetched according to the hit condition, that is, as long as the execution instruction is stored in any one of the I-Cache and the prefetch module, the execution instruction may be directly obtained and used by the fetch instruction pipeline.
Wherein, preFetch Buffer is realized by a fully-associative Entry Table, and the establishment, use and replacement selection of Entry are realized by an aging algorithm. Each entry can store data missed by the I-Cache, can also store prefetch instruction data, and can also store Cache instruction related operations, such as Prefetch.i instructions and data in RISC-V.
By adopting the method of the embodiment, the execution instruction and the related Prefetch instruction are stored by utilizing the space of the Prefetch module prefetcch, the Prefetch instruction does not need to be refilled to the I-Cache, the hit condition of the execution instruction address can be detected in the instruction buffer and the Prefetch module at the same time, the execution instruction is extracted according to the hit condition, the method is different from the conventional Prefetch hit access only performed on the fetch Cache, the Cache access efficiency is improved, the power consumption is reduced, the instruction in the Prefetch module can be sent to the fetch running water in advance, and the running performance is improved.
In one embodiment, step S130 includes:
in response to detecting a trigger miss event in the instruction cache and a trigger hit event in the prefetch module, an execution instruction is fetched from the prefetch module.
It can be understood that, if the instruction Cache, i.e. the I-Cache, does not detect the execution instruction, but the prefetch module detects the execution instruction, in this case, the instruction fetch pipeline may bypass the I-Cache, and directly fetch the execution instruction from the prefetch module, so as to reduce the latency of the instruction fetch pipeline and improve the access efficiency.
Preferably, in the case that a miss event is triggered in the instruction buffer and a hit event is triggered in the prefetch module, the step S130 further includes:
and controlling the prefetching module to backfill the execution instruction into the instruction buffer, and deleting the execution instruction in the prefetching module after the backfilling is finished.
It can be understood that, because the memory space of the prefetch module is small, in order to avoid the instruction fetch pipeline from reusing the execution instruction later, but the memory space of the prefetch module has been replaced with the instruction corresponding to the new prefetch address queue, the instruction fetch pipeline can directly fetch the execution instruction, and the execution instruction can be refilled to the I-Cache, and the execution instruction in the prefetch module is deleted after the refilling is completed, so as to save the memory space.
In one embodiment, step S130 further comprises:
in response to detecting a trigger hit in the instruction cache and a trigger hit in the prefetch module, the execution instruction is fetched from the instruction cache.
It can be understood that if the I-Cache and the prefetch module both detect the execution instruction, and the I-Cache is used as an instruction fetching Cache, the data transmission efficiency is obviously higher, so the execution instruction is fetched from the I-Cache.
Preferably, in the case that a hit event is triggered in the instruction buffer and a hit event is triggered in the prefetch module, the step S130 further includes:
the execution instructions in the prefetch module are deleted.
Because the I-Cache stores the execution instruction, the repeated instruction in the pre-fetching module can be deleted, so that the storage space is saved, and the instruction in the I-Cache can be directly called when the execution instruction is obtained next time.
In one embodiment, step S130 further includes:
in response to detecting a trigger hit in the instruction cache and a trigger miss in the prefetch module, an execution instruction is fetched from the instruction cache.
It can be understood that if the execution instruction is detected in the I-Cache but the execution instruction is not detected in the prefetch module, the execution instruction can be directly extracted from the I-Cache according to a conventional instruction prefetch mode without performing other operations on the prefetch module.
In one embodiment, as shown in fig. 2, step S130 further includes:
s201, in response to detecting that a miss event is triggered in the instruction buffer and a miss event is triggered in the pre-fetching module, controlling the pre-fetching module to send a second fetching request to the second-level buffer, wherein the second fetching request comprises an execution instruction and an address of the relevant pre-fetching instruction;
s202, according to the hit condition of the address in the secondary cache, backfilling the execution instruction and the related prefetch instruction to a prefetch module;
s203, an execution instruction is extracted from the pre-fetching module.
Illustratively, the prefetch-related instruction of the execution instruction represents an instruction that may be used in the near future of the fetch pipeline, and the address of the execution instruction and the prefetch-related instruction is the prefetch address queue calculated by the prefetch module. The secondary Cache may be an L2Cache having a storage capacity greater than the I-Cache, wherein the stored instructions are less demanding than the I-Cache for fetching instruction streams. Under the condition that the execution instruction is not detected in both the I-Cache and the pre-fetching module, the pre-fetching module is required to send a second fetching request containing the execution instruction and the address of the related pre-fetching instruction to the L2Cache, and the execution instruction and the related pre-fetching instruction are backfilled to the pre-fetching module according to the hit condition of the detected address in the L2Cache, so that the instruction fetching flow line can fetch the execution instruction from the pre-fetching module or can fetch the pre-fetching instruction next time.
By adopting the method of the embodiment, under the condition that the execution instruction is not detected in both the I-Cache and the pre-fetching module, the pre-fetching module can send the pre-fetching request to the L2Cache, and after the execution instruction is obtained by the pre-fetching module, the pre-fetching request is directly fetched by the instruction fetching flow line, and the hit judgment is carried out again without backfilling to the I-Cache, so that the flow performance is improved. Meanwhile, when the prefetch module generates a prefetch address queue, whether certain addresses exist in the I-Cache or not does not need to be checked, and a pipeline existing in the I-Cache cannot be blocked, so that the normal instruction fetching flow efficiency is not influenced. Meanwhile, the prefetch instruction does not need to be refilled to the I-Cache by the prefetch module, so that the operation on the prefetch instruction flow is reduced, the prefetch instruction is directly stored in the prefetch module, the instruction Cache and the prefetch module are accessed simultaneously based on the instruction fetching pipeline, and the subsequent instruction fetching operation possibly containing the prefetch instruction is not influenced.
If the relevant prefetch instruction exists in the prefetch module, the prefetch address of the existing relevant prefetch instruction does not need to be generated again.
In one embodiment, step S202 includes:
and controlling the secondary cache to backfill the execution instruction and the related prefetch instruction to the prefetch module in response to detecting that the addresses of the execution instruction and the related prefetch instruction all trigger hit events in the secondary cache.
It can be understood that, if the execution instruction and the related prefetch instruction are detected and hit in the L2Cache, the L2Cache is directly controlled to backfill all the detected instructions to the prefetch module for fetching instruction stream.
In one embodiment, step S202 further comprises:
in response to the fact that the address of the execution instruction triggers a hit event in the secondary cache and the addresses of the related prefetch instructions of the execution instruction do not trigger the hit event in the secondary cache, controlling the secondary cache to backfill the execution instruction and the instructions of the related prefetch instructions triggering the hit event into the prefetch module;
and acquiring the instruction which does not trigger the hit event in the related prefetch instructions from the main instruction cache and backfilling the instruction to the prefetch module.
Illustratively, the main instruction buffer may be a CPU main memory connected to the BIU, and if an execution instruction is detected in the L2Cache but a part of the relevant prefetch instruction is not detected, the L2Cache may be controlled to refill the execution instruction and the detected relevant prefetch instruction to the prefetch module first, so that the instruction fetch pipeline can fetch the execution instruction as soon as possible, and meanwhile, the BIU obtains the undetected relevant prefetch instruction in the main instruction buffer and refills the instruction fetch pipeline to the prefetch module, so that the instruction fetch pipeline obtains the execution instruction and then uses the relevant prefetch instruction subsequently.
It can be understood that the addresses of the relevant prefetch instructions of the execution instructions may also all miss in the second-level Cache, and in this case, the L2Cache may be controlled to refill the execution instructions to the prefetch module first, and obtain the relevant prefetch instructions in the main instruction Cache through the BIU, and refill the relevant prefetch instructions to the prefetch module, which is not described herein again.
Preferably, in the case that no relevant prefetch instruction of the execution instruction is detected or a relevant prefetch instruction is partially detected in the L2Cache, the step S202 further includes:
and controlling the main instruction register to backfill instructions which do not trigger hit events in the related prefetch instructions into the secondary register.
It can be appreciated that, because the memory space of the prefetch module is small, relevant prefetch instructions backfilled to the prefetch module may be backfilled and covered by other subsequent instructions if the relevant prefetch instructions are not fetched for a short time. Therefore, the BIU is controlled to backfill the undetected related prefetch instruction in the L2Cache to the L2Cache, so that the situation that the related prefetch instruction needs to be used in the subsequent instruction fetching flow, but the related prefetch instruction in the prefetch module is replaced, so that the instruction needs to be obtained in the main memory of the CPU again through the BIU can be avoided, the waiting time of the instruction fetching flow can be reduced, and the instruction fetching efficiency is improved.
In one embodiment, step S202 further comprises:
and responding to the detected miss event triggered by the address of the execution instruction in the secondary cache, acquiring the execution instruction and the related prefetch instruction from the main instruction cache, and backfilling the execution instruction and the related prefetch instruction to the prefetch module.
It is understood that if no execution instruction is detected in the L2Cache, the execution instruction needs to be fetched through the BIU and backfilled to the prefetch module.
Preferably, in a case that the execution instruction is not detected in the L2Cache, step S202 further includes:
and controlling the main instruction register to backfill the execution instruction and the related prefetch instruction into the secondary register.
It can be understood that, in the same way as the above-mentioned case where the relevant prefetch instruction is refilled into the L2Cache, when the execution instruction is not stored in the L2Cache, the execution instruction may be refilled into the prefetch module, or the execution instruction may be refilled into the L2Cache at the same time, so as to prevent the subsequent instruction fetch pipeline from reusing the execution instruction, but the execution instruction in the prefetch module has been replaced, so that the execution instruction needs to be obtained again through the BIU, which can reduce the waiting time of the instruction fetch pipeline and improve the instruction fetch efficiency.
In one embodiment, after step S203, the method further includes:
and backfilling the execution instruction to the instruction buffer, and deleting the execution instruction in the prefetching module after backfilling.
It can be understood that, after the execution instruction is refilled into the prefetch module, similar to the case where the execution instruction is not detected in the I-Cache but detected in the prefetch module, in order to avoid the instruction fetch pipeline from repeatedly using the execution instruction subsequently, but the execution instruction has been replaced with an instruction corresponding to a new prefetch address queue in the storage space of the prefetch module, the execution instruction may be refilled into the I-Cache except for the instruction fetch pipeline to directly extract the execution instruction, and the execution instruction in the prefetch module is deleted after refilling is completed, so as to save the storage space.
In another embodiment, after step S203, the method further includes:
and backfilling the execution instruction and the related prefetch instruction to the instruction buffer, and deleting the execution instruction and the related prefetch instruction in the prefetch module after backfilling is finished.
It can be understood that a designer can calculate the rule of the prefetch queue according to the designed prefetch module, evaluate the possibility that the related prefetch instruction of the execution instruction is required by the fetch pipeline in a short period, and if the possibility is high, the related execution and the execution instruction can be backfilled to the I-Cache. And otherwise, the relevant pre-fetching instruction does not need to be refilled to the I-Cache, and only the relevant pre-fetching instruction is stored in the L2 Cache. The specific backfill instruction range can be configured according to the actual situation, and is not limited to this.
Fig. 3 is a schematic structural diagram of an instruction fetch apparatus according to an embodiment of the present disclosure, the apparatus including:
the prefetch module is used for acquiring a backfilled instruction of a secondary Cache (L2 Cache), backfilling the instruction of an instruction Cache (I-Cache), and is connected with the instruction fetching module through a first bypass;
the prefetch module is further configured to obtain an execution instruction fetch request of the instruction fetch module, so that the instruction fetch module fetches the execution instruction in the prefetch module through the first bypass.
The pre-fetching module is also connected with the main instruction buffer through a second bypass, and the pre-fetching module can be further used for acquiring the execution instruction in the main instruction buffer and the related pre-fetching instruction through the second bypass.
More specifically, as shown in fig. 3, the prefetch module is communicated with the BIU through the second bypass, and when the prefetch address queue computed and generated by the prefetch module misses in the L2Cache, the prefetch module may directly obtain, through the BIU, an instruction (including an execution instruction and its related prefetch instruction) corresponding to the prefetch address queue in the main instruction Cache.
In an implementation manner, as shown in fig. 3, the apparatus may further include an I-Cache, an L2Cache, a BIU, and an instruction fetching module, functions of these modules are similar to those in the foregoing embodiments, and are not described herein again.
The execution steps of the disclosed instruction-executing extraction device are as follows:
1) The instruction fetching stream respectively sends access requests carrying execution instruction addresses to the I-Cache and the pre-fetching module, the execution instructions are hit in any one of the I-Cache and the pre-fetching module, and then the instruction fetching mode is determined according to the hit condition:
a, hitting an execution instruction in the I-Cache, directly obtaining the execution instruction in the I-Cache, if the execution instruction is not hit by the pre-fetching module in the condition, no operation is needed, and if the execution instruction is hit by the pre-fetching module, deleting the execution instruction in the pre-fetching module;
and b, missing the execution instruction in the I-Cache, and hitting the execution instruction in the prefetching module, wherein the instruction fetching flow line extracts the execution instruction from the prefetching module under the condition, meanwhile, the prefetching module is controlled to backfill the execution instruction to the I-Cache, and the execution instruction in the prefetching module is deleted after the backfilling is finished.
2.1 When the execution instruction is missed in both the I-Cache and the prefetch module, the prefetch module is controlled to generate a prefetch address queue according to the address of the miss execution instruction, the prefetch address queue comprises the address of the miss execution instruction and the address of the prefetch instruction related to the miss execution instruction, then a prefetch request is sent to the L2Cache according to the prefetch address queue, and an instruction acquisition mode is determined according to the hit condition of the prefetch address queue in the L2 Cache:
a, if the prefetch address queue hits in the L2Cache, controlling the L2Cache to backfill an execution instruction and a related prefetch instruction to a prefetch module;
b, the prefetch address queue does not completely hit in the L2Cache, the L2Cache can be controlled to backfill the hit instruction to the prefetch module, meanwhile, the BIU searches the miss instruction in the main instruction Cache, the BIU is controlled to directly backfill the miss instruction to the prefetch module, and the BIU is controlled to backfill the miss instruction to the L2 Cache.
2.2 The instruction fetching pipeline fetches the execution instruction from the prefetching module after the execution instruction and the related prefetching instruction are backfilled to the prefetching module, and meanwhile, the prefetching module can be controlled to backfill the execution instruction to the I-Cache and delete the execution instruction in the prefetching module after backfilling is finished.
3) And after the instruction fetching flow successfully fetches the execution instruction, sending an access request of a new execution instruction to the I-Cache and the pre-fetching module.
The specific arrangements and implementations of the embodiments of the present application have been described above from various perspectives. By utilizing the method provided by the embodiment, the following beneficial effects can be achieved:
(1) The method can detect the hit condition of the execution instruction address in the instruction fetch cache and the prefetch module at the same time, and extract the execution instruction according to the hit condition, and is different from the traditional method that the execution instruction can only be obtained through the instruction fetch cache or the execution instruction is backfilled to the instruction fetch cache, so that the storage space of the prefetch module is fully utilized, the cache access efficiency is improved, the power consumption is reduced, the instruction in the prefetch module can be sent to the instruction fetch running water in advance, and the running water performance is improved.
(2) The instruction required by the pre-fetching module can be directly obtained from the main instruction Cache through the BIU, the BIU does not need to firstly backfill the L2Cache and then backfill the pre-fetching module through the L2Cache, and therefore the speed of obtaining and executing the instruction by the instruction fetching flow line can be further improved, and the transmission efficiency of the whole flow line is improved.
(3) Because the instruction fetch Cache and the prefetch module can be accessed at the same time, and repeated parts in the instruction fetch Cache and the prefetch module are removed, the prefetch module does not need to access the I-Cache for address check when generating a prefetch address queue, thereby not needing to block the work of instruction fetch flow, and also not needing to set a larger area and use larger power consumption to realize the work of the multi-channel I-Cache.
Fig. 4 shows a block diagram of an electronic device according to an embodiment of the present application. As shown in fig. 4, the electronic apparatus includes: memory 410 and processor 420, memory 410 having stored therein instructions executable on processor 420. The processor 420, when executing the instructions, implements the method of identifying lane edges in the above-described embodiments. The number of the memory 410 and the processor 420 may be one or more. The electronic device is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
The electronic device may further include a communication interface 430, which is used for communicating with an external device for data interactive transmission. The various devices are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor 420 may process instructions for execution within the electronic device, including instructions stored in or on a memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to an interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 4, but that does not indicate only one bus or one type of bus.
Optionally, in a specific implementation, if the memory 410, the processor 420, and the communication interface 430 are integrated on a chip, the memory 410, the processor 420, and the communication interface 430 may complete communication with each other through an internal interface.
It should be understood that the processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or any conventional processor or the like. It is noted that the processor may be a processor supporting an Advanced reduced instruction set machine (ARM) architecture.
Embodiments of the present application provide a computer-readable storage medium (such as the memory 410 described above) storing computer instructions, which when executed by a processor implement the methods provided in embodiments of the present application.
Alternatively, the memory 410 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of an electronic device that recognizes an edge of a lane, and the like. Further, the memory 410 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 410 optionally includes memory generated remotely with respect to processor 420, which may be connected over a network to an electronic device that identifies lane edges. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other physical types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage media, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable Media does not include non-Transitory computer readable Media (transient Media), such as modulated data signals and carrier waves.
In the description of the present specification, the terms "first", "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various changes or substitutions within the technical scope of the present application, and these should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (16)

1. A method for executing instruction fetch, comprising:
acquiring a first extraction request, wherein the first extraction request comprises an address of an execution instruction;
detecting the hit condition of the address in an instruction cache and a prefetching module;
and extracting the execution instruction according to the hit condition.
2. The method of claim 1, wherein said fetching the execution instruction according to the hit condition comprises:
in response to detecting a trigger miss event in the instruction cache and a trigger hit event in the prefetch module, the execution instructions are fetched from the prefetch module.
3. The method of claim 2, wherein said fetching the execution instruction according to the hit condition further comprises:
and controlling the prefetching module to backfill the execution instruction to the instruction buffer, and deleting the execution instruction in the prefetching module after backfilling is finished.
4. The method of claim 1, wherein said fetching the execution instruction according to the hit condition further comprises:
in response to detecting a trigger hit event in the instruction cache and a trigger hit event in the prefetch module, the executing instruction is fetched from the instruction cache.
5. The method of claim 4, wherein said fetching the execution instruction according to the hit condition further comprises:
deleting the execution instruction in the prefetch module.
6. The method of claim 1, wherein said fetching the execution instruction according to the hit condition further comprises:
in response to detecting a trigger hit event in the instruction buffer, and a miss event is triggered in the prefetch module to fetch the execution instruction from the instruction cache.
7. The method of claim 1, wherein said fetching the execution instruction according to the hit condition further comprises:
in response to detecting a miss event triggered in the instruction cache and a miss event triggered in the prefetch module, controlling the prefetch module to send a second fetch request to a second level cache, wherein the second fetch request comprises the address of the execution instruction and the address of the related prefetch instruction;
according to the hit condition of the address in the secondary cache, backfilling the execution instruction and the related prefetching instruction to the prefetching module;
the execution instruction is fetched from the prefetch module.
8. The method of claim 7, wherein backfilling the execution instructions and their associated prefetch instructions to the prefetch module based on a hit by the address in the level-two cache comprises:
and controlling the secondary cache to backfill the execution instruction and the related prefetch instruction to the prefetch module in response to detecting that the addresses of the execution instruction and the related prefetch instruction all trigger hit events in the secondary cache.
9. The method of claim 7, wherein backfilling the execution instructions and their associated prefetch instructions to the prefetch module based on a hit of the address in the level two cache further comprises:
in response to detecting that the address of the executed instruction triggers a hit event in the second-level cache and the address of the related prefetch instruction of the executed instruction does not trigger all hit events in the second-level cache, controlling the second-level cache to backfill the executed instruction and the instruction triggering the hit event in the related prefetch instruction to the prefetch module;
and acquiring instructions which do not trigger hit events in the related prefetch instructions from a main instruction cache and backfilling the instructions to the prefetch module.
10. The method of claim 9, wherein backfilling the execution instructions and their associated prefetch instructions to the prefetch module based on a hit of the address in the level two cache further comprises:
and controlling instructions which do not trigger hit events in the related prefetch instructions to backfill the secondary cache in the main instruction cache.
11. The method of claim 7, wherein backfilling the execution instructions and their associated prefetch instructions to the prefetch module based on a hit of the address in the level two cache further comprises:
and responding to the detected miss event triggered by the address of the execution instruction in the secondary cache, acquiring the execution instruction and the related prefetch instruction from the main instruction cache, and backfilling the execution instruction and the related prefetch instruction to the prefetch module.
12. The method of claim 11, wherein backfilling the execution instructions and their associated prefetch instructions to the prefetch module based on a hit of the address in the level two cache further comprises:
and controlling the main instruction buffer to backfill the execution instruction and the related prefetch instruction to the secondary buffer.
13. The method of any of claims 7-12, wherein after said fetching said execution instruction from said prefetch module, further comprising:
and backfilling the execution instruction to the instruction buffer, and deleting the execution instruction in the prefetching module after backfilling is finished.
14. An apparatus for fetching an execution instruction, comprising:
the prefetch module is used for acquiring a backfilling instruction of the second-level buffer, backfilling the instruction to the instruction buffer, and is connected with the instruction fetching module through a first bypass;
the prefetch module is further used for acquiring an execution instruction fetch request of the instruction fetch module so that the instruction fetch module fetches the execution instruction in the prefetch module through the first bypass.
15. The apparatus of claim 14, wherein the prefetch module is coupled to the main instruction cache via a second bypass, the prefetch module further configured to:
the executed instructions and their associated prefetch instructions in the main instruction cache are fetched via the second bypass.
16. An electronic device for performing instruction fetching, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-13.
CN202211348834.2A 2022-10-31 2022-10-31 Execution instruction extraction method and device and electronic equipment Pending CN115756604A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211348834.2A CN115756604A (en) 2022-10-31 2022-10-31 Execution instruction extraction method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211348834.2A CN115756604A (en) 2022-10-31 2022-10-31 Execution instruction extraction method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN115756604A true CN115756604A (en) 2023-03-07

Family

ID=85355962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211348834.2A Pending CN115756604A (en) 2022-10-31 2022-10-31 Execution instruction extraction method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115756604A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117785294A (en) * 2023-12-29 2024-03-29 飞腾信息技术有限公司 Branch prediction method and system, instruction fetch control module, processor and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117785294A (en) * 2023-12-29 2024-03-29 飞腾信息技术有限公司 Branch prediction method and system, instruction fetch control module, processor and storage medium

Similar Documents

Publication Publication Date Title
CN1991793B (en) Method and system for proximity caching in a multiple-core system
US11144468B2 (en) Hardware based technique to prevent critical fine-grained cache side-channel attacks
US8683129B2 (en) Using speculative cache requests to reduce cache miss delays
US8683136B2 (en) Apparatus and method for improving data prefetching efficiency using history based prefetching
US20170161194A1 (en) Page-based prefetching triggered by tlb activity
US9286221B1 (en) Heterogeneous memory system
US9311239B2 (en) Power efficient level one data cache access with pre-validated tags
CN104252425B (en) The management method and processor of a kind of instruction buffer
US10558569B2 (en) Cache controller for non-volatile memory
US20070180158A1 (en) Method for command list ordering after multiple cache misses
US8352646B2 (en) Direct access to cache memory
US11783032B2 (en) Systems and methods for protecting cache and main-memory from flush-based attacks
CN117609110B (en) Caching method, cache, electronic device and readable storage medium
US10229066B2 (en) Queuing memory access requests
KR20150079408A (en) Processor for data forwarding, operation method thereof and system including the same
US10853262B2 (en) Memory address translation using stored key entries
US9836396B2 (en) Method for managing a last level cache and apparatus utilizing the same
CN108874691B (en) Data prefetching method and memory controller
CN105095104A (en) Method and device for data caching processing
US20230205872A1 (en) Method and apparatus to address row hammer attacks at a host processor
CN115756604A (en) Execution instruction extraction method and device and electronic equipment
US20060143400A1 (en) Replacement in non-uniform access cache structure
WO2015171626A1 (en) Controlled cache injection of incoming data
CN108874690A (en) The implementation method and processor of data pre-fetching
US6976125B2 (en) Method and apparatus for predicting hot spots in cache memories

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination