WO2021189203A1 - Bandwidth equalization method and apparatus - Google Patents

Bandwidth equalization method and apparatus Download PDF

Info

Publication number
WO2021189203A1
WO2021189203A1 PCT/CN2020/080729 CN2020080729W WO2021189203A1 WO 2021189203 A1 WO2021189203 A1 WO 2021189203A1 CN 2020080729 W CN2020080729 W CN 2020080729W WO 2021189203 A1 WO2021189203 A1 WO 2021189203A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
cache memory
index information
entry
cache
Prior art date
Application number
PCT/CN2020/080729
Other languages
French (fr)
Chinese (zh)
Inventor
王锦
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202080092629.7A priority Critical patent/CN114930306A/en
Priority to PCT/CN2020/080729 priority patent/WO2021189203A1/en
Publication of WO2021189203A1 publication Critical patent/WO2021189203A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0873Mapping of cache memory to specific storage devices or parts thereof

Definitions

  • This application relates to the field of data communication, and in particular to a bandwidth equalization method and device.
  • a cache high-speed buffer memory
  • the frequently fetched instructions from the main memory are cached in the cache, and then the instructions to be fetched are fetched on the processor core
  • the mapping relationship between the storage block in the main memory and the multiple caches is determined by the physical address of the storage block in the main memory.
  • the storage in the main memory is stored in the multiple caches.
  • the way instructions are deployed in multiple caches is only determined by the physical address of the instruction.
  • This application provides a bandwidth equalization method and device, which is used to solve the uneven distribution of requests due to the fact that the deployment mode of instructions in the main memory in multiple caches is only determined by the physical addresses of the instructions in the main memory.
  • a bandwidth equalization method including: monitoring the access frequency of a plurality of cache memories; determining a cold spot cache memory and a hot spot cache memory among the plurality of cache memories; Hot spot information is determined in the memory; the first index information of the hot spot information and the identification information of the cold spot cache memory are recorded in a target entry, where the target entry is an entry in the redirection data table .
  • the hotspot information in the cache with high access frequency is redirected to the cache with low access frequency, which changes the deployment method of information in multiple caches, and the deployment method of information in multiple caches is the same as that of each cache.
  • Access frequency is related, so that when obtaining information according to the request, it can greatly alleviate the problem of uneven request distribution, thereby improving the problem of serious imbalance of access bandwidth, and improving the performance of information acquisition; in addition, because the information is stored in multiple caches
  • the distribution is related to the access frequency of each cache, so the information acquisition performance can be estimated; in addition, the hotspot information can be redirected only according to the access frequency of each cache, and the steps are simple and easy to implement.
  • the monitoring the access frequency of the plurality of cache memories includes: in response to a frequency monitoring instruction, monitoring the access frequency of the plurality of cache memories in each of the monitoring periods based on a monitoring period .
  • the determining the cold spot cache memory and the hot spot cache memory among the plurality of cache memories includes: according to the access frequency of each of the cache memories, selecting the one with the highest access frequency The cache memory is determined to be the hot spot cache memory, and the cache memory with the smallest access frequency is determined to be the cold spot cache memory; or the access frequency is determined according to the access frequency of each cache memory A cache with a frequency greater than a first preset frequency is determined to be the hot spot cache, and a cache with an access frequency less than a second preset frequency is determined to be the cold spot cache, wherein the first preset The frequency is greater than the second preset frequency.
  • the determining hotspot information in the hotspot cache includes: determining whether the access frequency of the hotspot cache reaches the frequency configured by the register and the access frequency of the hotspot cache Whether the difference between the access frequency of the cold spot cache memory and the cold spot cache memory is greater than the configured value; if so, hot spot information is determined in the hot spot cache memory.
  • the hot spot cache By judging whether the access frequency of the hot spot cache memory reaches the frequency configured by the register and whether the difference between the access frequency of the hot spot cache memory and the access frequency of the cold spot cache memory is greater than the configured value, and when this condition is met, the hot spot cache
  • the hot-spot instruction is determined in the memory to record the first index information of the hot-spot instruction and the identification information of the cold-spot cache memory in the target entry, so as to realize the redirection of the hot-spot instruction.
  • a restriction condition is provided for starting the redirection process, and the redirection process can be started only when the restriction condition is met, which improves the accuracy of starting the redirection process.
  • the determining hot spot information in the hot spot cache memory includes: determining whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory; If yes, the hot spot information is determined in the hot spot cache memory.
  • the hot spot instruction is determined in the hot spot cache memory to combine the first index information of the hot spot instruction with The identification information of the cold spot cache memory is recorded in the target table entry, thereby realizing the redirection of the hot spot instruction.
  • a restriction condition is provided for starting the redirection process, and the redirection process can be started only when the restriction condition is met, which improves the accuracy of starting the redirection process.
  • the determining hotspot information in the hotspot cache memory includes: monitoring the access frequency of each buffer line in the hotspot cache memory; according to the access of each cache line Frequency, determine a hotspot buffer line in the hotspot cache memory; determine the information stored in the hotspot buffer line as hotspot information.
  • the redirection data table includes a plurality of table entries, each of the table entries includes a first identifier and a second identifier, the first identifier is a valid mark or an invalid mark, so The second mark is a hot spot mark or a non-hot spot mark.
  • the recording the first index information of the hot spot information and the identification information of the cold spot cache memory in a target entry includes: according to each of the multiple entries The first identifier and the second identifier of each entry, and the candidate entry is determined among the multiple entries, wherein the candidate entry includes the first identifier of the multiple entries that is an invalid flag The entry and the entry with the second identifier as a non-hot spot mark; determine the target entry in the candidate entry; record the first index information of the hot spot information and the identification information of the cold spot cache In the target table entry.
  • the method further includes: recording the first index in the target entry The second mark is set as a hot spot mark, and the first mark in the target entry is set as a valid mark.
  • the method further includes: monitoring the access frequency of each entry in the redirection data table; judging whether the access frequency of the first entry is less than a third preset frequency, wherein, The first entry is an entry with the second identifier being the hot spot tag; the second identifier of the first entry whose access frequency is less than the third preset frequency is modified to a non-hot spot tag; determining Whether the access frequency of the second entry is greater than the fourth preset frequency, where the second entry is the entry with the second identification as the non-hot spot flag; the access frequency is greater than the fourth preset frequency The second identifier of the second table entry is modified to a hot spot label; wherein, the fourth preset frequency is greater than the third preset frequency.
  • the second identifier in the entry Status that is, to determine whether to change the hot and cold state of the instruction corresponding to the first index information in the entry, so as to realize real-time monitoring and monitoring of the hot and cold state of the instruction corresponding to the first index information of each entry in the data table. Update to ensure the timeliness and accuracy of the information in the redirection data table.
  • the method further includes: filling the hot spot information into a cold spot buffer line in the cold spot cache memory.
  • the method further includes: determining any one of the buffer lines in the cold spot cache memory as the cold spot buffer line; or determining the access in the cold spot cache memory The buffer line whose frequency is less than the fifth preset frequency is determined as the cold spot buffer line; or the buffer line with the smallest access frequency in the cold spot cache memory is determined as the cold spot buffer line.
  • the method further includes: reading a first request, where the first request is a request for obtaining information to be obtained, the first request carries second index information, and the second index
  • the information is the index information used to obtain the information to be obtained; the first index information of the information to be obtained is determined according to the second index information; the first index information of the information to be obtained is combined with the redirection data table Match the first index information in each entry of the entry; if the first index information in one entry matches the first index information of the information to be obtained, it will match the first index information of the information to be obtained
  • the cache memory corresponding to the identification information of the cache memory in the entry matching the index information is determined as the first target cache memory; the second index information is sent to the first target cache memory, so that all The first target cache memory obtains the to-be-obtained information according to the second index information.
  • the cache memory corresponding to the identification information of the cache memory in the matching entry is determined as the first target cache memory, and the second index information is sent to the first target cache memory, so that the first The target cache obtains the information to be obtained according to the second index information, so the first request is split based on the redirected data table, which greatly alleviates the problem of uneven request distribution, thereby improving the problem of serious imbalance in request acquisition bandwidth and improving Improved information acquisition performance.
  • the method further includes: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the second index
  • the information and the mapping rule determine the first target cache memory; the second index information is sent to the first target cache memory, so that the first target cache memory obtains all data according to the second index information. Describe the information to be obtained.
  • the method further includes: receiving third index information sent by the first target cache, the third index information being calculated from the second index information and the storage interval, the The third index information is generated by the first target cache memory when it is determined that the acquisition of the information to be acquired has not been completed according to the end identifier in the information to be acquired; the information to be acquired is determined according to the third index information
  • the first index information; the first index information of the information to be obtained is matched with the first index information in each entry in the redirected data table; if the first index information in one of the entries matches If the first index information of the information to be obtained matches, the cache memory corresponding to the identification information of the cache memory in the entry matching the first index information of the information to be obtained is determined as the second target cache memory ; Send the third index information to the second target cache memory, so that the second target cache memory obtains the to-be-obtained information according to the third index information.
  • the method further includes: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the third index
  • the information and the mapping rule determine the second target cache memory; the third index information is sent to the second target cache memory, so that the second target cache memory obtains all data according to the third index information. Describe the information to be obtained.
  • a bandwidth equalization device including: a first monitoring module for monitoring the access frequency of a plurality of cache memories; a first determination module for determining a cold spot among the plurality of cache memories.
  • a buffer memory and a hot spot cache memory a second determination module, used to determine hot spot information in the hot spot cache memory; a recording module, used to compare the first index information of the hot spot information with the cold spot cache memory
  • the identification information of is recorded in a target entry, where the target entry is an entry in the redirection data table.
  • the first monitoring module is specifically configured to respond to a frequency monitoring instruction and monitor the access frequency of the plurality of cache memories in each monitoring period based on a monitoring period.
  • the first determining module is specifically configured to determine, according to the access frequency of each cache memory, the cache memory with the highest access frequency as the hot spot cache memory, Determine the cache memory with the smallest access frequency as the cold spot cache memory; or determine the cache memory with an access frequency greater than a first preset frequency as the cold spot cache memory according to the access frequency of each cache memory
  • a hot spot cache memory which determines a cache memory with an access frequency less than a second preset frequency as the cold spot cache memory, wherein the first preset frequency is greater than the second preset frequency.
  • the second determining module is specifically configured to determine whether the access frequency of the hot spot cache memory reaches the frequency configured by the register and the access frequency of the hot spot cache memory and the cold spot Whether the difference in the access frequency of the cache memory is greater than the configured value; if so, hot spot information is determined in the hot spot cache memory.
  • the second determining module is specifically configured to determine whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory; The hot spot information is determined in the hot spot cache memory.
  • the second determining module is specifically configured to monitor the access frequency of each buffer line in the hotspot cache; according to the access frequency of each buffer line, the A hot spot buffer line is determined in the hot spot cache memory; the information stored in the hot spot buffer line is determined as hot spot information.
  • the redirection data table includes a plurality of table entries, each of the table entries includes a first identifier and a second identifier, the first identifier is a valid mark or an invalid mark, so The second mark is a hot spot mark or a non-hot spot mark.
  • the recording module is specifically configured to determine the candidate table among the multiple table items according to the first identifier and the second identifier of each of the multiple table items Item, wherein the candidate entry includes an entry whose first identification is an invalid flag and an entry whose second identification is a non-hot spot flag among the plurality of entries; in the candidate entry Determine the target entry; record the first index information of the hot spot information and the identification information of the cold spot cache in the target entry.
  • the method further includes: a setting module, configured to set the second identifier in the target entry as a hotspot label, and set the first identifier in the target entry as a valid label.
  • it further includes: a second monitoring module for monitoring the access frequency of each entry in the redirection data table; a first judging module for judging the access of the first entry Whether the frequency is less than the third preset frequency, wherein the first entry is the entry with the second identification as the hotspot mark; the first modification module is configured to set the access frequency to be less than the third preset frequency The second identifier of the first entry is modified to a non-hot spot flag; the second determination module is used to determine whether the access frequency of the second entry is greater than the fourth preset frequency, wherein the second entry is all The second identifier is an entry of the non-hot-spot label; a second modification module is configured to modify the second identifier of the second entry whose access frequency is greater than the fourth preset frequency to a hot-spot label; wherein, The fourth preset frequency is greater than the third preset frequency.
  • a filling module configured to fill the hot spot information into the cold spot buffer row in the cold spot cache memory.
  • the method further includes: a third determining module, configured to determine any buffer line in the cold spot cache memory as the cold spot buffer line; or to cache the cold spot A buffer line with an access frequency less than the fifth preset frequency in the memory is determined as a cold spot buffer line; or a buffer line with the smallest access frequency in the cold spot cache memory is determined as a cold spot buffer line.
  • a third determining module configured to determine any buffer line in the cold spot cache memory as the cold spot buffer line; or to cache the cold spot A buffer line with an access frequency less than the fifth preset frequency in the memory is determined as a cold spot buffer line; or a buffer line with the smallest access frequency in the cold spot cache memory is determined as a cold spot buffer line.
  • a reading module configured to read a first request, the first request is a request for obtaining information to be obtained, the first request carries second index information, the The second index information is the index information used to obtain the information to be obtained; the fourth determining module is used to determine the first index information of the information to be obtained according to the second index information; the first matching module is used to The first index information of the information to be obtained is matched with the first index information in each entry in the redirection data table; the fifth determining module is configured to determine the first index information in one of the entries If it matches the first index information of the information to be obtained, the cache memory corresponding to the identification information of the cache memory in the entry matching the first index information of the information to be obtained is determined as the first target cache Memory; a first sending module, configured to send the second index information to the first target cache, so that the first target cache obtains the to-be-obtained according to the second index information information.
  • it further includes: a sixth determining module, configured to: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the The second index information and the mapping rule determine the first target cache memory; the second sending module is used to send the second index information to the first target cache memory, so that the first target is high-speed
  • the buffer memory obtains the to-be-obtained information according to the second index information.
  • a receiving module configured to receive third index information sent by the first target cache memory, where the third index information is calculated from the second index information and the storage interval, The third index information is generated by the first target cache memory when it is determined that the acquisition of the information to be acquired has not been completed according to the end identifier in the information to be acquired;
  • the seventh determining module is configured to The three index information determines the first index information of the information to be obtained;
  • the second matching module is used to compare the first index information of the information to be obtained with the first index information in each entry in the redirect data table Matching; an eighth determining module, configured to match the first index information of the information to be obtained if the first index information in one of the entries matches the first index information of the information to be obtained
  • the cache memory corresponding to the identification information of the cache memory in the entry is determined to be the second target cache memory;
  • the third sending module is configured to send the third index information to the second target cache memory, So that the second target cache memory obtains the to-be-obtained information according to
  • it further includes: a ninth determining module, configured to: if the first index information in each entry does not match the first index information of the to-be-obtained information, according to the The third index information and the mapping rule determine the second target cache memory; the fourth sending module is configured to send the third index information to the second target cache memory, so as to make the second target high-speed
  • the buffer memory obtains the to-be-obtained information according to the third index information.
  • a computer-readable storage medium including a computer program, which when executed on a computer, causes the computer to execute the method described in any one of the first aspects.
  • a computer program is provided, when the computer program is executed by a computer, it is used to execute the method described in any one of the first aspect.
  • a chip including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute any one of the first aspect The method described.
  • FIG. 1 is a schematic diagram of an application scenario of a bandwidth equalization method provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a bandwidth equalization method provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of the process of recording the first index of the hot spot instruction thank you and the id of the cold spot cache in the target entry provided by an embodiment of the application;
  • FIG. 4 is the first part of a schematic flowchart of an information acquisition method provided by an embodiment of this application.
  • FIG. 5 is the second part of a schematic flowchart of an information acquisition method provided by an embodiment of this application.
  • FIG. 6 is the third part of a schematic flowchart of an information acquisition method provided by an embodiment of this application.
  • FIG. 7 is the fourth part of a schematic flowchart of an information acquisition method provided by an embodiment of this application.
  • FIG. 8 is a schematic diagram of an application scenario including multiple caches provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a slice cache provided by an embodiment of the application.
  • FIG. 10 is the first part of a schematic flowchart of another information acquisition method provided by an embodiment of this application.
  • FIG. 11 is the second part of a schematic flowchart of another information acquisition method provided by an embodiment of the application.
  • FIG. 12 is the third part of a schematic flowchart of another information acquisition method provided by an embodiment of this application.
  • FIG. 13 is a schematic structural diagram of a bandwidth equalization device provided by an embodiment of the application.
  • At least one (item) refers to one or more, and “multiple” refers to two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B , Where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects before and after are in an “or” relationship.
  • the following at least one item (a) or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • At least one of a, b, or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
  • Fig. 1 is a schematic diagram of an application scenario of a bandwidth equalization method provided by an embodiment of the application.
  • the application scenario may include: multiple caches, multiple processor cores, and main memory. (Not shown in the figure), redirection module, redirection data table and crossbar (crossbar). in:
  • the main memory includes multiple storage blocks, each storage block is composed of a number of storage units, and each storage module is used to store information.
  • the main memory can be a data main memory, that is, the information stored in the main memory is data, or it can be an instruction main memory, that is, the information stored in the main memory is an instruction, which is not specifically limited in this application.
  • the way of storing instructions to be stored in main memory can be: judging whether the size of the instructions to be stored is greater than the capacity of the storage block in the main memory, and if the size of the instructions to be stored is equal to or less than the storage capacity of the storage block, the instructions to be stored are Stored in a storage block as a whole; if the size of the instruction to be stored is greater than the storage capacity of the storage block, the instruction to be stored is divided into multiple instruction segments according to the size of the instruction to be stored and the storage capacity of the storage block, and then The divided instruction segments are stored in multiple storage blocks.
  • the end indicator (EI) can also be set for the instruction stored in the storage block.
  • EI flag When the EI flag is 0, it means that the instruction stored in the storage block is one of the instruction segments corresponding to the instructions to be stored and is not The last instruction segment of the instruction to be stored, if the EI flag is 1, it indicates that the instruction stored in the storage block is the last instruction segment of the corresponding instruction to be stored, or the instruction stored in the storage block is The corresponding instruction to be stored. Based on this, if the size of the instruction to be stored is equal to or less than the storage capacity of the storage block, the instruction to be stored is stored in a storage block as a whole, and the EI of the instruction in the storage block (that is, the instruction to be stored) The flag is set to 1.
  • the instruction to be stored is divided into multiple instruction segments according to the size of the instruction to be stored and the capacity of the storage block, and then the multiple instruction segments are stored in multiple storage blocks , Where the number of multiple instruction segments is the same as the number of multiple storage blocks, and one instruction segment corresponds to one storage block. Finally, set the EI flag for the instructions stored in each storage block. If the instructions stored in the storage block are not pending To store the last instruction segment of the instruction, set the EI flag of the instruction stored in the storage block to 0. If the instruction stored in the storage block is the last instruction segment of the instruction to be stored, then the instruction stored in the storage block is The EI flag is set to 1.
  • Each cache includes multiple cache lines (cachelines), each cacheline is composed of multiple storage units, each cacheline is used to store instructions in the storage block in the main memory, and the storage capacity of each cacheline is the same as that in the main memory.
  • the storage capacity of each storage block is the same.
  • the cache here may be, for example, a slice cache (slice-based cache memory), etc., which is not specifically limited in this application.
  • the mapping rules between storage blocks in the main memory and multiple caches include the mapping relationship between the storage blocks in the main memory and multiple caches, and the mapping relationship between the storage blocks in the main memory and the cache lines in the cache.
  • the mapping relationship between the storage block in the main memory and multiple caches can be set according to specific application scenarios, which is not specifically limited in this application.
  • the mapping relationship between the storage block in the main memory and the buffer line in the cache can be, for example, any one of group connection, full connection, direct mapping, etc., which is not specifically limited in this application.
  • the instructions stored in the storage block in the main memory can be cached in the cache.
  • the cache corresponding to the storage block and the corresponding cache line in the corresponding cache may be determined according to the mapping rule, and then the instructions stored in the storage block are buffered in the corresponding cache line in the corresponding cache.
  • the redirection module is used to monitor the access frequency of each cache in multiple caches, and adjust the storage location of hotspot instructions in multiple caches according to the access frequency of each cache, and then balance the access frequency of each cache to achieve the requested Balanced allocation, balanced acquisition of bandwidth, and improved performance of information acquisition.
  • the redirection data table is used to record the adjustment record of the redirection module to the hotspot instruction. It should be noted that the redirection module and the redirection data table will be described in detail below, so they will not be repeated here.
  • FIG. 2 is a schematic flowchart of a bandwidth equalization method provided by an embodiment of the application.
  • the execution subject of the method may be, for example, a device or chip that can execute the method shown in FIG. 2 such as the redirection module in the above application scenario.
  • the application will be explained by taking the main memory as the command main memory, that is, the information stored in the main memory as the command as an example.
  • the method includes the following steps:
  • Step 201 Monitor the access frequency of multiple caches.
  • the monitoring method may include the following two types, among which:
  • the first is to regularly monitor the access frequency of each of the multiple caches during the monitoring period.
  • multiple monitoring moments can be set, and at the beginning of each monitoring moment, the access frequency of each cache in the monitoring period is monitored.
  • Multiple monitoring moments and monitoring periods can be set according to specific application scenarios, which are not specifically limited here.
  • the monitoring period can be 0.01 ms.
  • the second type is to monitor the access frequency of multiple caches in each monitoring cycle based on a monitoring cycle in response to a frequency monitoring command.
  • a frequency monitoring instruction can be sent to the execution subject of the method (for example, the redirection module), that is, enable the execution subject of the method, so that the method
  • the main body of execution receives and responds to the frequency monitoring instruction, and immediately starts the statistics of timing and the number of visits. When the timing reaches the monitoring period, the statistics of the number of visits are recorded and the visit frequency is calculated.
  • the access frequency of each cache in the monitoring period is obtained as follows: first, the number of accesses of each cache in the monitoring period can be obtained through a counter, and then, the number of accesses of each cache in the monitoring period The ratio of the number of accesses to the duration of the monitoring period is determined as the access frequency of the corresponding cache. It should be noted that in the above method, the monitoring period of each cache is the same, and the number of accesses of each cache in the monitoring period is positively correlated with the access frequency. Therefore, it is also possible to directly set each cache in the monitoring period. The number of accesses is determined as the access frequency of the corresponding cache in the monitoring period, which can reduce the amount of calculation while ensuring the accuracy of the data, improve the calculation efficiency, and save the calculation cost.
  • Step 202 Determine a cold spot cache and a hot spot cache among multiple caches.
  • the cold spot cache and the hot spot cache can be determined in the following two ways, among which:
  • Method 1 According to the access frequency of each cache, the cache with the largest access frequency is determined as a hot spot cache, and the cache with the smallest access frequency is determined as a cold spot cache. Specifically, each cache can be sorted in descending order of access frequency, the first cache is determined to be a hot spot cache, and the last cache is determined to be a cold spot cache.
  • Method 2 According to the access frequency of each cache, the cache with an access frequency greater than the first preset frequency is determined as a hot cache, and the cache with an access frequency less than the second preset frequency is determined as a cold spot cache, where the first preset The frequency is greater than the second preset frequency. It should be noted that the first preset frequency and the second preset frequency can be set according to the statistical result of historical data.
  • multiple caches can be sorted in descending order of access frequency, and the top X caches are ranked It is determined as a hot spot cache, and the cache at the bottom M position is determined as a cold spot cache.
  • the values of X and M can be the same or different.
  • Various other methods can also be used to select a hot spot cache and a cold spot cache from multiple caches.
  • Step 203 Determine the hotspot instruction in the hotspot cache.
  • the hotspot information is the hotspot instruction
  • the hotspot information is the hotspot data. Since the main memory is used as the command main memory as an example for description, the hot spot information here is a hot spot command.
  • step 203 firstly, monitor the access frequency of each cacheline in the hotspot cache, then determine the hotspot cacheline in the hotspot cache according to the access frequency of each cacheline, and finally determine the instructions stored in the hotspot cacheline as hotspot instructions.
  • the counter is used to obtain the number of accesses of each cacheline in the hotspot cache during the same time interval, and then the ratio of the number of accesses of each cacheline in the same time interval to the same time interval is determined as the access frequency of the corresponding cacheline. It should be noted that since each cacheline corresponds to the same time interval, and the number of accesses of each cacheline in the same time interval is positively related to the access frequency, it is also possible to directly place each cacheline in the same time interval. The number of accesses is determined as the access frequency of the corresponding cacheline in the same time interval, which reduces the amount of calculation while ensuring the accuracy of the data, improves the calculation efficiency, and saves the calculation cost.
  • the way to determine the hotspot cacheline in the hotspot cache can, for example, be: according to the access frequency of each cacheline, the cachelines in the hotspot cache are sorted in descending order of access frequency. The first cacheline is determined as a hot cacheline; or the access frequency of each cacheline is compared with a set value, and the cacheline with an access frequency greater than the set value is determined as a hot cacheline. It should be noted that the above manner is only exemplary and is not used to limit the present invention.
  • the hotspot cacheline in each hotspot cache can be determined according to the above principle, and then the hotspot instruction corresponding to each cacheline is determined according to each hotspot cacheline.
  • Step 204 Record the first index information of the hot spot instruction and the identification information (id) of the cold spot cache in a target entry (entry), where the target entry is an entry in the redirection data table.
  • the first index information of the hotspot instruction may be determined according to a mapping rule between a storage block in the main memory and multiple caches.
  • the redirection data table includes multiple entries, and each entry may include a first identifier and a second identifier, where the first identifier is a valid tag or an invalid tag, that is, the value of the first identifier has two values. There are two options. The two options are valid and invalid tags. If the first identifier in the entry is a valid tag, then there is valid information in the entry, and if the first identifier in the entry is an invalid tag, then there is no entry in the entry. Valid information; the second identifier is a hotspot tag or a non-hotspot tag, that is, there are two options for the value of the second tag. The two options are hotspot tags and non-hotspot tags.
  • each entry includes four areas. One area is used to record the first identifier, one area is used to record the second identifier, one area is used to record the first index information of the command, and one area is used to record the id of the cache. .
  • the first index information of the instruction in each entry can be determined according to the mapping rule between the storage block in the main memory and multiple caches.
  • the steps of recording the first index information of the hot spot instruction and the id of the cold spot cache in the target entry are as follows:
  • Step 301 According to the first identification and the second identification of each entry in the plurality of entries, a candidate entry is determined among the plurality of entries, wherein the candidate entry includes the entry whose first identification is an invalid mark and the first identification among the plurality of entries. 2.
  • the entry identified as a non-hot spot mark, that is, the candidate entry is an entry that does not have valid information among multiple entries, and the entry corresponding to the first index information recorded in the entry is an entry that is a cold spot instruction.
  • Step 302 Determine the target entry among the candidate entries. Specifically, any one of the candidate entries can be determined as the target entry; or if the candidate entry includes the entry with the first identification as an invalid mark and the entry with the second identification as a non-hot spot, it is preferable to select the first entry among the candidate entries. Any entry identified as an invalid mark is determined as the target entry; or if the candidate entries only include entries with the second identification as a non-hot-spot mark, then any entry identified as a hot-spot mark as a non-hot-spot mark is determined as the target entry.
  • the target entry of each hotspot instruction needs to be determined by the above principles.
  • Step 303 Record the first index information of the hot spot instruction and the id of the cold spot cache in the target entry.
  • the second identifier in the target entry can also be set as a hot spot flag to indicate that the instruction corresponding to the first index information recorded in the target entry is a hot spot instruction, and the first identifier in the target entry can be set to A valid flag to indicate that there is valid information in the target entry.
  • the way to determine the cold spot cache corresponding to each hot spot instruction can be: determine the corresponding cold spot cache for each hot spot cache respectively, and then determine the cold spot cache corresponding to each hot spot cache as the corresponding hot spot cache
  • the hot spot instruction corresponds to the cold spot cache. in:
  • the method of determining the corresponding cold spot cache for each hot spot cache can include:
  • the cold spot cache is determined as the cold spot cache corresponding to each hot spot cache. That is, the cold spot cache corresponding to each hot spot cache is the same.
  • the corresponding cold spot cache can be determined for each hot spot cache in the multiple cold spot caches.
  • the cold spot cache corresponding to each hot spot cache can be exactly the same or completely different. It may also be partly the same or partly different, etc., and there is no special limitation here.
  • the hot-spot instruction can also be filled into the cold-spot cacheline in the cold-spot cache. That is, the cold spot cache line is determined in the cold spot cache, and the hot spot instructions are filled into the cold spot cache line in the cold spot cache.
  • the method of determining the cold spot cacheline in the cold spot cache may include: determining any cacheline in the cold spot cache as a cold spot cacheline; or, monitoring the access frequency of each cacheline in the cold spot cache, and setting the cold spot The cache line whose access frequency in the cache is less than the fifth preset frequency is determined as a cold spot cacheline; or the access frequency of each cache line in the cold spot cache is monitored, and the cache line with the smallest access frequency in the cold spot cache is determined as a cold spot cache line. It should be noted that the above process is only exemplary and is not used to limit the application.
  • each hot-spot instruction is respectively filled into the cold-spot cacheline in its corresponding cold-spot cache.
  • this application also includes: monitoring the access frequency of each entry in the redirection data table, and determining Whether the access frequency of the first entry is less than the third preset frequency, and modify the second identification of the first entry whose access frequency is less than the third preset frequency to a non-hot spot mark, and determine whether the access frequency of the second entry is greater than the fourth preset frequency Set the frequency, and modify the second identifier of the second entry whose access frequency is greater than the fourth preset frequency to a hotspot tag, where the first entry is the entry with the second identifier in the redirection data table as the hotspot tag, and the second entry In order to redirect the entry in the data table whose second mark is a non-hot spot mark, the fourth preset frequency is greater than the third preset frequency.
  • the process of monitoring the access frequency of each entry in the redirection data table includes: judging that the first index information stored in each entry in the redirection data table is successfully matched by the index information of the instruction to be obtained within a preset time interval The number of times, and the ratio of the number of times the first index information stored in each entry is successfully matched within the preset time interval to the preset time interval is determined as the access frequency of the corresponding entry.
  • the preset time interval corresponding to each entry is the same, and the number of times that the first index information stored in each entry is successfully matched within the preset time interval is positively correlated with its access frequency, it is also possible The number of times the first index information stored in each entry is successfully matched within a preset time interval is determined as the access frequency of the corresponding entry, which can improve the calculation efficiency, reduce the calculation amount and the calculation cost while ensuring the accuracy of the data. .
  • the hot-cold and hot-spot status of each entry in the redirected data table can be monitored and updated in real time, and the timeliness and accuracy of the information in the redirected data table can be ensured.
  • the hotspot instruction can be determined in the hotspot cache in the following two ways. in:
  • Method 1 Determine whether the access frequency of the hot spot cache reaches the frequency configured by the register and whether the difference between the access frequency of the hot spot cache and the access frequency of the cold spot cache is greater than the configured value. If so, determine the hot spot instruction in the hot spot cache.
  • the frequency configured by the register may be 1000 MOPS, for example.
  • the cold spot cache corresponding to each hot spot cache can be determined in the cold spot cache to determine whether the access frequency of each hot spot cache reaches the frequency configured by the register and the number of hot spot caches for each hot spot cache. Whether the difference between the access frequency and the access frequency of the corresponding cold spot cache is greater than the configured value, the hot spot instruction is determined in the hot spot cache that meets the above conditions. It should be noted that the method of determining the cold spot cache corresponding to each hot spot cache has been described above, so it will not be repeated here.
  • Method 2 Determine whether the access frequency of the hot spot cache is greater than n times the access frequency of the cold spot cache, and if so, determine the hot spot instruction in the hot spot cache.
  • the cold spot cache corresponding to each hot spot cache can be determined in the cold spot cache to determine whether the access frequency of each hot spot cache is greater than n times the corresponding cold spot cache
  • the access frequency of, and the hotspot instruction is determined in the hotspot cache that meets the above conditions.
  • the hot spot instruction can be determined in the hot spot cache, thereby combining the first index information of the hot spot instruction and the cold spot cache The id is recorded in the target entry to realize the redirection of hot instructions.
  • the above method 1 and method 2 provide restriction conditions for starting the redirection process, and the redirection process can be started only when the restriction conditions are met, which improves the accuracy of starting the redirection process.
  • the hot information in the cache with high access frequency is redirected to the cache with low access frequency, which changes the way information is deployed in multiple caches, and the way information is deployed in multiple caches It is related to the access frequency of each cache.
  • FIG. 4 is the first part of a schematic flowchart of an information acquisition method provided by an embodiment of this application
  • FIG. 6 is the third part of the schematic flow chart of an information acquisition method provided by an embodiment of this application
  • FIG. 7 is a flow chart of an information acquisition method provided by an embodiment of this application
  • the fourth part of the schematic The execution subject of the information acquisition method may be the same as or different from the execution subject of the above bandwidth equalization method, which is not specifically limited in this application.
  • the information acquisition process will be described by taking the execution body as the redirection module in FIG. 1, the main memory as the instruction main memory, and the information to be acquired as the instruction to be acquired as an example.
  • the information acquisition process may include:
  • Step 401 The redirection module reads the first request.
  • the first request is a request for obtaining an instruction to be obtained
  • the first request carries second index information
  • the second index information is index information for obtaining an instruction to be obtained.
  • the number of second index information carried in the first request is at least one. If there are multiple second index information carried in the first request, the first request is a request for obtaining the instruction to be obtained corresponding to each second index information.
  • the second index information is the index information of the storage block of the first instruction segment storing the instruction to be acquired; If the size is less than or equal to the capacity of the storage block in the main memory, the second index information is the index information of the storage block storing the instruction to be acquired.
  • Step 402 The redirection module binds the execution thread for the first request. It should be noted that the execution thread is mapped to the processor Core one-to-one, and only the execution thread corresponding to the processor Core that has completed the previous instruction acquisition request can bind the first request again. When there are multiple binding execution threads, the first request is bound to the execution thread with shallow Inst Q according to the depth of the instruction queue (Instruction Queue, Inst Q) of the execution thread.
  • Step 403 The redirection module determines the first index information of the instruction to be acquired according to the second index information.
  • the principle of determining the first index information of the instruction to be acquired in the second index information is: the first index information of the instruction to be acquired is determined in the second index information according to the mapping rule between the storage block in the main memory and multiple caches.
  • the first index of the acquisition instruction may be, for example, the second index information, or a part of the second index information.
  • the second index information is the physical address of the instruction to be acquired
  • the first index information of the instruction to be acquired may be the physical address of the instruction to be acquired or a part of the fields in the physical address of the instruction to be acquired.
  • step 403 if the first request carries multiple second index information, one of the second index information is selected in step 403, and step 403 and the following steps are executed according to the selected second index information.
  • Step 404 The redirection module matches the first index information of the instruction to be acquired with the first index information in each entry in the redirection data table.
  • Step 405 If the first index information in an entry matches the first index information of the instruction to be acquired, the redirection module determines the cache corresponding to the id of the cache in the entry that matches the first index information of the instruction to be acquired as The first target cache.
  • Step 406 The redirection module sends the second index information to the first target cache, so that the first target cache obtains the instruction to be obtained according to the second index information.
  • step 403 and step 406 the first index information of the instruction to be acquired is determined according to the second index information, and the first index information of the instruction to be acquired is combined with the first index information in each entry in the redirection data table. Matching is used to determine whether the instruction to be acquired has been redirected according to the matching result. If so, the cache corresponding to the instruction to be acquired after redirection is determined, that is, the first target cache, so as to acquire the instruction to be acquired in the first target cache.
  • step 407 determine the cache corresponding to the instruction to be fetched according to the mapping rules between the storage block in the main memory and multiple caches, combined with the second index information, that is, the first target cache, and fetch the instruction to be fetched in the first target cahce , See step 407 for the specific process.
  • Step 407 If the first index information in each entry does not match the first index information of the instruction to be obtained, the redirection module determines the first target cache according to the second index information and the mapping rule.
  • the mapping rule is the mapping relationship between the storage block in the main memory and multiple caches, and the mapping relationship between the storage block and cachelines in multiple caches.
  • the principle of determining the first target cache according to the instruction to be acquired and the mapping rule is: based on the mapping relationship between the storage block and multiple caches, and in combination with the second index information, the cache corresponding to the instruction to be acquired is determined, that is, the first target cache.
  • the mapping relationship between the storage block and multiple caches is set according to the physical address of the storage block and the id of the cache, and the mapping relationship between the preset field in the physical address of the storage block and the id of the cache is established, the physical address and the cache The ids are all expressed in binary, and the number of bits in the preset field of the physical address is the same as the number of the id of the cache. That is, if the preset field of the physical address of the storage block is the same as the id of a cache, the cache is determined to be The cache corresponding to this storage block. In this way, the second index information is the physical address of the instruction to be acquired.
  • the preset field in the physical address of the instruction to be acquired is compared with the id of each cache, and the id is compared with the preset field in the physical address of the instruction to be acquired.
  • the same cache is determined as the cache corresponding to the instruction to be fetched.
  • Step 408 Send the second index information to the first target cache, so that the first target cache obtains the instruction to be obtained according to the second index information.
  • Step 409 The first target cache receives the second index information, and determines whether there is a cacheline corresponding to the second index information in the first target cache according to the second index information and the mapping relationship between the storage block in the main memory and the cacheline in the cache. , If it exists, get the command to be fetched in the corresponding cacheline.
  • Step 410 The first target cache sends the command to be obtained to the crossbar.
  • Step 411 The crossbar sends the instruction to be obtained to the Processor Core corresponding to the execution thread bound to the first request.
  • Step 412 The first target cache determines whether the acquisition of the instruction to be acquired has been completed according to the EI identifier in the acquired instruction to be acquired (that is, the instruction to be acquired from the corresponding cacheline in step 409).
  • Step 413 If the EI flag is 1, it is determined that the acquisition of the instruction to be acquired has been completed, and then jump to step 421.
  • Step 414 If the EI flag is 0, it is determined that the acquisition of the instruction to be acquired has not been completed, and the first target cache calculates the third index information according to the second index information and the storage interval. It should be noted that the storage interval is the address interval when different instruction segments of an instruction are stored in the main memory.
  • Step 415 The first target cache sends the third index information to the redirection module.
  • Step 416 The redirection module receives the third index information sent by the first target cache, and determines the first index information of the instruction to be acquired according to the third index information, and compares the first index information of the instruction to be acquired with the data in the redirection data table. The first index information in each entry is matched.
  • Step 417 If the first index information in an entry matches the first index information of the instruction to be acquired, the redirection module determines the cache corresponding to the id of the cache in the entry that matches the first index information of the instruction to be acquired as The second target cache.
  • Step 418 The redirection module sends the third index information to the second target cache, so that the second target cache obtains the instruction to be obtained according to the third index information.
  • Step 419 If the first index information in each entry does not match the first index information of the instruction to be obtained, the redirection module determines the second target cache according to the third index information and the mapping rule.
  • Step 420 The redirection module sends the third index information to the second target cache, so that the second target cache obtains the instruction to be obtained according to the third index information.
  • the principle of determining the second target cache according to the third index information is the same as the principle of determining the first target cache according to the second index information, so it will not be repeated here.
  • the principle that the second target cache obtains the instruction to be acquired according to the third index information and the principle of the subsequent processing flow are the same as the principle of step 409 and the principle of the subsequent processing flow, so it will not be repeated here.
  • Step 421 If the EI flag is 1, the redirection module determines whether there is other available second index information in the first request, and selects one second index information from the other available second index information in the first request. And based on the second index information selected in this step, step 403 and subsequent steps are repeated.
  • the EI flag is 1, it means that the acquisition of the instruction to be acquired corresponding to the second index information selected in step 403 has been completed. Since the number of second index information in the first request is at least one, when the number of second index information is more than one, other available second index information should be selected from the first request and based on the reselected first index information. For the second index information, the above process is repeated to obtain the to-be-obtained instruction corresponding to the reselected second index information.
  • the redirection module can read the new request and process the new request.
  • Step 422 In step 409, if there is no cacheline corresponding to the second index information in the first target cache, the first target cache generates a Refill (backfill) request according to the second index information.
  • Step 423 The first target cache sends the Refill request to the main memory.
  • Step 424 The main memory acquires the instruction to be acquired according to the second index information in the Refill request, and generates response information according to the instruction to be acquired and the second index information, and sends the instruction to be acquired to the execution thread bound to the first request through the crossbar.
  • Corresponding Processor Core Corresponding Processor Core.
  • Step 425 The main memory sends the response information to the redirection module.
  • Step 426 The redirection module determines the first index information of the instruction to be acquired according to the second index information in the response information, and compares the first index information of the instruction to be acquired with the first index stored in each entry in the redirection data table. Information to match.
  • Step 427 If the first index information of the instruction to be acquired matches the first index information in an entry in the redirection data table, the redirection module will match the first index information of the instruction to be acquired to the cache entry in the entry.
  • the cache corresponding to the id is determined to be the third target cache.
  • Step 428 The redirection module sends the response information to the third target cache.
  • Step 429 If the first index information of the instruction to be obtained does not match the first index information in any entry in the redirection data table, the redirection module determines according to the second index information in the response information in combination with the mapping rule The third target cache.
  • Step 430 The redirection module sends the response information to the third target cache.
  • Step 431 The third target cache receives the response information, and stores the to-be-obtained instruction in the response information in a cacheline in the third target cache.
  • the information acquisition process in which the main storage is the data main storage and the information to be obtained is the data to be obtained is the same as the foregoing steps 401 to 431, and therefore, details are not described here.
  • the first index information of the information to be obtained is determined according to the second index information, and the first index information of the information to be obtained is matched with the first index information in each entry in the redirection data table, and when there is When matching entries, the cache corresponding to the id of the cache in the matching entry is determined as the first target cache, and the second index information is sent to the first target cache, so that the first target cache obtains it according to the second index information Information to be acquired, therefore, the first request is split based on the redirect data table, which greatly alleviates the problem of uneven request distribution, thereby improving the problem of severe imbalance in request acquisition bandwidth, and improving information acquisition performance.
  • FIG. 8 provides a schematic diagram of an application scenario including multiple caches according to an embodiment of the application.
  • the application scenario includes Slice cache (that is, the cache in the application scenario is Slice cache. ), processor core, IBUF (Input Buffer, input buffer) REDIR1 (redirection module 1), Redirect Table (redirection data table), main memory (not shown in Figure 8).
  • IBUF includes input IFIFO (Input FIFO, First In First Out and REDIR0 (redirection module 0), the number of slice caches is 16, and the number of slice caches is from 0 to 15.
  • Each slice cache includes 256 cachelines and the number of processor cores is 16 , Respectively, processor core 0-15.
  • the main memory is the command main memory, that is, the information stored in the main memory is the command.
  • An 8-way group association is used between the cacheline in each slice cache and the main memory, and the instructions stored in each storage block in the main memory are deployed in a zigzag pattern in multiple slice caches.
  • the instructions to be stored are divided into multiple instruction segments according to the capacity of the storage block, and The divided instruction segments are stored in multiple storage blocks, where the number of divided instruction segments is equal to the number of multiple storage blocks, and the physical addresses of the storage blocks stored in adjacent instruction segments are separated by 8, that is, relative The interval of the physical address of the adjacent instruction segment is 8. It should be noted that the interval 8 here is decimal. If the size of the instruction to be stored is less than or equal to the capacity of the storage block, the instruction to be stored is stored as a whole in a In the storage block.
  • mapping rule between the storage block in the main memory and the 16 slice cache is: each storage block in the main memory is mapped to 16 slice caches in a Z-shape, and between each slice cache and the storage block in the main memory Use 8-way group association. It should be noted that the mapping rule is set according to the physical address of the storage block and the id of each slice cache.
  • the number of bits (binary) of the physical address of the storage block can be determined by the total number of storage blocks in the main memory and the storage bytes of each storage block.
  • the physical address of the storage block includes 18 bits, that is, the physical address of the storage block.
  • the address can be represented by PC[17:0]. It should be noted that the physical address of the storage block is the physical address of the instruction stored in the storage block, that is, the physical address of the instruction in the storage block is represented by PC[17:0] .
  • the number of bits required to represent the slice cache id in binary can be determined by the total number of slice caches. Since the number of slice cache is 16, the number of digits of the slice cache id is 4, that is, the slice cache id can be represented by SLID[3:0].
  • Each memory block in the main memory is mapped to 16 slice caches in a zigzag pattern as follows: According to the physical address of the memory block in the main memory, the memory block in the main memory is divided in units of 8 to obtain multiple Storage block groups, where each storage block group includes 8 storage blocks and the physical addresses of the 8 storage blocks are adjacent; then, the storage blocks in the storage block group a+16b are mapped to the ath Slice cache (That is, in Slice cache a), the value range of a is [0,15], and a is an integer, and b is greater than or equal to 0 and is an integer.
  • storage in storage block group 0 that is, storage blocks with physical addresses 0-7)
  • storage block group 16 that is, storage blocks with physical addresses 128-135) based on 0 and separated by 16 storage blocks Blocks are mapped to Slice cache0
  • storage block group 1 that is, storage blocks with physical addresses 8 to 15
  • storage block group 17 that is, storage blocks with physical addresses 136 to 143
  • the storage blocks in the block group are mapped to Slice cache1
  • storage block group 15 that is, storage blocks with physical addresses of 120 to 127
  • storage block group 31 that is, storage blocks with physical addresses of 248 to 255
  • the storage blocks in the storage block group based on 15 and separated by 16 are mapped to Slice cache15. It should be noted that the physical addresses in this section are all decimal.
  • PC[6:3] in the PC[17:0] of the storage block is used as a preset field to determine the slice corresponding to the storage block according to the PC[6:3] of the storage block Cache, that is, if the PC[6:3] of a storage block is the same as the SLID[3:0] of a slice cache, then the slice cache is the slice cache corresponding to the storage block.
  • the physical address of the storage block is the physical address of the instruction stored in the storage block
  • the PC of the instruction stored in the storage block [6:3] Compare with the SLID[3:0] of each slice cache, and determine the slice cache with the same SLID[3:0] and the instruction PC[6:3] as the slice cache corresponding to the instruction, And cache the instruction to its corresponding slice cache.
  • each slice cache includes Cache data and TAG Table tables. Since the structure of each slice cache is the same, the following only describes the Cache data and TAG Table tables in one slice cache.
  • the Cache data includes 256 cachelines (ie, cachelines 0 to 255), and each cacheline is used to cache instructions in the storage block in the main memory. Because the slice cache and the main memory adopt an 8-way group association method, it can The 256 cachelines are divided into 32 groups of cachelines in units of 8, where each group of cachelines includes 8 cachelines.
  • the TAG Table table consists of multiple rows and columns. One row corresponds to a set of cachelines, each row includes 8 large columns, and one large column is represented by one way, that is, each row includes a total of 8 ways, which are way0 ⁇ way7. , The 8 ways in each line correspond to the 8 cachelines in the corresponding set of cachelines one-to-one, and each way stores the relevant information of the instructions stored in the corresponding cacheline.
  • each way includes four parameters, namely VLD, TAG, lock, and dirty.
  • VLD in the way is used to indicate whether there are instructions stored in the cacheline corresponding to the way. If VLD is 0, it means that there are no instructions in the cacheline corresponding to the way. If the VLD is 1, it means that there are instructions stored in the cacheline corresponding to the way. ;
  • the TAG in the way is the tag bit in the physical address of the instruction stored in the cacheline corresponding to the way;
  • the lock in the way indicates whether the instructions stored in the cacheline corresponding to the way can be replaced. If the lock is 1, it indicates that the instructions stored in the cacheline corresponding to the way cannot be replaced. If the lock is 0, it indicates the cacheline corresponding to the way.
  • the stored instructions can be replaced;
  • the dirty in the way indicates whether the instructions stored in the cacheline corresponding to the way are consistent with the information that should be stored. If dirty is 0, it indicates that the instructions stored in the cacheline corresponding to the way are not consistent with the information that should be stored. It is 1, indicating that the instructions stored in the cacheline corresponding to the way are consistent with the information that should be stored.
  • the row number corresponding to each way is the index bit in the physical address of the instruction stored in the cacheline corresponding to the way.
  • the PC[17:13] of the instruction in the storage block is used as the index bit
  • the PC[12:3] of the instruction in the storage block is used as the tag bit.
  • the slice cache corresponding to the instruction can be determined according to the PC[6:3] of the instruction, and then the PC[17:13] according to the instruction is in its corresponding Slice cache determines the line number corresponding to the instruction, and determines the set of cachelines corresponding to the instruction according to the line number corresponding to the instruction, and then stores the instruction in one of the corresponding set of cachelines.
  • the cacheline storing the instruction determines the way corresponding to the cacheline storing the instruction in the TAG Tab table in the slice cache corresponding to the instruction, and updates the TAG Tab table in the corresponding slice cache according to the PC [12:3] of the instruction TAG in the corresponding way in.
  • the first index information of the instruction is the PC[17:3] of the instruction
  • the second index information is the physical address of the instruction to be obtained, that is, PC[17:0].
  • the PC[17:0] of the instruction to be fetched is the PC[17:0] of the memory block storing the instruction to be fetched in the main memory
  • the PC [17:13] of the instruction to be fetched is the PC [17:13] of the storage block in the main memory that stores the instruction to be fetched.
  • the PC[17:0] of the instruction to be fetched in the main memory stores PC[17:0] of the storage block of the first instruction segment of the instruction to be fetched
  • the PC [17:13] of the instruction to be fetched is the PC [17:13] of the storage block in the main memory that stores the first instruction segment of the instruction to be fetched.
  • the Redirect Table includes multiple entries. As shown in Figure 8, an entry includes four parameters, which are the first identification VLD, the second identification HOT, the hot spot information PC[17:3], and the cold spot slice cache id (ie SLID[3:0]).
  • the first identification VLD is 0, the first identification VLD is determined to be an invalid mark, and if the first identification VLD is 1, then the first identification VLD is determined to be a valid mark.
  • the second identifier HOT is 0, the second identifier is determined to be a non-hot spot marker, and if the second identifier HOT is 1, the second identifier is determined to be a hot spot marker.
  • the instructions stored in the storage block in the main memory can be cached into 16 slice caches according to the above mapping rule. Then, in the process of obtaining the instructions to be obtained based on the above application scenarios, the above-mentioned method of equalizing to be wide is used to adjust the deployment position of the hot instructions in the hot slice cache in the 16 slice caches to realize the redirection of the hot instructions in the hot slice cache , And then achieve a balanced effect.
  • REDIR1 detects the access frequency of each slice cache in Slice cache 0-15; then, according to the access frequency of each slice cache, determine the cold spot slice cache and hot slice cache in the slice cache 0-15 Then, determine the hot instruction in the hot slice cache; finally, record the PC[17:3] of the hot instruction and the SLID[3:0] of the cold slice cache in the target entry of the Redirect Table. It should be noted that since the implementation principle of bandwidth equalization has been described above, it will not be repeated here.
  • the process of obtaining the instruction to be obtained may include the following steps:
  • Step 1001 IBUF receives the first request and buffers the first request in the IFIFO.
  • the first request is a request to obtain the instruction to be acquired, and the first request carries second index information, that is, the PC[17:0] of the instruction to be acquired. It should be noted that, here, the first request includes a second index information as an example for description.
  • Step 1002 if the IFIFO is not empty and at least one processor core has completed the previous instruction acquisition request, REDIR0 reads the first request from the IFIFO, and binds the corresponding execution thread to the first request. It should be noted that there is a one-to-one correspondence between the execution thread and the processor core. Since the number of processor cores is 16, there are also 16 execution threads. It should be noted that the bound execution thread is the execution thread corresponding to the processor core that has completed the previous instruction acquisition request. In addition, if there are multiple bindable execution threads, that is, the previous instruction acquisition request is completed in multiple processor cores, the first request is allocated to the execution thread with shallow Inst Q according to the Inst Q depth of the execution thread.
  • Step 1003 REDIR0 determines the first index information of the instruction to be acquired according to the PC[17:0] of the instruction to be acquired. That is, PC[17:3] in the PC[17:0] of the instruction to be acquired is used as the first index information of the instruction to be acquired.
  • Step 1004 REDIR0 matches the PC[17:3] of the instruction to be obtained with the PC[17:3] in each entry in the Redirect Table.
  • Step 1005 If the PC[17:3] in an entry in the Redirect Table matches the PC[17:3] of the instruction to be obtained, REDIR0 matches the PC[17:3] in the Redirect Table with the instruction to be obtained The slice cache corresponding to the SLID[3:0] in the entry is determined as the first target slice cache.
  • Step 1006 REDIR0 sends the PC[17:0] of the command to be obtained to the first target slice cache.
  • Step 1007 If the PC[17:3] in each entry in the Redirect Table does not match the PC[17:3] of the instruction to be acquired, REDIR0 determines the pending instruction according to the PC[6:3] of the instruction to be acquired. Obtain the slice cache corresponding to the instruction, and determine the corresponding slice cache as the first target slice cache.
  • Step 1008 REDIR0 sends the PC[17:0] of the command to be obtained to the first target slice cache.
  • Step 1009 The first target slice cache receives the PC[17:0] of the instruction to be acquired, and determines the target row in the TAG Table according to the PC[17:13] (ie index bit) of the instruction to be acquired, and then transfers the instruction to be acquired
  • the PC[12:3] (ie tag bit) matches the TAG in each way in the target line.
  • the way corresponding to the TAG that matches the PC[12:3] of the instruction to be obtained will be determined It is the target way, and then, according to the PC [17:13] of the instruction to be acquired and the id of the target way, the cacheline corresponding to the instruction to be acquired is determined in the Cache data, and the instruction to be acquired is acquired from the cacheline corresponding to the instruction to be acquired.
  • Step 1010 The slice cache of the first target sends the acquired instruction to be acquired to the crossbar.
  • Step 1011 The crossbar sends a to-be-obtained instruction to the Processor Core corresponding to the execution thread bound by the first request.
  • Step 1012 The slice cache of the first target determines whether the acquisition of the instruction to be acquired has been completed according to the EI flag in the instruction to be acquired.
  • Step 1013 If the EI flag is 1, the slice cache of the first target determines that the acquisition of the instruction to be acquired has been completed, and jumps to step 1021.
  • Step 1014 If the EI flag is 0, the slice cache of the first target determines that the acquisition of the instruction to be acquired has not been completed, and obtains the third index information according to the PC[17:0] and the storage interval of the instruction to be acquired.
  • the value of the storage interval is 8, that is, the third index information is the binary sum of PC[17:0] and 8 of the instruction to be acquired.
  • Step 1015 The first target slice cache sends the third index information to REDIR1.
  • Step 1016 REDIR1 receives the third index information sent by the slice cache of the first target, and matches the PC[17:3] of the third index information with the PC[17:3] in each entry in the redirection data table.
  • Step 1017 If the PC[17:3] in an entry matches the PC[17:3] of the third index information, REDIR1 will match the SLID[ in the entry that matches the PC[17:3] of the third index information. 3:0] The corresponding slice cache is determined to be the second target slice cache.
  • Step 1018 REDIR1 sends the third index information to the second target slice cache.
  • Step 1019 If the PC[17:3] in each entry does not match the PC[17:3] of the third index information, REDIR1 determines the third index according to the PC[6:3] in the third index information The slice cache corresponding to the information determines the slice cache corresponding to the third index information as the second target slice cache.
  • Step 1020 REDIR1 sends the third index information to the second target slice cache.
  • the second target Slice cache receives the third index information, and obtains the instruction to be obtained according to the third index information.
  • the principle and subsequent principles are the same as the principle of step 1009 and subsequent principles, so it will not be here any more. Go ahead and repeat.
  • Step 1021 If the EI flag is 1, REDIR1 can read the new request and process the new request. Since the first request here only includes one second index information, if the EI flag is 1, read the new request and perform corresponding processing.
  • Step 1022 In step 1009, if there is no way corresponding to the PC[12:3] of the instruction to be acquired in the row corresponding to the PC[17:13] of the instruction to be acquired in the TAGTable table, then the first target Slice The cache generates a Refill request according to the PC[17:0] of the command to be fetched.
  • Step 1023 The first target slice cache sends the Refill request to the main memory.
  • Step 1024 The main memory acquires the instruction to be acquired according to the PC[17:0] of the instruction to be acquired in the Refill request, and generates response information according to the instruction to be acquired and the PC[17:0] of the instruction to be acquired, and transfers the instruction to be acquired Send the crossbar to the Processor Core corresponding to the execution thread bound by the first request.
  • Step 1025 The main memory sends the response information to REDIR1.
  • Step 1026 REDIR1 matches the PC[17:3] of the command to be obtained in the response message with the PC[17:3] stored in each entry in the redirection data table.
  • Step 1027 If the PC[17:3] of the instruction to be obtained matches the PC[17:3] in an entry in the redirection data table, then REDIR1 will match the PC[17:3] of the instruction to be obtained
  • the slice cache corresponding to SLID[3:0] in the entry is determined to be the third target slice cache.
  • Step 1028 REDIR1 sends the response information to the third target slice cache.
  • Step 1029 If the PC[17:3] of the instruction to be acquired does not match the PC[17:3] in any entry in the redirection data table, REDIR1 sets the PC[6:3] of the instruction to be acquired The corresponding slice cache is determined as the third target slice cache.
  • Step 1030 REDIR1 sends the response information to the third target slice cache.
  • Step 1031 The slice cache of the third target receives the response information, and stores the command to be obtained in the response information in a cacheline in the Cache data in the slice cache of the third target, and according to the PC of the command to be obtained in the response information [ 17:0] Update the information in the corresponding way in the TAG Table of the cacheline storing the instruction to be obtained, that is, set the VLD in the corresponding way to 1, TAG set to the PC of the instruction to be obtained [12:3], lock set to 1. Dirty is set to 1.
  • FIG. 13 is a schematic structural diagram of a bandwidth equalization device provided by an embodiment of the application.
  • the device 1300 may include: a first monitoring module 1301, a first determining module 1302, a second determining module 1303, and a recording module 1304.
  • the first monitoring module 1301 is used to monitor multiple cache memories.
  • the first determining module 1302 is used to determine the cold spot cache memory and the hot spot cache memory among the multiple cache memories; the second determining module 1303 is used to determine the hot spot cache memory in the Hot spot information;
  • the first monitoring module 1301 is specifically configured to respond to a frequency monitoring command and monitor the access frequency of the plurality of cache memories in each monitoring period based on a monitoring period.
  • the first determining module 1302 is specifically configured to determine, according to the access frequency of each cache memory, the cache memory with the largest access frequency as the hot spot cache memory , Determine the cache memory with the smallest access frequency as the cold spot cache memory; or determine the cache memory with an access frequency greater than the first preset frequency as the cold spot cache memory according to the access frequency of each cache memory
  • a cache memory with an access frequency less than a second preset frequency is determined as the cold spot cache memory, wherein the first preset frequency is greater than the second preset frequency.
  • the second determining module 1303 is specifically configured to determine whether the access frequency of the hot spot cache memory reaches the frequency configured by the register, and the access frequency of the hot spot cache memory is equal to the cold frequency. Whether the difference in the access frequency of the point cache is greater than the configured value; if so, the hot spot information is determined in the hot spot cache.
  • the second determining module 1303 is specifically configured to determine whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory; if so, then Hot spot information is determined in the hot spot cache memory.
  • the second determining module 1303 is specifically configured to monitor the access frequency of each buffer line in the hotspot cache; according to the access frequency of each buffer line, The hot spot cache line is determined in the hot spot cache memory; the information stored in the hot spot buffer line is determined as hot spot information.
  • the redirection data table includes a plurality of table entries, each of the table entries includes a first identifier and a second identifier, the first identifier is a valid mark or an invalid mark, so The second mark is a hot spot mark or a non-hot spot mark.
  • the recording module 1304 is specifically configured to determine a candidate among the multiple entries according to the first identifier and the second identifier of each entry in the multiple entries Entry, wherein the candidate entry includes an entry whose first identifier is an invalid flag and an entry whose second identifier is a non-hot spot flag among the multiple entries; in the candidate entry
  • the target entry is determined in the target entry; the first index information of the hot spot information and the identification information of the cold spot cache memory are recorded in the target entry.
  • the method further includes: a setting module, configured to set the second identifier in the target entry as a hotspot label, and set the first identifier in the target entry as a valid label.
  • it further includes: a second monitoring module for monitoring the access frequency of each entry in the redirection data table; a first judging module for judging the access of the first entry Whether the frequency is less than the third preset frequency, wherein the first entry is the entry with the second identification as the hotspot mark; the first modification module is configured to set the access frequency to be less than the third preset frequency The second identifier of the first entry is modified to a non-hot spot flag; the second determination module is used to determine whether the access frequency of the second entry is greater than the fourth preset frequency, wherein the second entry is all The second identifier is an entry of the non-hot-spot label; a second modification module is configured to modify the second identifier of the second entry whose access frequency is greater than the fourth preset frequency to a hot-spot label; wherein, The fourth preset frequency is greater than the third preset frequency.
  • a filling module configured to fill the hot spot information into the cold spot buffer row in the cold spot cache memory.
  • the method further includes: a third determining module, configured to determine any buffer line in the cold spot cache memory as the cold spot buffer line; or to cache the cold spot A buffer line with an access frequency less than the fifth preset frequency in the memory is determined as a cold spot buffer line; or a buffer line with the smallest access frequency in the cold spot cache memory is determined as a cold spot buffer line.
  • a third determining module configured to determine any buffer line in the cold spot cache memory as the cold spot buffer line; or to cache the cold spot A buffer line with an access frequency less than the fifth preset frequency in the memory is determined as a cold spot buffer line; or a buffer line with the smallest access frequency in the cold spot cache memory is determined as a cold spot buffer line.
  • a reading module configured to read a first request, the first request is a request for obtaining information to be obtained, the first request carries second index information, the The second index information is the index information used to obtain the information to be obtained; the fourth determining module is used to determine the first index information of the information to be obtained according to the second index information; the first matching module is used to The first index information of the information to be obtained is matched with the first index information in each entry in the redirection data table; the fifth determining module is configured to determine the first index information in one of the entries If it matches the first index information of the information to be obtained, the cache memory corresponding to the identification information of the cache memory in the entry matching the first index information of the information to be obtained is determined as the first target cache Memory; a first sending module, configured to send the second index information to the first target cache, so that the first target cache obtains the to-be-obtained according to the second index information information.
  • it further includes: a sixth determining module, configured to: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the The second index information and the mapping rule determine the first target cache memory; the second sending module is used to send the second index information to the first target cache memory, so that the first target is high-speed
  • the buffer memory obtains the to-be-obtained information according to the second index information.
  • a receiving module configured to receive third index information sent by the first target cache memory, where the third index information is calculated from the second index information and the storage interval, The third index information is generated by the first target cache memory when it is determined that the acquisition of the information to be acquired has not been completed according to the end identifier in the information to be acquired;
  • the seventh determining module is configured to The three index information determines the first index information of the information to be obtained;
  • the second matching module is used to compare the first index information of the information to be obtained with the first index information in each entry in the redirect data table Matching; an eighth determining module, configured to match the first index information of the information to be obtained if the first index information in one of the entries matches the first index information of the information to be obtained
  • the cache memory corresponding to the identification information of the cache memory in the entry is determined to be the second target cache memory;
  • the third sending module is configured to send the third index information to the second target cache memory, So that the second target cache memory obtains the to-be-obtained information according to
  • it further includes: a ninth determining module, configured to: if the first index information in each entry does not match the first index information of the to-be-obtained information, according to the The third index information and the mapping rule determine the second target cache memory; the fourth sending module is configured to send the third index information to the second target cache memory, so as to make the second target high-speed
  • the buffer memory obtains the to-be-obtained information according to the third index information.
  • the present application also provides a computer-readable storage medium, including a computer program, which when executed on a computer, causes the computer to execute any one of the methods in FIGS. 2-7.
  • This application also provides a computer program, when the computer program is executed by a computer, it is used to execute any one of the methods in FIGS. 2-7.
  • the present application also provides a chip including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute any one of FIGS. 2-7 Item method.
  • the chip may also include a memory and a communication interface.
  • the communication interface may be an input/output interface, a pin, an input/output circuit, or the like.
  • the steps of the foregoing method embodiments may be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware encoding processor, or executed and completed by a combination of hardware and software modules in the encoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • dynamic RAM dynamic RAM
  • DRAM dynamic random access memory
  • synchronous dynamic random access memory synchronous DRAM, SDRAM
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous connection dynamic random access memory serial DRAM, SLDRAM
  • direct rambus RAM direct rambus RAM
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (personal computer, server, or network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Abstract

Provided are a bandwidth equalization method and apparatus. The method comprises: monitoring the frequency of access to a plurality of caches; determining a cold spot cache and a hot spot cache from among the plurality of caches; determining hot spot information in the hot spot cache; and recording first index information of the hot spot information and identification information of the cold spot cache in a target table entry, with the target table entry being a table entry in a redirection data table. In the present application, the problem of uneven distribution of requests is greatly alleviated, thereby improving the problem of serious imbalances in bandwidth acquisition, and improving the performance of information acquisition.

Description

带宽均衡方法和装置Bandwidth equalization method and device 技术领域Technical field
本申请涉及数据通信领域,尤其涉及一种带宽均衡方法和装置。This application relates to the field of data communication, and in particular to a bandwidth equalization method and device.
背景技术Background technique
为了提高获取指令的效率,通常在处理器核与主存之间设置cache(高速缓冲存储器),并将从主存中频繁获取的指令缓存在cache中,然后,在处理器核获取待获取指令时,先通过匹配的方式判断cache中是否存在待获取指令,若是,则从cache中获取待获取指令,若否,则从主存中获取待获取指令。In order to improve the efficiency of fetching instructions, a cache (high-speed buffer memory) is usually set between the processor core and the main memory, and the frequently fetched instructions from the main memory are cached in the cache, and then the instructions to be fetched are fetched on the processor core At this time, it is first judged whether there is a command to be fetched in the cache by a matching method. If so, the command to be fetched is fetched from the cache; if not, the command to be fetched is fetched from the main memory.
随着5G技术的大力发展,对获取指令的带宽和延时的要求也越来越高,处理器核也朝着多核并行计算的方向发展,原本单一的cache越来越难以满足指令获取的需求。为了解决上述问题,通常采用多个cache来解决获取指令的带宽和延时问题。With the vigorous development of 5G technology, the requirements for the bandwidth and delay of obtaining instructions are getting higher and higher, and the processor cores are also developing in the direction of multi-core parallel computing. The original single cache is more and more difficult to meet the demand for obtaining instructions. . In order to solve the above problems, multiple caches are usually used to solve the problem of bandwidth and delay of obtaining instructions.
然而,在采用多个cache的解决方案中,主存中的存储块与多个cache的映射关系由主存中的存储块的物理地址决定,这样,在多个cache中存储主存中的存储块中的指令时,需要根据存储该指令的存储块的物理地址确定存储该指令的cache,由于指令的物理地址为存储该指令的存储块的物理地址,因此,可以理解为主存中存储的指令在多个cache中的部署方式仅由指令的物理地址决定。基于此,在执行获取待获取指令的请求时,若待获取指令的物理地址大部分指向同一个cache时,会出现请求分配不均的问题,进而导致指令获取带宽严重失衡,指令获取性能下降的问题。However, in a solution that uses multiple caches, the mapping relationship between the storage block in the main memory and the multiple caches is determined by the physical address of the storage block in the main memory. In this way, the storage in the main memory is stored in the multiple caches. In the case of an instruction in a block, it is necessary to determine the cache storing the instruction according to the physical address of the storage block storing the instruction. Since the physical address of the instruction is the physical address of the storage block storing the instruction, it can be understood as the storage in the main memory The way instructions are deployed in multiple caches is only determined by the physical address of the instruction. Based on this, when executing a request to acquire instructions to be acquired, if most of the physical addresses of the instructions to be acquired point to the same cache, the problem of uneven request distribution will occur, which will cause a serious imbalance in the instruction acquisition bandwidth and a decrease in instruction acquisition performance. problem.
发明内容Summary of the invention
本申请提供了一种带宽均衡方法和装置,用于解决由于主存中的指令在多个cache中的部署方式仅由主存中的指令的物理地址决定,而导致的请求分配不均,指令获取带宽严重失衡、指令获取性能下降的问题。This application provides a bandwidth equalization method and device, which is used to solve the uneven distribution of requests due to the fact that the deployment mode of instructions in the main memory in multiple caches is only determined by the physical addresses of the instructions in the main memory. The problem of serious imbalance in acquisition bandwidth and degraded instruction acquisition performance.
第一方面,提供一种带宽均衡方法,包括:监测多个高速缓冲存储器的访问频率;在所述多个高速缓冲存储器中确定冷点高速缓冲存储器和热点高速缓冲存储器;在所述热点高速缓冲存储器中确定热点信息;将所述热点信息的第一索引信息与所述冷点高速缓冲存储器的标识信息记录在目标表项中,其中,所述目标表项为重定向数据表中的表项。In a first aspect, a bandwidth equalization method is provided, including: monitoring the access frequency of a plurality of cache memories; determining a cold spot cache memory and a hot spot cache memory among the plurality of cache memories; Hot spot information is determined in the memory; the first index information of the hot spot information and the identification information of the cold spot cache memory are recorded in a target entry, where the target entry is an entry in the redirection data table .
通过监测多个cache的访问频率,并在多个cache中确定冷点cache和热点cache,以及将热点cache中的热点信息的第一索引信息和冷点cache的id记录在重定向数据表的目标entry中,即将访问频率高的cache中的热点信息重定向至访问频率低的cache中,改变了信息在多个cache中的部署方式,且信息在多个cache中的部署方式与每个cache的访问频率相关,这样在根据请求获取信息时,能够极大的缓解请求分配不均的问题,从而改善获取带宽严重失衡的问题,提高了信息获取的性能;另外,由于信息在多个cache中的分布与每个cache的访问频率相关,因此可以实现信息获取性能的预估;此外,仅根据 每个cache的访问频率,即可实现热点信息的重定向,步骤简单,易于实现。By monitoring the access frequency of multiple caches, and determining the cold spot cache and hot spot cache in multiple caches, and recording the first index information of the hot spot information in the hot spot cache and the id of the cold spot cache in the target of the redirect data table In entry, the hotspot information in the cache with high access frequency is redirected to the cache with low access frequency, which changes the deployment method of information in multiple caches, and the deployment method of information in multiple caches is the same as that of each cache. Access frequency is related, so that when obtaining information according to the request, it can greatly alleviate the problem of uneven request distribution, thereby improving the problem of serious imbalance of access bandwidth, and improving the performance of information acquisition; in addition, because the information is stored in multiple caches The distribution is related to the access frequency of each cache, so the information acquisition performance can be estimated; in addition, the hotspot information can be redirected only according to the access frequency of each cache, and the steps are simple and easy to implement.
在一种可能的实现方式中,所述监测多个高速缓冲存储器的访问频率包括:响应于频率监测指令,基于一监测周期监测所述多个高速缓冲存储器在每个所述监测周期的访问频率。In a possible implementation manner, the monitoring the access frequency of the plurality of cache memories includes: in response to a frequency monitoring instruction, monitoring the access frequency of the plurality of cache memories in each of the monitoring periods based on a monitoring period .
在一种可能的实现方式中,所述在所述多个高速缓冲存储器中确定冷点高速缓冲存储器和热点高速缓冲存储器包括:根据每个所述高速缓冲存储器的访问频率,将访问频率最大的所述高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率最小的所述高速缓冲存储器确定为所述冷点高速缓冲存储器;或者根据每个所述高速缓冲存储器的访问频率,将访问频率大于第一预设频率的高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率小于第二预设频率的高速缓冲存储器确定为所述冷点高速缓冲存储器,其中,所述第一预设频率大于所述第二预设频率。In a possible implementation manner, the determining the cold spot cache memory and the hot spot cache memory among the plurality of cache memories includes: according to the access frequency of each of the cache memories, selecting the one with the highest access frequency The cache memory is determined to be the hot spot cache memory, and the cache memory with the smallest access frequency is determined to be the cold spot cache memory; or the access frequency is determined according to the access frequency of each cache memory A cache with a frequency greater than a first preset frequency is determined to be the hot spot cache, and a cache with an access frequency less than a second preset frequency is determined to be the cold spot cache, wherein the first preset The frequency is greater than the second preset frequency.
在一种可能的实现方式中,所述在所述热点高速缓冲存储器中确定热点信息包括:判断所述热点高速缓冲存储器的访问频率是否达到寄存器配置的频率且所述热点高速缓冲存储器的访问频率与所述冷点高速缓冲存储器的访问频率的差是否大于配置值;若是,则在所述热点高速缓冲存储器中确定热点信息。In a possible implementation manner, the determining hotspot information in the hotspot cache includes: determining whether the access frequency of the hotspot cache reaches the frequency configured by the register and the access frequency of the hotspot cache Whether the difference between the access frequency of the cold spot cache memory and the cold spot cache memory is greater than the configured value; if so, hot spot information is determined in the hot spot cache memory.
通过判断热点高速缓冲存储器的访问频率是否达到寄存器配置的频率且热点高速缓冲存储器的访问频率与冷点高速缓冲存储器的访问频率的差是否大于配置值,以及在满足该条件时,在热点高速缓冲存储器中确定热点指令,以将热点指令的第一索引信息和冷点高速缓冲存储器的标识信息记录在目标表项中,从而实现热点指令的重定向。换言之,为启动重定向过程提供了限制条件,且仅在满足该限制条件的前提下,方可启动重定向过程,提高了启动重定向流程的准确性。By judging whether the access frequency of the hot spot cache memory reaches the frequency configured by the register and whether the difference between the access frequency of the hot spot cache memory and the access frequency of the cold spot cache memory is greater than the configured value, and when this condition is met, the hot spot cache The hot-spot instruction is determined in the memory to record the first index information of the hot-spot instruction and the identification information of the cold-spot cache memory in the target entry, so as to realize the redirection of the hot-spot instruction. In other words, a restriction condition is provided for starting the redirection process, and the redirection process can be started only when the restriction condition is met, which improves the accuracy of starting the redirection process.
在一种可能的实现方式中,所述在所述热点高速缓冲存储器中确定热点信息包括:判断所述热点高速缓冲存储器的访问频率是否大于n倍的所述冷点高速缓冲存储器的访问频率;若是,则在所述热点高速缓冲存储器中确定热点信息。In a possible implementation manner, the determining hot spot information in the hot spot cache memory includes: determining whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory; If yes, the hot spot information is determined in the hot spot cache memory.
通过判断热点高速缓冲存储器的访问频率是否大于n倍的冷点高速缓冲存储器的访问频率,以及在满足该条件时,在热点高速缓冲存储器中确定热点指令,以将热点指令的第一索引信息和冷点高速缓冲存储器的标识信息记录在目标表项中中,从而实现热点指令的重定向。换言之,为启动重定向过程提供了限制条件,且仅在满足该限制条件的前提下,方可启动重定向过程,提高了启动重定向流程的准确性。By judging whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory, and when this condition is met, the hot spot instruction is determined in the hot spot cache memory to combine the first index information of the hot spot instruction with The identification information of the cold spot cache memory is recorded in the target table entry, thereby realizing the redirection of the hot spot instruction. In other words, a restriction condition is provided for starting the redirection process, and the redirection process can be started only when the restriction condition is met, which improves the accuracy of starting the redirection process.
在一种可能的实现方式中,所述在所述热点高速缓冲存储器中确定热点信息包括:监测所述热点高速缓冲存储器中的每个缓冲行的访问频率;根据所述每个缓冲行的访问频率,在所述热点高速缓冲存储器中确定热点缓冲行;将所述热点缓冲行中存储的信息确定为热点信息。In a possible implementation manner, the determining hotspot information in the hotspot cache memory includes: monitoring the access frequency of each buffer line in the hotspot cache memory; according to the access of each cache line Frequency, determine a hotspot buffer line in the hotspot cache memory; determine the information stored in the hotspot buffer line as hotspot information.
在一种可能的实现方式中,所述重定向数据表包括多个表项,每个所述表项均包括第一标识和第二标识,所述第一标识为有效标记或无效标记,所述第二标识为热点标记或非热点标记。In a possible implementation manner, the redirection data table includes a plurality of table entries, each of the table entries includes a first identifier and a second identifier, the first identifier is a valid mark or an invalid mark, so The second mark is a hot spot mark or a non-hot spot mark.
在一种可能的实现方式中,所述将所述热点信息的第一索引信息与所述冷点高速缓冲存储器的标识信息记录在目标表项中包括:根据所述多个表项中的每个表项的第一标识和第二标识,在所述多个表项中确定候选表项,其中,所述候选表项包括所述多个表项中的 所述第一标识为无效标记的表项以及所述第二标识为非热点标记的表项;在所述候选表项中确定目标表项;将所述热点信息的第一索引信息和所述冷点高速缓冲存储器的标识信息记录在目标表项中。In a possible implementation manner, the recording the first index information of the hot spot information and the identification information of the cold spot cache memory in a target entry includes: according to each of the multiple entries The first identifier and the second identifier of each entry, and the candidate entry is determined among the multiple entries, wherein the candidate entry includes the first identifier of the multiple entries that is an invalid flag The entry and the entry with the second identifier as a non-hot spot mark; determine the target entry in the candidate entry; record the first index information of the hot spot information and the identification information of the cold spot cache In the target table entry.
在一种可能的实现方式中,在将所述热点信息的第一索引信息和所述冷点高速缓冲存储器的标识信息记录在目标表项中之后还包括:将所述目标表项中的第二标识设置为热点标记,以及将所述目标表项中的第一标识设置为有效标记。In a possible implementation manner, after recording the first index information of the hot spot information and the identification information of the cold spot cache memory in the target entry, the method further includes: recording the first index in the target entry The second mark is set as a hot spot mark, and the first mark in the target entry is set as a valid mark.
在一种可能的实现方式中,所述方法还包括:监测所述重定向数据表中的每个表项的访问频率;判断第一表项的访问频率是否小于第三预设频率,其中,所述第一表项为所述第二标识为所述热点标记的表项;将访问频率小于所述第三预设频率的所述第一表项的第二标识修改为非热点标记;判断第二表项的访问频率是否大于第四预设频率,其中,所述第二表项为所述第二标识为所述非热点标记的表项;将访问频率大于所述第四预设频率的所述第二表项的第二标识修改为热点标记;其中,所述第四预设频率大于所述第三预设频率。In a possible implementation, the method further includes: monitoring the access frequency of each entry in the redirection data table; judging whether the access frequency of the first entry is less than a third preset frequency, wherein, The first entry is an entry with the second identifier being the hot spot tag; the second identifier of the first entry whose access frequency is less than the third preset frequency is modified to a non-hot spot tag; determining Whether the access frequency of the second entry is greater than the fourth preset frequency, where the second entry is the entry with the second identification as the non-hot spot flag; the access frequency is greater than the fourth preset frequency The second identifier of the second table entry is modified to a hot spot label; wherein, the fourth preset frequency is greater than the third preset frequency.
通过监测重定向数据表中的每个表项的访问频率,以及根据表项的访问频率与第三预设频率或者第四预设频率的比较结果,确定是否改变表项中的第二标识的状态,即确定是否改变表项中的第一索引信息对应的指令的冷热点状态,从而实现重定向数据表中的每个表项的第一索引信息对应的指令的冷热点状态的实时监控和更新,确保重定向数据表中的信息的时效性、准确性。By monitoring the access frequency of each entry in the redirection data table, and according to the comparison result of the access frequency of the entry with the third preset frequency or the fourth preset frequency, it is determined whether to change the second identifier in the entry Status, that is, to determine whether to change the hot and cold state of the instruction corresponding to the first index information in the entry, so as to realize real-time monitoring and monitoring of the hot and cold state of the instruction corresponding to the first index information of each entry in the data table. Update to ensure the timeliness and accuracy of the information in the redirection data table.
在一种可能的实现方式中,所述方法还包括:将所述热点信息填充至所述冷点高速缓冲存储器中的冷点缓冲行中。In a possible implementation manner, the method further includes: filling the hot spot information into a cold spot buffer line in the cold spot cache memory.
在一种可能的实现方式中,所述方法还包括:将所述冷点高速缓冲存储器中的任意一个缓冲行确定为所述冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率小于第五预设频率的缓冲行确定为冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率最小的缓冲行确定为冷点缓冲行。In a possible implementation manner, the method further includes: determining any one of the buffer lines in the cold spot cache memory as the cold spot buffer line; or determining the access in the cold spot cache memory The buffer line whose frequency is less than the fifth preset frequency is determined as the cold spot buffer line; or the buffer line with the smallest access frequency in the cold spot cache memory is determined as the cold spot buffer line.
在一种可能的实现方式中,所述方法还包括:读取第一请求,所述第一请求为获取待获取信息的请求,所述第一请求携带第二索引信息,所述第二索引信息为用于获取所述待获取信息的索引信息;根据所述第二索引信息确定所述待获取信息的第一索引信息;将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第一目标高速缓冲存储器;将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。In a possible implementation, the method further includes: reading a first request, where the first request is a request for obtaining information to be obtained, the first request carries second index information, and the second index The information is the index information used to obtain the information to be obtained; the first index information of the information to be obtained is determined according to the second index information; the first index information of the information to be obtained is combined with the redirection data table Match the first index information in each entry of the entry; if the first index information in one entry matches the first index information of the information to be obtained, it will match the first index information of the information to be obtained The cache memory corresponding to the identification information of the cache memory in the entry matching the index information is determined as the first target cache memory; the second index information is sent to the first target cache memory, so that all The first target cache memory obtains the to-be-obtained information according to the second index information.
根据第二索引信息确定待获取信息的第一索引信息,将待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配,并在存在匹配的表项时,将匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第一目标高速缓冲存储器,以及将第二索引信息发送至第一目标高速缓冲存储器中,以使第一目标高速缓冲存储器根据第二索引信息获取待获取信息,因此基于重定向数据表实现了第一请求的分流,极大的缓解请求分配不均的问题,从而改善请求获取带宽严重失衡的问题,提高了信息获 取性能。Determine the first index information of the information to be obtained according to the second index information, match the first index information of the information to be obtained with the first index information in each entry in the redirection data table, and if there is a matching table Item, the cache memory corresponding to the identification information of the cache memory in the matching entry is determined as the first target cache memory, and the second index information is sent to the first target cache memory, so that the first The target cache obtains the information to be obtained according to the second index information, so the first request is split based on the redirected data table, which greatly alleviates the problem of uneven request distribution, thereby improving the problem of serious imbalance in request acquisition bandwidth and improving Improved information acquisition performance.
在一种可能的实现方式中,所述方法还包括:若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第二索引信息和映射规则确定第一目标高速缓冲存储器;将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。In a possible implementation manner, the method further includes: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the second index The information and the mapping rule determine the first target cache memory; the second index information is sent to the first target cache memory, so that the first target cache memory obtains all data according to the second index information. Describe the information to be obtained.
在一种可能的实现方式中,所述方法还包括:接收第一目标高速缓冲存储器发送的第三索引信息,所述第三索引信息由所述第二索引信息和存储间隔计算得到,所述第三索引信息由所述第一目标高速缓冲存储器在根据所述待获取信息中的结束标识确定未完成所述待获取信息的获取时生成;根据所述第三索引信息确定所述待获取信息的第一索引信息;将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第二目标高速缓冲存储器;将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。In a possible implementation manner, the method further includes: receiving third index information sent by the first target cache, the third index information being calculated from the second index information and the storage interval, the The third index information is generated by the first target cache memory when it is determined that the acquisition of the information to be acquired has not been completed according to the end identifier in the information to be acquired; the information to be acquired is determined according to the third index information The first index information; the first index information of the information to be obtained is matched with the first index information in each entry in the redirected data table; if the first index information in one of the entries matches If the first index information of the information to be obtained matches, the cache memory corresponding to the identification information of the cache memory in the entry matching the first index information of the information to be obtained is determined as the second target cache memory ; Send the third index information to the second target cache memory, so that the second target cache memory obtains the to-be-obtained information according to the third index information.
在一种可能的实现方式中,所述方法还包括:若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第三索引信息和映射规则确定第二目标高速缓冲存储器;将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。In a possible implementation, the method further includes: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the third index The information and the mapping rule determine the second target cache memory; the third index information is sent to the second target cache memory, so that the second target cache memory obtains all data according to the third index information. Describe the information to be obtained.
第二方面,提供一种带宽均衡装置,包括:第一监测模块,用于监测多个高速缓冲存储器的访问频率;第一确定模块,用于在所述多个高速缓冲存储器中确定冷点高速缓冲存储器和热点高速缓冲存储器;第二确定模块,用于在所述热点高速缓冲存储器中确定热点信息;记录模块,用于将所述热点信息的第一索引信息与所述冷点高速缓冲存储器的标识信息记录在目标表项中,其中,所述目标表项为重定向数据表中的表项。In a second aspect, a bandwidth equalization device is provided, including: a first monitoring module for monitoring the access frequency of a plurality of cache memories; a first determination module for determining a cold spot among the plurality of cache memories. A buffer memory and a hot spot cache memory; a second determination module, used to determine hot spot information in the hot spot cache memory; a recording module, used to compare the first index information of the hot spot information with the cold spot cache memory The identification information of is recorded in a target entry, where the target entry is an entry in the redirection data table.
在一种可能的实现方式中,所第一监测模块,具体用于响应于频率监测指令,基于一监测周期监测所述多个高速缓冲存储器在每个所述监测周期的访问频率。In a possible implementation manner, the first monitoring module is specifically configured to respond to a frequency monitoring instruction and monitor the access frequency of the plurality of cache memories in each monitoring period based on a monitoring period.
在一种可能的实现方式中,所述第一确定模块,具体用于根据每个所述高速缓冲存储器的访问频率,将访问频率最大的所述高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率最小的所述高速缓冲存储器确定为所述冷点高速缓冲存储器;或者根据每个所述高速缓冲存储器的访问频率,将访问频率大于第一预设频率的高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率小于第二预设频率的高速缓冲存储器确定为所述冷点高速缓冲存储器,其中,所述第一预设频率大于所述第二预设频率。In a possible implementation manner, the first determining module is specifically configured to determine, according to the access frequency of each cache memory, the cache memory with the highest access frequency as the hot spot cache memory, Determine the cache memory with the smallest access frequency as the cold spot cache memory; or determine the cache memory with an access frequency greater than a first preset frequency as the cold spot cache memory according to the access frequency of each cache memory A hot spot cache memory, which determines a cache memory with an access frequency less than a second preset frequency as the cold spot cache memory, wherein the first preset frequency is greater than the second preset frequency.
在一种可能的实现方式中,所述第二确定模块,具体用于判断所述热点高速缓冲存储器的访问频率是否达到寄存器配置的频率且所述热点高速缓冲存储器的访问频率与所述冷点高速缓冲存储器的访问频率的差是否大于配置值;若是,则在所述热点高速缓冲存储器中确定热点信息。In a possible implementation manner, the second determining module is specifically configured to determine whether the access frequency of the hot spot cache memory reaches the frequency configured by the register and the access frequency of the hot spot cache memory and the cold spot Whether the difference in the access frequency of the cache memory is greater than the configured value; if so, hot spot information is determined in the hot spot cache memory.
在一种可能的实现方式中,所述第二确定模块,具体用于判断所述热点高速缓冲存储器的访问频率是否大于n倍的所述冷点高速缓冲存储器的访问频率;若是,则在所述热点高速缓冲存储器中确定热点信息。In a possible implementation manner, the second determining module is specifically configured to determine whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory; The hot spot information is determined in the hot spot cache memory.
在一种可能的实现方式中,所述第二确定模块,具体用于监测所述热点高速缓冲存储器中的每个缓冲行的访问频率;根据所述每个缓冲行的访问频率,在所述热点高速缓冲存储器中确定热点缓冲行;将所述热点缓冲行中存储的信息确定为热点信息。In a possible implementation manner, the second determining module is specifically configured to monitor the access frequency of each buffer line in the hotspot cache; according to the access frequency of each buffer line, the A hot spot buffer line is determined in the hot spot cache memory; the information stored in the hot spot buffer line is determined as hot spot information.
在一种可能的实现方式中,所述重定向数据表包括多个表项,每个所述表项均包括第一标识和第二标识,所述第一标识为有效标记或无效标记,所述第二标识为热点标记或非热点标记。In a possible implementation manner, the redirection data table includes a plurality of table entries, each of the table entries includes a first identifier and a second identifier, the first identifier is a valid mark or an invalid mark, so The second mark is a hot spot mark or a non-hot spot mark.
在一种可能的实现方式中,所述记录模块,具体用于根据所述多个表项中的每个表项的第一标识和第二标识,在所述多个表项中确定候选表项,其中,所述候选表项包括所述多个表项中的所述第一标识为无效标记的表项以及所述第二标识为非热点标记的表项;在所述候选表项中确定目标表项;将所述热点信息的第一索引信息和所述冷点高速缓冲存储器的标识信息记录在目标表项中。In a possible implementation manner, the recording module is specifically configured to determine the candidate table among the multiple table items according to the first identifier and the second identifier of each of the multiple table items Item, wherein the candidate entry includes an entry whose first identification is an invalid flag and an entry whose second identification is a non-hot spot flag among the plurality of entries; in the candidate entry Determine the target entry; record the first index information of the hot spot information and the identification information of the cold spot cache in the target entry.
在一种可能的实现方式中,还包括:设置模块,用于将所述目标表项中的第二标识设置为热点标记,以及将所述目标表项中的第一标识设置为有效标记。In a possible implementation manner, the method further includes: a setting module, configured to set the second identifier in the target entry as a hotspot label, and set the first identifier in the target entry as a valid label.
在一种可能的实现方式中,还包括:第二监测模块,用于监测所述重定向数据表中的每个表项的访问频率;第一判断模块,用于判断第一表项的访问频率是否小于第三预设频率,其中,所述第一表项为所述第二标识为所述热点标记的表项;第一修改模块,用于将访问频率小于所述第三预设频率的所述第一表项的第二标识修改为非热点标记;第二判断模块,用于判断第二表项的访问频率是否大于第四预设频率,其中,所述第二表项为所述第二标识为所述非热点标记的表项;第二修改模块,用于将访问频率大于所述第四预设频率的所述第二表项的第二标识修改为热点标记;其中,所述第四预设频率大于所述第三预设频率。In a possible implementation manner, it further includes: a second monitoring module for monitoring the access frequency of each entry in the redirection data table; a first judging module for judging the access of the first entry Whether the frequency is less than the third preset frequency, wherein the first entry is the entry with the second identification as the hotspot mark; the first modification module is configured to set the access frequency to be less than the third preset frequency The second identifier of the first entry is modified to a non-hot spot flag; the second determination module is used to determine whether the access frequency of the second entry is greater than the fourth preset frequency, wherein the second entry is all The second identifier is an entry of the non-hot-spot label; a second modification module is configured to modify the second identifier of the second entry whose access frequency is greater than the fourth preset frequency to a hot-spot label; wherein, The fourth preset frequency is greater than the third preset frequency.
在一种可能的实现方式中,还包括:填充模块,用于将所述热点信息填充至所述冷点高速缓冲存储器中的冷点缓冲行中。In a possible implementation manner, it further includes: a filling module, configured to fill the hot spot information into the cold spot buffer row in the cold spot cache memory.
在一种可能的实现方式中,还包括:第三确定模块,用于将所述冷点高速缓冲存储器中的任意一个缓冲行确定为所述冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率小于第五预设频率的缓冲行确定为冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率最小的缓冲行确定为冷点缓冲行。In a possible implementation manner, the method further includes: a third determining module, configured to determine any buffer line in the cold spot cache memory as the cold spot buffer line; or to cache the cold spot A buffer line with an access frequency less than the fifth preset frequency in the memory is determined as a cold spot buffer line; or a buffer line with the smallest access frequency in the cold spot cache memory is determined as a cold spot buffer line.
在一种可能的实现方式中,还包括:读取模块,用于读取第一请求,所述第一请求为获取待获取信息的请求,所述第一请求携带第二索引信息,所述第二索引信息为用于获取所述待获取信息的索引信息;第四确定模块,用于根据所述第二索引信息确定所述待获取信息的第一索引信息;第一匹配模块,用于将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;第五确定模块,用于若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第一目标高速缓冲存储器;第一发送模块,用于将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。In a possible implementation manner, it further includes: a reading module, configured to read a first request, the first request is a request for obtaining information to be obtained, the first request carries second index information, the The second index information is the index information used to obtain the information to be obtained; the fourth determining module is used to determine the first index information of the information to be obtained according to the second index information; the first matching module is used to The first index information of the information to be obtained is matched with the first index information in each entry in the redirection data table; the fifth determining module is configured to determine the first index information in one of the entries If it matches the first index information of the information to be obtained, the cache memory corresponding to the identification information of the cache memory in the entry matching the first index information of the information to be obtained is determined as the first target cache Memory; a first sending module, configured to send the second index information to the first target cache, so that the first target cache obtains the to-be-obtained according to the second index information information.
在一种可能的实现方式中,还包括:第六确定模块,用于若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第二索引信息和映射规则 确定第一目标高速缓冲存储器;第二发送模块,用于将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。In a possible implementation manner, it further includes: a sixth determining module, configured to: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the The second index information and the mapping rule determine the first target cache memory; the second sending module is used to send the second index information to the first target cache memory, so that the first target is high-speed The buffer memory obtains the to-be-obtained information according to the second index information.
在一种可能的实现方式中,还包括:接收模块,用于接收第一目标高速缓冲存储器发送的第三索引信息,所述第三索引信息由所述第二索引信息和存储间隔计算得到,所述第三索引信息由所述第一目标高速缓冲存储器在根据所述待获取信息中的结束标识确定未完成所述待获取信息的获取时生成;第七确定模块,用于根据所述第三索引信息确定所述待获取信息的第一索引信息;第二匹配模块,用于将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;第八确定模块,用于若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第二目标高速缓冲存储器;第三发送模块,用于将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。In a possible implementation manner, it further includes: a receiving module, configured to receive third index information sent by the first target cache memory, where the third index information is calculated from the second index information and the storage interval, The third index information is generated by the first target cache memory when it is determined that the acquisition of the information to be acquired has not been completed according to the end identifier in the information to be acquired; the seventh determining module is configured to The three index information determines the first index information of the information to be obtained; the second matching module is used to compare the first index information of the information to be obtained with the first index information in each entry in the redirect data table Matching; an eighth determining module, configured to match the first index information of the information to be obtained if the first index information in one of the entries matches the first index information of the information to be obtained The cache memory corresponding to the identification information of the cache memory in the entry is determined to be the second target cache memory; the third sending module is configured to send the third index information to the second target cache memory, So that the second target cache memory obtains the to-be-obtained information according to the third index information.
在一种可能的实现方式中,还包括:第九确定模块,用于若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第三索引信息和映射规则确定第二目标高速缓冲存储器;第四发送模块,用于将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。In a possible implementation manner, it further includes: a ninth determining module, configured to: if the first index information in each entry does not match the first index information of the to-be-obtained information, according to the The third index information and the mapping rule determine the second target cache memory; the fourth sending module is configured to send the third index information to the second target cache memory, so as to make the second target high-speed The buffer memory obtains the to-be-obtained information according to the third index information.
第三方面,提供一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行第一方面中任一项所述的方法。In a third aspect, a computer-readable storage medium is provided, including a computer program, which when executed on a computer, causes the computer to execute the method described in any one of the first aspects.
第四方面,提供一种计算机程序,当所述计算机程序被计算机执行时,用于执行第一方面中任一项所述的方法。In a fourth aspect, a computer program is provided, when the computer program is executed by a computer, it is used to execute the method described in any one of the first aspect.
第五方面,提供一种芯片,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行第一方面中任一项所述的方法。In a fifth aspect, a chip is provided, including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute any one of the first aspect The method described.
附图说明Description of the drawings
图1为本申请实施例提供的带宽均衡方法的应用场景示意图;FIG. 1 is a schematic diagram of an application scenario of a bandwidth equalization method provided by an embodiment of the application;
图2为本申请实施例提供的带宽均衡方法的流程示意图;FIG. 2 is a schematic flowchart of a bandwidth equalization method provided by an embodiment of the application;
图3为本申请实施例提供的将热点指令的第一索引谢谢你和冷点cache的id记录在目标entry的流程示意图;FIG. 3 is a schematic diagram of the process of recording the first index of the hot spot instruction thank you and the id of the cold spot cache in the target entry provided by an embodiment of the application;
图4为本申请实施例提供的一种信息获取方法的流程示意图的第一部分;FIG. 4 is the first part of a schematic flowchart of an information acquisition method provided by an embodiment of this application;
图5为本申请实施例提供的一种信息获取方法的流程示意图的第二部分;FIG. 5 is the second part of a schematic flowchart of an information acquisition method provided by an embodiment of this application;
图6为本申请实施例提供的一种信息获取方法的流程示意图的第三部分;FIG. 6 is the third part of a schematic flowchart of an information acquisition method provided by an embodiment of this application;
图7为本申请实施例提供的一种信息获取方法的流程示意图的第四部分;FIG. 7 is the fourth part of a schematic flowchart of an information acquisition method provided by an embodiment of this application;
图8本申请实施例提供的包括多个cache的应用场景示意图;FIG. 8 is a schematic diagram of an application scenario including multiple caches provided by an embodiment of the present application;
图9为本申请实施例提供的Slice cache的结构示意图;FIG. 9 is a schematic structural diagram of a slice cache provided by an embodiment of the application;
图10为本申请实施例提供的另一种信息获取方法的流程示意图的第一部分;FIG. 10 is the first part of a schematic flowchart of another information acquisition method provided by an embodiment of this application;
图11为本申请实施例提供的另一种信息获取方法的流程示意图的第二部分;FIG. 11 is the second part of a schematic flowchart of another information acquisition method provided by an embodiment of the application;
图12为本申请实施例提供的另一种信息获取方法的流程示意图的第三部分;FIG. 12 is the third part of a schematic flowchart of another information acquisition method provided by an embodiment of this application;
图13为本申请实施例提供的一种带宽均衡装置的结构示意图。FIG. 13 is a schematic structural diagram of a bandwidth equalization device provided by an embodiment of the application.
具体实施方式Detailed ways
下面将结合附图,对本申请中的技术方案进行描述。The technical solution in this application will be described below in conjunction with the accompanying drawings.
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of this application clearer, the technical solutions in this application will be described clearly and completely in conjunction with the accompanying drawings in this application. Obviously, the described embodiments are part of the embodiments of this application. , Not all examples. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请的说明书实施例和权利要求书及附图中的术语“第一”、“第二”等仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", etc. in the specification embodiments, claims, and drawings of this application are only used for the purpose of distinguishing description, and cannot be understood as indicating or implying relative importance, nor can it be understood as indicating Or imply the order. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusions, for example, including a series of steps or units. The method, system, product, or device need not be limited to those clearly listed steps or units, but may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or devices.
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。It should be understood that in this application, "at least one (item)" refers to one or more, and "multiple" refers to two or more. "And/or" is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, "A and/or B" can mean: only A, only B, and both A and B , Where A and B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship. "The following at least one item (a)" or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a). For example, at least one of a, b, or c can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
图1为本申请实施例提供的带宽均衡方法的应用场景示意图,如图1所示,该应用场景可以包括:多个cache(高速缓冲存储器)、多个processor core(处理器核)、主存(图中未示出)、重定向模块、重定向数据表以及crossbar(交叉开关)。其中:Fig. 1 is a schematic diagram of an application scenario of a bandwidth equalization method provided by an embodiment of the application. As shown in Fig. 1, the application scenario may include: multiple caches, multiple processor cores, and main memory. (Not shown in the figure), redirection module, redirection data table and crossbar (crossbar). in:
主存包括多个存储块,每个存储块均由若干数量的存储单元组成,每个存储模块均用于存储信息。主存可以是数据主存,即主存中存储的信息为数据,还可以是指令主存,即主存中存储的信息为指令,本申请对此不作特殊限定。The main memory includes multiple storage blocks, each storage block is composed of a number of storage units, and each storage module is used to store information. The main memory can be a data main memory, that is, the information stored in the main memory is data, or it can be an instruction main memory, that is, the information stored in the main memory is an instruction, which is not specifically limited in this application.
下面,以主存为指令主存为例,对该应用场景中的每个模块的原理进行说明。In the following, taking the main memory as the command main memory as an example, the principle of each module in this application scenario will be explained.
主存中存储待存储指令的方式可以为:判断待存储指令的大小是否大于主存中的存储块的容量,若待存储指令的大小等于或者小于存储块的存储容量,则将该待存储指令作为一个整体存储在一个存储块中;若待存储指令的大小大于存储块的存储容量,则根据待存储指令的大小与存储块的存储容量将待存储指令划分为多个指令段,然后,将划分得到的多个指令段存储在多个存储块中。The way of storing instructions to be stored in main memory can be: judging whether the size of the instructions to be stored is greater than the capacity of the storage block in the main memory, and if the size of the instructions to be stored is equal to or less than the storage capacity of the storage block, the instructions to be stored are Stored in a storage block as a whole; if the size of the instruction to be stored is greater than the storage capacity of the storage block, the instruction to be stored is divided into multiple instruction segments according to the size of the instruction to be stored and the storage capacity of the storage block, and then The divided instruction segments are stored in multiple storage blocks.
另外,还可以为存储块中存储的指令设置结束标识(End Indicator,EI),该EI标识为0时,表征该存储块中存储的指令为其对应的待存储指令中的一个指令段且不是该待存储指令的最后一个指令段,若EI标识为1时,表征该存储块中存储的指令为其对应的待存储指令中的最后一个指令段,或者表征该存储块中存储的指令为其对应的待存储指令。 基于此,若待存储指令的大小等于或者小于存储块的存储容量,则将该待存储指令作为一个整体存储在一个存储块中,并将该存储块中的指令(即待存储指令)的EI标识设置为1。若待存储指令的大小大于存储块的存储容量,则根据待存储指令的大小和存储块的容量将待存储指令划分为多个指令段,然后,将多个指令段存储在多个存储块中,其中多个指令段的数量与多个存储块的数量相同,且一个指令段对应一个存储块,最后,为每个存储块中存储的指令设置EI标识,若存储块中存储的指令不是待存储指令的最后一个指令段,则将存储块中存储的指令的EI标识设置为0,若存储块中存储的指令为待存储指令的最后一个指令段,则将该存储块中存储的指令的EI标识设置为1。这样,在存储待存储指令时,通过设置每个存储块中存储的指令的EI标识,使得在获取待存储指令时,可以通过存储块中存储的指令的EI标识判断该存储块中存储的指令是待存储指令的一个指令段,还是待存储指令,详细的过程将在下文中获取信息的部分进行说明,此处不再进行赘述。In addition, the end indicator (EI) can also be set for the instruction stored in the storage block. When the EI flag is 0, it means that the instruction stored in the storage block is one of the instruction segments corresponding to the instructions to be stored and is not The last instruction segment of the instruction to be stored, if the EI flag is 1, it indicates that the instruction stored in the storage block is the last instruction segment of the corresponding instruction to be stored, or the instruction stored in the storage block is The corresponding instruction to be stored. Based on this, if the size of the instruction to be stored is equal to or less than the storage capacity of the storage block, the instruction to be stored is stored in a storage block as a whole, and the EI of the instruction in the storage block (that is, the instruction to be stored) The flag is set to 1. If the size of the instruction to be stored is greater than the storage capacity of the storage block, the instruction to be stored is divided into multiple instruction segments according to the size of the instruction to be stored and the capacity of the storage block, and then the multiple instruction segments are stored in multiple storage blocks , Where the number of multiple instruction segments is the same as the number of multiple storage blocks, and one instruction segment corresponds to one storage block. Finally, set the EI flag for the instructions stored in each storage block. If the instructions stored in the storage block are not pending To store the last instruction segment of the instruction, set the EI flag of the instruction stored in the storage block to 0. If the instruction stored in the storage block is the last instruction segment of the instruction to be stored, then the instruction stored in the storage block is The EI flag is set to 1. In this way, when storing instructions to be stored, by setting the EI flag of the instructions stored in each storage block, when obtaining the instructions to be stored, the instructions stored in the storage block can be judged by the EI flag of the instructions stored in the storage block. Whether it is an instruction segment of the instruction to be stored or the instruction to be stored, the detailed process will be described in the information acquisition part below, and will not be repeated here.
每个cache均包括多个缓存行(cacheline),每个cacheline均由多个存储单元构成,每个cacheline用于存储主存中的存储块中的指令,每个cacheline的存储容量与主存中的每个存储块的存储容量相同。需要说明的是,此处的cache例如可以为Slice cache(基于分片的高速缓冲存储器)等,本申请对此不作特殊限定。Each cache includes multiple cache lines (cachelines), each cacheline is composed of multiple storage units, each cacheline is used to store instructions in the storage block in the main memory, and the storage capacity of each cacheline is the same as that in the main memory. The storage capacity of each storage block is the same. It should be noted that the cache here may be, for example, a slice cache (slice-based cache memory), etc., which is not specifically limited in this application.
主存中的存储块与多个cache的映射规则包括主存中的存储块与多个cache的映射关系以及主存中的存储块与cache中的缓冲行的映射关系。其中,主存中的存储块与多个cache的映射关系可以根据具体的应用场景进行设置,本申请对此不作特殊限定。主存中的存储块与cache中的缓冲行之间的映射关系例如可以为组相连、全相连、直接映射等中的任意一种,本申请对此不作特殊限定。The mapping rules between storage blocks in the main memory and multiple caches include the mapping relationship between the storage blocks in the main memory and multiple caches, and the mapping relationship between the storage blocks in the main memory and the cache lines in the cache. Among them, the mapping relationship between the storage block in the main memory and multiple caches can be set according to specific application scenarios, which is not specifically limited in this application. The mapping relationship between the storage block in the main memory and the buffer line in the cache can be, for example, any one of group connection, full connection, direct mapping, etc., which is not specifically limited in this application.
通过上述映射规则,可以将主存中存储块中存储的指令缓存在cache中。具体的,可以根据映射规则确定存储块对应的cache,以及对应的cache中的对应的缓冲行,然后将该存储块中存储的指令缓冲在对应的cache中的对应的缓冲行。Through the above mapping rules, the instructions stored in the storage block in the main memory can be cached in the cache. Specifically, the cache corresponding to the storage block and the corresponding cache line in the corresponding cache may be determined according to the mapping rule, and then the instructions stored in the storage block are buffered in the corresponding cache line in the corresponding cache.
重定向模块用于监测多个cache中的每个cache的访问频率,以及根据每个cache的访问频率在多个cache中调整热点指令的存储位置,进而均衡每个cache的访问频率,实现请求的均衡分配,平衡获取带宽,提升信息获取的性能。The redirection module is used to monitor the access frequency of each cache in multiple caches, and adjust the storage location of hotspot instructions in multiple caches according to the access frequency of each cache, and then balance the access frequency of each cache to achieve the requested Balanced allocation, balanced acquisition of bandwidth, and improved performance of information acquisition.
重定向数据表用于记录重定向模块对热点指令的调整记录。需要说明的是,重定向模块和重定向数据表将在下文中进行详细说明,因此此处不再进行赘述。The redirection data table is used to record the adjustment record of the redirection module to the hotspot instruction. It should be noted that the redirection module and the redirection data table will be described in detail below, so they will not be repeated here.
需要说明的是,在主存为数据主存时,该应用场景中的每个模块的工作原理与主存为指令主存时的原理相同,因此此处不再进行赘述。It should be noted that when the main memory is the data main memory, the working principle of each module in this application scenario is the same as the principle when the main memory is the command main memory, so it will not be repeated here.
需要说明的是,上述应用场景仅为示例性的,并不用于限定本申请。It should be noted that the above application scenarios are only exemplary and are not used to limit the application.
图2为本申请实施例提供的带宽均衡方法的流程示意图,该方法的执行主体例如可以是上述应用场景中的重定向模块等可以执行图2中所示方法的装置或者芯片等。此处,将以主存为指令主存,即主存中存储的信息为指令为例对本申请进行说明。如图2所示,该方法包括以下步骤:FIG. 2 is a schematic flowchart of a bandwidth equalization method provided by an embodiment of the application. The execution subject of the method may be, for example, a device or chip that can execute the method shown in FIG. 2 such as the redirection module in the above application scenario. Here, the application will be explained by taking the main memory as the command main memory, that is, the information stored in the main memory as the command as an example. As shown in Figure 2, the method includes the following steps:
步骤201、监测多个cache的访问频率。Step 201: Monitor the access frequency of multiple caches.
在本申请实施例中,监测方式可以包括以下两种,其中:In the embodiment of the present application, the monitoring method may include the following two types, among which:
第一种,定时监测多个cache中的每个cache在监测周期内的访问频率。具体的,可以设置多个监测时刻,在每个监测时刻开始时,监测每个cache在监测周期内的访问频率。 多个监测时刻和监测周期可以根据具体的应用场景进行设置,此处对此不作特殊限定。例如,监测周期可以为0.01ms。The first is to regularly monitor the access frequency of each of the multiple caches during the monitoring period. Specifically, multiple monitoring moments can be set, and at the beginning of each monitoring moment, the access frequency of each cache in the monitoring period is monitored. Multiple monitoring moments and monitoring periods can be set according to specific application scenarios, which are not specifically limited here. For example, the monitoring period can be 0.01 ms.
第二种,响应于频率监测指令,基于一监测周期监测多个cache在每个监测周期的访问频率。具体的,可以在需要监测多个cache中的每个cache的访问频率时向该方法的执行主体(例如重定向模块)发送一频率监测指令,即向该方法的执行主体使能,使得该方法的执行主体接收并响应于该频率监测指令,立即启动定时和访问数量的统计,当定时达到监测周期时,记录统计的访问数量并计算访问频率。The second type is to monitor the access frequency of multiple caches in each monitoring cycle based on a monitoring cycle in response to a frequency monitoring command. Specifically, when the access frequency of each cache in multiple caches needs to be monitored, a frequency monitoring instruction can be sent to the execution subject of the method (for example, the redirection module), that is, enable the execution subject of the method, so that the method The main body of execution receives and responds to the frequency monitoring instruction, and immediately starts the statistics of timing and the number of visits. When the timing reaches the monitoring period, the statistics of the number of visits are recorded and the visit frequency is calculated.
在上述两种方式中,每个cache在监测周期内的访问频率的获取方式为:首先,可以通过计数器获取每个cache在监测周期内的访问数量,然后,将每个cache在监测周期内的访问数量与监测周期的时长的比值确定为对应cache的访问频率。需要说明的是,在上述方式中,每个cache的监测周期是相同的,且每个cache在监测周期内的访问数量与访问频率呈正相关关系,因此,也可以直接将每个cache在监测周期的访问数量确定为对应的cache在监测周期内的访问频率,这样可以在确保数据准确的同时减少计算量,提高计算效率,节约计算成本。In the above two methods, the access frequency of each cache in the monitoring period is obtained as follows: first, the number of accesses of each cache in the monitoring period can be obtained through a counter, and then, the number of accesses of each cache in the monitoring period The ratio of the number of accesses to the duration of the monitoring period is determined as the access frequency of the corresponding cache. It should be noted that in the above method, the monitoring period of each cache is the same, and the number of accesses of each cache in the monitoring period is positively correlated with the access frequency. Therefore, it is also possible to directly set each cache in the monitoring period. The number of accesses is determined as the access frequency of the corresponding cache in the monitoring period, which can reduce the amount of calculation while ensuring the accuracy of the data, improve the calculation efficiency, and save the calculation cost.
通过周期性监测或者定时监测的方式,实现了对每个cache的访问频率的动态监测。Through periodic monitoring or regular monitoring, the dynamic monitoring of the access frequency of each cache is realized.
需要说明的是,上述监测多个cache中的每个cache的访问频率的方式仅为示例性的,并不用于限定本发明。It should be noted that the above method of monitoring the access frequency of each of the multiple caches is only exemplary, and is not intended to limit the present invention.
步骤202、在多个cache中确定冷点cache和热点cache。Step 202: Determine a cold spot cache and a hot spot cache among multiple caches.
在本申请实施例中,可以通过以下两种方式确定冷点cache和热点cache,其中:In the embodiment of the present application, the cold spot cache and the hot spot cache can be determined in the following two ways, among which:
方式一、根据每个cache的访问频率,将访问频率最大的cache确定为热点cache,将访问频率最小的cache确定为冷点cache。具体的,可以将每个cache按照访问频率由大到小的顺序进行排序,将排在第一位的cache确定为热点cache,将排在最后一位的cache确定为冷点cache。Method 1: According to the access frequency of each cache, the cache with the largest access frequency is determined as a hot spot cache, and the cache with the smallest access frequency is determined as a cold spot cache. Specifically, each cache can be sorted in descending order of access frequency, the first cache is determined to be a hot spot cache, and the last cache is determined to be a cold spot cache.
方式二、根据每个cache的访问频率,将访问频率大于第一预设频率的cache确定为热点cache,将访问频率小于第二预设频率的cache确定为冷点cache,其中,第一预设频率大于第二预设频率。需要说明的是,可以根据历史数据的统计结果设置第一预设频率和第二预设频率。Method 2: According to the access frequency of each cache, the cache with an access frequency greater than the first preset frequency is determined as a hot cache, and the cache with an access frequency less than the second preset frequency is determined as a cold spot cache, where the first preset The frequency is greater than the second preset frequency. It should be noted that the first preset frequency and the second preset frequency can be set according to the statistical result of historical data.
需要说明的是,上述两种方式仅为示例性的,并不用于限定本发明,例如,还可以将多个cache按照访问频率由大到小的顺序进行排序,将排在前X位的cache确定为热点cache,将排在后M位的cache确定为冷点cache。其中,X和M的取值可以相同也可以不同。也可以使用其他各种方法,从多个cache中选出热点cache和冷点cache。It should be noted that the above two methods are only exemplary and are not used to limit the present invention. For example, multiple caches can be sorted in descending order of access frequency, and the top X caches are ranked It is determined as a hot spot cache, and the cache at the bottom M position is determined as a cold spot cache. Among them, the values of X and M can be the same or different. Various other methods can also be used to select a hot spot cache and a cold spot cache from multiple caches.
步骤203、在热点cache中确定热点指令。Step 203: Determine the hotspot instruction in the hotspot cache.
在本申请实施例中,若主存为指令主存,则热点信息为热点指令,若主存为数据主存,则热点信息为热点数据。由于,此处以主存为指令主存为例进行说明,因此此处的热点信息为热点指令。In the embodiment of the present application, if the main memory is the command main memory, the hotspot information is the hotspot instruction, and if the main memory is the data main memory, the hotspot information is the hotspot data. Since the main memory is used as the command main memory as an example for description, the hot spot information here is a hot spot command.
针对步骤203,首先,监测热点cache中的每个cacheline的访问频率,然后,根据每个cacheline的访问频率,在热点cache中确定热点cacheline,最后,将热点cacheline中存储的指令确定为热点指令。For step 203, firstly, monitor the access frequency of each cacheline in the hotspot cache, then determine the hotspot cacheline in the hotspot cache according to the access frequency of each cacheline, and finally determine the instructions stored in the hotspot cacheline as hotspot instructions.
具体的,监测热点cache中的每个cacheline的访问频率的原理如下:Specifically, the principle of monitoring the access frequency of each cacheline in the hotspot cache is as follows:
通过计数器获取热点cache中的每个cacheline在同一时间间隔内的访问数量,然后,将每个cacheline在同一时间间隔内的访问数量与该同一时间间隔的比值确定为对应的cacheline的访问频率。需要说明的是,由于每个cacheline对应的是同一个时间间隔,且每个cacheline在同一时间间隔内的访问数量与访问频率呈正相关关系,因此,也可以直接将每个cacheline在同一时间间隔内的访问数量确定为对应的cacheline在同一时间间隔内的访问频率,这样在确保数据准确的同时减少了计算量,提高了计算效率,节约了计算成本。The counter is used to obtain the number of accesses of each cacheline in the hotspot cache during the same time interval, and then the ratio of the number of accesses of each cacheline in the same time interval to the same time interval is determined as the access frequency of the corresponding cacheline. It should be noted that since each cacheline corresponds to the same time interval, and the number of accesses of each cacheline in the same time interval is positively related to the access frequency, it is also possible to directly place each cacheline in the same time interval. The number of accesses is determined as the access frequency of the corresponding cacheline in the same time interval, which reduces the amount of calculation while ensuring the accuracy of the data, improves the calculation efficiency, and saves the calculation cost.
根据热点cache中的每个cacheline的访问频率,在热点cache中确定热点cacheline的方式例如可以为:根据每个cacheline的访问频率将热点cache中的cacheline按照访问频率从大到小的顺序进行排序,将排在第一位的cacheline确定为热点cacheline;或者将每个cacheline的访问频率与一个设定值进行比较,将访问频率大于该设定值的cacheline确定为热点cacheline。需要说明的是,上述方式仅为示例性的,并不用于限定本发明。According to the access frequency of each cacheline in the hotspot cache, the way to determine the hotspot cacheline in the hotspot cache can, for example, be: according to the access frequency of each cacheline, the cachelines in the hotspot cache are sorted in descending order of access frequency. The first cacheline is determined as a hot cacheline; or the access frequency of each cacheline is compared with a set value, and the cacheline with an access frequency greater than the set value is determined as a hot cacheline. It should be noted that the above manner is only exemplary and is not used to limit the present invention.
需要说明的是,在热点cache为多个时,根据上述原理即可确定每个热点cache中的热点cacheline,进而根据每个热点cacheline确定每个cacheline对应的热点指令。It should be noted that when there are multiple hotspot caches, the hotspot cacheline in each hotspot cache can be determined according to the above principle, and then the hotspot instruction corresponding to each cacheline is determined according to each hotspot cacheline.
步骤204、将热点指令的第一索引信息与冷点cache的标识信息(id)记录在目标表项(entry)中,其中,目标表项为重定向数据表中的表项。Step 204: Record the first index information of the hot spot instruction and the identification information (id) of the cold spot cache in a target entry (entry), where the target entry is an entry in the redirection data table.
在本申请实施例中,热点指令的第一索引信息可以根据主存中的存储块与多个cache的映射规则进行确定。In the embodiment of the present application, the first index information of the hotspot instruction may be determined according to a mapping rule between a storage block in the main memory and multiple caches.
下面,对重定向数据表的结构进行说明。Next, the structure of the redirection data table will be described.
重定向数据表包括多个表项(entry),每个entry中可以包括一第一标识和一第二标识,其中,第一标识为有效标记或无效标记,即第一标识的取值有两种选择,这两种选择为有效标记和无效标记,若entry中的第一标识为有效标记,则该entry中存在有效信息,若entry中的第一标识为无效标记,则该entry中不存在有效信息;第二标识为热点标记或非热点标记,即第二标识的取值有两种选择,这两种选择为热点标记和非热点标记,若entry的第二标识为热点标记,则该entry中记录的第一索引信息对应的指令为热点指令,若entry的第二标识为非热点标记,则该entry中记录的第一索引信息对应的指令为冷点指令。基于此,每个entry包括四个区域,其中一个区域用于记录第一标识,一个区域用于记录第二标识,一个区域用于记录指令的第一索引信息,一个区域用于记录cache的id。The redirection data table includes multiple entries, and each entry may include a first identifier and a second identifier, where the first identifier is a valid tag or an invalid tag, that is, the value of the first identifier has two values. There are two options. The two options are valid and invalid tags. If the first identifier in the entry is a valid tag, then there is valid information in the entry, and if the first identifier in the entry is an invalid tag, then there is no entry in the entry. Valid information; the second identifier is a hotspot tag or a non-hotspot tag, that is, there are two options for the value of the second tag. The two options are hotspot tags and non-hotspot tags. If the second tag of the entry is a hotspot tag, this The instruction corresponding to the first index information recorded in the entry is a hot spot instruction. If the second identifier of the entry is a non-hot spot mark, the instruction corresponding to the first index information recorded in the entry is a cold spot instruction. Based on this, each entry includes four areas. One area is used to record the first identifier, one area is used to record the second identifier, one area is used to record the first index information of the command, and one area is used to record the id of the cache. .
需要说明的是,每个entry中的指令的第一索引信息可以根据主存中的存储块与多个cache的映射规则进行确定。It should be noted that the first index information of the instruction in each entry can be determined according to the mapping rule between the storage block in the main memory and multiple caches.
在此基础上,如图3所示,将热点指令的第一索引信息和冷点cache的id记录在目标entry中的步骤如下:On this basis, as shown in Figure 3, the steps of recording the first index information of the hot spot instruction and the id of the cold spot cache in the target entry are as follows:
步骤301、根据多个entry中的每个entry的第一标识和第二标识,在多个entry中确定候选entry,其中,候选entry包括多个entry中的第一标识为无效标记的entry以及第二标识为非热点标记的entry,即候选enry为多个entry中不存在有效信息的entry和entry中记录的第一索引信息对应的指令为冷点指令的entry。Step 301: According to the first identification and the second identification of each entry in the plurality of entries, a candidate entry is determined among the plurality of entries, wherein the candidate entry includes the entry whose first identification is an invalid mark and the first identification among the plurality of entries. 2. The entry identified as a non-hot spot mark, that is, the candidate entry is an entry that does not have valid information among multiple entries, and the entry corresponding to the first index information recorded in the entry is an entry that is a cold spot instruction.
步骤302、在候选entry中确定目标entry。具体的,可以将候选entry中的任意一个候选entry确定为目标entry;或者若候选entry中包括第一标识为无效标记的entry和第二标识为非热点标记的entry,则优选将候选entry中第一标识为无效标记的任意一个entry确 定为目标entry;或者若候选entry中仅包括第二标识为非热点标记的entry,则将热点标识为非热点标记的任一个entry确定为目标entry。Step 302: Determine the target entry among the candidate entries. Specifically, any one of the candidate entries can be determined as the target entry; or if the candidate entry includes the entry with the first identification as an invalid mark and the entry with the second identification as a non-hot spot, it is preferable to select the first entry among the candidate entries. Any entry identified as an invalid mark is determined as the target entry; or if the candidate entries only include entries with the second identification as a non-hot-spot mark, then any entry identified as a hot-spot mark as a non-hot-spot mark is determined as the target entry.
需要说明的是,在热点指令的数量为多个的情况下,即在热点cache为多个且至少一个热点cache中包括至少一个热点cacheline,或者在热点cache为一个且该热点cache中包括多个热点cacheline的情况下,需通过上述原理确定每个热点指令的目标entry。It should be noted that when the number of hotspot instructions is multiple, that is, there are multiple hotspot caches and at least one hotspot cache includes at least one hotspot cacheline, or there is one hotspot cache and the hotspot cache includes multiple In the case of a hotspot cacheline, the target entry of each hotspot instruction needs to be determined by the above principles.
步骤303、将热点指令的第一索引信息和冷点cache的id记录在目标entry中。Step 303: Record the first index information of the hot spot instruction and the id of the cold spot cache in the target entry.
在本申请实施例中,若热点指令为多个时,则首先需要分别为每个热点指令确定与其对应的冷点cache,然后,将每个热点指令的第一索引信息和其对应的冷点cache的id记录在对应的目标entry中。In the embodiment of the present application, if there are multiple hot spot instructions, it is first necessary to determine the corresponding cold spot cache for each hot spot instruction respectively, and then combine the first index information of each hot spot instruction and its corresponding cold spot cache The id of the cache is recorded in the corresponding target entry.
若热点指令为一个,则首选需要为该热点指令确定冷点cache,然后,将该热点指令的第一索引信息和确定的冷点cache的id记录在对应的目标entry。If there is one hot spot instruction, it is first necessary to determine the cold spot cache for the hot spot instruction, and then record the first index information of the hot spot instruction and the determined id of the cold spot cache in the corresponding target entry.
在执行步骤303之后,还可以将目标entry中的第二标识设置为热点标记,以指示目标entry中记录的第一索引信息对应的指令为热点指令,以及将目标entry中的第一标识设置为有效标记,以指示目标entry中存在有效信息。After step 303 is performed, the second identifier in the target entry can also be set as a hot spot flag to indicate that the instruction corresponding to the first index information recorded in the target entry is a hot spot instruction, and the first identifier in the target entry can be set to A valid flag to indicate that there is valid information in the target entry.
为每个热点指令确定与其对应的冷点cache的方式可以为:分别为每个热点cache确定与其对应的冷点cache,然后,将每个热点cache对应的冷点cache确定为对应的热点cache中的热点指令对应的冷点cache。其中:The way to determine the cold spot cache corresponding to each hot spot instruction can be: determine the corresponding cold spot cache for each hot spot cache respectively, and then determine the cold spot cache corresponding to each hot spot cache as the corresponding hot spot cache The hot spot instruction corresponds to the cold spot cache. in:
为每个热点cache确定与其对应的冷点cache的方式可以包括:The method of determining the corresponding cold spot cache for each hot spot cache can include:
若冷点cache为一个时,则将该冷点cache确定为每个热点cache对应的冷点cache。即每个热点cache对应的冷点cache相同。If there is one cold spot cache, the cold spot cache is determined as the cold spot cache corresponding to each hot spot cache. That is, the cold spot cache corresponding to each hot spot cache is the same.
若冷点cache为多个时,可以在多个冷点cache中为每个热点cache确定与其对应的冷点cache,其中,每个热点cache对应的冷点cache可以完全相同,也可以完全不同,还可以部分相同或部分不相同等,此处不作特殊限定。If there are multiple cold spot caches, the corresponding cold spot cache can be determined for each hot spot cache in the multiple cold spot caches. Among them, the cold spot cache corresponding to each hot spot cache can be exactly the same or completely different. It may also be partly the same or partly different, etc., and there is no special limitation here.
在将热点指令的第一索引信息和冷点cache的id记录在目标entry的同时或者之后,还可以将热点指令填充至冷点cache中的冷点cacheline。即在冷点cache中确定冷点cacheline,将热点指令填充至冷点cache中的冷点cacheline中。具体的,在冷点cache中确定冷点cacheline的方式可以包括:将冷点cache中的任意一个cacheline确定为冷点cacheline;或者,监测冷点cache中的每个cacheline的访问频率,将冷点cache中的访问频率小于第五预设频率的cacheline确定为冷点cacheline;或者监测冷点cache中的每个cacheline的访问频率,将冷点cache中的访问频率最小的cacheline确定为冷点cacheline。需要说明的是,上述过程仅为示例性的,并不用于限定本申请。At the same time or after recording the first index information of the hot-spot instruction and the id of the cold-spot cache in the target entry, the hot-spot instruction can also be filled into the cold-spot cacheline in the cold-spot cache. That is, the cold spot cache line is determined in the cold spot cache, and the hot spot instructions are filled into the cold spot cache line in the cold spot cache. Specifically, the method of determining the cold spot cacheline in the cold spot cache may include: determining any cacheline in the cold spot cache as a cold spot cacheline; or, monitoring the access frequency of each cacheline in the cold spot cache, and setting the cold spot The cache line whose access frequency in the cache is less than the fifth preset frequency is determined as a cold spot cacheline; or the access frequency of each cache line in the cold spot cache is monitored, and the cache line with the smallest access frequency in the cold spot cache is determined as a cold spot cache line. It should be noted that the above process is only exemplary and is not used to limit the application.
需要说明的是,在热点指令为多个时,分别将每个热点指令填充至其对应的冷点cache中的冷点cacheline中。It should be noted that when there are multiple hot-spot instructions, each hot-spot instruction is respectively filled into the cold-spot cacheline in its corresponding cold-spot cache.
进一步的,为了对重定向数据表中的每个entry中的第一索引信息对应的指令的冷热状态进行更新,本申请还包括:监测重定向数据表中的每个entry的访问频率,判断第一entry的访问频率是否小于第三预设频率,以及将访问频率小于第三预设频率的第一entry的第二标识修改为非热点标记,判断第二entry的访问频率是否大于第四预设频率,以及将访问频率大于第四预设频率的第二entry的第二标识修改为热点标记,其中,第一entry为重定向数据表中的第二标识为热点标记的entry,第二entry为重定向数据表中的第二标 识为非热点标记的entry,第四预设频率大于第三预设频率。Further, in order to update the hot and cold state of the instruction corresponding to the first index information in each entry in the redirection data table, this application also includes: monitoring the access frequency of each entry in the redirection data table, and determining Whether the access frequency of the first entry is less than the third preset frequency, and modify the second identification of the first entry whose access frequency is less than the third preset frequency to a non-hot spot mark, and determine whether the access frequency of the second entry is greater than the fourth preset frequency Set the frequency, and modify the second identifier of the second entry whose access frequency is greater than the fourth preset frequency to a hotspot tag, where the first entry is the entry with the second identifier in the redirection data table as the hotspot tag, and the second entry In order to redirect the entry in the data table whose second mark is a non-hot spot mark, the fourth preset frequency is greater than the third preset frequency.
监测重定向数据表中的每个entry的访问频率的过程包括:判断重定向数据表中的每个entry中存储的第一索引信息在预设时间间隔内被待获取指令的索引信息匹配成功的次数,以及将每个entry中存储的第一索引信息在预设时间间隔内被匹配成功的次数与预设时间间隔的比值确定为对应entry的访问频率。需要说明的是,由于每个entry对应的预设时间间隔相同,且每个entry中存储的第一索引信息在预设时间间隔内被匹配成功的次数与其访问频率呈正相关关系,因此,也可以将每个entry中存储的第一索引信息在预设时间间隔内被匹配成功的次数确定为对应entry的访问频率,这样可以在确保数据准确的前提下,提高计算效率、减少计算量和计算成本。The process of monitoring the access frequency of each entry in the redirection data table includes: judging that the first index information stored in each entry in the redirection data table is successfully matched by the index information of the instruction to be obtained within a preset time interval The number of times, and the ratio of the number of times the first index information stored in each entry is successfully matched within the preset time interval to the preset time interval is determined as the access frequency of the corresponding entry. It should be noted that since the preset time interval corresponding to each entry is the same, and the number of times that the first index information stored in each entry is successfully matched within the preset time interval is positively correlated with its access frequency, it is also possible The number of times the first index information stored in each entry is successfully matched within a preset time interval is determined as the access frequency of the corresponding entry, which can improve the calculation efficiency, reduce the calculation amount and the calculation cost while ensuring the accuracy of the data. .
通过监测重定向数据表中的每个entry的访问频率,以及根据entry的访问频率与第三预设频率或者第四预设频率的比较结果,确定是否改变entry中的第一索引信息对应的指令的冷热点状态,从而实现重定向数据表中的每个entry的第一索引信息对应的指令的冷热点状态的实时监控和更新,确保重定向数据表中的信息的时效性、准确性。By monitoring the access frequency of each entry in the redirection data table, and according to the comparison result of the entry access frequency with the third preset frequency or the fourth preset frequency, determine whether to change the instruction corresponding to the first index information in the entry The hot-cold and hot-spot status of each entry in the redirected data table can be monitored and updated in real time, and the timeliness and accuracy of the information in the redirected data table can be ensured.
为了进一步的确认是否对热点指令进行重定向,以进一步提高启动重定向流程的准确性,可以通过以下两种方式在热点cache中确定热点指令。其中:In order to further confirm whether to redirect the hotspot instruction, and to further improve the accuracy of starting the redirection process, the hotspot instruction can be determined in the hotspot cache in the following two ways. in:
方式一:判断热点cache的访问频率是否达到寄存器配置的频率且热点cache的访问频率与冷点cache的访问频率的差是否大于配置值,若是,则在热点cache中确定热点指令。寄存器配置的频率例如可以为1000MOPS。Method 1: Determine whether the access frequency of the hot spot cache reaches the frequency configured by the register and whether the difference between the access frequency of the hot spot cache and the access frequency of the cold spot cache is greater than the configured value. If so, determine the hot spot instruction in the hot spot cache. The frequency configured by the register may be 1000 MOPS, for example.
在上述方式一中,若存在多个热点cache,可在冷点cache中确定每个热点cache对应的冷点cache,判断每个热点cache的访问频率是否达到寄存器配置的频率且每个热点cache的访问频率与其对应的冷点cache的访问频率的差是否大于配置值,在符合上述条件的热点cache中确定热点指令。需要说明的是,确定每个热点cache对应的冷点cache的方式已经在上文中进行了说明,因此此处不再赘述。In the above method 1, if there are multiple hot spot caches, the cold spot cache corresponding to each hot spot cache can be determined in the cold spot cache to determine whether the access frequency of each hot spot cache reaches the frequency configured by the register and the number of hot spot caches for each hot spot cache. Whether the difference between the access frequency and the access frequency of the corresponding cold spot cache is greater than the configured value, the hot spot instruction is determined in the hot spot cache that meets the above conditions. It should be noted that the method of determining the cold spot cache corresponding to each hot spot cache has been described above, so it will not be repeated here.
方式二、判断热点cache的访问频率是否大于n倍的冷点cache的访问频率,若是,则在热点cache中确定热点指令。Method 2: Determine whether the access frequency of the hot spot cache is greater than n times the access frequency of the cold spot cache, and if so, determine the hot spot instruction in the hot spot cache.
在上述方式二中,若存在多个热点cache,则可在冷点cache中确定每个热点cache对应的冷点cache,判断每个热点cache的访问频率是否大于n倍的与其对应的冷点cache的访问频率,以及在符合上述条件的热点cache中确定热点指令。In the second method above, if there are multiple hot spot caches, the cold spot cache corresponding to each hot spot cache can be determined in the cold spot cache to determine whether the access frequency of each hot spot cache is greater than n times the corresponding cold spot cache The access frequency of, and the hotspot instruction is determined in the hotspot cache that meets the above conditions.
显然,通过上述方式一和方式二中的提供的确定热点指令的限制条件,在满足限制条件时,方可在热点cache中确定热点指令,从而将热点指令的第一索引信息和冷点cache的id记录在目标entry中,以实现热点指令的重定向。可以理解为:上述方式一和方式二为启动重定向过程提供了限制条件,且仅在满足该限制条件的前提下,方可启动重定向过程,提高了启动重定向流程的准确性。Obviously, through the restriction conditions for determining the hot spot instruction provided in the above method 1 and method 2, only when the restriction condition is met, the hot spot instruction can be determined in the hot spot cache, thereby combining the first index information of the hot spot instruction and the cold spot cache The id is recorded in the target entry to realize the redirection of hot instructions. It can be understood that: the above method 1 and method 2 provide restriction conditions for starting the redirection process, and the redirection process can be started only when the restriction conditions are met, which improves the accuracy of starting the redirection process.
需要说明的是,主存为数据主存时的带宽均衡方法的执行原理与主存为指令主存时的带宽均衡方法的执行原理相同,因此此处不再进行赘述。It should be noted that the execution principle of the bandwidth equalization method when the main memory is the data main memory is the same as the execution principle of the bandwidth equalization method when the main memory is the command main memory, so it will not be repeated here.
综上所述,通过监测多个cache的访问频率,并在多个cache中确定冷点cache和热点cache,以及将热点cache中的热点信息的第一索引信息和冷点cache的id记录在重定向数据表的目标entry中,即将访问频率高的cache中的热点信息重定向至访问频率低的cache中,改变了信息在多个cache中的部署方式,且信息在多个cache中的部署方式与每 个cache的访问频率相关,这样在根据请求获取信息时,能够极大的缓解请求分配不均的问题,从而改善获取带宽严重失衡的问题,提高了信息获取的性能;另外,由于信息在多个cache中的分布与每个cache的访问频率相关,因此可以实现信息获取性能的预估;此外,仅根据每个cache的访问频率,即可实现热点信息的重定向,步骤简单,易于实现。To sum up, by monitoring the access frequency of multiple caches, and determining the cold spot cache and hot spot cache in the multiple caches, and recording the first index information of the hot spot information in the hot spot cache and the id of the cold spot cache in the re In the target entry of the directional data table, the hot information in the cache with high access frequency is redirected to the cache with low access frequency, which changes the way information is deployed in multiple caches, and the way information is deployed in multiple caches It is related to the access frequency of each cache. This can greatly alleviate the problem of uneven distribution of requests when obtaining information according to the request, thereby improving the problem of serious imbalance of access bandwidth and improving the performance of information acquisition; in addition, because the information is in The distribution of multiple caches is related to the access frequency of each cache, so the information acquisition performance can be estimated; in addition, the hotspot information can be redirected only according to the access frequency of each cache, the steps are simple and easy to implement .
下面,在上述带宽均衡方法的基础上,对信息获取的过程进行说明,图4为本申请实施例提供的一种信息获取方法的流程示意图的第一部分,图5为本申请实施例提供的一种信息获取方法的流程示意图的第二部分,图6为本申请实施例提供的一种信息获取方法的流程示意图的第三部分,图7为本申请实施例提供的一种信息获取方法的流程示意图的第四部分。信息获取方法的执行主体可以与上述带宽均衡方法的执行主体相同,也可以不同,本申请对此不作特殊限定。下面,以执行主体为图1中的重定向模块,主存为指令主存,待获取信息为待获取指令为例对信息获取的过程进行说明。如图4至图7所示,该信息获取的流程可以包括:Below, based on the above bandwidth equalization method, the information acquisition process will be described. FIG. 4 is the first part of a schematic flowchart of an information acquisition method provided by an embodiment of this application, and FIG. The second part of the schematic flow chart of an information acquisition method, FIG. 6 is the third part of the schematic flow chart of an information acquisition method provided by an embodiment of this application, and FIG. 7 is a flow chart of an information acquisition method provided by an embodiment of this application The fourth part of the schematic. The execution subject of the information acquisition method may be the same as or different from the execution subject of the above bandwidth equalization method, which is not specifically limited in this application. In the following, the information acquisition process will be described by taking the execution body as the redirection module in FIG. 1, the main memory as the instruction main memory, and the information to be acquired as the instruction to be acquired as an example. As shown in Figure 4 to Figure 7, the information acquisition process may include:
步骤401、重定向模块读取第一请求,第一请求为获取待获取指令的请求,第一请求携带第二索引信息,第二索引信息为用于获取待获取指令的索引信息。Step 401: The redirection module reads the first request. The first request is a request for obtaining an instruction to be obtained, the first request carries second index information, and the second index information is index information for obtaining an instruction to be obtained.
第一请求中携带的第二索引信息的数量为至少一个。若第一请求中携带的第二索引信息为多个时,则第一请求为获取每个第二索引信息对应的待获取指令的请求。The number of second index information carried in the first request is at least one. If there are multiple second index information carried in the first request, the first request is a request for obtaining the instruction to be obtained corresponding to each second index information.
需要说明的是,若待获取指令的大小大于主存中的存储块的容量,则第二索引信息为存储该待获取指令的第一个指令段的存储块的索引信息;若待获取指令的大小小于或者等于主存中的存储块的容量,则第二索引信息为存储该待获取指令的存储块的索引信息。It should be noted that if the size of the instruction to be acquired is greater than the capacity of the storage block in the main memory, the second index information is the index information of the storage block of the first instruction segment storing the instruction to be acquired; If the size is less than or equal to the capacity of the storage block in the main memory, the second index information is the index information of the storage block storing the instruction to be acquired.
步骤402、重定向模块为第一请求绑定执行线程。需要说明的是,执行线程与processor Core一一映射,且只有已经完成前次指令获取请求的processor Core对应的执行线程才可再次绑定第一请求。当存在多个可绑定的执行线程时,根据执行线程的指令队列(Instruction Queue,Inst Q)深度,将第一请求绑定在Inst Q浅的执行线程上。Step 402: The redirection module binds the execution thread for the first request. It should be noted that the execution thread is mapped to the processor Core one-to-one, and only the execution thread corresponding to the processor Core that has completed the previous instruction acquisition request can bind the first request again. When there are multiple binding execution threads, the first request is bound to the execution thread with shallow Inst Q according to the depth of the instruction queue (Instruction Queue, Inst Q) of the execution thread.
步骤403、重定向模块根据第二索引信息确定待获取指令的第一索引信息。在第二索引信息中确定待获取指令的第一索引信息的原理为:通过主存中的存储块与多个cache的映射规则在第二索引信息中确定待获取指令的第一索引信息,待获取指令的第一索引例如可以是第二索引信息,还可以是第二索引信息中的一部分等。例如,若第二索引信息为待获取指令的物理地址,则待获取指令的第一索引信息可以为待获取指令的物理地址或者待获取指令的物理地址中的一部分字段。Step 403: The redirection module determines the first index information of the instruction to be acquired according to the second index information. The principle of determining the first index information of the instruction to be acquired in the second index information is: the first index information of the instruction to be acquired is determined in the second index information according to the mapping rule between the storage block in the main memory and multiple caches. The first index of the acquisition instruction may be, for example, the second index information, or a part of the second index information. For example, if the second index information is the physical address of the instruction to be acquired, the first index information of the instruction to be acquired may be the physical address of the instruction to be acquired or a part of the fields in the physical address of the instruction to be acquired.
需要说明的是,若第一请求携带多个第二索引信息,则在步骤403中选择其中的一个第二索引信息,并根据该选中的一个第二索引信息执行步骤403及以下步骤。It should be noted that if the first request carries multiple second index information, one of the second index information is selected in step 403, and step 403 and the following steps are executed according to the selected second index information.
步骤404、重定向模块将待获取指令的第一索引信息与重定向数据表中的每个entry中的第一索引信息进行匹配。Step 404: The redirection module matches the first index information of the instruction to be acquired with the first index information in each entry in the redirection data table.
步骤405、若一个entry中的第一索引信息与待获取指令的第一索引信息匹配,则重定向模块将与待获取指令的第一索引信息匹配的entry中的cache的id对应的cache确定为第一目标cache。Step 405: If the first index information in an entry matches the first index information of the instruction to be acquired, the redirection module determines the cache corresponding to the id of the cache in the entry that matches the first index information of the instruction to be acquired as The first target cache.
步骤406、重定向模块将第二索引信息发送至第一目标cache中,以使第一目标cache根据第二索引信息获取待获取指令。Step 406: The redirection module sends the second index information to the first target cache, so that the first target cache obtains the instruction to be obtained according to the second index information.
通过步骤403和步骤406可知,根据第二索引信息确定待获取指令的第一索引信息, 并将待获取指令的第一索引信息与重定向数据表中的每个entry中的第一索引信息进行匹配,以根据匹配结果确定该待获取指令是否被重定向过,若是,则确定待获取指令重定向后所对应的cache,即第一目标cache,以在第一目标cache中获取待获取指令。若否,则根据主存中的存储块与多个cache的映射规则,并结合第二索引信息确定待获取指令对应的cache,即第一目标cache,以及在第一目标cahce中获取待获取指令,具体过程详见步骤407。It can be known from step 403 and step 406 that the first index information of the instruction to be acquired is determined according to the second index information, and the first index information of the instruction to be acquired is combined with the first index information in each entry in the redirection data table. Matching is used to determine whether the instruction to be acquired has been redirected according to the matching result. If so, the cache corresponding to the instruction to be acquired after redirection is determined, that is, the first target cache, so as to acquire the instruction to be acquired in the first target cache. If not, determine the cache corresponding to the instruction to be fetched according to the mapping rules between the storage block in the main memory and multiple caches, combined with the second index information, that is, the first target cache, and fetch the instruction to be fetched in the first target cahce , See step 407 for the specific process.
步骤407、若每个entry中的第一索引信息均与待获取指令的第一索引信息不匹配,则重定向模块根据第二索引信息和映射规则确定第一目标cache。Step 407: If the first index information in each entry does not match the first index information of the instruction to be obtained, the redirection module determines the first target cache according to the second index information and the mapping rule.
映射规则为主存中的存储块与多个cache的映射关系以及存储块与多个cache中的cacheline的映射关系。根据待获取指令和映射规则确定第一目标cache的原理为:基于存储块与多个cache的映射关系,并结合第二索引信息确定待获取指令对应的cache,即第一目标cache。The mapping rule is the mapping relationship between the storage block in the main memory and multiple caches, and the mapping relationship between the storage block and cachelines in multiple caches. The principle of determining the first target cache according to the instruction to be acquired and the mapping rule is: based on the mapping relationship between the storage block and multiple caches, and in combination with the second index information, the cache corresponding to the instruction to be acquired is determined, that is, the first target cache.
下面,举例对基于存储块与多个cache的映射关系,并结合第二索引信息确定待获取指令对应的cache的过程进行说明。In the following, an example is given to describe the process of determining the cache corresponding to the instruction to be obtained based on the mapping relationship between the storage block and the multiple caches in combination with the second index information.
若根据存储块的物理地址和cache的id设置存储块与多个cache之间的映射关系,且以存储块的物理地址中的预设字段与cache的id建立映射关系,其中,物理地址和cache的id均用二进制表示,物理地址中的预设字段的位数与cache的id的位数相同,即若存储块的物理地址的预设字段与一cache的id相同,则将该cache确定为该存储块对应的cache。这样,第二索引信息为待获取指令的物理地址,将待获取指令的物理地址中的预设字段与每个cache的id进行比对,将id与待获取指令的物理地址中的预设字段相同的cache确定为待获取指令对应的cache。If the mapping relationship between the storage block and multiple caches is set according to the physical address of the storage block and the id of the cache, and the mapping relationship between the preset field in the physical address of the storage block and the id of the cache is established, the physical address and the cache The ids are all expressed in binary, and the number of bits in the preset field of the physical address is the same as the number of the id of the cache. That is, if the preset field of the physical address of the storage block is the same as the id of a cache, the cache is determined to be The cache corresponding to this storage block. In this way, the second index information is the physical address of the instruction to be acquired. The preset field in the physical address of the instruction to be acquired is compared with the id of each cache, and the id is compared with the preset field in the physical address of the instruction to be acquired. The same cache is determined as the cache corresponding to the instruction to be fetched.
步骤408、将第二索引信息发送至第一目标cache中,以使第一目标cache根据第二索引信息获取待获取指令。Step 408: Send the second index information to the first target cache, so that the first target cache obtains the instruction to be obtained according to the second index information.
步骤409、第一目标cache接收第二索引信息,根据第二索引信息并结合主存中的存储块与cache中的cacheline的映射关系确定第一目标cache中是否存在与第二索引信息对应的cacheline,若存在,则在对应的cacheline中获取待获取指令。Step 409: The first target cache receives the second index information, and determines whether there is a cacheline corresponding to the second index information in the first target cache according to the second index information and the mapping relationship between the storage block in the main memory and the cacheline in the cache. , If it exists, get the command to be fetched in the corresponding cacheline.
步骤410、第一目标cache将该待获取指令发送至crossbar。Step 410: The first target cache sends the command to be obtained to the crossbar.
步骤411、crossbar将该待获取指令发送至第一请求绑定的执行线程对应的Processor Core。Step 411: The crossbar sends the instruction to be obtained to the Processor Core corresponding to the execution thread bound to the first request.
步骤412、第一目标cache根据获取的待获取指令(即步骤409中从对应的cacheline中获取的待获取指令)中的EI标识判断待获取指令是否已经获取完成。Step 412: The first target cache determines whether the acquisition of the instruction to be acquired has been completed according to the EI identifier in the acquired instruction to be acquired (that is, the instruction to be acquired from the corresponding cacheline in step 409).
步骤413、若EI标记为1,则确定已经完成待获取指令的获取,跳转至步骤421。Step 413: If the EI flag is 1, it is determined that the acquisition of the instruction to be acquired has been completed, and then jump to step 421.
步骤414、若EI标记为0,则确定未完成待获取指令的获取,第一目标cache根据第二索引信息与存储间隔计算第三索引信息。需要说明的是,该存储间隔为在主存中存储一个指令中的不同指令段时的地址间隔。Step 414: If the EI flag is 0, it is determined that the acquisition of the instruction to be acquired has not been completed, and the first target cache calculates the third index information according to the second index information and the storage interval. It should be noted that the storage interval is the address interval when different instruction segments of an instruction are stored in the main memory.
步骤415、第一目标cache将第三索引信息发送至重定向模块。Step 415: The first target cache sends the third index information to the redirection module.
步骤416、重定向模块接收第一目标cache发送的第三索引信息,以及根据第三索引信息确定待获取指令的第一索引信息,将待获取指令的第一索引信息与重定向数据表中的每个entry中的第一索引信息进行匹配。Step 416: The redirection module receives the third index information sent by the first target cache, and determines the first index information of the instruction to be acquired according to the third index information, and compares the first index information of the instruction to be acquired with the data in the redirection data table. The first index information in each entry is matched.
步骤417、若一个entry中的第一索引信息与待获取指令的第一索引信息匹配,则重定向模块将与待获取指令的第一索引信息匹配的entry中的cache的id对应的cache确定为第二目标cache。Step 417: If the first index information in an entry matches the first index information of the instruction to be acquired, the redirection module determines the cache corresponding to the id of the cache in the entry that matches the first index information of the instruction to be acquired as The second target cache.
步骤418、重定向模块将第三索引信息发送至第二目标cache中,以使第二目标cache根据第三索引信息获取待获取指令。Step 418: The redirection module sends the third index information to the second target cache, so that the second target cache obtains the instruction to be obtained according to the third index information.
步骤419、若每个entry中的第一索引信息均与待获取指令的第一索引信息不匹配,则重定向模块根据第三索引信息和映射规则确定第二目标cache。Step 419: If the first index information in each entry does not match the first index information of the instruction to be obtained, the redirection module determines the second target cache according to the third index information and the mapping rule.
步骤420、重定向模块将第三索引信息发送至第二目标cache中,以使第二目标cache根据第三索引信息获取待获取指令。Step 420: The redirection module sends the third index information to the second target cache, so that the second target cache obtains the instruction to be obtained according to the third index information.
需要说明的是,根据第三索引信息确定第二目标cache的原理和根据第二索引信息确定第一目标cache的原理相同,因此此处不再进行赘述。第二目标cache根据第三索引信息获取待获取指令的原理以及之后的处理流程的原理同步骤409的原理及其之后的处理流程的原理,因此此处不再进行赘述。It should be noted that the principle of determining the second target cache according to the third index information is the same as the principle of determining the first target cache according to the second index information, so it will not be repeated here. The principle that the second target cache obtains the instruction to be acquired according to the third index information and the principle of the subsequent processing flow are the same as the principle of step 409 and the principle of the subsequent processing flow, so it will not be repeated here.
步骤421、若EI标记为1,则重定向模块判断第一请求中是否存在其他可用的第二索引信息,以及在第一请求中的其他可用的第二索引信息中选择一个第二索引信息,并基于该步骤中选择的第二索引信息重复步骤403及其之后的步骤。Step 421: If the EI flag is 1, the redirection module determines whether there is other available second index information in the first request, and selects one second index information from the other available second index information in the first request. And based on the second index information selected in this step, step 403 and subsequent steps are repeated.
具体的,在该步骤中,若EI标记为1,则说明已经完成步骤403中选择的第二索引信息对应的待获取指令的获取。由于第一请求中的第二索引信息的数量为至少一个,则在第二索引信息的数量为多个情况下,要从第一请求中选择其他可用第二索引信息,并基于重新选择的第二索引信息重复上述过程,获取该重新选择的第二索引信息对应的待获取指令。Specifically, in this step, if the EI flag is 1, it means that the acquisition of the instruction to be acquired corresponding to the second index information selected in step 403 has been completed. Since the number of second index information in the first request is at least one, when the number of second index information is more than one, other available second index information should be selected from the first request and based on the reselected first index information. For the second index information, the above process is repeated to obtain the to-be-obtained instruction corresponding to the reselected second index information.
需要说明的是,若EI标记为1且第一请求中没有其他可用的第一索引信息,则确定第一请求处理完成,重定向模块可以读取新的请求,并对新的请求进行处理。It should be noted that if the EI flag is 1 and there is no other available first index information in the first request, it is determined that the processing of the first request is completed, and the redirection module can read the new request and process the new request.
步骤422,在步骤409中,若第一目标cache中不存在与第二索引信息对应的cacheline,则第一目标cache根据第二索引信息生成Refill(回填)请求。Step 422: In step 409, if there is no cacheline corresponding to the second index information in the first target cache, the first target cache generates a Refill (backfill) request according to the second index information.
步骤423、第一目标cache将Refill请求发送至主存。Step 423: The first target cache sends the Refill request to the main memory.
步骤424、主存根据Refill请求中的第二索引信息获取待获取指令,以及根据待获取指令和第二索引信息生成应答信息,并将待获取指令通过crossbar发送至第一请求绑定的执行线程对应的Processor Core。Step 424: The main memory acquires the instruction to be acquired according to the second index information in the Refill request, and generates response information according to the instruction to be acquired and the second index information, and sends the instruction to be acquired to the execution thread bound to the first request through the crossbar. Corresponding Processor Core.
步骤425、主存将应答信息发送至重定向模块。Step 425: The main memory sends the response information to the redirection module.
步骤426、重定向模块根据应答信息中的第二索引信息确定待获取指令的第一索引信息,将待获取指令的第一索引信息与重定向数据表中的每个entry中存储的第一索引信息进行匹配。Step 426: The redirection module determines the first index information of the instruction to be acquired according to the second index information in the response information, and compares the first index information of the instruction to be acquired with the first index stored in each entry in the redirection data table. Information to match.
步骤427、若待获取指令的第一索引信息与重定向数据表中的一个entry中的第一索引信息匹配,则重定向模块将与待获取指令的第一索引信息匹配的entry中的cache的id对应的cache确定为第三目标cache。Step 427: If the first index information of the instruction to be acquired matches the first index information in an entry in the redirection data table, the redirection module will match the first index information of the instruction to be acquired to the cache entry in the entry. The cache corresponding to the id is determined to be the third target cache.
步骤428、重定向模块将应答信息发送至第三目标cache。Step 428: The redirection module sends the response information to the third target cache.
步骤429、若待获取指令的第一索引信息与重定向数据表中的任意一个entry中的第一索引信息均不匹配,则重定向模块根据应答信息中的第二索引信息并结合映射规则确定第三目标cache。Step 429: If the first index information of the instruction to be obtained does not match the first index information in any entry in the redirection data table, the redirection module determines according to the second index information in the response information in combination with the mapping rule The third target cache.
步骤430、重定向模块将应答信息发送至第三目标cache。Step 430: The redirection module sends the response information to the third target cache.
步骤431、第三目标cache接收应答信息,以及将应答信息中的待获取指令存储在第三目标cache中一个cacheline中。Step 431: The third target cache receives the response information, and stores the to-be-obtained instruction in the response information in a cacheline in the third target cache.
需要说明的是,主存为数据主存且待获取信息为待获取数据的信息获取流程同上述步骤401~431,因此,此处不再进行赘述。It should be noted that the information acquisition process in which the main storage is the data main storage and the information to be obtained is the data to be obtained is the same as the foregoing steps 401 to 431, and therefore, details are not described here.
由上可知,根据第二索引信息确定待获取信息的第一索引信息,将待获取信息的第一索引信息与重定向数据表中的每个entry中的第一索引信息进行匹配,并在存在匹配的entry时,将匹配的entry中的cache的id对应的cache确定为第一目标cache,以及将第二索引信息发送至第一目标cache中,以使第一目标cache根据第二索引信息获取待获取信息,因此基于重定向数据表实现了第一请求的分流,极大的缓解请求分配不均的问题,从而改善请求获取带宽严重失衡的问题,提高了信息获取性能。It can be seen from the above that the first index information of the information to be obtained is determined according to the second index information, and the first index information of the information to be obtained is matched with the first index information in each entry in the redirection data table, and when there is When matching entries, the cache corresponding to the id of the cache in the matching entry is determined as the first target cache, and the second index information is sent to the first target cache, so that the first target cache obtains it according to the second index information Information to be acquired, therefore, the first request is split based on the redirect data table, which greatly alleviates the problem of uneven request distribution, thereby improving the problem of severe imbalance in request acquisition bandwidth, and improving information acquisition performance.
下面,将举例对上述过程进行说明,图8本申请实施例提供的包括多个cache的应用场景示意图,如图8所示,该应用场景包括Slice cache(即该应用场景中的cache为Slice cache)、processor core、IBUF(Input Buffer,输入缓存)REDIR1(重定向模块1)、Redirect Table(重定向数据表)、主存(图8中未示出)其中,IBUF包括输入IFIFO(Input FIFO,First In First Out,先进先出)和REDIR0(重定向模块0),Slice cache的数量为16个,依次为Slice cache0~15,每个Slice cache均包括256个cacheline、processor core的数量为16个,分别为processor core0~15。主存为指令主存,即主存中存储的信息为指令。每个Slice cache中的cacheline与主存之间采用8-way组关联,主存中的每个存储块中存储的指令在多个Slice cache中按照Z字型进行部署。In the following, the above process will be described with an example. FIG. 8 provides a schematic diagram of an application scenario including multiple caches according to an embodiment of the application. As shown in FIG. 8, the application scenario includes Slice cache (that is, the cache in the application scenario is Slice cache. ), processor core, IBUF (Input Buffer, input buffer) REDIR1 (redirection module 1), Redirect Table (redirection data table), main memory (not shown in Figure 8). Among them, IBUF includes input IFIFO (Input FIFO, First In First Out and REDIR0 (redirection module 0), the number of slice caches is 16, and the number of slice caches is from 0 to 15. Each slice cache includes 256 cachelines and the number of processor cores is 16 , Respectively, processor core 0-15. The main memory is the command main memory, that is, the information stored in the main memory is the command. An 8-way group association is used between the cacheline in each slice cache and the main memory, and the instructions stored in each storage block in the main memory are deployed in a zigzag pattern in multiple slice caches.
需要说明的是,在主存中存储待存储指令时,若待存储指令的大小大于主存中的存储块的容量,则将待存储指令根据存储块的容量划分为多个指令段,并将划分得到的多个指令段存储在多个存储块中,其中划分得到的多个指令段的数量等于多个存储块的数量,且相邻指令段存储的存储块的物理地址相隔8,即相邻指令段的物理地址的间隔为8,需要说明的是,此处的间隔8为十进制,若待存储指令的大小小于或者等于存储块的容量,则将该待存储指令作为一个整体存储在一个存储块中。It should be noted that when storing instructions to be stored in the main memory, if the size of the instructions to be stored is greater than the capacity of the storage block in the main memory, the instructions to be stored are divided into multiple instruction segments according to the capacity of the storage block, and The divided instruction segments are stored in multiple storage blocks, where the number of divided instruction segments is equal to the number of multiple storage blocks, and the physical addresses of the storage blocks stored in adjacent instruction segments are separated by 8, that is, relative The interval of the physical address of the adjacent instruction segment is 8. It should be noted that the interval 8 here is decimal. If the size of the instruction to be stored is less than or equal to the capacity of the storage block, the instruction to be stored is stored as a whole in a In the storage block.
主存中的存储块与16个Slice cache的映射规则为:主存中的每个存储块按照Z字型映射至16个Slice cache中,以及每个Slice cache与主存中的存储块之间采用8-way组关联。需要说明的是,该映射规则是根据存储块的物理地址和各Slice cache的id设置的。The mapping rule between the storage block in the main memory and the 16 slice cache is: each storage block in the main memory is mapped to 16 slice caches in a Z-shape, and between each slice cache and the storage block in the main memory Use 8-way group association. It should be noted that the mapping rule is set according to the physical address of the storage block and the id of each slice cache.
存储块的物理地址的位数(二进制)可以由主存中的存储块的总数量和每个存储块的存储字节确定,此处,存储块的物理地址包括18位,即存储块的物理地址可以用PC[17:0]表示,需要说明的是,存储块的物理地址为该存储块中存储的指令的物理地址,即存储块中的指令的物理地址用PC[17:0]表示。The number of bits (binary) of the physical address of the storage block can be determined by the total number of storage blocks in the main memory and the storage bytes of each storage block. Here, the physical address of the storage block includes 18 bits, that is, the physical address of the storage block. The address can be represented by PC[17:0]. It should be noted that the physical address of the storage block is the physical address of the instruction stored in the storage block, that is, the physical address of the instruction in the storage block is represented by PC[17:0] .
用二进制表示Slice cache的id时所需的位数可以由Slice cache的总数量确定。由于Slice cache的数量为16,则Slice cache的id的位数为4位,即Slice cache的id可以用SLID[3:0]表示。The number of bits required to represent the slice cache id in binary can be determined by the total number of slice caches. Since the number of slice cache is 16, the number of digits of the slice cache id is 4, that is, the slice cache id can be represented by SLID[3:0].
下面,对上述映射规则进行说明。Next, the above mapping rules will be described.
主存中的每个存储块按照Z字型映射至16个Slice cache中的方式为:根据主存中的存储块的物理地址,以8为单位对主存中的存储块进行划分,得到多个存储块组,其中, 每个存储块组中包括8个存储块且该8个存储块的物理地址相邻;然后,将存储块组a+16b中的存储块映射至第a个Slice cache(即Slice cache a)中,其中,a的取值范围为[0,15],且a为整数,b大于等于0且为整数。换言之,存储块组0(即物理地址为0~7的存储块)、存储块组16(即物理地址为128~135的存储块)等以0为基础且相隔16的存储块组中的存储块映射至Slice cache0中;存储块组1(即物理地址为8~15的存储块)、存储块组17(即物理地址为136~143的存储块)等以1为基础且相隔16的存储块组中的存储块映射至Slice cache1中;依次类推,存储块组15(即物理地址为120~127的存储块)、存储块组31(即物理地址为248~255的存储块)等以15为基础且相隔16的存储块组中的存储块映射至Slice cache15中。需要说明的是,本段中的物理地址均为10进制。Each memory block in the main memory is mapped to 16 slice caches in a zigzag pattern as follows: According to the physical address of the memory block in the main memory, the memory block in the main memory is divided in units of 8 to obtain multiple Storage block groups, where each storage block group includes 8 storage blocks and the physical addresses of the 8 storage blocks are adjacent; then, the storage blocks in the storage block group a+16b are mapped to the ath Slice cache (That is, in Slice cache a), the value range of a is [0,15], and a is an integer, and b is greater than or equal to 0 and is an integer. In other words, storage in storage block group 0 (that is, storage blocks with physical addresses 0-7), storage block group 16 (that is, storage blocks with physical addresses 128-135) based on 0 and separated by 16 storage blocks Blocks are mapped to Slice cache0; storage block group 1 (that is, storage blocks with physical addresses 8 to 15), storage block group 17 (that is, storage blocks with physical addresses 136 to 143), etc., which are based on 1 and are stored 16 apart The storage blocks in the block group are mapped to Slice cache1; and so on, storage block group 15 (that is, storage blocks with physical addresses of 120 to 127), storage block group 31 (that is, storage blocks with physical addresses of 248 to 255), etc. The storage blocks in the storage block group based on 15 and separated by 16 are mapped to Slice cache15. It should be noted that the physical addresses in this section are all decimal.
这样,基于上述Z字型映射方式,将存储块的PC[17:0]中的PC[6:3]作为预设字段,以根据存储块的PC[6:3]确定存储块对应的Slice cache,即若一存储块的PC[6:3]与一Slice cache的SLID[3:0]相同,则该Slice cache为该存储块对应Slice cache。由于存储块的物理地址为存储块中存储的指令的物理地址,则在将一存储块中存储的指令缓存至16个Slice cache中的一个Slice cache时,将该存储块中存储的指令的PC[6:3]与每个Slice cache的SLID[3:0]进行比较,并将SLID[3:0]与指令的PC[6:3]相同的Slice cache确定为该指令对应的Slice cache,并将该指令缓存至其对应的Slice cache中。In this way, based on the above-mentioned zigzag mapping method, PC[6:3] in the PC[17:0] of the storage block is used as a preset field to determine the slice corresponding to the storage block according to the PC[6:3] of the storage block Cache, that is, if the PC[6:3] of a storage block is the same as the SLID[3:0] of a slice cache, then the slice cache is the slice cache corresponding to the storage block. Since the physical address of the storage block is the physical address of the instruction stored in the storage block, when the instruction stored in a storage block is cached to one of the 16 slice caches, the PC of the instruction stored in the storage block [6:3] Compare with the SLID[3:0] of each slice cache, and determine the slice cache with the same SLID[3:0] and the instruction PC[6:3] as the slice cache corresponding to the instruction, And cache the instruction to its corresponding slice cache.
下面,结合图9对每个Slice cache的结构进行说明。每个Slice cache均包括Cache data和TAG Table表。由于每个Slice cache的结构相同,因此,下面,仅对一个Slice cache中的Cache data和TAG Table表进行说明。In the following, the structure of each slice cache will be described with reference to FIG. 9. Each slice cache includes Cache data and TAG Table tables. Since the structure of each slice cache is the same, the following only describes the Cache data and TAG Table tables in one slice cache.
具体的,Cache data包括256个cacheline(即cacheline0~255),每个cacheline用于缓存主存中存储块中的指令,由于Slice cache与主存之间采用8-way的组关联方式,因此可以将256个cacheline以8为单位划分为32组cacheline,其中,每组cacheline均包括8个cacheline。TAG Table表由多个行和列组成,其中,一行对应一组cacheline,每行均包括8个大列,一大列用一个way表示,即每行总共包括8个way,分别为way0~way7,每行中的8个way与对应的一组cacheline中的8个cacheline一一对应,每个way中保存有与其对应的cacheline中存储的指令的相关信息。Specifically, the Cache data includes 256 cachelines (ie, cachelines 0 to 255), and each cacheline is used to cache instructions in the storage block in the main memory. Because the slice cache and the main memory adopt an 8-way group association method, it can The 256 cachelines are divided into 32 groups of cachelines in units of 8, where each group of cachelines includes 8 cachelines. The TAG Table table consists of multiple rows and columns. One row corresponds to a set of cachelines, each row includes 8 large columns, and one large column is represented by one way, that is, each row includes a total of 8 ways, which are way0~way7. , The 8 ways in each line correspond to the 8 cachelines in the corresponding set of cachelines one-to-one, and each way stores the relevant information of the instructions stored in the corresponding cacheline.
具体的,每个way均包括四个参数,分别为VLD、TAG、lock、dirty。其中,way中的VLD用于指示该way对应的cacheline中是否存储有指令,若VLD为0表征该way对应的cacheline中没有存储指令,若VLD为1,表征该way对应的cacheline中存储有指令;Specifically, each way includes four parameters, namely VLD, TAG, lock, and dirty. Among them, the VLD in the way is used to indicate whether there are instructions stored in the cacheline corresponding to the way. If VLD is 0, it means that there are no instructions in the cacheline corresponding to the way. If the VLD is 1, it means that there are instructions stored in the cacheline corresponding to the way. ;
way中的TAG为该way对应的cacheline中存储的指令的物理地址中的tag位;The TAG in the way is the tag bit in the physical address of the instruction stored in the cacheline corresponding to the way;
way中的lock表征该way对应的cacheline存储的指令是否可以被替换,其中,若lock为1,表征该way对应的cacheline存储的指令不可以被替换,若lock为0,表征该way对应的cacheline存储的指令可以被替换;The lock in the way indicates whether the instructions stored in the cacheline corresponding to the way can be replaced. If the lock is 1, it indicates that the instructions stored in the cacheline corresponding to the way cannot be replaced. If the lock is 0, it indicates the cacheline corresponding to the way. The stored instructions can be replaced;
way中的dirty表征该way对应的cacheline中存储的指令是否与其本该存储的信息是否一致,若dirty为0,表征该way对应的cacheline中存储的指令不与其本该存储的信息一致,若dirty为1,表征该way对应的cacheline中存储的指令与其本该存储的信息一致。The dirty in the way indicates whether the instructions stored in the cacheline corresponding to the way are consistent with the information that should be stored. If dirty is 0, it indicates that the instructions stored in the cacheline corresponding to the way are not consistent with the information that should be stored. It is 1, indicating that the instructions stored in the cacheline corresponding to the way are consistent with the information that should be stored.
每个way对应的行号为该way对应的cacheline中存储的指令的物理地址中的index位。The row number corresponding to each way is the index bit in the physical address of the instruction stored in the cacheline corresponding to the way.
基于上述映射规则,将存储块中的指令的PC[17:13]作为index位,将存储块中的指令 的PC[12:3]作为tag位。这样,在将存储块中的指令缓存至Slice cache时,可以根据该指令的PC[6:3]确定该指令对应的Slice cache,然后,根据该指令的PC[17:13]在其对应的Slice cache中确定该指令对应的行号,根据该指令对应的行号确定该指令对应的一组cacheline,再然后,将该指令存储至其对应的一组cacheline中的一个cacheline中,最后,根据存储该指令的cacheline在该指令对应的Slice cache中的TAG Tab表中确定存储该指令的cacheline对应的way,以及根据该指令的PC[12:3]更新其对应的Slice cache中的TAG Tab表中对应的way中的TAG。Based on the above mapping rules, the PC[17:13] of the instruction in the storage block is used as the index bit, and the PC[12:3] of the instruction in the storage block is used as the tag bit. In this way, when the instruction in the storage block is cached in the Slice cache, the slice cache corresponding to the instruction can be determined according to the PC[6:3] of the instruction, and then the PC[17:13] according to the instruction is in its corresponding Slice cache determines the line number corresponding to the instruction, and determines the set of cachelines corresponding to the instruction according to the line number corresponding to the instruction, and then stores the instruction in one of the corresponding set of cachelines. Finally, according to The cacheline storing the instruction determines the way corresponding to the cacheline storing the instruction in the TAG Tab table in the slice cache corresponding to the instruction, and updates the TAG Tab table in the corresponding slice cache according to the PC [12:3] of the instruction TAG in the corresponding way in.
在此基础上,对确定待获取指令对应的Slice cache后,从对应的Slice cache中获取待获取指令的过程进行说明。On this basis, after determining the slice cache corresponding to the instruction to be obtained, the process of obtaining the instruction to be obtained from the corresponding slice cache is described.
首先,根据待获取指令的PC[17:13]在对应的Slice cache的TAG Tab表中确定待获取指令对应的行,然后,根据待获取指令的PC[12:3]与对应的行中的每个way中的TAG进行匹配,若匹配到,则将匹配的way确定为待获取指令对应的way,最后,根据待获取指令的PC[17:13]和对应的way的id在对应的Slice cache中的Cache data中查找对应的cacheline,以及在对应的cacheline中获取待获取指令。First, determine the row corresponding to the instruction to be obtained in the TAG Tab table of the corresponding Slice cache according to the PC[17:13] of the instruction to be obtained, and then, according to the PC[12:3] of the instruction to be obtained and the corresponding row The TAG in each way is matched. If it is matched, the matched way is determined as the way corresponding to the instruction to be obtained. Finally, according to the PC [17:13] of the instruction to be obtained and the id of the corresponding way in the corresponding Slice Find the corresponding cacheline in the Cache data in the cache, and obtain the command to be obtained in the corresponding cacheline.
基于上述映射规则,指令的第一索引信息为该指令的PC[17:3],第二索引信息为待获取指令的物理地址,即PC[17:0]。Based on the foregoing mapping rule, the first index information of the instruction is the PC[17:3] of the instruction, and the second index information is the physical address of the instruction to be obtained, that is, PC[17:0].
需要说明的是,若待获取指令的大小小于或等于存储块的容量,则待获取指令的PC[17:0]为主存中存储该待获取指令的存储块的PC[17:0],待获取指令的PC[17:13]为主存中存储该待获取指令的存储块的PC[17:13]。若待获取指令的大小大于存储块的容量,则待获取指令的PC[17:0]为主存中存储该待获取指令中的第一个指令段的存储块的PC[17:0],待获取指令的PC[17:13]为主存中存储该待获取指令的第一个指令段的存储块的PC[17:13]。It should be noted that if the size of the instruction to be fetched is less than or equal to the capacity of the storage block, the PC[17:0] of the instruction to be fetched is the PC[17:0] of the memory block storing the instruction to be fetched in the main memory, The PC [17:13] of the instruction to be fetched is the PC [17:13] of the storage block in the main memory that stores the instruction to be fetched. If the size of the instruction to be fetched is greater than the capacity of the storage block, the PC[17:0] of the instruction to be fetched in the main memory stores PC[17:0] of the storage block of the first instruction segment of the instruction to be fetched, The PC [17:13] of the instruction to be fetched is the PC [17:13] of the storage block in the main memory that stores the first instruction segment of the instruction to be fetched.
Redirect Table包括多个entry,如图8所示,一个entry包括四个参数,分别为第一标识VLD、第二标识HOT、热点信息的PC[17:3]和冷点Slice cache的id(即SLID[3:0])。在图8中,若第一标识VLD为0,则确定第一标识VLD为无效标记,若第一标识VLD为1,则确定第一标识VLD为有效标记。若第二标识HOT为0,则确定第二标识为非热点标记,若第二标识HOT为1,则确定第二标识为热点标记。The Redirect Table includes multiple entries. As shown in Figure 8, an entry includes four parameters, which are the first identification VLD, the second identification HOT, the hot spot information PC[17:3], and the cold spot slice cache id (ie SLID[3:0]). In FIG. 8, if the first identification VLD is 0, the first identification VLD is determined to be an invalid mark, and if the first identification VLD is 1, then the first identification VLD is determined to be a valid mark. If the second identifier HOT is 0, the second identifier is determined to be a non-hot spot marker, and if the second identifier HOT is 1, the second identifier is determined to be a hot spot marker.
在上述应用场景的初始状态中,可以根据上述映射规则将主存中的存储块中存储的指令缓存至16个Slice cache中。然后,在基于上述应用场景获取待获取指令的过程中,采用上述待宽均衡方法调整热点Slice cache中的热点指令在16个Slice cache中的部署位置,实现热点Slice cache中的热点指令的重定向,进而达到均衡的作用。In the initial state of the above application scenario, the instructions stored in the storage block in the main memory can be cached into 16 slice caches according to the above mapping rule. Then, in the process of obtaining the instructions to be obtained based on the above application scenarios, the above-mentioned method of equalizing to be wide is used to adjust the deployment position of the hot instructions in the hot slice cache in the 16 slice caches to realize the redirection of the hot instructions in the hot slice cache , And then achieve a balanced effect.
具体的,带宽均衡的过程为:REDIR1检测Slice cache0~15中的每个Slice cache的访问频率;然后,根据每个Slice cache的访问频率在Slice cache0~15中确定冷点Slice cache和热点Slice cache,再然后,在热点Slice cache中确定热点指令;最后,将热点指令的PC[17:3]和冷点Slice cache的SLID[3:0]记录在Redirect Table的目标entry中。需要说明的是,由于带宽均衡的执行原理已经在上文中进行了说明,因此此处不再进行赘述。Specifically, the process of bandwidth equalization is: REDIR1 detects the access frequency of each slice cache in Slice cache 0-15; then, according to the access frequency of each slice cache, determine the cold spot slice cache and hot slice cache in the slice cache 0-15 Then, determine the hot instruction in the hot slice cache; finally, record the PC[17:3] of the hot instruction and the SLID[3:0] of the cold slice cache in the target entry of the Redirect Table. It should be noted that since the implementation principle of bandwidth equalization has been described above, it will not be repeated here.
如图10~图12所示,获取待获取指令的过程可以包括以下步骤:As shown in Figure 10 to Figure 12, the process of obtaining the instruction to be obtained may include the following steps:
步骤1001,IBUF接收第一请求并将第一请求缓存在IFIFO,第一请求为获取待获取指令的请求,第一请求携带第二索引信息,即待获取指令的PC[17:0]。需要说明的是,此 处以第一请求包括一个第二索引信息为例进行说明。Step 1001: IBUF receives the first request and buffers the first request in the IFIFO. The first request is a request to obtain the instruction to be acquired, and the first request carries second index information, that is, the PC[17:0] of the instruction to be acquired. It should be noted that, here, the first request includes a second index information as an example for description.
步骤1002、若IFIFO非空且至少一个processor core已完成前次指令获取请求,则REDIR0从IFIFO中读取第一请求,以及为第一请求绑定相应的执行线程。需要说明的是,执行线程与processor core一一对应。由于processor core的数量为16个,因此执行线程也为16个。需要说明的是,被绑定的执行线程为已完成前次指令获取请求的processor core对应的执行线程。此外,若出现多个可绑定的执行线程,即在多个processor core均完成其前次指令获取请求,则根据执行线程的Inst Q深度将第一请求分配给Inst Q浅的执行线程。Step 1002, if the IFIFO is not empty and at least one processor core has completed the previous instruction acquisition request, REDIR0 reads the first request from the IFIFO, and binds the corresponding execution thread to the first request. It should be noted that there is a one-to-one correspondence between the execution thread and the processor core. Since the number of processor cores is 16, there are also 16 execution threads. It should be noted that the bound execution thread is the execution thread corresponding to the processor core that has completed the previous instruction acquisition request. In addition, if there are multiple bindable execution threads, that is, the previous instruction acquisition request is completed in multiple processor cores, the first request is allocated to the execution thread with shallow Inst Q according to the Inst Q depth of the execution thread.
步骤1003,REDIR0根据待获取指令的PC[17:0]确定待获取指令的第一索引信息。即将待获取指令的PC[17:0]中的PC[17:3]作为待获取指令的第一索引信息。Step 1003: REDIR0 determines the first index information of the instruction to be acquired according to the PC[17:0] of the instruction to be acquired. That is, PC[17:3] in the PC[17:0] of the instruction to be acquired is used as the first index information of the instruction to be acquired.
步骤1004,REDIR0将待获取指令的PC[17:3]与Redirect Table中的每个entry中的PC[17:3]进行匹配。Step 1004: REDIR0 matches the PC[17:3] of the instruction to be obtained with the PC[17:3] in each entry in the Redirect Table.
步骤1005,若Redirect Table表中的一个entry中的PC[17:3]与待获取指令的PC[17:3]匹配,则REDIR0将Redirect Table中与待获取指令的PC[17:3]匹配的entry中的SLID[3:0]对应的Slice cache确定为第一目标Slice cache。Step 1005: If the PC[17:3] in an entry in the Redirect Table matches the PC[17:3] of the instruction to be obtained, REDIR0 matches the PC[17:3] in the Redirect Table with the instruction to be obtained The slice cache corresponding to the SLID[3:0] in the entry is determined as the first target slice cache.
步骤1006、REDIR0将待获取指令的PC[17:0]发送至第一目标Slice cache中。Step 1006: REDIR0 sends the PC[17:0] of the command to be obtained to the first target slice cache.
步骤1007、若Redirect Table表中的每个entry中的PC[17:3]均与待获取指令的PC[17:3]不匹配,则REDIR0根据待获取指令的PC[6:3]确定待获取指令对应的Slice cache,并将该对应的Slice cache确定为第一目标Slice cache。Step 1007: If the PC[17:3] in each entry in the Redirect Table does not match the PC[17:3] of the instruction to be acquired, REDIR0 determines the pending instruction according to the PC[6:3] of the instruction to be acquired. Obtain the slice cache corresponding to the instruction, and determine the corresponding slice cache as the first target slice cache.
步骤1008,REDIR0将待获取指令的PC[17:0]发送至第一目标Slice cache中。Step 1008: REDIR0 sends the PC[17:0] of the command to be obtained to the first target slice cache.
步骤1009、第一目标Slice cache接收待获取指令的PC[17:0],以及根据待获取指令的PC[17:13](即index位)在TAG Table表中确定目标行,将待获取指令的PC[12:3](即tag位)与目标行中的每个way中的TAG进行匹配,若匹配到,则将与待获取指令的PC[12:3]匹配的TAG对应的way确定为目标way,然后,根据待获取指令的PC[17:13]和目标way的id在Cache data中确定待获取指令对应的cacheline,以及从待获取指令对应的cacheline中获取待获取指令。Step 1009: The first target slice cache receives the PC[17:0] of the instruction to be acquired, and determines the target row in the TAG Table according to the PC[17:13] (ie index bit) of the instruction to be acquired, and then transfers the instruction to be acquired The PC[12:3] (ie tag bit) matches the TAG in each way in the target line. If it matches, the way corresponding to the TAG that matches the PC[12:3] of the instruction to be obtained will be determined It is the target way, and then, according to the PC [17:13] of the instruction to be acquired and the id of the target way, the cacheline corresponding to the instruction to be acquired is determined in the Cache data, and the instruction to be acquired is acquired from the cacheline corresponding to the instruction to be acquired.
步骤1010、第一目标Slice cache将获取的待获取指令发送至crossbar。Step 1010: The slice cache of the first target sends the acquired instruction to be acquired to the crossbar.
步骤1011、crossbar向第一请求绑定的执行线程对应的Processor Core发送待获取指令。Step 1011. The crossbar sends a to-be-obtained instruction to the Processor Core corresponding to the execution thread bound by the first request.
步骤1012、第一目标Slice cache根据待获取指令中的EI标记判断待获取指令是否已经获取完成。Step 1012: The slice cache of the first target determines whether the acquisition of the instruction to be acquired has been completed according to the EI flag in the instruction to be acquired.
步骤1013、若EI标记为1,则第一目标Slice cache确定已经完成待获取指令的获取,跳转至步骤1021。Step 1013: If the EI flag is 1, the slice cache of the first target determines that the acquisition of the instruction to be acquired has been completed, and jumps to step 1021.
步骤1014、若EI标记为0,则第一目标Slice cache确定未完成待获取指令的获取,以及根据待获取指令的PC[17:0]和存储间隔得到第三索引信息。此处,由于相邻指令段的物理地址的间隔为8,因此,存储间隔的取值为8,即第三索引信息为待获取指令的PC[17:0]与8的二进制的和。Step 1014: If the EI flag is 0, the slice cache of the first target determines that the acquisition of the instruction to be acquired has not been completed, and obtains the third index information according to the PC[17:0] and the storage interval of the instruction to be acquired. Here, since the interval between physical addresses of adjacent instruction segments is 8, the value of the storage interval is 8, that is, the third index information is the binary sum of PC[17:0] and 8 of the instruction to be acquired.
步骤1015、第一目标Slice cache将第三索引信息发送至REDIR1。Step 1015: The first target slice cache sends the third index information to REDIR1.
步骤1016、REDIR1接收第一目标Slice cache发送的第三索引信息,将第三索引信息 的PC[17:3]与重定向数据表中的每个entry中的PC[17:3]进行匹配。Step 1016: REDIR1 receives the third index information sent by the slice cache of the first target, and matches the PC[17:3] of the third index information with the PC[17:3] in each entry in the redirection data table.
步骤1017、若一个entry中的PC[17:3]与第三索引信息的PC[17:3]匹配,则REDIR1将与第三索引信息的PC[17:3]匹配的entry中的SLID[3:0]对应的Slice cache确定为第二目标Slice cache。Step 1017: If the PC[17:3] in an entry matches the PC[17:3] of the third index information, REDIR1 will match the SLID[ in the entry that matches the PC[17:3] of the third index information. 3:0] The corresponding slice cache is determined to be the second target slice cache.
步骤1018、REDIR1将第三索引信息发送至第二目标Slice cache中。Step 1018: REDIR1 sends the third index information to the second target slice cache.
步骤1019、若每个entry中的PC[17:3]均与第三索引信息的PC[17:3]不匹配,则REDIR1根据第三索引信息中的PC[6:3]确定第三索引信息对应的Slice cache,将第三索引信息对应的Slice cache确定为第二目标Slice cache。Step 1019: If the PC[17:3] in each entry does not match the PC[17:3] of the third index information, REDIR1 determines the third index according to the PC[6:3] in the third index information The slice cache corresponding to the information determines the slice cache corresponding to the third index information as the second target slice cache.
步骤1020、REDIR1将第三索引信息发送至第二目标Slice cache中。Step 1020: REDIR1 sends the third index information to the second target slice cache.
需要说明的是,第二目标Slice cache接收第三索引信息,并根据第三索引信息获取待获取指令的原理及其之后的原理同步骤1009的原理及其之后的原理相同,因此此处不再进行赘述。It should be noted that the second target Slice cache receives the third index information, and obtains the instruction to be obtained according to the third index information. The principle and subsequent principles are the same as the principle of step 1009 and subsequent principles, so it will not be here any more. Go ahead and repeat.
步骤1021、若EI标记为1,则REDIR1可以读取新的请求,并处理新的请求。由于此处的第一请求仅包括一个第二索引信息,因此若EI标记为1,则读取新的请求并进行相应的处理。Step 1021: If the EI flag is 1, REDIR1 can read the new request and process the new request. Since the first request here only includes one second index information, if the EI flag is 1, read the new request and perform corresponding processing.
步骤1022、在步骤1009中,若TAG Table表中与待获取指令的PC[17:13]对应的行中不存在与待获取指令的PC[12:3]对应的way,则第一目标Slice cache根据待获取指令的PC[17:0]生成Refill请求。Step 1022: In step 1009, if there is no way corresponding to the PC[12:3] of the instruction to be acquired in the row corresponding to the PC[17:13] of the instruction to be acquired in the TAGTable table, then the first target Slice The cache generates a Refill request according to the PC[17:0] of the command to be fetched.
步骤1023、第一目标Slice cache将Refill请求发送至主存。Step 1023: The first target slice cache sends the Refill request to the main memory.
步骤1024、主存根据Refill请求中的待获取指令的PC[17:0]获取待获取指令,以及根据待获取指令和待获取指令的PC[17:0]生成应答信息,以及将待获取指令通过crossbar发送至第一请求绑定的执行线程对应的Processor Core。Step 1024: The main memory acquires the instruction to be acquired according to the PC[17:0] of the instruction to be acquired in the Refill request, and generates response information according to the instruction to be acquired and the PC[17:0] of the instruction to be acquired, and transfers the instruction to be acquired Send the crossbar to the Processor Core corresponding to the execution thread bound by the first request.
步骤1025、主存将应答信息发送至REDIR1。Step 1025: The main memory sends the response information to REDIR1.
步骤1026、REDIR1将应答信息中的待获取指令的PC[17:3]与重定向数据表中的每个entry中存储的PC[17:3]进行匹配。Step 1026: REDIR1 matches the PC[17:3] of the command to be obtained in the response message with the PC[17:3] stored in each entry in the redirection data table.
步骤1027、若待获取指令的PC[17:3]与重定向数据表中的一个entry中的PC[17:3]匹配,则REDIR1将与待获取指令的的PC[17:3]匹配的entry中的SLID[3:0]对应的Slice cache确定为第三目标Slice cache。Step 1027: If the PC[17:3] of the instruction to be obtained matches the PC[17:3] in an entry in the redirection data table, then REDIR1 will match the PC[17:3] of the instruction to be obtained The slice cache corresponding to SLID[3:0] in the entry is determined to be the third target slice cache.
步骤1028、REDIR1将应答信息发送至第三目标Slice cache。Step 1028: REDIR1 sends the response information to the third target slice cache.
步骤1029、若待获取指令的PC[17:3]与重定向数据表中的任意一个entry中的PC[17:3]均不匹配,则REDIR1将待获取指令的的PC[6:3]所对应的Slice cache确定为第三目标Slice cache。Step 1029: If the PC[17:3] of the instruction to be acquired does not match the PC[17:3] in any entry in the redirection data table, REDIR1 sets the PC[6:3] of the instruction to be acquired The corresponding slice cache is determined as the third target slice cache.
步骤1030、REDIR1将应答信息发送至第三目标Slice cache。Step 1030: REDIR1 sends the response information to the third target slice cache.
步骤1031、第三目标Slice cache接收应答信息,以及将应答信息中的待获取指令存储在第三目标Slice cache中的Cache data中的一个cacheline中,并根据应答信息中的待获取指令的PC[17:0]更新存储待获取指令的cacheline在TAG Table表中对应way中的信息,即将对应的way中的VLD设置为1、TAG设置为待获取指令的PC[12:3]、lock设置为1、dirty设置为1。Step 1031: The slice cache of the third target receives the response information, and stores the command to be obtained in the response information in a cacheline in the Cache data in the slice cache of the third target, and according to the PC of the command to be obtained in the response information [ 17:0] Update the information in the corresponding way in the TAG Table of the cacheline storing the instruction to be obtained, that is, set the VLD in the corresponding way to 1, TAG set to the PC of the instruction to be obtained [12:3], lock set to 1. Dirty is set to 1.
图13为本申请实施例提供的一种带宽均衡装置的结构示意图。如图13所示,该装置 1300可以包括:第一监测模块1301、第一确定模块1302、第二确定模块1303、记录模块1304,其中,第一监测模块1301,用于监测多个高速缓冲存储器的访问频率;第一确定模块1302,用于在所述多个高速缓冲存储器中确定冷点高速缓冲存储器和热点高速缓冲存储器;第二确定模块1303,用于在所述热点高速缓冲存储器中确定热点信息;记录模块1304,用于将所述热点信息的第一索引信息与所述冷点高速缓冲存储器的标识信息记录在目标表项中,其中,所述目标表项为重定向数据表中的表项。FIG. 13 is a schematic structural diagram of a bandwidth equalization device provided by an embodiment of the application. As shown in FIG. 13, the device 1300 may include: a first monitoring module 1301, a first determining module 1302, a second determining module 1303, and a recording module 1304. The first monitoring module 1301 is used to monitor multiple cache memories. The first determining module 1302 is used to determine the cold spot cache memory and the hot spot cache memory among the multiple cache memories; the second determining module 1303 is used to determine the hot spot cache memory in the Hot spot information; recording module 1304, used to record the first index information of the hot spot information and the identification information of the cold spot cache in a target entry, where the target entry is in the redirection data table Table entry.
在一种可能的实现方式中,所第一监测模块1301,具体用于响应于频率监测指令,基于一监测周期监测所述多个高速缓冲存储器在每个所述监测周期的访问频率。In a possible implementation manner, the first monitoring module 1301 is specifically configured to respond to a frequency monitoring command and monitor the access frequency of the plurality of cache memories in each monitoring period based on a monitoring period.
在一种可能的实现方式中,所述第一确定模块1302,具体用于根据每个所述高速缓冲存储器的访问频率,将访问频率最大的所述高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率最小的所述高速缓冲存储器确定为所述冷点高速缓冲存储器;或者根据每个所述高速缓冲存储器的访问频率,将访问频率大于第一预设频率的高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率小于第二预设频率的高速缓冲存储器确定为所述冷点高速缓冲存储器,其中,所述第一预设频率大于所述第二预设频率。In a possible implementation manner, the first determining module 1302 is specifically configured to determine, according to the access frequency of each cache memory, the cache memory with the largest access frequency as the hot spot cache memory , Determine the cache memory with the smallest access frequency as the cold spot cache memory; or determine the cache memory with an access frequency greater than the first preset frequency as the cold spot cache memory according to the access frequency of each cache memory In the hot spot cache memory, a cache memory with an access frequency less than a second preset frequency is determined as the cold spot cache memory, wherein the first preset frequency is greater than the second preset frequency.
在一种可能的实现方式中,所述第二确定模块1303,具体用于判断所述热点高速缓冲存储器的访问频率是否达到寄存器配置的频率且所述热点高速缓冲存储器的访问频率与所述冷点高速缓冲存储器的访问频率的差是否大于配置值;若是,则在所述热点高速缓冲存储器中确定热点信息。In a possible implementation manner, the second determining module 1303 is specifically configured to determine whether the access frequency of the hot spot cache memory reaches the frequency configured by the register, and the access frequency of the hot spot cache memory is equal to the cold frequency. Whether the difference in the access frequency of the point cache is greater than the configured value; if so, the hot spot information is determined in the hot spot cache.
在一种可能的实现方式中,所述第二确定模块1303,具体用于判断所述热点高速缓冲存储器的访问频率是否大于n倍的所述冷点高速缓冲存储器的访问频率;若是,则在所述热点高速缓冲存储器中确定热点信息。In a possible implementation, the second determining module 1303 is specifically configured to determine whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory; if so, then Hot spot information is determined in the hot spot cache memory.
在一种可能的实现方式中,所述第二确定模块1303,具体用于监测所述热点高速缓冲存储器中的每个缓冲行的访问频率;根据所述每个缓冲行的访问频率,在所述热点高速缓冲存储器中确定热点缓冲行;将所述热点缓冲行中存储的信息确定为热点信息。In a possible implementation, the second determining module 1303 is specifically configured to monitor the access frequency of each buffer line in the hotspot cache; according to the access frequency of each buffer line, The hot spot cache line is determined in the hot spot cache memory; the information stored in the hot spot buffer line is determined as hot spot information.
在一种可能的实现方式中,所述重定向数据表包括多个表项,每个所述表项均包括第一标识和第二标识,所述第一标识为有效标记或无效标记,所述第二标识为热点标记或非热点标记。In a possible implementation manner, the redirection data table includes a plurality of table entries, each of the table entries includes a first identifier and a second identifier, the first identifier is a valid mark or an invalid mark, so The second mark is a hot spot mark or a non-hot spot mark.
在一种可能的实现方式中,所述记录模块1304,具体用于根据所述多个表项中的每个表项的第一标识和第二标识,在所述多个表项中确定候选表项,其中,所述候选表项包括所述多个表项中的所述第一标识为无效标记的表项以及所述第二标识为非热点标记的表项;在所述候选表项中确定目标表项;将所述热点信息的第一索引信息和所述冷点高速缓冲存储器的标识信息记录在目标表项中。In a possible implementation manner, the recording module 1304 is specifically configured to determine a candidate among the multiple entries according to the first identifier and the second identifier of each entry in the multiple entries Entry, wherein the candidate entry includes an entry whose first identifier is an invalid flag and an entry whose second identifier is a non-hot spot flag among the multiple entries; in the candidate entry The target entry is determined in the target entry; the first index information of the hot spot information and the identification information of the cold spot cache memory are recorded in the target entry.
在一种可能的实现方式中,还包括:设置模块,用于将所述目标表项中的第二标识设置为热点标记,以及将所述目标表项中的第一标识设置为有效标记。In a possible implementation manner, the method further includes: a setting module, configured to set the second identifier in the target entry as a hotspot label, and set the first identifier in the target entry as a valid label.
在一种可能的实现方式中,还包括:第二监测模块,用于监测所述重定向数据表中的每个表项的访问频率;第一判断模块,用于判断第一表项的访问频率是否小于第三预设频率,其中,所述第一表项为所述第二标识为所述热点标记的表项;第一修改模块,用于将访问频率小于所述第三预设频率的所述第一表项的第二标识修改为非热点标记;第二判断模块,用于判断第二表项的访问频率是否大于第四预设频率,其中,所述第二表项为所述 第二标识为所述非热点标记的表项;第二修改模块,用于将访问频率大于所述第四预设频率的所述第二表项的第二标识修改为热点标记;其中,所述第四预设频率大于所述第三预设频率。In a possible implementation manner, it further includes: a second monitoring module for monitoring the access frequency of each entry in the redirection data table; a first judging module for judging the access of the first entry Whether the frequency is less than the third preset frequency, wherein the first entry is the entry with the second identification as the hotspot mark; the first modification module is configured to set the access frequency to be less than the third preset frequency The second identifier of the first entry is modified to a non-hot spot flag; the second determination module is used to determine whether the access frequency of the second entry is greater than the fourth preset frequency, wherein the second entry is all The second identifier is an entry of the non-hot-spot label; a second modification module is configured to modify the second identifier of the second entry whose access frequency is greater than the fourth preset frequency to a hot-spot label; wherein, The fourth preset frequency is greater than the third preset frequency.
在一种可能的实现方式中,还包括:填充模块,用于将所述热点信息填充至所述冷点高速缓冲存储器中的冷点缓冲行中。In a possible implementation manner, it further includes: a filling module, configured to fill the hot spot information into the cold spot buffer row in the cold spot cache memory.
在一种可能的实现方式中,还包括:第三确定模块,用于将所述冷点高速缓冲存储器中的任意一个缓冲行确定为所述冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率小于第五预设频率的缓冲行确定为冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率最小的缓冲行确定为冷点缓冲行。In a possible implementation manner, the method further includes: a third determining module, configured to determine any buffer line in the cold spot cache memory as the cold spot buffer line; or to cache the cold spot A buffer line with an access frequency less than the fifth preset frequency in the memory is determined as a cold spot buffer line; or a buffer line with the smallest access frequency in the cold spot cache memory is determined as a cold spot buffer line.
在一种可能的实现方式中,还包括:读取模块,用于读取第一请求,所述第一请求为获取待获取信息的请求,所述第一请求携带第二索引信息,所述第二索引信息为用于获取所述待获取信息的索引信息;第四确定模块,用于根据所述第二索引信息确定所述待获取信息的第一索引信息;第一匹配模块,用于将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;第五确定模块,用于若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第一目标高速缓冲存储器;第一发送模块,用于将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。In a possible implementation manner, it further includes: a reading module, configured to read a first request, the first request is a request for obtaining information to be obtained, the first request carries second index information, the The second index information is the index information used to obtain the information to be obtained; the fourth determining module is used to determine the first index information of the information to be obtained according to the second index information; the first matching module is used to The first index information of the information to be obtained is matched with the first index information in each entry in the redirection data table; the fifth determining module is configured to determine the first index information in one of the entries If it matches the first index information of the information to be obtained, the cache memory corresponding to the identification information of the cache memory in the entry matching the first index information of the information to be obtained is determined as the first target cache Memory; a first sending module, configured to send the second index information to the first target cache, so that the first target cache obtains the to-be-obtained according to the second index information information.
在一种可能的实现方式中,还包括:第六确定模块,用于若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第二索引信息和映射规则确定第一目标高速缓冲存储器;第二发送模块,用于将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。In a possible implementation manner, it further includes: a sixth determining module, configured to: if the first index information in each of the entries does not match the first index information of the information to be obtained, then according to the The second index information and the mapping rule determine the first target cache memory; the second sending module is used to send the second index information to the first target cache memory, so that the first target is high-speed The buffer memory obtains the to-be-obtained information according to the second index information.
在一种可能的实现方式中,还包括:接收模块,用于接收第一目标高速缓冲存储器发送的第三索引信息,所述第三索引信息由所述第二索引信息和存储间隔计算得到,所述第三索引信息由所述第一目标高速缓冲存储器在根据所述待获取信息中的结束标识确定未完成所述待获取信息的获取时生成;第七确定模块,用于根据所述第三索引信息确定所述待获取信息的第一索引信息;第二匹配模块,用于将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;第八确定模块,用于若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第二目标高速缓冲存储器;第三发送模块,用于将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。In a possible implementation manner, it further includes: a receiving module, configured to receive third index information sent by the first target cache memory, where the third index information is calculated from the second index information and the storage interval, The third index information is generated by the first target cache memory when it is determined that the acquisition of the information to be acquired has not been completed according to the end identifier in the information to be acquired; the seventh determining module is configured to The three index information determines the first index information of the information to be obtained; the second matching module is used to compare the first index information of the information to be obtained with the first index information in each entry in the redirect data table Matching; an eighth determining module, configured to match the first index information of the information to be obtained if the first index information in one of the entries matches the first index information of the information to be obtained The cache memory corresponding to the identification information of the cache memory in the entry is determined to be the second target cache memory; the third sending module is configured to send the third index information to the second target cache memory, So that the second target cache memory obtains the to-be-obtained information according to the third index information.
在一种可能的实现方式中,还包括:第九确定模块,用于若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第三索引信息和映射规则确定第二目标高速缓冲存储器;第四发送模块,用于将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所 述待获取信息。In a possible implementation manner, it further includes: a ninth determining module, configured to: if the first index information in each entry does not match the first index information of the to-be-obtained information, according to the The third index information and the mapping rule determine the second target cache memory; the fourth sending module is configured to send the third index information to the second target cache memory, so as to make the second target high-speed The buffer memory obtains the to-be-obtained information according to the third index information.
本申请的上述装置,可以用于执行图2~7中所示的方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The above-mentioned device of the present application can be used to implement the technical solutions of the method embodiments shown in FIGS. 2-7, and the implementation principles and technical effects are similar, and will not be repeated here.
本申请还提供一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行图2~7中的任一项方法。The present application also provides a computer-readable storage medium, including a computer program, which when executed on a computer, causes the computer to execute any one of the methods in FIGS. 2-7.
本申请还提供一种计算机程序,当所述计算机程序被计算机执行时,用于执行图2~7中的任一项方法。This application also provides a computer program, when the computer program is executed by a computer, it is used to execute any one of the methods in FIGS. 2-7.
本申请还提供一种芯片,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行图2~7中的任一项方法。The present application also provides a chip including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute any one of FIGS. 2-7 Item method.
进一步地,所述芯片还可以包括存储器和通信接口。所述通信接口可以是输入/输出接口、管脚或输入/输出电路等。Further, the chip may also include a memory and a communication interface. The communication interface may be an input/output interface, a pin, an input/output circuit, or the like.
在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。In the implementation process, the steps of the foregoing method embodiments may be completed by hardware integrated logic circuits in the processor or instructions in the form of software. The processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware encoding processor, or executed and completed by a combination of hardware and software modules in the encoding processor. The software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers. The storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。The memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. Among them, the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory. The volatile memory may be random access memory (RAM), which is used as an external cache. By way of exemplary but not restrictive description, many forms of RAM are available, such as static random access memory (static RAM, SRAM), dynamic random access memory (dynamic RAM, DRAM), and synchronous dynamic random access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (synchlink DRAM, SLDRAM) ) And direct memory bus random access memory (direct rambus RAM, DR RAM). It should be noted that the memories of the systems and methods described herein are intended to include, but are not limited to, these and any other suitable types of memories.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the system, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method can be implemented in other ways. For example, the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (personal computer, server, or network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (35)

  1. 一种带宽均衡方法,其特征在于,包括:A bandwidth equalization method is characterized in that it includes:
    监测多个高速缓冲存储器的访问频率;Monitor the access frequency of multiple cache memories;
    在所述多个高速缓冲存储器中确定冷点高速缓冲存储器和热点高速缓冲存储器;Determining a cold spot cache memory and a hot spot cache memory among the plurality of cache memories;
    在所述热点高速缓冲存储器中确定热点信息;Determining hotspot information in the hotspot cache memory;
    将所述热点信息的第一索引信息与所述冷点高速缓冲存储器的标识信息记录在目标表项中,其中,所述目标表项为重定向数据表中的表项。The first index information of the hot spot information and the identification information of the cold spot cache memory are recorded in a target entry, where the target entry is an entry in a redirection data table.
  2. 根据权利要求1所述的方法,其特征在于,所述监测多个高速缓冲存储器的访问频率包括:The method according to claim 1, wherein the monitoring the access frequency of a plurality of cache memories comprises:
    响应于频率监测指令,基于一监测周期监测所述多个高速缓冲存储器在每个所述监测周期的访问频率。In response to the frequency monitoring command, the access frequency of the plurality of cache memories in each monitoring period is monitored based on a monitoring period.
  3. 根据权利要求1所述的方法,其特征在于,所述在所述多个高速缓冲存储器中确定冷点高速缓冲存储器和热点高速缓冲存储器包括:The method according to claim 1, wherein the determining a cold spot cache memory and a hot spot cache memory in the plurality of cache memories comprises:
    根据每个所述高速缓冲存储器的访问频率,将访问频率最大的所述高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率最小的所述高速缓冲存储器确定为所述冷点高速缓冲存储器;或者According to the access frequency of each of the cache memories, the cache memory with the largest access frequency is determined as the hot spot cache memory, and the cache memory with the smallest access frequency is determined as the cold spot cache memory ;or
    根据每个所述高速缓冲存储器的访问频率,将访问频率大于第一预设频率的高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率小于第二预设频率的高速缓冲存储器确定为所述冷点高速缓冲存储器,其中,所述第一预设频率大于所述第二预设频率。According to the access frequency of each of the cache memories, the cache memory with the access frequency greater than the first preset frequency is determined as the hot spot cache memory, and the cache memory with the access frequency less than the second preset frequency is determined as the hot spot cache memory. In the cold spot cache memory, the first preset frequency is greater than the second preset frequency.
  4. 根据权利要求1所述的方法,其特征在于,所述在所述热点高速缓冲存储器中确定热点信息包括:The method according to claim 1, wherein the determining hotspot information in the hotspot cache memory comprises:
    判断所述热点高速缓冲存储器的访问频率是否达到寄存器配置的频率且所述热点高速缓冲存储器的访问频率与所述冷点高速缓冲存储器的访问频率的差是否大于配置值;Determining whether the access frequency of the hot spot cache memory reaches the frequency configured by the register and whether the difference between the access frequency of the hot spot cache memory and the access frequency of the cold spot cache memory is greater than a configured value;
    若是,则在所述热点高速缓冲存储器中确定热点信息。If yes, the hot spot information is determined in the hot spot cache memory.
  5. 根据权利要求1所述的方法,其特征在于,所述在所述热点高速缓冲存储器中确定热点信息包括:The method according to claim 1, wherein the determining hotspot information in the hotspot cache memory comprises:
    判断所述热点高速缓冲存储器的访问频率是否大于n倍的所述冷点高速缓冲存储器的访问频率;Judging whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory;
    若是,则在所述热点高速缓冲存储器中确定热点信息。If yes, the hot spot information is determined in the hot spot cache memory.
  6. 根据权利要求1~5中任一项所述的方法,其特征在于,所述在所述热点高速缓冲存储器中确定热点信息包括:The method according to any one of claims 1 to 5, wherein the determining hotspot information in the hotspot cache memory comprises:
    监测所述热点高速缓冲存储器中的每个缓冲行的访问频率;Monitoring the access frequency of each buffer line in the hotspot cache memory;
    根据所述每个缓冲行的访问频率,在所述热点高速缓冲存储器中确定热点缓冲行;Determining a hot spot buffer line in the hot spot cache memory according to the access frequency of each buffer line;
    将所述热点缓冲行中存储的信息确定为热点信息。The information stored in the hot spot buffer row is determined as hot spot information.
  7. 根据权利要求1~5中任一项所述的方法,其特征在于,所述重定向数据表包括多个表项,每个所述表项均包括第一标识和第二标识,所述第一标识为有效标记或无效标记,所述第二标识为热点标记或非热点标记。The method according to any one of claims 1 to 5, wherein the redirection data table includes a plurality of table entries, each of the table entries includes a first identifier and a second identifier, and the first identifier One of the marks is a valid mark or an invalid mark, and the second mark is a hot mark or a non-hot mark.
  8. 根据权利要求7所述的方法,其特征在于,所述将所述热点信息的第一索引信息与所述冷点高速缓冲存储器的标识信息记录在目标表项中包括:The method according to claim 7, wherein the recording the first index information of the hot spot information and the identification information of the cold spot cache memory in a target table entry comprises:
    根据所述多个表项中的每个表项的第一标识和第二标识,在所述多个表项中确定候选表项,其中,所述候选表项包括所述多个表项中的所述第一标识为无效标记的表项以及所述第二标识为非热点标记的表项;According to the first identifier and the second identifier of each of the multiple entries, a candidate entry is determined from the multiple entries, wherein the candidate entry includes the multiple entries The first identifier is an entry with an invalid flag and the second identifier is an entry with a non-hot spot flag;
    在所述候选表项中确定目标表项;Determining a target entry in the candidate entry;
    将所述热点信息的第一索引信息和所述冷点高速缓冲存储器的标识信息记录在目标表项中。Record the first index information of the hot spot information and the identification information of the cold spot cache memory in a target entry.
  9. 根据权利要求8所述的方法,其特征在于,在将所述热点信息的第一索引信息和所述冷点高速缓冲存储器的标识信息记录在目标表项中之后还包括:The method according to claim 8, wherein after recording the first index information of the hot spot information and the identification information of the cold spot cache memory in a target entry, the method further comprises:
    将所述目标表项中的第二标识设置为热点标记,以及将所述目标表项中的第一标识设置为有效标记。The second identifier in the target entry is set as a hot spot label, and the first identifier in the target entry is set as a valid label.
  10. 根据权利要求7所述的方法,其特征在于,所述方法还包括:The method according to claim 7, wherein the method further comprises:
    监测所述重定向数据表中的每个表项的访问频率;Monitoring the access frequency of each entry in the redirection data table;
    判断第一表项的访问频率是否小于第三预设频率,其中,所述第一表项为所述第二标识为所述热点标记的表项;Judging whether the access frequency of the first entry is less than the third preset frequency, wherein the first entry is the entry whose second identifier is the hot spot mark;
    将访问频率小于所述第三预设频率的所述第一表项的第二标识修改为非热点标记;Modifying the second identifier of the first entry whose access frequency is less than the third preset frequency to a non-hot spot mark;
    判断第二表项的访问频率是否大于第四预设频率,其中,所述第二表项为所述第二标识为所述非热点标记的表项;Judging whether the access frequency of the second entry is greater than the fourth preset frequency, where the second entry is the entry whose second identifier is the non-hot spot mark;
    将访问频率大于所述第四预设频率的所述第二表项的第二标识修改为热点标记;Modifying the second identifier of the second entry whose access frequency is greater than the fourth preset frequency into a hot spot mark;
    其中,所述第四预设频率大于所述第三预设频率。Wherein, the fourth preset frequency is greater than the third preset frequency.
  11. 根据权利要求1~10中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-10, wherein the method further comprises:
    将所述热点信息填充至所述冷点高速缓冲存储器中的冷点缓冲行中。The hot spot information is filled into the cold spot buffer line in the cold spot cache memory.
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:The method according to claim 11, wherein the method further comprises:
    将所述冷点高速缓冲存储器中的任意一个缓冲行确定为所述冷点缓冲行;或者Determine any buffer line in the cold spot cache memory as the cold spot buffer line; or
    将所述冷点高速缓冲存储器中的访问频率小于第五预设频率的缓冲行确定为冷点缓冲行;或者Determining a buffer line whose access frequency in the cold spot cache memory is less than the fifth preset frequency as a cold spot buffer line; or
    将所述冷点高速缓冲存储器中的访问频率最小的缓冲行确定为冷点缓冲行。The buffer line with the smallest access frequency in the cold spot cache memory is determined as the cold spot buffer line.
  13. 根据权利要求1~12中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-12, wherein the method further comprises:
    读取第一请求,所述第一请求为获取待获取信息的请求,所述第一请求携带第二索引信息,所述第二索引信息为用于获取所述待获取信息的索引信息;Reading a first request, where the first request is a request for obtaining information to be obtained, the first request carries second index information, and the second index information is index information used to obtain the information to be obtained;
    根据所述第二索引信息确定所述待获取信息的第一索引信息;Determining the first index information of the to-be-obtained information according to the second index information;
    将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;Matching the first index information of the information to be obtained with the first index information in each entry in the redirection data table;
    若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第一目标高速缓冲存储器;If the first index information in one of the entries matches the first index information of the information to be obtained, then the identification information of the cache memory in the entry that matches the first index information of the information to be obtained is changed The corresponding cache memory is determined to be the first target cache memory;
    将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。The second index information is sent to the first target cache memory, so that the first target cache memory obtains the to-be-obtained information according to the second index information.
  14. 根据权利要求13所述的方法,其特征在于,所述方法还包括:The method according to claim 13, wherein the method further comprises:
    若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第二索引信息和映射规则确定第一目标高速缓冲存储器;If the first index information in each entry does not match the first index information of the information to be obtained, determining the first target cache memory according to the second index information and the mapping rule;
    将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。The second index information is sent to the first target cache memory, so that the first target cache memory obtains the to-be-obtained information according to the second index information.
  15. 根据权利要求13或14所述的方法,其特征在于,所述方法还包括:The method according to claim 13 or 14, wherein the method further comprises:
    接收第一目标高速缓冲存储器发送的第三索引信息,所述第三索引信息由所述第二索引信息和存储间隔计算得到,所述第三索引信息由所述第一目标高速缓冲存储器在根据所述待获取信息中的结束标识确定未完成所述待获取信息的获取时生成;Receive the third index information sent by the first target cache, the third index information is calculated from the second index information and the storage interval, and the third index information is calculated by the first target cache according to Generated when the end identifier in the information to be acquired determines that the acquisition of the information to be acquired has not been completed;
    根据所述第三索引信息确定所述待获取信息的第一索引信息;Determining the first index information of the to-be-obtained information according to the third index information;
    将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;Matching the first index information of the information to be obtained with the first index information in each entry in the redirection data table;
    若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第二目标高速缓冲存储器;If the first index information in one of the entries matches the first index information of the information to be obtained, then the identification information of the cache memory in the entry that matches the first index information of the information to be obtained is changed The corresponding cache memory is determined to be the second target cache memory;
    将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。The third index information is sent to the second target cache memory, so that the second target cache memory obtains the to-be-obtained information according to the third index information.
  16. 根据权利要求15所述的方法,其特征在于,所述方法还包括:The method according to claim 15, wherein the method further comprises:
    若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第三索引信息和映射规则确定第二目标高速缓冲存储器;If the first index information in each entry does not match the first index information of the information to be obtained, determining the second target cache memory according to the third index information and the mapping rule;
    将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。The third index information is sent to the second target cache memory, so that the second target cache memory obtains the to-be-obtained information according to the third index information.
  17. 一种带宽均衡装置,其特征在于,包括:A bandwidth equalization device is characterized in that it comprises:
    第一监测模块,用于监测多个高速缓冲存储器的访问频率;The first monitoring module is used to monitor the access frequency of multiple cache memories;
    第一确定模块,用于在所述多个高速缓冲存储器中确定冷点高速缓冲存储器和热点高速缓冲存储器;A first determining module, configured to determine a cold spot cache memory and a hot spot cache memory among the plurality of cache memories;
    第二确定模块,用于在所述热点高速缓冲存储器中确定热点信息;The second determining module is configured to determine hotspot information in the hotspot cache memory;
    记录模块,用于将所述热点信息的第一索引信息与所述冷点高速缓冲存储器的标识信息记录在目标表项中,其中,所述目标表项为重定向数据表中的表项。The recording module is configured to record the first index information of the hot spot information and the identification information of the cold spot cache memory in a target entry, where the target entry is an entry in a redirection data table.
  18. 根据权利要求17所述的装置,其特征在于,所第一监测模块,具体用于响应于频率监测指令,基于一监测周期监测所述多个高速缓冲存储器在每个所述监测周期的访问频率。The device according to claim 17, wherein the first monitoring module is specifically configured to respond to a frequency monitoring command and monitor the access frequency of the plurality of cache memories in each monitoring period based on a monitoring period .
  19. 根据权利要求17所述的装置,其特征在于,所述第一确定模块,具体用于根据每个所述高速缓冲存储器的访问频率,将访问频率最大的所述高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率最小的所述高速缓冲存储器确定为所述冷点高速缓冲存储器;或者根据每个所述高速缓冲存储器的访问频率,将访问频率大于第一预设频率的高速缓冲存储器确定为所述热点高速缓冲存储器,将访问频率小于第二预设频率的高速缓冲存储器确定为所述冷点高速缓冲存储器,其中,所述第一预设频率大于所述第二预设频率。The device according to claim 17, wherein the first determining module is specifically configured to determine the cache memory with the highest access frequency as the hot spot according to the access frequency of each of the cache memories A cache memory, which determines the cache memory with the smallest access frequency as the cold spot cache memory; or, according to the access frequency of each of the cache memories, a cache memory with an access frequency greater than a first preset frequency It is determined as the hot spot cache memory, and a cache memory with an access frequency less than a second preset frequency is determined as the cold spot cache memory, wherein the first preset frequency is greater than the second preset frequency.
  20. 根据权利要求17所述的装置,其特征在于,所述第二确定模块,具体用于判断所述热点高速缓冲存储器的访问频率是否达到寄存器配置的频率且所述热点高速缓冲存储器的访问频率与所述冷点高速缓冲存储器的访问频率的差是否大于配置值;若是,则在所述热点高速缓冲存储器中确定热点信息。The device according to claim 17, wherein the second determining module is specifically configured to determine whether the access frequency of the hotspot cache memory reaches the frequency configured by the register, and the access frequency of the hotspot cache memory is equal to that of the register configuration. Whether the difference in the access frequency of the cold spot cache memory is greater than the configured value; if so, hot spot information is determined in the hot spot cache memory.
  21. 根据权利要求17所述的装置,其特征在于,所述第二确定模块,具体用于判断所述热点高速缓冲存储器的访问频率是否大于n倍的所述冷点高速缓冲存储器的访问频率;若是,则在所述热点高速缓冲存储器中确定热点信息。The device according to claim 17, wherein the second determining module is specifically configured to determine whether the access frequency of the hot spot cache memory is greater than n times the access frequency of the cold spot cache memory; if so , The hot spot information is determined in the hot spot cache memory.
  22. 根据权利要求17~21中任一项所述的装置,其特征在于,所述第二确定模块,具体用于监测所述热点高速缓冲存储器中的每个缓冲行的访问频率;根据所述每个缓冲行的访问频率,在所述热点高速缓冲存储器中确定热点缓冲行;将所述热点缓冲行中存储的信息确定为热点信息。The device according to any one of claims 17 to 21, wherein the second determining module is specifically configured to monitor the access frequency of each buffer line in the hotspot cache; The access frequency of each buffer line is determined in the hot spot cache memory; the information stored in the hot spot buffer line is determined as hot spot information.
  23. 根据权利要求17~21中任一项所述的装置,其特征在于,所述重定向数据表包括多个表项,每个所述表项均包括第一标识和第二标识,所述第一标识为有效标记或无效标记,所述第二标识为热点标记或非热点标记。The device according to any one of claims 17 to 21, wherein the redirection data table includes a plurality of entries, each of the entries includes a first identifier and a second identifier, and the first identifier One of the marks is a valid mark or an invalid mark, and the second mark is a hot mark or a non-hot mark.
  24. 根据权利要求23所述的装置,其特征在于,所述记录模块,具体用于根据所述多个表项中的每个表项的第一标识和第二标识,在所述多个表项中确定候选表项,其中,所述候选表项包括所述多个表项中的所述第一标识为无效标记的表项以及所述第二标识 为非热点标记的表项;在所述候选表项中确定目标表项;将所述热点信息的第一索引信息和所述冷点高速缓冲存储器的标识信息记录在目标表项中。23. The device according to claim 23, wherein the recording module is specifically configured to determine the number of entries in the plurality of entries according to the first identifier and the second identifier of each entry in the plurality of entries. Candidate entries are determined in the multiple entries, wherein the candidate entries include entries in the plurality of entries whose first identification is an invalid flag and entries whose second identification is a non-hot spot flag; The target entry is determined in the candidate entry; the first index information of the hot spot information and the identification information of the cold spot cache memory are recorded in the target entry.
  25. 根据权利要求24所述的装置,其特征在于,还包括:The device according to claim 24, further comprising:
    设置模块,用于将所述目标表项中的第二标识设置为热点标记,以及将所述目标表项中的第一标识设置为有效标记。The setting module is configured to set the second identifier in the target entry as a hotspot label, and set the first identifier in the target entry as a valid label.
  26. 根据权利要求23所述的装置,其特征在于,还包括:The device according to claim 23, further comprising:
    第二监测模块,用于监测所述重定向数据表中的每个表项的访问频率;The second monitoring module is used to monitor the access frequency of each entry in the redirection data table;
    第一判断模块,用于判断第一表项的访问频率是否小于第三预设频率,其中,所述第一表项为所述第二标识为所述热点标记的表项;A first judging module, configured to judge whether the access frequency of the first entry is less than the third preset frequency, wherein the first entry is the entry whose second identifier is the hotspot mark;
    第一修改模块,用于将访问频率小于所述第三预设频率的所述第一表项的第二标识修改为非热点标记;A first modification module, configured to modify the second identifier of the first entry whose access frequency is less than the third preset frequency to a non-hot spot mark;
    第二判断模块,用于判断第二表项的访问频率是否大于第四预设频率,其中,所述第二表项为所述第二标识为所述非热点标记的表项;The second judgment module is configured to judge whether the access frequency of the second entry is greater than the fourth preset frequency, wherein the second entry is the entry whose second identifier is the non-hot spot mark;
    第二修改模块,用于将访问频率大于所述第四预设频率的所述第二表项的第二标识修改为热点标记;A second modification module, configured to modify the second identifier of the second entry whose access frequency is greater than the fourth preset frequency to a hot spot mark;
    其中,所述第四预设频率大于所述第三预设频率。Wherein, the fourth preset frequency is greater than the third preset frequency.
  27. 根据权利要求17~26中任一项所述的装置,其特征在于,还包括:The device according to any one of claims 17 to 26, further comprising:
    填充模块,用于将所述热点信息填充至所述冷点高速缓冲存储器中的冷点缓冲行中。The filling module is used to fill the hot spot information into the cold spot buffer line in the cold spot cache memory.
  28. 根据权利要求27所述的装置,其特征在于,还包括:The device according to claim 27, further comprising:
    第三确定模块,用于将所述冷点高速缓冲存储器中的任意一个缓冲行确定为所述冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率小于第五预设频率的缓冲行确定为冷点缓冲行;或者将所述冷点高速缓冲存储器中的访问频率最小的缓冲行确定为冷点缓冲行。The third determining module is configured to determine any buffer line in the cold spot cache memory as the cold spot buffer line; or set the access frequency in the cold spot cache memory to be less than the fifth preset frequency The buffer line is determined as a cold-spot buffer line; or the buffer line with the smallest access frequency in the cold-spot cache memory is determined as a cold-spot buffer line.
  29. 根据权利要求17~28中任一项所述的装置,其特征在于,还包括:The device according to any one of claims 17-28, further comprising:
    读取模块,用于读取第一请求,所述第一请求为获取待获取信息的请求,所述第一请求携带第二索引信息,所述第二索引信息为用于获取所述待获取信息的索引信息;The reading module is configured to read a first request, the first request is a request for obtaining information to be obtained, the first request carries second index information, and the second index information is for obtaining the information to be obtained Information index information;
    第四确定模块,用于根据所述第二索引信息确定所述待获取信息的第一索引信息;A fourth determining module, configured to determine the first index information of the to-be-obtained information according to the second index information;
    第一匹配模块,用于将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;The first matching module is configured to match the first index information of the information to be obtained with the first index information in each entry in the redirection data table;
    第五确定模块,用于若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第一目标高速缓冲存储器;The fifth determining module is configured to, if the first index information in one of the entries matches the first index information of the information to be obtained, then select the entries that match the first index information of the information to be obtained The cache memory corresponding to the identification information of the cache memory is determined to be the first target cache memory;
    第一发送模块,用于将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。The first sending module is configured to send the second index information to the first target cache memory, so that the first target cache memory obtains the to-be-obtained information according to the second index information.
  30. 根据权利要求29所述的装置,其特征在于,还包括:The device according to claim 29, further comprising:
    第六确定模块,用于若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第二索引信息和映射规则确定第一目标高速缓冲存储器;The sixth determining module is configured to determine the first target according to the second index information and the mapping rule if the first index information in each entry does not match the first index information of the information to be obtained Cache memory
    第二发送模块,用于将所述第二索引信息发送至所述第一目标高速缓冲存储器中,以使所述第一目标高速缓冲存储器根据所述第二索引信息获取所述待获取信息。The second sending module is configured to send the second index information to the first target cache memory, so that the first target cache memory obtains the to-be-obtained information according to the second index information.
  31. 根据权利要求29或30所述的装置,其特征在于,还包括:The device according to claim 29 or 30, further comprising:
    接收模块,用于接收第一目标高速缓冲存储器发送的第三索引信息,所述第三索引信息由所述第二索引信息和存储间隔计算得到,所述第三索引信息由所述第一目标高速缓冲存储器在根据所述待获取信息中的结束标识确定未完成所述待获取信息的获取时生成;The receiving module is configured to receive the third index information sent by the first target cache, the third index information is calculated from the second index information and the storage interval, and the third index information is calculated from the first target The cache memory is generated when it is determined that the acquisition of the information to be acquired has not been completed according to the end identifier in the information to be acquired;
    第七确定模块,用于根据所述第三索引信息确定所述待获取信息的第一索引信息;A seventh determining module, configured to determine the first index information of the to-be-obtained information according to the third index information;
    第二匹配模块,用于将所述待获取信息的第一索引信息与重定向数据表中的每个表项中的第一索引信息进行匹配;The second matching module is configured to match the first index information of the information to be obtained with the first index information in each entry in the redirection data table;
    第八确定模块,用于若一个所述表项中的第一索引信息与所述待获取信息的第一索引信息匹配,则将与所述待获取信息的第一索引信息匹配的表项中的高速缓冲存储器的标识信息对应的高速缓冲存储器确定为第二目标高速缓冲存储器;The eighth determining module is configured to, if the first index information in one of the entries matches the first index information of the information to be obtained, then select the entries that match the first index information of the information to be obtained The cache memory corresponding to the identification information of the cache memory is determined to be the second target cache memory;
    第三发送模块,用于将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。The third sending module is configured to send the third index information to the second target cache memory, so that the second target cache memory obtains the to-be-obtained information according to the third index information.
  32. 根据权利要求31所述的装置,其特征在于,还包括:The device according to claim 31, further comprising:
    第九确定模块,用于若每个所述表项中的第一索引信息均与所述待获取信息的第一索引信息不匹配,则根据所述第三索引信息和映射规则确定第二目标高速缓冲存储器;A ninth determining module, configured to determine a second target according to the third index information and the mapping rule if the first index information in each entry does not match the first index information of the information to be obtained Cache memory
    第四发送模块,用于将所述第三索引信息发送至所述第二目标高速缓冲存储器中,以使所述第二目标高速缓冲存储器根据所述第三索引信息获取所述待获取信息。The fourth sending module is configured to send the third index information to the second target cache memory, so that the second target cache memory obtains the to-be-obtained information according to the third index information.
  33. 一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行权利要求1~16中任一项所述的方法。A computer-readable storage medium, comprising a computer program, which when executed on a computer, causes the computer to execute the method according to any one of claims 1-16.
  34. 一种计算机程序,当所述计算机程序被计算机执行时,用于执行权利要求1~16中任一项所述的方法。A computer program, when the computer program is executed by a computer, it is used to execute the method of any one of claims 1-16.
  35. 一种芯片,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行权利要求1~16中任一项所述的方法。A chip comprising a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute the computer program described in any one of claims 1-16 method.
PCT/CN2020/080729 2020-03-23 2020-03-23 Bandwidth equalization method and apparatus WO2021189203A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080092629.7A CN114930306A (en) 2020-03-23 2020-03-23 Bandwidth balancing method and device
PCT/CN2020/080729 WO2021189203A1 (en) 2020-03-23 2020-03-23 Bandwidth equalization method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/080729 WO2021189203A1 (en) 2020-03-23 2020-03-23 Bandwidth equalization method and apparatus

Publications (1)

Publication Number Publication Date
WO2021189203A1 true WO2021189203A1 (en) 2021-09-30

Family

ID=77890824

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/080729 WO2021189203A1 (en) 2020-03-23 2020-03-23 Bandwidth equalization method and apparatus

Country Status (2)

Country Link
CN (1) CN114930306A (en)
WO (1) WO2021189203A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023125380A1 (en) * 2021-12-31 2023-07-06 华为技术有限公司 Data management method and corresponding apparatus

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9286219B1 (en) * 2012-09-28 2016-03-15 Emc Corporation System and method for cache management
CN108459972A (en) * 2016-12-12 2018-08-28 中国航空工业集团公司西安航空计算技术研究所 A kind of efficient cache management design method of multichannel solid state disk
CN110140173A (en) * 2017-01-20 2019-08-16 阿姆有限公司 Extend the device and method in the service life of memory
CN110531938A (en) * 2019-09-02 2019-12-03 广东紫晶信息存储技术股份有限公司 A kind of cold and hot data migration method and system based on various dimensions
CN110765035A (en) * 2018-07-25 2020-02-07 爱思开海力士有限公司 Memory system and operating method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9286219B1 (en) * 2012-09-28 2016-03-15 Emc Corporation System and method for cache management
CN108459972A (en) * 2016-12-12 2018-08-28 中国航空工业集团公司西安航空计算技术研究所 A kind of efficient cache management design method of multichannel solid state disk
CN110140173A (en) * 2017-01-20 2019-08-16 阿姆有限公司 Extend the device and method in the service life of memory
CN110765035A (en) * 2018-07-25 2020-02-07 爱思开海力士有限公司 Memory system and operating method thereof
CN110531938A (en) * 2019-09-02 2019-12-03 广东紫晶信息存储技术股份有限公司 A kind of cold and hot data migration method and system based on various dimensions

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023125380A1 (en) * 2021-12-31 2023-07-06 华为技术有限公司 Data management method and corresponding apparatus

Also Published As

Publication number Publication date
CN114930306A (en) 2022-08-19

Similar Documents

Publication Publication Date Title
US11074190B2 (en) Slot/sub-slot prefetch architecture for multiple memory requestors
CN113424160B (en) Processing method, processing device and related equipment
TWI684099B (en) Profiling cache replacement
CN105740164B (en) Multi-core processor supporting cache consistency, reading and writing method, device and equipment
US7930451B2 (en) Buffer controller and management method thereof
US10042576B2 (en) Method and apparatus for compressing addresses
US7769950B2 (en) Cached memory system and cache controller for embedded digital signal processor
CN104809076B (en) Cache management method and device
KR102290464B1 (en) System-on-chip and address translation method thereof
CN106030549B (en) For the methods, devices and systems to the tally set cache of cache memory outside chip
KR102575913B1 (en) Asymmetric set combined cache
US11093410B2 (en) Cache management method, storage system and computer program product
KR101093317B1 (en) Prefetch control in a data processing system
CN107291629A (en) A kind of method and apparatus for accessing internal memory
WO2017160480A1 (en) Priority-based access of compressed memory lines in memory in a processor-based system
WO2021189203A1 (en) Bandwidth equalization method and apparatus
JPS59224942A (en) Digital exchange
US8122194B2 (en) Transaction manager and cache for processing agent
US8375156B2 (en) Intelligent PCI-express transaction tagging
US20180196750A1 (en) Aggregating messages for reducing cache invalidation rate
CN109299021A (en) Page migration method, apparatus and central processing unit
US10120819B2 (en) System and method for cache memory line fill using interrupt indication
US6847990B2 (en) Data transfer unit with support for multiple coherency granules
CN100462941C (en) Method for realizing memory space in configurable RISC CPU
US11341062B2 (en) System-on-chip and acceleration method for system memory accessing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927069

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20927069

Country of ref document: EP

Kind code of ref document: A1