CN118227446B - Cache performance evaluation method and device, electronic equipment and readable storage medium - Google Patents

Cache performance evaluation method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN118227446B
CN118227446B CN202410634793.6A CN202410634793A CN118227446B CN 118227446 B CN118227446 B CN 118227446B CN 202410634793 A CN202410634793 A CN 202410634793A CN 118227446 B CN118227446 B CN 118227446B
Authority
CN
China
Prior art keywords
access
cache
request
hit
access request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410634793.6A
Other languages
Chinese (zh)
Other versions
CN118227446A (en
Inventor
刘宇航
满洋
陈泓佚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Open Source Chip Research Institute
Original Assignee
Beijing Open Source Chip Research Institute
Filing date
Publication date
Application filed by Beijing Open Source Chip Research Institute filed Critical Beijing Open Source Chip Research Institute
Priority to CN202410634793.6A priority Critical patent/CN118227446B/en
Publication of CN118227446A publication Critical patent/CN118227446A/en
Application granted granted Critical
Publication of CN118227446B publication Critical patent/CN118227446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The embodiment of the invention provides a cache performance evaluation method, a device, electronic equipment and a readable storage medium, and relates to the technical field of computers, wherein the method comprises the following steps: responding to multiple access operations of each access request in a test program, and acquiring access statistical information of the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated; and evaluating the performance of the cache to be evaluated based on the access statistical information and the access mode of each access request. In this way, performance evaluation of the cache may be achieved by obtaining the number of hits in the cache for the memory access request. Meanwhile, through the access statistical information and the access modes of each access request, the performance of the cache can be evaluated from the dimensions of different access modes, multi-dimensional evaluation is realized, and the accuracy of the performance evaluation of the cache is improved.

Description

Cache performance evaluation method and device, electronic equipment and readable storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a cache performance evaluation method, a device, an electronic apparatus, and a readable storage medium.
Background
With the development of computer technology, the design requirements of the central processing unit (Central Processing Unit, CPU) are also increasing, and thus the performance of the designed CPU needs to be evaluated.
The cache is an important component of the CPU and is used for storing data or instructions which are required to be accessed frequently by the CPU, so that the running speed and the running efficiency of the CPU can be improved. The performance of the cache has a large impact on the performance of the CPU, and therefore, how to evaluate the performance of the cache becomes a problem to be solved.
Disclosure of Invention
The embodiment of the invention provides a cache performance evaluation method, a device, electronic equipment and a readable storage medium, which can solve the problem of how to evaluate the performance of a cache in the prior art.
In order to solve the above problems, an embodiment of the present invention discloses a cache performance evaluation method, which includes:
responding to multiple access operations of each access request in a test program, and acquiring access statistical information of the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated;
And evaluating the performance of the cache to be evaluated based on the access statistical information and the access mode of each access request.
In another aspect, an embodiment of the present invention discloses a cache performance evaluation apparatus, the apparatus including:
the acquisition module is used for responding to multiple access operations of each access request in the test program and acquiring access statistical information of the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated;
And the first evaluation module is used for evaluating the performance of the cache to be evaluated based on the access statistical information and the access modes of the access requests.
In still another aspect, the embodiment of the invention also discloses an electronic device, which comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus; the memory is configured to store executable instructions that cause the processor to perform the cache performance evaluation method described above.
The embodiment of the invention also discloses a readable storage medium, which enables the electronic equipment to execute the cache performance evaluation method when the instructions in the readable storage medium are executed by the processor of the electronic equipment.
The embodiment of the invention also discloses a computer program product containing instructions, which when run on a computer, cause the computer to execute the cache performance evaluation method.
The embodiment of the invention has the following advantages:
The embodiment of the invention provides a cache performance evaluation method, which is used for acquiring access statistical information of a test program through multiple access operations responding to each access request in the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated; and evaluating the performance of the cache to be evaluated based on the access statistical information and the access mode of each access request. In this way, performance evaluation of the cache may be achieved by obtaining the number of hits in the cache for the memory access request. Meanwhile, through the access statistical information and the access modes of each access request, the performance of the cache can be evaluated from the dimensions of different access modes, multi-dimensional evaluation is realized, and the accuracy and the interpretability of the performance evaluation of the cache are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of steps of an embodiment of a cache performance evaluation method of the present invention;
FIG. 2 is a schematic diagram of the acquisition of hit counts according to the present invention;
FIG. 3 is a schematic diagram of the access mode of the present invention;
FIG. 4 is a graph showing a statistical result of the present invention;
FIG. 5 is a diagram showing still another statistical result of the present invention;
FIG. 6 is a block diagram illustrating an embodiment of a cache performance evaluation apparatus of the present invention;
FIG. 7 is a block diagram of an electronic device for cache performance evaluation, as provided by an example of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present invention may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, the term "and/or" as used in the specification and claims to describe an association of associated objects means that there may be three relationships, e.g., a and/or B, may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. The term "plurality" in embodiments of the present invention means two or more, and other adjectives are similar.
Method embodiment
Referring to FIG. 1, a flowchart illustrating steps of an embodiment of a cache performance evaluation method of the present invention may include, in particular, the steps of:
Step 101, responding to multiple access operations of each access request in a test program, and acquiring access statistical information of the test program; the access statistical information at least comprises the hit times of each access request in the cache to be evaluated.
Step 102, evaluating the performance of the cache to be evaluated based on the access statistics information and the access mode of each access request.
It should be noted that, for the above steps 101 to 102, the cache in the present invention may be a cache system design with a cache function, which may belong to a processor to be evaluated, it may be understood that one processor generally includes modules or systems with different functions, and in order to evaluate the functions of the processor, the different modules or systems are generally evaluated separately, which is an evaluation of the cache system in the embodiment of the present invention. Alternatively, the embodiment of the present invention may be applied to a processor to be evaluated, and may also be applied to a processor simulator, for example, gem5 may be used to build a cache to be evaluated on gem5, where gem5 is a processor simulator with precise cycle, and the processor core simulation is precise cycle, and the simulation of the cache system may be performed at a certain frequency, for example: the simulation is performed once every update period (a variable in the simulator for indicating how many times the processor state is updated per second, a time TICK), that is, every time a clock period elapses, a memory access statistic is obtained.
The test program may be pre-built, may be randomly built, or may be built according to a certain test requirement, which is not limited in the embodiment of the present invention. Specifically, the test program may include a plurality of different access requests, where access modes of the access requests may be the same or different. In addition, the memory access request in the embodiment of the present invention refers to a load instruction (load).
Specifically, the embodiment of the invention can pre-construct a plurality of access requests with different access modes to obtain the test program. The access mode refers to an access type of an access request, and the access type may include a step access, an indirect access, and the like.
According to the embodiment of the invention, when each access request carries out access operation, the access statistical information of the test program can be acquired first. It should be noted that the access statistics information at least includes the hit number of each access request in the cache to be evaluated. In particular, cache refers to a store between the CPU and memory that is typically small in size but fast in speed. When the processor executes the access request, data is often acquired from the cache, and when the data to be acquired exists in the cache, the access request is indicated to hit the cache, and the memory is not required to be accessed at the moment. Accordingly, when the data to be fetched does not exist in the cache, it is indicated that the access request misses the cache, and the data to be fetched is required from the memory. When the CPU reads the data in the memory, the CPU will prefetch part of the data into the cache in addition to the data to be loaded this time, so that the data to be read by the CPU later is already in the cache, and the performance can be effectively improved. Further, since the efficiency of acquiring data from the cache is higher than that of acquiring data from the memory, the more times the access request hits the cache, the higher the efficiency of executing the access request by the processor, that is, the better the performance of the cache, the better the performance of the processor, so the embodiment of the invention can evaluate the performance of the cache by acquiring the access statistics information of the test program.
In the embodiment of the invention, one access request corresponds to one static loading instruction, and in the process of executing the test program, one static loading instruction can be executed for a plurality of times as a dynamic instruction. Further, in the embodiment of the invention, one access request can be executed for a plurality of times, thereby obtaining the hit times of each access request in the cache in a plurality of executions.
Specifically, the operation of obtaining the access statistical information may be obtained through a performance counter of the processor, the performance counter of the processor may count access behaviors when the processor executes the test program, and may count the hit times of different access requests in the cache and the hit times in the memory when the different access requests are executed for multiple times. Further, the access mode may be that when a test program is built, a test program including access requests of different access modes is built in advance, so as to ensure that the access mode of each access request is known. Or the test program in the embodiment of the invention can be randomly constructed, and after the access statistical information is obtained, each access request can be output and displayed, so that the related staff can evaluate the access mode of each access request, and the access mode of each access request can be obtained by receiving the input information of the related staff. The memory access operation in the embodiment of the invention can be data access or instruction access.
It will be appreciated that the more hits a memory request has in the cache, the better the performance of the cache. For the same cache system, the performance of the access requests aiming at different access modes may be different, so that the embodiment of the invention can evaluate the performance of the cache by combining the access modes of each access request. Specifically, the embodiment of the invention can divide different performance levels in advance, and determine the performance level of the cache aiming at different access modes according to the hit times of each access request in the cache and the access modes of each access request. For example, different performance levels for different access modes may be associated with respective corresponding intervals of number of hits in the cache. Wherein, the level of performance can be used to characterize the performance. The higher the performance level, the greater the interval value of the interval associated with the number of hits in the cache. Further, different weight coefficients can be set for different access modes according to actual access requirements, and weighting calculation is performed on the obtained performance levels of the different access modes, so that the performance level corresponding to the cache to be evaluated is obtained.
Further, the method provided by the embodiment of the invention can be applied to a processor, so that the different performance levels can be uploaded to the processor in advance.
For example, taking the test program that there are the access request a and the access request B, where the access mode of a is indirect access, and the access mode of B is step access, if the access statistics information indicates that the hit number of a in the cache is 1850707 times, the hit number of B in the memory is 3059 times, and the hit number of B in the cache is 2584088 times, and the hit number of B in the memory is 645 times, it may be obtained that the access performance of the cache for step access is better, and the access performance for indirect access is worse. Further, according to a preset performance level, the performance of the cache to be evaluated for different access modes can be further divided.
According to the cache performance evaluation method provided by the embodiment of the invention, the access statistical information of the test program is obtained through multiple access operations in response to each access request in the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated; and evaluating the performance of the cache to be evaluated based on the access statistical information and the access mode of each access request. In this way, performance evaluation of the cache may be achieved by obtaining the number of hits in the cache for the memory access request. Meanwhile, through the access statistical information and the access modes of each access request, the performance of the cache can be evaluated from the dimensions of different access modes, multi-dimensional evaluation is realized, and the accuracy and the interpretability of the performance evaluation of the cache are improved.
Further, in the embodiment of the present invention, the performance of the cache is evaluated based on the number of hits and the access mode of the access request, and compared with a method of evaluating the cache miss rate or the average instruction number (Instruction Per Cycle, IPC) executed per clock cycle, the cache miss rate can only reflect the miss proportion in the access request received by the cache, which cannot locate the miss timing, program fragments with the same miss rate may have different effects on IPC due to different miss timings, and further cannot determine program fragments with larger effects on the performance of the cache. IPC is sensitive to branch prediction and other factors, and cannot directly reflect the coverage degree of a cache system to memory access. The embodiment of the invention can evaluate the performance of the cache from the dimensions of different access modes through the hit times of each access request and the access modes of each access request, realize multi-dimensional evaluation, determine the access mode with larger influence on the performance of the cache and improve the accuracy of the performance evaluation of the cache.
In an alternative embodiment of the invention, the storage hierarchy of the cache to be evaluated comprises at least two levels; in step 101, the operation of obtaining access statistics information of a test program in response to multiple access operations of each access request in the test program may specifically include the following steps:
And responding to multiple access operations of each access request in the test program, and acquiring the hit times of each access request in caches of all levels as the access statistical information.
Where the levels refer to different memory levels (memory hierarchy), in the case where the memory hierarchy of the cache includes at least two levels, the cache is a multi-level cache system, and the speeds of the caches of different levels are different, the processor will always access the caches of different levels in sequence according to the level order, that is, will access the cache of the highest level first, that is, the first level cache (L1 cache) to acquire the data to be acquired, return the data when the L1 cache hits, access the second level cache (L2 cache) when the L1 cache does not exist, …, and access the memory when the cache of the lowest level, that is, the last level cache, remains unnoticed. Meanwhile, the higher-level cache is usually smaller in capacity and higher in reading speed, accordingly, if the number of hits of the access request in the higher-level cache is larger, the performance of the cache system is better, and if the number of hits of the access request in the lower-level cache is larger, the performance of the cache system is general.
Specifically, when the storage hierarchy of the cache to be evaluated includes at least two hierarchies, the above access statistical information may also be obtained by using a performance counter, where the performance counter may respectively count the hit numbers of the access request in caches of different hierarchies, so as to obtain hit number distribution in the storage hierarchy.
Further, when the cache to be evaluated includes at least two levels of caches, the embodiment of the invention can obtain the hit times of each access request in each level of caches as access statistical information, so that more accurate and finer evaluation on the performance of the caches can be realized according to the hit times of caches in different levels.
Optionally, the above operation of obtaining the number of hits of each access request in the cache of each level may specifically include the following steps:
S11, setting a plurality of hit counters for each access request; different hit counters correspond to different levels of cache.
S12, for any access operation of each access request, packaging the access request into a request packet, and setting a level parameter in the request packet.
S13, in the process of returning the request packet from the target level, adding 1 to the level parameter in the request packet every time one level passes; the target level is the level of the cache in which the request packet hits;
s14, determining a target level of the access request based on the value of the level parameter in the returned request packet, and adding 1 to a hit counter corresponding to the target level from hit counters corresponding to the access request;
s15, under the condition that evaluation conditions are met, acquiring the hit times of each access request in caches of all levels based on the current values of all hit counters corresponding to each access request.
Specifically, for the steps S11 to S15, in the embodiment of the present invention, a plurality of hit counters may be set for each memory request, and different hit counters may correspond to caches of different levels, so that each hit counter of each memory request may count the hit times of caches of different levels.
Wherein the request packet refers to a message packet (MESSAGE PACKET), and the memory access request is packaged into the request packet so as to be convenient for delivery in various levels of the cache. Further, the embodiment of the invention sets a hierarchy parameter in the request packet, wherein the hierarchy parameter is used for representing the hierarchy of the cache hit by the request packet. Specifically, a variable may be created in the request packet as a hierarchy parameter. Further, after creating the hierarchy parameter, its initial value may also be set to 0. Specifically, the above-mentioned level parameters may be integer type variables, or may be floating point type, which is not limited in the embodiment of the present invention.
Further, after the request packet is sent out, the request packet is firstly transferred to the L1 cache, if the target data to be accessed exists in the L1 cache, the target data is added to the request packet, then the request packet is returned, if the target data does not exist in the L1 cache, the request packet is continuously transferred to the L2 cache until the target data is accessed, and then the request packet is sequentially transferred from the hit level upwards. In the embodiment of the invention, when the request packet returns from the level of the hit cache, the level parameter in the request packet is increased by 1 every time a level passes, so that the level of the hit cache of the request packet can be determined through the level parameter, and the hit counter can count conveniently.
Accordingly, after the returned request packet is obtained, the target level where the cache hit by the access request is located can be determined according to the value of the level parameter, so that the hit counter corresponding to the target level can be increased by 1, and statistics on hit times of different levels can be realized.
The evaluation condition may be that the execution times of the test program reach a preset time threshold, or that the execution time of the test program reaches an execution time threshold, which may be set by itself according to actual requirements, which is not limited in the embodiment of the present invention. Further, after the evaluation condition is satisfied, the current value of each hit counter corresponding to each access request may be determined as the hit number of each access request in each level of cache.
Further, in the embodiment of the present invention, a Program Counter (PC value) of each memory request may be used as an index value of each memory request, so that hit counters corresponding to different memory requests may be distinguished by a PC value, that is, a PC value of a memory request may be used as an identifier of a hit Counter corresponding to the memory request. Accordingly, after receiving the returned request packet, the hit counters corresponding to all the access requests may be indexed based on the PC value of the access request carried in the request packet, and the hit counter identified as the same as the PC value may be determined as the hit counter of the access request corresponding to the request packet.
In the embodiment of the invention, a plurality of hit counters are set for each access request; different hit counters correspond to different levels of cache; for any access operation of each access request, packaging the access request into a request packet, and setting a hierarchy parameter in the request packet; in the process of returning the request packet from the target hierarchy, adding 1 to the hierarchy parameter in the request packet every time one hierarchy is passed; the target level is the level of the cache in which the request packet hits; determining a target level of the access request based on the value of the level parameter in the returned request packet, and adding 1 to a hit counter corresponding to the target level from hit counters corresponding to the access request; and under the condition that the evaluation condition is met, acquiring the hit times of each access request in caches of all levels based on the current value of each hit counter corresponding to each access request. In this way, the level of the cache hit by the access request can be determined by setting the level parameter, and at the same time, the hit numbers of the access request in caches of different levels can be counted respectively by setting a plurality of hit counters for one access request.
Illustratively, referring to FIG. 2, a schematic diagram of hit count acquisition of the present invention is shown, where the cache in FIG. 2 includes three levels of cache, namely a private primary data cache, a private secondary cache, and a shared tertiary cache. The processor core 1 may execute a test program to obtain access statistics via a performance counter in the processor. The cache queue refers to a Load Store queue. The other cores refer to processor cores other than processor core 1.
Specifically, a certain access request (Load instruction) at a completes address calculation and issues an access request to the cache. The memory requests are encapsulated in message packets (MESSAGE PACKET) and passed in the various levels of the cache. The message packet stores metadata, and the metadata may include a hierarchical parameter d, a source of a cache line, a priority of a replacement algorithm corresponding to the cache line, and the like. According to fig. 2, the message packet hits the first level cache, and it is returned directly to the message packet, where the level parameter is still 0, indicating that the target level of the message packet is the first level cache. Metadata for the message packets is recorded in the Load Store queue of the processor performance model.
And returning the message packet corresponding to the Load instruction at the position B from the hit storage hierarchy. According to fig. 2, when the message packet hits the memory, and when the message packet returns to the cache queue, the hierarchy parameter in the request packet is increased by 1 every time a hierarchy is passed, and when the message packet returns to the cache queue, the hierarchy parameter in the message packet is 3, which indicates that the target hierarchy of the message packet is the memory. A Load instruction at C is processed, and the response of the Load instruction in the cache system can be counted according to the metadata stored in the Load Store queue.
As shown in fig. 2, the performance counter may include 5 entries, which are respectively a program counter, a first level cache hit counter, a second level cache hit counter, and a third level cache hit counter, where the program counter is configured to record a PC value of the memory access request. As shown in fig. 2, it indicates that a memory request with PC value 0xABC hits K times in the first level cache, L times in the second level cache, M times in the third level cache, and N times in the memory.
Further, the duty cycle returned by the various levels of cache may reflect the efficiency of the cache system, depending on each Load instruction. If a certain memory request is returned more by the L3 cache or by memory, then the IPC of the program fragment where the memory request is located is typically also lower.
Optionally, the access statistical information further includes an index value of each access request, and after the operation of obtaining the access statistical information of the test program, the embodiment of the present invention specifically further includes the following steps:
S21, outputting source codes corresponding to the index values to an information display interface based on the index values of the access requests.
S22, receiving each mode information input by a user for the source code corresponding to each index value based on the information display interface, and determining each mode information as the access mode of the access request corresponding to each index value.
The index value refers to a PC value of the memory access request, and the PC value is unique and fixed for a static instruction, so that the embodiment of the invention can output the source code of the corresponding memory access request to the information display interface through the PC value, and the source code can be analyzed by related testers. Specifically, source codes corresponding to the PC values can be obtained through addr2 lines. The addr2line is a debug information reading tool, and a Program Counter (PC) may be corresponding to a line of the source code.
Further, the embodiment of the invention can sequentially output the source codes corresponding to the index values to the information display interface for display, related testers analyze the displayed source codes to obtain the access mode of the source codes, and a user can input the access mode as input information, so that the embodiment of the invention can determine the access mode corresponding to the source codes by receiving the mode information input by the user.
Illustratively, referring to FIG. 3, a schematic diagram of the access mode of the present invention is shown, wherein a high-level language source file refers to the source program of a test program, and a compiler may generate a binary executable file containing code and data segments during the compilation stage, as well as debug information (e.g., DWARF formatted debug information, debugging With Arbitrary Record Formats. DWARF is a debug information file format used by many compilers and debuggers to support source-level debugging), which may contain mappings of source code to PC values. After the test program runs on the processor, a statistics result (access statistics information) of a performance counter can be obtained. The debug information reading tool can read the source code corresponding to each access request, namely the high-level language code, based on the mapping relation between the source code and the PC value in the debug information and the statistical result of the performance counter. And then the access mode of each access request can be determined through the high-level language code.
Further, the embodiment of the invention can evaluate the performance of the cache or the effect of an optimization algorithm based on the access mode, the cache hit count of each level in the access statistical information and the duty ratio of the cache hit of each level.
In the embodiment of the invention, the access statistical information also comprises index values of the access requests; outputting source codes corresponding to the index values to an information display interface through the index values based on the access requests; and receiving mode information input by a user for source codes corresponding to the index values based on the information display interface, and determining the mode information as a memory access mode of a memory access request corresponding to the index values. Therefore, through setting the information display interface, the access mode corresponding to each access request can be determined by receiving the input of the user.
Optionally, the operation of evaluating the performance of the to-be-evaluated cache based on the access statistics information and the access mode of each access request may specifically include the following steps:
S31, aiming at each access request, acquiring the reference hit times corresponding to the access mode of the access request.
S32, based on the hit times of the access requests and the reference hit times, evaluating the performance of the cache to be evaluated.
The reference hit times may be preset, and may be hit times of the memory requests of each memory mode in the cache with performance meeting requirements, so that the embodiment of the present invention may evaluate the performance of the cache to be evaluated through the reference hit times of different memory modes.
Specifically, for any access request, the reference hit number may be used as a performance threshold, and under the condition that the hit number of the access request is not less than the reference hit number corresponding to the access mode of the access request, it is determined that the cache performance of the cache to be evaluated for the access mode meets the requirement. Accordingly, under the condition that the hit number of the access request is smaller than the reference hit number corresponding to the access mode of the access request, it is determined that the cache performance of the cache to be evaluated for the access mode does not meet the requirement.
Optionally, the embodiment of the invention can preset the reference hit times of caches in different levels, which can be the hit times of the access requests in each access mode in each level of the caches with performance meeting the requirement. Accordingly, the evaluation method may be performed in combination with the number of hits and the number of reference hits in caches of different levels.
According to the embodiment of the invention, the reference hit times corresponding to the access mode of the access request are obtained by aiming at each access request; and evaluating the performance of the cache to be evaluated based on the hit times of the access requests and the reference hit times. By setting the number of reference hits, the performance of the cache can be effectively evaluated.
Optionally, the embodiment of the invention specifically may further include:
S41, optimizing the cache to be evaluated by adopting a first optimization algorithm, re-executing the operation of acquiring the access statistical information of the test program based on the optimized cache to obtain second access statistical information, and taking the access statistical information corresponding to the cache before optimization as first access statistical information; the second access statistic information comprises the hit times of each access request in the optimized cache and the optimization state of the first optimization algorithm when each access request hits the optimized cache.
S42, evaluating the optimization effect of the first optimization algorithm based on the first access statistical information, the second access statistical information and the access modes of all access requests.
For the steps S41 to S42, the first optimization algorithm refers to an optimization technique for the cache, and may be any prefetcher, a prefetching technique, a prefetching algorithm, or a replacement policy, etc., and the first optimization algorithm may be selected according to actual needs, which is not limited in the embodiment of the present invention. It may be appreciated that the above-mentioned first optimization algorithm may optimize the performance of the cache, and the optimization effects of different optimization algorithms may be different.
Specifically, the optimization effect of the first optimization algorithm according to the embodiment of the invention can be evaluated from the angles of different access modes. Specifically, for the access mode of any access request, the hit number of the access request can be obtained from the first access statistical information as a first number, and the hit number of the access request is obtained from the second access statistical information as a second number, if the second number is greater than the first number, the first optimization algorithm can improve the processing efficiency of the cache on the access mode, and further, if the second number is greater than the first number, and the difference value between the first and second times is greater than a preset threshold, the first optimization algorithm can greatly improve the processing efficiency of the cache on the access mode, and the optimization effect is better.
For example, referring to fig. 4, a statistical result diagram of the present invention is shown, and as shown in fig. 4, it shows 6 access requests, where PC values are respectively 0x119fa,0x119fe,0x119ea,0x119f0,0x119f8, and 0x119f4, and each line where the PC value is located corresponds to the hit number of the access request corresponding to the PC value in the first-level cache, the second-level cache, the third-level cache, and the memory. Taking the access modes of 0x119fa and 0x119fe as indirect access and taking other requests as stepping access as an example, it can be seen that compared with other access requests, the hit times of 0x119fa and 0x119fe in the three-level cache and the memory are more, and the performance of the cache against the indirect access mode is poor, and the performance against the stepping access mode is better.
Still another exemplary embodiment of the present invention is shown in fig. 5, where fig. 5 is a statistical result optimized by a hardware prefetching technique for a cache, and it can be seen that the hardware prefetching technique can increase the hit times of indirect accesses 0x119fa and 0x119fe in the first-level cache, reduce the hit times in the level below the second-level cache, and effectively improve the processing efficiency of the cache system for the indirect access mode.
In addition, the statistics reflect that the prefetch technique still has room for improvement. The prefetch technique uses a step access predictor and indirect access identification technique to handle prefetching for a first level of indirect access. Ideally, the number of indirect memory instructions returned from the primary cache should be similar to the stepping memory on which they depend. While the number of indirect accesses returned from the primary cache in the actual result is still less than the stepping accesses it depends on. It can therefore be concluded that there is still room for improvement in this prefetch technique.
Meanwhile, after the prefetch technique is applied, the IPC of the test program is not raised, and the reasons may be the reasons of other modules in the processor core, such as branch prediction, etc. If the prior art way of evaluating the prefetch technique using only IPC is used, it will be concluded that the prefetch technique is not useful. The embodiment of the invention evaluates according to the hit times and the access mode, and can see that when the hit times of the first-level cache increases, the number of requests sent to the second-level cache is correspondingly reduced, so that the hit rate of the second-level cache is reduced, if the technology is evaluated by only using the hit rate of the cache, visual results cannot be obtained, or error conclusion that the hit rate of the second-level cache is reduced by the technology is obtained, and the evaluation effect is poor.
The invention can solve the problems of difficult evaluation of the cache optimization mechanism such as prefetcher, replacement algorithm and the like and poor evaluation effect of the traditional method by combining the access mode and the effect of each access instruction in the cache system.
Further, the reason for each cache hit may be recorded in the metadata of the message packet according to the embodiment of the present invention, and may be that the prefetch technology 1 is adopted to prefetch the post-hit, the prefetch technology 2 is adopted to prefetch the post-hit, or the address has been accessed before. Further, the cause of each level of cache miss may also be recorded in metadata, such as: the first access that the prefetcher does not cover, the prefetcher covers but does not fetch in time, is swapped out of cache for capacity reasons, is swapped out of cache for conflict reasons, etc. Further refinement of the cache and optimization algorithm may be based on the metadata.
Further, the second access statistic information may include an optimization state of the first optimization algorithm when each access request hits the optimized cache. The optimization state refers to optimization parameters of an optimization algorithm, and the optimization parameters of different optimization algorithms are different, for example, in the case that the first optimization algorithm is a replacement algorithm, the optimization state may be a least recently Used distance (LEAST RECENTLY Used) of the least recently Used replacement algorithm or a Re-reference interval predicted by a Re-REFERENCE INTERVAL Prediction replacement algorithm (RRIP). Specifically, the above-mentioned optimization state can be obtained by reading the current value of the optimization parameter of the optimization algorithm.
According to the embodiment of the invention, the cache to be evaluated is optimized by adopting a first optimization algorithm, the operation of obtaining the access statistical information of the test program is re-executed based on the optimized cache, the second access statistical information is obtained, and the access statistical information corresponding to the cache before optimization is used as the first access statistical information; and evaluating the optimization effect of the first optimization algorithm based on the first access statistical information, the second access statistical information and the access modes of all access requests. In this way, an efficient evaluation of the first optimization algorithm can be achieved.
Optionally, the embodiment of the invention specifically may further include:
S51, optimizing the optimized cache by adopting a second optimization algorithm, and re-executing the operation of acquiring the access statistical information of the test program to obtain third access statistical information; the third access statistical information comprises the optimization state of the first optimization algorithm and the optimization state of the second optimization algorithm when each access request hits the optimized cache.
S52, based on the second access statistical information and the third access statistical information, evaluating the optimization effect of the first optimization algorithm and the second optimization algorithm.
The second optimization algorithm refers to an optimization technology for the cache, and may be any prefetcher, a prefetching technology, a prefetching algorithm, a replacement policy or the like different from the first optimization algorithm, and the second optimization algorithm may be selected according to actual requirements, which is not limited in the embodiment of the present invention.
Specifically, since in some cases, two or more optimization techniques may be simultaneously used by one cache system, the effects of different optimization techniques may overlap each other or cancel each other to cause a worse cache effect. For example, if a hardware prefetch technique and a replacement policy both effectively optimize computer system performance when compared to baseline alone, but when both methods are used simultaneously, the effect may not be as good as when either method is used alone, possibly because hardware prefetching increases memory access traffic and access opportunities are advanced over normal read data, not conforming to the assumptions of the replacement algorithm design.
On the basis, after the cache is optimized by adopting the first optimization algorithm, the cache is further optimized by adopting the second optimization algorithm, and the first optimization algorithm and the second optimization algorithm are simultaneously applied to the cache at the moment.
Further, according to the embodiment of the invention, the optimization effect of the whole of the first optimization algorithm and the second optimization algorithm can be evaluated through the second access statistical information and the third access statistical information. Specifically, the optimization state in the second access statistical information can be compared with the optimization state of the first optimization algorithm in the third access statistical information, and if the optimization state of the first optimization algorithm in the third access statistical information is poor, the optimization effect of simultaneously applying the first optimization algorithm and the second optimization algorithm can be obtained.
For example, taking the first optimization algorithm as a replacement algorithm, taking the second optimization algorithm as a hardware prefetching technology, for example, before the hardware prefetching technology is added, the re-reference interval (optimization state) predicted by the replacement algorithm for a certain access request is longer, but after the hardware prefetching is added, the predicted re-reference interval of the access instruction is shorter, so that the fact that the replacement algorithm is influenced by the prefetching can be estimated to cause poor comprehensive optimization effect of the two.
In summary, the embodiment of the invention provides a cache performance evaluation method, which acquires access statistical information of a test program through multiple access operations responding to each access request in the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated; and evaluating the performance of the cache to be evaluated based on the access statistical information and the access mode of each access request. In this way, performance evaluation of the cache may be achieved by obtaining the number of hits in the cache for the memory access request. Meanwhile, through the access statistical information and the access modes of each access request, the performance of the cache can be evaluated from the dimensions of different access modes, multi-dimensional evaluation is realized, and the accuracy of the performance evaluation of the cache is improved.
Further, in the embodiment of the present invention, the performance of the cache is evaluated based on the number of hits and the access mode of the access request, and compared with a method of evaluating the cache miss rate or the average instruction number (Instruction Per Cycle, IPC) executed per clock cycle, the cache miss rate can only reflect the miss proportion in the access request received by the cache, which cannot locate the miss timing, program fragments with the same miss rate may have different effects on IPC due to different miss timings, and further cannot determine program fragments with larger effects on the performance of the cache. IPC is sensitive to branch prediction and other factors, and cannot directly reflect the coverage degree of a cache system to memory access. The embodiment of the invention can evaluate the performance of the cache from the dimensions of different access modes through the hit times of each access request and the access modes of each access request, realize multi-dimensional evaluation, determine the access mode with larger influence on the performance of the cache and improve the accuracy and the interpretability of the performance evaluation of the cache.
Furthermore, the embodiment of the invention can also provide basis for the design optimization of the cache system and can also provide basis for the improvement of an optimization algorithm. The embodiment of the invention can assist in the design of a hardware cache system and can be conveniently used in a simulator or a simulation environment.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Device embodiment
Referring to FIG. 6, which illustrates a block diagram of an embodiment of a cache performance evaluation apparatus of the present invention, the apparatus 20 may specifically comprise:
An obtaining module 201, configured to obtain access statistics information of a test program in response to multiple access operations of each access request in the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated;
a first evaluation module 202, configured to evaluate the performance of the cache to be evaluated based on the access statistics information and the access mode of each access request.
Optionally, the storage hierarchy of the cache to be evaluated comprises at least two hierarchies; the obtaining module 201 is specifically configured to:
And responding to multiple access operations of each access request in the test program, and acquiring the hit times of each access request in caches of all levels as the access statistical information.
Optionally, the acquiring module 201 includes:
A setting submodule, configured to set a plurality of hit counters for each of the access requests; different hit counters correspond to different levels of cache;
the packaging sub-module is used for packaging the access requests into request packets for any access operation of each access request, and setting level parameters in the request packets;
A parameter sub-module, configured to add 1 to a hierarchy parameter in the request packet every time a hierarchy is passed in a process of returning the request packet from a target hierarchy; the target level is the level of the cache in which the request packet hits;
The determining submodule is used for determining a target level of the access request based on the value of the level parameter in the returned request packet, and adding 1 to a hit counter corresponding to the target level from hit counters corresponding to the access request;
The number acquisition sub-module is used for acquiring the hit number of each access request in each level of cache based on the current value of each hit counter corresponding to each access request under the condition that the evaluation condition is met.
Optionally, the access statistical information further includes an index value of each access request; the apparatus further comprises:
the output module is used for outputting source codes corresponding to the index values to the information display interface based on the index values of the access requests;
And the receiving module is used for receiving the mode information input by the user for the source codes corresponding to the index values based on the information display interface and determining the mode information as the access mode of the access request corresponding to the index values.
Optionally, the first evaluation module includes:
the reference acquisition sub-module is used for acquiring the reference hit times corresponding to the access modes of the access requests aiming at each access request;
And the evaluation sub-module is used for evaluating the performance of the cache to be evaluated based on the hit times of the access requests and the reference hit times.
Optionally, the apparatus further comprises:
The first optimizing module is used for optimizing the cache to be evaluated by adopting a first optimizing algorithm, re-executing the operation of acquiring the access statistical information of the test program based on the optimized cache to obtain second access statistical information, and taking the access statistical information corresponding to the cache before optimization as first access statistical information; the second access statistical information comprises hit times of each access request in the optimized cache and an optimization state of the first optimization algorithm when each access request hits the optimized cache;
and the second evaluation module is used for evaluating the optimization effect of the first optimization algorithm based on the first access statistical information, the second access statistical information and the access modes of all access requests.
Optionally, the apparatus further comprises:
the second optimization module is used for optimizing the optimized cache by adopting a second optimization algorithm and re-executing the operation of acquiring the access statistical information of the test program to obtain third access statistical information; the third access statistical information comprises the optimization state of the first optimization algorithm and the optimization state of the second optimization algorithm when each access request hits the optimized cache;
And the third evaluation module is used for evaluating the optimization effects of the first optimization algorithm and the second optimization algorithm based on the second access statistical information and the third access statistical information.
In summary, the embodiment of the invention provides a cache performance evaluation device, which acquires access statistical information of a test program through multiple access operations responding to each access request in the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated; and evaluating the performance of the cache to be evaluated based on the access statistical information and the access mode of each access request. In this way, performance evaluation of the cache may be achieved by obtaining the number of hits in the cache for the memory access request. Meanwhile, through the access statistical information and the access modes of each access request, the performance of the cache can be evaluated from the dimensions of different access modes, multi-dimensional evaluation is realized, and the accuracy of the performance evaluation of the cache is improved.
For system embodiments, the description is relatively simple as it is substantially similar to method embodiments, and reference is made to the description of method embodiments for relevant points.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
The specific manner in which the respective modules perform operations in the cache performance evaluation apparatus in the above-described embodiments has been described in detail in the embodiments regarding the method, and will not be described in detail here.
The embodiment of the invention also provides electronic equipment, which comprises: a processor, a memory for storing processor-executable instructions, wherein the processor is configured to perform the cache performance evaluation method described above.
Referring to fig. 7, a schematic structural diagram of an electronic device according to an embodiment of the present invention is shown. As shown in fig. 7, the electronic device includes: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus; the memory is configured to store at least one executable instruction that causes the processor to perform the cache performance evaluation method of the foregoing embodiment.
It should be noted that, the electronic device in the embodiment of the present application includes a mobile electronic device and a non-mobile electronic device.
The Processor may be a CPU (Central Processing Unit ), general purpose Processor, DSP (DIGITAL SIGNAL Processor ), ASIC (Application SPECIFIC INTEGRATED Circuit), FPGA (Field Programmble GATE ARRAY, field programmable gate array) or other editable device, transistor logic device, hardware component, or any combination thereof. The processor may also be a combination that performs the function of a computation, e.g., a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, etc.
The communication bus may include a path to transfer information between the memory and the communication interface. The communication bus may be a PCI (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The communication bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one line is shown in fig. 7, but not only one bus or one type of bus.
The memory may be a ROM (Read Only memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (ELECTRICALLY ERASABLE PROGRAMMABLE READ ONLY, electrically erasable programmable Read Only memory), a CD-ROM (Compact Disa Read Only, compact disc Read Only), a magnetic tape, a floppy disk, an optical data storage device, and the like.
Embodiments of the present invention also provide a non-transitory computer-readable storage medium, which when executed by a processor of an electronic device (server or terminal), enables the processor to perform the cache performance evaluation method shown in fig. 1.
Embodiments of the present invention also provide a computer program product comprising instructions that, when executed on a computer, cause the computer to perform the cache performance evaluation method shown in fig. 1.
The embodiment of the application also provides a chip, which comprises a processor and a communication interface, wherein the communication interface is coupled with the processor, and the processor is used for running programs or instructions to realize the processes of the cache performance evaluation method embodiment, and the same technical effects can be achieved, so that repetition is avoided, and the description is omitted here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may be implemented, in whole or in part, in software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk Solid STATE DISK (SSD)), etc.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
It should be noted that, in the embodiment of the present application, the related processes of obtaining various data are all performed under the premise of conforming to the corresponding data protection rule policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal device that comprises the element.
The above detailed description of the cache performance evaluation method, the device, the electronic equipment and the readable storage medium provided by the invention applies specific examples to illustrate the principles and the implementation of the invention, and the above examples are only used to help understand the method and the core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (7)

1. A method of cache performance evaluation, the method comprising:
Responding to multiple access operations of each access request in a test program, and setting a plurality of hit counters for each access request; the storage hierarchy of the cache to be evaluated comprises at least two hierarchies; different hit counters correspond to different levels of cache;
for any access operation of each access request, packaging the access request into a request packet, and setting a hierarchy parameter in the request packet;
In the process of returning the request packet from the target hierarchy, adding 1 to the hierarchy parameter in the request packet every time one hierarchy is passed; the target level is the level of the cache in which the request packet hits;
determining a target level of the access request based on the value of the level parameter in the returned request packet, and adding 1 to a hit counter corresponding to the target level from hit counters corresponding to the access request;
Under the condition that evaluation conditions are met, acquiring hit times of each access request in caches of all levels based on the current value of each hit counter corresponding to each access request, and taking the hit times as access statistical information;
For each access request, acquiring the reference hit times corresponding to the access mode of the access request;
And evaluating the performance of the cache to be evaluated based on the hit times of the access requests and the reference hit times.
2. The method of claim 1 wherein the memory statistics further comprise an index value for each of the memory requests; after the hit times of each access request in the caches of each level are obtained and used as the access statistical information, the method further comprises:
Outputting source codes corresponding to the index values to an information display interface based on the index values of the access requests;
And receiving mode information input by a user for source codes corresponding to the index values based on the information display interface, and determining the mode information as a memory access mode of a memory access request corresponding to the index values.
3. The method according to any one of claims 1-2, wherein the method further comprises:
Optimizing the cache to be evaluated by adopting a first optimization algorithm, re-executing the operation of acquiring the access statistical information of the test program based on the optimized cache to obtain second access statistical information, and taking the access statistical information corresponding to the cache before optimization as first access statistical information; the second access statistical information comprises hit times of each access request in the optimized cache and an optimization state of the first optimization algorithm when each access request hits the optimized cache;
And evaluating the optimization effect of the first optimization algorithm based on the first access statistical information, the second access statistical information and the access modes of all access requests.
4. A method according to claim 3, characterized in that the method further comprises:
optimizing the optimized cache by adopting a second optimization algorithm, and re-executing the operation of obtaining the access statistical information of the test program to obtain third access statistical information; the third access statistical information comprises the optimization state of the first optimization algorithm and the optimization state of the second optimization algorithm when each access request hits the optimized cache;
and evaluating the optimization effects of the first optimization algorithm and the second optimization algorithm based on the second access statistic information and the third access statistic information.
5. A cache performance evaluation apparatus, the apparatus comprising:
the acquisition module is used for responding to multiple access operations of each access request in the test program and acquiring access statistical information of the test program; the access statistical information at least comprises hit times of each access request in a cache to be evaluated;
the first evaluation module is used for evaluating the performance of the cache to be evaluated based on the access statistical information and the access modes of the access requests;
the storage hierarchy of the cache to be evaluated comprises at least two hierarchies; the acquisition module is specifically configured to:
responding to multiple access operations of each access request in a test program, and acquiring hit times of each access request in caches of all levels as the access statistical information;
the acquisition module comprises:
A setting submodule, configured to set a plurality of hit counters for each of the access requests; different hit counters correspond to different levels of cache;
the packaging sub-module is used for packaging the access requests into request packets for any access operation of each access request, and setting level parameters in the request packets;
A parameter sub-module, configured to add 1 to a hierarchy parameter in the request packet every time a hierarchy is passed in a process of returning the request packet from a target hierarchy; the target level is the level of the cache in which the request packet hits;
The determining submodule is used for determining a target level of the access request based on the value of the level parameter in the returned request packet, and adding 1 to a hit counter corresponding to the target level from hit counters corresponding to the access request;
The number acquisition sub-module is used for acquiring the hit number of each access request in each level of cache based on the current value of each hit counter corresponding to each access request under the condition that the evaluation condition is met;
the first evaluation module comprises:
the reference acquisition sub-module is used for acquiring the reference hit times corresponding to the access modes of the access requests aiming at each access request;
And the evaluation sub-module is used for evaluating the performance of the cache to be evaluated based on the hit times of the access requests and the reference hit times.
6. An electronic device, comprising a processor, a memory, a communication interface, and a communication bus, wherein the processor, the memory, and the communication interface communicate with each other via the communication bus; the memory is configured to hold executable instructions that cause the processor to perform the cache performance assessment method according to any one of claims 1 to 4.
7. A readable storage medium, wherein instructions in the readable storage medium, when executed by a processor of an electronic device, enable the processor to perform the cache performance evaluation method of any one of claims 1 to 4.
CN202410634793.6A 2024-05-21 Cache performance evaluation method and device, electronic equipment and readable storage medium Active CN118227446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410634793.6A CN118227446B (en) 2024-05-21 Cache performance evaluation method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410634793.6A CN118227446B (en) 2024-05-21 Cache performance evaluation method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN118227446A CN118227446A (en) 2024-06-21
CN118227446B true CN118227446B (en) 2024-08-02

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016012288A (en) * 2014-06-30 2016-01-21 富士通株式会社 Test device, test program, and test method
CN114631082A (en) * 2019-10-31 2022-06-14 超威半导体公司 Cache access measurement skew correction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016012288A (en) * 2014-06-30 2016-01-21 富士通株式会社 Test device, test program, and test method
CN114631082A (en) * 2019-10-31 2022-06-14 超威半导体公司 Cache access measurement skew correction

Similar Documents

Publication Publication Date Title
Huynh et al. Scope-aware data cache analysis for WCET estimation
US9003384B2 (en) Methods and apparatuses for automatic type checking via poisoned pointers
Da Silva et al. Performance Modeling for FPGAs: Extending the Roofline Model with High‐Level Synthesis Tools
US20080184011A1 (en) Speculative Throughput Computing
US7779393B1 (en) System and method for efficient verification of memory consistency model compliance
US8612944B2 (en) Code evaluation for in-order processing
EP3835944B1 (en) Apparatus and method for source code optimisation
US11636122B2 (en) Method and apparatus for data mining from core traces
US20070150660A1 (en) Inserting prefetch instructions based on hardware monitoring
Pan et al. A modeling framework for reuse distance-based estimation of cache performance
CN107436834A (en) Estimate method, product and the system of power consumption of processing unit
US9753731B1 (en) Methods and systems for analyzing and improving performance of computer codes
US8612952B2 (en) Performance optimization based on data accesses during critical sections
CN118227446B (en) Cache performance evaluation method and device, electronic equipment and readable storage medium
CN106649143B (en) Cache access method and device and electronic equipment
CN118227446A (en) Cache performance evaluation method and device, electronic equipment and readable storage medium
CN116149917A (en) Method and apparatus for evaluating processor performance, computing device, and readable storage medium
Huber et al. WCET driven design space exploration of an object cache
KR20230052821A (en) Prefetching
US9916164B2 (en) Methods and apparatus to optimize instructions for execution by a processor
US8438003B2 (en) Methods for improved simulation of integrated circuit designs
CN111190644A (en) Embedded Flash on-chip read instruction hardware acceleration method and device
Feljan et al. The impact of intra-core and inter-core task communication on architectural analysis of multicore embedded systems
Gebhard Static timing analysis tool validation in the presence of timing anomalies
CN118295936B (en) Management method and device of cache replacement policy and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant