CN116049034A - Verification method and device for cache consistency of multi-core processor system - Google Patents

Verification method and device for cache consistency of multi-core processor system Download PDF

Info

Publication number
CN116049034A
CN116049034A CN202210467692.5A CN202210467692A CN116049034A CN 116049034 A CN116049034 A CN 116049034A CN 202210467692 A CN202210467692 A CN 202210467692A CN 116049034 A CN116049034 A CN 116049034A
Authority
CN
China
Prior art keywords
cache
state
cache line
memory address
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210467692.5A
Other languages
Chinese (zh)
Inventor
马擎堃
张克松
陈元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haiguang Information Technology Co Ltd
Original Assignee
Haiguang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haiguang Information Technology Co Ltd filed Critical Haiguang Information Technology Co Ltd
Priority to CN202210467692.5A priority Critical patent/CN116049034A/en
Publication of CN116049034A publication Critical patent/CN116049034A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7839Architectures of general purpose stored program computers comprising a single central processing unit with memory
    • G06F15/7842Architectures of general purpose stored program computers comprising a single central processing unit with memory on one IC chip (single chip microcontrollers)
    • G06F15/7846On-chip cache and off-chip main memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the application discloses a method and a device for verifying cache consistency of a multi-core processor system, relates to the technical field of multi-core processor function verification, and aims to improve the verification efficiency of the cache consistency of the multi-core processor system. The method comprises the following steps: monitoring a first request received or sent by a first cache of a first processor core; monitoring first response information for the first request; determining the state of a first cache line from the corresponding relation between a preset memory address and the state of the cache line; predicting the state of the first cache line based on the state of the first cache line, the command type and a preset state transition rule based on a cache consistency protocol; the state transition rule based on the cache consistency protocol is consistent with the cache consistency state transition rule used in the multi-core processor system; and comparing the current state of the cache line with the predicted state to determine whether the first cache accords with cache consistency. The application verifies the cache consistency of the multi-core processor.

Description

Verification method and device for cache consistency of multi-core processor system
Technical Field
The present disclosure relates to the technical field of functional verification of multi-core processors, and in particular, to a method and apparatus for verifying cache consistency of a multi-core processor system, an electronic device, and a readable storage medium.
Background
Computer physical memory typically uses DRAM (Dynamic Random Access Memory), which is generally costly but also of a generally uniform speed. The processing speed of a Central Processing Unit (CPU) is limited by the access speed of a physical memory of a computer, and in order to increase the access speed, a Cache (Cache) is introduced between the two as a bridge for communication between the two.
The Cache and storage systems of a multi-core processor interact using fixed-size blocks of data, referred to as cachelines. The Cache stores the data stored in the Cache and records the current state of the data.
The multi-core processor manages the state of a Cache line through a Cache-coherence (Cache) protocol, so that the problem of data loss or data coherence is avoided. Cache coherence is an important feature of on-chip multi-core processor memory systems, which specifies the correctness of parallel program execution results, defining relationships between memory access operations in the multi-core processor.
The verification of cache coherence is an important part of the functional verification of the on-chip multi-core processor, however, in the prior art, there is no related scheme for verifying cache coherence, which results in lower efficiency of verification of cache coherence of the multi-core processor system.
Disclosure of Invention
In view of this, embodiments of the present application provide a method, an apparatus, an electronic device, and a readable storage medium for verifying cache coherence of a multi-core processor system, which can improve the efficiency of verifying cache coherence of the multi-core processor system.
In a first aspect, an embodiment of the present application provides a method for verifying cache consistency of a multi-core processor system, where the multi-core processor system includes two or more processor cores, each processor core has a first level cache and a second level cache, and each processor core has a third level cache in common; the verification method for cache consistency comprises the following steps: monitoring a first request received or sent by a first cache of a first processor core; the first request comprises a memory address and a command type of data to be operated by the first request; monitoring first response information for the first request; wherein the first response information includes a current state of a first cache line in the first cache; the first cache line is used for storing the data to be operated by the first request; determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address; predicting the state of the first cache line based on the state of the first cache line, the command type and a preset state transition rule based on a cache consistency protocol to obtain a predicted state of the first cache line; the preset state transition rule based on the cache consistency protocol is consistent with the cache consistency state transition rule used in the multi-core processor system; and comparing the current state of the cache line with the predicted state to determine whether the first cache accords with cache consistency.
According to a specific implementation manner of the embodiment of the present application, the monitoring the first request received or sent by the first cache of the first processor core includes: the first cache of the first processor core is monitored for received or transmitted finger requests, probe requests and/or rejection requests.
According to a specific implementation manner of the embodiment of the present application, the first cache is a second-level cache; the monitoring the first request received or sent by the first cache of the first processor core includes: requests between a second level cache and an access finger unit, a first level data cache, or a third level cache of a first processor core are monitored.
According to a specific implementation manner of the embodiment of the present application, the corresponding relationship between the preset memory address and the cache line state includes: a plurality of preset memory addresses and corresponding relations between the cache line states; the corresponding relation between each preset memory address and the cache line state corresponds to each processor core one by one; after monitoring a first request received or sent by a first cache of a first processor core, before determining a state of a first cache line from a preset corresponding relationship between a memory address and a cache line state according to the memory address, the method further includes: marking the first request so that the marked first request includes first processor core information; determining a corresponding relation between a first preset memory address corresponding to the first processor core and a cache line state from corresponding relations between a plurality of preset memory addresses and the cache line state according to first processor core information included in the marked first request; the determining the state of the first cache line according to the memory address from the corresponding relation between the preset memory address and the state of the cache line includes: and determining the state of the first cache line from the corresponding relation between the first preset memory address and the state of the cache line according to the memory address.
According to a specific implementation manner of the embodiment of the application, the method further includes: and under the condition that the current state of the cache line is consistent with the predicted state, updating the corresponding relation between the preset memory address and the cache line state information by using the memory address and the current state of the cache line.
In a second aspect, an embodiment of the present application provides a device for verifying cache consistency of a multi-core processor system, where the multi-core processor system includes two or more processor cores, each processor core has a first level cache and a second level cache, and each processor core has a third level cache in common; the verification device for cache consistency comprises: the first monitoring module is used for monitoring a first request received or sent by a first cache of the first processor core; the first request comprises a memory address and a command type of data to be operated by the first request; a second monitoring module for monitoring first response information for the first request; wherein the first response information includes a current state of a first cache line in the first cache; the first cache line is used for storing the data to be operated by the first request; the first determining module is used for determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address; the prediction module is used for predicting the state of the first cache line based on the state of the first cache line, the command type and a preset state transition rule based on a cache consistency protocol to obtain a predicted state of the first cache line; the preset state transition rule based on the cache consistency protocol is consistent with the cache consistency state transition rule used in the multi-core processor system; and the second determining module is used for comparing the current state of the cache line with the predicted state and determining whether the first cache accords with cache consistency.
According to a specific implementation manner of the embodiment of the present application, the first monitoring module is specifically configured to: the first cache of the first processor core is monitored for received or transmitted finger requests, probe requests and/or rejection requests.
According to a specific implementation manner of the embodiment of the present application, the first cache is a second-level cache; the first monitoring module is specifically configured to: requests between a second level cache and an access finger unit, a first level data cache, or a third level cache of a first processor core are monitored.
According to a specific implementation manner of the embodiment of the present application, the corresponding relationship between the preset memory address and the cache line state includes: a plurality of preset memory addresses and corresponding relations between the cache line states; the corresponding relation between each preset memory address and the cache line state corresponds to each processor core one by one; the apparatus further comprises: the marking module is used for marking the first request before determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address after the first monitoring module monitors the first request received or sent by the first cache of the first processor core, so that the marked first request comprises first processor core information; a third determining module, configured to determine, according to first processor core information included in the marked first request, a correspondence between a first preset memory address corresponding to the first processor core and a cache line state from correspondence between a plurality of preset memory addresses and cache line states; the first determining module is specifically configured to: and determining the state of the first cache line from the corresponding relation between the first preset memory address and the state of the cache line according to the memory address.
According to a specific implementation manner of the embodiment of the application, the apparatus further includes: and the updating module is used for updating the corresponding relation between the preset memory address and the cache line state information by using the memory address and the current state of the cache line under the condition that the current state of the cache line is consistent with the predicted state.
In a third aspect, an embodiment of the present application provides an electronic device, including: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space surrounded by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory, and is configured to execute the method for verifying cache consistency of the multi-core processor system according to any one of the foregoing implementations.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing one or more programs executable by one or more processors to implement a method for verifying cache coherence of a multi-core processor system according to any one of the foregoing implementations.
According to the verification method, the device, the electronic equipment and the readable storage medium for cache consistency of the multi-core processor system, through monitoring a first request received or sent by a first cache of a first processor core and first response information aiming at the first request, according to a memory address in the first request, the state of the first cache line is determined from a corresponding relation between a preset memory address and a cache line state, then based on the state of the first cache line in the first response information, a command type in the first request and a preset state transfer rule based on a cache consistency protocol, the state of the first cache line is predicted to obtain a predicted state of the first cache line, finally the current state of the cache line is compared with the predicted state, whether the first cache accords with the cache consistency is determined, and as the predicted state of the first cache line is predicted according to the state of the first cache line, the command type and the preset state transfer rule based on the cache consistency protocol, the preset state transfer rule based on the cache consistency protocol is predicted, and whether the predicted state of the first cache line accords with the cache consistency of a system using the state transfer rule based on the cache consistency protocol is higher than the first cache line, and therefore whether the current state of the cache can be judged to accord with the predicted state of the first cache line.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a multi-core processor system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of signal paths between a single processor core and various levels of caches according to one embodiment of the present disclosure;
FIG. 3 is a flowchart illustrating a method for verifying cache coherency in a multi-core processor system according to an embodiment of the present disclosure;
FIG. 4 is a Cache coherency transition diagram based on the MOESI protocol according to an embodiment of the present application;
FIG. 5 is a schematic diagram illustrating a relationship between data stored in the first level cache and data stored in the second level cache according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an embodiment of the present application;
FIG. 7 is a Cache consistency check device according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of an embodiment of the present application;
FIG. 9 is a schematic diagram of a device for verifying cache coherency in a multi-core processor system according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described in detail below with reference to the accompanying drawings. It should be understood that the described embodiments are merely some, but not all, of the embodiments of the present application. All other embodiments, based on the embodiments herein, which would be apparent to one of ordinary skill in the art without making any inventive effort, are intended to be within the scope of the present application.
As described in the background art, in order to increase the access speed of the multi-core processor, a Cache is provided between the multi-core processor and the storage system, and the access speed from the Cache is fast, but the cost is also high, and the Cache capacity used by the processor is small due to the limitations of the cost, power consumption, area and the like of the processor. According to the time locality and the space locality of the Cache, if one data is accessed at present, the data is likely to be accessed later, and the data around the data is likely to be accessed later, the computer places the relevant area accessed by the CPU into the Cache, and when the CPU accesses the memory, the CPU accesses the Cache first and then accesses the memory if the CPU cannot find the relevant area, so that the processing speed of the computer is greatly improved.
Referring to fig. 1 and 2, the caches of the multi-core processor generally adopt a three-level Cache structure, each core is provided with a private first-level Cache (L1 Cache) and a second-level Cache (L2 Cache), a third-level Cache (L3 Cache) is shared among cores, and the L1 Cache is divided into an instruction Cache and a data Cache. When the kernel needs to access the program or the data, the kernel can firstly fetch from the L1 Cache, if the kernel does not go to the L2 Cache, the L2 Cache does not go to the L3 Cache, and if the kernel does not go to the internal access, the L3 Cache does not go to the internal access.
Cache and storage systems interact using fixed-size blocks of data, referred to as cachelines. The Cache stores the data cached in the memory, and records whether the current data is valid, modified, and the corresponding memory block address.
The processor will first Fetch instructions from the instruction Cache of L1 through the Fetch unit (Instruction Fetch, I-Fetch) and store the instructions in the instruction register. And then decoding the instruction, and reading the register file according to the decoded value to obtain the source operand of the instruction. After the execution unit (EX) generates the effective address for the load/store operation, the load store and data cache (Load store and Data Cache, LSDC) unit is responsible for completing all memory accesses of the core.
Multicore processors typically manage the state of a Cache line through a Cache-coherence protocol, thereby avoiding data loss or data coherence problems. Cache coherency is an important feature of on-chip multi-core processor memory systems that specifies the correctness of the results of parallel program execution, defining the relationships between memory access operations in a multiprocessor.
With the development and scaling of modern integrated circuits, verification is becoming one of the greatest challenges in System On Chip (SOC) development. Cache coherency verification is an important content of on-chip multi-core processor functional verification.
Based on the above, the method for verifying the cache consistency of the multi-core processor system provides a convenient verification method for technicians, and improves the verification efficiency of the cache consistency of the multi-core processor system.
In order that those skilled in the art will better understand the technical concepts, embodiments and advantages of the examples of the present application, a detailed description will be given below by way of specific examples.
The embodiment of the application provides a verification method for cache consistency of a multi-core processor system, wherein the multi-core processor system comprises more than two processor cores, each processor core is respectively provided with a first-level cache and a second-level cache, and each processor core is commonly provided with a third-level cache; the verification method for cache consistency comprises the following steps: monitoring a first request received or sent by a first cache of a first processor core; the first request comprises a memory address and a command type of data to be operated by the first request; monitoring first response information for the first request; wherein the first response information includes a current state of a first cache line in the first cache; the first cache line is used for storing the data to be operated by the first request; determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address; predicting the state of the first cache line based on the state of the first cache line and a preset state transition rule based on a cache consistency protocol to obtain a predicted state of the cache line; comparing the current state of the cache line with the predicted state to determine whether the first cache accords with cache consistency, so that verification efficiency of cache consistency of the multi-core processor system can be improved.
FIG. 3 is a flowchart illustrating a method for verifying cache coherency in a multi-core processor system according to an embodiment of the present application, as shown in FIG. 3; the multi-core processor system of the embodiment comprises more than two processor cores, wherein each processor core is provided with a first-level cache and a second-level cache respectively, and the processor cores are provided with a third-level cache together; the method for verifying cache consistency of the multi-core processor system of the embodiment may include:
s101, monitoring a first request received or sent by a first cache of a first processor core.
In this embodiment, the first request may include a memory address and a command type where data to be operated on by the first request is located.
Command types may include read, write, and probe types.
The first processor core may be one of more than two processor cores.
The first cache may be a first level cache, a second level cache, or a third level cache.
In the running process of the multi-core processor system, the embodiment monitors a request received or sent by a cache corresponding to one of the processor cores, taking data reading as an example, where the first request includes a read command and a memory address corresponding to the data, and the request may be sent to a second-level cache or sent to a third-level cache.
S102, monitoring first response information for the first request.
The first response information may be feedback information for the first request.
In this embodiment, the first response information includes a current state of a first cache line (Cacheline) in the first cache; first cache line the cache line where the data to be operated upon by the first request is located.
After the first request is monitored, the first request may be deposited at a predetermined location for subsequent operations after receiving the first response information for the first request.
It is understood that the Cacheline in this embodiment is the Cacheline in the first cache. The current state of the first cache line (cache) may be Invalid, shared, exclusive, modified or own.
S103, determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address.
According to the memory address in the first request, the state of the first cache line can be determined in the corresponding relation between the preset memory address and the state of the cache line.
The correspondence between the predetermined memory address and the cache line state may correspond to the first processor core.
S104, predicting the state of the first cache line based on the state of the first cache line, the command type and a preset state transition rule based on a cache consistency protocol, and obtaining a predicted state of the first cache line.
And predicting the state of the first cache line according to the state of the first cache line determined in the step S103 and a preset state transition rule based on a cache consistency protocol, so as to obtain a predicted state of the first cache line.
Referring to fig. 4, the state transition rule based on the cache coherence protocol may be a state transition rule derived based on the cache coherence protocol, the rule may be a state transition diagram, the corresponding state transition table is shown in table 1, and the corresponding state interpretation is shown in table 2.
TABLE 1
Figure BDA0003625114240000091
TABLE 2
Figure BDA0003625114240000092
/>
It will be appreciated that the state transition rules based on the cache coherency protocol of the present embodiment are consistent with state transition rules based on the cache coherency protocol used in the multi-core processor system.
In some examples, the cache coherence protocol based state transition rule may be a MOESI protocol based state transition table that includes a cache line state, a command type, and a predicted cache line state correspondence.
S105, comparing the current state of the cache line with the predicted state, and determining whether the first cache accords with cache consistency.
Comparing the current state of the first cache line in the first cache included in the first response information in S102 with the predicted state of the first cache line determined in S104, thereby determining whether the first cache accords with the cache consistency, specifically, when the current state of the cache line accords with the predicted state, the first cache accords with the cache consistency; when the current state of the cache line is inconsistent with the predicted state, the first cache is inconsistent with the cache consistency.
It can be understood that, for each request received or sent by the first cache, the consistency determination may be performed according to the procedure of the foregoing embodiment; further, for each processor core, a determination of coherency may be made in accordance with the processes of the above embodiments.
According to the embodiment, a first request received or sent by a first cache of a first processor core and first response information aiming at the first request are monitored, the state of the first cache line is determined according to a memory address in the first request from a corresponding relation between a preset memory address and a cache line state, then the state of the first cache line in the first response information, a command type in the first request and a preset state transition rule based on a cache consistency protocol are predicted, the state of the first cache line is obtained, finally the current state of the cache line is compared with the predicted state, whether the first cache accords with cache consistency is determined, and as the predicted state of the first cache line is obtained according to the state of the first cache line, the command type and the preset state transition rule based on the cache consistency protocol, the state transition rule based on the cache consistency protocol is consistent with a cache consistency state transition rule used in a processor system, so that the predicted state of the first cache line is considered to be consistent with the cache consistency state transition rule used in the first cache system, and the cache consistency of the first cache line can be improved compared with the current state of the cache line, and the cache consistency of the first cache line can be verified.
In some examples, monitoring the first cache of the first processor core for the first request received or sent (S101) may include:
s101a, monitoring a fetch request, a probe request and/or a rejection request received or sent by a first cache of a first processor core.
The fetch request may be a fetch request, which may be a request to read, write, or modify data corresponding to a memory address.
The probe request may be a probe request, where the probe request may be a change in data in a first cache of the first processor core, and in order to satisfy a cache coherency protocol, a request is sent to another processor core, so that the cache of the other processor includes data corresponding to the memory address.
The reject request may be a victim request, which is a request to remove data that is farthest in time from the first cache after the first cache is full, and store the data to the next level of cache.
Referring to FIG. 5, in some examples, the L2Cache of each processor core contains all the data backup information of the L1 Cache, i.e., the L2Cache is of the exclusive type that simplifies consistency management, e.g., in a multi-core processor, when one of the processor cores changes data for one address in memory (e.g., executes a store instruction), if the data for that address is also saved in the private caches of the other processor cores (typically in a multi-core processor, both the L1 Cache and the L2Cache are private), then they need to be invalidated to avoid the processor from using the erroneous data. For the Cache of including, only the L2Cache of a lower level is required to be checked, and meanwhile, the L2Cache is also connected with the L3 Cache, so that all information of the kernel and the storage system can be monitored, and the L2Cache is selected for consistency monitoring, so that the computing resource can be saved.
In some examples, the first cache is a second level cache. Monitoring the first request received or sent by the first cache of the first processor core (S101) may include:
requests between a second level cache and an access finger unit, a first level data cache, or a third level cache of a first processor core are monitored.
Requests between the second level cache and the access finger unit of the first processor core may be monitored, requests between the second level cache and the first level cache of the first processor core may be monitored, and requests between the second level cache and the third level cache of the first processor core may be monitored.
The cache line state of the same data in the caches corresponding to different processor cores may be different, for example, the cache line state in the cache of one processor core is S state, the cache line state in the cache of another processor core may be O state, so as to adapt to the above situation, in some examples, the corresponding relationship between the preset memory address and the cache line state may include: and the corresponding relation between the plurality of preset memory addresses and the cache line state.
The corresponding relation between each preset memory address and the cache line state corresponds to each processor core one by one, namely, each processor core corresponds to the corresponding relation between one preset memory address and the cache line state.
Correspondingly, in some examples, after monitoring the first request received or sent by the first cache of the first processor core, before determining the state of the first cache line from the preset correspondence between the memory address and the state of the cache line according to the memory address, the method may further include:
s106, marking the first request so that the marked first request comprises first processor core information.
The first request may be marked with alphanumeric, numeric, or core identification information such that the marked first request also includes first processor core information.
S107, according to the first processor core information included in the marked first request, determining the corresponding relation between the first preset memory address corresponding to the first processor core and the cache line state from the corresponding relation between the plurality of preset memory addresses and the cache line state.
According to the memory address, determining the state of the first cache line from the corresponding relation between the preset memory address and the cache line state (S103) may include:
and determining the state of the first cache line from the corresponding relation between the first preset memory address and the state of the cache line according to the memory address.
To facilitate determining whether the first cache is cache coherent for a request subsequent to the first request in accordance with the steps of the above embodiments, in some examples, the method may further include:
and under the condition that the current state of the cache line is consistent with the predicted state, updating the corresponding relation between the preset memory address and the cache line state information by using the memory address and the current state of the cache line.
The method can be written by using hardware description languages such as Verilog, and the verification speed can be greatly improved by using a hardware accelerator to support verification in a larger scale. The KHz to MHz level simulation speeds can be achieved using hardware accelerators such as Zebu and velace.
In the practical project development, the codes written by the hardware description language are injected into a hardware platform, correspondingly, a device for supporting checking the Cache consistency of the multi-core processor can be obtained, and finally, the Cache consistency check of the CPU RTL on an emulgator is completed while the running speed is ensured.
The following describes the scheme of the present application in detail with reference to a specific embodiment from the perspective of a hardware device.
Referring to fig. 6 and fig. 7, the Cache consistency check device of the present embodiment may include:
Cache consistency model read port upstream: the signals from the I-Fetch unit and the LSDC unit are listened to. The specific mode is as follows:
the fetch monitor 1 (Fetch Bundle Monitor 1) monitors request and response information between the L2 and LSDC units, such as requests and responses generated when the LSDC needs to read data at a certain address.
The probe monitor 1 (Probe Bundle Monitor 1) monitors probe information between the L2 and the LSDC unit, and in the case of multiple processors, the existence of the private cache may cause multiple copies of the same memory data, that is, the data corresponding to the same memory address may be stored in multiple locations. To ensure data consistency, when a certain CPU core needs to operate on data of a certain address, the state of the address data in other positions (such as the Cache of another CPU core) is obtained through a Probe operation.
A Victim Monitor (Victim Monitor) monitors Victim information between the L2 and the LSDC unit, and mainly refers to information such as cache address, state, data and the like of the L2 written by the L1 through a Victim operation.
The finger monitor 2 (Fetch Bundle Monitor 2) and the probe monitor 2 (Probe Bundle Monitor 2) monitor the request and response information between L2 and I-Fetch, respectively.
The cache consistency model reads the port downstream, listens for signals from L3. The specific mode is as follows:
finger monitor 0 (Fetch Bundle Monitor 0) and probe monitor 0 (Probe Bundle Monitor 0) monitor request and response information between L2 and L3, respectively.
Cache consistency model (Cache Coherence Model)
3.1 buffering L2 received information from L1 and L3 to the packet queue (Bundle Queues).
And 3.2, receiving a response corresponding to the request, wherein the response comprises the current state of the Cache in the L2 and identification information, searching the corresponding request in a data packet queue through the identification information, acquiring a memory address (Cache address) from the request, and searching the state of the Cache corresponding to the Cache address from a Cache consistency model according to the memory address (Cache address), wherein the Cache consistency model is similar to a table, the latest Cache state of the private caches L2 of all cores of the multi-core processor is recorded, the index of the table is the Cache address (memory address), and the indexed value is the state of the Cache.
The consistency updating processor comprises a Fetch request/response processor (Fetch Req/Rsp Handler), a Probe request/response processor (Probe Req/Resp Handler) and a Victim processor (Victim Handler), and a Cache consistency transfer table (graph) of the consistency updating processor according to a Cache consistency protocol such as MOESI protocol processes a command type sent by a Cache line for a certain memory address and a state of the Cache line indexed by the Cache consistency model, wherein the command type is contained in request information cached to a Bundle queue, and the prediction state of the Cache line can be predicted;
And comparing the predicted state with the state of the Cacheline in the response to determine whether the cache consistency is met.
After comparing the predicted state with the state of the Cacheline in the response, if the predicted state is consistent with the state of the Cacheline in the response, using the Cacheline address and the Cacheline state to update the Cache consistency model.
When looking up the Cache coherence model, if not, the Cache state is considered to be the Invalid state.
Specifically, in the case of a multi-core processor, each time when L2 receives a request/response of L1 and L3, because there may be multiple copies of a target Cache line in the whole storage system, in order to ensure Cache consistency of our multi-core processor during actual processing, each time when data of a certain Cache line of L2 is read and written, we need to find the Cache state of the corresponding Cache line address in a large table of the Cache consistency model when the corresponding request initiation is monitored, and perform prediction update. When the processor obtains the response information corresponding to the request, we can obtain the actual state of the Cacheline after operating on the Cacheline from the response information. And then, comparing the Cacheline state obtained from the response with the state predicted before, so as to verify whether the current actual multi-core processor meets the requirement of Cache consistency when working.
When using an emulgator for verification, the Cache consistency model is a large table for caching the corresponding relation between the Cache address and the Cache state, and a large amount of storage resources are occupied in the step, so that in order to save hardware resources, data can be exported to a C language through DPI, and the corresponding relation between the Cache address and the Cache state is completed by a C environment.
Referring to FIG. 8, the cache consistency model is a table obtained by MOESI protocol, and is consistent with the protocol to be followed by the actual multi-core processor, and the state of the corresponding address is updated each time based on the request of the actual processor, so that the predicted state is considered to be a completely correct result.
The embodiment of the application provides a verification device for cache consistency of a multi-core processor system, wherein the multi-core processor system comprises more than two processor cores, each processor core is provided with a first-level cache and a second-level cache, and each processor core is provided with a third-level cache together; the verification device for cache consistency comprises: the first monitoring module is used for monitoring a first request received or sent by a first cache of the first processor core; the first request comprises a memory address and a command type of data to be operated by the first request; a second monitoring module for monitoring first response information for the first request; wherein the first response information includes a current state of a first cache line in the first cache; the first cache line is used for storing the data to be operated by the first request; the first determining module is used for determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address; the prediction module is used for predicting the state of the first cache line based on the state of the first cache line, the command type and a preset state transition rule based on a cache consistency protocol to obtain a predicted state of the first cache line; the preset state transition rule based on the cache consistency protocol is consistent with the cache consistency state transition rule used in the multi-core processor system; and the second determining module is used for comparing the current state of the cache line with the prediction state to determine whether the first cache accords with the cache consistency, so that the verification efficiency of the cache consistency of the multi-core processor system can be improved.
Fig. 9 is a schematic structural diagram of a device for verifying cache coherence of a multi-core processor system according to an embodiment of the present application, where, as shown in fig. 9, the multi-core processor system includes more than two processor cores, each processor core has a first-level cache and a second-level cache, and each processor core has a third-level cache in common; the verification device for cache consistency comprises: a first monitoring module 11, configured to monitor a first request received or sent by a first cache of a first processor core; the first request comprises a memory address and a command type of data to be operated by the first request; a second monitoring module 12 for monitoring first response information for the first request; wherein the first response information includes a current state of a first cache line in the first cache; the first cache line is used for storing the data to be operated by the first request; a first determining module 13, configured to determine, according to the memory address, a state of the first cache line from a corresponding relationship between a preset memory address and a cache line state; a prediction module 14, configured to predict a state of the first cache line based on the state of the first cache line, the command type, and a preset state transition rule based on a cache coherence protocol, to obtain a predicted state of the first cache line; the preset state transition rule based on the cache consistency protocol is consistent with the cache consistency state transition rule used in the multi-core processor system; and a second determining module 15, configured to compare the current state of the cache line with the predicted state, and determine whether the first cache accords with cache consistency.
The device of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 3, and its implementation principle and technical effects are similar, and are not described here again.
According to the device, a first request received or sent by a first cache of a first processor core and first response information aiming at the first request are monitored, the state of the first cache line is determined according to a memory address in the first request from a corresponding relation between a preset memory address and a cache line state, then the state of the first cache line in the first response information, a command type in the first request and a preset state transition rule based on a cache consistency protocol are predicted, the state of the first cache line is obtained, finally the current state of the cache line is compared with the predicted state, whether the first cache accords with cache consistency is determined, and as the predicted state of the first cache line is obtained according to the state of the first cache line, the command type and the preset state transition rule based on the cache consistency protocol, the state transition rule based on the cache consistency protocol is consistent with a cache consistency state transition rule used in a multi-core processor system, so that the predicted state of the first cache line is more than the current state of the first cache line can be considered to be in accordance with the cache consistency of the first cache line, and the cache consistency of the first cache line can be verified, and whether the predicted state of the first cache line accords with the current cache consistency can be verified.
As an optional implementation manner, the first monitoring module is specifically configured to: the first cache of the first processor core is monitored for received or transmitted finger requests, probe requests and/or rejection requests.
As an optional implementation manner, the first cache is a second-level cache; the first monitoring module is specifically configured to: requests between a second level cache and an access finger unit, a first level data cache, or a third level cache of a first processor core are monitored.
As an optional implementation manner, the corresponding relationship between the preset memory address and the cache line state includes: a plurality of preset memory addresses and corresponding relations between the cache line states; the corresponding relation between each preset memory address and the cache line state corresponds to each processor core one by one; the apparatus further comprises: the marking module is used for marking the first request before determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address after the first monitoring module monitors the first request received or sent by the first cache of the first processor core, so that the marked first request comprises first processor core information; a third determining module, configured to determine, according to first processor core information included in the marked first request, a correspondence between a first preset memory address corresponding to the first processor core and a cache line state from correspondence between a plurality of preset memory addresses and cache line states; the first determining module is specifically configured to: and determining the state of the first cache line from the corresponding relation between the first preset memory address and the state of the cache line according to the memory address.
As an alternative embodiment, the apparatus further comprises: and the updating module is used for updating the corresponding relation between the preset memory address and the cache line state information by using the memory address and the current state of the cache line under the condition that the current state of the cache line is consistent with the predicted state.
The device of the above embodiment may be used to implement the technical solution of the above method embodiment, and its implementation principle and technical effects are similar, and are not repeated here.
Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 10, may include: the processor 62 and the memory 63 are arranged on the circuit board 64, wherein the circuit board 64 is arranged in a space surrounded by the shell 61; a power supply circuit 65 for supplying power to the respective circuits or devices of the above-described electronic apparatus; the memory 63 is for storing executable program code; the processor 62 executes a program corresponding to the executable program code by reading the executable program code stored in the memory 63, so as to perform the cache consistency verification method of any of the multi-core processor systems provided in the foregoing embodiments, so that corresponding beneficial technical effects can be achieved, which have been described in detail above and will not be repeated here.
Such electronic devices exist in a variety of forms including, but not limited to:
(1) Ultra mobile personal computer device: such devices are in the category of personal computers, having computing and processing functions, and generally also having mobile internet access characteristics. Such terminals include: PDA, MID, and UMPC devices, etc., such as iPad.
(2) And (3) a server: the configuration of the server includes a processor, a hard disk, a memory, a system bus, and the like, and the server is similar to a general computer architecture, but is required to provide highly reliable services, and thus has high requirements in terms of processing capacity, stability, reliability, security, scalability, manageability, and the like.
(3) Other electronic devices with data interaction functions.
Accordingly, embodiments of the present application further provide a computer readable storage medium, where one or more programs are stored, where the one or more programs may be executed by one or more processors, so as to implement the cache consistency verification method of any of the multi-core processor systems provided in the foregoing embodiments, and therefore, the foregoing embodiments have been described in detail, and are not repeated herein.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.
For convenience of description, the above apparatus is described as being functionally divided into various units/modules, respectively. Of course, the functions of each unit/module may be implemented in one or more pieces of software and/or hardware when implementing the present application.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily conceivable by those skilled in the art within the technical scope of the present application should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (12)

1. The verification method of cache consistency of the multi-core processor system is characterized in that the multi-core processor system comprises more than two processor cores, each processor core is provided with a first-level cache and a second-level cache, and the processor cores are provided with a third-level cache in common;
The verification method for cache consistency comprises the following steps:
monitoring a first request received or sent by a first cache of a first processor core; the first request comprises a memory address and a command type of data to be operated by the first request;
monitoring first response information for the first request; wherein the first response information includes a current state of a first cache line in the first cache; the first cache line is used for storing the data to be operated by the first request;
determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address;
predicting the state of the first cache line based on the state of the first cache line, the command type and a preset state transition rule based on a cache consistency protocol to obtain a predicted state of the first cache line; the preset state transition rule based on the cache consistency protocol is consistent with the cache consistency state transition rule used in the multi-core processor system;
and comparing the current state of the cache line with the predicted state to determine whether the first cache accords with cache consistency.
2. The method of claim 1, wherein monitoring the first cache of the first processor core for the received or transmitted first request comprises:
the first cache of the first processor core is monitored for received or transmitted finger requests, probe requests and/or rejection requests.
3. The method of claim 1, wherein the first cache is a second level cache; the monitoring the first request received or sent by the first cache of the first processor core includes:
requests between a second level cache and an access finger unit, a first level data cache, or a third level cache of a first processor core are monitored.
4. The method of claim 1, wherein the correspondence between the predetermined memory address and the cache line state comprises: a plurality of preset memory addresses and corresponding relations between the cache line states; the corresponding relation between each preset memory address and the cache line state corresponds to each processor core one by one;
after monitoring a first request received or sent by a first cache of a first processor core, before determining a state of a first cache line from a preset corresponding relationship between a memory address and a cache line state according to the memory address, the method further includes:
Marking the first request so that the marked first request includes first processor core information;
determining a corresponding relation between a first preset memory address corresponding to the first processor core and a cache line state from corresponding relations between a plurality of preset memory addresses and the cache line state according to first processor core information included in the marked first request;
the determining the state of the first cache line according to the memory address from the corresponding relation between the preset memory address and the state of the cache line includes:
and determining the state of the first cache line from the corresponding relation between the first preset memory address and the state of the cache line according to the memory address.
5. The method according to claim 1, wherein the method further comprises:
and under the condition that the current state of the cache line is consistent with the predicted state, updating the corresponding relation between the preset memory address and the cache line state information by using the memory address and the current state of the cache line.
6. The verification device for cache consistency of the multi-core processor system is characterized by comprising more than two processor cores, wherein each processor core is provided with a first-level cache and a second-level cache respectively, and the processor cores are provided with a third-level cache together;
The verification device for cache consistency comprises:
the first monitoring module is used for monitoring a first request received or sent by a first cache of the first processor core; the first request comprises a memory address and a command type of data to be operated by the first request;
a second monitoring module for monitoring first response information for the first request; wherein the first response information includes a current state of a first cache line in the first cache; the first cache line is used for storing the data to be operated by the first request;
the first determining module is used for determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address;
the prediction module is used for predicting the state of the first cache line based on the state of the first cache line, the command type and a preset state transition rule based on a cache consistency protocol to obtain a predicted state of the first cache line; the preset state transition rule based on the cache consistency protocol is consistent with the cache consistency state transition rule used in the multi-core processor system;
And the second determining module is used for comparing the current state of the cache line with the predicted state and determining whether the first cache accords with cache consistency.
7. The apparatus of claim 6, wherein the first monitoring module is specifically configured to:
the first cache of the first processor core is monitored for received or transmitted finger requests, probe requests and/or rejection requests.
8. The apparatus of claim 6, wherein the first cache is a second level cache; the first monitoring module is specifically configured to:
requests between a second level cache and an access finger unit, a first level data cache, or a third level cache of a first processor core are monitored.
9. The apparatus of claim 6, wherein the correspondence between the predetermined memory address and the cache line state comprises: a plurality of preset memory addresses and corresponding relations between the cache line states; the corresponding relation between each preset memory address and the cache line state corresponds to each processor core one by one;
the apparatus further comprises:
the marking module is used for marking the first request before determining the state of the first cache line from the corresponding relation between the preset memory address and the state of the cache line according to the memory address after the first monitoring module monitors the first request received or sent by the first cache of the first processor core, so that the marked first request comprises first processor core information;
A third determining module, configured to determine, according to first processor core information included in the marked first request, a correspondence between a first preset memory address corresponding to the first processor core and a cache line state from correspondence between a plurality of preset memory addresses and cache line states;
the first determining module is specifically configured to:
and determining the state of the first cache line from the corresponding relation between the first preset memory address and the state of the cache line according to the memory address.
10. The apparatus of claim 6, wherein the apparatus further comprises:
and the updating module is used for updating the corresponding relation between the preset memory address and the cache line state information by using the memory address and the current state of the cache line under the condition that the current state of the cache line is consistent with the predicted state.
11. An electronic device, the electronic device comprising: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space surrounded by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the cache coherency verification method of the multi-core processor system according to any of the preceding claims 1-5.
12. A computer readable storage medium storing one or more programs executable by one or more processors to implement the method of verifying cache coherency of a multi-core processor system of any of the preceding claims 1-5.
CN202210467692.5A 2022-04-29 2022-04-29 Verification method and device for cache consistency of multi-core processor system Pending CN116049034A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210467692.5A CN116049034A (en) 2022-04-29 2022-04-29 Verification method and device for cache consistency of multi-core processor system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210467692.5A CN116049034A (en) 2022-04-29 2022-04-29 Verification method and device for cache consistency of multi-core processor system

Publications (1)

Publication Number Publication Date
CN116049034A true CN116049034A (en) 2023-05-02

Family

ID=86124362

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210467692.5A Pending CN116049034A (en) 2022-04-29 2022-04-29 Verification method and device for cache consistency of multi-core processor system

Country Status (1)

Country Link
CN (1) CN116049034A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116775509A (en) * 2023-05-31 2023-09-19 合芯科技有限公司 Cache consistency verification method, device, equipment and storage medium
CN117687928A (en) * 2024-01-29 2024-03-12 中电科申泰信息科技有限公司 Multiprocessor core cache consistency verification module and method based on UVM

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116775509A (en) * 2023-05-31 2023-09-19 合芯科技有限公司 Cache consistency verification method, device, equipment and storage medium
CN116775509B (en) * 2023-05-31 2024-03-22 合芯科技有限公司 Cache consistency verification method, device, equipment and storage medium
CN117687928A (en) * 2024-01-29 2024-03-12 中电科申泰信息科技有限公司 Multiprocessor core cache consistency verification module and method based on UVM
CN117687928B (en) * 2024-01-29 2024-04-19 中电科申泰信息科技有限公司 Multiprocessor core cache consistency verification module and method based on UVM

Similar Documents

Publication Publication Date Title
US8209499B2 (en) Method of read-set and write-set management by distinguishing between shared and non-shared memory regions
US10394714B2 (en) System and method for false sharing prediction
US6918012B2 (en) Streamlined cache coherency protocol system and method for a multiple processor single chip device
US8140828B2 (en) Handling transaction buffer overflow in multiprocessor by re-executing after waiting for peer processors to complete pending transactions and bypassing the buffer
KR100933820B1 (en) Techniques for Using Memory Properties
CN116049034A (en) Verification method and device for cache consistency of multi-core processor system
US20080086599A1 (en) Method to retain critical data in a cache in order to increase application performance
US8296518B2 (en) Arithmetic processing apparatus and method
CN108268385B (en) Optimized caching agent with integrated directory cache
JP2011013858A (en) Processor and address translating method
CN115130402B (en) Cache verification method, system, electronic equipment and readable storage medium
JP2021524966A (en) How to verify access to Level 2 cache for multi-core interconnects
JP5625809B2 (en) Arithmetic processing apparatus, information processing apparatus and control method
US7058767B2 (en) Adaptive memory access speculation
US6965972B2 (en) Real time emulation of coherence directories using global sparse directories
CN116167310A (en) Method and device for verifying cache consistency of multi-core processor
CN115269199A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN115372791A (en) Test method and device of integrated circuit based on hardware simulation and electronic equipment
JP2010128698A (en) Multiprocessor system
CN115202738A (en) Verification method and system of multi-core system under write-through strategy
CN112380013B (en) Cache preloading method and device, processor chip and server
US7290085B2 (en) Method and system for flexible and efficient protocol table implementation
US20230112575A1 (en) Accelerator for concurrent insert and lookup operations in cuckoo hashing
US20230099256A1 (en) Storing an indication of a specific data pattern in spare directory entries
US20230280904A1 (en) Monitoring memory locations to identify whether data stored at the memory locations has been modified

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination