CN116049035A - Verification and debugging realization method for cache consistency - Google Patents

Verification and debugging realization method for cache consistency Download PDF

Info

Publication number
CN116049035A
CN116049035A CN202211683813.6A CN202211683813A CN116049035A CN 116049035 A CN116049035 A CN 116049035A CN 202211683813 A CN202211683813 A CN 202211683813A CN 116049035 A CN116049035 A CN 116049035A
Authority
CN
China
Prior art keywords
cache
target
core
nct
nest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211683813.6A
Other languages
Chinese (zh)
Other versions
CN116049035B (en
Inventor
徐继辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hexin Technology Suzhou Co ltd
Hexin Technology Co ltd
Original Assignee
Hexin Technology Suzhou Co ltd
Hexin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hexin Technology Suzhou Co ltd, Hexin Technology Co ltd filed Critical Hexin Technology Suzhou Co ltd
Priority to CN202211683813.6A priority Critical patent/CN116049035B/en
Publication of CN116049035A publication Critical patent/CN116049035A/en
Application granted granted Critical
Publication of CN116049035B publication Critical patent/CN116049035B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2247Verification or detection of system hardware configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a method for realizing verification and debugging of cache consistency, which comprises the following steps: firstly, core information and Nest information of a target POWER processor are acquired, an NCT target module is constructed, and the NCT target module is configured and driven through a target Tcl interface; and then configuring the states of all levels of caches corresponding to Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives. According to the method, a new NCT target module is added, a Core direct drive storage system can be skipped, the implementation difficulty of verifying and debugging cache consistency is simplified, the scene of arbitrarily testing the cache consistency can be configured through Tcl script configuration and simple Tcl commands, the verification and debugging period is shortened, and automatic debugging and batch automatic debugging are supported.

Description

Verification and debugging realization method for cache consistency
Technical Field
The application relates to the field of information technology, in particular to a method for verifying and debugging cache consistency.
Background
Cache coherence, also known as Cache coherence, refers to the mechanism by which data in a Cache is guaranteed to be identical to data in main memory in a computer system that employs a hierarchical storage system.
In a system, problems arise when many different devices share a common memory resource, and data in the cache is inconsistent. This problem is particularly prone to occur in multiprocessor systems having several CPUs. Therefore, it is important to ensure Cache consistency, and when designing a Cache consistency protocol, the prior art is often performed through two aspects, on one hand, by means of an existing physical processor, by selecting a proper test set, the state configuration of a CPU Cache (CPU Cache) is triggered, so as to enable multiple cores (multiple cores) and multiple chips (multiple chips) to run, and the change condition of the state of the Cache is analyzed by means of a performance monitoring unit (Performance Monitor Unit, PMU) event; on the other hand, a simulator is utilized to construct a storage system and a bus of the whole CPU processor, and a proper test set is also required to be selected, so that the change condition of the state of a Cache memory (Cache) is collected by means of simulated statistical information.
However, in the above scheme, it is difficult to select a suitable test set capable of covering all cache consistency protocol conditions, and when the design condition of the physical machine is complex and more paths need to be passed, the design complexity of the test set is increased, and the scene difficulty of constructing multiple cores and multiple chips is increased, so that the existing cache consistency debugging and verifying method is low in efficiency and difficult to implement.
Disclosure of Invention
The application provides a cache consistency verification and debugging realization method, which can configure any scene for testing cache consistency through a simple Tcl command, greatly shortens verification and debugging periods in time, reduces complexity of consistency debugging, and has the following technical scheme.
In one aspect, a method for implementing verification and debugging of cache consistency is provided, where the method includes:
obtaining Core information and Nest information of a target POWER processor;
constructing an NCT target module; the NCT target module is used for providing needed data for Nest in the target POWER processor to drive the operation of Nest instead of Core in the target POWER processor;
configuring and driving the NCT target module through a target Tcl interface;
and configuring the states of all levels of caches corresponding to the Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives.
In yet another aspect, an implementation apparatus for verifying and debugging cache consistency is provided, where the apparatus includes:
the Core and Nest acquisition module is used for acquiring Core information and Nest information of the target POWER processor;
the NCT target module construction module is used for constructing a NCT target module; the NCT target module is used for providing needed data for Nest in the target POWER processor to drive the operation of Nest instead of Core in the target POWER processor;
the NCT target module configuration module is used for configuring and driving the NCT target module through a target Tcl interface;
the verification and debugging module is used for configuring the states of all levels of caches corresponding to the Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after Nest in the target POWER processor drives.
In one possible implementation, the NCT target module configuration module is further to:
configuring and driving the NCT target module through a group of Tcl expansion commands, and realizing a processing function of each command in the Tcl expansion commands; the Tcl expansion command includes a command to set cache coherency, a state configuration command to specify an address, and a command to set a request address to be a fixed address.
In one possible implementation manner, the command for setting Cache consistency includes the number of the Cache, the Cache state, whether the Core operates, the operation priority of the Core, the loading and storing of the Cache, the operation times and the setting of the delay period number.
In one possible implementation, the state configuration command of the specified address includes the number of the Cache, the specified address, the Cache state, and the setting of the execution priority of the Core.
In a possible implementation manner, the verification and debug module is further configured to:
acquiring and storing setting requests of caches at all levels through a target linked list based on the Tcl expansion command;
issuing the setting request to the corresponding cores, and completing the state configuration of the Cache in each Core according to the setting request; driving a Nest Bus, a Bus and an MC memory controller in the target POWER processor;
and verifying the state conversion of the Cache with the completed state configuration according to the operation content in each Core, and recovering the configuration information with the completed configuration in the target linked list.
In a possible implementation manner, the verification and debug module is further configured to:
constructing a target linked list;
receiving the Cache consistency command corresponding to the setting request of each level of Cache through the target linked list;
and sequencing the received commands with the Cache consistency according to the serial number sequence and the adding sequence of the Cache.
In a possible implementation manner, the verification and debug module is further configured to:
traversing the cache consistency command in the target linked list according to a target condition, and issuing the setting request to a corresponding Core;
and when the corresponding Core is scheduled, performing state configuration on the Cache in the corresponding Core.
In yet another aspect, a computer device is provided, the computer device comprising a processor and a memory having stored therein at least one instruction that is loaded and executed by the processor to implement a cache coherence verification and debug implementation method as described above.
In yet another aspect, there is provided an implementation method of verifying and debugging of cache coherence as described above, in which at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor.
The technical scheme that this application provided can include following beneficial effect:
firstly, core information and Nest information of a target POWER processor are acquired, an NCT target module is constructed, and the NCT target module is configured and driven through a target Tcl interface; and then configuring the states of all levels of caches corresponding to Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives. In the scheme, the Core in the target POWER processor is replaced by the NCT target module to provide required data for the Nest in the target POWER processor so as to drive the operation of the Nest, and the realization difficulty of verifying and debugging cache consistency is simplified; the simple Tcl command can be used for configuring a scene for arbitrarily testing cache consistency, so that the verification and debugging period is greatly shortened in time, and the complexity of consistency debugging is reduced; in addition, because of adopting a Tcl script configuration mode, automatic debugging and batch automatic debugging can be supported, so that verification time is shortened.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a method flow diagram illustrating a method of implementing verification and debugging of cache coherency according to an exemplary embodiment.
FIG. 2 is a flow chart illustrating a state configuration of a Cache, according to an example embodiment.
FIG. 3 is a method flow diagram illustrating a method of implementing verification and debugging of cache coherency according to an exemplary embodiment.
Fig. 4 is a diagram illustrating an overall structure of a Tcl according to an example embodiment.
Fig. 5 is a schematic diagram illustrating a Tcl-to-command analysis process according to an example embodiment.
FIG. 6 is a schematic diagram illustrating the structure of a linked list according to an example embodiment.
Fig. 7 is a schematic diagram illustrating an overall operational flow after joining NCT and Tcl interfaces, according to an example embodiment.
FIG. 8 is a flowchart illustrating interaction of a Cache with a Bus, according to an example embodiment.
Fig. 9 is a block diagram illustrating a structure of an implementation apparatus for verifying and debugging cache coherence according to an exemplary embodiment.
Fig. 10 shows a block diagram of a computer device according to an exemplary embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be understood that, in the description of the embodiments of the present application, the term "corresponding" may indicate that there is a direct correspondence or an indirect correspondence between the two, or may indicate that there is an association between the two, or may indicate a relationship between the two and the indicated, configured, or the like.
FIG. 1 is a method flow diagram illustrating a method of implementing verification and debugging of cache coherency according to an exemplary embodiment. As shown in fig. 1, the method may include the steps of:
s101, core information and Nest information of the target POWER processor are acquired.
In one possible implementation, the problem needs to be solved from two aspects for the existing drawbacks of cache coherence verification and debugging: 1) Shortening the data execution path, the cache coherence protocol should only involve this part of the relevant modules for the storage system, which is much simpler. 2) The interface which can simply relate to input data and can cover all conditions of Cache consistency protocols is provided, so that the state of data in the Cache can be directly modified and set, and triggering can be realized preferably by means of configuration commands or files.
Further, core (Core): a chip may contain a plurality of cores, and a Core may contain a plurality of thread (processor) inside.
Further, the POWER architecture divides the implementation of a processor into two major parts, core and Nest (including memory, bus, peripherals, etc.).
S102, constructing an NCT target module; the NCT target module is used for replacing the Core in the target POWER processor to provide required data for Nest in the target POWER processor so as to drive the operation of the Nest.
Further, this embodiment solves the above-mentioned problems in a simulator (designated as M), which is a relatively economical and convenient way compared to a physical processor. 1) For the first aspect, that is, shortening the data execution path, a storage system needs to be driven separately, corresponding to the current POWER architecture, the POWER processor is divided into two parts, namely Core and Nest (including storage, bus, peripheral, etc.), in implementation, that is, the two parts can also be operated separately in implementation of the simulator, and Nest is operated separately under the condition that Core is not started; thus, the data required by the Nest may now be provided by a NCT target module (herein named NCT Model) instead of the Core to drive the operation of the entire Nest. 2) For the second aspect, an interface capable of simply involving input data is provided, and all cache coherence protocols can be covered; under the condition of the existing Nest driving framework, a set of interfaces for debugging consistency is expanded, and the running condition of each Core (the Core is a simulation part which is simulated by an NCT Model and is not actually started) and the data and state conditions in the Cache are configured by means of a Tcl script according to requirements.
Furthermore, the simulator is used for simulating a complete CPU and all modules of the whole system, and can simulate the execution process of instructions in the CPU, and is mainly used for designing the CPU and performing performance analysis on the CPU, such as open-source gem5.
Furthermore, a new NCT target module is added to skip the Core and directly drive the storage system, so that the realization difficulty of verifying and debugging the cache consistency is simplified.
S103, configuring and driving the NCT target module through a target Tcl interface.
Further, tcl ("Tool Command Language") is a simple, embeddable interpreted programming language that also provides an interface for user-defined expansion commands.
Further, the present embodiment is divided in implementation into two parts, 1) implementation of the Tcl interface. 2) The implementation of NCT Model (NCT target module) and the expansion of the coherence interface of each level of Cache.
Further, referring to a flowchart of a state configuration of the Cache shown in fig. 2, first, a Tcl interface is used to implement configuration and drive an NCT Model (NCT target module). Then, the NCT Model completes two parts of work, configures the state of each level of Cache (L2 Cache and L3 Cache) and drives the whole Nest to comprise each level of Cache, bus and MC memory controller.
S104, configuring the states of all levels of caches corresponding to the Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives.
Furthermore, the state of each level of Cache corresponding to the Core information is configured through the configured NCT target module, a Nest, a Bus and an MC memory controller in the target POWER processor are required to be driven, and after the Nest in the target POWER processor is driven, the state conversion of the Cache with the configured state is verified and debugged, that is, after the driving system operates, the generated state of the Cache is changed, namely a protocol requiring verification and debugging. The Bus comprises buses for connecting different cores and memories inside the chips and buses for interconnecting and interacting among the different chips.
In summary, core information and new information of the target POWER processor are acquired first, an NCT target module is constructed, and the NCT target module is configured and driven through a target Tcl interface; and then configuring the states of all levels of caches corresponding to Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives. In the scheme, the Core in the target POWER processor is replaced by the NCT target module to provide required data for the Nest in the target POWER processor so as to drive the operation of the Nest, and the realization difficulty of verifying and debugging cache consistency is simplified; the simple Tcl command can be used for configuring a scene for arbitrarily testing cache consistency, so that the verification and debugging period is greatly shortened in time, and the complexity of consistency debugging is reduced; in addition, because of adopting a Tcl script configuration mode, automatic debugging and batch automatic debugging can be supported, so that verification time is shortened.
FIG. 3 is a method flow diagram illustrating a method of implementing verification and debugging of cache coherency according to an exemplary embodiment. As shown in fig. 3, the method may include the steps of:
s301, core information and Nest information of the target POWER processor are acquired.
S302, constructing an NCT target module; the NCT target module is used for replacing the Core in the target POWER processor to provide required data for Nest in the target POWER processor so as to drive the operation of the Nest.
S303, configuring and driving the NCT target module through a group of Tcl expansion commands, and realizing a processing function of each command in the Tcl expansion commands; the Tcl expansion command includes a command to set cache coherency, a state configuration command to specify an address, and a command to set a request address to a fixed address.
In one possible implementation, in the implementation part of the Tcl interface, please refer to a Tcl overall structure diagram shown in fig. 4 and a structure diagram of a Tcl command analysis process shown in fig. 5, the Tcl makes an application program composed of a large entity containing compiled code and a small part of Tcl code for configuring and writing high-level commands, and programming is performed according to a component-based method, where different components have different functions for different purposes. The Tcl has good expansibility, and is convenient for users to add new functional modules for the Tcl.
Further, the basic principle of Tcl is: the Tcl language uses commands to drive execution, the Tcl itself has built-in commands, and provides an interface for user-defined expansion commands, in order to use the Tcl, an application program needs to first generate an interpreter (inter), which is composed of a group of command sets, variable bindings and a command execution state, when expanding commands, names of the commands and corresponding processing function pointers need to be registered in the interpreter at the same time, when the commands need to be executed, only the commands need to be transferred to the interpreter, and the interpreter can automatically complete the call of the corresponding processing functions.
Therefore, based on the above principle, the present embodiment implements setting for the Nest and the Cache through a set of Tcl expansion commands, and implements a processing function (callback function interface callback) of each command.
Further, the Tcl expansion command mainly implements three commands: a command to set cache coherency (i.e., a command to set coherence), a state configuration command to specify an address, and a command to set a request address to a fixed address.
In the first aspect, for a cache coherence command (i.e., a command to set a coherence), the coherence command is defined as NCT coh_set, and contains 7 parameters in total, as shown in table 1:
TABLE 1
Figure BDA0004020022380000091
As can be seen from table 1, the command for setting Cache coherency includes the number of caches, the Cache state (i.e., cache state in table 1), whether the Core is running, the priority of the Core running, the load and store of the caches (i.e., cache op in table 1), the number of runs, and the setting of the number of Delay cycles (i.e., delay cycles in table 1). In table 1, I is an abbreviation of invalid, indicating invalid data; s is short, and represents the state that data is shared with other caches; m is an abbreviation of Modified, which indicates that the data in the Cache has been Modified and has not been written to other storage. Cache op includes LOAD and STORE; LOAD means reading data from the Cache and STORE means writing data to the Cache.
Illustratively, the cache coherency command is further illustrated by the following example:
NCT coh_set 0T on 0STORE 1 5
setting the state of Core 0L2 as T, operating Core 0 with priority of 0, executing STORE operation once, and starting to execute 5 cycles;
NCT coh_set 1S off
setting the state of Core 0L3 as S, and not initiating a request;
NCT coh_set 4S off
setting the state of Core 2L2 as S, and not initiating a request;
NCT coh_set 2S on 0LOAD 1 0
the state of Core 1L2 is set to S, core 1 is run, priority is 0, and the LOAD operation is performed once.
Further, the set of NCT coh_set for different caches points to the same address to trigger the consistency process flow, and if no address is specified, a random address is generated by default, for example, 0x100000.
In a second aspect, for a state configuration command specifying an address, it is defined as NCT addr_set, containing a total of 4 parameters, as shown in table 2:
TABLE 2
Figure BDA0004020022380000101
In one possible implementation, the state configuration command specifying the address includes the number of caches, the specified address, the Cache state (i.e., cache state in Table 2), and the settings of the Core's run priority. Here, only a function of setting the state of the specified address is provided, and since some instructions have a side effect, namely, the possibility of changing a plurality of addresses at the same time, there is a need for a function of setting the state of the specified address separately. Illustratively, the cache coherency command is further illustrated by the following example:
NCT addr_set 0 0x4567898T 1
setting the corresponding line state of the specified address 0x4567898 of Core 0L2 as T and the priority as 1;
in a third aspect, for a command that sets a request address to a fixed address, it is defined as NCT fixed_addr, such as NCT fixed_addr 0x100000; if the address is not specified, each access is a generated random address access; since the first command is only to set the state of the Cache, no address is specified, and this command is used to assist the first command (i.e., the Cache coherency command described above). The default does not specify that the address uses a random address, the fixed address provided herein, that is, the address set herein will be used to construct the state address set in the above-described Cache coherency command after setting, and the state configuration command specifying the address is a state in which the address in a certain Cache is set alone, has no global nature, and the command to set the Cache coherency and the command to set the request address to the fixed address are global.
The scene of arbitrary test of the coherence can be configured through a simple Tcl command, the verification and debugging period is greatly shortened in time, and the complexity of consistency debugging is reduced through a simple interface. And by constructing a group of commands, the whole flow of cache consistency verification and debugging is driven and configured in a script manner.
S304, acquiring and storing setting requests of caches at all levels through a target linked list based on the Tcl expansion command.
In one possible implementation, a target linked list is constructed;
receiving a Cache consistency command corresponding to the setting request of each level of Cache through the target linked list;
and sequencing the received commands with the Cache consistency according to the serial number sequence and the adding sequence of the Cache.
Furthermore, the simulator is software for performing software simulation on the whole processor, and is composed of a plurality of modules, if a new module is newly added, the new module needs to be registered in the whole software to be called, and in this embodiment, a NCT target module is newly added. Firstly, the NCT target module needs to register an own initialization function to finish the initialization of the NCT target module. The simulator M provides a registration mechanism for the newly added modules, and for each module, there is a callback_register_which may include its own processing function, and call callback_register_insert is inserted into each callback chain. The callback chain is the function registration chain of the newly added module. This has the advantage that there is a unified interface to call back all the individual processing functions registered therein, typically driven using do_callback. Among these, so-called registration exposes some of its interfaces to the system, waiting until the system is "called" when it needs to be used, a mechanism also called callback, so the interfaces herein are called callback_registers, i.e. xxx of callback registration. In a programming language, a callback speaks a function that is passed to another function call. When the function name of the callback_register_xxx is used for being put into a CALLBACK chain of the system, and when the system operates, the function registered before the system is called through DO_callback at a certain trigger time, namely a CALLBACK mechanism.
Further, the required configuration is obtained through the Tcl interface implemented previously to set the system and the coherence of the Nest part. What is important is that the NCTModel_coh_Core_init function (i.e. the setting of the Core is completed), the agents corresponding to each Core are added according to the configuration of the chip and Core in the current machine, each agent corresponds to one Cache or mc, and the current Core is used for setting all the caches and mc currently. The machine refers to a computer system, which can be understood as a PC, a chip, i.e. a processor, i.e. a CPU in the general sense, the Core is a Core in the CPU, and the agents can understand that the agents are representative of the modules that may be affected by the agents in the NCT target module, i.e. all caches or mem needs to be described in the NCT target module. mc is memory control, the interface of the internal control memory of CPU; the coh_debug starts the corerence debug function.
Further, referring to a schematic structure of a linked list shown in fig. 6, fig. 6 adopts a linked list (cohDebugRecorder) to receive all the settings for the corerence, and the settings are arranged according to the serial numbers of the Cache, and the same numbers are arranged according to the addition order. The numbers in the cohDebugRecorder chain table represent the setting of a certain Cache, namely NCT coh_set0T on 0STORE 1 5, and as the same Cache can be set in various ways, a plurality of nodes can be arranged, the Tcl command is stored by the chain table in the initial stage of being loaded into the system, and is distributed to each Core later, and when waiting for the system to schedule the cores, the state of the Cache in the system is set. It is determined in the NCT_Model that if the coh_debug mode is turned on, the received cohDebugRecorder is traversed and a configuration is installed to start the Core and to assign the record to which the Core belongs. The records corresponding to cores that are not started remain in the linked list and configuration is uniformly completed by Core 0 (so Core 0 must be selected preferentially if there is an active Core).
Furthermore, by constructing the storage request, the cache consistency is verified and debugged in a direct drive storage system mode, and the interference path is reduced.
S305, issuing the setting request to the corresponding Core, completing the state configuration of the Cache in each Core according to the setting request, and driving a Nest, bus buses and an MC memory controller in the target POWER processor.
In one possible implementation, traversing the cache coherence command in the target linked list according to a target condition, and issuing the setting request to a corresponding Core;
and when the corresponding Core is scheduled, performing state configuration on the Cache in the corresponding Core.
Further, when driving the Nest portion, it is necessary to construct the issued request including address, operation type, and the like. The NCTModel_coh_driver function is an interface which actually generates a request and transmits the request to a Nest, and in the function, the request is sequentially fetched, set and executed according to each corresponding corerence record in the linked list.
Furthermore, the embodiment can directly control the state of a specific address in the storage system according to the configured condition, so as to simulate the scene of arbitrary cache consistency.
S306, according to the operation content in each Core, verifying the state conversion of the Cache with the completed state configuration, and recovering the configuration information with the completed configuration in the target linked list.
Further, when the Cache interface is operated, the state of the address in each Cache is controlled, the same address is generated in the NCTModel_fake_cache_state function through a static spa variable, and the state configuration is carried out according to the configuration provided by the Tcl, so that the scene of the coherence can be simulated for verification and debugging. Finally, the configured cohdebugRecorder linked list is reclaimed by the NCTModel_Core_exit function. In this embodiment, the general purpose is to set states of different caches, and drive Nest to run; firstly, storing a request for setting a Cache; then, the request is issued to the corresponding Core, namely the issuing process in the upper graph; then, the state configuration completed by the corresponding Cache in each Core is performed, the driving of the storage system is completed according to the operation in a certain Core, such as reading or writing, and whether the state conversion is correct or not is verified; and finally recovering the linked list.
Further, referring to a schematic overall operation flow after adding NCT and Tcl interfaces shown in fig. 7, fig. 7 mainly adds a new command of NCT expansion and a process of using the new command, first, adding the new command needs to be implemented by addcommand so that it can identify the new NCT command; and secondly, after the system is started, analyzing NCT commands in the Tcl script to finish corresponding operations of each command in the last step, namely setting a state, driving the operation of a Nest storage system and the like. The interaction flow chart of the Cache and the Bus shown in fig. 8 can be obtained through the operation condition and the statistical information, and the processing flow of the protocol can be drawn through further analysis and can be used for verifying the correctness of the protocol. As shown in FIG. 8, the Memory coherence arbitration and scheduling is accomplished through the middle Bus, and this module is also the core of the coherence design, and the tool of this embodiment formally debugs the system to verify that its function is correct. The general working process is as follows: when one Core needs access operation, such as reading or writing, the access operation is issued to the Cache; 1) The Cache sends a request to Bus, and the Bus broadcasts the request to all caches of the Core in mem; 2) Each Cache and mem receiving the request returns a response according to the state of the address of each Cache and mem; 3) After receiving all responses, bus arbitrates to select the latest state and position of data, and broadcasts the result to all caches and mem again. 4) The Cache receives the combineresp, each Cache and mem make corresponding transition state change, and the Cache sending the request is already aware of the place of the data at the moment, so that the data transmission can be performed.
Because the Tcl script configuration mode is adopted, automatic execution can be realized, automatic debugging and batch automatic debugging are supported, the state change in the running process is recorded through a log for subsequent analysis, and batch simultaneous execution can be realized through a multi-instance mode so as to shorten the verification time.
In summary, core information and new information of the target POWER processor are acquired first, an NCT target module is constructed, and the NCT target module is configured and driven through a target Tcl interface; and then configuring the states of all levels of caches corresponding to Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives. In the scheme, the Core in the target POWER processor is replaced by the NCT target module to provide required data for the Nest in the target POWER processor so as to drive the operation of the Nest, and the realization difficulty of verifying and debugging cache consistency is simplified; the simple Tcl command can be used for configuring a scene for arbitrarily testing cache consistency, so that the verification and debugging period is greatly shortened in time, and the complexity of consistency debugging is reduced; in addition, because of adopting a Tcl script configuration mode, automatic debugging and batch automatic debugging can be supported, so that verification time is shortened.
Fig. 9 is a block diagram illustrating a structure of an implementation apparatus for verifying and debugging cache coherence according to an exemplary embodiment. The device comprises:
the Core and Nest acquisition module 901 is configured to acquire Core information and Nest information of the target POWER processor;
a NCT target module construction module 902 for constructing a NCT target module; the NCT target module is used for providing needed data for Nest in the target POWER processor to drive the operation of Nest instead of Core in the target POWER processor;
the NCT target module configuration module 903 is configured and driven for the NCT target module through a target Tcl interface;
the verification and debug module 904 is configured to configure the states of the caches at each level corresponding to the Core information through the configured NCT target module, and verify and debug the state conversion of the caches with the configured states after the last in the target POWER processor drives.
In one possible implementation, the NCT target module configuration module 903 is further configured to:
configuring and driving the NCT target module through a group of Tcl expansion commands, and realizing a processing function of each command in the Tcl expansion commands; the Tcl expansion command includes a command to set cache coherency, a state configuration command to specify an address, and a command to set a request address to be a fixed address.
In one possible implementation manner, the command for setting Cache consistency includes the number of the Cache, the Cache state, whether the Core operates, the operation priority of the Core, the loading and storing of the Cache, the operation times and the setting of the delay period number.
In one possible implementation, the state configuration command of the specified address includes the number of the Cache, the specified address, the Cache state, and the setting of the execution priority of the Core.
In one possible implementation, the verification and debug module 904 is further configured to:
acquiring and storing setting requests of caches at all levels through a target linked list based on the Tcl expansion command;
issuing the setting request to the corresponding cores, and completing the state configuration of the Cache in each Core according to the setting request;
and verifying the state conversion of the Cache with the completed state configuration according to the operation content in each Core, and recovering the configuration information with the completed configuration in the target linked list.
In one possible implementation, the verification and debug module 904 is further configured to:
constructing a target linked list;
receiving the Cache consistency command corresponding to the setting request of each level of Cache through the target linked list;
and sequencing the received commands with the Cache consistency according to the serial number sequence and the adding sequence of the Cache.
In one possible implementation, the verification and debug module 904 is further configured to:
traversing the cache consistency command in the target linked list according to a target condition, and issuing the setting request to a corresponding Core;
and when the corresponding Core is scheduled, performing state configuration on the Cache in the corresponding Core.
In summary, core information and new information of the target POWER processor are acquired first, an NCT target module is constructed, and the NCT target module is configured and driven through a target Tcl interface; and then configuring the states of all levels of caches corresponding to Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives. In the scheme, the Core in the target POWER processor is replaced by the NCT target module to provide required data for the Nest in the target POWER processor so as to drive the operation of the Nest, and the realization difficulty of verifying and debugging cache consistency is simplified; the simple Tcl command can be used for configuring a scene for arbitrarily testing cache consistency, so that the verification and debugging period is greatly shortened in time, and the complexity of consistency debugging is reduced; in addition, because of adopting a Tcl script configuration mode, automatic debugging and batch automatic debugging can be supported, so that verification time is shortened.
Referring to fig. 10, a block diagram of a computer device according to an exemplary embodiment of the present application is provided, where the computer device includes a memory and a processor, and the memory is configured to store a computer program, and when the computer program is executed by the processor, the method for implementing verification and debug of cache consistency is implemented.
The processor may be a central processing unit (Central Processing Unit, CPU). The processor may also be any other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods in embodiments of the present application. The processor executes various functional applications of the processor and data processing, i.e., implements the methods of the method embodiments described above, by running non-transitory software programs, instructions, and modules stored in memory.
The memory may include a memory program area and a memory data area, wherein the memory program area may store an operating system, at least one application program required for a function; the storage data area may store data created by the processor, etc. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some implementations, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
In an exemplary embodiment, a computer readable storage medium is also provided for storing at least one computer program that is loaded and executed by a processor to implement all or part of the steps of the above method. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. The method for realizing verification and debugging of cache consistency is characterized by comprising the following steps:
obtaining Core information and Nest information of a target POWER processor;
constructing an NCT target module; the NCT target module is used for replacing the Core in the target POWER processor and providing needed data for Nest in the target POWER processor so as to drive the operation of the Nest;
configuring and driving the NCT target module through a target Tcl interface;
and configuring the states of all levels of caches corresponding to the Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after the Nest in the target POWER processor drives.
2. The method of claim 1, wherein configuring and driving the NCT target module over a target Tcl interface comprises:
configuring and driving the NCT target module through a group of Tcl expansion commands, and realizing a processing function of each command in the Tcl expansion commands; the Tcl expansion command includes a command to set cache coherency, a state configuration command to specify an address, and a command to set a request address to be a fixed address.
3. The method of claim 2, wherein the command to set Cache coherency includes a number of caches, a Cache state, whether Core is running, a priority of Core running, a load and store of Cache, a number of runs, and a set number of delay cycles.
4. The method of claim 2, wherein the state configuration command specifying an address includes a number of caches, a specified address, a Cache state, and a setting of a Core's run priority.
5. The method of claim 1, wherein the configuring, by the NCT target module after configuration, the state of each level of Cache corresponding to the Core information, and after a Nest in the target POWER processor drives, verifying and debugging the state conversion of the Cache with the configured state, includes:
acquiring and storing setting requests of caches at all levels through a target linked list based on the Tcl expansion command;
issuing the setting request to the corresponding Core, completing the state configuration of the Cache in each Core according to the setting request, and driving a Nest, bus buses and an MC memory controller in the target POWER processor;
and verifying the state conversion of the Cache with the completed state configuration according to the operation content in each Core, and recovering the configuration information with the completed configuration in the target linked list.
6. The method of claim 5, wherein the obtaining and storing, based on the Tcl expansion command, a set request for each level of Cache through a target linked list includes:
constructing a target linked list;
receiving the Cache consistency command corresponding to the setting request of each level of Cache through the target linked list;
and sequencing the received commands with the Cache consistency according to the serial number sequence and the adding sequence of the Cache.
7. The method according to claim 6, wherein issuing the setting request to the corresponding Core and completing the configuration of the state of the Cache in each Core according to the setting request includes:
traversing the cache consistency command in the target linked list according to a target condition, and issuing the setting request to a corresponding Core;
and when the corresponding Core is scheduled, performing state configuration on the Cache in the corresponding Core.
8. An implementation device for verifying and debugging cache consistency, which is characterized by comprising:
the Core and Nest acquisition module is used for acquiring Core information and Nest information of the target POWER processor;
the NCT target module construction module is used for constructing a NCT target module; the NCT target module is used for providing needed data for Nest in the target POWER processor to drive the operation of Nest instead of Core in the target POWER processor;
the NCT target module configuration module is used for configuring and driving the NCT target module through a target Tcl interface;
the verification and debugging module is used for configuring the states of all levels of caches corresponding to the Core information through the configured NCT target module, and verifying and debugging the state conversion of the caches with the configured states after Nest in the target POWER processor drives.
9. A computer device comprising a processor and a memory, wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement a cache coherence verification and debug implementation method according to any of claims 1 to 7.
10. A computer readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement a method of cache coherence verification and debug according to any of claims 1 to 7.
CN202211683813.6A 2022-12-27 2022-12-27 Verification and debugging realization method for cache consistency Active CN116049035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211683813.6A CN116049035B (en) 2022-12-27 2022-12-27 Verification and debugging realization method for cache consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211683813.6A CN116049035B (en) 2022-12-27 2022-12-27 Verification and debugging realization method for cache consistency

Publications (2)

Publication Number Publication Date
CN116049035A true CN116049035A (en) 2023-05-02
CN116049035B CN116049035B (en) 2024-02-09

Family

ID=86121094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211683813.6A Active CN116049035B (en) 2022-12-27 2022-12-27 Verification and debugging realization method for cache consistency

Country Status (1)

Country Link
CN (1) CN116049035B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116339736A (en) * 2023-05-29 2023-06-27 英诺达(成都)电子科技有限公司 Configuration method, device, equipment and storage medium of TCL (TCL) interactive interface
CN116627857A (en) * 2023-05-25 2023-08-22 合芯科技有限公司 Processor out-of-core cache model and simulation method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998009208A2 (en) * 1996-08-28 1998-03-05 Wind River Systems, Inc. A tool for software diagnosis divided across multiple processors
US7424418B1 (en) * 2003-12-31 2008-09-09 Sun Microsystems, Inc. Method for simulation with optimized kernels and debugging with unoptimized kernels
CN103150264A (en) * 2013-01-18 2013-06-12 浪潮电子信息产业股份有限公司 Extension Cache Coherence protocol-based multi-level consistency simulation domain verification and test method
CN104603748A (en) * 2012-09-27 2015-05-06 英特尔公司 Processor having multiple cores, shared core extension logic, and shared core extension utilization instructions
CN109254883A (en) * 2017-07-14 2019-01-22 深圳市中兴微电子技术有限公司 A kind of debugging apparatus and method of on-chip memory
CN112084113A (en) * 2020-09-16 2020-12-15 上海创景信息科技有限公司 Configurable automatic test method and system based on embedded simulation verification software
CN113468049A (en) * 2021-06-29 2021-10-01 平安养老保险股份有限公司 Test method, device, equipment and medium based on configurable interface
CN114398683A (en) * 2022-03-24 2022-04-26 之江实验室 Endogenous safety database storage method and device based on heterogeneous subsystem
CN114580344A (en) * 2022-04-24 2022-06-03 飞腾信息技术有限公司 Test excitation generation method, verification system and related equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998009208A2 (en) * 1996-08-28 1998-03-05 Wind River Systems, Inc. A tool for software diagnosis divided across multiple processors
US7424418B1 (en) * 2003-12-31 2008-09-09 Sun Microsystems, Inc. Method for simulation with optimized kernels and debugging with unoptimized kernels
CN104603748A (en) * 2012-09-27 2015-05-06 英特尔公司 Processor having multiple cores, shared core extension logic, and shared core extension utilization instructions
CN103150264A (en) * 2013-01-18 2013-06-12 浪潮电子信息产业股份有限公司 Extension Cache Coherence protocol-based multi-level consistency simulation domain verification and test method
CN109254883A (en) * 2017-07-14 2019-01-22 深圳市中兴微电子技术有限公司 A kind of debugging apparatus and method of on-chip memory
CN112084113A (en) * 2020-09-16 2020-12-15 上海创景信息科技有限公司 Configurable automatic test method and system based on embedded simulation verification software
CN113468049A (en) * 2021-06-29 2021-10-01 平安养老保险股份有限公司 Test method, device, equipment and medium based on configurable interface
CN114398683A (en) * 2022-03-24 2022-04-26 之江实验室 Endogenous safety database storage method and device based on heterogeneous subsystem
CN114580344A (en) * 2022-04-24 2022-06-03 飞腾信息技术有限公司 Test excitation generation method, verification system and related equipment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
QINGSONG WEI: "Transactional NVM cache with high performance and crash consistency", SC \'17: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS *
虞致国;黄召军;陈子逢;万书芹;魏斌;: "基于FPGA的无线传感器网络SoC验证平台设计", 电子与封装, no. 11 *
钱诚;沈海华;陈天石;陈云霁;: "超大规模集成电路可调试性设计综述", 计算机研究与发展, vol. 49, no. 1 *
龚令侃;卢景芬;: "嵌入式微处理器片上调试系统的设计和验证", 计算机工程, no. 1 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116627857A (en) * 2023-05-25 2023-08-22 合芯科技有限公司 Processor out-of-core cache model and simulation method
CN116627857B (en) * 2023-05-25 2023-11-24 合芯科技有限公司 Processor out-of-core cache model and simulation method
CN116339736A (en) * 2023-05-29 2023-06-27 英诺达(成都)电子科技有限公司 Configuration method, device, equipment and storage medium of TCL (TCL) interactive interface
CN116339736B (en) * 2023-05-29 2023-07-28 英诺达(成都)电子科技有限公司 Configuration method, device, equipment and storage medium of TCL (TCL) interactive interface

Also Published As

Publication number Publication date
CN116049035B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
CN116049035B (en) Verification and debugging realization method for cache consistency
US7246052B2 (en) Bus master and bus slave simulation using function manager and thread manager
US7548841B2 (en) Method for logic checking to check operation of circuit to be connected to bus
CN115841089B (en) System-level chip verification platform and verification method based on UVM
JP2002358249A (en) Bus protocol compliance test method for device, and system
JPH06314213A (en) Debugging device
JPH11212817A (en) Tracking method and device for hardware assisted firmware
JPH1196130A (en) Method and device for evaluating performance of multiprocessing system, and storage medium storing performance evaluating program thereof
US7676774B2 (en) System LSI verification system and system LSI verification method
JP4906286B2 (en) Software development environment system
US7231568B2 (en) System debugging device and system debugging method
US20050071145A1 (en) Simulation apparatus, simulation program, and recording medium
US10754743B2 (en) Apparatus and method using debug status storage element
JP2002288002A (en) Emulator device and emulation method
CN111782217A (en) System and method for quickly and efficiently generating cache consistency test C program
CN111143141B (en) State machine setting method and system
JP2005284557A (en) Microcomputer whose internal memory can be monitored
JPH08180094A (en) Architecture simulator
JP2005353020A (en) Simulation system for computer program
EP0947938B1 (en) System for emulating an electronic device
JP4893028B2 (en) Chipset emulation apparatus and method
JP3424548B2 (en) Software logic simulator for embedded devices
van der Wijst An Accelerator based on the ρ-VEX Processor: an Exploration using OpenCL
KR100658485B1 (en) Microprocessor development system
Calvez et al. Real-time behavior monitoring for multi-processor systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant