CN116821038A - Lock step control apparatus and method for processor - Google Patents

Lock step control apparatus and method for processor Download PDF

Info

Publication number
CN116821038A
CN116821038A CN202311092193.3A CN202311092193A CN116821038A CN 116821038 A CN116821038 A CN 116821038A CN 202311092193 A CN202311092193 A CN 202311092193A CN 116821038 A CN116821038 A CN 116821038A
Authority
CN
China
Prior art keywords
processor
data
cache
interrupt
processor core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311092193.3A
Other languages
Chinese (zh)
Other versions
CN116821038B (en
Inventor
张志远
纪海涛
刘凌
吴向斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel China Research Center Co ltd
Original Assignee
Intel China Research Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel China Research Center Co ltd filed Critical Intel China Research Center Co ltd
Priority to CN202311092193.3A priority Critical patent/CN116821038B/en
Publication of CN116821038A publication Critical patent/CN116821038A/en
Application granted granted Critical
Publication of CN116821038B publication Critical patent/CN116821038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application provides a lock step control device and a lock step control method for a processor. There is provided a processor comprising: two processor cores including a first processor core and a second processor core as a redundant processor core; the last level of cache LLC is shared by two processor cores; and a lockstep control unit configured to: periodically triggering a checkpoint interrupt to instruct each processor core to save the operating data of the processor core at the checkpoint to the LLC and shadow register set, and triggering a rollback interrupt if it is determined that the output data or addresses of the two processor cores do not match to restore the operating data of the two processor cores to the operating data of the two processor cores at the checkpoint prior to the rollback interrupt based on the operating data stored in the LLC and shadow register set.

Description

Lock step control apparatus and method for processor
Technical Field
The present application relates generally to the field of processors, and more particularly to lockstep control apparatus and methods for processors.
Background
Processor core lock step (LockStep) is a technique for achieving high reliability in microprocessor systems. A lockstep enabled microprocessor system refers to a redundant system of two or more processor cores that monitor each other. The lockstep technology needs to keep the processor cores and the memory accurately synchronous, and execute the same instruction in the same correct clock period, so that the correctness of program operation needs to be continuously checked, the processor errors are detected in time, and a fault suppression area is established to prevent fault propagation.
Currently, processor lockstep technology is applied to a Micro Controller Unit (MCU) widely used in the fields of automobile industry, industrial controllers, alternative energy sources, etc. For small-sized, low-performance MCUs that do not require internal caches, a relatively easy-to-implement three-core lockstep technique may be used to build a high-reliability redundant system, but for relatively large-sized high-performance processors (e.g., processors with internal caches), a dual-core lockstep technique is typically used to build a redundant system. The currently commonly adopted dual-core lockstep technology requires single-error correction dual-error detection (SECDED) protection for all internal data or address register files and cached data in the processor core, so the hardware cost and performance loss for implementing lockstep on a high-performance processor core are very high.
Disclosure of Invention
In view of the above, the present application provides a new lockstep control mechanism for a processor, which can implement lockstep control of a processor core with low hardware cost and performance loss.
According to an aspect of the present application, there is provided a processor comprising: two processor cores including a first processor core and a second processor core as a redundant processor core; the last level of cache LLC is shared by two processor cores; and a lockstep control unit configured to: periodically triggering a checkpoint interrupt to instruct each of the two processor cores to save the operating data of the processor core at the checkpoint to the LLC and shadow register set, and triggering a rollback interrupt if it is determined that the output data or addresses of the two processor cores do not match to restore the operating data of the two processor cores to the operating data of the two processor cores at the checkpoint prior to the rollback interrupt based on the operating data stored in the LLC and shadow register set.
According to another aspect of the present application, there is provided a lockstep control method for a processor, wherein the processor includes two processor cores including a first processor core and a second processor core as a redundant processor core. The lock step control method comprises the following steps: periodically triggering a checkpoint interrupt to instruct each of the two processor cores to save its operational data at the checkpoint into the LLC and shadow register set; and triggering a rollback interrupt if it is determined that the output data or addresses of the two processor cores do not match to restore the operational data of the two processor cores to their operational data at the checkpoint prior to the rollback interrupt based on the LLC and the operational data stored in the shadow register set, wherein the LLC is shared for use by the two processor cores and is configured to receive and store cache data from each processor core having a modified M cache state therein when the checkpoint interrupt is triggered. The shadow register set is configured to receive and store register data and core state data for each processor core when a checkpoint interrupt is triggered; and the operational data includes cache data, register data, and core state data.
With the lockstep control mechanism proposed by the present application, SECDED protection is not required for register data and cache data (e.g., data in the level one data cache L1D) within a processor core, but only for data in the Last Level Cache (LLC). Furthermore, according to the lockstep control mechanism, in order to achieve lockstep control of the processor core, there is no need to change the design of the processor core itself, that is, the proposed lockstep control mechanism can be used for lockstep control of the processor core even if the processor core itself does not support lockstep operation. Thus, the lockstep control mechanism has lower hardware cost and less impact on the performance of the processor core, and is particularly suitable for implementing lockstep control of high performance processor cores with internal caches.
Drawings
The application will be better understood from the following description of specific embodiments thereof, taken in conjunction with the accompanying drawings, in which:
fig. 1 illustrates a block diagram of an example processor and/or System on a Chip (SoC), which may have one or more cores and an integrated memory controller.
FIG. 2 shows a schematic block diagram of a dual core processor employing a lockstep control mechanism in accordance with an embodiment of the application;
FIG. 3 shows a schematic flow chart of a lockstep control method for a processor core, according to an embodiment of the application.
Detailed Description
Features and exemplary embodiments of various aspects of the application are described in detail below. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the application. It will be apparent, however, to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the application by showing examples of the application. The present application is in no way limited to any particular configuration set forth below, but rather covers any modification, substitution, or improvement of elements, components, and algorithms without departing from the spirit of the application. In the drawings and the following description, well-known structures and techniques have not been shown in order to avoid unnecessarily obscuring the present application.
Fig. 1 illustrates a block diagram of an example processor and/or SoC 100, which processor and/or SoC 100 may have one or more cores and an integrated memory controller. The processor 100 illustrated in solid line boxes has a single core 102 (a), a system agent unit circuit 110, and a set of one or more interface controller unit circuits 116, while the optionally added dashed line boxes illustrate the alternative processor 100 as having a plurality of cores 102 (a) - (N), a set of one or more integrated memory control unit circuits 114 in the system agent unit circuit 110, dedicated logic 108, and a set of one or more interface controller unit circuits 116.
Different implementations of the processor 100 may include: 1) A CPU, wherein the dedicated logic 108 is integrated graphics and/or scientific (throughput) logic (may include one or more cores, not shown), the cores 102 (a) - (N) are one or more general-purpose cores (e.g., general-purpose ordered cores, general-purpose out-of-order cores, or a combination of both); 2) Coprocessors in which cores 102 (a) - (N) are a large number of specialized cores primarily for graphics and/or scientific (throughput) purposes; and 3) a coprocessor, wherein cores 102 (A) - (N) are a number of general purpose ordered cores. Thus, processor 100 may be a general purpose processor, a coprocessor or special-purpose processor, such as, for example, a network or communication processor, compression engine, graphics processor, GPGPU (general purpose graphics processing unit), a high-throughput integrated many-core (MIC) coprocessor (including 30 or more cores), embedded processor, or the like. The processor may be implemented on one or more chips. The processor 100 may be part of one or more substrates and/or may be implemented on one or more substrates using any of a variety of process technologies, such as complementary metal oxide semiconductor (complementary metal oxide semiconductor, CMOS), bipolar CMOS (BiCMOS), P-type metal oxide semiconductor (P-type metal oxide semiconductor, PMOS), or N-type metal oxide semiconductor (N-type metal oxide semiconductor, NMOS).
The memory hierarchy includes one or more levels of cache cell circuitry 104 (a) - (N) within cores 102 (a) - (N), a set of one or more shared cache cell circuitry 106, and an external memory (not shown) coupled to the set of integrated memory controller cell circuitry 114. The set of one or more shared cache unit circuits 106 may include one or more intermediate level caches, such as level 2 (L2), level 3 (L3), level 4 (4), or other levels of cache, such as Last Level Cache (LLC), and/or combinations of these. While in some examples the interface network circuitry 112 (e.g., a ring interconnect) provides an interface to the dedicated logic 108 (e.g., integrated graphics logic), the set of shared cache unit circuitry 106, and the system agent unit circuitry 110, alternative examples use any number of well-known techniques to provide an interface to these units. In some examples, coherency is maintained between one or more of the shared cache unit circuits 106 and cores 102 (a) - (N). In some examples, the interface controller unit circuitry 116 couples these cores 102 to one or more other devices 118, such as one or more I/O devices, storage, one or more communication devices (e.g., wireless network, wired network, etc.), and so forth.
In some examples, one or more of cores 102 (A) - (N) have multi-threading capabilities. System agent unit circuitry 110 includes those components that coordinate and operate cores 102 (A) - (N). The system agent unit circuit 110 may include, for example, a power control unit (power control unit, PCU) circuit and/or a display unit circuit (not shown). The PCU may be (or may include) logic and components required to adjust the power states of cores 102 (a) - (N) and/or dedicated logic 108 (e.g., integrated graphics logic). The display element circuit is used to drive one or more externally connected displays.
Cores 102 (a) - (N) may be homogenous in terms of instruction set architecture (instruction set architecture, ISA). For example, processor cores 102 (A) - (N) may each employ a RISC-V instruction set architecture. The RISC-V instruction set architecture is a brand new open source instruction set architecture based on the Reduced Instruction Set (RISC) principle, wherein the letter V comprises two layers of meanings, and the fifth generation instruction set architecture is designed by Berkeley from RISC-I; and two are variance (vector) and vector (vector). Alternatively, cores 102 (A) - (N) may also be heterogeneous with respect to the ISA; that is, a subset of cores 102 (a) - (N) may be capable of executing one ISA, while other cores may be capable of executing only a subset of that ISA or capable of executing another ISA. However, when two or more processor cores are utilized to construct a redundant system, the ISA of these processor cores should be homogenous.
As described above, for relatively large-sized high-performance processors (e.g., the processor shown in fig. 1), a dual-core lockstep technique is typically used to build a redundant system to achieve a highly reliable computing or control system. The redundant system may be made up of two or more processor cores that monitor each other. The currently commonly employed dual core lockstep technique requires SECDED protection of all internal data or address register files and cached data in the processor core, so the hardware cost and performance penalty of implementing lockstep on a high performance processor core is very high.
Embodiments of the present application provide a lockstep control mechanism that does not require SECDED protection of register data and cache data (e.g., data in the level one data cache L1D) within a processor core, but rather only requires SECDED protection of data in the LLC. Furthermore, according to the lockstep control mechanism, in order to achieve lockstep control of the processor core, there is no need to change the design of the processor core itself, that is, the proposed lockstep control mechanism can be used for lockstep control of the processor core even if the processor core itself does not support lockstep operation.
FIG. 2 shows a schematic block diagram of a dual core processor employing a lockstep control mechanism in accordance with an embodiment of the application. As shown in fig. 2, processor core 0 and processor core 1 are used to build a high reliability redundant system. For example, processor core 1 acts as a redundant processor core. Core 0 and core 1 share instruction codes from the instruction cache of core 0, and in the absence of errors, perform the same operations based on these instruction codes and output the same execution result data. Typically, execution of instruction code by core 1 may be delayed by several clock cycles (e.g., 2 clock cycles by a delay unit) relative to execution of instruction code by core 0. The lockstep technology needs to keep the two processor cores and the memory accurately synchronous, and execute the same instruction in the same correct clock period, so that the correctness of program operation needs to be continuously checked, the processor error is timely detected, a fault suppression area is established, and the fault is prevented from spreading.
As shown in fig. 2, the lockstep control mechanism according to an embodiment of the present application may be implemented by a lockstep control apparatus composed of a lockstep control unit, an LLC, and a shadow register set. In order to achieve lockstep control of a processor core, two high priority interrupt processes, namely, a checkpoint interrupt and a rollback interrupt, are introduced according to embodiments of the present application. Specifically, the lockstep control unit may be configured to: periodically triggering a checkpoint interrupt to instruct processor cores 0 and 1 to save their operational data at the checkpoint into the LLC and shadow register set, and triggering a rollback interrupt if it is determined that the output data or addresses of the two processor cores do not match to restore the operational data of the two processor cores to their operational data at the checkpoint prior to the rollback interrupt based on the operational data stored in the LLC and shadow register set.
First, a checkpoint interrupt may be used to periodically checkpoint for enabling restoration of the running data of the processor core when a rollback interrupt is triggered. When a checkpoint interrupt is triggered, each processor core may flush cache data in its internal cache (e.g., L1D) with modified M-cache states into the LLC while modifying the cache state of these cache data in its internal cache to either an exclusive E or stale I cache state.
For example, each processor core may be based on a RISC-V instruction set architecture and configured to Flush cache data in the processor core having an M cache state into the LLC with a predefined Flush instruction (e.g., flush M2E) when a checkpoint interrupt is triggered, while modifying the cache state of these cache data in its internal cache to an E or I cache state.
The M, E and I cache states referred to herein may be cache states defined in the standard MESI cache protocol. As shown in Table 1, the MESI cache protocol defines four cache states.
TABLE 1
In addition, when a checkpoint interrupt is triggered, each processor core also saves its register data and core state data into the shadow register set. The shadow register set may be located in a control register space outside of the internal cache of each processor core. When all of the register data and core state data of the processor core at the checkpoints have been saved into the shadow register set, the processor core may notify the lockstep control unit that checkpointing has been completed.
According to some embodiments of the application, the set of shadow registers may include a first subset of shadow registers and a second subset of shadow registers. The two shadow register subsets may be configured to alternately store register data and core state data for each processor core at a plurality of checkpoints in a ping-pong (ping-pong) mode. For example, a first shadow register subset may store register data and core state data for each processor core at a first checkpoint, a second shadow register subset may store register data and core state data for each processor core at a second checkpoint, then the first shadow register subset may store register data and core state data for each processor core at a third checkpoint, a second shadow register subset may store register data and core state data for each processor core at a fourth checkpoint, and so on.
Storing the register data and the core state data of the processor core in ping-pong mode may protect the register data and the core state data of the processor core at a last checkpoint during a checkpoint interrupt such that recovery of the operating data of the processor core may be achieved based on the data at the last checkpoint even if a current checkpoint interrupt process is problematic.
Furthermore, the lockstep control unit needs to check the correctness of the program execution in both processor cores without interruption. That is, the lockstep control unit may compare the data or addresses output after the two processor cores 0 and 1 execute the same instruction code, and trigger a rollback interrupt to restore the operation data of the two processor cores 0 and 1 to its operation data at the checkpoint prior to the rollback interrupt when it is determined that the output data or addresses of the two processor cores 0 and 1 do not match.
Specifically, when a rollback interrupt is triggered, each processor core changes the cache state of the cache data in its internal cache to an I state, reads the cache data stored in the LLC by the processor core at the checkpoint prior to the rollback interrupt from the LLC, and reads the register data and core state data at the checkpoint prior to the rollback interrupt from the shadow register set, thereby restoring the operating data of the processor core to the operating data at the checkpoint prior to the rollback interrupt.
According to an embodiment of the application, the LLC is shared by both processor cores 0 and 1 and is configured to receive and store cache data having a modified M cache state in each processor core when a checkpoint interrupt is triggered.
In addition, in the LLC, a new data cache state, the temporary T-cache state, is defined in addition to the four data cache states defined in the standard MESI/MOESI cache protocol, and cache data stored in the LLC having the T-cache state is prohibited from being moved out of the LLC. As shown in table 2, there may be five cache states for cache data in the LLC.
TABLE 2
According to an embodiment of the application, the LLC may be configured to: storing cache data from each processor core having an M-cache state in a temporary T-cache state during a checkpoint interrupt; changing the cache state of the stored cache data with the T cache state into a modified M cache state at the end of the check point interrupt; and changing the cache state of the stored cache data having the T-cache state to the stale I-cache state at the beginning of the rollback interrupt. In this way, memory space in the LLC may be freed up in time to save updated checkpoint running data.
It should be noted that cached data in the LLC needs to be SECDED protected. And when storage space in the LLC is about to be full (e.g., when the proportion of storage space in the LLC is occupied exceeds a predetermined threshold), a checkpoint interrupt may be triggered to timely free up storage space in the LLC to save updated checkpoint's operational data.
In order to facilitate a better understanding of the specific application of the proposed lockstep control mechanism in a processor, the working process of the processor according to one embodiment of the present application will be described further below with reference to fig. 2, which will relate to the proposed lockstep control mechanism.
As shown in fig. 2, processor core 0 acquires instruction code to be executed from its instruction cache, while the instruction code is also supplied to processor core 1, which is a redundant processor core, via a delay unit so that processor core 1 executes the same instruction as processor core 0. Only in general, execution of the instruction code by processor core 1 may be delayed by a number of clock cycles (e.g., 2 clock cycles by a delay unit) relative to execution of the instruction code by processor core 0.
In the case where no error occurs, the processor core 0 and the processor core 1 execute the same operation based on these instruction codes and output the same execution result data, but when an error occurs in the execution of an instruction by a certain processor core, the output of the two processor cores may be inconsistent. Therefore, a lockstep technology is needed to continuously check the correctness of program operation, timely detect processor errors, establish a fault suppression area and prevent fault propagation.
According to the lockstep control mechanism proposed by the present application, the lockstep control unit may periodically trigger a checkpoint interrupt to checkpoint. The lockstep control unit may trigger a checkpoint interrupt for both processor core 0 and processor core 1, except that the trigger of a checkpoint interrupt for processor core 1 may be delayed by a number of clock cycles relative to the trigger of a checkpoint interrupt for processor core 0 by a delay unit, similar to the delay of execution of processor checkmark instruction code. It should be noted that the two delay units shown in fig. 2 may be two independent delay units or may be the same delay unit that is shared.
When a checkpoint interrupt is triggered, processor cores 0 and 1 save their operating data at the checkpoint to the LLC and shadow register set. For example, processor core 0 and processor core 0 save their operating data to the LLC and shadow register sets via core 0 store data line and core 1 store data line, respectively. As described previously, the run data may include cache data to be flushed into the LLC and register data and processor core state data to be saved into the shadow register set.
Furthermore, according to the lockstep control mechanism proposed by the present application, the lockstep control unit will compare the data and addresses output after the two processor cores 0 and 1 execute the same instruction code without interruption (e.g., every clock cycle). As shown in fig. 2, the instruction address of processor core 0 and the internal cache data (including the address associated with the data) may be transferred to the lockstep control unit via an address transfer line and a core 0 store data line, respectively, and similarly, the instruction address of processor core 1 and the internal cache data (including the address associated with the data) may be transferred to the lockstep control unit via an address transfer line and a core 1 store data line, respectively.
The lockstep control unit will continue to monitor the state of the two processor cores if it determines that the output data and address of the two processor cores match, and will trigger a rollback interrupt if it determines that the output data or address of the two processor cores do not match. The lockstep control unit will trigger a rollback interrupt for both processor core 0 and processor core 1, except that the triggering of the rollback interrupt for processor core 1 may be delayed by a number of clock cycles relative to the triggering of the rollback interrupt for processor core 0 by a delay unit, similar to the delay of the checkpoint interrupt and the delay of the execution of the processor checkpointed instruction code.
When a rollback interrupt is triggered, each processor core changes the cache state of the cache data in its internal cache to an I state, reads the cache data stored by the processor core into the LLC at the checkpoint prior to the rollback interrupt from the LLC, and reads the register data and core state data at the checkpoint prior to the rollback interrupt from the shadow register set, thereby restoring the operating data of the processor core to the operating data at the checkpoint prior to the rollback interrupt.
For example, processor core 0 may restore the operational data at the checkpoint prior to the rollback interrupt into processor core 0 through the core load data line, and such operational data is also restored into processor core 1 after passing through the delay unit. In this way, after the lockstep control unit triggers a rollback interrupt for processor core 0 and processor core 1, respectively, the running data of the two processor cores will remain matched again and the synchronization operation can be continued. At this point, the lockstep control unit will also continue to monitor the state of both processor cores.
An example of a dual core processor employing a lockstep control mechanism in accordance with an embodiment of the present application is described above in connection with FIG. 2. A lockstep control method for a processor according to an embodiment of the present application will be described with reference to fig. 3.
Fig. 3 shows a schematic flow chart of a lockstep control method for a processor according to an embodiment of the application. The processor may include two processor cores, such as a first processor core and a second processor core that is a redundant processor core. As shown in fig. 3, the lockstep control method may be performed by a lockstep control unit and include operations 310 to 340.
At operation 310, the lockstep control unit may periodically trigger a checkpoint interrupt to instruct each of the two processor cores to save the operating data of that processor core at the checkpoint into the LLC and shadow register set.
According to an embodiment of the application, the LLC is shared by each processor core and is configured to receive and store cache data having a modified M cache state in each processor core when a checkpoint interrupt is triggered; the shadow register set is configured to receive and store register data and core state data for each processor core when a checkpoint interrupt is triggered; and the operational data includes cache data, register data, and core state data.
According to some embodiments of the application, the shadow register set may include a first shadow register subset and a second shadow register subset configured to alternately store register data and core state data for each processor core at a plurality of checkpoints in a ping-pong mode. The shadow register set may be located in a control register space outside of the internal cache of each processor core.
Furthermore, according to some embodiments of the application, each processor core may be based on a RISC-V instruction set architecture and configured to: when a checkpoint interrupt is triggered, cache data in the processor core having an M cache state is flushed to the LLC using a predefined Flush instruction (e.g., flush M2E) and the cache state of the cache data in the processor core is changed to an exclusive E or invalidate I cache state.
At operation 320, the lockstep control unit may compare the output data and addresses of the two processor cores. As described above, in the case where no error occurs, the two processor cores perform the same operation based on the same instruction code and output the same execution result data, but when an error occurs in the execution of an instruction by a certain processor core, the output of the two processor cores may be inconsistent. Therefore, a lockstep technology is needed to continuously check the correctness of program operation, timely detect processor errors, establish a fault suppression area and prevent fault propagation.
According to embodiments of the present application, the lockstep control unit may compare the output data and addresses of two processor cores without interruption. In order to maintain an exact match between the two processor cores, the comparison operation is typically performed every clock cycle, although embodiments of the application are not limited in this regard.
Then, at operation 330, the lockstep control unit determines whether the output data and address of the two processor cores match. In the event that it is determined that the output data and address of the two processor cores match, the lockstep control unit will return to operation 320 to continue monitoring the state of the two processor cores, while in the event that it is determined that the output data or address of the two processor cores do not match, the lockstep control unit will proceed to operation 340 to trigger a rollback interrupt.
At operation 340, the lockstep control unit may trigger a rollback interrupt to restore the operational data of the two processor cores to the operational data of the two processor cores at the checkpoint prior to the rollback interrupt based on the LLC and the operational data stored in the shadow register set.
According to an embodiment of the application, the LLC is configured to: storing cache data from each processor core having an M-cache state in a temporary T-cache state during a checkpoint interrupt; changing the cache state of the stored cache data with the T cache state into a modified M cache state at the end of the check point interrupt; and changing the cache state of the stored cache data having the T-cache state to the stale I-cache state at the beginning of the rollback interrupt. Cache data stored in the LLC having a T-cache state is prohibited from being moved out of the LLC, and cache data in the LLC is SECDED protected.
According to an embodiment of the application, the lockstep control unit may also trigger a checkpoint interrupt when the proportion of memory space in the LLC that is occupied exceeds a predetermined threshold.
After operation 340 is complete, the lockstep control unit may return to operation 320 to continue to monitor the state retention of both processor cores.
Further, it should be noted that the operations in the lockstep control method described above merely represent respective operations to be performed by the lockstep control unit, and these operations need not be performed in the order shown in the figures, but may be performed in other suitable order or in parallel. For example, operation 310 and operation 320 may be performed in parallel.
From the above description in connection with fig. 2 and 3, with the lockstep control mechanism proposed by the present application, SECDED protection is not required for register data and cache data (e.g., data in the level one data cache L1D) inside the processor core, but only for data in the LLC. Furthermore, in order to achieve lockstep control of the processor core, there is no need to change the design of the processor core itself, that is, the proposed lockstep control mechanism may be used for lockstep control of the processor core even if the processor core itself does not support lockstep operations. Thus, the lockstep control mechanism has lower hardware cost and less impact on the performance of the processor core, and is particularly suitable for implementing lockstep control of high performance processor cores with internal caches.
The following paragraphs describe examples of various embodiments.
Example 1 includes a processor comprising: two processor cores including a first processor core and a second processor core as a redundant processor core; the last level of cache LLC is shared by two processor cores; and a lockstep control unit configured to: periodically triggering a checkpoint interrupt to instruct each of the two processor cores to save the operating data of the processor core at the checkpoint to the LLC and shadow register set, and triggering a rollback interrupt if it is determined that the output data or addresses of the two processor cores do not match to restore the operating data of the two processor cores to the operating data of the two processor cores at the checkpoint prior to the rollback interrupt based on the operating data stored in the LLC and shadow register set.
Example 2 includes the processor of example 1, wherein the LLC is configured to receive and store cache data in each processor core having a modified M cache state when a checkpoint interrupt is triggered; and the shadow register set is configured to receive and store register data and core state data for each processor core when the checkpoint interrupt is triggered, wherein the operational data includes cache data, register data and core state data.
Example 3 includes the processor of example 2, wherein the LLC is configured to: storing cache data from each processor core having an M-cache state in a temporary T-cache state during a checkpoint interrupt; changing the cache state of the stored cache data with the T cache state into an M cache state when the check point interrupt is ended; and changing the cache state of the stored cache data having the T-cache state to the stale I-cache state at the beginning of the rollback interrupt.
Example 4 includes the processor of example 3, wherein cache data stored in the LLC having a T-cache state is prohibited from being moved out of the LLC.
Example 5 includes the processor of example 1, wherein the lockstep control unit is further configured to: a checkpoint interrupt is triggered when the proportion of memory space in the LLC is occupied exceeds a predetermined threshold.
Example 6 includes the processor of example 1, wherein the cache data in the LLC is single error correction double error detection SECDED protected.
Example 7 includes the processor of example 1, wherein the shadow register set includes a first shadow register subset and a second shadow register subset configured to alternately store register data and core state data for each processor core at a plurality of checkpoints in a ping-pong mode.
Example 8 includes the processor of example 1, wherein the shadow register set is located in a control register space outside of the internal cache of each processor core.
Example 9 includes the processor of example 1, wherein the first processor core executes the same instructions as the second processor core, and execution of the instructions by the second processor core is delayed by a predetermined clock cycle relative to execution of the instructions by the first processor.
Example 10 includes the processor of any one of examples 1 to 9, wherein each processor core is configured to: when a checkpoint interrupt is triggered, cache data in the processor core having a modified M-cache state is flushed to the LLC and the cache state of the cache data in the processor core is changed to an exclusive E or invalidated I cache state.
Example 11 includes the processor of any of examples 1 to 9, wherein each processor core is based on a RISC-V instruction set architecture and is configured to: when a checkpoint interrupt is triggered, cache data in the processor core having a modified M-cache state is flushed to the LLC using a predefined flush instruction, and the cache state of the cache data in the processor core is changed to an exclusive E or stale I cache state.
Example 12 includes a lockstep control method for a processor including two processor cores including a first processor core and a second processor core that is a redundant processor core, the lockstep control method comprising: periodically triggering a checkpoint interrupt to instruct each of the two processor cores to save the operating data of the processor core at the checkpoint to the last level cache LLC and the shadow register set; and triggering a rollback interrupt if it is determined that the output data or addresses of the two processor cores do not match to restore the operational data of the two processor cores to the operational data of the two processor cores at the checkpoint of the two processor cores prior to the rollback interrupt based on the LLC and the operational data stored in the shadow register set, wherein the LLC is shared for use by the two processor cores and is configured to receive and store cache data from each processor core having a modified M cache state therein when the checkpoint interrupt is triggered. The shadow register set is configured to receive and store register data and core state data for each processor core when a checkpoint interrupt is triggered; and the operational data includes cache data, register data, and core state data.
Reference has been made above to "an embodiment," "some embodiments," however, it should be understood that features mentioned in various embodiments are not necessarily applicable only to this embodiment, but may be used in other embodiments. Features of one embodiment may be applied to or included in another embodiment.
The ordinal numbers of "first", "second" …, etc. are mentioned above. It should be understood, however, that such expressions are merely for the convenience of description and reference, and that no ordinal relation of order exists between the objects so defined.
The present application may be embodied in other specific forms without departing from its spirit or essential characteristics. For example, the algorithms described in particular embodiments may be modified without departing from the basic spirit of the application. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (12)

1. A processor, comprising:
two processor cores including a first processor core and a second processor core as a redundant processor core;
the last level of cache LLC is shared by the two processor cores; and
a lockstep control unit configured to:
periodically triggering a checkpoint interrupt to instruct each of the two processor cores to save its operating data at a checkpoint to the LLC and shadow register set, an
A rollback interrupt is triggered if output data or addresses of the two processor cores are determined to not match to restore the operating data of the two processor cores to operating data of the two processor cores at a checkpoint prior to the rollback interrupt based on the LLC and the operating data stored in the shadow register set.
2. The processor of claim 1, wherein,
the LLC is configured to receive and store cache data having a modified M cache state in each processor core when the checkpoint interrupt is triggered; and is also provided with
The shadow register set is configured to receive and store register data and core state data for each processor core when the checkpoint interrupt is triggered,
wherein the operation data includes the cache data, the register data, and the core state data.
3. The processor of claim 2, wherein the LLC is configured to:
storing the cached data with modified M-cache state from each processor core in a temporary T-cache state during the checkpoint interrupt;
changing the cache state of the stored cache data with the T cache state into an M cache state when the check point interrupt is ended; and is also provided with
And changing the cache state of the stored cache data with the T cache state into an invalid I cache state at the beginning of the rollback interrupt.
4. A processor according to claim 3, wherein cache data stored in the LLC having a T-cache state is inhibited from being moved out of the LLC.
5. The processor of claim 1, wherein the lockstep control unit is further configured to: triggering the checkpoint interrupt when a proportion of storage space in the LLC is occupied exceeds a predetermined threshold.
6. The processor of claim 1, wherein the cache data in the LLC is protected by single-error correction double-error detection SECDED.
7. The processor of claim 1, wherein the shadow register set comprises a first shadow register subset and a second shadow register subset configured to alternately store register data and core state data for each processor core at a plurality of checkpoints in a ping-pong mode.
8. The processor of claim 1, wherein the shadow register set is located in a control register space outside of an internal cache of each processor core.
9. The processor of claim 1, wherein the first processor core executes the same instructions as the second processor core and execution of the instructions is delayed by a predetermined clock cycle relative to execution of the instructions by the first processor.
10. The processor of any one of claims 1 to 9, wherein each processor core is configured to: flushing cache data in the processor core having a modified M cache state into the LLC when the checkpoint interrupt is triggered, and changing the cache state of the cache data in the processor core to an exclusive E or stale I cache state.
11. The processor of any one of claims 1 to 9, wherein each processor core is based on a RISC-V instruction set architecture and is configured to: flushing cache data in the processor core having a modified M cache state into the LLC with a predefined flush instruction when the checkpoint interrupt is triggered, and changing the cache state of the cache data in the processor core to an exclusive E or invalidate I cache state.
12. A lockstep control method for a processor, wherein the processor includes two processor cores including a first processor core and a second processor core as a redundant processor core, the lockstep control method comprising:
periodically triggering a checkpoint interrupt to instruct each of the two processor cores to save the operating data of the processor core at the checkpoint to the last level cache LLC and the shadow register set; and
triggering a rollback interrupt if it is determined that the output data or addresses of the two processor cores do not match, to restore the operating data of the two processor cores to the operating data of the two processor cores at the checkpoint prior to the rollback interrupt based on the LLC and the operating data stored in the shadow register set,
wherein the LLC is shared by the two processor cores and is configured to receive and store cache data having a modified M cache state in each processor core when the checkpoint interrupt is triggered;
the shadow register set is configured to receive and store register data and core state data for each processor core when the checkpoint interrupt is triggered; and is also provided with
The operation data includes the cache data, the register data, and the core state data.
CN202311092193.3A 2023-08-28 2023-08-28 Lock step control apparatus and method for processor Active CN116821038B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311092193.3A CN116821038B (en) 2023-08-28 2023-08-28 Lock step control apparatus and method for processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311092193.3A CN116821038B (en) 2023-08-28 2023-08-28 Lock step control apparatus and method for processor

Publications (2)

Publication Number Publication Date
CN116821038A true CN116821038A (en) 2023-09-29
CN116821038B CN116821038B (en) 2023-12-26

Family

ID=88122473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311092193.3A Active CN116821038B (en) 2023-08-28 2023-08-28 Lock step control apparatus and method for processor

Country Status (1)

Country Link
CN (1) CN116821038B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117992126A (en) * 2024-04-07 2024-05-07 西安羚控电子科技有限公司 Processor cooperative work method and system based on software lockstep

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180373315A1 (en) * 2017-06-22 2018-12-27 Intel Corporation System, Apparatus And Method For Dynamically Controlling Error Protection Features Of A Processor
US20190114243A1 (en) * 2018-12-12 2019-04-18 Intel Corporation Hardware lockstep checking within a fault detection interval in a system on chip
CN111581003A (en) * 2020-04-29 2020-08-25 浙江大学 Full-hardware dual-core lock-step processor fault-tolerant system
US20210173738A1 (en) * 2019-12-09 2021-06-10 SiFive, Inc. Checker Cores for Fault Tolerant Processing
CN114416435A (en) * 2021-12-28 2022-04-29 中国科学院计算技术研究所 Microprocessor architecture and microprocessor fault detection method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180373315A1 (en) * 2017-06-22 2018-12-27 Intel Corporation System, Apparatus And Method For Dynamically Controlling Error Protection Features Of A Processor
US20190114243A1 (en) * 2018-12-12 2019-04-18 Intel Corporation Hardware lockstep checking within a fault detection interval in a system on chip
US20210173738A1 (en) * 2019-12-09 2021-06-10 SiFive, Inc. Checker Cores for Fault Tolerant Processing
CN111581003A (en) * 2020-04-29 2020-08-25 浙江大学 Full-hardware dual-core lock-step processor fault-tolerant system
CN114416435A (en) * 2021-12-28 2022-04-29 中国科学院计算技术研究所 Microprocessor architecture and microprocessor fault detection method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117992126A (en) * 2024-04-07 2024-05-07 西安羚控电子科技有限公司 Processor cooperative work method and system based on software lockstep

Also Published As

Publication number Publication date
CN116821038B (en) 2023-12-26

Similar Documents

Publication Publication Date Title
US7444544B2 (en) Write filter cache method and apparatus for protecting the microprocessor core from soft errors
CN106575218B (en) Persistent store fence processor, method, system, and instructions
JP4603185B2 (en) Computer and its error recovery method
CN116821038B (en) Lock step control apparatus and method for processor
CN100489801C (en) Firmware mechanism for correcting soft errors
Reick et al. Fault-tolerant design of the IBM Power6 microprocessor
US20090044044A1 (en) Device and method for correcting errors in a system having at least two execution units having registers
CN112667450B (en) Dynamically configurable fault-tolerant system with multi-core processor
JP2006164277A (en) Device and method for removing error in processor, and processor
JP7351933B2 (en) Error recovery method and device
CN111581003B (en) Full-hardware dual-core lock-step processor fault-tolerant system
US11966290B2 (en) Checker cores for fault tolerant processing
US20150286544A1 (en) Fault tolerance in a multi-core circuit
US7954038B2 (en) Fault detection
de Oliveira et al. Applying lockstep in dual-core ARM Cortex-A9 to mitigate radiation-induced soft errors
CN117687846A (en) Pipeline reinforcement method and system based on dual-core lockstep processor
CN115080315B (en) Fault detection and processing method and device, processor and electronic equipment
US20080229134A1 (en) Reliability morph for a dual-core transaction-processing system
Han et al. A 1GHz fault tolerant processor with dynamic lockstep and self-recovering cache for ADAS SoC complying with ISO26262 in automotive electronics
US20130055017A1 (en) Device and method for restoring information in a main storage unit
CN107168827B (en) Dual-redundancy pipeline and fault-tolerant method based on check point technology
KR101658828B1 (en) Apparatus and method for function recovery of CPU core
El Salloum et al. Recovery mechanisms for dual core architectures
CN112559253B (en) Method and device for backing up and restoring data of computer system
US20240168841A1 (en) Checksum-based fault detection and correction for a matrix compute engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant