CN114610519B - Real-time recovery method and system for abnormal errors of processor register set - Google Patents

Real-time recovery method and system for abnormal errors of processor register set Download PDF

Info

Publication number
CN114610519B
CN114610519B CN202210262087.4A CN202210262087A CN114610519B CN 114610519 B CN114610519 B CN 114610519B CN 202210262087 A CN202210262087 A CN 202210262087A CN 114610519 B CN114610519 B CN 114610519B
Authority
CN
China
Prior art keywords
processor
register group
instruction
module
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210262087.4A
Other languages
Chinese (zh)
Other versions
CN114610519A (en
Inventor
周婉婷
李磊
袁世伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202210262087.4A priority Critical patent/CN114610519B/en
Publication of CN114610519A publication Critical patent/CN114610519A/en
Application granted granted Critical
Publication of CN114610519B publication Critical patent/CN114610519B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Retry When Errors Occur (AREA)

Abstract

The invention discloses a real-time recovery method and a system for abnormal errors of a processor register set, which are applied to the field of intrinsic safety of a processor, and aim at the safety problems of soft errors, transient fault injection, malicious tampering and the like of the processor register set in the operation process, and the problems of high resource overhead, low real-time performance, lack of effective means for error recovery of the register set and the like of the existing error recovery technology; the method provided by the invention ensures the reliability and safety of the processor hardware level to a certain extent.

Description

Real-time recovery method and system for abnormal errors of processor register set
Technical Field
The invention belongs to the field of processor endogenous safety, and particularly relates to a recovery technology of an abnormal state of a processor.
Background
Hardware-based fault injection attacks refer to: the effect of fault injection is achieved by changing environmental parameters, interfering hardware or changing the pin input of an integrated circuit chip; the hardware trojan refers to: a particular module, either deliberately implanted in a chip or electronic system or unintentionally left defective by a designer, can be utilized by an attacker to perform a destructive function under special conditions. The inserted hardware trojan may cause leakage of information, change of circuit function, or even destroy the circuit. The general purpose register set of a processor is a carrier of all intermediate data in the operation of the processor, and once the register set is attacked by a malicious attack, the attack on the processor is destructive. These attacks include: hardware trojans, hidden back doors, design holes, electromagnetic pulse injection, laser injection, and the like. In a safety processor, the real-time recovery of the fault can effectively ensure the stable and reliable operation of the system, which is as important as the abnormal state detection of the processor.
The existing recovery method for the abnormal state of the processor mainly comprises the following steps:
(1) The recovery method based on the checkpoint and rollback comprises the following steps: "X.Wang, et al.An M-Cache-Based Security Monitoring and Fault Recovery Architecture for Embedded Processor, IEEE Transactions on Version Large Scale Integration (VLSI) Systems, vol.28, no.11, pp.2314-2327, nov.2020, (DOI: 10.1109/TVLSI.2020.3021533.) the Processor exception state is restored using function entry as checkpoint execution backup and rollback technique. The main disadvantages are: 1) The method is a recovery method for abnormal errors of processor instructions, and cannot be applied to recovery of abnormal states of a processor register set; 2) The method needs to back up a processor register, a data RAM, an instruction RAM and a memory in real time because the method is used for detecting and recovering in real time according to the divided program blocks, so that a large amount of resources need to be consumed, and the cost of chip area can be delayed and is very high; 3) The real-time nature of its recovery is not high enough.
(2) The method based on attack detection and error recovery comprises the following steps: "A.Chaudhari, et al.A frame for low overhead hard ware based control flow error detection and recovery, IEEE 31st VLSI Test Symposium (2013), (DOI: 10.1109/VTS.2013.6548908)" the major disadvantages are: 1) The method mainly aims to solve the problem of errors of a currently executed Basic Block (Basic Block), and ignores a non-executed Basic Block; 2) All the execution programs need to be analyzed in advance and the feature extraction of the basic blocks needs to be performed, which results in that the feature extraction needs to be performed again each time a different program is executed, and a large amount of workload is increased.
Disclosure of Invention
To solve the above technical problem, the present invention provides a real-time recovery method for abnormal errors of a processor register set,
one of the technical schemes adopted by the invention is as follows: a real-time recovery method for abnormal state of processor register set includes:
a1, acquiring an instruction stream entering a decoding stage at an instruction fetching stage of a processor, and reordering the acquired instruction stream to obtain a real execution sequence of the instruction stream at an execution stage;
a2, latching write channel signals of a register set of the processor according to the instruction stream reordered in the step A1;
a3, when the abnormal state of the processor register group is found, acquiring a command and a PC value corresponding to the moment when an error occurs, and generating a corresponding error early warning signal;
and A4, restoring the abnormal state of the register group of the processor in real time according to the abnormal state detection result of the register group of the processor, the instruction and the PC value corresponding to the abnormal state occurrence moment and the latched and backed-up register group.
The second scheme adopted by the invention is as follows: a system for real-time recovery of processor register set exception status, comprising: the device comprises a latch backup module, an instruction rearrangement module, a register group abnormal state detection module and a rollback recovery module;
the instruction rearrangement module collects the instruction stream entering the decoding stage at the instruction fetching stage of the processor, reorders the collected instruction stream to obtain the real execution sequence of the instruction stream at the execution stage, and finally sends the rearranged instruction stream to the register group abnormity detection module and the latch backup module;
the latch backup module is used for latching and backing up channel signals written into the register group by the processor, and then selecting write channel data sources entering the main register group and the secondary register group according to the real-time detection result of the register group abnormity detection module;
the register group abnormal state detection module finishes detecting the abnormal state of the register group in real time, then outputs an indication signal of a real-time detection result, and simultaneously outputs an instruction and an instruction PC value corresponding to the moment when the register group generates an abnormal error;
and the rollback recovery module recovers the abnormal state of the register group of the processor in real time according to the detection result of the abnormal state of the register group, the instruction and the PC value corresponding to the abnormal state occurrence moment and the latched and backed-up register group so as to ensure the safe and reliable execution of the processor.
The invention has the beneficial effects that: the beneficial effects of the proposed method of the invention are mainly reflected in two aspects:
(1) The method realizes effective recovery of the abnormal state of the register group of the processor based on the latch backup of the register group of the processor and the rollback replacement of the PC value corresponding to the error moment, and verifies the reliability (100 percent of the real-time recovery of the abnormal state of the register group of the processor), the resource overhead (the function of completing the latch backup by using 38 registers) and the efficiency in the delay overhead (the abnormal state can be recovered by using 12 clock cycles) of the method. The processor abnormal state recovery method is superior to the existing processor abnormal state recovery method in both real-time performance and resource consumption;
(2) The real-time recovery module of the abnormal state of the register set designed according to the method provided by the invention can be easily embedded into the processor, the abnormal state can be recovered within 12 clock cycles only by slightly modifying the structure of the processor and combining the abnormal state detection module of the register set of the processor, the method is simple and efficient, and the reliability and the safety of the hardware level of the processor are ensured to a certain extent without occupying too many hardware resources.
Drawings
FIG. 1 is a flowchart of a method for real-time recovery of an exception condition of a register set of a processor according to an embodiment of the present invention;
FIG. 2 is a block diagram of a system for real-time recovery of abnormal states of a processor register set according to the present invention;
FIG. 3 is a diagram of an indication of a malicious attack on a register set implanted to test the effectiveness of the present invention;
FIG. 4 is a simulation result of the real-time recovery of an abnormal condition resulting from an implanted malicious attack on a register set using the method of the present invention;
wherein, (a) x15 of the register is tampered with for the implanted hardware trojan; (b) is the NOP instruction inserted in step S13; (c) The value of the backup register set x15 used to roll back the PC address to the error occurrence time 0x420 at the recovery time.
Detailed Description
In order to facilitate understanding of the technical contents of the present invention by those skilled in the art, the present invention will be further explained with reference to the accompanying drawings.
Example 1
The invention provides a real-time recovery method of abnormal states of a processor register set, which comprises the following steps:
s1, establishing a processor simulation execution environment for recovering abnormal errors of a processor register set in real time;
s2, acquiring all signals of write channel signals of a register group of the processor;
s3, delaying 3 clock cycles by adopting a trigger according to the signal of the write channel acquired in the step S2;
s4, sampling the instruction stream of the decoding stage of the processor in real time, and then performing rearrangement operation;
s5, according to the write channel signal with the delay of three periods obtained in the step S3 and the rearranged instruction stream obtained in the step S4, the write channel signal is latched to ensure that the register value written into the standby register group corresponds to the previous instruction of the actual execution instruction of the processor;
s6, obtaining a reference model of the state of the register group according to the instruction flow obtained in the step S4 and the architecture and instruction set information of the processor, then obtaining the state condition of the register group in real time, and finally comparing the state condition with the state of the register group of the reference model to achieve the purpose of detecting the abnormal state of the register group in real time; specifically, if the state of the register group acquired in real time is the same as that of the reference model, the state is indicated to be abnormal, otherwise, the state is abnormal;
s7, detecting the abnormal state of the register group in real time according to the step S6, acquiring a command and a PC value corresponding to the moment when an error occurs once the abnormal state of the register group of the processor is found, and generating a corresponding error early warning signal;
s8, if the processor register set exception error early warning signal generated in the step S7 is valid, pausing the pipeline of the fetch, decoding, execution and write-back stages of the processor to enable the processor to pause;
s9, selecting values written into the main register set and the secondary register set according to the abnormal error early warning signal of the processor register set generated in the step S7, wherein the early warning signal is a selection signal input by the processor register set;
and S10, when the processor is started, the abnormal error early warning signal is invalid, the used register group is a main register group, and the data written in the Bypass mode of the register group write channel signal acquired in the S2 are written into the main register group.
And S11, when the processor is started, the abnormal error early warning signal is invalid, the register group for latching backup is a secondary register group, the write channel signal of the register group obtained in the S2 is delayed, and latching is carried out according to the rearranged instruction stream in the S5.
And S12, selecting data written into the register bank of the processor according to the processor register bank abnormal error early warning signal generated in the step S7, wherein the abnormal error early warning signal is invalid, selecting the 'main' register bank of the step S10 as the output of the register bank mux, and if the abnormal error early warning signal is valid and the pipeline is flushed, selecting the 'secondary' register bank value generated in the step S11 as the output of the register bank mux.
S13, inserting a plurality of NOP instructions between the instruction fetching stage and the decoding stage;
s14, resuming the pipelines of the decoding stage, the execution stage and the write-back stage of the processor, and still pausing the pipeline of the instruction fetching stage, so that the processor executes the NOP instruction inserted in the step S13, but does not perform the instruction fetching operation;
and S15, replacing the PC value of the instruction read from the instruction RAM in the instruction taking stage with the PC value corresponding to the error occurrence time acquired in the step S7 according to the instruction and the PC value corresponding to the error occurrence time acquired in the step S7.
S16, when the abnormal error early warning signal is effective, processing the instruction in the prefetch finger FIFO in the instruction fetching stage as a subsequent instruction at the error occurrence moment, and for rollback recovery, the prefetch finger FIFO needs to be emptied, so that when the processor is recovered in an abnormal state, the instruction stored in the prefetch finger FIFO before is not executed any more;
s17, restoring the pipeline of the processor instruction fetching stage according to the completion of the step S12, the step S15 and the step S16;
and S18, completing the real-time recovery of the abnormal errors of the processor register set.
In step S3, the flip-flop is used to delay the write channel signal of the register set by 3 clock cycles, so as to: the real-time detection technology of the abnormal error of the register group used by the invention can give the early warning signal of the error state generated by the register group after 2 clock cycles of the error occurrence time, so 3 clock cycles are delayed to ensure that the corresponding time of the backed-up register group value is earlier than the error occurrence time, and the error value of the backed-up register group is not existed.
The step S4 specifically comprises the following steps: sampling an instruction stream in a decoding stage in real time; then, according to the relevant information of the instruction set and the structure of the processor, a rearrangement rule of the instruction stream in the decoding stage of the processor is formulated; and finally, rearranging the instruction streams in the decoding stage of the processor according to the established instruction stream rearrangement rule. The instruction rearrangement herein may refer to application No.: paragraphs 59-72 of the specification 202110162587.6 describe specific details of instruction modification, and the present invention is not described in detail herein.
The step S5 specifically comprises the following steps: collecting the rearranged instruction stream and delaying for one beat; comparing the instruction stream delayed by one beat with the instruction stream not delayed to obtain an indication signal indicating that the instruction changes, wherein when the signal is 1, the instruction execution of the current instruction by the processor is completed, the next instruction is started to be executed, and when the signal is 0, the instruction execution of the current instruction by the processor is still indicated; latching the 3-beat delayed register group write channel signal according to an instruction signal of instruction change executed by a processor, outputting the 3-beat delayed register group write channel signal to a backup register group when the instruction signal is 1, and performing latching operation on the 3-beat delayed register group write channel signal when the instruction signal is 0;
the pipeline suspended in step S8 includes: an instruction fetch stage, a decode stage, an execute stage, and a write back stage.
In step S9, the input signal and the selection signal of the write channel signal MUX of the register set include:
1) Input signal 1: writing a channel signal by the processor corresponding to the step S2;
2) Input signal 2: the signal in step S5) is delayed and latched;
3) Selecting a signal: the error early warning signal generated in the step S7 and the register group label used by the processor are generated together, when the processor is started, if the used register group is a main register group, the signal written into the register group is selected as the signal in 1), and if the used backup register group is a secondary register group, the signal written into the register group is selected as the signal in 2);
when the early warning signal is 1 and the register set used by the processor is a main register set, the register set used by the processor is switched to a backup register set-a secondary register set, and the signal written into the register set is selected as the signal in 1), and at the moment, the backup register set is the main register set, and the signal written into the register set is selected as the signal in 2); when the early warning signal is 1 and the register bank used by the processor is a secondary register bank, the register bank used by the processor is switched to a backup register bank, namely a main register bank, the signal written into the register bank is selected to be the signal in 1), the backup register bank is the secondary register bank, and the signal written into the register bank is selected to be the signal in 2).
As will be known by those skilled in the art, the "abnormal error early warning signal is invalid" in the present invention, that is, no abnormality occurs in the register set, and the early warning signal is 0; the corresponding 'abnormal error early warning signal is effective', namely the register group is abnormal, and the early warning signal is 1.
The specific functions of inserting the NOP instruction in step S13 include: 1) Pipeline flushing, which can clear the invalid state and useless control of the processor caused by pausing the processor pipeline in step S8; 2) The time may be acquired for the processing of step S15 and step S16.
Step S15 specifically includes: counting the number of the NOP instructions inserted in the step S13; generating an instruction PC address jump signal according to the count value of the number of the inserted NOP instructions, wherein the signal lasts for one beat; according to the generated instruction PC address jump signal, the PC address output to the instruction RAM by the instruction fetching stage of the processor is replaced by the PC address corresponding to the abnormal error moment of the register group, meanwhile, the read request signal for reading the instruction RAM is set to be 1, and the instruction of the PC address corresponding to the abnormal error moment of the register group can be correctly taken out after the pipeline of the instruction fetching stage is recovered in one clock cycle.
Step S16 is to restore the pipeline of the instruction fetching stage in the next clock cycle of step S15, and the signal indicating the pipeline restoration of the instruction fetching stage is obtained by xoring the abnormal error early warning signal and the PC replacement completion signal.
Step S17 specifically includes: acquiring the Depth FIFO _ Depth of a prefetch finger FIFO according to the architecture of a processor; at the same time of restoring the pipeline of the instruction fetching stage in step S16, a counter is used to start counting, and the counting is stopped until the count value is equal to FIFO _ Depth; a strobe signal lasting FIFO _ Depth clock cycles is generated from the resulting count value so that instructions entering the decode stage are not temporarily provided by the prefetch finger FIFO but are provided directly by the instruction output of the fetch RAM.
Example 2
The invention provides a real-time recovery system for abnormal state of processor register set, as shown in fig. 2, comprising: the device comprises a latch backup module, an instruction rearrangement module, a register group abnormal state detection module and a rollback recovery module;
the instruction rearranging module collects the instruction stream entering the decoding stage in the instruction fetching stage, reorders the collected instruction stream so as to obtain the real execution sequence of the instruction stream in the execution stage, and finally sends the rearranged instruction stream to the register group abnormity detection module and the latch backup module;
the latch backup module is used for latching and backing up channel signals written into the register group by the processor, and then selecting whether write channel data entering the main register group and the secondary register group are from Bypass or the delay latch module according to the real-time detection result of the register group abnormity detection module;
the register group abnormal state detection module completes the task of detecting the abnormal state of the register group in real time, then outputs an indication signal of a real-time detection result, and simultaneously outputs an instruction and an instruction PC value corresponding to the moment when the register group generates an abnormal error.
And the rollback recovery module recovers the abnormal state of the register group of the processor in real time according to the abnormal state detection result of the register group, the instruction and the PC value corresponding to the abnormal state occurrence moment and the latched backup register group so as to ensure the safe and reliable execution of the processor.
As shown in FIG. 2, the processor pipeline includes four stages, respectively: an instruction fetch stage, a decode stage, an execute stage, and a write back stage.
The latch backup module in this embodiment specifically includes: the data source of a write channel entering the main register group is selected to be the first Bypass or the first delay latch module by the first multi-way switch according to the real-time detection result of the register group abnormity detection module; the second multi-way switch selects the data source entering the write channel of the secondary register group as a second Bypass or a second delay latch module according to the output result of the real-time detection result of the register group abnormity detection module after passing through the inverter; the third multi-way switch selects the register group to output as a main register group or a secondary register group according to the real-time detection result of the register group abnormity detection module;
the first delay latch module and the second delay latch module latch and release the write channel signal delayed for three clock cycles according to the instruction stream generated by the instruction rearrangement module.
The instruction rearrangement module in this embodiment is used for collecting an instruction stream entering a decoding stage from an instruction fetching stage, rearranging the instruction stream according to an instruction rearrangement rule to restore a real execution sequence of the instruction stream in an execution stage, and finally sending the rearranged instruction to the register group abnormality detection module and the delay latch module. The rule of rearrangement can refer to patent application No. 202110162587.6. This block essentially provides an indication signal that latches the write channel signal.
The rollback recovery module in this embodiment specifically includes: the device comprises an NOP instruction insertion module, a PC address replacement module, a prefetch instruction FIFO processing module and an early warning signal resetting module;
(1) A NOP instruction insertion module: the module acts on an instruction fetching stage of a processor pipeline, and after the processor pipeline is suspended, a plurality of NOP instructions are inserted into an instruction output path of the instruction fetching stage, wherein the number of the NOP instructions is determined by a pipeline flushing cycle and a PC address replacement cycle. When insertion of a NOP instruction is complete, enable signals are generated that restore the processor decode, execute, and write back stage pipelines. This module may clear the invalid state and useless control of the processor caused by pausing the processor pipeline while providing time for subsequent processing by the rollback recovery module.
(2) PC address replacement module: the instruction PC address jump signal is generated based on a count of the number of NOP instructions, determined by the clock cycles consumed by the execution pipeline flush, which lasts for one beat. And then replacing the PC address output to the instruction RAM by the processor prefetch module with the PC address corresponding to the abnormal error moment of the register group, and setting the read instruction request signal and the PC address effective signal of the read instruction to be 1, so as to ensure that the instruction of the PC address corresponding to the abnormal error moment of the register group can be correctly taken out after the instruction fetching stage is recovered in one clock period. The pipeline of the fetch stage is resumed immediately after instruction replacement is complete.
(3) Prefetch finger FIFO processing module: acquiring the Depth FIFO _ Depth of a prefetch finger FIFO according to the architecture of a processor; then, when the pipeline of the instruction fetching stage is recovered, a counter is used for starting counting, and the counting is stopped until the counting value is equal to FIFO _ Depth, so that the counting value generates a gating signal which lasts for FIFO _ Depth for a clock period; according to the signal, the instruction entering the decoding stage is temporarily not provided by the prefetch finger FIFO, but is directly provided by the instruction output of the fetch finger RAM.
(4) Resetting an abnormal signal: the module mainly detects whether the abnormal state of the processor register group is successfully recovered or not, and sends the abnormal state to the register group abnormal state detection module for resetting the early warning signal.
The method realizes effective recovery of the abnormal state of the register group of the processor based on the latch backup of the register group of the processor and rollback replacement of the PC value corresponding to the error moment, and verifies the reliability (100 percent of real-time recovery of the abnormal state of the register group of the processor), the resource overhead (the function of completing the latch backup by using 38 registers) and the efficiency in the delay overhead (the abnormal state can be recovered by using 12 clock cycles) of the method. The processor abnormal state recovery method is superior to the existing processor abnormal state recovery method in both real-time performance and resource consumption; the real-time recovery module of the abnormal state of the register set designed according to the method provided by the invention can be easily embedded into the processor, the abnormal state can be recovered within 12 clock cycles only by slightly modifying the structure of the processor and combining the abnormal state detection module of the register set of the processor, the method is simple and efficient, and the reliability and the safety of the hardware level of the processor are ensured to a certain extent without occupying too many hardware resources.
In order to verify the effectiveness of the present invention, an example that the processor register set shown in fig. 3 is subjected to a malicious tampering attack is given in the present embodiment. The inserted hardware trojan maliciously tampers x15 of the register set, so that the correct value is changed from 4 to 5, the address of the program entry is wrong, the program entry jumps to the wrong program entry, and the function is operated wrongly. The abnormal state of the register set of the processor is then restored according to the method of the present invention, and the waveform of the restored result is shown in fig. 4. FIG. 4 (a) shows that the inserted hardware trojan has maliciously tampered with the x15 register set, so that the correct value changes from 4 to 5, and the corresponding PC value is 0x420 of the previous cycle, and the x15 value of the backup register is correct; FIG. 4 (b) shows the NOP instruction inserted during the recovery process in S13, with the backup register being the correct value; in fig. 4 (c), the PC address is rolled back to the error occurrence time 0x420 at the recovery time, and the value of the backup register group x15 used is 4, which is a correct value.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (2)

1. A real-time recovery method for abnormal state of processor register set is characterized by comprising the following steps:
a1, acquiring an instruction stream entering a decoding stage at an instruction fetching stage of a processor, and reordering the acquired instruction stream to obtain a real execution sequence of the instruction stream at an execution stage;
a2, latching the write channel signal of the processor according to the instruction stream reordered in the step A1; the step A2 comprises the following steps:
a21, two register groups are adopted, wherein one register group is used as a register group used by a processor, and the other register group is used as a backup register group;
a22, if the state of the register group used by the processor is normal at the current moment, writing a processor writing channel signal into the register group used by the processor in a Bypass mode, and writing the processor writing channel signal into a backup register group after delayed latching;
a23, if the state of the register group used by the processor is abnormal at the current moment, taking the backup register group at the previous moment as the register group used by the current processor, and writing a processor write channel signal into the register group used by the current processor in a Bypass mode; taking the register group used by the processor at the previous moment as the current backup register group, and writing the write channel signal of the processor into the current backup register group after delaying and latching;
the method comprises the following steps of performing delayed latching on a writing channel signal of a processor, specifically: delaying the writing channel signal of the processor for three periods to carry out latch operation;
latching and releasing the write channel signals delayed for three clock cycles according to the rearranged instruction stream; specifically, the method comprises the following steps:
delaying the rearranged instruction stream by one beat; comparing the instruction stream delayed by one beat with the instruction stream not delayed, and releasing the write channel signal delayed by three clock cycles if the current instruction executed by the processor is completed; if the processor is still executing the current instruction, latching the write channel signal delayed for three clock cycles;
a3, when the abnormal state of the processor register group is found, acquiring a command and a PC value corresponding to the moment when an error occurs, and generating a corresponding error early warning signal;
a4, restoring the abnormal state of the register group of the processor in real time according to the abnormal state detection result of the register group of the processor, the instruction and the PC value corresponding to the abnormal state occurrence moment and the register group for latching backup; the step A4 comprises the following steps:
a41, if the state of a register group used by a processor is abnormal at the current moment, acquiring an instruction and a PC value corresponding to the moment when an error occurs;
a42, pausing pipelines of an instruction fetching stage, a decoding stage, an execution stage and a write-back stage of a processor;
a43, inserting a plurality of NOP instructions between an instruction fetching stage and a decoding stage;
a44, restoring the pipeline of the decoding stage, the execution stage and the write-back stage of the processor, so that the processor executes the NOP instruction inserted in the step A43;
a45, replacing the instruction-fetching PC value of the instruction-fetching RAM in the instruction-fetching stage of the processor with the PC value corresponding to the error-generating moment according to the instruction and the PC value corresponding to the error-generating moment;
a46, restoring the pipeline of the instruction fetching stage of the processor;
and A47, acquiring the Depth FIFO _ Depth of the prefetch finger FIFO, and generating a gating signal lasting FIFO _ Depth for a clock cycle, so that the instruction entering the decoding stage is directly provided by the instruction output of the fetch finger RAM.
2. A system for real-time recovery of exception status of a register bank of a processor, comprising: the device comprises a latch backup module, an instruction rearrangement module, a register group abnormal state detection module and a rollback recovery module;
the instruction rearrangement module collects the instruction stream entering the decoding stage at the instruction fetching stage of the processor, reorders the collected instruction stream to obtain the real execution sequence of the instruction stream at the execution stage, and finally sends the rearranged instruction stream to the register group abnormity detection module and the latch backup module;
the latch backup module is used for carrying out latch backup on channel signals written into the register group by the processor and determining the output register group according to the real-time detection result of the register group abnormity detection module; the latching backup module comprises: the first multi-way switch selects a data source entering a write channel of the first register group as the first Bypass or the first delay latch module according to a real-time detection result of the register group abnormity detection module; the real-time detection result of the register group abnormity detection module is used as the input of the inverter, and the second multi-way switch selects the data source entering the write channel of the second register group as a second Bypass or a second delay latch module according to the output result of the inverter; the third multi-way switch selects the register group to output as the first register group or the second register group according to the real-time detection result of the register group abnormity detection module;
one register set of the first register set and the second register set is used as a register set used by the processor, and the other register set is used as a backup register set;
if the real-time detection result of the register group abnormity detection module is that the state of the register group is normal, a multi-way switch corresponding to the register group used by the processor selects a data source of a write channel entering the register group used by the processor as Bypass corresponding to the register group used by the processor, and a multi-way switch corresponding to the backup register group selects a data source of a write channel entering the backup register group as a delay latch module corresponding to the backup register group;
if the real-time detection result of the register group abnormity detection module is that the state of the register group is abnormal, taking the backup register group at the previous moment as the register group used by the current processor, and selecting a write channel data source of the register group used by the current processor to be Bypass corresponding to the register group used by the current processor by a multi-way switch corresponding to the register group used by the current processor; the register group used by the processor at the previous moment is used as the current backup register group, and the multi-way switch corresponding to the current backup register group selects the data source entering the write channel of the current backup register group as the delay latch module corresponding to the current backup register group;
the first delay latch module or the second delay latch module latches and releases the write channel signal delayed for three clock cycles according to the instruction stream generated by the instruction rearrangement module; specifically, the method comprises the following steps:
delaying the rearranged instruction stream by one beat; comparing the instruction stream delayed by one beat with the instruction stream not delayed, and if the current instruction executed by the processor is completed, releasing the write channel signal delayed by three clock cycles by the first delay latch module or the second delay latch module; if the processor is still executing the current instruction, the first delay latch module or the second delay latch module carries out latch operation on the write channel signal delayed by three clock cycles;
the register group abnormal state detection module finishes detecting the abnormal state of the register group in real time, then outputs a real-time detection result and simultaneously outputs an instruction and an instruction PC value corresponding to the moment when the register group generates an abnormal error;
the rollback recovery module recovers the abnormal state of the register group of the processor in real time according to the abnormal state detection result of the register group, the instruction and the PC value corresponding to the abnormal state occurrence moment and the latched backup register group; the rollback recovery module includes: the device comprises a NOP instruction insertion module, a PC address replacement module and a prefetch instruction FIFO processing module;
after the processor pipeline is suspended, the NOP instruction insertion module inserts a plurality of NOP instructions into an instruction output path of an instruction fetching stage of the processor pipeline; generating enable signals to resume the processor decode stage, execute stage, and write back stage pipelines when insertion of a NOP instruction is complete;
the PC address replacing module replaces the instruction fetching PC address of the instruction fetching RAM of the processor pre-instruction fetching module with the PC address corresponding to the moment when the abnormal error occurs in the register group, and the running water of the instruction fetching stage is immediately recovered after the instruction replacement is completed;
the prefetch finger FIFO processing module generates a strobe signal lasting FIFO _ Depth for a clock cycle length according to the Depth FIFO _ Depth of the prefetch finger FIFO.
CN202210262087.4A 2022-03-17 2022-03-17 Real-time recovery method and system for abnormal errors of processor register set Active CN114610519B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210262087.4A CN114610519B (en) 2022-03-17 2022-03-17 Real-time recovery method and system for abnormal errors of processor register set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210262087.4A CN114610519B (en) 2022-03-17 2022-03-17 Real-time recovery method and system for abnormal errors of processor register set

Publications (2)

Publication Number Publication Date
CN114610519A CN114610519A (en) 2022-06-10
CN114610519B true CN114610519B (en) 2023-03-14

Family

ID=81863794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210262087.4A Active CN114610519B (en) 2022-03-17 2022-03-17 Real-time recovery method and system for abnormal errors of processor register set

Country Status (1)

Country Link
CN (1) CN114610519B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328690B (en) * 2022-10-13 2023-02-17 北京登临科技有限公司 Exception handling method, computer readable medium and electronic device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4805095A (en) * 1985-12-23 1989-02-14 Ncr Corporation Circuit and a method for the selection of original data from a register log containing original and modified data
DE102005055067A1 (en) * 2005-11-18 2007-05-24 Robert Bosch Gmbh Device and method for correcting errors in a system having at least two execution units with registers
US8972782B2 (en) * 2012-11-09 2015-03-03 International Business Machines Corporation Exposed-pipeline processing element with rollback
US10515049B1 (en) * 2017-07-01 2019-12-24 Intel Corporation Memory circuits and methods for distributed memory hazard detection and error recovery
CN112905995B (en) * 2021-02-05 2022-08-05 电子科技大学 Method and system for detecting abnormal behaviors of register group in processor in real time

Also Published As

Publication number Publication date
CN114610519A (en) 2022-06-10

Similar Documents

Publication Publication Date Title
US9619233B2 (en) Computer processor providing exception handling with reduced state storage
Austin DIVA: A reliable substrate for deep submicron microarchitecture design
Sahoo et al. Using likely program invariants to detect hardware errors
US20050207521A1 (en) Recovery from errors in a data processing apparatus
US7467325B2 (en) Processor instruction retry recovery
Wang et al. ReStore: Symptom-based soft error detection in microprocessors
Wang et al. Examining ACE analysis reliability estimates using fault-injection
US7373548B2 (en) Hardware recovery in a multi-threaded architecture
TWI613588B (en) Method, microprocessor and computer program for synchronizing operations among cores
US20050050307A1 (en) Periodic checkpointing in a redundantly multi-threaded architecture
US20100031084A1 (en) Checkpointing in a processor that supports simultaneous speculative threading
Austin DIVA: A dynamic approach to microprocessor verification
WO2007034128A1 (en) Insertion of error detection circuits based on error propagation within integrated circuits
CN114610519B (en) Real-time recovery method and system for abnormal errors of processor register set
US7895469B2 (en) Integrated circuit using speculative execution
US8347066B2 (en) Replay instruction morphing
Ebrahimi et al. ScTMR: A scan chain-based error recovery technique for TMR systems in safety-critical applications
Valadimas et al. Timing error tolerance in small core designs for SoC applications
KR100508320B1 (en) Processor having replay architecture with fast and slow replay paths
Gawkowski et al. Improving fault handling software techniques
Shankar et al. Control focused soft error detection for embedded applications
Jeitler et al. Low latency recovery from transient faults for pipelined processor architectures
Shazli et al. Transient error detection and recovery in processor pipelines
Maniatakos et al. Design and evaluation of a timestamp-based concurrent error detection method (CED) in a modern microprocessor controller
US9645882B2 (en) Field repairable logic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant