EP4205010A1 - Fault resistant memory access - Google Patents

Fault resistant memory access

Info

Publication number
EP4205010A1
EP4205010A1 EP20774955.7A EP20774955A EP4205010A1 EP 4205010 A1 EP4205010 A1 EP 4205010A1 EP 20774955 A EP20774955 A EP 20774955A EP 4205010 A1 EP4205010 A1 EP 4205010A1
Authority
EP
European Patent Office
Prior art keywords
data
memory system
location
error check
execution core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP20774955.7A
Other languages
German (de)
French (fr)
Inventor
Andrew Dellow
Mark Bowen Hill
Tariq Kurd
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of EP4205010A1 publication Critical patent/EP4205010A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/78Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
    • G06F21/79Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in semiconductor storage media, e.g. directly-addressable memories

Definitions

  • This invention relates to the detection of faults in data processing devices, for example when accessing the memory of a security central processing unit.
  • Faults can occur in data processing devices such as security central processing units (CPU). These faults can lead to undesirable behaviour of the device. For example, a fault in such a device can lead to an incorrect instruction being run, which can give an unexpected result.
  • CPU central processing units
  • Faults can be introduced accidentally or deliberately.
  • An example of an accidental fault is one caused by a high energy particle or beam, such as a cosmic ray, that strikes a part of the processing device, causing it to malfunction.
  • Variations in characteristics or performance of a component device of a processing device can introduce faults into the processing device.
  • Examples of deliberate fault injection are those caused by laser probing (such as heating up at least a portion of the processing device using a laser) and introducing glitches in the power supply to the processing device.
  • Deliberate fault injection can be used to try to get unexpected behaviour in the processing device. If a processing device runs incorrect instructions, an attacker can use this to try and compromise the processing device. This can help an attacker gain control of the processing device and/or reveal details of the operation of the processing device.
  • a security CPU is implemented within a secure subsystem, where it has a local memory.
  • the local memory is accessible from outside the subsystem, but the access from other components is very restricted.
  • the security CPU authenticates code and copies it to the local memory before it executes it. If it is possible to tamper with the local memory, then the integrity of the device is broken.
  • One method of breaking the integrity is to cause the CPU to fetch from the wrong memory location. This could, for example, cause the CPU to fail to execute a branch which checks whether security criteria have been met. The CPU then fails to detect a security problem and continues executing.
  • Injected faults can also corrupt the memory contents, corrupt memory write data between the core and memory, corrupt read data between the memory and the core and corrupt the memory address used to access the memory.
  • One method of fault protection is to use parity protection on the data busses. However, this may only offer weak protection and does not check whether the memory address has been corrupted.
  • US 7603609 B2 describes a method and system for optimized instruction fetch to protect against soft and hard errors. This method replaces faulty read data with a known value.
  • a data processing device comprising an execution core and a memory system communicatively coupled to the execution core via a data path, the memory system being configured to receive from the execution core a request to fetch data from a location in the memory system, the request including data specifying the location, and in response to that request: retrieve a data unit from the location; form an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit; and transmit the retrieved data unit and the error check value to the execution core.
  • Only every n-th location in the memory may be addressable by a request from the execution core, and the memory system may be configured to, in response to the request: retrieve n data units from each of n contiguous locations including the location specified in the request; form n error check values each computed over both of (i) at least part of the data specifying the location and (ii) at least part of a respective one of the retrieved data units; and transmit the n retrieved data units and the n error check values to the execution core. This may allow the integrity of the n values to be verified in response to such a request.
  • n may be 2. This may provide an efficient balance between transferred data content and error check data.
  • the length of each data unit may be two bytes. This may provide an efficient balance between transferred data content and error check data.
  • the data processing device may comprise a bus for conveying the or each retrieved data unit and the or each error check value from the memory system to the execution core.
  • the width of the bus may be at least n times the total number of bits in a data unit and an error check value. This may allow the data units and the error check values responsive to a single request to be passed simultaneously.
  • the memory system may be configured to generate the or each error check value using a cyclic redundancy check computed over the inputs thereto. This may be an efficient manner of computing the error check value.
  • the error check value may be a remainder of the cyclic redundancy check. This may efficiently compress the error check information.
  • the memory system may be configured to transmit the or each data unit and the or each error check value responsive to a single request in parallel to the execution core. This may be an efficient way of transporting data between the components of the data processing device.
  • the memory system may comprise a storage block and an output interface, wherein the storage block stores data at the location and the output interface is configured to form the error check value. This may allow the memory system to store the data internally.
  • the memory system may comprise an input interface configured to receive the request and scramble the data specifying the location to determine the location specified by the data. This may conveniently allow the memory address and data to be scrambled before accessing the memory.
  • the execution core may be configured to: form the request including data specifying the location; store, locally to the execution core, the data specifying the location; transmit the request to the memory system; receive from the memory system a data unit and an error check value; form a local error check value computed over both of (i) at least part of the data specifying the location as stored locally to the execution core and (ii) at least part of the data unit as received from the memory system; check whether the local error check value matches the error check value received from the memory system; and if the local error check value does not match the error check value received from the memory system, raise an error.
  • the execution core may be configured to raise the error such that execution is halted. This may prevent a fault from corrupting the system or prevent an attacker from being able to influence or control the operation of a processing module in fetching instructions from memory without such influence or control being detected. Once a fault has been detected, steps can be taken to reassert authorised control and limit the effect of the attack or other fault.
  • the execution core may have an execution pipeline and the execution core may be configured to locally store the data specifying the location in the execution pipeline.
  • the execution core may be configured to: interpret the data unit as received from the memory system as an instruction; execute that instruction; and perform the said check before completing the instruction. This may prevent a faulty instruction from updating state in the CPU as a result of an accidental or deliberate fault.
  • the memory system may be parity protected. This may provide additional fault protection. BRIEF DESCRIPTION OF THE FIGURES
  • Figure 1 shows an example of a data processing device comprising an execution core and a memory.
  • Figure 2 shows an exemplary flowchart detailing the method steps performed by the memory.
  • the present disclosure relates to a method for ensuring that the processing module accesses the correct location in the memory, write data reaches the memory without corruption and read data is returned from the memory without corruption.
  • additional signals are added to the memory bus interface to protect the data and the memory address.
  • the write data is protected on the memory request path.
  • the read data is protected on the memory response path.
  • the accessed address is looped back to the core processor. The core processor then checks that the accessed address matches the expected address.
  • the memory receives a request from the core to fetch data from a location in the memory.
  • the request including data specifying the location of the requested data in the memory.
  • the memory is configured to retrieve a data unit from the location.
  • An error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit is formed. The retrieved data unit and the error check value is then transmitted to the execution core.
  • the error check is preferably a cyclic redundancy check (CRC). However, other suitable error checks may also be used.
  • CRC cyclic redundancy check
  • CRC protection is added to the read data and the write data.
  • the error check value may be a CRC remainder.
  • the address can be included in the CRC calculation so that the CRC check only passes if the address was received.
  • a CRC check can detect any number of bit errors, depending on the polynomial used for the CRC check.
  • CRC remainders are generated from not only the data but also from the requested address. This means that when the data is received (write data at the memory, read data at the core) then the CRC check includes whether the address was correctly received or not. For the write data check at the memory, this is simple because the address is sent as part of the request.
  • the load-store unit checks the CRC to confirm that data was read from the correct address.
  • the CRC can also be used to show that instructions issue at the correct PC, by pushing instructions into the fetch buffer with the CRC remainder and checking the CRC remainder at execution. This allows faults in the instruction fetch buffer and surrounding logic to also be detected, as these faults often cause instructions to issue at an incorrect PC.
  • a CRC remainder may be calculated for every 16-bits of data, because RISC-V instructions are of variable length but are a multiple of 16-bits.
  • Figure 1 shows an example of a Hi2120CS NB-loT data processing device 100 comprising a HiMiDeerSV100 RISC-V execution core 101 and a memory 105 communicatively coupled to the execution core via a data path, which in this example is a bus.
  • the memory 105 is configured to store a respective one of a plurality of data units at a respective one of a plurality of locations.
  • the memory 105 is parity protected.
  • the memory is configured to receive, from the execution core 101 , a request to fetch data from a location in the memory 105.
  • the request includes data specifying the location of the data in the memory 105.
  • the memory system is configured to retrieve a data unit from the location and form an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit.
  • the memory system then transmits the retrieved data unit and the error check value to the execution core 101 .
  • the bus conveys the retrieved data unit(s) and the error check value(s) from the memory system to the execution core.
  • the width of the bus is preferably at least n times the total number of bits in a data unit and an error check value to allow the data units and the error check values responsive to a single request to be passed simultaneously.
  • n is 2.
  • the length of each data unit may be two bytes. This may provide an efficient balance between transferred data content and error check data.
  • the AHB (Advanced High-performance Bus) signals are as follows.
  • a HWDATA_CRC is generated from HWDATA (write/store data) and HADDR (request address).
  • the CRC is checked at the memory to detect address/write data corruption.
  • HRDATA_CRC is generated at the memory from HRDATA (read data) and HADDR (address) and is checked in the core. For loads, the HRDATA_CRC check ensures that the address reaches the memory and is returned to the core correctly.
  • the SV100 core 101 makes a bus request (l-BUS and I or D-BUS).
  • HWDATA_CRC signals are added to the AHB, which in this example are CRC12 (polynomial 0x8f8) generated from HADDR and HWDATA and are invalid for loads.
  • the arbiter 102 selects a bus request, and passes it to block 103.
  • block 103 which may act as an input interface for the memory system, the memory address and data is scrambled before accessing the memory.
  • Block 104 generates parity to write to the memory.
  • the memory 105 is then accessed.
  • HADDR, HWDATA and HRDATA memory read data
  • HWDATA_CRC is checked for stores.
  • the load data parity is checked from the memory and HRDATA_CRC for the loads is generated to return to the core from HADDR, and HRDATA (read data).
  • the response signals are registered, if necessary for timing.
  • the core receives the response and checks the CRCs against the read data, and the requested address.
  • a bus sideband signal relating to the system therefore loops back the memory address to the core.
  • Fault detection logic relating to the signal checks that the actual address accessed matches the requested address.
  • the sideband signal may be a CRC calculated across the instruction and the returned address.
  • the returned CRC may be checked using the returned data and an independently calculated version of the expected address, and may cause an alarm to fire on failure of the check.
  • CRC12 remainders are generated for each 16-bits of data.
  • Variable length encoding means that 16-bit quantities are issued (one or more at a time).
  • CRC12 remainders can be pushed into the instruction-fetch buffer, one for each 16-bits on instruction (much smaller than recording the looped back PC).
  • the CRC remainder is checked against the local execute PC and the issued instruction. This confirms that the instruction fetch unit (IFU) issued the instruction at the correct PC. This protects the instruction-fetch buffer against fault injection, for example missing a push due to a fault.
  • HRDATA_CRC is also check in the load-store unit (LSU). Therefore, this check confirms that loads accessed the correct address and also means that the memory system can’t return a CRC for stores.
  • CRC lengths may give improved protection.
  • other CRC lengths such as CRC6 remainders, may be used.
  • each instruction is preferably issued in 16-bit elements.
  • Each instruction may have 1 to m 16-bit elements (1 to 3 in this embodiment). Therefore, it is desirable to be able to check the CRC across any pair of bytes. Preferably, the lowest numbered byte should be even.
  • the above bytes may be 12 bytes in the memory.
  • [1 :0] could be an instruction
  • [5:4, 3:2] could be an instruction
  • [b:a, 9:8, 7:6] could be an instruction, etc. Therefore, when the instruction is issued, 1-3 pairs of bytes are issued, each pair with its CRC. The CRC can then be checked for each pair. Therefore, each pair may have its own CRC, and four bytes may be read from the memory so as to generate two CRCs to return to the core.
  • the memory system may be configured to, in response to the request: retrieve n data units from each of n contiguous locations including the location specified in the request; form n error check values each computed over both of (i) at least part of the data specifying the location and (ii) at least part of a respective one of the retrieved data units; and transmit the n retrieved data units and the n error check values to the execution core. This may allow the integrity of the n values to be verified in response to such a request.
  • the memory transmits the fetched data unit(s) and the error check value(s) responsive to a single request in parallel to the execution core.
  • the method described herein therefore provides a low cost, simple method for detecting address faults in the memory access, write data faults between the core and the memory, read data faults between the memory and core and fetch buffer faults.
  • the execution core may be configured to: form the request including data specifying the location; store, locally to the execution core, the data specifying the location; transmit the request to the memory; receive from the memory a data unit and an error check value; form a local error check value computed over both of (i) at least part of the data specifying the location as stored locally to the execution core and (ii) at least part of the data unit as received from the memory; check whether the local error check value matches the error check value received from the memory; and if the local error check value does not match the error check value received from the memory, raise an error.
  • the execution core may be configured to raise the error such that execution is halted.
  • the execution core may have an execution pipeline and the execution core may be configured to locally store the data specifying the location in the execution pipeline.
  • the execution core may be configured to interpret the data unit as received from the memory as an instruction, and perform the error check before completing the instruction. If the error check indicates a fault, the instruction may be prevented from completing, so as to prevent the faulty instruction from updating state in the CPU.
  • Figure 2 summarizes a method 200 in which the following steps are performed by the memory system.
  • the method comprises retrieving a data unit from the location.
  • an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit is formed.
  • the retrieved data unit and the error check value are transmitted to the execution core.
  • the method described herein may therefore allow, in the presence of faults, to check that the CPU accessed the correct location in the memory when making memory requests (i.e. that the write data reaches the memory without corruption), that write data reached the memory without corruption, and that read data was returned to the core from the memory without corruption. It may also enable protection of the long wires between the memory system and the core, which may have injected faults.

Abstract

Described herein is a data processing device (100) comprising an execution core (101) and a memory system (102-109) communicatively coupled to the execution core via a data path, the memory system being configured to receive from the execution core a request to fetch data from a location in the memory system, the request including data specifying the location, and in response to that request: retrieve a data unit from the location; form an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit; and transmit the retrieved data unit and the error check value to the execution core. This can usefully reveal that a fault has occurred in accessing the memory system.

Description

FAULT RESISTANT MEMORY ACCESS
FIELD OF THE INVENTION
This invention relates to the detection of faults in data processing devices, for example when accessing the memory of a security central processing unit.
BACKGROUND
Faults can occur in data processing devices such as security central processing units (CPU). These faults can lead to undesirable behaviour of the device. For example, a fault in such a device can lead to an incorrect instruction being run, which can give an unexpected result.
Faults can be introduced accidentally or deliberately. An example of an accidental fault is one caused by a high energy particle or beam, such as a cosmic ray, that strikes a part of the processing device, causing it to malfunction. Variations in characteristics or performance of a component device of a processing device, for example due to small feature size, can introduce faults into the processing device.
Examples of deliberate fault injection are those caused by laser probing (such as heating up at least a portion of the processing device using a laser) and introducing glitches in the power supply to the processing device. Deliberate fault injection can be used to try to get unexpected behaviour in the processing device. If a processing device runs incorrect instructions, an attacker can use this to try and compromise the processing device. This can help an attacker gain control of the processing device and/or reveal details of the operation of the processing device.
A security CPU is implemented within a secure subsystem, where it has a local memory. The local memory is accessible from outside the subsystem, but the access from other components is very restricted. The security CPU authenticates code and copies it to the local memory before it executes it. If it is possible to tamper with the local memory, then the integrity of the device is broken. One method of breaking the integrity is to cause the CPU to fetch from the wrong memory location. This could, for example, cause the CPU to fail to execute a branch which checks whether security criteria have been met. The CPU then fails to detect a security problem and continues executing.
Injected faults can also corrupt the memory contents, corrupt memory write data between the core and memory, corrupt read data between the memory and the core and corrupt the memory address used to access the memory.
It is desirable to provide secure devices which can prevent or reduce the number of instances of an attacker gaining access to the secret information.
One method of fault protection is to use parity protection on the data busses. However, this may only offer weak protection and does not check whether the memory address has been corrupted.
In US2017/0185535 A1 , the contents of the memory is protected, but the method does not provide for detection of corruption on the bus network.
US 7603609 B2 describes a method and system for optimized instruction fetch to protect against soft and hard errors. This method replaces faulty read data with a known value.
In the presence of faults, it is desirable to be able to check that the CPU accessed the correct location in the memory when making memory requests and to check that write data reached the memory without corruption and read data was returned from the memory without corruption. It is further desirable to protect the long wires between the memory system and the core which may have injected faults.
SUMMARY
According to one aspect there is provided a data processing device comprising an execution core and a memory system communicatively coupled to the execution core via a data path, the memory system being configured to receive from the execution core a request to fetch data from a location in the memory system, the request including data specifying the location, and in response to that request: retrieve a data unit from the location; form an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit; and transmit the retrieved data unit and the error check value to the execution core. This can usefully reveal that a fault has occurred in accessing the memory system and may provide a low cost, simple method for detecting address faults in the memory access, write data faults between the core and the memory, read data faults between the memory and core and fetch buffer faults.
Only every n-th location in the memory may be addressable by a request from the execution core, and the memory system may be configured to, in response to the request: retrieve n data units from each of n contiguous locations including the location specified in the request; form n error check values each computed over both of (i) at least part of the data specifying the location and (ii) at least part of a respective one of the retrieved data units; and transmit the n retrieved data units and the n error check values to the execution core. This may allow the integrity of the n values to be verified in response to such a request. n may be 2. This may provide an efficient balance between transferred data content and error check data.
The length of each data unit may be two bytes. This may provide an efficient balance between transferred data content and error check data.
The data processing device may comprise a bus for conveying the or each retrieved data unit and the or each error check value from the memory system to the execution core. The width of the bus may be at least n times the total number of bits in a data unit and an error check value. This may allow the data units and the error check values responsive to a single request to be passed simultaneously.
The memory system may be configured to generate the or each error check value using a cyclic redundancy check computed over the inputs thereto. This may be an efficient manner of computing the error check value.
The error check value may be a remainder of the cyclic redundancy check. This may efficiently compress the error check information.
The memory system may be configured to transmit the or each data unit and the or each error check value responsive to a single request in parallel to the execution core. This may be an efficient way of transporting data between the components of the data processing device. The memory system may comprise a storage block and an output interface, wherein the storage block stores data at the location and the output interface is configured to form the error check value. This may allow the memory system to store the data internally.
The memory system may comprise an input interface configured to receive the request and scramble the data specifying the location to determine the location specified by the data. This may conveniently allow the memory address and data to be scrambled before accessing the memory.
The execution core may be configured to: form the request including data specifying the location; store, locally to the execution core, the data specifying the location; transmit the request to the memory system; receive from the memory system a data unit and an error check value; form a local error check value computed over both of (i) at least part of the data specifying the location as stored locally to the execution core and (ii) at least part of the data unit as received from the memory system; check whether the local error check value matches the error check value received from the memory system; and if the local error check value does not match the error check value received from the memory system, raise an error.
The execution core may be configured to raise the error such that execution is halted. This may prevent a fault from corrupting the system or prevent an attacker from being able to influence or control the operation of a processing module in fetching instructions from memory without such influence or control being detected. Once a fault has been detected, steps can be taken to reassert authorised control and limit the effect of the attack or other fault.
The execution core may have an execution pipeline and the execution core may be configured to locally store the data specifying the location in the execution pipeline.
The execution core may be configured to: interpret the data unit as received from the memory system as an instruction; execute that instruction; and perform the said check before completing the instruction. This may prevent a faulty instruction from updating state in the CPU as a result of an accidental or deliberate fault.
The memory system may be parity protected. This may provide additional fault protection. BRIEF DESCRIPTION OF THE FIGURES
The present invention will now be described by way of example with reference to the accompanying drawings.
In the drawings:
Figure 1 shows an example of a data processing device comprising an execution core and a memory.
Figure 2 shows an exemplary flowchart detailing the method steps performed by the memory.
DETAILED DESCRIPTION
It is desirable to provide secure devices which can prevent or reduce the number of instances of an attacker gaining access to secret information. In general, it is desirable to prevent an attacker from being able to influence or control the operation of a processing module in fetching instructions from memory without such influence or control being detected. Once it is detected, steps can be taken to reassert authorised control and limit the effect of the attack or other fault.
The present disclosure relates to a method for ensuring that the processing module accesses the correct location in the memory, write data reaches the memory without corruption and read data is returned from the memory without corruption.
As will be described in more detail below, additional signals are added to the memory bus interface to protect the data and the memory address. The write data is protected on the memory request path. The read data is protected on the memory response path. The accessed address is looped back to the core processor. The core processor then checks that the accessed address matches the expected address.
In general, the memory receives a request from the core to fetch data from a location in the memory. The request including data specifying the location of the requested data in the memory. In response to that request, the memory is configured to retrieve a data unit from the location. An error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit is formed. The retrieved data unit and the error check value is then transmitted to the execution core.
The error check is preferably a cyclic redundancy check (CRC). However, other suitable error checks may also be used.
In a preferred implementation, CRC protection is added to the read data and the write data. The error check value may be a CRC remainder. The address can be included in the CRC calculation so that the CRC check only passes if the address was received. A CRC check can detect any number of bit errors, depending on the polynomial used for the CRC check.
CRC remainders are generated from not only the data but also from the requested address. This means that when the data is received (write data at the memory, read data at the core) then the CRC check includes whether the address was correctly received or not. For the write data check at the memory, this is simple because the address is sent as part of the request.
For the read data, this means that the address is looped back to the core and so the core records the originally requested address and use that to check the CRC. The address loopback is achieved without directly adding the address to the bus response.
For load data, the load-store unit (LSU) checks the CRC to confirm that data was read from the correct address. For instructions, as well as checking that the fetch was received from the correct address from the bus, the CRC can also be used to show that instructions issue at the correct PC, by pushing instructions into the fetch buffer with the CRC remainder and checking the CRC remainder at execution. This allows faults in the instruction fetch buffer and surrounding logic to also be detected, as these faults often cause instructions to issue at an incorrect PC.
In a preferred embodiment where the execution core is a RISC-V core, a CRC remainder may be calculated for every 16-bits of data, because RISC-V instructions are of variable length but are a multiple of 16-bits.
Figure 1 shows an example of a Hi2120CS NB-loT data processing device 100 comprising a HiMiDeerSV100 RISC-V execution core 101 and a memory 105 communicatively coupled to the execution core via a data path, which in this example is a bus. The memory 105 is configured to store a respective one of a plurality of data units at a respective one of a plurality of locations. In this example, the memory 105 is parity protected. As will be described in more detail below, the memory is configured to receive, from the execution core 101 , a request to fetch data from a location in the memory 105. The request includes data specifying the location of the data in the memory 105.
In response to that request, the memory system is configured to retrieve a data unit from the location and form an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit. The memory system then transmits the retrieved data unit and the error check value to the execution core 101 .
The bus conveys the retrieved data unit(s) and the error check value(s) from the memory system to the execution core. The width of the bus is preferably at least n times the total number of bits in a data unit and an error check value to allow the data units and the error check values responsive to a single request to be passed simultaneously. Preferably, n is 2. The length of each data unit may be two bytes. This may provide an efficient balance between transferred data content and error check data.
In the example shown in Figure 1 , the AHB (Advanced High-performance Bus) signals are as follows. A HWDATA_CRC is generated from HWDATA (write/store data) and HADDR (request address). The CRC is checked at the memory to detect address/write data corruption. HRDATA_CRC is generated at the memory from HRDATA (read data) and HADDR (address) and is checked in the core. For loads, the HRDATA_CRC check ensures that the address reaches the memory and is returned to the core correctly.
As shown in Figure 1 , in a first step, the SV100 core 101 makes a bus request (l-BUS and I or D-BUS). HWDATA_CRC signals are added to the AHB, which in this example are CRC12 (polynomial 0x8f8) generated from HADDR and HWDATA and are invalid for loads. The arbiter 102 selects a bus request, and passes it to block 103. At block 103, which may act as an input interface for the memory system, the memory address and data is scrambled before accessing the memory. Block 104 generates parity to write to the memory. The memory 105 is then accessed. At block 106, HADDR, HWDATA and HRDATA (memory read data) are descrambled. At block 107, HWDATA_CRC is checked for stores. At block 108, which may act as an output interface for the memory system, the load data parity is checked from the memory and HRDATA_CRC for the loads is generated to return to the core from HADDR, and HRDATA (read data). At block 109, the response signals are registered, if necessary for timing. Back at the core 101 , the core receives the response and checks the CRCs against the read data, and the requested address. A bus sideband signal relating to the system therefore loops back the memory address to the core. Fault detection logic relating to the signal checks that the actual address accessed matches the requested address. As described above, the sideband signal may be a CRC calculated across the instruction and the returned address. The returned CRC may be checked using the returned data and an independently calculated version of the expected address, and may cause an alarm to fire on failure of the check.
In this example, CRC12 remainders are generated for each 16-bits of data. Variable length encoding means that 16-bit quantities are issued (one or more at a time). CRC12 remainders can be pushed into the instruction-fetch buffer, one for each 16-bits on instruction (much smaller than recording the looped back PC). In the execute stage, the CRC remainder is checked against the local execute PC and the issued instruction. This confirms that the instruction fetch unit (IFU) issued the instruction at the correct PC. This protects the instruction-fetch buffer against fault injection, for example missing a push due to a fault. HRDATA_CRC is also check in the load-store unit (LSU). Therefore, this check confirms that loads accessed the correct address and also means that the memory system can’t return a CRC for stores.
Longer CRC lengths may give improved protection. However, other CRC lengths, such as CRC6 remainders, may be used.
In one embodiment, four bytes are read from the memory at once and then the CRCs are calculated across each pair of bytes. For example, bytes 0-3 are read and a CRC is formed from 0-1 and 2-3. Instructions are preferably issued in 16-bit elements. Each instruction may have 1 to m 16-bit elements (1 to 3 in this embodiment). Therefore, it is desirable to be able to check the CRC across any pair of bytes. Preferably, the lowest numbered byte should be even.
For example:
Addr 0: 3 2 1 0
Addr 1 : 7 6 5 4
Addr 2: b a 9 8
The above bytes may be 12 bytes in the memory. For example, [1 :0] could be an instruction, [5:4, 3:2] could be an instruction, [b:a, 9:8, 7:6] could be an instruction, etc. Therefore, when the instruction is issued, 1-3 pairs of bytes are issued, each pair with its CRC. The CRC can then be checked for each pair. Therefore, each pair may have its own CRC, and four bytes may be read from the memory so as to generate two CRCs to return to the core.
In a preferred implementation, only every n-th location in the memory is addressable by a request from the execution core. In this case, the memory system may be configured to, in response to the request: retrieve n data units from each of n contiguous locations including the location specified in the request; form n error check values each computed over both of (i) at least part of the data specifying the location and (ii) at least part of a respective one of the retrieved data units; and transmit the n retrieved data units and the n error check values to the execution core. This may allow the integrity of the n values to be verified in response to such a request.
For efficiency, preferably, the memory transmits the fetched data unit(s) and the error check value(s) responsive to a single request in parallel to the execution core.
The method described herein therefore provides a low cost, simple method for detecting address faults in the memory access, write data faults between the core and the memory, read data faults between the memory and core and fetch buffer faults.
Where a fault occurs, it may be desirable to try to correct that fault and to carry on with accessing the memory. An alternative is to stop the process on fault detection. Where a fault is deliberately introduced, processing can be stopped, the fault reviewed and the processing core or module reset to clear the fault.
The execution core may be configured to: form the request including data specifying the location; store, locally to the execution core, the data specifying the location; transmit the request to the memory; receive from the memory a data unit and an error check value; form a local error check value computed over both of (i) at least part of the data specifying the location as stored locally to the execution core and (ii) at least part of the data unit as received from the memory; check whether the local error check value matches the error check value received from the memory; and if the local error check value does not match the error check value received from the memory, raise an error. For example, in response to detecting an error, the execution core may be configured to raise the error such that execution is halted. In one implementation, the execution core may have an execution pipeline and the execution core may be configured to locally store the data specifying the location in the execution pipeline.
In one implementation, the execution core may be configured to interpret the data unit as received from the memory as an instruction, and perform the error check before completing the instruction. If the error check indicates a fault, the instruction may be prevented from completing, so as to prevent the faulty instruction from updating state in the CPU.
Figure 2 summarizes a method 200 in which the following steps are performed by the memory system. At step 201 , the method comprises retrieving a data unit from the location. At step 202, an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit is formed. At step 203, the retrieved data unit and the error check value are transmitted to the execution core.
The techniques described herein may be useful in various security CPUs, for example in HiMiDeerSVxxx security CPUs which form part of the Huawei product line.
The method described herein may therefore allow, in the presence of faults, to check that the CPU accessed the correct location in the memory when making memory requests (i.e. that the write data reaches the memory without corruption), that write data reached the memory without corruption, and that read data was returned to the core from the memory without corruption. It may also enable protection of the long wires between the memory system and the core, which may have injected faults.
The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Claims

1. A data processing device (100) comprising an execution core (101) and a memory system (102-109) communicatively coupled to the execution core via a data path, the memory system being configured to receive from the execution core a request to fetch data from a location in the memory system, the request including data specifying the location, and in response to that request: retrieve a data unit from the location; form an error check value computed over both of (i) at least part of the data specifying the location and (ii) at least part of the retrieved data unit; and transmit the retrieved data unit and the error check value to the execution core.
2. A data processing device as claimed in claim 1 , wherein only every n-th location in the memory system is addressable by a request from the execution core, and the memory system is configured to, in response to the request: retrieve n data units from each of n contiguous locations including the location specified in the request; form n error check values each computed over both of (i) at least part of the data specifying the location and (ii) at least part of a respective one of the retrieved data units; and transmit the n retrieved data units and the n error check values to the execution core.
3. A data processing device as claimed in claim 2, wherein n is 2.
4. A data processing device as claimed in claim 2 or 3, wherein the length of each data unit is two bytes.
5. A data processing device as claimed in any of claims 2 to 4, comprising a bus for conveying the or each retrieved data unit and the or each error check value from the memory system to the execution core, the width of the bus being at least n times the total number of bits in a data unit and an error check value.
6. A data processing device as claimed in any preceding claim, wherein the memory system is configured to generate the or each error check value using a cyclic redundancy check computed over the inputs thereto.
7. A data processing device as claimed in claim 6, wherein the error check value is a remainder of the cyclic redundancy check.
8. A data processing device as claimed in any preceding claim, wherein the memory system is configured to transmit the or each data unit and the or each error check value responsive to a single request in parallel to the execution core.
9. A data processing device as claimed in any preceding claim, wherein the memory system comprises a storage block (105) and an output interface (108), the storage block (105) stores data at the location and the output interface (108) is configured to form the error check value.
10. A data processing device as claimed in claim 9, wherein the memory system (105) comprises an input interface (103) configured to receive the request and scramble the data specifying the location to determine the location specified by the data.
11 . A data processing device as claimed in any preceding claim, wherein the execution core is configured to: form the request including data specifying the location; store, locally to the execution core, the data specifying the location; transmit the request to the memory system; receive from the memory system a data unit and an error check value; form a local error check value computed over both of (i) at least part of the data specifying the location as stored locally to the execution core and (ii) at least part of the data unit as received from the memory system; check whether the local error check value matches the error check value received from the memory system; and if the local error check value does not match the error check value received from the memory system, raise an error.
12. A data processing device as claimed in claim 11 , wherein the execution core is configured to raise the error such that execution is halted.
13. A data processing device as claimed in claim 11 or 12, wherein the execution core has an execution pipeline and the execution core is configured to locally store the data specifying the location in the execution pipeline. ata processing device as claimed in any of claims 11 to 13, wherein the execution core ured to: interpret the data unit as received from the memory system as an instruction; execute that instruction; and perform the said check before completing the instruction.
EP20774955.7A 2020-09-17 2020-09-17 Fault resistant memory access Pending EP4205010A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/075924 WO2022058010A1 (en) 2020-09-17 2020-09-17 Fault resistant memory access

Publications (1)

Publication Number Publication Date
EP4205010A1 true EP4205010A1 (en) 2023-07-05

Family

ID=72560593

Family Applications (1)

Application Number Title Priority Date Filing Date
EP20774955.7A Pending EP4205010A1 (en) 2020-09-17 2020-09-17 Fault resistant memory access

Country Status (2)

Country Link
EP (1) EP4205010A1 (en)
WO (1) WO2022058010A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7278083B2 (en) 2003-06-27 2007-10-02 International Business Machines Corporation Method and system for optimized instruction fetch to protect against soft and hard errors
US8042023B2 (en) * 2008-01-14 2011-10-18 Qimonda Ag Memory system with cyclic redundancy check
US9619313B2 (en) 2015-06-19 2017-04-11 Intel Corporation Memory write protection for memory corruption detection architectures

Also Published As

Publication number Publication date
WO2022058010A1 (en) 2022-03-24

Similar Documents

Publication Publication Date Title
US6823473B2 (en) Simultaneous and redundantly threaded processor uncached load address comparator and data value replication circuit
CN103140841B (en) The method and apparatus of the part of protected storage
US7752505B2 (en) Method and apparatus for detection of data errors in tag arrays
US6772383B1 (en) Combined tag and data ECC for enhanced soft error recovery from cache tag errors
US9252814B2 (en) Combined group ECC protection and subgroup parity protection
EP2294526B1 (en) A method for secure data reading and a data handling system
US8589759B2 (en) RAM single event upset (SEU) method to correct errors
US20080034350A1 (en) System and Method for Checking the Integrity of Computer Program Code
US6519717B1 (en) Mechanism to improve fault isolation and diagnosis in computers
US8433950B2 (en) System to determine fault tolerance in an integrated circuit and associated methods
US20160364289A1 (en) End-to-End Error Detection and Correction
US11966290B2 (en) Checker cores for fault tolerant processing
EP4205010A1 (en) Fault resistant memory access
Atamaner et al. Detecting errors in instructions with bloom filters
GB2514611A (en) Storage integrity validator
WO2022053157A1 (en) Fault resistant verification
US20230359523A1 (en) Memory integrity check
US11256569B2 (en) Error correcting bits
US20050060611A1 (en) Apparatus, system, and method for identifying a faulty communication module
EP4055481A1 (en) Fault detection system
Shen et al. Behavior-Based Fault Monitoring

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20230327

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)