WO2005124506A2 - Cryptographic architecture with instruction masking and other techniques for thwarting differential power analysis - Google Patents

Cryptographic architecture with instruction masking and other techniques for thwarting differential power analysis Download PDF

Info

Publication number
WO2005124506A2
WO2005124506A2 PCT/US2005/020093 US2005020093W WO2005124506A2 WO 2005124506 A2 WO2005124506 A2 WO 2005124506A2 US 2005020093 W US2005020093 W US 2005020093W WO 2005124506 A2 WO2005124506 A2 WO 2005124506A2
Authority
WO
WIPO (PCT)
Prior art keywords
instructions
processor
bus
random number
register
Prior art date
Application number
PCT/US2005/020093
Other languages
French (fr)
Other versions
WO2005124506A3 (en
Inventor
David B. Shu
Lap-Wai Chow
Jr. William M. Clark
Original Assignee
Hrl Laboratories, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hrl Laboratories, Llc filed Critical Hrl Laboratories, Llc
Priority to US11/628,920 priority Critical patent/US8095993B2/en
Priority to GB0623489A priority patent/GB2430515B/en
Priority to JP2007527677A priority patent/JP2008502283A/en
Publication of WO2005124506A2 publication Critical patent/WO2005124506A2/en
Publication of WO2005124506A3 publication Critical patent/WO2005124506A3/en
Priority to US13/296,740 priority patent/US20120144205A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/002Countermeasures against attacks on cryptographic mechanisms
    • H04L9/003Countermeasures against attacks on cryptographic mechanisms for power analysis, e.g. differential power analysis [DPA] or simple power analysis [SPA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30101Special purpose registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • G06F21/72Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in cryptographic circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • G06F21/75Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information by inhibiting the analysis of circuitry or operation
    • G06F21/755Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information by inhibiting the analysis of circuitry or operation with measures against power attack
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/82Protecting input, output or interconnection devices
    • G06F21/85Protecting input, output or interconnection devices interconnection devices, e.g. bus-connected or in-line devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/321Program or instruction counter, e.g. incrementing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09CCIPHERING OR DECIPHERING APPARATUS FOR CRYPTOGRAPHIC OR OTHER PURPOSES INVOLVING THE NEED FOR SECRECY
    • G09C1/00Apparatus or methods whereby a given sequence of signs, e.g. an intelligible text, is transformed into an unintelligible sequence of signs by transposing the signs or groups of signs or by replacing them by others according to a predetermined system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0625Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation with splitting of the data block into left and right halves, e.g. Feistel based algorithms, DES, FEAL, IDEA or KASUMI
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2123Dummy operation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/08Randomization, e.g. dummy operations or using noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry

Definitions

  • the present invention relates to the security of cryptographic methods and the cryptographic architecture of a processor used in microelectronic assemblies, such as Smart Cards and the like, in order to prevent security breaches ofthe same when a Differential Power Analysis (DP A) attack is utilized.
  • DP A Differential Power Analysis
  • Cryptographic techniques are well-known in the art. Indeed, they date from at least the time of Caesar when the need to keep certain information secret from prying eyes became important enough for people to find ways to disguise the information by means of codes and ciphers.
  • Cryptographic techniques are in a wide array of applications, both governmental and private.
  • One application of cryptographic techniques is to protect Mafi ⁇ n stored " US a Smart Card and/or to protect the capabilities ofthe Smart Card from unauthorized use or modifications.
  • Cryptographic devices such as Smart Cards, use secret keys to process input information and/or to produce output information. It has been assumed that the information stored in a cryptographic device, such as a Smart Card, is relatively safe from attack provided that an especially strong cryptographic technique is utilized.
  • Modern cryptography utilizes transposition and substitution of digital data.
  • Messages to be encrypted known as plaintext
  • plaintext are transformed by a function that is parameterized by a key.
  • the output ofthe encryption process known as the ciphertext
  • the received ciphertext is then decrypted, using a key, back into plaintext.
  • FIG. 1 depicts a cryptographic system.
  • An attacker may attack the smart card or security processor by looking for information related to the secret keys that may be leaked via EM radiation, power consumption, timing etc.
  • the leaked information commonly referred to as side channel information
  • side channel information can then be used by attackers in order to determine the secret key used.
  • DP A Differential Power Analysis
  • Unfortunately there is no way to guarantee that power consumption, EM radiation, etc. will not leak certain cryptographic process information being performed by a device and thus obtain information about the secret " Therefore, defensive techniques are needed that produce leaked information that is unusable by hackers using correlation techniques such as DPA.
  • the well-known DES cipher utilizes a number, typically 16, of substitution box (S-Box) functions.
  • S-Box functions are non-linear and can be implemented by using table lookups, Boolean logic or appropriately programmed computers.
  • DPA digital encryption standard
  • Thomas Messerges in U.S. Patent Number 6,208,135, uses a randomized starting point in the set of target bits. Mr. Messerges processes the corresponding target bits in a different order; thus it becomes difficult for a DPA attacker to group related target bits from all the plaintexts of interest in order to perform statistical analyses associated with given target bit positions. However, not only does this approach not conceal the information leaked by a data bus; it also cannot prevent a malicious attacker from using this information to reorder the target bit into the correct bit position. Mr. Messerges also developed another technique, as discussed in U.S. Patent Number 6,295,606, that uses a random mask to keep the message and key hidden both while they are stored in memory, and during processing by the cryptographic algorithm itself.
  • This invention proposes a unique Random Instruction Mask (RIM) as a countermeasure to the DPA process, effectively making power consumption uncorrelatable to cipher bit values.
  • RIM Random Instruction Mask
  • the present invention has the following advantages over the techniques of Messerges, Boeckler and others: (1) More Efficient Calculations: The techniques taught by Messerges et al. slow down the DES algorithm by 300 to 500% due to the regular update ofthe S-boxes. In the present invention, the DES algorithm will be slowed down by approximately 15%. (2) More Robust: Even in the presence of leaked information for multiple address locations.
  • the DES algorithm is an example of an iterative-block cipher.
  • DES is described in detail in ANSI X.392, "American National Standard for Data Encryption Algorithm (DEA),” American Standards institute, 1981, which is incorporated by reference herein.
  • the DES cipher is well known and utilizes a number, typically sixteen, of substitution- permutation box (SP-Box) functions instituted in program sequences called rounds.
  • SP-Box substitution- permutation box
  • the SP.box functions are non-linear and are conventionally implemented using lookup tables or Boolean logic gates or appropriately programmed computers.
  • the DES encryption algorithm performs eight SP box operations, in turn, by accessing sequentially each lookup table (or by using equivalent logic gates).
  • the eight SP boxes each take, as input, a scrambled 6-bit key, (here, scrambled means that the key has been XOR-ed and shifted) and produce a 4-bit output target to be accessed by the CPU for OR-ing operations.
  • Each such 6-bit scrambled key is an SP box's entry address.
  • Table 1 shows the C-language representation of SP boxes 1 and 2 in a 32-bit implementation of DES. DES can run with 16, 32, and 64 bits but we have chosen the 32-bit representation as a nominal example. From Table 1 note that each SP lookup contains 64 elements. Each element in a nominal DES implementation is 32-bits and embeds a given 4-bit output target. This embedding will now be described in greater detail.
  • the data bus is typically 32-bits wide, this 4-bit output target is distributed somewhere within a 32-bit word according to the permutation rules (one per SP box) as implied in Table 1 , where the data is presented in a hexadecimal format. That is, each SP lookup table will have a different embedding position for a given 4-bit output target.
  • lookup table SPl shown in Table 1, embeds a 4-bit output target at bit positions 24, 16, 10 and 2 in a 32-bit word.
  • Lookup table SP2 embeds a 4-bit output target at bit positions 20, 5, 31 and 15, where bit 20 is the most significant bit (MSB) and bit 15 is the least significant bit (LSB) for a given 4-bit output.
  • SPl [0] ⁇ 0x01010400L ⁇ is embedded with a 4-bit output target value of 14 (i.e, 1110).
  • the 32 bit binary word is 0000 00010000 00010000 0100 0000 0000.
  • the right most digit is the LSB while the left most digit is the MSB for a given 32- bit binary word.
  • the values ofthe bits at 24, 16, 10, and 2 are used.
  • the 4-bit output target is 1110. This is determined by looking for the MSB value ofthe 4-bit output target at position 24, the next bit is at position 16, the third bit is at position 10, and finally the LSB of 0 is at position 2 ofthe 32 bit binary word SPl [0].
  • the bit positions, 24, 16, 10 and 0 are underlined in the binary representation given above.
  • the fourth entry SPl [3] ⁇ 0x01010404L ⁇ , (which differs from the 1110 of SP 1 [0] only at the LSB), has a 4-bit output target value of 15 (i.e., 1111).
  • the fourth entry SP2[3] ⁇ 0x00108020L ⁇ , (which differs with 1111 of SP2[0] only at the 2nd LSB), has a 4-bit target value of 13 (i.e., 1101). Having established the relationship between the 4-bit output target and its corresponding SP box's entry, next the calculation of a given SP box's entry address is discussed.
  • a DES algorithm uses shifting instructions running in the CPU to calculate a box's entry address. Both the number of shifting instructions used in a specific SP box's entry address calculation and the time interval between each consecutive access of an SP box will be well known to anyone who is familiar with the DES algorithm.
  • a DPA attacker looks for patterns in the power trace.
  • SP5 SP box 5
  • the DPA attacker looks for a pattern indicating eight shifts as seen in Table 2.
  • the DPA attacker would know that the time from the beginning ofthe eight shifts (see numeral 131) to the beginning of a next set of shifts is equal to a time TI 5 as shown in Figure 2a.
  • the DPA attacker when finding this pattern in a power trace, would know that the SP address calculation for SP5 has been found (at numeral 123).
  • the attacker would also know that the information in the power trace for the time slot following the end ofthe eight shifts would contain the corresponding 4-bit output target information.
  • Figure 2b shows the time line with randomized accessing order for the eight SP boxes.
  • the processing order of SPl and SP3 has been swapped, and similarly for the SP4 and SP6.
  • a DPA attacker will have to identify these shifting instruction signatures in order to align power traces by re-shuffling the SP box accessing order. After alignment for a given SP box, statistical averaging and other analysis of these power traces can be performed. Thus, the DPA attacker can ultimately align the power traces to determine the 6-bit key.
  • the present invention provides a method of inhibiting a successful DPA of a cryptographic device comprising: randomly varying an amount of time required to determine at least one lookup table address; and randomly varying an amount of time occurring between one access of at least one lookup table and a subsequent access of another lookup table.
  • the present invention provides a cryptographic architecture comprising: a processor; a memory module containing an encryption algorithm coupled to said processor; a control flag.register coupled to said processor for controlling the state operation ofthe processor; and a random number generator coupled to said control flag register, wherein said processor sets said control flag register and said random number generator resets said control flag register.
  • the present invention provides a system for thwarting DPA, said system comprising: means for running an encryption algorithm and means for inserting a random number of pseudo instructions into said encryption algorithm.
  • the present invention provides a system for decorrelating side channel information, said system comprising: means for running a Data Encryption Standard (DES) algorithm, said DES algorithm comprising a plurality of substitution/permutation box entry address evaluations and means for inserting a random number of shifting instructions run in each of said plurality of substitution/permutation box entry address evaluations.
  • DES Data Encryption Standard
  • the present invention provides a method of altering a power trace of a cryptographic architecture comprising the steps of: running an encryption algorithm; setting a control flag; and performing a random number of instructions when said control flag is set.
  • the present invention provides a method of inhibiting a successful differential power analysis of a cryptographic device comprising randomly increasing an amount of time required to determine at least one lookup table address; and randomly increasing an amount of time occurring between one access of at least one lookup table and a subsequent access of another lookup table.
  • the present invention provides a cryptographic architecture comprising: a processor; a memory module containing an encryption algorithm coupled to said processor; a control flag register coupled to said processor for controlling the state operation ofthe processor; and a random number generator coupled to said control flag register, wherein said processor sets said control flag register and said random number generator resets said control flag register.
  • the present invention provides a system for thwarting differential power analysis, said system comprising: means for running an encryption algorithm and means for inserting a random number of pseudo instructions into said encryption algorithm.
  • the present invention provides a system for de- correlating side channel information, said system comprising: means for running a Data Encryption Standard (DES) algorithm, said DES algorithm comprising a plurality of substitution/permutation box entry address evaluations and means for inserting a random number of shifting instructions run in each of said plurality of substitution/permutation box entry address evaluations.
  • DES Data Encryption Standard
  • the present invention provides a method of altering a power trace of a cryptographic architecture comprising the steps of: running an encryption algorithm; setting a control flag; and performing a random number of instructions when said control flag is set.
  • the present invention provides a cryptographic CPU architecture comprising: an ALU; a control flag; a plurality of registers for normally receiving output ofthe ALU in response to an arithmetic instruction; and an additional register for receiving output ofthe ALU, in lieu of one ofthe plurality of registers, in response to an arithmetic instruction when the control flag is set.
  • the present invention provides a method of concealing data processing occurring in a CPU from power analysis during the execution of a program, the method comprising: (i) at a point during the execution ofthe program, inserting a random number of program counter cycles instruction fetch cycles; (ii) while the random number of instruction fetch cycles are occurring, fetching instructions from memory, executing those instructions in program sequence, but inhibiting updating of normal memory locations based on the execution of those instructions; and (iii) at the conclusion of said random number of instructions, then recommencing normal program execution by refetching the same instructions which were initially fetched while the random number of instruction fetch cycles were occurring, but when the instructions are refetched, updating memory locations in a normal manner for the CPU.
  • the present invention provides a method of concealing data processing occurring in a CPU from power analysis during the execution of a program, the method comprising: (i) at a point during the execution ofthe program, inserting a random number of program counter cycles instruction fetch cycles; and (ii) while the random number of instruction fetch cycles are occurring, mimicking power consumption associated with (a) fetching instructions from memory, (b) executing those instructions in program sequence, and (c) writing results to memory registers.
  • the present invention provides a data processor comprising: an arithmetic logic unit; a control flag register; a plurality of registers for normally receiving output ofthe arithmetic logic unit in response to an arithmetic instruction and in response to a first state of said control flag register; and a dummy register for receiving output ofthe arithmetic logic unit, in lieu of one ofthe plurality of registers, in response to an instruction and in response to a second state of said control flag register.
  • the present invention provides a cryptographic bus architecture comprising: a random number generator having a plurality of random number outputs at which a multi-bit random number is output; a plurality of bidirectional bus drivers, each bi-directional bus driver having at least one input for receiving at least one of said random number outputs; and a bus coupling at least one of said plurality of bi-directional bus drivers to at least another of said bi-directional bus drivers; wherein bi-directional bus drivers that are coupled to a common line of said bus are controlled by a common selected one of said random number outputs.
  • the present invention provides a method of preventing a breach of security comprising the steps of: sending encrypted bits over a bus; and randomly toggling the polarity of said encrypted bits on said bus.
  • the present invention provides a method for protecting secret keys comprising: providing a plurality of bi-directional bus drivers; coupling a line lata bus between at least a first bi-directional bus driver of said plurality of bidirectional bus drivers and a second bi-directional bus driver of said plurality of bidirectional bus drivers; signaling said first bi-directional bus driver to provide a first set of bits to said bus, said bits having a first polarity; signaling said second bi-directional bus driver to receive said first set of bits having said first polarity; randomly signaling said first bi-directional bus driver to provide a second set of bits to said bus, said second set of bits having an opposite polarity than said first set of bits; and signaling said second bi-directional bus driver to receive said second set of bits having said opposite polarity.
  • Figure 1 depicts a prior art diagram of information available to attackers
  • Figure 2a is a prior art timeline corresponding to the normal accesses of eight SP lookup tables for a given round
  • Figure 2b is a prior art timeline corresponding to a randomized accessing order of the eight SP lookup tables for a given round
  • Figure 3 is a time line with both the time intervals and SP boxes accessing orders being randomized by Random Instruction Masking (RIM) in accordance with the present disclosure
  • Figure 4 is a time line with the shifting instructions being equalized in accordance with the present disclosure
  • Figure 5 is a block diagram of a first embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure
  • Figure 6 is a block diagram of a second embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure
  • Figure 7 is a block diagram of a third embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure.
  • Figure 8 is a time line associated with the embodiment of Figures 7.
  • Figure 9 is a block diagram of a fourth embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure.
  • Figure 10 is a block diagram a prior art RISC CPU.
  • FIG 11 is a block diagram a RISC CPU in accordance with a sixth embodiment ofthe present invention.
  • Figure 12 is a block diagram of a system in accordance with a cryptographic bus architecture embodiment
  • FIG. 13 is a detailed block diagram of a bus architecture in accordance with the cryptographic bus architecture embodiment.
  • Figure 14 depicts a block diagram of bit writing with dual rails in accordance with the cryptographic bus architecture embodiment.
  • Table 1 shows values, expressed in the C language, for SP-boxes 1 and 2 implemented as lookup tables of 64 elements.
  • Table 2 is a C language program that sequentially accesses DES's eight SP lookup tables for a given round.
  • "" ' " Table " 3 " is an assembly language program to implement C program statement number 5 of Table 2.
  • Table 4 is an assembly language program to implement a portion ofthe DES encryption algorithm that performs eight S and P boxes' operations in turn by accessing sequentially each lookup table.
  • Table 5 is an assembly language program of to implement C program statement number 5 of Table 2 using the embodiment of Figure 7.
  • Figure 9 depicts a fourth embodiment which is basically a combination ofthe embodiments of Figures 6 and 7.
  • Figure 11 depicts a fifth embodiment which is based on a modified RISC CPU design, but the modifications discussed may also be used with non-RISC CPUs if desired.
  • Figures 12-14 related to a cryptographic bus architecture which may be used independently or in combination with the other embodiments.
  • any encryption algorithm is a series of instructions executed by a processor. While the inputs and outputs of these instructions will vary, the amount of time required to complete each instruction is determined by the clock speed ofthe processor or a bus over which the data is transmitted to and from the processor. Different instructions take more clock cycles than other instructions.
  • the knowledge ofthe encryption algorithm used to encrypt/decrypt the data provides hackers with knowledge about the timing ofthe algorithm, i.e. knowledge about which instructions are used and thus how long each instruction should take. This knowledge about timing can then be used to align side channel information. Thus, the side channel information can then be processed by sophisticated statistical approaches that allow the attacker to break the encryption.
  • a system and method for randomizing the number of instructions within the encryption algorithm is disclosed herein.
  • the instructions and timing within the encryption algorithm are no longer known to the DPA attacker. Therefore, the timing ofthe algorithm will be unknown to the attackers and they will be unable to align the side channel information. Without the alignment ofthe side channel information, the sophisticated statistical approaches will fail and the encrypted information will be protected.
  • This specification provides information specific to an on-chip Random Instruction Masking (RIM) architecture on a microprocessor that is used to perform cryptographic operations. Furthermore, this specification provides an architectural approach for securing existing cryptographic algorithms (including RSA, DES, AES and non-linear algorithms) from Side-Channel-Attacks ⁇ i.e., attacks based on leaked power information. The motivation is to keep systems secure even though the underlying circuits will very likely always be leaking such information.
  • RIM Random Instruction Masking
  • a software approach to randomizing the order ofthe processing ofthe target bit is not enough to secure an algorithm completely. It is also necessary to destroy all instruction signatures or power patterns that may allow the DPA attackers to reorder the target bits to their original sequences. Consequently, one approach is to complement a software approach with hardware protection preferably by means of an architecture that implements the randomizing instructions and time delays as disclosed herein.
  • a DPA selection function can simultaneously select for values of four target a ' ther than just one bifbecause low-level instructions often manipulate four bits (due to common use of six key bits.)
  • the resulting DPA characteristics tend to have larger peaks, but do not have better signal-to-noise ratios because proportionately fewer samples are included in the averaging.
  • Figure 3 depicts how the time line relationship between SP box's entry address calculation 131 and the generation of a given 4-bit output target 123 maybe modified.
  • the modification comprises the insertion of random numbers of pseudo shifting instructions 133 (according the embodiment of Figure 5, for example) or random numbers of randomized pseudo instructions in each SP box's entry address calculation subroutine (according the embodiment of Figure 6, for example).
  • the numbers of inserted pseudo instructions do not necessarily be random, since if each SP box ends up having the same numbers of real and pseudo instructions, then the attacker is still left with little or not information to ascertain which box is which.
  • the pseudo shift instructions include the shift and that they exactly mimic the power signature ofthe real shift instructions. Unless these pseudo instructions include a shift, their effect could probably be observed and thus ignored by a DPA attacker.
  • most encryption algorithms do utilize shift instructions somewhere, and assuming that the algorithm is known by the DPA attacker, then a similar correlation can be found unless the disclosed technique of inserting random numbers of shift instructions is utilized.
  • the insertion ofthe pseudo shifting instructions 133 or other pseudo instructions 133 changes not only the number of instructions run in each SP box's entry address evaluation, but also the time interval between consecutive SP box ss T n .
  • a random number of pseudo shifting instructions 133 have been inserted in SP5, thus changing the time interval T 5 between the access of SP5 and SPl compared to Figure 2b.
  • a random number of pseudo instructions 133 are inserted in SP4, thus changing the time interval T 4 between the access of SP4 and SP6 compared to Figure 2b.
  • a random number of pseudo shifting instructions 133 could also be inserted in one or more ofthe other SP boxes.
  • the instructions are called 'pseudo' since they preferably mimic the power consumption trace of a real counterpart instruction (and, indeed, in certain embodiments, they may in fact be real instructions), but the execution ofthe pseudo instruction does not result in any data being updated by the processor.
  • both the Shifting instruction signatures and the time interval signatures are camouflaged or even eliminated. This will cause a DPA attacker to be unable to identify which SP box SPl - SP8 is being accessed in the program. This will make the re-shifting (randomization) of the SP box access order an effective way of hiding information from DPA attackers; therefore, they can no longer align different power traces to the same reference for statistical averaging and analysis. If the pseudo instructions exactly mimic real shift instructions from a power use point of view, then the attacker can find it very difficult to identify which SP box is which. If the pseudo instructions mimic a set of randomized instructions, then the SP boxes may well be very difficult to recognize at all. The attacker may well wonder whether the encryption protocol used by the device is the same protocol that the attacker assumes the attacked device utilizes.
  • Figure 5 depicts a first embodiment of a hardware architecture for implementing the DES algorithm which may be used to insert a random number of pseudo shifting instructions 133 (as discussed with reference to Figure 3) or an equalized number of shifting instructions 133 (as discussed with reference to Figure 4).
  • the system illustrated in Figure 5 includes a 32-bit processor or Central Processing Unit (CPU) 101 with RAM 103 and ROM 105 memories on a single chip.
  • CPU Central Processing Unit
  • the CPU could be a 16-bit or 64-bit processor, respectively. »
  • the system also contains substitution/permutation boxes (SPl - SP8) 107, which . can be implemented as lookup tables, as discussed above.
  • the CPU 101 runs an encryption/decryption program stored in the ROM 105, while the RAM 103 is for intermediate storage ofthe cipher text data.
  • the 6-bit key (or a guessed key) 121 and SP boxes 107 are used to calculate the Cipher Function
  • a Random Number generator 115 is coupled to a Random Instruction Mask (RIM) control flag register 113 which is coupled to the CPU 101.
  • RLM Random Instruction Mask
  • the random number generator 115 and the RLM control flag register 113 are used to camouflage the power trace so that this power trace cannot be time-aligned to yield statistical material for any given 6-bit key 121.
  • a random numoer oi pseu ⁇ o smiting instructions 133 are generated through the interaction ofthe CPU 101, the RLM Control Flag Register 113 and the Random Number Generator 115.
  • the CPU 101 runs the encryption/decryption program stored in the ROM 105. Embedded in this encryption/decryption program (to be discussed later) is an instruction to set the RIM Control Flag Register 113. Upon processing this instruction, the CPU 101 sends a signal on bus 109 to the RLM Control Flag Register 113 that sets it.
  • the RLM Control Flag Register 113 then sends a RIM Control Flag signal on a control line 111 to the CPU 101 causing the CPU 101 state machine to halt (to stop updating registers in response to calculations). This may be accomplished by sending a signal from the RIM Control Flag Register 113 to the program counter register within the CPU 101 that will disable the program counter. Effectively, the state machine ofthe CPU 101 is halted.
  • the state machine ofthe CPU 101 remains halted until the RLM Control Flag Register 113 is reset. This will cause the RIM Control Flag Register 113 to send a signal to the CPU 101 on control line 111 to enable the program -counter in CPU 101.
  • the RLM Control Flag Register 113 is preferably reset through the use ofthe Random Number Generator 115.
  • the Random Number Generator 115 is preferably a 1-bit random number generator.
  • the Random Number Generator 115 is synchronized with the timing ofthe instruction cycle ofthe CPU 101.
  • the Random Number Generator 115 may provide an output every clock cycle, or may be gated to ensure that an output is provided to the RLM Control Flag Register after a random number of X cycles, where X is any number such as 5.
  • the RLM Control Flag Register 113 is programmed to reset when either a zero or one is received from the one-bit Random Number Generator 115 depending upon the logic used. For example, assume that a zero from the one-bit Random Number Generator 115 will reset the RLM Control Flag Register 113. Because the RLM Control Flag Register 113 is reset only after receiving a zero from the one-bit Random Number Generator 115, and the one-bit
  • Random Number Generator 115 will generate a zero after a random number of cycles, the time the state machine ofthe CPU 101 is halted will also be random. Thus, a random number of pseudo instructions 133 is generated affecting the time line ofthe algorithm.
  • a pseudo instruction 133 is an instruction producing the same power signature on power traces as the original instruction but the write back ofthe execution result to the destination register in the CPU 101 is inhibited since the state machine of CPU 101 is halted.
  • the inhibiting ofthe CPU 101 preserves the CPU's state. Thus, inhibiting write back prevents the CPU from moving onto the next step in the algorithm; however, the power traces suggest otherwise. Thus, the attacker will be unable to use the power traces to decrypt the keys.
  • the CPU 101 in Figure 5 is preferably modified to accommodate these pseudo instructions with a RLM control flag signal sent on the bus 111, generated by a RIM control flag register 113, which, when activated, will disable the update ofthe CPU 101 destination register or the CPU 101 program counter (details of an embodiment of a modified CPU are disclosed in U.S. Patent Application No. 10/864,568 filed June 8, 2004 entitled "Cryptographic CPU Architecture with Random Instruction Masking to Thwart Differential Power Analysis").
  • the assembly statement #4 (i.e., jal link rshft) in Table 3 jumps and links to the subroutine labeled as "rshft" or Statement #13 (thus the mnemonic jal).
  • the term "link” in this statement represents a register that contains the return address.
  • RLM statements of variable block size are inserted before (or after, or both) an actual shifting instruction statement like, #15 (i.e., sra 1 1).
  • the instruction #13 in Table 3 allows the insertion of RLM instructions, when the RLM Control Flag Register 113 is set by the CPU 10.1 until the RLM Control Flag Register 113 is reset by the Random Number Generator 115. After execution of statement #15, and the completion ofthe RLM block, the "useful" execution ofthe program resumes.
  • statements #13 and #14 in Table 3 are for illustrative purposes only. These statements can occur anywhere, before, between or after an actual shifting instruction statement like #15. Preferably, for design simplicity, statements #13 and #14 are located within the scope ofthe shifting routine.
  • This random insertion thwarts a DPA attacker's attempt to track the shift instruction signatures because the number of discrete samples of a power trace is no longer fixed, but random. Hence, power traces cannot be time-aligned by the attacker for each 4-bit output target 123.
  • this insertion of random instructions also changes the time interval, for example TI5, further thwarting the attempts ofthe DPA attacker.
  • the random number of pseudo shift statements are preferably inserted in the middle of a loop - so the effect of them is magnified by the If these statements were inserted outside the loop, then adding only one or two pseudo shifts really won't help: changing a »8 to a »10 may not camouflage it enough in the context ofthe DES algorithm. If you are trying to hide a »8 from a »16 or »24, this requires that enough pseudo shift instructions be added to confuse the »8 with a »16 or a »24. Putting the added random number of pseudo shift statements in the loop ensures that the added number of pseudo shift statements will be an integer multiple of 8. If a random number of pseudo shift statements is inserted outside the loop, then other techniques can be used to ensure that the added number of pseudo shift instructions will be 8, 16, 24 (or other number sufficiently close thereto to confuse the DPA attacker).
  • Table 4 is an assembly language program with a 16-bit CPU to implement the portion ofthe DES portion ofthe DES encryption algorithm that performs eight S and P boxes operations in turn by accessing sequentially each lookup table 107 as shown in Figure 5.
  • Lines starting with ";" are comment lines.
  • Underlined statements are the corresponding C language statements for comment purposes.
  • Figure 6 depicts another embodiment of a hardware architecture for implementing the DES algorithm which may be used to insert a random number of random pseudo instructions 133 (see Figure 3).
  • the first embodiment of Figure 5 disables this tracking ability by inserting a random number of RIM instructions in each SP box's entry address calculation subroutine. In this embodiment, however, not only the number but also the content of these instructions will be altered, as described in detail below.
  • This second embodiment, as shown in Figure 6, is very similar to the first embodiment of Figure 5 and therefore common elements are identified by common reference numerals. As in the case ofthe embodiment of Figure 5, this embodiment preferably has a 32-bit CPU 101 with RAM memories 103 and ROM memories 105 disposed on a single chip.
  • This chip also preferably contains substitution/permutation boxes (SPl - SP8) 107, which can be implemented as lookup tables.
  • the CPU 101 runs the program stored in the ROM 105, while the RAM 103 is for intermediate storage of the cipher text data.
  • the CPU 101 fetches not only the normal encryption program from the ROM 105, but also the camouflaged, randomized instructions by means of a 32-bit pseudo random number generator 117.
  • a MUX 119 selected by a RIM control flag register 113, determines the type of instructions fetched by the CPU 101, real instructions from ROM 105 or randomized instructions generated by the 32-bit pseudo random number generator 117.
  • a conventional CPU is modified to include the RL control flag register 113 which, when activated, will disable the update ofthe CPU's destination register(s).
  • the RL control flag register 113 which, when activated, will disable the update ofthe CPU's destination register(s).
  • all the instructions executed inside the RLM statements block will camouflage the power trace so that the number of discrete samples of a power trace is no longer fixed for a given 4-bit output target.
  • the number and type of these instructions are determined on the fly by the random number generators.
  • the program address is also constantly being substituted for by another 32-bit pseudo Random number, since the Program Counter is not updated until the CPU resumes normal execution after the RIM control flag has been reset by the 1-bit random number generator.
  • the RLM control line 111 of Figures 5 or 6 should be made to be "probe-proof by burying it deeply in the layers ofthe semiconductor device. However, if the RLM control line 111 can be probed, then the afore-described techniques for dealing with a DPA attack will be overcome if the DPA attacker disables the RLM control signal on line 111 by tying it to ground (or high, depending on its logic) throughout the attack. Detailed Description of a Third Embodiment
  • Figure 7 depicts a third embodiment that is more resistant to probing than the embodiments of either Figures 5 or 6 and Figure 8 presents a time line for this embodiment.
  • This embodiment overcomes a single point failure attack, that is, an attack on line 111 of the foregoing embodiments, by introducing a Shift Control Counter (SCC) 140 and other changes discussed below.
  • SCC Shift Control Counter
  • This embodiment is described with reference an embodiment in which the total number of shift instructions (both real and pseudo) are fixed at twenty-four in number. However, those skilled in the art should now appreciate that the number of fixed and real instructions can be fixed at some other number or can be randomized utilizing the techniques previously described with reference to Figures 5 and 6.
  • Figure 7 anticipates an attack will occur on line 111 and the previously disclosed design of line 111 is modified so that even in the event of a successful attack, the system does not revert back to an unprotected design (such as the designs described with reference to Figures 2a and 2b).
  • the SCC 140 will be set (for example by a suitable software instruction or set of software instructions - see, e.g., instructions 3 and 4 in Table 5) to a count corresponding to that ofthe SP box.
  • Each decoded shift instruction will decrement this counter 140 by one until it reaches zero using, for example, its own decoder hardware.
  • a zero count will activate the "RLM_shift" signal at its output that will make any subsequent shift instruction a RLM instruction (i.e., a pseudo shift instruction with a camouflaged power signature).
  • each SP box has 24 right bit shifts associated therewith.
  • RLM_shifts i.e. pseudo shifts
  • the shifts which are pseudo shifts in Figure 8, are identified by hatching lines. For example, for box SP5, eight shifts are real right bit shift instructions while sixteen shifts are pseudo shift instructions. If a DPA attacker attacking " ⁇ l " disables the "RIM_shift” signal, then the normal execution ofthe encryption algorithm will be disrupted due to the fact that extra shifts will be performed because the pseudo shift instructions are then turned into real instructions due to the interference with line 111. This instead of merely inhibiting the production of pseudo shift instructions, interference with line 111 causes the inhibited pseudo shift instructions to be replaced with real shift instructions.
  • Table 5 is similar to Table 3, but shows the SCC 140 augmented RIM implementation in an assembly language subroutine.
  • the same assembly statement #3 (in an italic font) first loads register C with the number of shifts to be used to initialize Shift Control Counter (SCC) as indicated by the assembly statement #4 (i.e., sw_SCC C) which stores word SCC with the content of register C (thus the mnemonic sw).
  • Assembly statement #3 is not intended to tell the CPU to execute how many shifts; instead, assembly statement #5 is used for this purpose to provide identical shifting instruction power signatures for every SP box access.
  • the SCC control circuitry will decode each shifting instruction and decrement its counter until it reaches zero.
  • the zeroed SSC counter will then convert subsequent real shift instructions into pseudo instructions by asserting "RIM_shft” signal to camouflage their power signatures.
  • a non shifting instruction will never activate the "RLM_shft” signal.
  • SCC circuitry will only be active when it is running encryption algorithm during SP box access, so that normal shift instruction decoding is in effect for non-SP box operations.
  • the physical protection ofthe RLM control line 111 on the chip from direct probing is no longer critical (although it would make sense to protect it nevertheless in order to make the DPA attacker think he will obtain meaningful results by attacking it - something which will turn out to be an exercise in futility).
  • DPA calculates and plot the difference ofthe sum of two groups of power traces.
  • DPA can be effective due to the fact that there is a statistical correlation between the difference ofthe sum ofthe two groups of power traces and the content of a target bit (b) getting through the data path ofthe system at a specific order. Because ofthe introduction of SCC augmented RLM in this embodiment, this statistical correlation is no longer valid as target bits are now getting through the data path ofthe system at a random order rather than at a specific order, and it cannot be disabled without disrupting normal execution ofthe encryption algorithm. Disruption of encryption algorithm by attacking the RIM control line yields no useful statistical key material to be gathered by the attacker.
  • DPA can only be effective if there is a statistical correlation between the difference between the sums of two groups of power traces and the content of a single target bit that exits the system at a specific time. With this RLM embedded embodiment, this statistical correlation is no longer valid due to the fact that target bits now exit the data path ofthe system at random rather than at specific times.
  • the introduction of embedded RIM results in the random variation of two features. The first is a variation in the number/type of instructions run in each SP box's entry address evaluation. The second is a variation in the time interval between each consecutive SP box access. These two features will cause a DPA attacker to be unable to identify which SP box is being accessed in the program. This will, in turn make the re-shuffling ofthe SP box access an effective way of hiding information from DPA attackers because they can no longer align different power traces to the same reference for statistical averaging and analysis.
  • the total number of real and pseudo shifts associated with each SP box totals twenty four shifts.
  • eight real shifts are associated with sixteen pseudo shifts.
  • the eight real shifts are the correct number of shifts for box SP5 according to the DES algorithm. If line 111 is attacked, then twenty four real shifts will occur in box SP5 instead (and in the other SP boxes as well), making a "mess", to so speak, ofthe DES algorithm.
  • the total number of shifts in each SP box need not be fixed at twenty four (or some other number, for that matter), but may be varied or randomized, if desired. That complicates the design ofthe CPU shown in Figure 7 somewhat, for example, by incorporating the design of either Figure 5 or 6, but the modification needed to randomize the total number of shift instructions is rather straightforward, as can be seen by reference to Figure 9 which shows a fourth embodiment as combination ofthe embodiments of Figures 6 and 7.
  • a modified RISC Processor (CPU) architecture can be used, for example, to generate identical power signatures for both normal instructions and special camouflaged "pseudo" instructions controlled by the Random instruction Masking (RIM) flag.
  • This ific processor architecture is intended to work in an on-chip cryptographic system embedded with Random Instruction Masking (RIM), and this architecture combined with the S/W-specific RLM concepts, is intended to protect the cryptographic system from piracy through Power Analysis and Differential Power Analysis.
  • Camouflaged instructions are those instructions that have the same instruction code and the same power signature as those typically used in encryption, but when running in this specific processor architecture, will not change the content of any processor register or alter the processor status.
  • the Random Instruction Masking is a technique to create a camouflaged encryption program to protect the cryptographic device from reverse engineering through Power Analysis or Differential Power Analysis.
  • Figure 10 is a general (simplified) RISC Processor (CPU) architecture 200.
  • a RISC instruction is an arithmetic or logic function performed by the ALU (Arithmetic Logic Unit) 210 taking two operands from two registers ofthe Register File 220 and the result ofthe operation being written back into a third register ofthe Register File 220
  • the Register File 220 consists of a number of registers with the same width (number of bits, e.g. 32-bits) that can be accessed with an address selection.
  • the processor gets its instruction sequentially from the ROM 240 and loads it into the Instruction Register 245.
  • the ROM 240 stores all the instruction codes ofthe whole program including the encryption algorithm.
  • the Control Logic 250 decodes the instruction code in the Instruction Register 245 and gives the correct control commands to the ALU 210 and other parts ofthe processor 200. Addresses ofthe operands (Source A and B) and the destination are also defined in the instruction code.
  • An address decoder 260 decodes the address information from the Instruction Register 245 and provides the access control ofthe specific register in the Register File 220.
  • the ALU 210 controlled by the Control Logic 250, gets the two operands (sources A and B) from the register file 220 with the specified addresses and performs the instruction-specified arithmetic or logical operation. The result ofthe ALU operation is written back to another register in the Register File 220 with the destination address on a data bus 215.
  • a Program Counter 230 that stored the index reference ofthe instruction in the whole program will be incremented or updated by the Control Logic uring the execution ofthe instruction. Some specific instructions ofthe processor will not increment or update the Program Counter 230. The updating of some other Flag Registers (not shown) in the processor, similar to the Program Counter 230, is also instruction dependent.
  • CMOS circuits do not draw static current so that power is dissipated only when charging and discharging ofthe load capacitance (switching).
  • the current consumption of a CMOS circuit depends mainly on the capacitive loading, the driving capability ofthe driver and the frequency of the switching.
  • a complete instruction cycle run in the processor involves the operation of different circuits at different times. Different parts ofthe processor circuits, due to their differences in device dimension, parasitic loading, and switching speed, will generate a unique current pattern (power signature) with respect to time on the power bus when activated. Power Analysis or Differential Power Analysis (DPA) uses these power signature patterns to correlate the instructions.
  • DPA Differential Power Analysis
  • RLM embedded Random Instruction Masking
  • One very important condition for the RLM approach to successfully prevent DPA attacks is to eliminate any power signature of these RLM instructions. The best way to do this is to make the power signature ofthe RLM instruction identical to the normal instruction so that they are not differentiable in Power Analysis or Differential Power Analysis (DPA).
  • Figure 11 shows an improved version ofthe RISC Processor 200 shown in Figure 10.
  • the random number generator is also depicted in Figure 5 in connection with the first embodiment.
  • the RISC Processor of Figure 11 has extra AND gates compared to the Processor of Figure 5 for controlling the Destination Address and the Program Counter Increment Enable.
  • An extra register 222 is attached to the data bus 215. This register 222 is designed in such a way that it is identical to a register in the Register File 220 at least from a power consumption viewpoint.
  • a pseudo program counter 232 is present to duplicate the original Program Counter 230 in the processor in terms of power consumption. While the RLM control flag 202 is set, the pseudo program counter 232 fetches instructions from the ROM 240 and those instructions enter the Instruction Register 245 and are decoded by the Address Decoder 260 as usual. But the results ofthe instruction are directed to the additional register 222 instead of a register in the Register File 220.
  • the processor 200 When the RLM control flag 202 equals a logical '0', the processor 200 will be under normal operation (that is, it functions as depicted by Figure 5 as unmodified).
  • the extra AND gates 221 , 231 at the destination address and the program counter just passing the original signals from the Address Decoder 260 and the Control Logic unit 250.
  • the added register 222 and the pseudo program counter 232 are disabled. Since all the circuit components involved during the execution of an instruction are the same as in Figure 10, the power signature (i.e. the consumed current pattern with respect to time) of every instruction run in the modified processor of Figure 11 will be the same as the processor of Figure 10.
  • the ALU is directed to load the results ofthe instruction being executed into added register 222 instead of one ofthe normal destination registers in register file 220. Since the physical design ofthe added register 222 is identical to a destination register in register file 220, the consumed current pattern of loading this added register 222 will be the same as loading the results into a real destination register in the register file 220.
  • the AND gate 223 arranged at the front ofthe added register is for the purpose of emulating the power of one AND gate 221 used e ⁇ ohe' ' me' ' destination registers during normal operation.
  • the RLM flag 202 also disables the real Program Counter 230, and the pseudo program counter 232 is activated to be incremented or updated.
  • RLM flag 202 goes back to logical '0', the processor will resume its normal operation to continue running the original program.
  • Whatever instructions (no restriction of what kind) run during RLM flag at logical T have no effect on the processor nor the programming other than just producing a camouflage effect of executing an associated normal instruction in the power trace.
  • the instructions that were fetched when the RJM flag at a logical ' 1 ' are basically re-fetched. Of course, the sequence my vary somewhat since the outcomes of branch instructions could be different. In any event, the processing basically continues from where it was interrupted while the RIM flag at a logical ' 1 '.
  • the power traces will contain a random variation ofthe number of certain instructions and also a variety of different kinds of instructions executed in the subroutine. Thus, DPA attackers can no longer identify and align the power traces ofthe SP box subroutine.
  • the extra register 222 is a dummy register in that it receives and stores data, but the data received thereby is preferably not used to influence subsequent data processing by processor 200.
  • Figure 11 it is shown separated from register file 220, but it could be implemented as a part of register file 220, if desired.
  • the protection ofthe RIM control line at the output ofthe RLM control flag 202 on the chip from direct probing is important. If the RLM control line were easily accessed, some knowledgeable attackers may use this technique to force the RLM control line to be always at logical '0' so as to disable the RLM. A number of camouflage techniques are available to protect the physical design of CMOS circuits from reverse engineering.
  • the RLM control line can be made very difficult to probe by burying it deep into the silicon implant level and shielding it with actively connected higher Poly and metal layers. It will be very difficult to locate this RLM control line and any attempt to remove the higher protecting layers will damage the functionality ofthe chip.
  • the state ofthe RIM flag 202 is assumed to be at a logical ' 1 ' when the pseudo program counter 232 is being used to fetch instructions. As is well known to those skilled in the art, this logic shown on Figure 11 may be easily modified so the a logical '0' would cause the pseudo program counter 232 to come into play and then a logical ' 1 ' would represent normal CPU operation.
  • the circuit shown in Figure 11 is not intended for a pipelined ALU. However, it is straightforward to adapt the circuit of Figure 11 for a pipelined ALU.
  • a pipelined ALU has four stages: prefetch, instruction decode, execute, and writeback.
  • the RLM control signal from the RLM flag may be synchronized with the pipeline through a delay circuit.
  • the RIM control flag 202 should be synchronized with added register 222, AND gates 221 and pseudo program counter 232 when used with a pipelined ALU.
  • a processor 200 may have additional status flag registers that should not be updated when running in RLM mode.
  • the control of such registers may be modified in the same way as the registers (by providing dummy flag registers - analogous to extra register 222 - for writing results to when in RLM mode) resulting in a duplicated power signature component for updating these flag registers without really ing them.
  • These flag registers are not depicted in Figure 11 for the purpose of simplicity.
  • high capacitive loading and high speed mean that the switching ofthe data bus and the read/write ofthe Register File (Memory) will dominate the power consumption.
  • the switching power of updating the flag registers is not significant in comparison to the total power. Even the program counter switching power may not be significant enough to cause an observable difference in the power traces. Leaving these flag registers untouched may be a convenient way to reduce the extra circuitry required.
  • This embodiment prevents usage of side channel information by DPA attackers by randomly toggling the polarity ofthe target bit at the data bus driver while maintaining the equal probability of having a '0' or ' 1 ' values. In other words, the power traces no longer statistically correlate with the secret key. Thus, side channel information cannot be used to determine the, keys being used by the cryptographic system.
  • This embodiment may be used with the other embodiments or may be used alone.
  • the result is that within each group of messages having the same target bit values computed from the selection function with correctly guessed key K s , the corresponding power traces will not be always '0' or ' 1 '. The chance of having a '0' or ' 1 ' at the target bit will be approximately at 0.5 due to the randomization of polarity.
  • the selection function D is effectively un-correlatable to the actual power trace measurement. The selection function D has thus been deprived of a way of predicting the power consumption ofthe actual target bit. In the case of Kg being incorrectly guessed, randomization will maintain the un-correlation between D and the corresponding power traces.
  • FIG 12 depicts a Cryptographic Bus Architecture 311 (CBA) in accordance with the present invention, preferably having bi-directional drivers 315, 317 at both ends and a typically heavily loaded bus 316 in between. Bi-directional drivers are preferred since the use of non-bi-directional drivers would tend to increase the number of bus drivers needed to practice the invention.
  • the bus 311 connects CPU 301 to its memories 321, 323. The CPU 301 runs the program stored in the ROM 321 and the RAM 323 is for intermediate storage ofthe cipher text data and the key.
  • the N-bit random number generator 313 controls the N-bit bi-directional drivers 315, 317.
  • the random number generator 313 has N outputs 314, wherein each output comprises of one bit.
  • Each bit 314 0 - 314 N controls one bus driver 315, 317.
  • the random number generator 313 generates a new set of N-bit random numbers 3140 - 314N whenever an "activate signal" is received from the CPU 301 though the enable line 303.
  • the activate signal is preferably sent by the CPU 301 at the beginning of each DES round and is preferably software invoked.
  • the value of each random bit 314 0 - 314 is used to determine the way to toggle a driver 315, 317, i.e.
  • the polarity control line 313 is preferably made to be "probe-resistant" because it is preferably buried beneath those circuit features readily visible to the reverse engineer. That is, this control line can be made with implanted layers in the substrate, using the techniques of U.S. Patent Nos. 5,866,933; 6,294,816 or 6,613,661 (each of which is hereby incorporated herein by reference), and therefore is buried beneath oxide, polysilicon and/or metal, making the possibility of connecting to the control line a much more difficult proposition.
  • the required polarity changes are infrequent enough to thwart the statistical analysis by a reverse engineer. For example, the polarity can be changed at the beginning of each DES round, or at the beginning of fetching each new plaintext for encryption.
  • FIG. 13 depicts a more detailed block diagram ofthe preferred embodiment.
  • the 'CPU Read' 401 0 - 401 N and 'CPU Write' 403 0 - 403 N lines are used to control the data flow direction.
  • the bi-directional bus drivers 315, 317 are inverting or non-inverting tri-state buffers determined by the value ofthe associated random bit 314 0 - 314 if the 3m number generated by random number generator 313. For example when the random bit 314 0 is 'O'for bi-directional bus driver 315 during a 'CPU write' operation, the signal at 305 0 will be inverted on the data bus 316.
  • bi-directional bus driver 317 will pick up the inverted signal from the data bus 316 for bit 305 0 and invert the bit again to ensure the integrity ofthe original data signal. This occurs for each bit of the data signal 305, typically with some bits being inverted and others not.
  • the non-inverting buffer 319 will drive the data bus 316 instead ofthe inverting one 320. Since the signals 314 0 - 314 N are random, the chance of having a value of '0' or ' 1 ' will be approximately 0.5 and 0.5. The result is that all the deterministic power information associated with the content ofthe data bus will be lost. Thus, even in the case of a DPA attack having a correctly guessed key, the tip-off correlation between the content ofthe target bit over the data bus and the corresponding power traces is lost.
  • a set of dual rails (d and d_bar) is preferably used to write a given register bit as shown in Figure 14. Because ofthe symmetry of this design, the dual rails simultaneously contain both the new data 'd' and its complement 'd_bar', thus masking the external power consumption to be normalized at 0.5 as a result of averaging 'd' and 'd_bar'. Note that the presence of .complementary read amplifiers and complementary write amplifiers.
  • the set of dual rails contains '0,1 '; for a data value D 0 of ' 1 ' the data value for the set of dual rails is '1,0'. Therefore, independent ofthe data value D 0 , this circuit (including the rails d and d_bar as well as the complementary read and complementary write amplifiers will always have the same average power consumption and thus will make the data value D 0 un-correlatable to the power consumption ofthe circuit.
  • the data value D 0 ofthe circuit of Figure 14 can have a '0' value or a M' value, but, in either case, one of d and d__bar will be equal to "0" and the other of d and d_bar will be equal to '1' and their average will, of course, be equal to 0.5.
  • the result is that the r signature ofthe circuit is independent ofthe data value content ofthe ALU register bit.
  • a given register has multiple bits and each bit of storage is preferably constructed in accordance with the design according to Figure 14.
  • the present invention is preferably implemented in an on-chip bus and/or chip architecture of a microprocessor that is used to perform cryptographic operations.
  • This architectural approach enables securing existing cryptographic algorithms (including RSA, DES, AES and non-linear algorithms).
  • fval SP8[ work & 0x3 fL]; 10 fval
  • SP6[(work» ⁇ ) & 0x3fL]; 11 fval
  • SP4[(work » 16) & 0x3fL]; 12 fval
  • rshft sw PJM_start I/O to start RIM by allowing insertion of random instructions with CPU ; registers update disabled, (i.e., begin of RIM statements block) ; random instruction from random number generator ; random instruction from random number generator 16.
  • sw RIM_stop I/O to stop Random Instruction Masking by enabling update of registers; ; (i.e., end of RIM statements block)
  • Table 5 The corresponding Assembly language program to implement the C program statement #5 of Table 2 for the embodiment of Figure 7 - lines starting with a ";" are the comment lines.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Storage Device Security (AREA)

Abstract

An apparatus and method for preventing information leakage attacks that utilize timeline alignment. The apparatus and method inserts a random number of instructions into an encryption algorithm such that the leaked information can not be aligned in time to allow an attacker to break the encryption.

Description

Cryptographic Architecture with Instruction Masking and Other Techniques for Thwarting Differential Power Analysis
CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part of U.S. Patent Application Nos. 10/864,569; 10/864,556 and 10/864,568 all filed on June 8, 2004 and respectively entitled "Cryptographic Architecture with Random Instruction Masking to Thwart Differential Power Analysis", "Cryptographic Bus Architecture for the Prevention of Differential Power Analysis" and "Cryptographic CPU Architecture with Random instruction Masking to Thwart Differential Power Analysis." The disclosure of each of these related applications is hereby incorporated by reference.
BACKGROUND OF THE INVENTION
1. Field of the invention
The present invention relates to the security of cryptographic methods and the cryptographic architecture of a processor used in microelectronic assemblies, such as Smart Cards and the like, in order to prevent security breaches ofthe same when a Differential Power Analysis (DP A) attack is utilized.
2. Description of Related Art
Cryptographic techniques are well-known in the art. Indeed, they date from at least the time of Caesar when the need to keep certain information secret from prying eyes became important enough for people to find ways to disguise the information by means of codes and ciphers.
Today, cryptographic techniques are in a wide array of applications, both governmental and private. One application of cryptographic techniques is to protect Mafiόn stored" US a Smart Card and/or to protect the capabilities ofthe Smart Card from unauthorized use or modifications. Cryptographic devices, such as Smart Cards, use secret keys to process input information and/or to produce output information. It has been assumed that the information stored in a cryptographic device, such as a Smart Card, is relatively safe from attack provided that an especially strong cryptographic technique is utilized.
Modern cryptography utilizes transposition and substitution of digital data. Messages to be encrypted, known as plaintext, are transformed by a function that is parameterized by a key. The output ofthe encryption process, known as the ciphertext, is then transmitted. The received ciphertext is then decrypted, using a key, back into plaintext.
One example where modern cryptography is used is in pay-TN conditional-access systems such as pay channels for cable and satellite television. Smart cards and/or security processors (containing secret keys) are used to decrypt the television signals. Attackers buy a cable or satellite receiver and then attack the smart card or security processor inside in order to determine the secret keys. The cipher text is the information sent from the cable or satellite provider, and the plaintext is the decrypted television signal sent to the television. Thus, it is generally assumed that the input and output information, i.e. the plaintext and ciphertext, is available to attackers, and information about the secret keys is unavailable. Figure 1 depicts a cryptographic system. An attacker may attack the smart card or security processor by looking for information related to the secret keys that may be leaked via EM radiation, power consumption, timing etc. The leaked information, commonly referred to as side channel information, can then be used by attackers in order to determine the secret key used. One common technique for determining a secret key from leaked or side channel information is known as Differential Power Analysis (DP A). Unfortunately, there is no way to guarantee that power consumption, EM radiation, etc. will not leak certain cryptographic process information being performed by a device and thus obtain information about the secret "Therefore, defensive techniques are needed that produce leaked information that is unusable by hackers using correlation techniques such as DPA.
The following background discussion is provided in order to supply a context for one application ofthe presently disclosed technology, which involves a well- known cipher, the data encryption standard (DES), for which DPA analysis is commonly used to break. One skilled in the art will appreciate that this discussion is for illustration purposes only, and that the present invention may be utilized to protect secret keys of a number of data encryption formats from a number of hacking techniques in which side channel information is used in order to determine the secret keys.
The well-known DES cipher utilizes a number, typically 16, of substitution box (S-Box) functions. The S-Box functions are non-linear and can be implemented by using table lookups, Boolean logic or appropriately programmed computers.
It has been discovered within the past several years that DPA can be utilized by attackers to determine the secret keys used in cryptographic devices employing DES such as Smart Cards, where in particular the digital encryption standard (DES) is used. See, for example, Differential Power Analysis published by Paul Coker, et al., Cryptographic Research of San Francisco, California. A tutorial on DPA is also provided in the article, Power Analysis Tutorial, published by Manfred Aigner, et al., of the Institute for Applied Information Processing and Communication, University of Technology, Graz, Austria. As described in these references, in order to utilize the DPA technique, the attacker monitors the power consumption ofthe cryptographic device. The fluctuations in the power used by the device reflect the operations going on within the device and that, in turn, can be used to glean information about the secret keys stored within the device.
It is emphasized, however, that side channel information other than power consumption information may be studied by DPA to extract encryption keys. Some Lples are electro-magnetic (EM) radiation and faulty outputs. Unfortunately, there is no way to guarantee that power consumption, EM radiation, and the like, will not leak certain information, and it is believed that it is impractical to expect cryptographic devices, such as Smart Cards, to be completely leak-free in terms of information being able to be discerned by their power consumption, EM radiation or the like. However, defensive techniques can be used that make whatever information is leaked uncorrelatable, even if sophisticated statistical approaches are used, for example, in the DPA process. As such, the present invention is concerned with a solution to the problem of making power consumption information uncorrelatable to the secret keys stored within a cryptographic device, such as a Smart Card.
In the prior art, certain decorrelation techniques do exist. See, for example, U.S Patent Numbers 6,295,606 and 6,298,153 to Messerges, et al., and published European Patent Application Number 1,098,469 of Boeckeler.
The decorrelation techniques discussed in published European Patent Application Number 1,098,469 by Gregor Boeckeler, superimpose a random current profile based on a secondary clock CLK2, insertedαipon the existing profile of a CPU which is based on a master clock CLK1. Each clock is randomly adjusted in a range between 3-7 MHz. Due to two clocks differing from one another with respect to their center frequencies, the combined current profile is randomized which makes a DPA attacker's job more difficult.
Thomas Messerges, in U.S. Patent Number 6,208,135, uses a randomized starting point in the set of target bits. Mr. Messerges processes the corresponding target bits in a different order; thus it becomes difficult for a DPA attacker to group related target bits from all the plaintexts of interest in order to perform statistical analyses associated with given target bit positions. However, not only does this approach not conceal the information leaked by a data bus; it also cannot prevent a malicious attacker from using this information to reorder the target bit into the correct bit position. Mr. Messerges also developed another technique, as discussed in U.S. Patent Number 6,295,606, that uses a random mask to keep the message and key hidden both while they are stored in memory, and during processing by the cryptographic algorithm itself. However, since the mask is randomly changed, new S-boxes must be updated accordingly, and this takes time. The disadvantage is that this kind of masking operation slows down the DES algorithm by a factor of three to five. In addition, this kind of masking operation cannot prevent an attacker from gathering a 48-bit partial key from Round Sixteen when the results must be eventually unmasked to provide the correct output ofthe cipher. Thus Messerges' approach becomes vulnerable to DPA after unmasking. With 48 bits now known at Round Sixteen, the remaining six key bits to make 56 can then be exhaustively searched by an attacker. The present approach is computationally faster, and it also can prevent an attacker from gathering the partial key from Round Sixteen ofthe DES algorithm.
These prior art approaches have certain limitations and therefore need improvement. This invention proposes a unique Random Instruction Mask (RIM) as a countermeasure to the DPA process, effectively making power consumption uncorrelatable to cipher bit values. The present invention has the following advantages over the techniques of Messerges, Boeckler and others: (1) More Efficient Calculations: The techniques taught by Messerges et al. slow down the DES algorithm by 300 to 500% due to the regular update ofthe S-boxes. In the present invention, the DES algorithm will be slowed down by approximately 15%. (2) More Robust: Even in the presence of leaked information for multiple address locations. (3) Better Protection: 48-bits of a key can be completely concealed in the last DES round, (in DES the output is unmasked at the end ofthe algorithm thereby exposing the key, which is not solved by the prior art), and (4) Low Power Consumption: There is an increase in power consumption by less than 1% compared to Boeckeler' s. random current profiling, which increases power consumption to about 200% during cryptographic operations. Before discussing the details ofthe preferred embodiments disclosed herein, additional details related to the DES algorithm and DPA attacks will be provided. If the reader is new to this area, further information may be found in the following articles: P. Kocher, J. Jaffe, and B.Jun, "Introduction to Differential Power Analysis and Related Attacks," 1998; Thomas S. Messergers, Ezzy A. Dabbish, and Robert H. Sloan, "Investigations of Power Analysis Attacks on Smartcards", in Proceedings ofUSENIX Workshop on Smartcard Technology, Chicago, Illinois, May 1999, pp. 151-161; and Manfred Aigner and Elisabeth Oswald, "Power Analysis Tutorial" Institute for Applied Information Processing and Communication University of Technology Graz, Austria. The following discussion is offered to provide a context for a detailed explanation ofthe presently disclosed technology.
The DES algorithm is an example of an iterative-block cipher. DES is described in detail in ANSI X.392, "American National Standard for Data Encryption Algorithm (DEA)," American Standards institute, 1981, which is incorporated by reference herein. The DES cipher is well known and utilizes a number, typically sixteen, of substitution- permutation box (SP-Box) functions instituted in program sequences called rounds. The SP.box functions are non-linear and are conventionally implemented using lookup tables or Boolean logic gates or appropriately programmed computers. In each ofthe sixteen rounds, the DES encryption algorithm performs eight SP box operations, in turn, by accessing sequentially each lookup table (or by using equivalent logic gates). The eight SP boxes each take, as input, a scrambled 6-bit key, (here, scrambled means that the key has been XOR-ed and shifted) and produce a 4-bit output target to be accessed by the CPU for OR-ing operations. Each such 6-bit scrambled key is an SP box's entry address. Table 1 shows the C-language representation of SP boxes 1 and 2 in a 32-bit implementation of DES. DES can run with 16, 32, and 64 bits but we have chosen the 32-bit representation as a nominal example. From Table 1 note that each SP lookup contains 64 elements. Each element in a nominal DES implementation is 32-bits and embeds a given 4-bit output target. This embedding will now be described in greater detail. The data bus is typically 32-bits wide, this 4-bit output target is distributed somewhere within a 32-bit word according to the permutation rules (one per SP box) as implied in Table 1 , where the data is presented in a hexadecimal format. That is, each SP lookup table will have a different embedding position for a given 4-bit output target. For example, lookup table SPl, shown in Table 1, embeds a 4-bit output target at bit positions 24, 16, 10 and 2 in a 32-bit word. Lookup table SP2 embeds a 4-bit output target at bit positions 20, 5, 31 and 15, where bit 20 is the most significant bit (MSB) and bit 15 is the least significant bit (LSB) for a given 4-bit output. As a further illustration, the first four entries of lookup table SPl, i.e., SPl [0:3] = {0x01010400L, OxOOOOOOOOL, 0x000 lOOOOL, 0x01010404L} have 4-bit output target values of 14, 0, 4, 15. Specifically, SPl [0] = {0x01010400L} is embedded with a 4-bit output target value of 14 (i.e, 1110). For example, for SP 1 [0] the 32 bit binary word is 0000 00010000 00010000 0100 0000 0000. The right most digit is the LSB while the left most digit is the MSB for a given 32- bit binary word. To derive the 4-bit output target, the values ofthe bits at 24, 16, 10, and 2 are used. For example, for SPl [0] the 4-bit output target is 1110. This is determined by looking for the MSB value ofthe 4-bit output target at position 24, the next bit is at position 16, the third bit is at position 10, and finally the LSB of 0 is at position 2 ofthe 32 bit binary word SPl [0]. The bit positions, 24, 16, 10 and 0 are underlined in the binary representation given above. The fourth entry SPl [3] = {0x01010404L}, (which differs from the 1110 of SP 1 [0] only at the LSB), has a 4-bit output target value of 15 (i.e., 1111).
On the other hand, the lookup table SP2 illustrates different embedding bit position scheme as shown in the first four entries of lookup table SP2, i.e., SP2[0:3] = {Ox80108020L, 0x80008000L, Ox000080000L, 0x00108020L.} Only the contents at bit positions 20, 5, 31 and 15 are changed to reflect the values of 15, 3, 1, 13 for the corresponding 4-bit blocks. In particular, the first entry of lookup table SP2, SP2[0] = {0x80108020L] has a 4-bit output target value of 15 (i.e., 1111) because bit 20, 5, 31 and 15 all have a value of 1. The fourth entry SP2[3] = {0x00108020L}, (which differs with 1111 of SP2[0] only at the 2nd LSB), has a 4-bit target value of 13 (i.e., 1101). Having established the relationship between the 4-bit output target and its corresponding SP box's entry, next the calculation of a given SP box's entry address is discussed. In general, a DES algorithm uses shifting instructions running in the CPU to calculate a box's entry address. Both the number of shifting instructions used in a specific SP box's entry address calculation and the time interval between each consecutive access of an SP box will be well known to anyone who is familiar with the DES algorithm. In view of this fact, DPA attacks are focused on aligning the power traces of each 4-bit output target of an SP box by referencing the preceding shifting instruction signature unique to that box. As shown in Table 2, under conventional operation, the accessing of each SP box is preceded by a different amount of shifts: »8, »16 or »24 ('»' stands for a right shift in the C computer language and thus '»n' stands for a right shift of n bits). One skilled in the art will recognize that the routine in Table 2 is written in the C computer language. Figure 2a shows a corresponding time line with normal accessing order for eight SP boxes [SPl ... SP8]. Since each shift instruction normally shifts one bit at a time, »8 normally implies eight right bit shift instructions, »16 normally implies sixteen right bit shift instructions, and so forth. The shift for SP5 are identified by numeral 131.
In order to align the power traces, a DPA attacker looks for patterns in the power trace. To determine a SP address calculation for SP box 5 (SP5), the DPA attacker looks for a pattern indicating eight shifts as seen in Table 2. In addition, the DPA attacker would know that the time from the beginning ofthe eight shifts (see numeral 131) to the beginning of a next set of shifts is equal to a time TI5 as shown in Figure 2a. Thus, the DPA attacker, when finding this pattern in a power trace, would know that the SP address calculation for SP5 has been found (at numeral 123). In addition, the attacker would also know that the information in the power trace for the time slot following the end ofthe eight shifts would contain the corresponding 4-bit output target information. This information allows for the alignment ofthe power traces for statistical averaging which provides information regarding the 6-bit key. One skilled in the art will appreciate that power traces are noisy, thus finding instruction signatures and other patterns may not guarantee the success of a DPA attack. However, the instruction signatures and other ms are available in the prior art for an attacker to use. By destroying these instruction signatures and time patterns, the success of a DPA attack is even more unlikely.
Figure 2b shows the time line with randomized accessing order for the eight SP boxes. As an illustration in Figure 2b, the processing order of SPl and SP3 has been swapped, and similarly for the SP4 and SP6. In this case, it is obvious that a DPA attacker will have to identify these shifting instruction signatures in order to align power traces by re-shuffling the SP box accessing order. After alignment for a given SP box, statistical averaging and other analysis of these power traces can be performed. Thus, the DPA attacker can ultimately align the power traces to determine the 6-bit key.
Summary of the Disclosed Technology
In one embodiment, the present invention provides a method of inhibiting a successful DPA of a cryptographic device comprising: randomly varying an amount of time required to determine at least one lookup table address; and randomly varying an amount of time occurring between one access of at least one lookup table and a subsequent access of another lookup table.
In another embodiment, the present invention provides a cryptographic architecture comprising: a processor; a memory module containing an encryption algorithm coupled to said processor; a control flag.register coupled to said processor for controlling the state operation ofthe processor; and a random number generator coupled to said control flag register, wherein said processor sets said control flag register and said random number generator resets said control flag register.
In yet another embodiment, the present invention provides a system for thwarting DPA, said system comprising: means for running an encryption algorithm and means for inserting a random number of pseudo instructions into said encryption algorithm. "" Bi still another embodiment, the present invention provides a system for decorrelating side channel information, said system comprising: means for running a Data Encryption Standard (DES) algorithm, said DES algorithm comprising a plurality of substitution/permutation box entry address evaluations and means for inserting a random number of shifting instructions run in each of said plurality of substitution/permutation box entry address evaluations.
In yet another embodiment, the present invention provides a method of altering a power trace of a cryptographic architecture comprising the steps of: running an encryption algorithm; setting a control flag; and performing a random number of instructions when said control flag is set.
In still yet another embodiment the present invention provides a method of inhibiting a successful differential power analysis of a cryptographic device comprising randomly increasing an amount of time required to determine at least one lookup table address; and randomly increasing an amount of time occurring between one access of at least one lookup table and a subsequent access of another lookup table.
In still yet another another embodiment, the present invention provides a cryptographic architecture comprising: a processor; a memory module containing an encryption algorithm coupled to said processor; a control flag register coupled to said processor for controlling the state operation ofthe processor; and a random number generator coupled to said control flag register, wherein said processor sets said control flag register and said random number generator resets said control flag register.
In yet another embodiment, the present invention provides a system for thwarting differential power analysis, said system comprising: means for running an encryption algorithm and means for inserting a random number of pseudo instructions into said encryption algorithm. In still yet another embodiment, the present invention provides a system for de- correlating side channel information, said system comprising: means for running a Data Encryption Standard (DES) algorithm, said DES algorithm comprising a plurality of substitution/permutation box entry address evaluations and means for inserting a random number of shifting instructions run in each of said plurality of substitution/permutation box entry address evaluations.
In yet another embodiment, the present invention provides a method of altering a power trace of a cryptographic architecture comprising the steps of: running an encryption algorithm; setting a control flag; and performing a random number of instructions when said control flag is set.
In yet another embodiment, the present invention provides a cryptographic CPU architecture comprising: an ALU; a control flag; a plurality of registers for normally receiving output ofthe ALU in response to an arithmetic instruction; and an additional register for receiving output ofthe ALU, in lieu of one ofthe plurality of registers, in response to an arithmetic instruction when the control flag is set.
In yet another embodiment, the present invention provides a method of concealing data processing occurring in a CPU from power analysis during the execution of a program, the method comprising: (i) at a point during the execution ofthe program, inserting a random number of program counter cycles instruction fetch cycles; (ii) while the random number of instruction fetch cycles are occurring, fetching instructions from memory, executing those instructions in program sequence, but inhibiting updating of normal memory locations based on the execution of those instructions; and (iii) at the conclusion of said random number of instructions, then recommencing normal program execution by refetching the same instructions which were initially fetched while the random number of instruction fetch cycles were occurring, but when the instructions are refetched, updating memory locations in a normal manner for the CPU. In still yet another embodiment, the present invention provides a method of concealing data processing occurring in a CPU from power analysis during the execution of a program, the method comprising: (i) at a point during the execution ofthe program, inserting a random number of program counter cycles instruction fetch cycles; and (ii) while the random number of instruction fetch cycles are occurring, mimicking power consumption associated with (a) fetching instructions from memory, (b) executing those instructions in program sequence, and (c) writing results to memory registers.
In still another embodiment, the present invention provides a data processor comprising: an arithmetic logic unit; a control flag register; a plurality of registers for normally receiving output ofthe arithmetic logic unit in response to an arithmetic instruction and in response to a first state of said control flag register; and a dummy register for receiving output ofthe arithmetic logic unit, in lieu of one ofthe plurality of registers, in response to an instruction and in response to a second state of said control flag register.
In another embodiment, the present invention provides a cryptographic bus architecture comprising: a random number generator having a plurality of random number outputs at which a multi-bit random number is output; a plurality of bidirectional bus drivers, each bi-directional bus driver having at least one input for receiving at least one of said random number outputs; and a bus coupling at least one of said plurality of bi-directional bus drivers to at least another of said bi-directional bus drivers; wherein bi-directional bus drivers that are coupled to a common line of said bus are controlled by a common selected one of said random number outputs.
In another embodiment, the present invention provides a method of preventing a breach of security comprising the steps of: sending encrypted bits over a bus; and randomly toggling the polarity of said encrypted bits on said bus.
In another embodiment, the present invention provides a method for protecting secret keys comprising: providing a plurality of bi-directional bus drivers; coupling a line lata bus between at least a first bi-directional bus driver of said plurality of bidirectional bus drivers and a second bi-directional bus driver of said plurality of bidirectional bus drivers; signaling said first bi-directional bus driver to provide a first set of bits to said bus, said bits having a first polarity; signaling said second bi-directional bus driver to receive said first set of bits having said first polarity; randomly signaling said first bi-directional bus driver to provide a second set of bits to said bus, said second set of bits having an opposite polarity than said first set of bits; and signaling said second bi-directional bus driver to receive said second set of bits having said opposite polarity.
Brief Description of the Figures
Figure 1 depicts a prior art diagram of information available to attackers;
Figure 2a is a prior art timeline corresponding to the normal accesses of eight SP lookup tables for a given round;
Figure 2b is a prior art timeline corresponding to a randomized accessing order of the eight SP lookup tables for a given round;
Figure 3 is a time line with both the time intervals and SP boxes accessing orders being randomized by Random Instruction Masking (RIM) in accordance with the present disclosure;
Figure 4 is a time line with the shifting instructions being equalized in accordance with the present disclosure;
Figure 5 is a block diagram of a first embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure;
Figure 6 is a block diagram of a second embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure; Figure 7 is a block diagram of a third embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure; and
Figure 8 is a time line associated with the embodiment of Figures 7.
Figure 9 is a block diagram of a fourth embodiment of a hardware architecture for implementing the DES algorithm in accordance with the present disclosure.
Figure 10 is a block diagram a prior art RISC CPU.
Figure 11 is a block diagram a RISC CPU in accordance with a sixth embodiment ofthe present invention.
Figure 12 is a block diagram of a system in accordance with a cryptographic bus architecture embodiment;
Figure 13 is a detailed block diagram of a bus architecture in accordance with the cryptographic bus architecture embodiment; and
Figure 14 depicts a block diagram of bit writing with dual rails in accordance with the cryptographic bus architecture embodiment.
Brief Description of the Tables
Table 1 shows values, expressed in the C language, for SP-boxes 1 and 2 implemented as lookup tables of 64 elements. Table 2 is a C language program that sequentially accesses DES's eight SP lookup tables for a given round. ""'" Table" 3 "is an assembly language program to implement C program statement number 5 of Table 2. Table 4 is an assembly language program to implement a portion ofthe DES encryption algorithm that performs eight S and P boxes' operations in turn by accessing sequentially each lookup table. Table 5 is an assembly language program of to implement C program statement number 5 of Table 2 using the embodiment of Figure 7.
Introduction
The presently disclosed technology now will be described more fully hereinafter with reference to the accompanying drawings, in which a preferred embodiments ofthe technology are described with reference to Figures 7 and 8. However, before discussing Figures 7 and 8, this detailed description leads the reader through Figures 3 - 6 which repeats the description of some ofthe material presented hi the related applications noted above. These descriptions are useful in better understanding the improvements disclosed by Figures 7 and 8.
Figure 9 depicts a fourth embodiment which is basically a combination ofthe embodiments of Figures 6 and 7.
Figure 11 depicts a fifth embodiment which is based on a modified RISC CPU design, but the modifications discussed may also be used with non-RISC CPUs if desired.
Figures 12-14 related to a cryptographic bus architecture which may be used independently or in combination with the other embodiments.
The presently disclosed technology may be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. '""' The Following discussion provides one context for using the present disclosure in connection with a well-known cipher, the data encryption standard (DES), and thwarting DPA analysis that is commonly used to break DES. Those skilled in the art should appreciate that this discussion is for illustrative purposes only, and that the presently disclosed technology may be utilized to protect secret keys of a number of data encryption formats from a number of hacking techniques in which side channel information is used in order to determine the secret keys.
In general, any encryption algorithm is a series of instructions executed by a processor. While the inputs and outputs of these instructions will vary, the amount of time required to complete each instruction is determined by the clock speed ofthe processor or a bus over which the data is transmitted to and from the processor. Different instructions take more clock cycles than other instructions. The knowledge ofthe encryption algorithm used to encrypt/decrypt the data provides hackers with knowledge about the timing ofthe algorithm, i.e. knowledge about which instructions are used and thus how long each instruction should take. This knowledge about timing can then be used to align side channel information. Thus, the side channel information can then be processed by sophisticated statistical approaches that allow the attacker to break the encryption.
A system and method for randomizing the number of instructions within the encryption algorithm is disclosed herein. By randomizing the number of instructions and by their execution thereby inserting random delay times, the instructions and timing within the encryption algorithm are no longer known to the DPA attacker. Therefore, the timing ofthe algorithm will be unknown to the attackers and they will be unable to align the side channel information. Without the alignment ofthe side channel information, the sophisticated statistical approaches will fail and the encrypted information will be protected.
The following discussion illustrates how certain embodiments ofthe presently disclosed technology may be incorporated with a system using DES to prevent an attack JPA. One skilled in the art will appreciate that the present invention may be incorporated into other systems that use other encryption algorithms in order to randomize the time between given instructions. This randomization of time may be used to defeat any attack that relies upon understanding the timing ofthe algorithm in order to break the encryption.
This specification provides information specific to an on-chip Random Instruction Masking (RIM) architecture on a microprocessor that is used to perform cryptographic operations. Furthermore, this specification provides an architectural approach for securing existing cryptographic algorithms (including RSA, DES, AES and non-linear algorithms) from Side-Channel-Attacks ~ i.e., attacks based on leaked power information. The motivation is to keep systems secure even though the underlying circuits will very likely always be leaking such information.
A software approach to randomizing the order ofthe processing ofthe target bit is not enough to secure an algorithm completely. It is also necessary to destroy all instruction signatures or power patterns that may allow the DPA attackers to reorder the target bits to their original sequences. Consequently, one approach is to complement a software approach with hardware protection preferably by means of an architecture that implements the randomizing instructions and time delays as disclosed herein.
Several embodiments of an architectural or hardware approach to prevent DPA attacks from extracting information correlated to secret keys to the DES or other cryptographic algorithm are described below. Moreover, certain background information regarding DES is provided above. If the reader is new to this field, the reader should refer first to the documents mentioned in the introductory portion of this disclosure. In addition, the following illustration is dependent upon a thorough knowledge ofthe DES algorithm. Comparable detailed knowledge ofthe appropriate algorithm would be required to attempt an attack on one ofthe other algorithms. The present discussion starts by discussing the defensive RLM techniques for preventing DPA and related attacks. It is assumed that a DPA selection function can simultaneously select for values of four target a'ther than just one bifbecause low-level instructions often manipulate four bits (due to common use of six key bits.) The resulting DPA characteristics tend to have larger peaks, but do not have better signal-to-noise ratios because proportionately fewer samples are included in the averaging.
Figure 3 depicts how the time line relationship between SP box's entry address calculation 131 and the generation of a given 4-bit output target 123 maybe modified. The modification comprises the insertion of random numbers of pseudo shifting instructions 133 (according the embodiment of Figure 5, for example) or random numbers of randomized pseudo instructions in each SP box's entry address calculation subroutine (according the embodiment of Figure 6, for example).
The numbers of inserted pseudo instructions do not necessarily be random, since if each SP box ends up having the same numbers of real and pseudo instructions, then the attacker is still left with little or not information to ascertain which box is which.
It is desirable that the pseudo shift instructions include the shift and that they exactly mimic the power signature ofthe real shift instructions. Unless these pseudo instructions include a shift, their effect could probably be observed and thus ignored by a DPA attacker. There is a fixed relationship between the number of shifts and the SP box index (when the presently disclosed technology is not used) and as long as the attacker can identify that number of shifts somewhere, then the attacker can identify the specific SP box being addressed. The attacker can do.this via statistical reordering ofthe data to find the correct number of shifts. In addition to DES, most encryption algorithms do utilize shift instructions somewhere, and assuming that the algorithm is known by the DPA attacker, then a similar correlation can be found unless the disclosed technique of inserting random numbers of shift instructions is utilized.
As shown in Figure 3, the insertion ofthe pseudo shifting instructions 133 or other pseudo instructions 133 changes not only the number of instructions run in each SP box's entry address evaluation, but also the time interval between consecutive SP box ss Tn. In the example shown in Figure 3, a random number of pseudo shifting instructions 133 have been inserted in SP5, thus changing the time interval T5 between the access of SP5 and SPl compared to Figure 2b. Further, a random number of pseudo instructions 133 are inserted in SP4, thus changing the time interval T4 between the access of SP4 and SP6 compared to Figure 2b. Of course, a random number of pseudo shifting instructions 133 could also be inserted in one or more ofthe other SP boxes. The instructions are called 'pseudo' since they preferably mimic the power consumption trace of a real counterpart instruction (and, indeed, in certain embodiments, they may in fact be real instructions), but the execution ofthe pseudo instruction does not result in any data being updated by the processor.
Due to the insertion of a random number of pseudo instructions 133 that preferably mimic the real shift instruction from a power use point of view, both the Shifting instruction signatures and the time interval signatures are camouflaged or even eliminated. This will cause a DPA attacker to be unable to identify which SP box SPl - SP8 is being accessed in the program. This will make the re-shifting (randomization) of the SP box access order an effective way of hiding information from DPA attackers; therefore, they can no longer align different power traces to the same reference for statistical averaging and analysis. If the pseudo instructions exactly mimic real shift instructions from a power use point of view, then the attacker can find it very difficult to identify which SP box is which. If the pseudo instructions mimic a set of randomized instructions, then the SP boxes may well be very difficult to recognize at all. The attacker may well wonder whether the encryption protocol used by the device is the same protocol that the attacker assumes the attacked device utilizes.
As mentioned above, instead of randomizing the number of shift instructions run in each (or some) SP box's entry address evaluation, it is possible to equalize the number of shift instructions, such that there appears (for example) to be a total of twenty four shifts before each output, as shown in Figure 4. However, it may be preferable to randomize the number of instructions, which also randomizes the time interval between each consecutive SP box access. Thus, the randomization helps to thwart an attacker's if the time interval as a signature to identify the SP box access. This added uncertainty further complicates the attacker's task. However, as can be seen with reference to Figure 4, randomization ofthe number of inserted pseudo instructions 131 is not critical to the present disclosure.
Detailed Description of a First Embodiment
Figure 5 depicts a first embodiment of a hardware architecture for implementing the DES algorithm which may be used to insert a random number of pseudo shifting instructions 133 (as discussed with reference to Figure 3) or an equalized number of shifting instructions 133 (as discussed with reference to Figure 4). The system illustrated in Figure 5 includes a 32-bit processor or Central Processing Unit (CPU) 101 with RAM 103 and ROM 105 memories on a single chip. One skilled in the art will appreciate that the presently disclosed technology may be implemented for other hardware architectures such as 2-bit or 8-bit architectures. Accordingly, the CPU could be a 16-bit or 64-bit processor, respectively. »
The system also contains substitution/permutation boxes (SPl - SP8) 107, which . can be implemented as lookup tables, as discussed above. The CPU 101 runs an encryption/decryption program stored in the ROM 105, while the RAM 103 is for intermediate storage ofthe cipher text data. The 6-bit key (or a guessed key) 121 and SP boxes 107 are used to calculate the Cipher Function A Random Number generator 115 is coupled to a Random Instruction Mask (RIM) control flag register 113 which is coupled to the CPU 101. In this embodiment, the random number generator 115 and the RLM control flag register 113 are used to camouflage the power trace so that this power trace cannot be time-aligned to yield statistical material for any given 6-bit key 121. Since an attacker is focused on aligning the power trace associated with each 4-bit output target 123 by tracking the shifting instruction signatures, the present RIM approach is devoted to disabling this tracking ability. A random numoer oi pseuαo smiting instructions 133 are generated through the interaction ofthe CPU 101, the RLM Control Flag Register 113 and the Random Number Generator 115. The CPU 101 runs the encryption/decryption program stored in the ROM 105. Embedded in this encryption/decryption program (to be discussed later) is an instruction to set the RIM Control Flag Register 113. Upon processing this instruction, the CPU 101 sends a signal on bus 109 to the RLM Control Flag Register 113 that sets it. The RLM Control Flag Register 113 then sends a RIM Control Flag signal on a control line 111 to the CPU 101 causing the CPU 101 state machine to halt (to stop updating registers in response to calculations). This may be accomplished by sending a signal from the RIM Control Flag Register 113 to the program counter register within the CPU 101 that will disable the program counter. Effectively, the state machine ofthe CPU 101 is halted.
The state machine ofthe CPU 101 remains halted until the RLM Control Flag Register 113 is reset. This will cause the RIM Control Flag Register 113 to send a signal to the CPU 101 on control line 111 to enable the program -counter in CPU 101. The RLM Control Flag Register 113 is preferably reset through the use ofthe Random Number Generator 115. For design simplicity, the Random Number Generator 115 is preferably a 1-bit random number generator. The Random Number Generator 115 is synchronized with the timing ofthe instruction cycle ofthe CPU 101. The Random Number Generator 115 may provide an output every clock cycle, or may be gated to ensure that an output is provided to the RLM Control Flag Register after a random number of X cycles, where X is any number such as 5. For a one-bit Random Number Generator 115, the RLM Control Flag Register 113 is programmed to reset when either a zero or one is received from the one-bit Random Number Generator 115 depending upon the logic used. For example, assume that a zero from the one-bit Random Number Generator 115 will reset the RLM Control Flag Register 113. Because the RLM Control Flag Register 113 is reset only after receiving a zero from the one-bit Random Number Generator 115, and the one-bit
Random Number Generator 115 will generate a zero after a random number of cycles, the time the state machine ofthe CPU 101 is halted will also be random. Thus, a random number of pseudo instructions 133 is generated affecting the time line ofthe algorithm. Preferably, a pseudo instruction 133 is an instruction producing the same power signature on power traces as the original instruction but the write back ofthe execution result to the destination register in the CPU 101 is inhibited since the state machine of CPU 101 is halted. The inhibiting ofthe CPU 101 preserves the CPU's state. Thus, inhibiting write back prevents the CPU from moving onto the next step in the algorithm; however, the power traces suggest otherwise. Thus, the attacker will be unable to use the power traces to decrypt the keys.
The CPU 101 in Figure 5 is preferably modified to accommodate these pseudo instructions with a RLM control flag signal sent on the bus 111, generated by a RIM control flag register 113, which, when activated, will disable the update ofthe CPU 101 destination register or the CPU 101 program counter (details of an embodiment of a modified CPU are disclosed in U.S. Patent Application No. 10/864,568 filed June 8, 2004 entitled "Cryptographic CPU Architecture with Random Instruction Masking to Thwart Differential Power Analysis").
As a result of this RLM control flag signal on bus 111, all the instructions executed while the state machine ofthe CPU 101 is halted will have no material effect except to alter the power trace so that the number of discrete samples of a power trace is no longer fixed for a given 4-bit output target 123. While the RLM control flag Register 113 is set a random number of instructions will be executed. When the RIM control flag 111 is reset, the 4-bit output target 123 is supplied to the RAM 103. The introduction of RLM results in the random variation of not only the number of Shifting Instructions run in each SP box's entry address evaluation but also ofthe time interval between each consecutive SP box access TIn. For further details regarding the random instructions executed while the RLM control flag is activated see U.S. Patent Application No. 10/864,556 filed on June 8, 2004 and entitled "Cryptographic Bus Architecture for the Prevention of Differential Power Analysis". A description follows of how the insertion of pseudo shifting instructions works. As shown in Table 2, the DES C language statement #5 (i.e., fval |= SP5[(work» 8) & 0 x3fL]) contains an 8-bit right shifting instruction (i.e., "work » 8") as part ofthe entry address calculation to access the SP5 lookup table. Table 3 shows the expansion of this single C language statement into the corresponding Assembly language subroutine.
The assembly statement #4 (i.e., jal link rshft) in Table 3 jumps and links to the subroutine labeled as "rshft" or Statement #13 (thus the mnemonic jal). The term "link" in this statement represents a register that contains the return address. When the program executes statement #13, i.e. the program counter pc <-pc +1, the program counter stops advancing. The program counter tries to prefetch statement #14 but is halted until the RIM control flag is reset by the random number generator 115. The "rshft" subroutine will right shift register 1 by 8 places as specified in the register C. To camouflage the power trace segment associated with the shifting instruction, RLM statements of variable block size (indicated between statement #13 and #14) are inserted before (or after, or both) an actual shifting instruction statement like, #15 (i.e., sra 1 1). The instruction #13 in Table 3 allows the insertion of RLM instructions, when the RLM Control Flag Register 113 is set by the CPU 10.1 until the RLM Control Flag Register 113 is reset by the Random Number Generator 115. After execution of statement #15, and the completion ofthe RLM block, the "useful" execution ofthe program resumes.
The location of statements #13 and #14 in Table 3 are for illustrative purposes only. These statements can occur anywhere, before, between or after an actual shifting instruction statement like #15. Preferably, for design simplicity, statements #13 and #14 are located within the scope ofthe shifting routine. This random insertion thwarts a DPA attacker's attempt to track the shift instruction signatures because the number of discrete samples of a power trace is no longer fixed, but random. Hence, power traces cannot be time-aligned by the attacker for each 4-bit output target 123. In addition, this insertion of random instructions also changes the time interval, for example TI5, further thwarting the attempts ofthe DPA attacker. The random number of pseudo shift statements are preferably inserted in the middle of a loop - so the effect of them is magnified by the If these statements were inserted outside the loop, then adding only one or two pseudo shifts really won't help: changing a »8 to a »10 may not camouflage it enough in the context ofthe DES algorithm. If you are trying to hide a »8 from a »16 or »24, this requires that enough pseudo shift instructions be added to confuse the »8 with a »16 or a »24. Putting the added random number of pseudo shift statements in the loop ensures that the added number of pseudo shift statements will be an integer multiple of 8. If a random number of pseudo shift statements is inserted outside the loop, then other techniques can be used to ensure that the added number of pseudo shift instructions will be 8, 16, 24 (or other number sufficiently close thereto to confuse the DPA attacker).
In terms of providing additional information, Table 4 is an assembly language program with a 16-bit CPU to implement the portion ofthe DES portion ofthe DES encryption algorithm that performs eight S and P boxes operations in turn by accessing sequentially each lookup table 107 as shown in Figure 5. Lines starting with ";" are comment lines. Underlined statements are the corresponding C language statements for comment purposes.
Detailed Description of a Second Embodiment
Figure 6 depicts another embodiment of a hardware architecture for implementing the DES algorithm which may be used to insert a random number of random pseudo instructions 133 (see Figure 3).
Since a DPA attacker is focused on aligning the power trace associated with each 4-bit output target by tracking the shifting instruction signatures, the first embodiment of Figure 5 disables this tracking ability by inserting a random number of RIM instructions in each SP box's entry address calculation subroutine. In this embodiment, however, not only the number but also the content of these instructions will be altered, as described in detail below. This second embodiment, as shown in Figure 6, is very similar to the first embodiment of Figure 5 and therefore common elements are identified by common reference numerals. As in the case ofthe embodiment of Figure 5, this embodiment preferably has a 32-bit CPU 101 with RAM memories 103 and ROM memories 105 disposed on a single chip. This chip also preferably contains substitution/permutation boxes (SPl - SP8) 107, which can be implemented as lookup tables. The CPU 101 runs the program stored in the ROM 105, while the RAM 103 is for intermediate storage of the cipher text data. In this embodiment, the CPU 101 fetches not only the normal encryption program from the ROM 105, but also the camouflaged, randomized instructions by means of a 32-bit pseudo random number generator 117. As shown in Figure 6; a MUX 119, selected by a RIM control flag register 113, determines the type of instructions fetched by the CPU 101, real instructions from ROM 105 or randomized instructions generated by the 32-bit pseudo random number generator 117.
As in the case ofthe first embodiment, a conventional CPU is modified to include the RL control flag register 113 which, when activated, will disable the update ofthe CPU's destination register(s). As a result of this flag being set, all the instructions executed inside the RLM statements block will camouflage the power trace so that the number of discrete samples of a power trace is no longer fixed for a given 4-bit output target. The number and type of these instructions are determined on the fly by the random number generators. The program address is also constantly being substituted for by another 32-bit pseudo Random number, since the Program Counter is not updated until the CPU resumes normal execution after the RIM control flag has been reset by the 1-bit random number generator.
The RLM control line 111 of Figures 5 or 6 should be made to be "probe-proof by burying it deeply in the layers ofthe semiconductor device. However, if the RLM control line 111 can be probed, then the afore-described techniques for dealing with a DPA attack will be overcome if the DPA attacker disables the RLM control signal on line 111 by tying it to ground (or high, depending on its logic) throughout the attack. Detailed Description of a Third Embodiment
Figure 7 depicts a third embodiment that is more resistant to probing than the embodiments of either Figures 5 or 6 and Figure 8 presents a time line for this embodiment. This embodiment overcomes a single point failure attack, that is, an attack on line 111 of the foregoing embodiments, by introducing a Shift Control Counter (SCC) 140 and other changes discussed below. This embodiment is described with reference an embodiment in which the total number of shift instructions (both real and pseudo) are fixed at twenty-four in number. However, those skilled in the art should now appreciate that the number of fixed and real instructions can be fixed at some other number or can be randomized utilizing the techniques previously described with reference to Figures 5 and 6. The embodiment of Figure 7 anticipates an attack will occur on line 111 and the previously disclosed design of line 111 is modified so that even in the event of a successful attack, the system does not revert back to an unprotected design (such as the designs described with reference to Figures 2a and 2b).
During the calculation of a given SP box's entry address, as defined in the Data Encryption Standard (DES) algorithm, the SCC 140 will be set (for example by a suitable software instruction or set of software instructions - see, e.g., instructions 3 and 4 in Table 5) to a count corresponding to that ofthe SP box. Each decoded shift instruction will decrement this counter 140 by one until it reaches zero using, for example, its own decoder hardware. A zero count will activate the "RLM_shift" signal at its output that will make any subsequent shift instruction a RLM instruction (i.e., a pseudo shift instruction with a camouflaged power signature). In Figure 8, each SP box has 24 right bit shifts associated therewith. However, some or all ofthe right bit shift instructions are RLM_shifts (i.e. pseudo shifts). The shifts, which are pseudo shifts in Figure 8, are identified by hatching lines. For example, for box SP5, eight shifts are real right bit shift instructions while sixteen shifts are pseudo shift instructions. If a DPA attacker attacking "ϊ l" disables the "RIM_shift" signal, then the normal execution ofthe encryption algorithm will be disrupted due to the fact that extra shifts will be performed because the pseudo shift instructions are then turned into real instructions due to the interference with line 111. This instead of merely inhibiting the production of pseudo shift instructions, interference with line 111 causes the inhibited pseudo shift instructions to be replaced with real shift instructions.
Therefore, the attacker can gather no useful statistical key material. In other words, disturbing the RIM flag will disrupt the normal execution ofthe encryption algorithm and DPA attack fails as it yield correct results (due to the extra real shifts which occur). On the other hand, if the DPA attacker leaves the "RIM_shift" signal alone, the activated "RIM_shift" signal will camouflage the shift instructions' power signatures as previously described with reference to Figures 5 or 6. This means then, that the randomizing ofthe SP box accessing order will be an effective way to thwart a DPA attacker's attempt because the grouping and reordering of target bits required by DPA is made much more difficult.
Table 5 is similar to Table 3, but shows the SCC 140 augmented RIM implementation in an assembly language subroutine. The same assembly statement #3 (in an italic font) first loads register C with the number of shifts to be used to initialize Shift Control Counter (SCC) as indicated by the assembly statement #4 (i.e., sw_SCC C) which stores word SCC with the content of register C (thus the mnemonic sw). Assembly statement #3 is not intended to tell the CPU to execute how many shifts; instead, assembly statement #5 is used for this purpose to provide identical shifting instruction power signatures for every SP box access. The SCC control circuitry will decode each shifting instruction and decrement its counter until it reaches zero. The zeroed SSC counter will then convert subsequent real shift instructions into pseudo instructions by asserting "RIM_shft" signal to camouflage their power signatures. A non shifting instruction will never activate the "RLM_shft" signal. SCC circuitry will only be active when it is running encryption algorithm during SP box access, so that normal shift instruction decoding is in effect for non-SP box operations. The physical protection ofthe RLM control line 111 on the chip from direct probing is no longer critical (although it would make sense to protect it nevertheless in order to make the DPA attacker think he will obtain meaningful results by attacking it - something which will turn out to be an exercise in futility). So some knowledgeable attackers may be able to force the RLM control line 11 lto be always at logical '0' (whether it is physically protected or not) so as to disable the RIM. In this embodiment, the DPA attack ofthe chip is protected by a novel approach - the conversion of unnecessary pseudo shifts into real shifts that just render the data meaningless.
In summary, the principle of DPA is to calculate and plot the difference ofthe sum of two groups of power traces. DPA can be effective due to the fact that there is a statistical correlation between the difference ofthe sum ofthe two groups of power traces and the content of a target bit (b) getting through the data path ofthe system at a specific order. Because ofthe introduction of SCC augmented RLM in this embodiment, this statistical correlation is no longer valid as target bits are now getting through the data path ofthe system at a random order rather than at a specific order, and it cannot be disabled without disrupting normal execution ofthe encryption algorithm. Disruption of encryption algorithm by attacking the RIM control line yields no useful statistical key material to be gathered by the attacker.
DPA can only be effective if there is a statistical correlation between the difference between the sums of two groups of power traces and the content of a single target bit that exits the system at a specific time. With this RLM embedded embodiment, this statistical correlation is no longer valid due to the fact that target bits now exit the data path ofthe system at random rather than at specific times. The introduction of embedded RIM results in the random variation of two features. The first is a variation in the number/type of instructions run in each SP box's entry address evaluation. The second is a variation in the time interval between each consecutive SP box access. These two features will cause a DPA attacker to be unable to identify which SP box is being accessed in the program. This will, in turn make the re-shuffling ofthe SP box access an effective way of hiding information from DPA attackers because they can no longer align different power traces to the same reference for statistical averaging and analysis.
In the embodiment of Figure 7, the total number of real and pseudo shifts associated with each SP box totals twenty four shifts. For example, for box SP5 in Figure 8, eight real shifts are associated with sixteen pseudo shifts. The eight real shifts are the correct number of shifts for box SP5 according to the DES algorithm. If line 111 is attacked, then twenty four real shifts will occur in box SP5 instead (and in the other SP boxes as well), making a "mess", to so speak, ofthe DES algorithm.
In Figure 7 the pseudo shifts are shown as occurring after the real shifts, but the order can be changed, if desired, so that the pseudo shifts would occur before or even mixed among the real shifts.
Detailed Descπption of a Fourth Embodiment
The total number of shifts in each SP box need not be fixed at twenty four (or some other number, for that matter), but may be varied or randomized, if desired. That complicates the design ofthe CPU shown in Figure 7 somewhat, for example, by incorporating the design of either Figure 5 or 6, but the modification needed to randomize the total number of shift instructions is rather straightforward, as can be seen by reference to Figure 9 which shows a fourth embodiment as combination ofthe embodiments of Figures 6 and 7.
Detailed Description of a Fifth Embodiment
A modified RISC Processor (CPU) architecture can be used, for example, to generate identical power signatures for both normal instructions and special camouflaged "pseudo" instructions controlled by the Random instruction Masking (RIM) flag. This ific processor architecture is intended to work in an on-chip cryptographic system embedded with Random Instruction Masking (RIM), and this architecture combined with the S/W-specific RLM concepts, is intended to protect the cryptographic system from piracy through Power Analysis and Differential Power Analysis. Camouflaged instructions are those instructions that have the same instruction code and the same power signature as those typically used in encryption, but when running in this specific processor architecture, will not change the content of any processor register or alter the processor status. The Random Instruction Masking is a technique to create a camouflaged encryption program to protect the cryptographic device from reverse engineering through Power Analysis or Differential Power Analysis.
Figure 10 is a general (simplified) RISC Processor (CPU) architecture 200. A RISC instruction is an arithmetic or logic function performed by the ALU (Arithmetic Logic Unit) 210 taking two operands from two registers ofthe Register File 220 and the result ofthe operation being written back into a third register ofthe Register File 220 The Register File 220 consists of a number of registers with the same width (number of bits, e.g. 32-bits) that can be accessed with an address selection. In each instruction cycle, the processor gets its instruction sequentially from the ROM 240 and loads it into the Instruction Register 245. The ROM 240 stores all the instruction codes ofthe whole program including the encryption algorithm. The Control Logic 250 decodes the instruction code in the Instruction Register 245 and gives the correct control commands to the ALU 210 and other parts ofthe processor 200. Addresses ofthe operands (Source A and B) and the destination are also defined in the instruction code. An address decoder 260 decodes the address information from the Instruction Register 245 and provides the access control ofthe specific register in the Register File 220. The ALU 210, controlled by the Control Logic 250, gets the two operands (sources A and B) from the register file 220 with the specified addresses and performs the instruction-specified arithmetic or logical operation. The result ofthe ALU operation is written back to another register in the Register File 220 with the destination address on a data bus 215. Depending on the type of instructions, a Program Counter 230 that stored the index reference ofthe instruction in the whole program will be incremented or updated by the Control Logic uring the execution ofthe instruction. Some specific instructions ofthe processor will not increment or update the Program Counter 230. The updating of some other Flag Registers (not shown) in the processor, similar to the Program Counter 230, is also instruction dependent.
Most modern processors are built in CMOS technology. CMOS circuits do not draw static current so that power is dissipated only when charging and discharging ofthe load capacitance (switching). The current consumption of a CMOS circuit depends mainly on the capacitive loading, the driving capability ofthe driver and the frequency of the switching. A complete instruction cycle run in the processor involves the operation of different circuits at different times. Different parts ofthe processor circuits, due to their differences in device dimension, parasitic loading, and switching speed, will generate a unique current pattern (power signature) with respect to time on the power bus when activated. Power Analysis or Differential Power Analysis (DPA) uses these power signature patterns to correlate the instructions.
An embedded Random Instruction Masking (RIM) approach is used to randomly vary both the number and also the content ofthe RIM instructions in each SP box's entry address calculation subroutine as disclosed above. One very important condition for the RLM approach to successfully prevent DPA attacks is to eliminate any power signature of these RLM instructions. The best way to do this is to make the power signature ofthe RLM instruction identical to the normal instruction so that they are not differentiable in Power Analysis or Differential Power Analysis (DPA). Figure 11 shows an improved version ofthe RISC Processor 200 shown in Figure 10. A RIM control flag 202 generated from a random number generator 223, for example, controls the activation ofthe RLM instructions. The random number generator is also depicted in Figure 5 in connection with the first embodiment. The RISC Processor of Figure 11 has extra AND gates compared to the Processor of Figure 5 for controlling the Destination Address and the Program Counter Increment Enable. An extra register 222 is attached to the data bus 215. This register 222 is designed in such a way that it is identical to a register in the Register File 220 at least from a power consumption viewpoint. A pseudo program counter 232 is present to duplicate the original Program Counter 230 in the processor in terms of power consumption. While the RLM control flag 202 is set, the pseudo program counter 232 fetches instructions from the ROM 240 and those instructions enter the Instruction Register 245 and are decoded by the Address Decoder 260 as usual. But the results ofthe instruction are directed to the additional register 222 instead of a register in the Register File 220.
When the RLM control flag 202 equals a logical '0', the processor 200 will be under normal operation (that is, it functions as depicted by Figure 5 as unmodified). The extra AND gates 221 , 231 at the destination address and the program counter just passing the original signals from the Address Decoder 260 and the Control Logic unit 250. At the same time, the added register 222 and the pseudo program counter 232 are disabled. Since all the circuit components involved during the execution of an instruction are the same as in Figure 10, the power signature (i.e. the consumed current pattern with respect to time) of every instruction run in the modified processor of Figure 11 will be the same as the processor of Figure 10.
When the RLM control flag 202 is activated (equal to logical ' 1 '), fetching an instruction from ROM 240, decoding and sourcing the A and B operands from the register file 220, and the operation ofthe operands in ALU 210 continues on as usual. However, due to the presence of AND gates 221, which are responsive to the state ofthe RLM flag 202, disable the selection ofthe destination register in register file 220, none of the destination registers in the register file 220 is selected to receive the results from the ALU 210. Rather, AND gate 223 causes the data on data bus 215 from ALU 210 to be directed to extra register 222 instead. The result is that the ALU is directed to load the results ofthe instruction being executed into added register 222 instead of one ofthe normal destination registers in register file 220. Since the physical design ofthe added register 222 is identical to a destination register in register file 220, the consumed current pattern of loading this added register 222 will be the same as loading the results into a real destination register in the register file 220. The AND gate 223 arranged at the front ofthe added register is for the purpose of emulating the power of one AND gate 221 used e ϊ ohe' 'me''destination registers during normal operation. At the same time, the RLM flag 202 also disables the real Program Counter 230, and the pseudo program counter 232 is activated to be incremented or updated. Again, because ofthe identical physical design between the two program counters 230, 232, the power pattern of incrementing or updating the program counter by the executed instruction will be maintained. At the end of such an instruction cycle, none ofthe contents in destination registers in register file 220 or the real program counter 230 is modified. That is, the status ofthe processor 200 remains the same as before this instruction is being executed while the RLM flag 202 is set. When the RIM flag 202 is set, the processor 200 acts, from a data processing standpoint as if it were processing NOP (no operation) instructions. But from a power consumption standpoint, the processor appears to be processing real instructions.
When RLM flag 202 goes back to logical '0', the processor will resume its normal operation to continue running the original program. Whatever instructions (no restriction of what kind) run during RLM flag at logical T have no effect on the processor nor the programming other than just producing a camouflage effect of executing an associated normal instruction in the power trace. Thus, the instructions that were fetched when the RJM flag at a logical ' 1 ' are basically re-fetched. Of course, the sequence my vary somewhat since the outcomes of branch instructions could be different. In any event, the processing basically continues from where it was interrupted while the RIM flag at a logical ' 1 '. When this processor with the RIM flag controlled instructions in the SP box address calculation subroutine is used, then the power traces will contain a random variation ofthe number of certain instructions and also a variety of different kinds of instructions executed in the subroutine. Thus, DPA attackers can no longer identify and align the power traces ofthe SP box subroutine.
The extra register 222 is a dummy register in that it receives and stores data, but the data received thereby is preferably not used to influence subsequent data processing by processor 200. In Figure 11 it is shown separated from register file 220, but it could be implemented as a part of register file 220, if desired. The protection ofthe RIM control line at the output ofthe RLM control flag 202 on the chip from direct probing is important. If the RLM control line were easily accessed, some knowledgeable attackers may use this technique to force the RLM control line to be always at logical '0' so as to disable the RLM. A number of camouflage techniques are available to protect the physical design of CMOS circuits from reverse engineering. Using these techniques, the RLM control line can be made very difficult to probe by burying it deep into the silicon implant level and shielding it with actively connected higher Poly and metal layers. It will be very difficult to locate this RLM control line and any attempt to remove the higher protecting layers will damage the functionality ofthe chip.
The state ofthe RIM flag 202 is assumed to be at a logical ' 1 ' when the pseudo program counter 232 is being used to fetch instructions. As is well known to those skilled in the art, this logic shown on Figure 11 may be easily modified so the a logical '0' would cause the pseudo program counter 232 to come into play and then a logical ' 1 ' would represent normal CPU operation.
The circuit shown in Figure 11 is not intended for a pipelined ALU. However, it is straightforward to adapt the circuit of Figure 11 for a pipelined ALU. In general, a pipelined ALU has four stages: prefetch, instruction decode, execute, and writeback. The RLM control signal from the RLM flag may be synchronized with the pipeline through a delay circuit. Thus, the RIM control flag 202 should be synchronized with added register 222, AND gates 221 and pseudo program counter 232 when used with a pipelined ALU.
Of course, a processor 200 may have additional status flag registers that should not be updated when running in RLM mode. The control of such registers may be modified in the same way as the registers (by providing dummy flag registers - analogous to extra register 222 - for writing results to when in RLM mode) resulting in a duplicated power signature component for updating these flag registers without really ing them. These flag registers are not depicted in Figure 11 for the purpose of simplicity.
Within a processor, high capacitive loading and high speed mean that the switching ofthe data bus and the read/write ofthe Register File (Memory) will dominate the power consumption. The switching power of updating the flag registers (usually single-bit registers) is not significant in comparison to the total power. Even the program counter switching power may not be significant enough to cause an observable difference in the power traces. Leaving these flag registers untouched may be a convenient way to reduce the extra circuitry required.
Detailed Description of a Bus Architecture Embodiment
This embodiment prevents usage of side channel information by DPA attackers by randomly toggling the polarity ofthe target bit at the data bus driver while maintaining the equal probability of having a '0' or ' 1 ' values. In other words, the power traces no longer statistically correlate with the secret key. Thus, side channel information cannot be used to determine the, keys being used by the cryptographic system. This embodiment may be used with the other embodiments or may be used alone.
Specifically, with reference to DPA, the result is that within each group of messages having the same target bit values computed from the selection function with correctly guessed key Ks, the corresponding power traces will not be always '0' or ' 1 '. The chance of having a '0' or ' 1 ' at the target bit will be approximately at 0.5 due to the randomization of polarity. Thus, the selection function D is effectively un-correlatable to the actual power trace measurement. The selection function D has thus been deprived of a way of predicting the power consumption ofthe actual target bit. In the case of Kg being incorrectly guessed, randomization will maintain the un-correlation between D and the corresponding power traces. Figure 12 depicts a Cryptographic Bus Architecture 311 (CBA) in accordance with the present invention, preferably having bi-directional drivers 315, 317 at both ends and a typically heavily loaded bus 316 in between. Bi-directional drivers are preferred since the use of non-bi-directional drivers would tend to increase the number of bus drivers needed to practice the invention. The bus 311 connects CPU 301 to its memories 321, 323. The CPU 301 runs the program stored in the ROM 321 and the RAM 323 is for intermediate storage ofthe cipher text data and the key.
The N-bit random number generator 313 controls the N-bit bi-directional drivers 315, 317. The random number generator 313 has N outputs 314, wherein each output comprises of one bit. Each bit 3140 - 314N controls one bus driver 315, 317. The random number generator 313 generates a new set of N-bit random numbers 3140 - 314N whenever an "activate signal" is received from the CPU 301 though the enable line 303. The activate signal is preferably sent by the CPU 301 at the beginning of each DES round and is preferably software invoked. The value of each random bit 3140 - 314 is used to determine the way to toggle a driver 315, 317, i.e. change its polarity, and drive the heavily loaded internal data bus 316 so as to defeat correlation. The polarity control line 313 is preferably made to be "probe-resistant" because it is preferably buried beneath those circuit features readily visible to the reverse engineer. That is, this control line can be made with implanted layers in the substrate, using the techniques of U.S. Patent Nos. 5,866,933; 6,294,816 or 6,613,661 (each of which is hereby incorporated herein by reference), and therefore is buried beneath oxide, polysilicon and/or metal, making the possibility of connecting to the control line a much more difficult proposition. The required polarity changes are infrequent enough to thwart the statistical analysis by a reverse engineer. For example, the polarity can be changed at the beginning of each DES round, or at the beginning of fetching each new plaintext for encryption.
Figure 13 depicts a more detailed block diagram ofthe preferred embodiment. The 'CPU Read' 4010 - 401N and 'CPU Write' 4030 - 403N lines are used to control the data flow direction. The bi-directional bus drivers 315, 317 are inverting or non-inverting tri-state buffers determined by the value ofthe associated random bit 3140 - 314 if the 3m number generated by random number generator 313. For example when the random bit 3140 is 'O'for bi-directional bus driver 315 during a 'CPU write' operation, the signal at 3050 will be inverted on the data bus 316. At the other end, bi-directional bus driver 317 will pick up the inverted signal from the data bus 316 for bit 3050 and invert the bit again to ensure the integrity ofthe original data signal. This occurs for each bit of the data signal 305, typically with some bits being inverted and others not. For the case when the bit 3140 is a random ' 1 ', the non-inverting buffer 319 will drive the data bus 316 instead ofthe inverting one 320. Since the signals 3140 - 314N are random, the chance of having a value of '0' or ' 1 ' will be approximately 0.5 and 0.5. The result is that all the deterministic power information associated with the content ofthe data bus will be lost. Thus, even in the case of a DPA attack having a correctly guessed key, the tip-off correlation between the content ofthe target bit over the data bus and the corresponding power traces is lost.
After the logical content of a data bus 316, which tends to have heavy capacitive loading in processor designs, is made un-correlatable to a power trace measurement, any remaining correlation could be coming from the lightly loaded capacitive wires connecting the ALU and register files. To minimize detection of this correlation, a set of dual rails (d and d_bar) is preferably used to write a given register bit as shown in Figure 14. Because ofthe symmetry of this design, the dual rails simultaneously contain both the new data 'd' and its complement 'd_bar', thus masking the external power consumption to be normalized at 0.5 as a result of averaging 'd' and 'd_bar'. Note that the presence of .complementary read amplifiers and complementary write amplifiers. Specifically, for a data value D0 of '0', the set of dual rails contains '0,1 '; for a data value D0 of ' 1 ' the data value for the set of dual rails is '1,0'. Therefore, independent ofthe data value D0, this circuit (including the rails d and d_bar as well as the complementary read and complementary write amplifiers will always have the same average power consumption and thus will make the data value D0 un-correlatable to the power consumption ofthe circuit. The data value D0 ofthe circuit of Figure 14 can have a '0' value or a M' value, but, in either case, one of d and d__bar will be equal to "0" and the other of d and d_bar will be equal to '1' and their average will, of course, be equal to 0.5. The result is that the r signature ofthe circuit is independent ofthe data value content ofthe ALU register bit. Of course, a given register has multiple bits and each bit of storage is preferably constructed in accordance with the design according to Figure 14.
The present invention is preferably implemented in an on-chip bus and/or chip architecture of a microprocessor that is used to perform cryptographic operations. This architectural approach enables securing existing cryptographic algorithms (including RSA, DES, AES and non-linear algorithms).
Having described the presently disclosed technology in connection with different embodiments thereof, modification will now suggest itself to those skilled in the art. As such, the invention as defined in the appended claims is not to be limited to the disclosed embodiments except as specifically required by the appended claims.
static unsigned long SPl[64] = { 0x01010400L, OxOOOOOOOOL, OxOOOlOOOOL, 0x01010404L, 0x01010004L, Ox00010404L, 0x00000004L, OxOOOlOOOOL, 0x00000400L, 0x01010400L, 0x01010404L, 0x00000400L, 0x01000404L, Ox01010004L, OxOlOOOOOOL, 0x00000004L, 0x00000404L, Ox01000400L, 0x01000400L, Ox00010400L, 0x00010400L, OxOlOlOOOOL, OxOlOlOOOOL, 0x01000404L, 0x00010004L, Ox01000004L, 0x01000004L, 0x00010004L, OxOOOOOOOOL, Ox00000404L, Ox00010404L, OxOlOOOOOOL, OxOOOlOOOOL, Ox01010404L, 0x00000004L, OxOlOlOOOOL, Ox01010400L, OxOlOOOOOOL, OxOlOOOOOOL, 0x00000400L, 0x01010004L, OxOOOlOOOOL, 0x00010400L, 0x01000004L, 0x00000400L, Ox00000004L, 0x01000404L, 0x00010404L, Ox01010404L, Ox00010004L, OxOlOlOOOOL, Ox01000404L, 0x01000004L, 0x00000404L, 0x00010404L, Ox01010400L, 0x00000404L, 0x01000400L, 0x01000400L, OxOOOOOOOOL, 0x00010004L, Ox00010400L, OxOOOOOOOOL, Ox01010004L }; static unsigned long SP2[64] = { Ox80108020L, Ox80008000L, Ox00008000L, Ox00108020L, OxOOlOOOOOL, 0x00000020L, Ox80100020L, Ox8O0O8O20L, 0x80000020L, 0x80108020L, Ox80108000L, OxδOOOOOOOL, Ox80008000L, OxOOlOOOOOL, 0x00000020L, Ox80100020L, Ox00108000L, Ox00100020L, Ox80008020L, OxOOOOOOOOL, OxδθOOOOOOL, Ox00008000L, 0x00108020L, OxδOlOOOOOL, 0x00100020L, Ox80000020L, OxOOOOOOOOL, Ox00108000L, Ox00008020L, OxδOlOδOOOL, OxδOlOOOOOL, Ox00008020L, OxOOOOOOOOL, Ox0010δ020L, Oxδ01t)0020L, OxOOlOOOOOL, 0xδ000δ020L, OxδOlOOOOOL, OxδOlOδOOOL, OxOOOOδOOOL, OxδOlOOOOOL, OxδOOOδOOOL, 0x00000020L, 0x80108020L, Ox00108020L, 0x00000020L, OxOOOOδOOOL, OxSOOOOOOOL, 0x00008020L, Oxδ0108000L, OxOOlOOOOOL, Ox80000020L, Ox00100020L, Ox8000δ020L, 0x80000020L, Ox00100020L, OxOOlOδOOOL, OxOOOOOOOOL, OxδOOOδOOOL, OxO0OOδO2OL, OxδOOOOOOOL, Ox80100020L, 0x8010δ020L, OxOOlOδOOOL };
Table 1. Expressed in C language, for example, SP-Box 1 & 2 are implemented as lookup tables of 64 elements
1. { 2. work = (right « 28) | (right » 4); 3. work Λ= *keys++; 4. fval = SP7[ work & 0x3fL]; 5. fval |= SP5[(work » 8) & 0x3fL]; 6. fVal |= SP3[(work » 16) & 0x3£L]; 7. fval |= SPl[(work » 24) & 0x3fL]; work = right Λ *keys++; 9. fval |= SP8[ work & 0x3 fL]; 10 fval |= SP6[(work» δ) & 0x3fL]; 11 fval |= SP4[(work » 16) & 0x3fL]; 12 fval |= SP2[(work » 24) & 0x3fL]; 13, leftt = fval; 14,
Table 2. C language program that sequentially accesses DES's eight SP lookup tables for a given round.
The C language statement fval |= SP5[(work » 8) & 0x3 fL] becomes, in assembly language:
1. Ii A 0x3f ; A= 0x3f 2. add 1 work O ; 1 = work 3. li C 8 ; C = δ initialize shifting counter to 8 4. jal link rshft ; jump to Subroutine to right shift register 1 by C (reg.) places; l = (work» 8) 5. and 1 1 A ; l = (work» δ) & 0x3fL 6. li B SP5 ; B = &SP5 7. add B B l ; ; B = &SP5[(work » 8) & 0x3fL] 8. Lw B B ; ; B = SP5[(work » 8) & 0x3fL]; 9. Lw C fval ; ; C = fval 10. or C C B ; ; fval = C |= SP5[(work » 8) & 0x3fL]; 11. sw fval C ; ; fval = C 12. ; "rshft" is the routine to right shift register 1 by C (reg.) places with Random Instruction Masking (RIM) enabled 13. rshft sw RIM_start ; I/O to start RIM by allowing insertion of random instructions with CPU ; registers update disabled, (i.e., begin of RIM statements block) ; random instruction from random number generator ; random instruction from random number generator 14." sw RLM_stoρ ; I/O to stop Random Instruction Masking by enabling update of registers; ; (i.e., end of RLM statements block) 15. sra 1 1 register 1 is shifted right by one place 16. sub C C constl C- - ; decrement count register by one 17. bnz C rshft (C > 0) loop 18. jr link return to caller
Table 3. The corresponding Assembly language program to implement the C program statement #5 of Table 2 - lines start with ";" are the comment lines.
forC round = 0: round < 8; round-H- ) { worksfO] = Criεhtsril « 12) 1 fuietøslOl » 4) & OxOfff):
3. : works l] = (riεhtslOl « 12 | unehtsm » 4) & OxOfff):
4. li round 0 , round = 0
5. li A edf , A = edf
6. Lw B A , B = &edf
7. Lw C- B C = edf
8. li A keys A = keys, .i.e. enOks
9. add A A C A = enOks + edf
10. Lw j A j = &keys // initialize the pointer to the key schedules
11. rndbk4 li A desmsk A = desmsk
12. Lw A A A = &desmsk[0]
13. li B 4 B = 4
14. add B B A B = &desmsk[4]
15. Lw fvalO B fvalO = desmsk[4] = OxOfff
16. li A 0 A = 0
17. add 1 rightO A 1 = rightO
18. li C 4 l = (rights[0] » 4)
19. jal Ink rshft
20. and workO 1 fvalO workO = (rights[0] » 4) & OxOfff
21. add 1 rightl A 1 = rightl
22. li C 12 l = (rights[l] « 12)
23. jal Ink rtls
24. or workO workO 1
25. add 1 rightl A 1 = rightl
26. li C 4 l = (rights[l] » 4)
27. jal Ink rshft
28. and workl 1 fvalO workl = (rights[l] » 4) & OxOfff
29. add 1 rightO A 1 = rightO
30. li C 12 l = (rights[0] « 12)
31. jal Ink rtls
32. or workl workl 1
33. ; worksfO] Λ= *kevs++;
34. : works l] Λ= *keys++;
35. Lw c j C = *keys++
36. add j j constl j++
37. xor workO C workO works[0] Λ= *keys++
38. Lw C j C = *keys++
39. add j j constl j++
40. xor workl C workl works[l] Λ= *keys++
41. : fvalsroi = SP7LL [ worksril & 0x3fLl:
42. ; fvalsrn = SP7RR r worksril & 0x3fLl:
43. li fvalO fval intialize variables address for &fvals[0]
44. Lw fvalO fvalO fvalO = &fvals[0]
45. li A 0x3f A= 0x3f
46. and 1 workl A 1 = works[l] & 0x3fL
47. li B SP7LL B = SP7LL I. Lw B B B = &SP7LL . add B B l B = &SP7LL[ works[l] & 0x3fL] 50. Lw B B B = SP7LL[ works[l] & 0x3fL]; 51. sw fValO B fvals[0] = SP7LL[ works[l] & 0x3£L]; 52. li B SP7RR B = SP7RR 53. Lw B B B = &SP7RR 54. add B B l B = &SP7RR[ worksfl] & 0x3 fL] 55. Lw B B B = SP7RR[ works[l] & 0x3fL]; 56. add 1 fvalO constl l = &fvals[l] 57. sw 1 B fvals[l] = SP7RR[ worksfl] & 0x3fL]; 58. fva Ol [= SP5LL1 (worksril » 8) & 0x3fL1: 59. fvalsrn |= SP5RRr (worksril » 8) & 0x3fLl: 60. li 1 0 1 = 0 61. add 1 workl 1 1 = works[l] 62. li C 8 l = (works[l] » 8) 63. jal Ink rshft 64. and 1 1 A l = (works[l] » 8) & 0x3fL 65. li B SP5LL B = SP5LL 66. Lw B B B = &SP5LL 67. add B B l B = &SP5LL[(works[l] » 8) & 0x3 fL] 68. Lw B B B = SP5LL[(works[l] » 8) & 0x3fL]; 69. Lw C fvalO C = fvalsfO] 70. or C C B fvals[0] |= SP5LL[(works[l] » δ) & 0x3fL]; 71. sw fvalO .C fvalsfO] = C 72. li B SP5RR B = SP5RR 73. Lw B B B = &SP5RR 74. add B B l B = &SP5RR[(works[l] » δ) & 0x3fL] 75. Lw B B B = SP5RR[(works[l] » δ) & 0x3fL]; 76. or fvall fvall B fvals[l] |= SP5RR[(works[l] » δ) & 0x3fL] 77. ; routine to left shift register 1 by C (reg.) places 78. rtls sla 1 1 ; 79. sub C C constl c- δ0. bnz C rtls (C > 0) loop 81. jr Ink return to caller 82. ; routine to right shift register 1 by C (reg.) places 83. ; warning : need to convert arithmetic shift to unsigned right shift 84. ; used reg k as temporary var 85. rshft Lw B constl; B = sign bit to extract 86. and B 1 B ; B contains the sign bit of 1 87. sra 1 1 xor 1 1 B
89. sub C C constl C- 90. bnz C rshftl (C > 0) loop 91. jr Ink return to caller 92. rshftl sra 1 1 93. sub C C constl C- 94. bnz C rshftl (C > 0) loop j Ink ; return to caller
Table 4
"; fvall = SP5[(work » δ) & 0x3fL]; 1. li A 0x3f A = 0x3f 2. add 1 work O 1 = work 3. li C 8 C = 8 : initialize shifting counter to 8 4. sw SCC C I/O to set external Shift Counter Control (SCC) to δ, when zero, it enables RIM_shft 5. li C 24 ; C = 24 ; initialize internal shifting counter to 24 to provide extra pseudo instructions. 6. jal link rshft ; jump to Subroutine to right shift register 1 by C (reg.) places; l = (work » 24) 7. and 1 1 A 1 = (work » 8) & 0x3fL li B SP5 B = &SP5 9. add B B l B = &SP5[(work » δ) & 0x3 fL] 10. Lw B B B = SP5[(work» δ) & 0x3fL]; 11. Lw C fval C = fval 12. or C C B fval = C |= SP5[(work » 8) & 0x3fL]; 13. sw fval C fval = C 14. ; "rshft" is the routine to right shift register 1 by C (reg.) places with Random Instruction Masking (RIM) enabled 15. rshft sw PJM_start ; I/O to start RIM by allowing insertion of random instructions with CPU ; registers update disabled, (i.e., begin of RIM statements block) ; random instruction from random number generator ; random instruction from random number generator 16. sw RIM_stop ; I/O to stop Random Instruction Masking by enabling update of registers; ; (i.e., end of RIM statements block)
17. sra 1 1 register 1 is shifted right by one place 18. sub C C constl C- - ; decrement count register by one 19. bnz C rshft (C > 0) loop 20. jr link return to caller
Table 5 The corresponding Assembly language program to implement the C program statement #5 of Table 2 for the embodiment of Figure 7 - lines starting with a ";" are the comment lines.

Claims

Claims'"
1. A cryptographic architecture comprising: a processor; a memory containing an encryption algorithm coupled to said processor; and a control flag register and a shift control counter coupled to said processor for controlling the state operation ofthe processor, the shift control counter adapted to count a number of desired real shift instructions for carrying out the encryption algorithm; the control flag register being set and/or reset by instructions stored in said memory and issued by the processor, the control flag register assuming a particular state when shift instructions are to be performed as pseudo shift instructions by the processor.
2. The cryptographic architecture of claim 1 wherein the control flag register and a shift control counter are interconnected by a pair of gates, a first gate of said pair of gates having an output coupled to a first input of a second gate of said pair of gates, the first gate having one input thereof coupled to an output ofthe control flag register and having another input thereof coupled to an output ofthe second gate, the second gate having another input coupled to the shift control counter, the output ofthe second gate also being coupled to the processor for halting state operation ofthe processor.
3. The cryptographic architecture of claim 2 wherein the desired shift instructions and the pseudo shift instructions occur in a plurality of groups, each group of shift instructions comprising a fixed number of shift instructions, with the number of pseudo shift instructions in each group varying by group.
4. The cryptographic architecture of claim 3 wherein at least one group comprises all pseudo shift instructions and at least one another group comprises all real shift instructions.
5. The cryptographic architecture of claim 1 wherein the desired shift instructions and the pseudo shift instructions occur in a plurality of groups, each group of instructions compπsmg a fixed number of shift instructions, with the number of pseudo shift instructions in each group varying by group.
6. The cryptographic architecture of claim 5 wherein at least one group comprises all pseudo shift instructions and at least one another group comprises all real shift instructions.
7. The cryptographic architecture of claim 1 wherein said processor is a 16- bit, 32-bit or 64-bit processor.
8. The cryptographic architecture of claim 1 wherein said encryption algorithm is a Data Encryption Standard (DES) algorithm.
9. A system for thwarting differential power analysis, said system comprising: means for running an encryption algorithm; and means for inserting a random or a predetermined number of pseudo instructions into said encryption algorithm, said pseudo instruction mimicking real instructions in terms of at least energy consumption without affecting the encryption algorithm being run.
10. The system of claim 9 wherein said means for running an encryption algorithm comprises: a processor; and a memory storing the encryption algorithm coupled to said processor.
11. The system of claim 10 wherein said processor is a 16-bit, 32-bit or 64-bit processor.
12. The system of claim 10 wherein said encryption algorithm is a Data Encryption Standard (DES) algorithm.
13. The system of claim 9 wherein the pseudo instructions emulate bit- wise shift instructions power consumption wise.
14. The system of claim 9 wherein the pseudo instructions comprise a set of randomized instructions.
15. The system of claim 9 wherein said means for inserting comprises: a control flag register coupled to said processor; and a random number generator coupled to said control flag register.
16. The system of claim 15 wherein said random number generator is a one- bit random number generator.
17. A system for decorrelating side channel information, said system comprising: means for running a Data Encryption Standard (DES) algorithm, said DES algorithm comprising a plurality of substitution/permutation box entry address evaluations; and means for inserting a number of pseudo instructions in at least one of said plurality of substitution/permutation box entry address evaluations, the pseudo instructions mimicking, energy consumption- wise, corresponding real instructions, but without affecting the running of the DES algorithm.
18. The system of claim 17 wherein said means for running a DES algorithm comprises: a processor; and a memory containing an encryption algorithm coupled to said processor and a plurality of lookup tables coupled to said processor, said plurality of substitution/permutation boxes being implemented in said plurality of lookup tables.
19. The system of claim 18 wherein said processor is a 16-bit, 32-bit or 64-bit processor.
20. The system of claim 17 wherein said means for inserting includes a control flag register coupled to said processor for causing said processor to issue pseudo instructions, which do not update registers associated with the processor, rather than corresponding real instructions which would update at least register associated with the processor.
21. The system of claim 20 wherein said means for inserting further includes a shift control counter for inserting additional real instructions into the DES algorithm if a connection between the control flag register and the processor is successfully probed by an attacker, the additional real inserted instructions being effective to disable calculations performed by the DES algorithm.
22. The system of claim 20 wherein said means for inserting further includes a random number generator coupled to said control flag register.
23. The system of claim 22 wherein said random number generator is a one- bit random number generator.
24. A system for decorrelating .side channel information, said system comprising: means for running a Data Encryption Standard (DES) algorithm, said DES algorithm comprising a plurality of substitution/permutation box entry address evaluations; and means for inserting a fixed and/or a random number of pseudo instructions in at least one of said plurality of substitution/permutation box entry address evaluations.
25. The system of claim 24 wherein said means for running a DES algorithm comprises: a processor; and a memory module containing an encryption algorithm coupled to said processor and a plurality of lookup tables coupled to said processor, said plurality of substitution/permutation boxes being implemented in said plurality of lookup tables.
26. The system of claim 25 wherein said processor is a 16-bit, 32-bit or 64-bit processor.
27. A method of altering a power trace of a cryptographic architecture comprising: running an encryption algorithm; setting a control flag; and performing a number of pseudo instructions when said control flag is set, said pseudo instructions mimicking corresponding real instructions energy consumption wise without affecting calculations performed according to said encryption algorithm.
28. The method of claim 27 wherein in the setting a control flag further comprises halting a state machine of a processor running said encryption algorithm.
29. The method of claim 28 wherein the halting of the state machine further comprises disabling a destination register in said state machine.
30. The method of claim 27 further comprising modifying said encryption algorithm to shuffle an access order of a plurality of lookup tables.
31. The method of claim 27 wherein said encryption algorithm is a Date Encryption Standard (DES) algorithm.
32. The method of claim 27 further comprising resetting said control flag, wherein said step of resetting further comprises sending a signal from a random number generator to a control flag register.
33. A cryptographic CPU architecture comprising: an ALU; a control flag; a plurality of registers for normally receiving output ofthe ALU in response to an arithmetic instruction; and an additional register for receiving output ofthe ALU, in lieu of one ofthe plurality of registers, in response to an arithmetic instruction when the control flag is set.
34. The cryptographic CPU architecture of claim 33 further comprising: a first program counter; and a second program counter; wherein the first and second program counters are responsive to the state of said control flag so that the first program counter is enabled where said control flag is not set and so that the second program counter is enabled where said control flag is set; and wherein an enabled one of said first and second program counters fetches instructions from an instruction memory.
35. The cryptographic CPU architecture of claim 34 wherein the ALU outputs the results of an arithmetic instruction fetched by the first program counter to one of said plurality of registers and the ALU outputs the results of an arithmetic instruction fetched by the second program counter to said additional register.
36. The cryptographic CPU architecture of claim 35 wherein the additional register is a dummy register having no output for fransferring data to the ALU.
37. The cryptographic CPU architecture of claim 36 wherein the registers and the additional register each have an associated gate for controlling the transfer of data to the registers and to the additional register, the associated gates being controlled by the state of said control flag.
38. The cryptographic CPU architecture of claim 33 wherein the additional register is a dummy register having no output for transferring data to the ALU.
39. The cryptographic CPU architecture of claim 33 wherein the registers and the additional register each have an associated gate for controlling the transfer of data to the registers and to the additional register, the associated gates being controlled by the state of said control flag.
40. A method of concealing data processing occurring in a CPU from power analysis during the execution of a program, the method comprising: (i) at a point during the execution ofthe program, inserting a random number of program counter cycles instruction fetch cycles, (ii) while the random number of instruction fetch cycles are occurring, fetching instructions from memory, executing those instructions in program sequence, but inhibiting updating of normal memory locations based on the execution of those instructions; and (iii) at the conclusion of said random number of instructions, then recommencing noπnal program execution by refetching the same instructions which were initially fetched while the random number of instruction fetch cycles were occurring, but when the instructions are refetched, updating memory locations in a normal manner for the CPU.
41. The method of claim 40 wherein the insertion of said random number of program counter cycles instruction fetch cycles is controlled by s state of a random instruction mask control flag.
42. The method of claim 40 wherein, while the random number of instruction fetch cycles are occurring, updating a dummy memory location based on the execution of instructions.
43. A method of concealing data processing occurring in a CPU from power analysis during the execution of a program, the method comprising: (i) at a point during the execution ofthe program, inserting a random number of program counter cycles instruction fetch cycles; and (ii) while the random number of instruction fetch cycles are occurring, mimicking power consumption associated with (a) fetching instructions from memory, (b) executing those instructions in program sequence, and (c) writing results to memory registers.
44. A data processor comprising: an arithmetic logic unit; a control flag register; a plurality of registers for normally receiving output ofthe arithmetic logic unit in response to an arithmetic instruction and in response to a first state of said control flag register; and a dummy register for receiving output ofthe arithmetic logic unit, in lieu of one ofthe plurality of registers, in response to an instruction and in response to a second state of said control flag register.
45. The data processor of claim 44 further comprising: a first program counter; a second program counter; the first and second program counters being responsive to the state of said control flag register so that the first program counter is enabled when said control flag register is in said first state and so. that the second program counter is enabled when said control flag register is in said second state; and where!n~an enabled one of said first and second program counters fetches instructions from an instruction memory.
46. The data processor of claim 45 wherein the arithmetic logic unit outputs the results of an arithmetic instruction fetched by the first program counter to one of said plurality of registers and the arithmetic logic unit outputs the results of an arithmetic instruction fetched by the second program counter to said dummy register.
47. The data processor of claim 46 wherein the dummy register has no output for transferring data to the arithmetic logic unit.
48. The data processor of claim 47 wherein the registers and the dummy register each have an associated logic gate for controlling the transfer of data to the registers and to the dummy register, the associated logic gates being controlled by the state of said control flag register.
49. The data processor of claim 44 wherein the dummy register has no output for transferring data to the arithmetic logic unit.
50. The data processor of claim 44 wherein the registers and the dummy register each have an associated logic gate for controlling the transfer of data to the registers and to the dummy register, the associated logic gates being controlled by the state of said control flag register.
51. A cryptographic bus architecture comprising: a random number generator having a plurality of random number outputs at which a multi-bit random number is output; a plurality of bi-directional bus drivers, each bi-directional bus driver having at least one input for receiving at least one of said random number outputs; and a bus coupling at least one of said plurality of bi-directional bus drivers to at least another of said bi-directional bus drivers; wherein bi-directional bus drivers that are coupled to a common line of said bus are controlled by a common selected one of said random number outputs.
52. The cryptographic bus architecture as claimed in claim 51 wherein said plurality of random number outputs is camouflaged.
53. The cryptographic bus architecture as claimed in claim 51 wherein at least one of said plurality of bi-directional bus drivers comprises a normally inverting tri-state buffer and at least another one of said plurality of bi-directional bus drivers comprises a normally non-inverting tri-state buffer.
54. The cryptographic bus architecture as claimed in claim 51 further comprising a set of dual rails coupled to said plurality of bi-directional bus drivers, the set of dual rails coupling said bus to a CPU or to memory.
55. = The cryptographic bus architecture as claimed in claim 51 wherein the random number generator is responsive to a control signal for causing said random number generator to emit a new random number.
56. The cryptographic bus architecture as claimed in claim 55 wherein the control signal is generated by a processor.
57. The cryptographic bus architecture as claimed in claim 56 wherein the control signal is generated by said processor in response to a software instruction.
58. A method of preventing a breach of security comprising the steps of: sending encrypted bits over a bus; and randomly toggling the polarity of said encrypted bits on said bus.
59. Tne method as claimed in claim 58 wherein said bus has dual rails tor each bit transmitted in a parallel manner on said bus, one rail of said dual rails being invented compared to the other rail of said dual rails.
60. A method of preventing a breach of security comprising sending encrypted bits over a bus having dual rails for each bit transmitted in a parallel manner on said bus, one rail of said dual rails being invented compared to the other rail of said dual rails.
61. A method for protecting secret keys comprising: providing a plurality of bi-directional bus drivers; coupling a line of a data bus between at least a first bi-directional bus driver of said plurality of bi-directional bus drivers and a second bi-directional bus driver of said plurality of bi-directional bus drivers; signaling said first bi-directional bus driver to provide a first set of bits to said bus, said bits having a first polarity; ■ signaling said second bi-directional bus driver to receive said first set of bits having said first polarity; randomly signaling said first bi-directional bus driver, to provide a second set of bits to said bus, said second set of bits having an opposite polarity than said first set of bits; and signaling said second bi-directional bus driver to receive said second set of bits having said opposite polarity.
62. The method as claimed in claim 61 further comprising the step of camouflaging said signaling of said first and second bi-directional bus drivers.
63. The method as claimed in claim 61 further including: coupling a second line of said data bus between at least a third bi-directional bus driver of said plurality of bi-directional bus drivers and a forth bi-directional bus driver of said plurality of bi-directional bus drivers; signaling said third bi-directional bus driver to provide a third set of bits to said bus, said bits having a first polarity; signaling said forth bi-directional bus driver to receive said third set of bits having said first polarity; randomly signaling said third bi-directional bus driver to provide a forth set of bits to said bus, said forth set of bits having an opposite polarity than said second set of bits; and signaling said forth bi-directional bus driver to receive said forth set of bits having said opposite polarity.
64. A method for preventing information leakage attacks comprising the steps of: randomly inverting a polarity of at least one of a plurality of signals on a first end of a bus; and signaling to a second end of said bus that said random inverting has occurred at said first end of said bus.
65. A cryptographic bus architecture comprising: a random number generator for generating a multi-bit random number; first and second pluralities of bi-directional bus drivers, each bi-directional bus driver having a control input responsive to a selected bit of said random number; and a bus coupling said first plurality of bi-directional bus drivers to said second plurality of bi-directional bus drivers, each of said bi-directional bus drivers being associated with a single line of said bus and wherein the bi-directional bus drivers coupled to a common line of said bus are responsive to a common bit of random number.
66. The cryptographic bus architecture as claimed in claim 65 wherein said random number generator has a plurality of camouflaged random number output ports.
67. The cryptographic bus architecture as claimed in claim 65 wherein said bidirectional bus drivers comprise an inverting tri-state buffer or a non-inverting tri-state buffer as determined by a state of data at its control input.
68. The cryptographic bus architecture as claimed in claim 65 further comprising a first and second sets of dual rails coupled to said first and second pluralities of bi-directional bus drivers, the first and second sets of dual rails coupling said bus to a CPU and to memory.
PCT/US2005/020093 2004-06-08 2005-06-07 Cryptographic architecture with instruction masking and other techniques for thwarting differential power analysis WO2005124506A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US11/628,920 US8095993B2 (en) 2004-06-08 2005-06-07 Cryptographic architecture with instruction masking and other techniques for thwarting differential power analysis
GB0623489A GB2430515B (en) 2004-06-08 2005-06-07 A cryptographic CPU architecture for thwarting differential power analysis
JP2007527677A JP2008502283A (en) 2004-06-08 2005-06-07 Cryptographic architecture with instruction masks and other techniques that interfere with differential power analysis
US13/296,740 US20120144205A1 (en) 2004-06-08 2011-11-15 Cryptographic Architecture with Instruction Masking and other Techniques for Thwarting Differential Power Analysis

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US10/864,568 US7949883B2 (en) 2004-06-08 2004-06-08 Cryptographic CPU architecture with random instruction masking to thwart differential power analysis
US10/864,569 2004-06-08
US10/864,569 US8065532B2 (en) 2004-06-08 2004-06-08 Cryptographic architecture with random instruction masking to thwart differential power analysis
US10/864,568 2004-06-08
US10/864,556 2004-06-08
US10/864,556 US8296577B2 (en) 2004-06-08 2004-06-08 Cryptographic bus architecture for the prevention of differential power analysis

Publications (2)

Publication Number Publication Date
WO2005124506A2 true WO2005124506A2 (en) 2005-12-29
WO2005124506A3 WO2005124506A3 (en) 2006-05-11

Family

ID=35058184

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/020093 WO2005124506A2 (en) 2004-06-08 2005-06-07 Cryptographic architecture with instruction masking and other techniques for thwarting differential power analysis

Country Status (4)

Country Link
US (5) US7949883B2 (en)
JP (5) JP2008502283A (en)
GB (6) GB2430515B (en)
WO (1) WO2005124506A2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008146384A (en) * 2006-12-11 2008-06-26 Nec Electronics Corp Information processor and instruction fetch control method
EP2343664A1 (en) * 2009-12-21 2011-07-13 Nxp B.V. Cryptographic device
FR2956764A1 (en) * 2010-02-19 2011-08-26 St Microelectronics Rousset PROTECTION OF REGISTERS AGAINST UNILATERAL DISTURBANCES
CN102523085A (en) * 2011-12-15 2012-06-27 北京握奇数据系统有限公司 Data encryption method, data encrypting device and smart card
EP2675105A1 (en) * 2012-06-12 2013-12-18 Electronics and Telecommunications Research Institute Apparatus and method for providing security service
FR3060789A1 (en) * 2016-12-19 2018-06-22 Commissariat A L'energie Atomique Et Aux Energies Alternatives METHOD FOR EXECUTING A MICROPROCESSOR OF A POLYMORPHIC MACHINE CODE OF A PREDETERMINED FUNCTION
CN108270427A (en) * 2017-01-03 2018-07-10 意法半导体(鲁塞)公司 The device and method being managed for the current drain to integration module
EP2738974B1 (en) * 2012-11-29 2020-08-12 Spirtech Method for deriving multiple cryptographic keys from a master key in a security microprocessor
US11126432B2 (en) 2010-02-11 2021-09-21 Nxp B.V. Computer processor and method with short forward jump instruction inhibiting

Families Citing this family (101)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2365153A (en) * 2000-01-28 2002-02-13 Simon William Moore Microprocessor resistant to power analysis with an alarm state
DE10202700A1 (en) * 2002-01-24 2003-08-07 Infineon Technologies Ag Device and method for generating a command code
FR2838262B1 (en) * 2002-04-08 2004-07-30 Oberthur Card Syst Sa METHOD FOR SECURING ELECTRONICS WITH ENCRYPTED ACCESS
US7949883B2 (en) * 2004-06-08 2011-05-24 Hrl Laboratories, Llc Cryptographic CPU architecture with random instruction masking to thwart differential power analysis
FR2874440B1 (en) * 2004-08-17 2008-04-25 Oberthur Card Syst Sa METHOD AND DEVICE FOR PROCESSING DATA
CN101147182B (en) * 2005-03-31 2010-09-01 松下电器产业株式会社 Data encryption device and data encryption method
DE602006020010D1 (en) * 2005-12-19 2011-03-24 St Microelectronics Sa Protection of the execution of a DES algorithm
US7647486B2 (en) * 2006-05-02 2010-01-12 Atmel Corporation Method and system having instructions with different execution times in different modes, including a selected execution time different from default execution times in a first mode and a random execution time in a second mode
US7774616B2 (en) * 2006-06-09 2010-08-10 International Business Machines Corporation Masking a boot sequence by providing a dummy processor
US20070288761A1 (en) * 2006-06-09 2007-12-13 Dale Jason N System and method for booting a multiprocessor device based on selection of encryption keys to be provided to processors
US20070288740A1 (en) * 2006-06-09 2007-12-13 Dale Jason N System and method for secure boot across a plurality of processors
US20070288739A1 (en) * 2006-06-09 2007-12-13 Dale Jason N System and method for masking a boot sequence by running different code on each processor
US20070288738A1 (en) * 2006-06-09 2007-12-13 Dale Jason N System and method for selecting a random processor to boot on a multiprocessor system
US7594104B2 (en) * 2006-06-09 2009-09-22 International Business Machines Corporation System and method for masking a hardware boot sequence
US8365310B2 (en) * 2006-08-04 2013-01-29 Yeda Research & Development Co. Ltd. Method and apparatus for protecting RFID tags from power analysis
US8301890B2 (en) * 2006-08-10 2012-10-30 Inside Secure Software execution randomization
US7613907B2 (en) * 2006-08-11 2009-11-03 Atmel Corporation Embedded software camouflage against code reverse engineering
US8321666B2 (en) * 2006-08-15 2012-11-27 Sap Ag Implementations of secure computation protocols
US7984301B2 (en) * 2006-08-17 2011-07-19 Inside Contactless S.A. Bi-processor architecture for secure systems
JP4960044B2 (en) * 2006-09-01 2012-06-27 株式会社東芝 Cryptographic processing circuit and IC card
US7554865B2 (en) * 2006-09-21 2009-06-30 Atmel Corporation Randomizing current consumption in memory devices
JP5203594B2 (en) * 2006-11-07 2013-06-05 株式会社東芝 Cryptographic processing circuit and cryptographic processing method
US7822207B2 (en) * 2006-12-22 2010-10-26 Atmel Rousset S.A.S. Key protection mechanism
EP2000936A1 (en) * 2007-05-29 2008-12-10 Gemplus Electronic token comprising several microprocessors and method of managing command execution on several microprocessors
US8781111B2 (en) * 2007-07-05 2014-07-15 Broadcom Corporation System and methods for side-channel attack prevention
DE102007038763A1 (en) * 2007-08-16 2009-02-19 Siemens Ag Method and device for securing a program against a control flow manipulation and against a faulty program sequence
US8473751B2 (en) * 2007-12-13 2013-06-25 Oberthur Technologies Method for cryptographic data processing, particularly using an S box, and related device and software
FR2925968B1 (en) * 2007-12-26 2011-06-03 Ingenico Sa MICROPROCESSOR SECURING METHOD, COMPUTER PROGRAM AND CORRESPONDING DEVICE
US20090245510A1 (en) * 2008-03-25 2009-10-01 Mathieu Ciet Block cipher with security intrinsic aspects
JP5146156B2 (en) * 2008-06-30 2013-02-20 富士通株式会社 Arithmetic processing unit
US8175265B2 (en) * 2008-09-02 2012-05-08 Apple Inc. Systems and methods for implementing block cipher algorithms on attacker-controlled systems
JP2010288233A (en) * 2009-06-15 2010-12-24 Toshiba Corp Encryption processing apparatus
KR101646705B1 (en) * 2009-12-01 2016-08-09 삼성전자주식회사 Cryptographic device for implementing s-box
US9213835B2 (en) * 2010-04-07 2015-12-15 Xilinx, Inc. Method and integrated circuit for secure encryption and decryption
US8522052B1 (en) 2010-04-07 2013-08-27 Xilinx, Inc. Method and integrated circuit for secure encryption and decryption
US8522016B2 (en) * 2010-06-18 2013-08-27 Axis Technology Software, LLC On-the-fly data masking
KR101665562B1 (en) * 2010-11-05 2016-10-25 삼성전자주식회사 Detection circuit, detecting method thereof, and memory system having the detection Circuit
US20120124669A1 (en) * 2010-11-12 2012-05-17 International Business Machines Corporation Hindering Side-Channel Attacks in Integrated Circuits
TWI422203B (en) * 2010-12-15 2014-01-01 Univ Nat Chiao Tung Electronic device and method for protecting against differential power analysis attack
US8525545B1 (en) 2011-08-26 2013-09-03 Lockheed Martin Corporation Power isolation during sensitive operations
US8624624B1 (en) 2011-08-26 2014-01-07 Lockheed Martin Corporation Power isolation during sensitive operations
GB2494731B (en) 2011-09-06 2013-11-20 Nds Ltd Preventing data extraction by sidechannel attack
US8958550B2 (en) * 2011-09-13 2015-02-17 Combined Conditional Access Development & Support. LLC (CCAD) Encryption operation with real data rounds, dummy data rounds, and delay periods
US8924740B2 (en) * 2011-12-08 2014-12-30 Apple Inc. Encryption key transmission with power analysis attack resistance
TWI464593B (en) * 2012-03-03 2014-12-11 Nuvoton Technology Corp Output input control apparatus and control method thereof
DE102012209404A1 (en) * 2012-06-04 2013-12-05 Robert Bosch Gmbh Apparatus for executing a cryptographic method and method of operation therefor
JP5926655B2 (en) * 2012-08-30 2016-05-25 ルネサスエレクトロニクス株式会社 Central processing unit and arithmetic unit
DE102012018924A1 (en) * 2012-09-25 2014-03-27 Giesecke & Devrient Gmbh Side channel protected masking
JP2014096644A (en) * 2012-11-08 2014-05-22 Mitsubishi Electric Corp Semiconductor integrated circuit and data transfer method
JPWO2014073214A1 (en) * 2012-11-12 2016-09-08 日本電気株式会社 Information processing system and personal information analysis method for analyzing personal information
DE102013100572B4 (en) * 2013-01-21 2020-10-29 Infineon Technologies Ag BUS ARRANGEMENT AND METHOD OF SENDING DATA OVER A BUS
US9755822B2 (en) * 2013-06-19 2017-09-05 Cryptography Research, Inc. Countermeasure to power analysis attacks through time-varying impedance of power delivery networks
FR3011354A1 (en) * 2013-10-01 2015-04-03 Commissariat Energie Atomique METHOD FOR EXECUTING A MICROPROCESSOR OF A POLYMORPHIC BINARY CODE OF A PREDETERMINED FUNCTION
EP2884387B1 (en) * 2013-12-13 2016-09-14 Thomson Licensing Efficient modular addition resistant to side-channel attacks
US9892089B2 (en) * 2014-01-03 2018-02-13 Infineon Technologies Ag Arithmetic logical unit array, microprocessor, and method for driving an arithmetic logical unit array
DE102014001647A1 (en) * 2014-02-06 2015-08-06 Infineon Technologies Ag Operation based on two operands
US9838198B2 (en) * 2014-03-19 2017-12-05 Nxp B.V. Splitting S-boxes in a white-box implementation to resist attacks
TWI712915B (en) 2014-06-12 2020-12-11 美商密碼研究公司 Methods of executing a cryptographic operation, and computer-readable non-transitory storage medium
CN104168266B (en) * 2014-07-21 2018-02-13 苏州大学 A kind of encryption method for taking precautions against lasting leakage attack
CN104698954B (en) * 2015-02-04 2017-04-19 四川长虹电器股份有限公司 Method for controlling using time of electronic product and electronic product for realizing method
JP6467246B2 (en) * 2015-02-26 2019-02-06 株式会社メガチップス Data processing system
US10530566B2 (en) * 2015-04-23 2020-01-07 Cryptography Research, Inc. Configuring a device based on a DPA countermeasure
US9934041B2 (en) * 2015-07-01 2018-04-03 International Business Machines Corporation Pattern based branch prediction
US10489611B2 (en) * 2015-08-26 2019-11-26 Rambus Inc. Low overhead random pre-charge countermeasure for side-channel attacks
DE102016119750B4 (en) * 2015-10-26 2022-01-13 Infineon Technologies Ag Devices and methods for multi-channel scanning
NL2015745B1 (en) * 2015-11-09 2017-05-26 Koninklijke Philips Nv A cryptographic device arranged to compute a target block cipher.
US10789358B2 (en) 2015-12-17 2020-09-29 Cryptography Research, Inc. Enhancements to improve side channel resistance
US10649690B2 (en) * 2015-12-26 2020-05-12 Intel Corporation Fast memory initialization
DE102016201262A1 (en) * 2016-01-28 2017-08-17 Robert Bosch Gmbh Method and device for providing a computer program
EP3203460B1 (en) * 2016-02-05 2021-04-07 Nxp B.V. Secure data storage
CN105871536B (en) * 2016-06-14 2019-01-29 东南大学 A kind of anti-power consumption attack method towards aes algorithm based on random delay
EP3267354A1 (en) * 2016-07-04 2018-01-10 Gemalto Sa Secure loading of secret data to non-protected hardware registers
US10771235B2 (en) * 2016-09-01 2020-09-08 Cryptography Research Inc. Protecting block cipher computation operations from external monitoring attacks
US20180089426A1 (en) * 2016-09-29 2018-03-29 Government Of The United States As Represented By The Secretary Of The Air Force System, method, and apparatus for resisting hardware trojan induced leakage in combinational logics
CN108073837B (en) * 2016-11-15 2021-08-20 华为技术有限公司 Bus safety protection method and device
CN108242993B (en) * 2016-12-26 2020-12-22 航天信息股份有限公司 Method and device for aligning side channel signal and reference signal
WO2018174819A1 (en) 2017-03-20 2018-09-27 Nanyang Technological University Hardware security to countermeasure side-channel attacks
FR3065556B1 (en) * 2017-04-19 2020-11-06 Tiempo ELECTRONIC CIRCUIT SECURE BY DISRUPTION OF ITS POWER SUPPLY.
US10650156B2 (en) * 2017-04-26 2020-05-12 International Business Machines Corporation Environmental security controls to prevent unauthorized access to files, programs, and objects
CN107425976A (en) * 2017-04-26 2017-12-01 美的智慧家居科技有限公司 Key chip system and internet of things equipment
EP3422176A1 (en) * 2017-06-28 2019-01-02 Gemalto Sa Method for securing a cryptographic process with sbox against high-order side-channel attacks
US10678927B2 (en) 2017-08-31 2020-06-09 Texas Instruments Incorporated Randomized execution countermeasures against fault injection attacks during boot of an embedded device
US11082432B2 (en) 2017-12-05 2021-08-03 Intel Corporation Methods and apparatus to support reliable digital communications without integrity metadata
US11586778B2 (en) * 2017-12-07 2023-02-21 Bar-Ilan University Secured memory
CN108804340B (en) * 2018-04-02 2020-09-04 武汉斗鱼网络科技有限公司 Android system data recovery method, storage medium, electronic device and system
GB2578317B (en) * 2018-10-23 2021-11-24 Advanced Risc Mach Ltd Generating a test sequence of code based on a directed sequence of code and randomly selected instructions
CN109800181B (en) * 2018-12-12 2021-05-04 深圳市景阳科技股份有限公司 Disk-based data writing method, data writing device and terminal equipment
CN109947479A (en) * 2019-01-29 2019-06-28 安谋科技(中国)有限公司 Instruction executing method and its processor, medium and system
CN110098916B (en) * 2019-04-08 2021-07-20 武汉大学 High-order side channel analysis method based on software instruction positioning
CN110098799B (en) * 2019-04-30 2022-02-11 西安电子科技大学 Equivalent restart frequency modulation true random number generator and true random number generation method
EP3767849A1 (en) * 2019-07-18 2021-01-20 Nagravision SA A hardware component and a method for implementing a camouflage of current traces generated by a digital system
CN112422272B (en) * 2019-08-20 2022-10-21 深圳市航顺芯片技术研发有限公司 AES encryption method and circuit for preventing power consumption attack
US11604873B1 (en) * 2019-12-05 2023-03-14 Marvell Asia Pte, Ltd. Noisy instructions for side-channel attack mitigation
JP7433931B2 (en) * 2020-01-27 2024-02-20 キヤノン株式会社 Information processing device and its control method and program
US11449642B2 (en) * 2020-09-04 2022-09-20 Arm Limited Attack protection by power signature blurring
CN112417525B (en) * 2020-11-28 2022-03-22 郑州信大捷安信息技术股份有限公司 Side channel attack resisting method for SoC (System on chip) security chip and side channel attack resisting electronic system
US11449606B1 (en) * 2020-12-23 2022-09-20 Facebook Technologies, Llc Monitoring circuit including cascaded s-boxes for fault injection attack protection
US20220416997A1 (en) * 2021-06-24 2022-12-29 Intel Corporation Handling unaligned transactions for inline encryption
US11934327B2 (en) * 2021-12-22 2024-03-19 Microsoft Technology Licensing, Llc Systems and methods for hardware acceleration of data masking using a field programmable gate array
EP4224341A1 (en) * 2022-02-07 2023-08-09 nCipher Security Limited A device and a method for performing a cryptographic algorithm
US11860703B1 (en) * 2022-08-04 2024-01-02 Intel Corporation Code-based technique to mitigate power telemetry side-channel leakage from system buses

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19936939A1 (en) * 1998-09-30 2000-04-06 Philips Corp Intellectual Pty Data processing device and method for its operation to prevent differential power consumption analysis
US6060908A (en) * 1997-08-04 2000-05-09 Siemens Aktiengesellschaft Databus
EP1006492A1 (en) * 1998-11-30 2000-06-07 Hitachi, Ltd. Information processing equipment and IC card
US20020169968A1 (en) * 1999-12-02 2002-11-14 Berndt Gammel Microprocessor configuration with encryption
US20030005321A1 (en) * 2001-06-28 2003-01-02 Shuzo Fujioka Information processing device
US20030110390A1 (en) * 2000-05-22 2003-06-12 Christian May Secure data processing unit, and an associated method
US20030118190A1 (en) * 1998-05-29 2003-06-26 Siemens Aktiengesellschaft Method and apparatus for processing data where a part of the current supplied is supplied to an auxiliary circuit
WO2004053662A2 (en) * 2002-12-12 2004-06-24 Arm Limited Processing activity masking in a data processing system
FR2862150A1 (en) * 2003-11-12 2005-05-13 Innova Card Integrated circuit for performing confidential transaction, has central processing unit, random access memory and read only memory that are connected by data bus that routes encrypted data produced from plain data

Family Cites Families (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US131596A (en) * 1872-09-24 Improvement in plows
US169969A (en) * 1875-11-16 Improvement in wagon-brakes
US273630A (en) * 1883-03-06 Base rocking-chair
JPS63152241A (en) * 1986-12-17 1988-06-24 Fujitsu Ltd Data bus cryptographic system
CA2008071A1 (en) * 1989-01-27 1990-07-27 Jeffrey S. Watters Pump bus to avoid indeterminacy in reading variable bit field
US4978955A (en) * 1989-11-09 1990-12-18 Archive Corporation Data randomizing/de-randomizing circuit for randomizing and de-randomizing data
US5222040A (en) * 1990-12-11 1993-06-22 Nexcom Technology, Inc. Single transistor eeprom memory cell
JPH05151114A (en) * 1991-11-26 1993-06-18 Fujitsu Ltd Information security system for radio
US5572722A (en) * 1992-05-28 1996-11-05 Texas Instruments Incorporated Time skewing arrangement for operating random access memory in synchronism with a data processor
IL106513A (en) * 1992-07-31 1997-03-18 Hughes Aircraft Co Integrated circuit security system and method with implanted interconnections
US6208135B1 (en) * 1994-07-22 2001-03-27 Steve J. Shattil Inductive noise cancellation circuit for electromagnetic pickups
US6014446A (en) * 1995-02-24 2000-01-11 Motorola, Inc. Apparatus for providing improved encryption protection in a communication system
FR2745924B1 (en) 1996-03-07 1998-12-11 Bull Cp8 IMPROVED INTEGRATED CIRCUIT AND METHOD FOR USING SUCH AN INTEGRATED CIRCUIT
US6061451A (en) * 1996-09-03 2000-05-09 Digital Vision Laboratories Corporation Apparatus and method for receiving and decrypting encrypted data and protecting decrypted data from illegal use
US5878135A (en) * 1996-11-27 1999-03-02 Thomson Consumer Electronics, Inc. Decoding system for processing encrypted broadcast, cable or satellite video data
US6076161A (en) 1997-08-25 2000-06-13 National Semiconductor Corporation Microcontroller mode selection system and method upon reset
JP4168209B2 (en) * 1997-12-02 2008-10-22 忠弘 大見 A material in which a fluororesin is formed on the surface of a fluorinated passive film and various devices and parts using the material
JPH11191149A (en) 1997-12-26 1999-07-13 Oki Electric Ind Co Ltd Lsi for ic card and using method therefor
US6298153B1 (en) 1998-01-16 2001-10-02 Canon Kabushiki Kaisha Digital signature method and information communication system and apparatus using such method
WO1999063419A1 (en) * 1998-05-29 1999-12-09 Infineon Technologies Ag Method and device for processing data
IL139935A (en) * 1998-06-03 2005-06-19 Cryptography Res Inc Des and other cryptographic processes with leak minimization for smartcards and other cryptosystems
US6317820B1 (en) * 1998-06-05 2001-11-13 Texas Instruments Incorporated Dual-mode VLIW architecture providing a software-controlled varying mix of instruction-level and task-level parallelism
JP3600454B2 (en) * 1998-08-20 2004-12-15 株式会社東芝 Encryption / decryption device, encryption / decryption method, and program storage medium therefor
DE19845073C2 (en) 1998-09-30 2001-08-30 Infineon Technologies Ag Procedure for securing DES encryption against spying on the keys by analyzing the current consumption of the processor
US6408075B1 (en) 1998-11-30 2002-06-18 Hitachi, Ltd. Information processing equipment and IC card
JP2000187618A (en) * 1998-12-22 2000-07-04 Casio Comput Co Ltd Information processor
US6298135B1 (en) * 1999-04-29 2001-10-02 Motorola, Inc. Method of preventing power analysis attacks on microelectronic assemblies
US6295606B1 (en) * 1999-07-26 2001-09-25 Motorola, Inc. Method and apparatus for preventing information leakage attacks on a microelectronic assembly
JP4233709B2 (en) * 1999-09-30 2009-03-04 大日本印刷株式会社 IC chip and IC card
EP1098469B1 (en) 1999-11-03 2007-06-06 Infineon Technologies AG Coding device
FR2801751B1 (en) * 1999-11-30 2002-01-18 St Microelectronics Sa ELECTRONIC SAFETY COMPONENT
FR2802669B1 (en) 1999-12-15 2002-02-08 St Microelectronics Sa NON-DETERMINED METHOD FOR SECURE DATA TRANSFER
FR2803459B1 (en) 1999-12-30 2002-02-15 Itis TRANSMISSION SYSTEM WITH SPATIAL, TEMPORAL AND FREQUENTIAL DIVERSITY
JP4168305B2 (en) * 2000-01-12 2008-10-22 株式会社ルネサステクノロジ IC card and microcomputer
JP2001266103A (en) * 2000-01-12 2001-09-28 Hitachi Ltd Ic card and microcomputer
JP4310878B2 (en) 2000-02-10 2009-08-12 ソニー株式会社 Bus emulation device
NL1016269C2 (en) 2000-09-26 2002-03-27 Corus Staal Bv Segment of a highway construction, and method for applying it.
US6678707B1 (en) * 2000-10-30 2004-01-13 Hewlett-Packard Development Company, L.P. Generation of cryptographically strong random numbers using MISRs
DE10061998A1 (en) * 2000-12-13 2002-07-18 Infineon Technologies Ag The cryptographic processor
JP3977592B2 (en) * 2000-12-28 2007-09-19 株式会社東芝 Data processing device
US7243117B2 (en) * 2001-02-07 2007-07-10 Fdk Corporation Random number generator and probability generator
US7987510B2 (en) * 2001-03-28 2011-07-26 Rovi Solutions Corporation Self-protecting digital content
JP4009437B2 (en) 2001-05-09 2007-11-14 株式会社ルネサステクノロジ Information processing device
DE10128573A1 (en) * 2001-06-13 2003-01-02 Infineon Technologies Ag Prevent unwanted external detection of operations in integrated digital circuits
US7142670B2 (en) * 2001-08-14 2006-11-28 International Business Machines Corporation Space-efficient, side-channel attack resistant table lookups
JP4045777B2 (en) * 2001-10-30 2008-02-13 株式会社日立製作所 Information processing device
US7194633B2 (en) * 2001-11-14 2007-03-20 International Business Machines Corporation Device and method with reduced information leakage
US7062606B2 (en) * 2002-11-01 2006-06-13 Infineon Technologies Ag Multi-threaded embedded processor using deterministic instruction memory to guarantee execution of pre-selected threads during blocking events
KR100530372B1 (en) 2003-12-20 2005-11-22 삼성전자주식회사 Cryptographic method capable of protecting elliptic curve code from side channel attacks
US7899190B2 (en) * 2004-04-16 2011-03-01 Research In Motion Limited Security countermeasures for power analysis attacks
US7949883B2 (en) * 2004-06-08 2011-05-24 Hrl Laboratories, Llc Cryptographic CPU architecture with random instruction masking to thwart differential power analysis
US7127320B1 (en) * 2004-12-01 2006-10-24 Advanced Micro Devices, Inc. Render-resolve method of obtaining configurations and formatting it for use by semiconductor equipment interfaces
US20060282678A1 (en) * 2005-06-09 2006-12-14 Axalto Sa System and method for using a secure storage device to provide login credentials to a remote service over a network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6060908A (en) * 1997-08-04 2000-05-09 Siemens Aktiengesellschaft Databus
US20030118190A1 (en) * 1998-05-29 2003-06-26 Siemens Aktiengesellschaft Method and apparatus for processing data where a part of the current supplied is supplied to an auxiliary circuit
DE19936939A1 (en) * 1998-09-30 2000-04-06 Philips Corp Intellectual Pty Data processing device and method for its operation to prevent differential power consumption analysis
EP1006492A1 (en) * 1998-11-30 2000-06-07 Hitachi, Ltd. Information processing equipment and IC card
US20020169968A1 (en) * 1999-12-02 2002-11-14 Berndt Gammel Microprocessor configuration with encryption
US20030110390A1 (en) * 2000-05-22 2003-06-12 Christian May Secure data processing unit, and an associated method
US20030005321A1 (en) * 2001-06-28 2003-01-02 Shuzo Fujioka Information processing device
WO2004053662A2 (en) * 2002-12-12 2004-06-24 Arm Limited Processing activity masking in a data processing system
FR2862150A1 (en) * 2003-11-12 2005-05-13 Innova Card Integrated circuit for performing confidential transaction, has central processing unit, random access memory and read only memory that are connected by data bus that routes encrypted data produced from plain data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HOLLMANN H D L ET AL: "Protection of software algorithms executed on secure modules" FUTURE GENERATIONS COMPUTER SYSTEMS, ELSEVIER SCIENCE PUBLISHERS. AMSTERDAM, NL, vol. 13, no. 1, July 1997 (1997-07), pages 55-63, XP004081709 ISSN: 0167-739X *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008146384A (en) * 2006-12-11 2008-06-26 Nec Electronics Corp Information processor and instruction fetch control method
EP2343664A1 (en) * 2009-12-21 2011-07-13 Nxp B.V. Cryptographic device
US11126432B2 (en) 2010-02-11 2021-09-21 Nxp B.V. Computer processor and method with short forward jump instruction inhibiting
US9020148B2 (en) 2010-02-19 2015-04-28 Stmicroelectronics (Rousset) Sas Protection of registers against unilateral disturbances
FR2956764A1 (en) * 2010-02-19 2011-08-26 St Microelectronics Rousset PROTECTION OF REGISTERS AGAINST UNILATERAL DISTURBANCES
EP2369521A1 (en) 2010-02-19 2011-09-28 STMicroelectronics (Rousset) SAS Protection of records against unilateral disruptions
US9558375B2 (en) 2010-02-19 2017-01-31 Stmicroelectronics (Rousset) Sas Protection of registers against unilateral disturbances
CN102523085A (en) * 2011-12-15 2012-06-27 北京握奇数据系统有限公司 Data encryption method, data encrypting device and smart card
US8855309B2 (en) 2012-06-12 2014-10-07 Electronics And Telecommunications Research Institute Apparatus and method for providing security service
EP2675105A1 (en) * 2012-06-12 2013-12-18 Electronics and Telecommunications Research Institute Apparatus and method for providing security service
EP2738974B1 (en) * 2012-11-29 2020-08-12 Spirtech Method for deriving multiple cryptographic keys from a master key in a security microprocessor
FR3060789A1 (en) * 2016-12-19 2018-06-22 Commissariat A L'energie Atomique Et Aux Energies Alternatives METHOD FOR EXECUTING A MICROPROCESSOR OF A POLYMORPHIC MACHINE CODE OF A PREDETERMINED FUNCTION
WO2018115650A1 (en) 2016-12-19 2018-06-28 Commissariat à l'énergie atomique et aux énergies alternatives Method for executing a polymorphic machine code of a predetermined function by a microprocessor
US11157659B2 (en) 2016-12-19 2021-10-26 Commissariat A L'energie Atomique Et Aux Energies Alternatives Method for executing a polymorphic machine code of a predetermined function by a microprocessor
CN108270427A (en) * 2017-01-03 2018-07-10 意法半导体(鲁塞)公司 The device and method being managed for the current drain to integration module
CN108270427B (en) * 2017-01-03 2021-06-15 意法半导体(鲁塞)公司 Apparatus and method for managing current consumption of integrated module

Also Published As

Publication number Publication date
GB2449576A (en) 2008-11-26
JP2012095345A (en) 2012-05-17
US7949883B2 (en) 2011-05-24
GB0816396D0 (en) 2008-10-15
GB2447795B (en) 2009-03-18
US8095993B2 (en) 2012-01-10
GB2451359B (en) 2009-05-20
GB2445652A (en) 2008-07-16
GB2447804B (en) 2009-03-18
JP5283735B2 (en) 2013-09-04
GB0623489D0 (en) 2007-01-03
US20050271202A1 (en) 2005-12-08
US20120144205A1 (en) 2012-06-07
GB2430515B (en) 2008-08-20
JP5414780B2 (en) 2014-02-12
JP2011239461A (en) 2011-11-24
US8296577B2 (en) 2012-10-23
JP2013167897A (en) 2013-08-29
GB0810628D0 (en) 2008-07-16
US20050273631A1 (en) 2005-12-08
GB0814566D0 (en) 2008-09-17
US20070180541A1 (en) 2007-08-02
GB2445652B (en) 2009-02-25
US20050273630A1 (en) 2005-12-08
GB2430515A (en) 2007-03-28
US8065532B2 (en) 2011-11-22
GB0724643D0 (en) 2008-01-30
GB2447804A (en) 2008-09-24
JP2013141323A (en) 2013-07-18
GB2449576B (en) 2009-03-18
JP2008502283A (en) 2008-01-24
GB2451359A (en) 2009-01-28
WO2005124506A3 (en) 2006-05-11
GB0807135D0 (en) 2008-05-21
GB2447795A (en) 2008-09-24

Similar Documents

Publication Publication Date Title
US8095993B2 (en) Cryptographic architecture with instruction masking and other techniques for thwarting differential power analysis
JP2008502283A5 (en)
US8332634B2 (en) Cryptographic systems for encrypting input data using an address associated with the input data, error detection circuits, and methods of operating the same
US9582650B2 (en) Security of program executables and microprocessors based on compiler-architecture interaction
JP2013138496A (en) Method and apparatus for minimizing differential power attacks on processors
WO2001061915A2 (en) Method and system for resistance to statistical power analysis
Ambrose et al. MUTE-AES: A multiprocessor architecture to prevent power analysis based side channel attack of the AES algorithm
CN111046381A (en) Embedded CPU anti-differential power consumption analysis device and method
JP2007328789A (en) Cryptographic system for encrypting input data by using address associated with input data, error detection circuit, and operation method of the same
Lee et al. Processor accelerator for AES
US20220237304A1 (en) Data Processing Device and Method for Processing Secret Data
Clavier Attacking block ciphers
EP3479287B1 (en) Secure loading of secret data to non-protected hardware registers
US11593111B2 (en) Apparatus and method for inhibiting instruction manipulation
US20240193300A1 (en) Data processing device and method for processing secret data
WO2002027478A1 (en) Instruction issue in a processor
WO2002027474A1 (en) Executing a combined instruction
Kim et al. POSTER: Stopping Run-Time Countermeasures in Cryptographic Primitives
WO2002027479A1 (en) Computer instructions
WO2002027476A1 (en) Register assignment in a processor

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 0623489.2

Country of ref document: GB

Ref document number: 0623489

Country of ref document: GB

WWE Wipo information: entry into national phase

Ref document number: 2007527677

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Ref document number: DE

WWE Wipo information: entry into national phase

Ref document number: 11628920

Country of ref document: US

Ref document number: 2007180541

Country of ref document: US

122 Ep: pct application non-entry in european phase
WWP Wipo information: published in national office

Ref document number: 11628920

Country of ref document: US