WO2002054228A1 - Renomination de registre - Google Patents

Renomination de registre Download PDF

Info

Publication number
WO2002054228A1
WO2002054228A1 PCT/GB2001/004795 GB0104795W WO02054228A1 WO 2002054228 A1 WO2002054228 A1 WO 2002054228A1 GB 0104795 W GB0104795 W GB 0104795W WO 02054228 A1 WO02054228 A1 WO 02054228A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
register
registers
physical
instructions
Prior art date
Application number
PCT/GB2001/004795
Other languages
English (en)
Inventor
Nigel Paul Smart
Michael David May
Hendrik Lambertus Muller
Original Assignee
University Of Bristol
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB0029735A external-priority patent/GB0029735D0/en
Application filed by University Of Bristol filed Critical University Of Bristol
Publication of WO2002054228A1 publication Critical patent/WO2002054228A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • G06F9/384Register renaming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/002Countermeasures against attacks on cryptographic mechanisms
    • H04L9/003Countermeasures against attacks on cryptographic mechanisms for power analysis, e.g. differential power analysis [DPA] or simple power analysis [SPA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/72Indexing scheme relating to groups G06F7/72 - G06F7/729
    • G06F2207/7219Countermeasures against side channel or fault attacks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/08Randomization, e.g. dummy operations or using noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations

Definitions

  • DPA provides the most powerful attack using very cheap resources. Many people have started to examine this problem and S. Chad et al provides a worrying analysis regarding the weakness of AES (Advanced Encryption Standard) algorithms on Smart cards, see the article entitled “A Cautionary Note Regarding the Evaluation of AES Candidates on Smart-Cards" in the Second Advanced Encryption Standard Conference, Rome, March 1999.
  • AES Advanced Encryption Standard
  • the present invention seeks to improve tamper resistance according to the third approach, that is, by decorrelating the timing of power traces on successive program executions.
  • Kocher et al also describe two ways of producing the required temporal misalignment by introducing: i) introducing random clock signals, and ii) introducing randomness into the execution order.
  • Kocher et al in "Differential Power Analysis” mention that randomising execution order can help defeat DPA, but can lead to other problems if not done carefully.
  • One randomising approach uses the idea of randomised multi-threading at an instruction level using a set of essentially "shadow" registers. This allows auxiliary threads to execute random encryptions, hence hoping to mask the correct encryption operation.
  • the disadvantage is that additional computational tasks are again required and this requires a more complex processor architecture having separate banks of registers, one for each thread.
  • DES Data Encryption Standard
  • DES Data Encryption Standard
  • block cipher which operates on plaintext blocks of a given size (64-bits) and returns ciphertext blocks of the same size.
  • DES operates on the 64-bit blocks using key sizes of 56- bits.
  • the keys are actually stored as being 64 bits long, but every 8th bit in the key is not used (i.e. bits numbered 7, 15, 23, 31, 39, 47, 55, and 63).
  • a software randomiser would work at too high a level of abstraction.
  • the randomised multi-threading idea is close to a solution but suffers from increased CPU time and requires a more complex processor with separate banks of registers, one for each thread.
  • a method of issuing instructions to an execution unit in a processor comprising: identifying in an ordered sequence of instructions a set of instructions for which the order of execution is not critical; and selecting instructions in said set for successive execution on a random basis each time the ordered sequence of instructions is executed.
  • the identification of a set of instructions for which the order of execution is not critical relies on establishing instructions which do not have dependencies on other instructions in the sequence.
  • the first mentioned dependency is a real dependency in that the computation of the later instruction cannot proceed until the previous result value has been computed.
  • the second dependency is however a resource constraint. The computation could proceed if a different register was available to hold the result value. It is one aim of the invention to reduce the number of dependencies of the second type in a processor so as to maximise the possibility of issuing instructions in a random sequence to increase non-determinism.
  • a further object of the invention is to increase non-determinism in a processor regardless of whether or not instructions are issued in a random sequence.
  • a method of allocating registers in a processor wherein an instruction set to be executed on the processor defines a set of virtual register identifiers and the processor includes a set of physical registers, the method comprising: reading virtual register identifiers specified in an instruction; and identifying physical registers associated respectively with said virtual register identifiers, said physical registers being used for execution of the instruction, wherein at least one of said physical registers is selected at random from the set of physical registers during said identifying step.
  • a register renaming unit for use in a processor, the register renaming unit comprising: a register mapper which maintains a mapping between a set of virtual register identifiers and a set of physical registers, wherein virtual register identifiers are specified in instructions to be executed by a processor and physical registers are provided in the processor for holding values relating to by execution of the instruction; means for allocating selected physical registers from said second set to be used for the virtual register identifiers specified in an instruction; a random selection unit for selecting at least one of said physical registers at random from the set of physical registers; and means for updating the register mapper after random selection of said physical register.
  • a method of executing instructions in a processor comprising: issuing successive instructions from an ordered sequence of instructions, wherein said instructions specify virtual register identifiers identifying registers from a first set of virtual registers; for each instruction, reading said virtual register identifiers and allocating corresponding physical registers from a second set of physical registers provided in the processor for execution of the instruction; wherein at least one of said physical registers is allocated at random from said second set; and executing said instruction using the physical registers which have been allocated to the virtual registers specified in the instruction.
  • Figure 1 shows a block diagram of a generic CPU architecture
  • Figure 2 shows a non-deterministic processor executing two instructions compared to other processors
  • Figure 3 is an example of a random register renaming unit
  • Figure 5 shows an embodiment of the random issue unit
  • Figure 6 shows a flow chart explaining how instructions are issued at random
  • Figure 7 shows an example of two input random selection unit
  • Figures 8A and 8B show a generic model and a 16 input random selection unit
  • Figure 9 shows a flow chart describing a method for choosing which random instruction in the issue buffer to execute.
  • FIG. 1 is a block diagram illustrating the standard functional units that make up a pipelined computer system.
  • a program memory 2 contains program instructions, which are addressable at different memory locations.
  • An ADDRESS bus 6 and a DATA bus 4 transfer information to and from the various elements that make up the processor 8.
  • the system contains an instruction fetch unit 16 having a program counter 12 that stores the address of the next instruction to be fetched. For sequential execution of instructions the program counter will normally be incremented by a single addressing unit. However, if a branch instruction is encountered, the program flow is broken and the program counter 12 needs to be loaded with the address of a target instruction (that is, the first instruction of the branch sequence).
  • the instructions are fetched from the program memory and stored in an instruction issue buffer 14.
  • the program counter referred to herein is used to control instruction fetches from memory. There may also be an execution counter which is used by the execution unit 18 to specify which instruction is currently being executed.
  • the instructions are decoded and supplied to relevant execution units. In this example, only one execution unit 18 or pipeline is shown, however the present invention is intended to be used in conjunction with modern processors which may have several execution units allowing parallel execution paths. Encryption algorithms need a substantial level of computational power and modern processor architectures such as superscalar, VLIW (Very Long Instruction Word) and SIMD (Single Instruction Multiple Data) are ideally suited to the present invention.
  • the results of the operations are written back by a result write stage 22 into temporary registers of a register file 20, which is used to load and store data in and out of main memory.
  • the present invention is concerned mainly with the register renaming unit denoted by the reference numeral 100. Also, the present description deals with a modified issue buffer 14 which will be described in more detail later.
  • the issue buffer generates an instruction fetch signal 13 to control which instructions are supplied from the fetch unit 10. Furthermore, part of the decode circuitry may be used to decode the instruction dependencies. This will also be described in more detail at a later stage.
  • the processor of the described embodiment is a non-deterministic processor. Non-deterministic processing as described herein means that for successive runs of the program, although the result will be the same, the order of execution of the instructions will be random. This reduces the impact of a DPA-type attack in that the power traces resulting from successive program runs will be different.
  • Figure 2 serves to highlight the differences between a non-deterministic processor and other known processors when executing a simple program consisting of the following two lines of code:
  • the non-deterministic processor allows the instructions to be executed in any order provided that it has been established that the instructions are independent. So in the first cycle either the ADD or the XOR instruction can be carried out and in the second cycle the other instruction will be executed.
  • the standard processor executes instructions sequentially and although there is a little "out of order" execution to help with branch prediction, this occurs on a small scale. In any event, in such a processor each time a program is run containing a certain sequence of instructions, the execution sequence will be identical.
  • Pentium processor has a plurality of execution units (A) and (B), which execute the independent instructions in parallel the processor is still deterministic in that the ADD and the XOR instructions are executed concurrently in pipes (A) and (B).
  • a slightly more complex code sequence comprising eight instructions is shown in Table 1.
  • the non-deterministic processor described herein makes use of the fact that in many code sequences a number of instructions are independent and thus can, in theory, be executed in any order. This is exploited by executing the instructions in a random order at run time. This causes the access patterns to memory for either data or instructions to be uncorrelated for successive program executions, and thus causes the power trace to be different each time.
  • Reference numeral 100 in Figure 1 denotes a register renaming unit which is located in the fetch stage 10 of the processor.
  • the detailed construction of the register renaming unit 100 is shown in Figure 3.
  • the register renaming unit 100 allows registers to be named during execution of a program.
  • a virtual register set is defined, which are the registers specified by a particular instruction set. These are referred to herein as R1 , R2 etc. Exemplary code sequences therefore use this denotation to denote virtual registers.
  • Registers in the register file 20, that is the so-called physical registers are specified herein using Reg1 , Reg2 etc. It is these physical registers which are the ones actually used by the processor for execution of instructions. To implement the present invention, the number of physical registers should be greater than the number of virtual registers.
  • the register renaming unit 100 comprises a translation look-up table 102 which maintains a mapping from virtual to physical registers.
  • the translation look-up table 102 contains an index for every virtual register specified in the instruction set.
  • One such fetched instruction is labelled 212 in Figure 3, supplied along an instruction fetch path 214. Using this index it is possible to locate a physical register associated with the specified virtual register.
  • Physical registers can be pre-allocated to virtual registers, or can be allocated (or reallocated) on the fly as discussed more particularly in the following in relation to destination registers.
  • a random selection unit 106 selects physical registers at random using a random number generator 108 as described more fully in the following.
  • the random register renaming unit 100 is operative in relation to each instruction 212 which is fetched by the fetch unit 12. These instructions define virtual registers R1, R2 etc.
  • Each instruction 212 is labelled as 212 in Figure 3.
  • the instruction 212 has an opcode field OPC, first source operand field VSRO, a second source operand field VSR1 and a destination field VDR.
  • the virtual source operand fields VSRO, VSR1 (of course taking the form of virtual registers identifiers R1 , R2, etc) are supplied as indexes into the register translation lookup table 102.
  • the set of free' registers is determined using an issue window matrix 200 whose purpose is to keep track of the physical registers being used by the instructions.
  • the issue window matrix 200 has I rows where each row of the matrix corresponds to an instruction, and 2 P columns, where each column corresponds to a physical register (i.e. Reg1 ,Reg2, etc).
  • a matrix of size Ix2 p maintains for each instruction in the issue window which physical registers that instruction uses.
  • a one in element (i, p) in the matrix indicates that physical register p will be used by instruction i.
  • a zero in element (i, p) indicates that register p will not be used by instruction i.
  • a flag bit corresponding to the relevant matrix element is set to one whereas if the physical register is not being used then the flag bit is a zero.
  • Figure 3 also illustrates a set of physical register usage flags (PRUF) labelled as row 201. These can be implemented as an extra row associated with the issue window matrix (as illustrated) or separately. These flags represent which physical register may be used by future instructions. A "1" in column (p) indicates that register p may be used by future instructions. A “0" in column (p) indicates that register p cannot be used by future instructions.
  • PRUF physical register usage flags
  • the flag bits associated with each of the physical registers used by the renamed instruction 210 will be set 202 in the issue window matrix 200.
  • an instruction with renamed physical registers is stored in location 202 of the issue window 200 such that the bits in row 202 that are set to one correspond to the instruction's physical source and destination registers.
  • the physical register usage flags 201 are updated by clearing the bit associated with the old mapping of the destination register along path 221 , and by setting the bit associated with the new mapping of the destination register.
  • An OR gate performs a logical OR operation across all the flag bits of a particular column of the issue window matrix 200.
  • Each column corresponds to a physical register such that if the OR'ed result for a column is zero then this indicates that the relevant physical register is not being used by any instructions.
  • the result is checked such that if a one is output then the physical register is being used by at least one instruction, whereas if the result is a zero this indicates that the corresponding physical register is free and is available for selection by the random selection unit 106.
  • the random selection unit 106 uses the random number generator 108 to randomly select one of the set of physical registers marked as being free. Two processes then occur:
  • the translation look-up table 102 is updated to include an identification along path 220 of the newly mapped randomly selected physical register as corresponding to the virtual destination register identified in the instruction;
  • the destination virtual register as specified in the fetched instruction is renamed by supplying the identifier along path 222 to specify the randomly selected physical register.
  • the flag bits of all the physical registers of the row corresponding to the completed instruction in the issue window table are marked as free 204 (i.e. set to zero).
  • a preferred solution would be to reset (i.e. set to zero) the flag bits corresponding to the source registers when the source registers have been read by the execution unit 18, and to reset the destination register when the result value has been written.
  • the operands of the fetched instruction that originally identified virtual register have been renamed to specify physical register identifiers 210.
  • the instruction 210 with renamed registers is then sent via path 208 to the issue buffer 14. It should be understood that the non-deterministic nature of the processor can be increased if the instructions in the issue buffer can be executed at random using a random issue unit as described in more detail later with reference to Figure 5. Alternatively, these instructions can be executed normally using a conventional processor architecture where the issue unit does not execute instructions in a random order.
  • step S100 denotes the initial mapping step.
  • Step S102 denote the maintenance of the used bits where the row of matrix 200 associated with a completed instruction is cleared and the flag bits of the matrix 200 are set for the physical source and destination registers used. This includes setting the flag bits in row 201 relating to the use of feature instructions.
  • Step S104 denotes the final updating of the register translation look-up table 102.
  • a renaming unit will check for dependencies and rename registers at random, so that for the example given above the code sequence can be renamed in any of the ways shown below.
  • Random register renaming has another advantage. Power dissipation of overwriting a register from one value with another depends on both data values (that is the number of bits which need to be flipped). If it is not predetermined how registers are allocated, then the power dissipation between write-backs into the register file is not deterministic.
  • the random register renaming technique described herein can be used by itself to improve non-determinism in a processor. Alternatively, it can be used in conjunction with the random issue of instructions so that the two effects together provide a synergistic result in increasing tamper-resistance.
  • the random issue of instructions is now described in the following. It will be appreciated that the instruction which is supplied to the random issue unit discussed below is the renamed instruction 210 of Figure 3.
  • FIG. 5 shows an example of the implementation of a random issue unit.
  • the random issue unit comprises an instruction table 32 with an associated dependency matrix table 30. Instructions are prefetched into the instruction table 32 using conventional instruction fetch circuitry.
  • the dependency matrix table has slots and columns, where the slots represent bit-masks associated with each instruction in the instruction table 32.
  • the bit-masks or dependency bits are an indication as to whether an instruction has a dependency on another instruction. Broadly speaking there are two types of dependencies that need to be considered for an instruction:
  • the Used and Defined Register tables 34, 36 shown in Figure 5 each comprise a number of rows and columns. Each row corresponds to a register (or operand) and each column corresponds to a particular slot (or instruction) in the instruction issue table 32. Each register comprises a plurality of slots corresponding to the number of instructions in the instruction table 32 and is the so-called bit-mask for a register.
  • the bit-mask for a register is a binary stream where a "1" indicates which instruction has a dependency on that register.
  • each table has five rows corresponding to registers Reg1 to Reg5, i.e. Reg1 corresponds to the top row and Reg5 to the bottom row.
  • the processor performs a logical OR operation 38 of the bit mask of the Used Registers table 34 and the Defined Registers table 36 thereby creating a new bit-mask stored in a free slot of the dependency matrix 30.
  • a test can be performed by OR-ing with OR gates 40 each of the dependency bits of a slot of the dependency matrix. If all the dependency bits of a slot associated with a particular instruction are set to zero, then the instruction can be executed and a FIRE signal 42 is generated to the Random Selection Unit 44. Given the result of the OR for each row of the table, a number of zeros (indicating instructions to be executed) and a number of ones (indicating instructions that are blocked) are obtained. The random selection unit 44 selects one of the slots which is indicated at value zero, at random, and causes that instruction to be executed next. In the described embodiment, the dependency bits are overwritten with new values when the dependencies of the next instruction are loaded into the matrix.
  • the random issue unit supplies an instruction to be executed from the instruction table 32 along instruction supply path 50 and loads an instruction into the instruction table 32 along instruction load path 52 at the same time.
  • Figure 6 is a flow chart indicating how the instructions in the instruction issue buffer 14 are issued for execution and loaded concurrently.
  • the load operations are represented by the left branch flow (C), while the issue operations are represented by the right branch flow (D).
  • the left branch flow (C) of figure 6 relates to an instruction load operation starting at step S1 where the next instruction, specified by the program counter 12, is loaded into the instruction table 32 of the issue buffer 14.
  • the load operations will firstly be described in general terms, and then more specifically in relation to one example.
  • Each instruction defines two source operands 54 and a destination operand 56. These will nearly always be defined as registers although that is not necessary. Direct addresses or immediates are possible.
  • the source and destination operands 54,56 are simultaneously decoded.
  • the decoded information is translated into bit-masks that are set in the Used Registers and Defined Registers tables 34,36. These bit-masks are OR-ed by OR gate 38 ( Figure 5) to create dependency bits indicating on which instructions the loaded instruction depends.
  • the empty slot E associated with the loaded instruction is then selected for replacement by setting the InValid flag 58 to zero.
  • the dependency bits are loaded into the selected slot E of the dependency matrix.
  • the bit-masks in column E of the Registers Used and Registers Defined tables 34,36 are set to "1" along path 62 for the corresponding rows of these tables to ensure that future instructions that use those registers are going to wait for the instruction to finish.
  • the Used and Defined Register tables 34, 36 are set-up during the instruction fetch or LOAD sequence, as already indicated.
  • the fetched instruction is decoded and the bit-masks associated with each of the registers specified in the instruction are checked for dependencies with other instructions. For example, assume the instruction: ADD Reg2, Reg3, Reg4 is fetched.
  • the bit masks associated with the registers Reg2 and Reg3 in the Used Registers table 34 i.e. the source registers
  • OR gate 38 the bit mask associated with register Reg4 in the Defined Registers Table (i.e. the destination register) is sent to the OR gate 38.
  • each bit mask has N slots where each slot corresponds to a particular instruction.
  • the OR gate 38 receives the bit-masks and performs a bit-wise logical OR operation for each slot simultaneously. For example, assume the following bit- masks exist:
  • the first step includes simultaneously performing a second OR operation 40 across all the dependency bits for each slot of the dependency matrix 30 to determine which instructions have no dependencies. For the example, a "1" set in the third bit of the dependency mask for the instruction in question means that the OR'ed result will be a "1". Therefore this instruction still has dependencies stage and cannot be fired at the random selection unit 44.
  • the final step is to set the appropriate bit masks associated with the currently loaded instruction.
  • the appropriate bit-masks being the registers that cannot be used by future instructions until the current instruction has been issued.
  • register Reg4 in the Used Registers table 34 for the present instruction column in set to "1" to inform all future instructions that Reg4 cannot be used as a source register (i.e. read from), because the present instruction uses this as a destination register (i.e. write to).
  • registers Reg2 and Reg3 are source registers for the present instruction and thus these registers are set to "1" in the Defined Registers table 36 to indicate that these registers cannot be written to until the present instruction has completed.
  • the right branch flow (D) of Figure 6 relates to random instruction issue starting at S1 where the dependency bits associated with each instruction are checked using an OR operation via OR gate 40. Then all of the independent instructions are flagged as ready for issue and appropriate fire signals are sent to the Random Selection Unit.
  • the Random Selection Unit 44 selects one of the instructions 46 for example the instruction X, which is issued along instruction supply path 50 to the relevant execution unit.
  • column X is then cleared (i.e.
  • step S4 a pointer E is initialised for the next iteration.
  • E is a pointer that points to an empty slot which is available in the issue table. After every instruction has been loaded, E must point to another free slot. One could, for example, use the instruction previously executed to initialise E. In that way, the pointer E would follow the executed instructions around the table.
  • Figure 7 represents a two input example of how a random selection unit 44 may be implemented.
  • the truth table for the random selector is shown below:
  • Figure 7 shows two inputs 70 and 72 for the random selection unit 44. It should be apparent from figure 5 that each input I . or ⁇ -_ will either be a '0' or a '1'. More generally, a '0' will appear if all of the dependency bits of the relevant slot are '0'. Thus, a '0' indicates an independent instruction, which can be selected by the Random Selection Unit 44. An inspection of truth table 2 reveals that if one of the inputs is a '1', then the output 46 of the random selector will always take the logical value of the other input. Input is shown coupled to an AND gate 76 through an inverting element 75. The AND gate 76 accepts two other inputs, i.e. a random signal R 80 and an enable signal E 78. The output of the AND gate is OR-ed 74 with input l 0 to produce the selected output 46 of the random selection unit 44.
  • each input I . or ⁇ -_ will either be a '0' or a
  • the random signal R does not have to be truly random. It could be typically generated using a pseudo-random generator that is reseeded regularly with some entropy.
  • the enable signal 78 allows random issue to be disabled, i.e. non- determinism can be turned off, for example to allow a programmer to debug code by stepping through the instructions.
  • Figures 8A and 8B show a slightly more complex example of a random selection unit having 16 inputs.
  • a 16 input random issue unit can be provided by adapting the simple two input structure shown in Figure 7 and connecting it in a cascaded structure.
  • Figure 8A shows a generalised stage of one of the random selection units. The inputs run from l 0 to 1 2 ⁇ +1 -1. The generalised stage can be applied to the 16 input random selector shown by Figure 8B.
  • the 16 inputs are divided in half with the even inputs I0. 12... 114 being input to a first multiplexer 82 and the odd inputs 11 , 13, ...115 being input to a second multiplexer 84.
  • Each multiplexer selects 1 output from 2 k inputs (i.e. 8:1 in the final stage) and each multiplexer accepts control signals from the lower stages A . ... A ⁇ - ⁇ (i.e. A_, A-i, A 2 in the final stage). This is confirmed by diagram on the right, which shows the selected signals from the lower stages being feedback into the higher stages. Then the relevant stage behaves as the two input model shown in Figure 7.
  • Figure 9 is a flow chart illustrating a method to choose which instruction in the instruction buffer to execute.
  • the issue buffer is assigned the symbol B.
  • step S13a issues this instruction to the relevant execution unit and the program sequence is completed i.e. EXIT. If however, there is more than one instruction in the buffer, step S13b involves dividing the buffer into two sets of roughly equal size and assigning the symbols L and R respectively. Then at S14, the instructions within the L buffer are examined to see if any independent instructions can be issued. If not, step S15b sets the active issue buffer B to look at buffer R and the process is repeated from step S12.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Advance Control (AREA)

Abstract

L'invention concerne une unité de renomination aléatoire de registre, servant à renommer les registres virtuels associés à des instructions extraites, mettant en oeuvre un ensemble d'instructions qui correspondent aux registres physiques et sont à exécuter sur le processeur. Les registres sources spécifiés dans chaque instruction extraite sont utilisés en tant qu'index dans une table de consultation et de traduction, laquelle attribue un registre physique à chaque registre source. En ce qui concerne le registre de destination spécifié dans chaque instruction extraite, une unité de sélection aléatoire attribue un registre physique disponible au registre de destination et met à jour en conséquence la table de consultation et de traduction. En renommant les registres spécifiés par chaque instruction extraite, on augmente le niveau non déterministe du processeur, ce niveau pouvant encore être augmenté lorsque l'unité édition, quand elle exécute les instruction du registre nommé, le fait dans un ordre aléatoire.
PCT/GB2001/004795 2000-12-06 2001-10-30 Renomination de registre WO2002054228A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
GB0029735A GB0029735D0 (en) 2000-12-06 2000-12-06 Register renaming
GB0029735.8 2000-12-06
GB0102467A GB0102467D0 (en) 2000-12-06 2001-01-31 Register Renaming
GB0102467.8 2001-01-31

Publications (1)

Publication Number Publication Date
WO2002054228A1 true WO2002054228A1 (fr) 2002-07-11

Family

ID=26245387

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2001/004795 WO2002054228A1 (fr) 2000-12-06 2001-10-30 Renomination de registre

Country Status (1)

Country Link
WO (1) WO2002054228A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2489405A (en) * 2011-03-22 2012-10-03 Advanced Risc Mach Ltd Data storage circuitry generating encryption key based on physical data storage location
WO2013079912A1 (fr) * 2011-12-02 2013-06-06 Arm Limited Appareil et procédé de traitement de données pour effectuer un renommage de registre sans registres supplémentaires
WO2013079911A1 (fr) * 2011-12-02 2013-06-06 Arm Limited Appareil et procédé de traitement de données de renommage de registre pour effectuer un renommage de registre
EP2860656A2 (fr) 2013-10-01 2015-04-15 Commissariat à l'Énergie Atomique et aux Énergies Alternatives Procédé d'exécution par un microprocesseur d'un code binaire polymorphique d'une fonction prédéterminée
US20170083355A1 (en) * 2015-09-22 2017-03-23 Qualcomm Incorporated Dynamic register virtualization
WO2018115650A1 (fr) 2016-12-19 2018-06-28 Commissariat à l'énergie atomique et aux énergies alternatives Procédé d'exécution par un microprocesseur d'un code machine polymorphique d'une fonction prédéterminée

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5745726A (en) * 1995-03-03 1998-04-28 Fujitsu, Ltd Method and apparatus for selecting the oldest queued instructions without data dependencies
WO2000007097A1 (fr) * 1998-07-31 2000-02-10 Advanced Micro Devices, Inc. Processeur configure pour liberer selectivement des registres physiques sur retrait d'instructions
US6138230A (en) * 1993-10-18 2000-10-24 Via-Cyrix, Inc. Processor with multiple execution pipelines using pipe stage state information to control independent movement of instructions between pipe stages of an execution pipeline

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6138230A (en) * 1993-10-18 2000-10-24 Via-Cyrix, Inc. Processor with multiple execution pipelines using pipe stage state information to control independent movement of instructions between pipe stages of an execution pipeline
US5745726A (en) * 1995-03-03 1998-04-28 Fujitsu, Ltd Method and apparatus for selecting the oldest queued instructions without data dependencies
WO2000007097A1 (fr) * 1998-07-31 2000-02-10 Advanced Micro Devices, Inc. Processeur configure pour liberer selectivement des registres physiques sur retrait d'instructions

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FORREST ET AL: "Building diverse computer systems", OPERATING SYSTEMS, 1997., THE SIXTH WORKSHOP ON HOT TOPICS IN CAPE COD, MA, USA 5-6 MAY 1997, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 5 May 1997 (1997-05-05), pages 67 - 72, XP010226847, ISBN: 0-8186-7834-8 *
LEIBHOLZ D ET AL: "THE ALPHA 21264: A 500 MHZ OUT-OF-ORDER EXECUTION MICROPROCESSOR", PROCEEDINGS OF IEEE COMPCON '97. SAN JOSE, FEB. 23 - 26, 1997, LOS ALAMITOS, IEEE COMP. SOC. PRESS, US, 23 February 1997 (1997-02-23), pages 28 - 36, XP000751757, ISBN: 0-8186-7805-4 *
SRINIVASAN S ET AL: "ON THE USE OF PSEUDORANDOM SEQUENCES FOR HIGH SPEED RESOURCE ALLOCATORS IN SUPERSCALAR PROCESSORS", IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS & PROCESSORS, ICCD '99, 10 October 1999 (1999-10-10) - 13 October 1999 (1999-10-13), AUSTIN, TX,US, pages 124 - 130, XP001004645 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2489405A (en) * 2011-03-22 2012-10-03 Advanced Risc Mach Ltd Data storage circuitry generating encryption key based on physical data storage location
JP2012199922A (ja) * 2011-03-22 2012-10-18 Arm Ltd 機密データの暗号化および記憶
GB2489405B (en) * 2011-03-22 2018-03-07 Advanced Risc Mach Ltd Encrypting and storing confidential data
US9280675B2 (en) 2011-03-22 2016-03-08 Arm Limited Encrypting and storing confidential data
US9201656B2 (en) 2011-12-02 2015-12-01 Arm Limited Data processing apparatus and method for performing register renaming for certain data processing operations without additional registers
CN103988174A (zh) * 2011-12-02 2014-08-13 Arm有限公司 无需额外寄存器执行寄存器重命名的数据处理设备和方法
US8914616B2 (en) 2011-12-02 2014-12-16 Arm Limited Exchanging physical to logical register mapping for obfuscation purpose when instruction of no operational impact is executed
CN103988462A (zh) * 2011-12-02 2014-08-13 Arm有限公司 用于执行寄存器重命名的寄存器重命名数据处理装置和方法
WO2013079911A1 (fr) * 2011-12-02 2013-06-06 Arm Limited Appareil et procédé de traitement de données de renommage de registre pour effectuer un renommage de registre
CN103988462B (zh) * 2011-12-02 2017-03-08 Arm 有限公司 用于执行寄存器重命名的寄存器重命名数据处理装置和方法
WO2013079912A1 (fr) * 2011-12-02 2013-06-06 Arm Limited Appareil et procédé de traitement de données pour effectuer un renommage de registre sans registres supplémentaires
EP2860656A2 (fr) 2013-10-01 2015-04-15 Commissariat à l'Énergie Atomique et aux Énergies Alternatives Procédé d'exécution par un microprocesseur d'un code binaire polymorphique d'une fonction prédéterminée
US9489315B2 (en) 2013-10-01 2016-11-08 Commissariat à l'énergie atomique et aux énergies alternatives Method of executing, by a microprocessor, a polymorphic binary code of a predetermined function
US20170083355A1 (en) * 2015-09-22 2017-03-23 Qualcomm Incorporated Dynamic register virtualization
CN108027728A (zh) * 2015-09-22 2018-05-11 高通股份有限公司 动态寄存器虚拟化
US10282224B2 (en) * 2015-09-22 2019-05-07 Qualcomm Incorporated Dynamic register virtualization
WO2018115650A1 (fr) 2016-12-19 2018-06-28 Commissariat à l'énergie atomique et aux énergies alternatives Procédé d'exécution par un microprocesseur d'un code machine polymorphique d'une fonction prédéterminée

Similar Documents

Publication Publication Date Title
May et al. Random register renaming to foil DPA
EP3757854B1 (fr) Circuit de pipeline de microprocesseur pour prendre en charge le calcul cryptographique
May et al. Non-deterministic processors
US8654970B2 (en) Apparatus and method for implementing instruction support for the data encryption standard (DES) algorithm
US20100250965A1 (en) Apparatus and method for implementing instruction support for the advanced encryption standard (aes) algorithm
US7949883B2 (en) Cryptographic CPU architecture with random instruction masking to thwart differential power analysis
US7620821B1 (en) Processor including general-purpose and cryptographic functionality in which cryptographic operations are visible to user-specified software
US8417961B2 (en) Apparatus and method for implementing instruction support for performing a cyclic redundancy check (CRC)
US7673152B2 (en) Microprocessor with program and data protection function under multi-task environment
US20030163718A1 (en) Tamper resistant software-mass data encoding
US8090934B2 (en) Systems and methods for providing security for computer systems
US8356185B2 (en) Apparatus and method for local operand bypassing for cryptographic instructions
US9317286B2 (en) Apparatus and method for implementing instruction support for the camellia cipher algorithm
US8745407B2 (en) Virtual machine or hardware processor for IC-card portable electronic devices
Albert et al. Combatting software piracy by encryption and key management
US20100246815A1 (en) Apparatus and method for implementing instruction support for the kasumi cipher algorithm
US7570760B1 (en) Apparatus and method for implementing a block cipher algorithm
CN113673002A (zh) 一种基于指针加密机制和risc-v协处理器的内存溢出防御方法
WO2002054228A1 (fr) Renomination de registre
Gilmont et al. Architecture of security management unit for safe hosting of multiple agents
WO2002027478A1 (fr) Emission d'instructions par un processeur
WO2002027474A1 (fr) Execution d'une instruction combinee
US20120216020A1 (en) Instruction support for performing stream cipher
Hossain et al. Hexon: Protecting firmware using hardware-assisted execution-level obfuscation
US7711955B1 (en) Apparatus and method for cryptographic key expansion

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PH PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP