WO2013142981A1 - Securing accessible systems using base function encoding - Google Patents

Securing accessible systems using base function encoding Download PDF

Info

Publication number
WO2013142981A1
WO2013142981A1 PCT/CA2013/000305 CA2013000305W WO2013142981A1 WO 2013142981 A1 WO2013142981 A1 WO 2013142981A1 CA 2013000305 W CA2013000305 W CA 2013000305W WO 2013142981 A1 WO2013142981 A1 WO 2013142981A1
Authority
WO
WIPO (PCT)
Prior art keywords
function
input
vector
code
encoding
Prior art date
Application number
PCT/CA2013/000305
Other languages
French (fr)
Inventor
Harold Johnson
Yuan Xiang Gu
Michael Wiener
Yongxin Zhou
Original Assignee
Irdeto Canada Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Irdeto Canada Corporation filed Critical Irdeto Canada Corporation
Priority to EP13767371.1A priority Critical patent/EP2831794B1/en
Priority to CN201380028121.0A priority patent/CN104335218B/en
Priority to US14/389,361 priority patent/US9965623B2/en
Publication of WO2013142981A1 publication Critical patent/WO2013142981A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/54Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by adding security routines or objects to programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • G06F12/1408Protection against unauthorised use of memory or access to memory by using cryptography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/14Protecting executable software against software analysis or reverse engineering, e.g. by obfuscation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0631Substitution permutation network [SPN], i.e. cipher composed of a number of stages or rounds each involving linear and nonlinear transformations, e.g. AES algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/14Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using a plurality of keys or algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/034Test or assess a computer or a system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/04Masking or blinding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/16Obfuscation or hiding, e.g. involving white box

Definitions

  • the present invention relates generally to electronic computing devices and computer systems, and more specifically, to securing software and firmware on devices and systems which are accessible to attack.
  • a white box attack is an attack on a software algorithm in which it is assumed that the attacker has full visibility into the execution of the algorithm.
  • protection systems have met with reasonable success, but as such protection systems have become more and more sophisticated, so has the sophistication of the attacking techniques (such as encoding reduction attacks, statistical bucketing attacks and homomorphic mapping attacks).
  • attacking techniques such as encoding reduction attacks, statistical bucketing attacks and homomorphic mapping attacks.
  • Embodiments of the present invention aim generally at providing more effective secret-hiding and tamper-resistance techniques, providing protection of software code and data without fear that security will be breached.
  • the methods and systems disclosed herein are not limited to any particular underlying program. They may be applied to cryptographic systems, but equally, may be applied to non-cryptographic systems. As well, the software code that is being protected does not dictate what is done to protect it, so the protection techniques are not constrained by the underlying code. This may provide an advantage over other protection techniques which can leave or create patterns that are based on the underlying code. Such patterns may provide weaknesses that can be exploited by attackers.
  • Some embodiments disclosed herein provide "profound data dependence", which can make it difficult or impossible to unentangle or distinguish the protected code and the code which is providing protection.
  • AES algorithms typically execute the same way all the time, no matter what the input data is. This makes it straightforward for an attacker to know what he is looking for and where to find it.
  • Most white box protection systems have a rigid equation structure which does not address this type of problem. That is, an attacker may know what types of operations or effects to look for, and where in code or execution to look to find those operations or effects.
  • embodiments disclosed herein may provide coding which is not rigid, such as where each iteration of a protection algorithm results in a different encoding. Thus, the system is extremely non-repeatable.
  • this may make embodiments disclosed herein more resistant to a "compare” type attack, in which an attacker changes 1 bit and observes how the targeted program changes. In some embodiments disclosed herein, if an attacker changes 1 bit, then the protected code will look completely different.
  • a compare attack is an attack where two iterations of code execution are compared to see the difference, such as changing a single input bit to see how the operation and output change. Protection algorithms as disclosed herein may result in dramatically different functions with each iteration of the protected code, so a compare attack does not provide any useful information.
  • Some embodiments include systems and techniques for software protection that operate by applying bijective "base" functions to the targeted code. These base functions are pairs of mutually-inverse functions fa, fx '1 which are used, for example, to encode an operation, and then un-encode the operation at a later point in a software application. The encoding obscures the original function and the data which it generates. There is no loss of information, as the unencoding operation accommodates for the encoding operation,
  • Base function pairs may be chosen such that an attacker cannot easily find or determine the inverse function. That is, given a function / ⁇ , the inverse ⁇ "1 may not be found easily without the key K.
  • the key K may be used at code generation time, but then discarded once the functions ⁇ , fx '1 have been generated and applied to the targeted code.
  • These base function pairs are also lossless, i.e. mathematically invertible.
  • the protected software application does not need to decode a function or process completely to use it elsewhere in the targeted code, as the encoding and unencoding changes are included within the encoded application.
  • base function pairs may include permutation polynomial encodings.
  • a permutation polynomial is a polynomial which is invertible (a polynomial bijection).
  • Some embodiments may generate or use base function pairs in such a manner that they generate “instance diversity” and “dynamic diversity”.
  • each base function pair may create a secure "communication channel", such as between portions of a software application, between two software applications or platforms, or the like.
  • Dynamic diversity may be created by linking operation of the software to the input data. Each time an encoding is performed, such as for communication between two encoded applications, instance and dynamic diversity may be generated between the two applications.
  • the base functions may be highly "text dependent” so they offer good resistance to plaintext and perturbation attacks. If an attacker changes anything, even making a very small change such as the value of 1 bit, the change will result in a very large behavioural change.
  • the diversity provided by embodiments disclosed herein may provide a variable, randomly-chosen structure to protected code.
  • An engine which generates the base function pairs and encodings may rely on a random or pseudo-random key to choose the underlying function and/or the key.
  • a key according to embodiments disclosed herein may not be as small as the keys of many conventional security systems (i.e. 64 or 128 bits); rather, it may be thousands or tens of thousands of bits. For example, a prototype was developed which uses 2,000 bits.
  • the base functions disclosed herein may include bijections used to encode, decode, recode data.
  • Such bijections may include the following characteristics: [18] 1) Encoding wide data elements (typically four or more host computer words wide), unlike typical scalar encodings (see [5, 7] listed in the Appendix), but like block ciphers.
  • a scalar encoding generally employs one or at most a few mathematical constructions.
  • a cipher typically employs a slightly larger number, but the number is still small.
  • a variety of encodings are applied to an entire function, creating an intricately interlaced structure resulting from the interaction of many forms of protection with one another.
  • Embodiments may have no fixed number of rounds, no fixed widths for operands of various substeps, no fixed interconnection of the various substeps, and no predetermined number of iterations of any kind.
  • Some embodiments may use a large quantity of real entropy (i.e., a large truly random input). However, if an engine which generates the base function pairs is not itself exposed to attackers, it may be safe to employ significantly smaller keys which then generate much larger pseudo-random keys by means of a pseudo-random number generator, since in that case, the attacker must contend with both the real key entropy (that for the seed to the pseudo-random number generator) and the randomness inevitably resulting from the programming of the generator. [27] In some embodiments, biased permutations may also be used. If internal data is used to generate base function pairs or other encoding data/functions rather than random numbers, then the resulting encoding will contain bias.
  • Some embodiments may include techniques for binding pipe-starts and pipe-ends, so that the targeted software code is tied to applications or platforms at both ends. This may be useful, for example, in a peer-to-peer data transfer environment or a digital rights management (DRM) environment. Systems and techniques disclosed herein also may be used to tie ciphers to other software applications or platforms, which is generally difficult to do using conventional techniques.
  • DRM digital rights management
  • Some embodiments may use "function-indexed interleaving". This technique provides deep nonlinearity from linear components, and nonlinear equation solving. It can be used in many ways, such as boundary protection, dynamic constant generation (e.g. key-to- code), providing dynamic diversity (data-dependent functionality), self-combining ciphers, cipher mixing and combining ciphers and non-ciphers. For example, it may be used to mix black box ciphers with the other protection code disclosed herein, providing the long term security of a black box cipher, with the other benefits of white box security. As noted above, the encoding of the embodiments disclosed herein may be highly dependent on run- time data.
  • a key which determines the base functions and structure
  • R which determines which obfuscations are to be applied to the "defining implementations".
  • the key, K may be augmented from the context, though in some examples described herein, only R is augmented in this way.
  • semi-consistent information or data from a user or his device such as a smart phone, tablet computer, PDA, server or desktop computer system, or the like
  • IP address such as an IP address
  • Function-indexed interleaving typically interleaves arbitrary functions. If some of these functions are themselves functions obtained by function-indexed interleaving, then that is a recursive use of function-indexed interleaving.
  • Some embodiments may include random cross-linking, cross-trapping, dataflow duplication, random cross-connection, and random checks, combined with code-reordering, create omni-directional cross-dependencies and variable-dependent coding.
  • Some embodiments may use memory- shuffling with fractured transforms (dynamic data mangling) to hide dataflow may also be employed.
  • dynamic data mangling an array A of memory cells may be used which can be viewed as having virtual indices 0, 1, 2, ... , -l where M is the size of the array and the modulus of a permutation polynomial p on the finite ring ZJ(M) (i.e., the integers modulo M), as in a C program array.
  • M the size of the array
  • the modulus of a permutation polynomial p on the finite ring ZJ(M) i.e., the integers modulo M
  • a ⁇ p( )], A[p(l)], ... , A[p(M-l)] may be considered "pseudo-registers" R , ... , RM-I extending those of the host machine.
  • Some embodiments may use "spread and blend” encoding. This is another way of describing the use of base functions plus code interleaving, which "smears out” the boundaries of the base functions to make them more difficult for an attacker to discern. General data blending may have portions of base functions that are mixed with other code, making it more difficult to identify and lift the code. [34] Some embodiments provide security lifecycle management. Black box security provides good long-term protection, but is not very useful in today's applications.
  • Embodiments disclosed herein may refresh implementations faster than they can be cracked on unprotected devices. Different devices and applications have different needs. For example, a pay-per-view television broadcast such as a sporting event, may have very little value several days after the event, so it may only be necessary to provide sufficient security to protect the broadcast data for a day or so. Similarly, the market for computer games may tail off very quickly after several weeks, so it may be critical only to protect the game for the first few weeks or months. Embodiments disclosed herein may allow a user to apply the level of security that is required, trading off the security against performance. Literally, an adjustable "obfuscation dial" can be placed on the control console.
  • the intensity with which obfuscating methods are applied may be controlled. Generally, these settings may be adjusted when the application is created with its embedded base function, as part of a software development process.
  • Security analysis may provide an estimate of how difficult the application will be to crack given a specific level of obfuscation. Based on the estimate, an engineering decision may be made of how to balance performance needs against the need for security, and "obfuscation dial" may be set accordingly. This kind of flexibility is not available with other protection systems. With AES, for example, a fixed key length and fixed code is used, which cannot be adjusted. [35]
  • Some embodiments may provide a flexible security refresh rate, allowing for a trade-off of complexity for the "moving target" of refreshing code. In many cases, the need is to refresh fast enough to stay ahead of potential attackers.
  • Some embodiments may not have a primary aim of providing long-term data security in hacker-exposed environments.
  • the solution is not to expose the data to hackers, but only to expose means of access to the data by, e.g., providing a web presence for credential-protected (SecurelD(TM), pass-phrases, etc.) clients which access the data via protected conversations which can expose, at most, a small portion of the data.
  • credential-protected SecurelD(TM), pass-phrases, etc.
  • white-box cryptography has proven to be vulnerable to attacks which can be executed very swiftly by cryptographically-sophisticated attackers with expert knowledge of the analysis of executable programs, since the cryptographic algorithms employed are amongst the most thoroughly examined algorithms in existence, and the tools for analysing programs have become very sophisticated of late as well.
  • ciphers have peculiar computational properties in that they are often defined over arithmetic domains not normally used in computation: for example, AES is defined over a Galois field, RSA public-key cryptosystems are defined by modular arithmetic over extremely large moduli, 3DES over bit operations, table lookups, and bit-permutations extended with duplicated bits.
  • Some embodiments may provide much stronger short-term resistance to attack. Such protection may be suitable for systems where the time over which resistance is needed is relatively short, because longer term security is addressed by means of refreshing the software which resides on the exposed platforms. This addresses a specific unfilled need which focusses at the point of tension created by highly sophisticated cryptanalytic tools and knowledge, extremely well studied ciphers, limited protections affordable via software obfuscation, highly sophisticated tools for the analysis of executable programs, and the limited exposure times for software in typical commercial content distribution environments.
  • the goal is to prevent the kinds of attacks which experience with white-box cryptography has shown to be within the state of the art: swift cryptanalytic attacks and/or code-lifting attacks so swift that they have value even given the limited lifespans of validity between refreshes of the exposed programs (such as STB programs).
  • Some embodiments may provide significantly larger encodings which can resist attacks for longer periods of time, by abandoning the notion of computing with encoded operands— as is done with the simpler encodings above— and replacing it with something more like a cipher.
  • Ciphers themselves can be, and are, used for this purpose, but often they cannot easily be interlocked with ordinary software because (1) their algorithms are rigidly fixed by cipher standards, and (2) their computations are typically very different from ordinary software and therefore are neither readily concealed within it, nor readily interlocked with it.
  • the base-functions described herein provide an alternative which permits concealment and interlocking: they make use of conventional operations, and their algorithms are enormously more flexible than is the case with ciphers. They can be combined with ciphers to combine a level of black-box security as strong as conventional cryptography with a level of white-box security significantly superior to both simple encodings as above and known white-box cryptography.
  • a base function may be created by selecting a word size w and a vector length N, and generating an invertible state- vector function configured to operate on an N- vector of w-element words, which includes a combination of multiple invertible operations.
  • the state-vector function may receive an input of at least 64 bits and provides an output of at least 64 bits.
  • a first portion of steps in the state-vector function may perform linear or affine computations over Z/(2 W ). Portions of steps in the state-vector function may be indexed using first and second indexing techniques. At least one operation in an existing computer program may then be modified to execute the state-vector function instead of the selected operation.
  • Each of the indexing techniques may control a different indexing operation, such as if-then-else constructs, switches, element-permutation selections, iteration counts, element rotation counts, function-indexed key indexes, or the like.
  • Some of the steps in the state- vector function may be non-T-function operations.
  • each step in the state- vector function may be invertible, such that the entire state-vector function is invertible by inverting each step.
  • the state-vector function may be keyed using, for example, a run-time key, a generation-time key, or a function-indexed key.
  • the state- vector function may be implemented by various operation types, such as linear operations, matrix operations, random swaps, or the like.
  • Various encoding schemes also may be applied to inputs and/or outputs of the state- vector function, and/or operations within the state- vector function. In some configurations, different encodings may be applied to as to produce fractures at various points associated with the state-vector function.
  • base functions as disclosed herein may be executed by, for example, receiving an input having a word size w, applying an invertible state-vector function configured to operate on N- vectors of w-element words to the input, where the state-vector function includes multiple invertible operations, and a first portion of steps in the state-vector function perform linear or affine computations over Z/(2 W ). Additional operations may be applied to the output of the invertible state-vector function, where each is selected based upon a different indexing technique.
  • the state-vector function may have any of the properties disclosed herein with respect to the state-vector function and base functions.
  • a first operation may be executed by performing a second operation, for example, by receiving an input X encoded as A(X) with a first encoding A, performing a first plurality of computer-executable operations on the input using the value of B ⁇ '(X), where B '1 is the inverse of a second encoding mechanism B, the second encoding B being different from the first encoding A, providing an output based upon B ⁇ 2 (X).
  • Such operation may be considered a "fracture", and may allow for an operation to be performed without being accessible or visible to an external user, or to a potential attacker.
  • the output of the first operation may not be provided external to executable code with which the first operation is integrated.
  • the input may be permuted according to a sorting-network topology.
  • the matrix operation may be executed using the permuted input to generate the output, and the output permuted according to the sorting-network topology.
  • the permuted output then may be provided as the output of the matrix operation.
  • a first input may be received, and a function-indexed interleaved first function applied to the first input to generate a first output having a left portion and a right portion.
  • a function-index interleaved second function may be applied to the first output to generate a second output, where the left portion of the first output is used as a right input to the second function, and the right portion of the first output is used as a left input to the second function.
  • the second output may then be provided as an encoding of the first input.
  • a key K may be generated, and a pair of base functions ⁇ , ⁇ K 1 generated based upon the key K and a randomization information R.
  • the base function ⁇ may be applied to a first end of a communication pipe, and the inverse ⁇ to a second end of the communication pipe, after which the key K may be discarded.
  • the communication pipe may span applications on a single platform, or on separate platforms.
  • one or more operations to be executed by a computer system during execution of a program may be duplicated to create a first copy of the operation or operations.
  • the program may then be modified to execute the first operation copy instead of the first operation.
  • Each operation and the corresponding copy may be encoded using a different encoding. Pairs of operations also may be used to create a check value, such as where the difference between execution of an operation result and execution of the copy is added to the result of the operation or the result of the operation copy. This may allow for detection of a modification made by an attacker during execution of the program.
  • either a copy or the original operation may be selected randomly and executed by the program.
  • the result of the randomly-selected operations may be equivalent to a result that would have been obtained had only a single copy of the operations been performed.
  • an input may be received from an application.
  • An array of size M may be defined with a number of M-register locations c ⁇ , ... ,cradically, with n ⁇ M.
  • a permutation polynomial p, an input-based 1 *n vector mapping matrix A yielding z from the input, and a series of constants p(z+i) also may be defined.
  • a series of operations may then be performed, with each operation providing an intermediate result that is stored in an M- register selected randomly from the M-register s.
  • a final result may then be provided to the application based upon the series of intermediate results from a final M-register storing the final result.
  • Each intermediate result stored in an M-register may have a separate encoding applied to the intermediate result prior to storing the intermediate result in the corresponding M-register.
  • the different encodings applied to intermediate results may be randomly chosen from among multiple different encodings.
  • different decodings which may or may not correspond to the encodings used to store intermediate results in the M-registers, may be applied to intermediate results stored in M-registers.
  • New M-registers may be allocated as needed, for example, only when required according to a graph-coloring allocation algorithm.
  • a first operation g(y) that produces at least a first value a as an output may be executed, and a first variable x encoded as aX+b, using a and a second value b.
  • a second operation f(aX+b) may be executed using aX+b as an input, and a decoding operation using a and b may be performed, after which a and b may be discarded.
  • the value b also may be the output of a third operation h(z). Different encodings may be used for multiple input values encoded as aX+b, using different execution instances of g(y) and/or h(z).
  • the values may be selected from any values stored in a computer-readable memory, based upon the expected time that the constant(s) are stored in the memory.
  • existing computer- readable program code containing instructions to execute an operation f(aX+b) and g(y), and g(y) produces at least a first value c when executed; may be modified to encode x as cX+d.
  • the operation f(cX+d) may be executed for at least one x, and c and d subsequently discarded.
  • At least one base function may be blended with executable program code for an existing application.
  • the base function may be blended with the executable program code by replacing at least one operation in the existing program code with the base function.
  • the base function also may be blended with the existing application by applying one, some, or all of the techniques disclosed herein, including fractures, variable dependent coding, dynamic data mangling, and/or cross-linking.
  • the base functions and/or any blending techniques used may include, or may exclusively include, operations which are similar or indistinguishable from the operations present in the portion of the existing application program code with which they are blended.
  • a computer system and/or computer program product may be provided that includes a processor and/or a computer-readable storage medium storing instructions which cause the processor to perform one or more of the techniques disclosed herein.
  • Figure 1 shows a commutative diagram for an encrypted function, in accordance with the present invention
  • Figure 2 shows a Virtual Machine General Instruction Format, in accordance with the present invention
  • Figure 3 shows a Virtual Machine Enter/Exit Instruction Format, in accordance with the present invention
  • Figure 4 shows a Mark I 'Woodenman' Construction, in accordance with the present invention.
  • Figures 5 and 6 show the first and second half respectively, of a Mark II
  • Figure 7 shows a graphical representation of a sorting network, in accordance with the present invention.
  • Figure 8 shows a flow chart of method of performing function-indexed
  • Figure 9 shows a flow chart of method of performing control-flow duplication, in accordance with the present invention.
  • FIG. 10 shows a flow chart of method of performing data-flow duplication, in accordance with the present invention.
  • Figure 11 shows a flow chart of method of creating ⁇ segments, in accordance with the present invention.
  • Figure 12 presents a process flow diagram for implementation of the Mark II protection system of the invention
  • Figure 13 shows a graphical representation of the irregular structure of segment design in a Mark III implementation of the invention
  • Figure 14 shows a graphical representation of the granularity that may be achieved with T-function splitting in a Mark III implementation of the invention
  • Figure 15 shows a graphical representation of the overall structure of a Mark III implementation of the invention
  • Figure 16 shows a graphical representation of the defensive layers of a Mark III implementation of the invention
  • Figure 17 shows a graphical representation of mass data encoding in an
  • Figures 18 and 19 show graphical representations of control flow encoding in an implementation of the invention.
  • Figure 20 shows a graphical representation of dynamic data mangling in an implementation of the invention
  • Figure 21 shows a graphical representation of cross-linking and cross-trapping in an implementation of the invention
  • Figure 22 shows a graphical representation of context dependent coding in an implementation of the invention
  • Figure 23 presents a process flow diagram for implementation of the Mark II protection system of the invention
  • Figure 24 shows a graphical representation of a typical usage of Mass Data Encoding or Dynamic Data Mangling in an implementation of the invention.
  • Figure 25 shows an exemplary block diagram setting out the primary problems that the embodiments of the invention seek to address
  • Table 25 presents a table which categorizes software boundary problems
  • Figure 26 shows a block diagram of an exemplary software system in unprotected form, under white box protection, and protected with the system of the invention
  • Figure 27 shows a bar diagram contrasting the levels of protection provided by black box, security, white box security and protection under an exemplary embodiment of the invention
  • Figure 28 shows a process flow diagram contrasting ciphers, hashes and exemplary base functions in accordance with the present invention
  • Figure 29 shows an exemplary block diagram of how base functions of the invention may be used to provide secure communication pipes
  • Figure 30 shows a process flow diagram for function-indexed interleaving in accordance with the present invention.
  • Figure 31 presents a process flow diagram for implementation of the Mark I protection system of the invention.
  • Embodiments disclosed herein describe systems, techniques, and computer program products that may allow for securing aspects of computer systems that may be exposed to attackers. For example, software applications that have been distributed on commodity hardware for operation by end users may come under attack from entities that have access to the code during execution.
  • embodiments disclosed herein provide techniques to create a set of base functions, and integrate those functions with existing program code in ways that make it difficult or impossible for a potential attacker to isolate, distinguish, or closely examine the base functions and/or the existing program code.
  • processes disclosed herein may receive existing program code, and combine base functions with the existing code.
  • the base functions and existing code also may be combined using various techniques such as fractures, dynamic data mangling, cross-linking, and/or variable dependent coding as disclosed herein, to further blend the base functions and existing code.
  • the base functions and other techniques may use operations that are computationally similar, identical, or indistinguishable from those used by the existing program code, which can increase the difficulty for a potential attacker to distinguish the protected code from the protection techniques applied. As will be described herein, this can provide a final software product that is much more resilient to a variety of attacks than is possible using conventional protection techniques.
  • embodiments disclosed herein may provide solutions for several fundamental problems that may arise when software is to be protected from attack, such as software boundary protection, advanced diversity and renewability problems and protection measurability problems.
  • Software boundary problems may be organized into five groups as shown in Table 1: skin problems, data boundaries, code boundaries, boundaries between protected data and protected code, and boundaries between protected software and secured hardware.
  • whitebox cryptography components can be identified by
  • Code vulnerabilities to component-based attacks such as code Boundary lifting, code replacement, code cloning, replay, code sniffing, and code spoofing.
  • Data Boundaries may be categorized as one of three types: data type boundaries, data dependency boundaries and data crossing functional component boundaries.
  • data type boundaries current data transformation techniques are limited to individual data types, not multiple data types or mass data. The boundaries among distinct protected data items stand out, permitting identification and partitioning.
  • data dependency boundaries data diffusion via existing data flow protections is limited: original data flow and computational logic is exposed. Most current white box cryptography weaknesses are related to both data type and data dependency boundary problems.
  • data crossing functional component boundaries data communications among functional components of an application system, whether running on the same or different devices, or as client and server, are made vulnerable because the communication boundaries are clearly evident.
  • embodiments disclosed herein may address some or all of these data boundary issues because both the data and the boundaries themselves may be obscured.
  • Code Boundaries may be categorized into two types: functional boundaries among protected components, and boundaries between injected code and the protected version of the original application code.
  • Functional boundaries among protected components are a weakness because boundaries among functional components are still visible after protecting those components. That is, with white box protection, the white box cryptographic components can generally be identified by their distinctive computations. In general, such protected
  • computation segments can be easily partitioned, creating vulnerabilities to component-based attacks such as code lifting, code replacement, code cloning, replay, code sniffing, and code spoofing.
  • boundaries between injected protection code and the protected version of the original application code are also generally visible.
  • Current individual protection techniques create secured code that is localized to particular computations. Code boundaries resulting from use of different protection techniques are not effectively glued and interlocked.
  • the use of base function encodings and function-indexed interleaving by embodiments disclosed herein may address all of these code boundary issues, because code may be obscured and interleaved with the protection code itself. Because basic computer processing and arithmetic functions are used for the protection code, there is no distinctive code which the attacker will quickly identify.
  • FIG. 26 shows a block diagram of an example software system protected under a known white box model, and under an example embodiment as disclosed herein.
  • the original code and data functions, modules and storage blocks to be protected are represented by the geometric shapes labeled Fl, F2, F3, Dl and D2.
  • Existing white box and similar protection techniques may be used to protect the various code and data functions, modules and storage blocks, but even in a protected form they will (at the very least) disclose unprotected data and other information at their boundaries.
  • embodiments of the present invention may resolve these boundary problems.
  • an observer cannot tell which parts are Fl, F2, F3, Dl, D2 and data from the original program, even though the observer has access to the program and can observe and alter its operation.
  • Bijections are described in greater detail hereinafter, but in short, they are lossless pairs of functions, ⁇ , fx '1 , which perform a transposition of a function, which is undone later in the protected code.
  • the transposition may be done in thousands or millions of different ways, each transposition generally being done in a completely different and non-repeatable manner.
  • Various techniques may be used to conceal existing programs, achieving massive multicoding of bijective functions, which are not humanly programmed, but are generated by random computational processes. This includes bijective functions which can be used in cipher- and hash-like ways to solve boundary problems. [99]
  • Embodiments disclosed herein may provide improved security and security guarantees (i.e. validated security and validated security metrics) relative to conventional techniques.
  • FIG. 27 contrasts conventional black box and white box models with properties of the embodiments disclosed herein, in terms of the long-term security and resistance to hostile attacks.
  • Cryptography is largely reliant on Ciphers and Hashes; Ciphers enable transfer of secrets over unsecured or public channels, while Hashes validate provenance. These capabilities have enormous numbers of uses. In a black-box environment, such cryptographic techniques may have very good long term security.
  • Ciphers and Hashes have a rigid structure and very standardized equations which are straightforward to attack.
  • White box protection may be used to improve the level of resistance to attacks, but even in such an environment the protected code will still reveal patterns and equations from the original Cipher-code and Hash-code, and boundaries will not be protected. As well, white box protection will not provide diversity which protects code against perturbation attacks.
  • embodiments disclosed herein may incorporate Cipher-like and Hashlike encodings, which gives the protective encodings the security and strength of Ciphers and Hashes.
  • the process of applying white box encodings to Ciphers and Hashes typically uses simple encodings in an attempt to protect and obscure very distinctive code.
  • the techniques disclosed herein may use strong, diverse encodings to protect any code. With the diverse encodings and interleaving as disclosed, distinctiveness in the targeted code will be removed. Thus, as shown, the disclosed techniques may provide a much stronger security profile than conventional black box and white box protection.
  • FIG. 1 shows a commutative diagram for an encrypted function using encodings, in accordance with embodiments of the present invention.
  • a bijection d D ⁇ U and a bijection r: R ⁇ R ' may be selected.
  • F—r ° F ° d l is an encoded version of F; d is an input encoding or a domain, encoding and r is an output encoding or a range encoding.
  • a bijection such as d or r is simply called an encoding.
  • the diagram shown in Figure 1 then commutes, and computation with F' is computation with an encrypted function. Additional details regarding the use of such encodings generally are provided in Section 2.3 of the Appendix.
  • FIG 28 contrasts the properties of conventional Ciphers and Hashes with those of the bijective base functions disclosed herein.
  • Ciphers are non-lossy functions; they preserve all of the information that they encode, so the information can be unencoded and used in the same manner as the original.
  • Ciphers are invertible provided that one is given the key(s), but it is hard to determine the key or keys Kl, K2 from instances of plain and encrypted information ("PLAIN” and "ENCRYPTED” in Figure 28).
  • Hashes are lossy above a certain length, but this typically is not a problem because hashes are generally used just for validation. With a hash it is hard to determine the optional key, K, from instances of the original data and the hash ("PLAIN" and "HASHED” in Figure 28).
  • the base functions disclosed herein may serve in place of either ciphers or hashes, as it is hard to determine the key or keys from consideration of the encoding and unencoding functions fx, fx '1 .
  • the advantage that the base functions provide over the use of Ciphers or Hashes, is that the computations used by the base functions are more similar to ordinary code, which makes it easier to blend the code of the base functions with the targeted code.
  • Ciphers and Hashes use very distinctive code and structure which is difficult to obscure or hide, resulting in vulnerability.
  • Mutually-inverse base function pairs as disclosed herein may employ random secret information (entropy) in two ways: as key information K which is used to determine the mutually inverse functions ⁇ , fx '1 , and as randomization information R which determines how the fx, fx '1 implementations are obscured.
  • two mutually inverse base functions may be represented by subroutines G and H, written in C.
  • the base functions may be constructed by an automated base function generator program or system, with G being an obfuscated implementation of the mathematical function fx and H being an obfuscated implementation of the mathematical function ⁇ "1 .
  • G can be used to 'encrypt' data or code, which can then be 'decrypted' with H (or vice versa).
  • run-time keys can be provided in additional to the build-time key K.
  • the extra input vector elements can be used as a run-time key. This is much like the situation with a cipher such as AES-128.
  • a typical run of AES-128 has two inputs: one is a 128-bit key, and one is a 128-bit text. The implementation performs encipherment or decipherment of the text under control of the key.
  • a base-function can be constructed to encrypt differently depending on the content of its extra inputs, so that the extra inputs in effect become a runtime key (as opposed to the software generation time key K controlling the static aspects of the base function).
  • the building blocks of base functions disclosed herein make it relatively easy to dictate whether the runtime key is the same for the implementations of both_ «, fg l or is different for fg than ⁇ / ⁇ '1 : if the runtime key is added to the selector vector, it is the same for /K and ⁇ "1 , and if it is added elsewhere, it differs between ⁇ and. ⁇ 1 .
  • Key information K can be used to select far more varied encoding functions than in known white box systems, permitting much stronger spatial and temporal diversity. Diversity is also provided with other techniques used in embodiments of the invention such as Function- Indexed Interleaving which provides dynamic diversity via text-dependence. Further diversity may also be provided by variants of Control-Flow Encoding and Mass-Data Encoding described hereinafter.
  • Base functions as disclosed herein may incorporate or make use of state vector functions.
  • a state-vector function is organized around a vector of N elements, each element of which is a w-bit quantity.
  • the state vector function may be executed using a series of steps, in each of which a number between zero and N of the elements of the vector are modified. In a step in which zero elements are modified, the step essentially applies the identity function on the state-vector.
  • constructing a base function may be invertible.
  • a state- vector function is invertible if, for each and every step in the state-vector function, a step-inverse exists such that that applying the step-algorithm and then applying the step-inverse algorithm has no net effect. Any finite sequence of invertible steps is invertible by performing the inverse-step algorithms in the reverse order of their originals.
  • Illustrative examples of invertible steps on a vector of w-bit elements include adding two elements, such as adding i to j to obtain i+j, multiplying an element by an odd constant over Z/(2 ,v ), mapping a contiguous or non-contiguous sub- vector of the elements to new values by taking the product with an invertible matrix over Z/(2 W ).
  • the associated inverse steps for these examples are subtracting element from element j, multiplying the element by the multiplicative inverse of the original constant multiplier over Z/(2 W ), and mapping the sub- vector back to its original values by multiplying by the inverse of that matrix, respectively.
  • Some embodiments may use one or more state-vector functions that have one or more indexed steps.
  • a step is indexed if, in addition to its normal inputs, it takes an additional index input such that changing the index changes the computed function.
  • the step of adding a constant vector could be indexed by the constant vector, or the step of permuting a sub-vector could be indexed by the permutation applied.
  • the specific function executed is determined at least in part by the index provided to the function.
  • Indexed steps also may be invertible.
  • an indexed step is invertible if it computes an invertible step for each index, and the index used to compute the step, or information from which that index can be derived, is available when inverting the step.
  • S 17 is invertible if Sn '1 is defined, and the index (17) is available at the appropriate time to ensure that it Sn "1 is computed when inverting the state- vector function.
  • a step may operate on some elements of the state. To index this step, other elements of the state may be used to compute the index. If invertible steps are then performed on the other elements, the index by may be retrieved by inverting those steps, as long as the two sets of elements do not overlap.
  • Function-Indexed Interleaving as disclosed herein is a specific example of the principle of the use of indexed steps within a base function.
  • Other uses of indexed steps as disclosed herein may include: allowing the creation of keyed state-vector functions: the set of indexes used in some of the indexed steps can be used as a key. In that case, the index is not obtained from within the computation, but is provided by an additional input; i.e., the function takes the state-vector plus the key as an input. If the indexed steps are invertible, and the ordinary, non-indexed steps are invertible, then the whole state- ector function is invertible, rather like a keyed cipher.
  • the index information may provide or may serve as a key for the generated base functions. If the state-vector function is partially evaluated with respect to the index information when the state- vector function is generated, so that the index does not appear in the execution of the generated function explicitly, it is a generation-time key. If code to handle the index information is generated during execution of the state-vector function, so that the index does appear in the execution of the generated function explicitly, it is a run-time key. If the code internally generates the index within the state- vector function, it is a function-indexed key. [116] In an embodiment, a base function may be constructed based upon an initial selected or identified word-size w.
  • the default integer size of the host platform may be used as the word size w.
  • the default integer size typically is 32 bits.
  • the short integer length as used, for example, in C may be used, such as 16 bits.
  • a 64-bit word size may be used.
  • a vector length N is also selected for the base function, which represents the length of inputs and outputs in the w-sized words, typically encompassing four or more words internally. In some embodiments, such as where interleaving techniques as disclosed herein are used, it may be preferred for the word size w to be twice the internal word size of the N- vector.
  • the state-vector function then may be created by concatenating a series of steps or combinations of steps, each of which performs invertible steps on N- vectors of w-element word.
  • the inverse of the state-vector function may be generated by concatenating the inverses of the steps in the reverse order.
  • one or more keys also may be incorporated into the state- vector function.
  • Various types of keying may be applied to, or integrated with, the state- vector function, including run-time keying, generation-time keying, and function-indexed keying as previously described.
  • the function may be modified to receive the key explicitly as an additional input to the function.
  • code in the state- vector function may be partially evaluated with respect to a provided key. For many types of operations, this alone or in conjunction with typical compiler optimizations may be sufficient to make the key unrecoverable or unapparent within the generated code.
  • the state-vector function may be constructed such that appropriate keys for inverse operations are provided as needed within the state-vector function.
  • steps for the state-vector function which have wide variety [119]
  • the initial and/or final steps of the state-vector function may be steps which mix input entropy across the entire state-vector, typically other than any separate key-input.
  • the state-vector function such that at least every few steps, a non-T-function step is performed.
  • examples of T-function steps include addition, subtraction, multiplication, bitwise AND
  • a state-vector function pair includes the state-vector function as described herein and the complete inverse of the state- vector function.
  • construction of the state-vector function pair may, but need not be performed by, for example, combining a series of parameterized algorithms and/or inverse algorithms in the form of language source such as C++ code or the like.
  • substitution of generation- time keys may, but need not be performed by a combination of macro substitution in the macro preprocessor, function in-lining, and use of parameterized templates.
  • states-vector generating system may be automated within a state-vector generating system as disclosed herein.
  • state-vector function pair Once the state-vector function pair has been generated, one or both may be protected using binary- and/or compiler-level tools to further modify the generated code.
  • the specific modifications made to one or both functions in the state-vector function pair may be selected based upon whether or not each member is expected to execute in an environment likely to be subject to attack.
  • the function or a part of the function that is expected to be in an exposed environment may be bound near a point at which an input vector is provided to the state- vector function, and/or near the point where an output vector is consumed by its invoking code.
  • the code may be bound by, for example, the use of dynamic data mangling and/or fractures as disclosed herein.
  • the inputs provided may be from a mangled store, and outputs may be fetched by an invoker from the mangled store.
  • Other techniques may be used to bind code at these points, such as data-flow duplication with cross-linking and cross-trapping as disclosed herein. Different combinations may be used, such as where dynamic data mangling, fractures, and data-flow duplication are all applied at the same point to bind the code at that point.
  • the protections applied to code expected to be in an exposed environment may be applied within one or both of the state-vector function, with the portion of the code affected determined by the needed level of security.
  • applying multiple additional protection types at each possible point or almost each possible point may provide maximal security; applying a single protection at multiple points, or multiple protection types at only a single code point, may provide a lower level of security but improved performance during code generation and/or execution.
  • fractures may be applied at multiple points throughout the generation and binding process, because many opportunities for fracture creation may exist due to generation of many linear and affine operations among the steps of the state-vector function during its construction. [124] In some embodiments, it may be useful to make one member of a state-vector function pair more compact than the other. This may be done, for example, by making the other member of the pair more expensive to compute.
  • a hardware-resident member of the state-vector function pair when one member of a state-vector function pair is to be used on exposed and/or limited-power hardware such as a smart card or the like, it may be preferred for a hardware-resident member of the state-vector function pair to be significantly more compact than in other embodiments disclosed herein. To do so, a corresponding server-resident or other non-exposed member of the state-vector function pair may be made significantly more costly to compute. As a specific example, rather than using a relatively high number of coefficients as disclosed and as would be expected for a state-vector function generation technique as disclosed previously, a repetitious algorithm may be used.
  • the repetitious algorithm may use coefficients supplied by a predictable stream generation process or similar source, such as a pseudo-random number generator that uses a seed which completely determines the generated sequence.
  • a suitable example of such a generator is the a pseudo-random generator based on ARC4.
  • ARC4 a pseudo-random generator based on ARC4.
  • the pseudo-random number generator may be used to generate all matrix elements and displacement- vector elements. Appropriate constraints may be applied to ensure invertibility of the resulting function.
  • a limited-resource device such as a smart card may be adapted to execute one of a state- vector function pair, while the system as a whole still receives at least some of the benefits of a complete state-vector function system as disclosed herein.
  • base functions as disclosed herein may be used to provide a secure communication pipe from one or more applications on one or more platforms, to one or more applications on one or more other platforms (i.e. an e-link). The same process may be used to protect communication from one sub-application to another sub-application on a single platform.
  • a base function pair fx, fx.] may be used to protect a pipe by performing a cipher-like encrypt and decrypt at respective ends of the pipe.
  • the base function may be applied to the pipe start and pipe end, and also applied to the application and its platform, thus binding them together and binding them to the pipe. This secures (1) the application to the pipe- start, (2) the pipe-start to the pipe-end, and (3) the pipe-end to the application information flow.
  • a key K is generated using a random or pseudo-random process.
  • the base-functions ⁇ , ⁇ "1 are then generated using the key K and randomization information R.
  • the base functions are then applied to pipe-start and pipe-end so that at run time, the pipe-start computes. ⁇ , and the pipe- end computes./*- "1 .
  • the key K can then be discarded as it is not required to execute the protected code.
  • the base-function specifications will be cipher- based specifications ⁇ ⁇ ,/ ⁇ ' ⁇ (similar to FIPS-197 for AES encrypt and decrypt).
  • Cloaked base-functions are specific implementations (pipe-start and pipe-end above) of the smooth base-functions designed to foil attempts by attackers to find K, invert a base-function (i.e., break encryption), or break any of the bindings shown above. That is, a smooth base function is one which implements ⁇ or ⁇ "1 straightforwardly, with no added obfuscation. A cloaked base function still computes , //;: but it does so in a far less straightforward manner. Its implementation makes use of the obfuscation entropy R to find randomly chosen, hard to follow techniques for implementing ⁇ or j 1 . Further examples of techniques for creating and using cloaked base functions are provided in further detail herein.
  • embodiments disclosed herein may use replace matrix functions with functions which are (1) wide-input; that is, the number of bits comprising a single input is large, so that the set of possible input values is extremely large, and (2) deeply nonlinear; that is, functions which cannot possibly be converted into linear functions by i/o encoding (i.e., by individually recoding individual inputs and individual outputs).
  • Making the inputs wide makes brute force inversion by tabulating the function over all inputs consume infeasibly vast amounts of memory, and deep nonlinearity prevents homomorphic mapping attacks.
  • Some embodiments may use "Function-Indexed Interleaving", which may provide diffusion and/or confusion components which are deeply nonlinear.
  • a function from vectors to vectors is deeply nonlinear if and only if it cannot be implemented by a matrix together with arbitrary individual input- and output-encodings. If it is not deeply nonlinear, then it is "linear up to I/O encoding" ("linearity up to I/O encoding" is a weakness exploited in the BGE attack on WhiteBox AES.)
  • Function-Indexed Interleaving allows conformant deeply nonlinear systems of equations to be solved by linear-like means. It can be used to foster data-dependent processing, a form of dynamic diversity, in which not only the result of a computation, but the nature of the computation itself, is dependent on the data.
  • Figure 30 shows a process flow diagram of an example Function-Indexed Interleaving process, which interleaves a single 4 x 4 function with a family of 4 x 4 functions.
  • the l x l function with l x l function-family case permits combining of arbitrary kinds of functions, such as combining a cipher with itself (in the spirit of 3DES) to increase key-space; combining different ciphers with one another;
  • the square boxes represent bijective functions, typically but not necessarily implemented by matrices.
  • the triangle has the same inputs as the square box it touches and is used to control a switch which selects among multiple right-side functions, with inputs and outputs interleaving left-side and right- side inputs and outputs as shown: if the left-side box and right-side boxes are 1-to-l, so is the whole function; if the left-side box and right-side boxes are bijective, so is the whole function; if the left-side box and right-side boxes are MDS (maximum distance separable), so is the whole function, whether bijective or not. [133] If the triangle and all boxes are linear and chosen at random, then (by observation) over 80% of the constructions are deeply nonlinear.
  • function-indexed interleaving also may be nested, such that the left- function or right-function-family may themselves be instances of function-indexed
  • FIG. 31 Three specific example embodiments are described in detail herein, referred to as the Mark I, II and III systems.
  • An exemplary implementation of the Mark I system is presented in the process flow diagram of Figure 31.
  • the square boxes represent mixed Boolean arithmetic (MBA) polynomial encoded matrices.
  • MSA mixed Boolean arithmetic
  • Each matrix is encoded independently, and the interface encodings need not match. Thus, 2 x 2 recodings cannot be linearly merged with predecessors and successors.
  • the central construction is function-indexed interleaving which causes the text processing to be text-dependent.
  • the number of interleaved functions can be very large with low overhead. For example, permuting rows and columns of 4 x 4 matrices gives 576 choices. As another example, XORing with initial and final constants gives a relatively very high number of choices.
  • Initial and final recodings mix the entropy across corresponding inputs/outputs of the left fixed matrix and the right selectable matrices. Internal input/output recodings on each matrix raise the homomorphic mapping work factor from order 2 3w/2 to order 2 5w 2 allowing for full 'birthday paradox' vulnerability - the work factor may be higher, but is unlikely to be lower.
  • Static dependency analysis can be used to isolate the components.
  • a function which is a T-function will have the property that a change to an input element's 2' bit never affects an output element' s 2 ⁇ bit when > j.
  • the bit- order numbering within words is considered to be from low-order (2°) to high-order (2 W l ) bits, regarding words as representing binary magnitudes, so this may be restated as: an output bit can only depend on input bits of the same or lower order. So it may be possible to "slice off or ignore higher bits and still get valid data.
  • Some embodiments also may incorporate tens of millions of T-functions, in contrast to known implementations which only use hundreds of T-functions. As a result, embodiments disclosed herein may be more resistant to bit slicing attacks and statistical attacks.
  • T-functions Functions composable from ⁇ , ⁇ , ⁇ , ⁇ computed over B w together with +, -, ⁇ over Z/(2 W ), so that all operations operate on w-bit words, are T-functions. Obscure constructions with the T-function property are vulnerable to bit-slice attacks, since it is possible to obtain, from any T-function, another legitimate T-function, by dropping high-order bits from all words in input and output vectors. The T-function property does not hold for right bit-shifts, bitwise rotations, division operations, or remainder/modulus operations based on a
  • a less severe external vulnerability may exist if the functions of the pair have the property that each acts as a specific T-function on specific domains, and the number of distinct T-functions is low.
  • a statistical bucketing attack can characterize each T-function.
  • the domains can similarly be characterized, again, without any examination of the code, using an adaptive known plaintext attack, an attacker can fully characterize the functionality of a member of the pair, completely bypassing its protections, using only black- box methods. Plainly, it may be desirable to have an effective number of distinct T- functions to foil the above attack.
  • Mark III type implementations for example, there are over 10 distinct T-functions per segment and over 10 40 T-functions over all. Mark III type
  • the pair of implementations may include functions which achieve full cascade, that is, every output depends on every input, and on average, changing one input bit changes half of the output bits.
  • An example of an internal vulnerability may occur in a Mark II type implementation where, by 'cutting' the implementation at certain points, it may be possible to find a sub-implementation (a component) corresponding to a matrix such that the level of dependency is exactly 2 ⁇ 2 (in which case the component is a mixer matrix) or 4 x 4 (in which case it is one of the L, S, or R matrices). Once these have been isolated, properties of linear functions allow very efficient characterization of these matrices. This is an internal attack because it requires non-black-box methods: it actually requires examination of internals of the implementations, whether static (to determine the dependencies) or dynamic (to characterize the matrices by linearity-based analyses).
  • embodiments disclosed herein may provide, by means of of variable and increasingly intricate internal structures and increasingly variegated defenses, an environment in which any full crack of an instance requires many sub-cracks, the needed sub-cracks vary from instance to instance, the structure and number of the attacked components varies from instance to instance, and the protection mechanisms employed vary from instance to instance.
  • automating an attack becomes a sufficiently large task to discourage attackers from attempting it.
  • the deployed protections may have been updated or otherwise may have moved on to a new technology for which the attack-tool's algorithm no longer suffices.
  • FIG. 23 presents the processing of a "base core function" which appears four times in Figure 12.
  • the complete execution flow for a Mark II type system is shown in Figures 5 and 6, and described in further detail with reference to Figures 5 and 6 in Section 5.1 of the Appendix.
  • the Mark II proposal is similar to Mark I in that it has a fixed internal structure, with only coefficient variations among the base function implementation pairs. Further description regarding the example embodiment of a Mark II implementation and a
  • a Mark III base function design may include the following properties: an irregular and key-determined structure, so that the attacker cannot know the details of the structure in advance; highly data-dependent functionality: varying the data varies the processing of the data, making statistical bucketing attacks resource -intensive; a relatively extremely high T-function count (the number of separate sub-functions susceptible to a recursive bit-slice attack), making a blind bit-slice attack on its T-functions infeasible; redundant and implicitly cross-checked data-flow, making code-modification attacks highly resource-intensive; and omni-directional obfuscation-induced dependencies, making dependency-based analysis resource-intensive.
  • Figure 13 shows a schematic representation of execution flow in a portion of an example Mark III type implementation. Similar to the example execution flows described with respect to the Mark I and Mark II type implementations, each component may represent a function, process, algorithm or the like, with arrows representing potential execution paths between them. Where different arrows lead to different points within the components, it will be understood that different portions of the component may be executed, or different execution paths within the component may be selected. As shown in Figure 13, a Mark III type implementation may provide an irregular, key-dependent, data-dependent, dataflow-redundant, cross-linked, cross-checked, tamper-chaotic structure, containing a nested function-indexed- interleaving within a function-indexed interleaving.
  • FIG. 15 shows another example schematic of a portion of a Mark III type implementation as disclosed herein.
  • the initial and final mixing may use linear transforms of 32-bit words having widths of 3 to 6. Five to seven segments may be are used, each of which contains a 3-band recursive instance of function-indexed interleaving.
  • Each band is 3 to 6 elements wide, with a total of 12 elements for all three bands.
  • Matrices are I/O permuted and I/O rotated, giving over 100 million T-subfunctions per segment: the whole base function has over 10 40 T-subfunctions.
  • Dataflow duplication, random cross-connection, and random checks, combined with code-reordering also may be used, creating omnidirectional cross-dependencies. [160] A number of the different defenses that may be used in a Mark III type system are shown graphically in Figure 16.
  • memory-shuffling with fractured transforms dynamic data mangling
  • random cross-linking, cross-trapping, and variable-dependent coding which causes pervasive inter-dependence and chaotic tamper response
  • permutation polynomial encodings and function-indexed interleaving which hobble linear attacks
  • variable, randomly-chosen structure which hobbles advance-knowledge attacks
  • - functionality is highly dependent on run-time data, reducing repeatability and hobbling statistical bucketing attacks.
  • Some embodiments may include data flow duplication techniques. For example, as described below, for every instruction which is not a JUMP. . . , ENTER, or EXIT, the instruction may copied so that an original instruction is immediately followed by its copy, and new registers may be chosen for all of the copied instructions such that, if x and;; are instructions, with y being the copy of x,
  • JUMPA ('jump arbitrarily'), which is an unconditional branch with two destinations in control-flow graph (cfg) form, just like a conditional branch, but with no input: instead, JUMPA chooses between its two destinations at random. JUMPA is not actually part of the VM instruction set, and no JUMPA will occur in the final obfuscated implementation of or f ⁇ 1 .
  • fractures may be useful in obfuscation because the computation which they perform effectively does not appear in the encoded code - the amount and form of code to perform a normal networked encoding and one which adds an operation by means of a fracture is identical, and there appears to be no obvious way to disambiguate these cases, since encodings themselves tend to be somewhat ambiguous.
  • the defining property of a fracture is the fracture function, for example v -1 ° u.
  • the fracture function for example v -1 ° u.
  • specifying the fracture function does not necessarily specify the producing and consuming encodings which imply it.
  • Mass Data Encoding is described in United States Patent No. 7,350,085, the contents of which are incorporated herein by reference. In short, MDE scrambles memory locations in a hash-like fashion, dynamically recoding memory cells on each store and dynamically recoding and relocating memory cells by background processing.
  • a fetch or store can perform an add or multiply while continuing to look like a simple fetch or store. This makes it hard for an attacker to disambiguate between mere obfuscation and useful work.
  • MDE is compiled, not just interpreted, so supporting data structures are partially implicit and hence, well-obscured. Actual addresses are always scrambled and rescrambled by background activity.
  • the code accessing the Virtual MDE memory is initially written as if it were accessing an ordinary piece of memory.
  • the code is then modified by the methods described in US patent 7,350,085 to employ a mapping technique which encodes both the data and locations in the memory.
  • the locations accessed move around over time, and the encodings applied to the data likewise change over time, under the feet of the running code.
  • This technique of protection has substantial overhead, but its highly dynamic nature makes it arduous for an attacker to penetrate the meaning of software which uses it.
  • Cells are recoded when stored, and are recoded periodically by background activity. Mismatching recode on store and corresponding recode on fetch can do a covert add or multiply (key-controllable).
  • Fetched items are recoded, but not to smooth (i.e., not to unencoded).
  • Stored items are not smooth prior to store, and are recoded on store to a dynamically chosen new cell encoding.
  • Stored data are meaningless without the code which accesses them.
  • One program can have any number of distinct, nonoverlapping MDE memories. An MDE memory can be moved as a block from one place to another or can be transmitted from one program to another via a transmission medium. That is, messages of sufficient bulk can be transmitted in MDE-memory form. [186]
  • the initial state of the memory is not produced by hacker-visible activity, and hence conceals how its contents were derived. That is, the initial state is especially obscure.
  • Control Flow Encoding is described in United States Patent No. 6,779,114, the contents of which are incorporated herein by reference.
  • CFE combines code-fragments into multi-function lumps with functionality controlled by register-switching: many-to-many mapping of functionality to code locations; execution highly unrepeatable if external entropy available: the same original code turns into many alternative executions in CFE code.
  • register-switching and dispatch code key information can control what is executed and therefore control the computation performed by embodiments of the invention.
  • Code represented by the control-flow graph of Figure 18, where the letters denote code fragments, can be encoded as shown in Figure 19.
  • the protected control-flow encoding shows lumps created by combining pieces, executed under the control of the dispatcher, with the 'active' piece(s) selected by register switching.
  • CFE is compiled, not just interpreted, so supporting data structures are partially implicit, and hence, well-obscured.
  • Lumps combine multiple pieces; that is, they have multiple possible functionalities. When a lump is executed, which piece(s) is/are active is determined by which operate via registers pointing to real data, not dummy data. The same piece may occur in multiple lumps, with different data-encodings: mapping from
  • the dispatcher can be arranged to select pieces which embody a background process, making it hard to distinguish background and foreground activity. Available entropy is used to determine which alternative way of executing a sequence of pieces is employed, providing dynamic execution diversity (nonrepeating execution). As well, key information can be used to influence dispatch and hence vary the represented algorithm.
  • a modulus M a permutation polynomial p over the mod- ring, an input-based l n vector mapping matrix A yielding z from the inputs, and a series of constant , -p(z+i) for 1 ⁇ i ⁇ M, may be selected, where the c, values are distinct since p is a mod- perm-polynomial.
  • Locations c ⁇ , ..., c carving (with n ⁇ M) are treated in an array ⁇ size M as 'M- registers'.
  • data may be moved randomly into and out of M-registers, and from M-register to M-register, changing encoding at each move. Some embodiments also may randomly cause either the encodings to form an unbroken sequence, or may inject fractures as disclosed herein where encodings do not match.
  • the fracture function e3 e2-l ° el . If el, el are linear, so is e3. If el, e2 are permutation polynomials, so is e3.
  • fractures may provide a means of injecting hidden computations such that the code looks much the same before and after it is added.
  • cross-Linking and Cross-Trapping The generous application of cross-linking and cross-trapping can provide aggressive chaotic response to tampering and perturbation attacks, with much stronger transcoding and massive static analysis resistance.
  • cross-linking and cross- trapping may be effected as follows, as illustrated in Figure 21: 1) copy computations at least once;
  • the context in which base function pairs are implemented may be an integral part of the operation of the base-function.
  • Context includes information from the application, hardware, and/or communication.
  • Context of one base-function component can also include information from other components, which are part of the application in which it resides.
  • an implementation of a base-function pair or a similar construct may be hosted on a platform from which hardware or other platform signature constants can be derived and on which the implementation can be made to depend. It may be preferred for the implementation to reside in a containing application from which an application signature or other application constants can be derived and on which the implementation can be made to depend.
  • the implementation may also take inputs from which further constant signature information can be derived and on which the implementation can be made to depend.
  • Permutations may provide a basis for storing enormous numbers of alternatives in limited space. For example, row/column permutations may be used to turn a non-repeating 4x4 matrix into 576 non-repeating 4x4 matrices. In some embodiments, the order of computations may be permuted, deep dependence of computations on run-time data may be generated, and the like.
  • some embodiments may first sort, at each cross-link, compare, and swap on greater-than. To permute, swaps are performed with probability 1 ⁇ 2. It is easy to show that if the network sorts correctly with a compare-swap, then it permutes with random swap with the full range of permutations as possible outputs. Some embodiments may use a recommended probability 1 ⁇ 2 Boolean generator to compare two text-based full-range permutation polynomial encoded values.
  • the permutation count is equal to the number of elements to permute, which does not evenly divide the number of swap-configurations.
  • the advantage is simplicity and high dependency count with non-T functionality.
  • unbiased permutations can also be generated by selecting a 1 st element at random by taking the ri mod n element among the elements (zero origin), selecting 2 nd element at random by taking the r 2 mod ( «-l) element at random from the remaining elements, and the like.
  • each r t is a full range text-based perm- poly value. This may provide almost perfectly bias-free and non-T- function. However, operations may be harder to hide in or interleave with ordinary code than for sorting-network- based permutation.
  • bit-slice attacks are a common attack tool: repeatedly executing a function and ignoring all but the lowest-order bit, and then the lowest-order two bits, the three lowest-order bits, etc. This allows the attacker to gain information until the full word size (say 32 bits) is reached, at which point complete information has been obtained on how the function behaves.
  • a function constructed using T-function and non-T-function components has subdomains over which it is a T-function embedded in an entire domain in which the function is not.
  • liberal use also may be made of non-T-function computations at other points, such as at decision points, in permutations, in recodings, and the like.
  • Figure 24 shows a graphical representation of a typical usage of Mass Data Encoding or Dynamic Data Mangling as described above. If inputs to a base function are provided by such an obscured memory array, by either of these two techniques, and the results are also obtained by the application from the obscured memory array, it becomes difficult for an attacker to analyse the data-flow of information entering or leaving the base function, making attacks on the base function more arduous.
  • Security-Refresh Rate For effective application security lifecycle management, applications typically must be capable of resisting attacks on an ongoing basis. As part of this resistance, such applications may be configured to self-upgrade in response to security -refresh messages containing security renewal information. Such upgrades may involve patch files, table replacements, new cryptographic keys, and other security-related information.
  • a viable level of security is one in which application security is refreshed frequently enough so that the time taken to compromise an instance's security is longer than the time to the security-refresh which invalidates the compromise; i.e., instances are refreshed faster than they can typically be broken. This is certainly achievable at very high security- refresh rates. However, such frequent refresh actions consume bandwidth, and as we raise the refresh rate, the proportion of bandwidth allocated to security -refresh messages increases, and available non-security payload bandwidth decreases.
  • variable- dependent coding may be used to further obscure the operation of related code.
  • One way of doing so is to use values that are used or generated by other operations in nearby or related sections of code.
  • values may be used repeatedly for different purposes within a region of code, which may make it more difficult for an attacker to discern any individual use, or to extract information about the specific operations being performed in relation to those values. For example, if a value x is encoded as aX+b, there may be a great deal of leeway in the specific values used for the constants a and b. In this example, if there are values available within the executing code that remain constant over the life of x, they may be used as one or more of the constants a and/or b.
  • a first operation f(Y) may return values a and b and a second operation g(Z) may return values c and d, each of which is stored in memory for a period of time.
  • variable x may be encoded during the time that a and b are stored in memory as aX+b, and as cX+d during the time that c and d are stored in memory.
  • the appropriate constants will be available via the memory to allow for decoding or otherwise manipulating x in the appropriate encoding.
  • the values may be overwritten or discarded after that time, since the encoding constants need only be available during the time that x is used by operations within the executing program.
  • variable values generated during execution of code may be used for other purposes in addition to or as an alternative to the finite encoding example provided.
  • variable values may be used to select a random item from a list or index, as a seed for a pseudo-random number generator, as an additive, multiplicative, or other scaling factor, or the like.
  • variable values generated by one portion of executed code may be used in any place where a constant value is needed at another portion of executed code, for a duration not more than the generated variable values are expected to be available.
  • Embodiments of the invention described herein may be used to provide the following, where a "sufficient period of time" may be selected based on, or otherwise determined by, the needs of security lifecycle management: 1) Black-Box Security: security as a keyed black-box cipher against attacks up to adaptive known plaintext for a sufficient period of time;
  • Anti-Partitioning partition implementation into its construction blocks for a sufficient period of time; 6) Application-Locking: cannot extract implementation from its containing application for a sufficient period of time; and 7) Node-Locking: cannot extract implementation from its host platform for a sufficient period of time.
  • embodiments disclosed herein relate to base-function encoding, using various techniques and systems as disclosed. Specific embodiments also may be referred to herein, such as in the Appendix, as "ClearBox” implementations.
  • the various techniques as disclosed herein may use operations that are similar in nature to those used in an application that is being protected by the disclosed techniques, as previously described. That is, the protection techniques such as base functions, fractures, dynamic data mangling, cross-linking, and variable dependent coding may use operations that are similar to those used by the original application code, such that it may be difficult or impossible for a potential attacker to distinguish between the original application code and the protective measures disclosed herein.
  • base functions may be constructed using operations that are the same as, or computationally similar to, the operations performed by the original application code with which the base functions are integrated, in contrast to the distinctive functions typically employed by, for example, known encryption techniques. Such operations and techniques that are difficult or impossible to distinguish may be described herein as "computationally similar.”
  • a method is generally conceived to be a self-consistent sequence of steps leading to a desired result. These steps require physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It is convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, parameters, items, elements, objects, symbols, characters, terms, numbers, or the like. It should be noted, however, that all of these terms and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. The description of the present invention has been presented for purposes of illustration but is not intended to be exhaustive or limited to the disclosed embodiments.
  • FIG. 32 is an example computer system 3200 suitable for implementing embodiments disclosed herein.
  • the computer 3200 may include a communication bus 3201 which interconnects major components of the system, such as a central processor 3210; a fixed storage 3240, such as a hard drive, flash storage, SAN device, or the like; a memory 3220; an input/output module 3230, such as a display screen connected via a display adapter, and/or one or more controllers and associated user input devices such as a keyboard, mouse, and the like; and a network interface 3250, such as an Ethernet or similar interface to allow communication with one or more other computer systems.
  • a communication bus 3201 which interconnects major components of the system, such as a central processor 3210; a fixed storage 3240, such as a hard drive, flash storage, SAN device, or the like; a memory 3220; an input/output module 3230, such as a display screen connected via a display adapter, and/or one or more controllers and associated user input devices such as
  • the bus 3201 allows data communication between the central processor 3210 other components.
  • Applications resident with the computer 3200 generally may be stored on and accessed via a computer readable medium, such as the storage 3240 or other local or remote storage device.
  • a computer readable medium such as the storage 3240 or other local or remote storage device.
  • each module shown may be integral with the computer or may be separate and accessed through other interfaces.
  • the storage 3240 may be local storage such as a hard drive, or remote storage such as a network-attached storage device.
  • various embodiments disclosed herein may include or be embodied in the form of computer-implemented processes and apparatuses for practicing those processes.
  • Embodiments also may be embodied in the form of a computer program product having computer program code containing instructions embodied in non-transitory and/or tangible media, such as floppy diskettes, CD-ROMs, hard drives, USB (universal serial bus) drives, or any other machine readable storage medium.
  • computer program code When such computer program code is loaded into and executed by a computer, the computer may become an apparatus for practicing embodiments disclosed herein.
  • Embodiments also may be embodied in the form of computer program code, for example, whether stored in a storage medium, loaded into and/or executed by a computer, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing embodiments disclosed herein.
  • the computer program code may configure the processor to create specific logic circuits.
  • a set of computer-readable instructions stored on a computer- readable storage medium may be implemented by a general-purpose processor, which may transform the general-purpose processor or a device containing the general-purpose processor into a special-purpose device configured to implement or carry out the instructions.
  • Embodiments may be implemented using hardware that may include a processor, such as a general purpose microprocessor and/or an Application Specific Integrated Circuit (ASIC) that embodies all or part of the techniques according to embodiments of the disclosed subject matter in hardware and/or firmware.
  • a processor such as a general purpose microprocessor and/or an Application Specific Integrated Circuit (ASIC) that embodies all or part of the techniques according to embodiments of the disclosed subject matter in hardware and/or firmware.
  • ASIC Application Specific Integrated Circuit
  • a computer system may include one or more modules configured to receive existing computer executable code, to modify the code as disclosed herein, and to output the modified code.
  • Each module may include one or more sub-modules, such as where a module configured to modify existing computer executable code includes one or more modules to generate base functions, blend the base functions with the code, and output the blended code.
  • other modules may be used to implement other functions disclosed herein.
  • Each module may be configured to perform a single function, or a module may perform multiple functions.
  • each function may be implemented by one or more modules operating individually or in coordination.
  • N the set of natural numbers
  • SC T set 5 is contained in or equal to sal T
  • a I t' is their eonmlmatimv.
  • the tuple of length 4- v ⁇ obtained by creating a tuple containing the elements of u in order and then the elements of v in order; e.g.. (a, c, d ⁇ ⁇ (z, y, z) — (a. b. e. d, x. y. z).
  • R indicates that R C .4 x ?: i.e., that R is a binary relation on .4 x £?.
  • Thi notation is similar to that used for functions below. Its intent is to indicate that the binary relation is interpreted as a multi-function (MF), the relational abstraction of a computation not necessarily deterministic which takes an input from get A and returns an output in ml B. In the ease of a function, this computation must be deterministic, whereas in the ease of an MP. the amputation need not be deterministic, and so it is a better mathematical model for much software in which external events may effect the progress of execution within a given process.
  • MF multi-function
  • R '' is the inverse of ⁇ .
  • Q binary relations
  • R. S (SoR) Q S o (RoQ).
  • 3 ⁇ 4j is the aggregation of 11 % , .., , 3 ⁇ 4.
  • a dire ted graph ( ⁇ .4) where set N is the nwitz-ae! and binary relation A ⁇ N x N is the arc-relation or edge-n ation. (r.y) € .4 is an sir or ed#£ of G * .
  • G ⁇ N,A ⁇ . & node ?/ € ;V is reachable from a node r e .V if there is a path in G which begins with r and ends with y. (Hence every node is reachable from itself.)
  • the xch of a- € V is ⁇ y ⁇ ' I y is reachable from x).
  • Two nodes x. y are connected in G iff one of the two following conditions hold recursively:
  • path (_r) is a path from x to y, so every node n NofG is connected to itself.
  • a DG G - (N, A) is a connected DG iff every pair of nodes x, y el .V of 6 * is connected.
  • a source node in a DC G" ⁇ N.
  • A) is a node whose in-degree is zero, and a sink node in a DC G ⁇ [ N.
  • A) is a node mdiose out-degree is zero.
  • a fx: G - (N,. A ) is a emtrvl-ftew graph (crRi) Iff it lias a d mgttished scnsr«» nod n ⁇ € -V from which every node m € i$ reachable,
  • G (N,A) I» a CFC with source node n ⁇ .
  • Z denotes the set of all isttegers and den te the set of all integers greater than xoro (the natural numi ).
  • 2/ (TO) denotes the ring of the integers modulo m, for some integer m > 0.
  • Z (m) CF(m) t the Galois field of the integers modulo m.
  • B denotes the set ⁇ , 1 ⁇ of frifj, which ma be identified with the two elements of the ring Z f2) ⁇ C,F ⁇ '2 ⁇ >
  • T 0 «T (- (X V (- T)) - 1) ⁇ 0
  • Bit a ise Camp tit er hut mcti om and (B*. V . ⁇ . - ⁇ ) .
  • B n the set of all leng! h- n bit-vectors
  • a computer with ii-bit words typically provides bitwise and ( ⁇ ⁇ , inclusive or I V ⁇ and not ( -* f. Then f ⁇ ⁇ . V, ⁇ , is a Boolean algebra. In i B, V, A, -i), in which the vector-length is one, 0 is false and 1 is true.
  • the lowest numbered bit in «0 element word at which y and tf differ is not lower than the lowest numbered bit in an dement word at which x and x f differ.
  • t his numbering within words to be from low-order (2°) to high-order i2 u ' ⁇ 1 ⁇ bits, regardin words a representing binary magnitudes, so we can restate this as: an output bit can only depend on input bits of the same or lower order.
  • the T-ranetion property does not hold for right bit-shifts, bitwise rotations, division operations, or remainder/modulus operations based on a divisor /modulus which is not a power of two. nor does it hold for functions hi which conditional APPENDIX branches make decisions in which higher-order condition bits affect the value of lower-order output bits.
  • a non-constant polynomial is irreducible if it cannot be written as the product of two or more non-constant polynomials. Irreducible polynomials play a role for polynomial? similar to that played by primes for the integers.
  • variable r has no special significance: a regards a particular polynomial, it is just a place-holder. Of coarse, we may substitute a value for x to evahtate the polynomial that is, variable ;r is only significant when we substitute something for it.
  • Polyn mi ls over €F ⁇ 2; Z C2) have special significance in cryptography, since the ⁇ d -*- i )-voetor of coefficients is simply a bit-string and can efficiently be represented on a computer se.g.. polynomials of degrees up to 7 can be represented as 8-bit bytes); addition and subtraction are identical; and the sum of two such polynomials In bit-string representation is computed using bitwise -;e (exclusive or).
  • Encodings may be derived from algebraic structures (see ⁇ 2.2). For
  • finite rmg eneoding is based on the tact that afflne functions; n f - ⁇ - #£+ fewer Zfi2*) f where a? i» the word width, whieh c n be im ! metited by igmmmg overflow w* that the modulus 3 ⁇ 4t the natural m chine integer raotaliw, aw totes whenever $ is odd
  • t e key to encoded computation is that inputs, outputs, and computation are all encoded.
  • m the preferred word- width for setiKf computer ⁇ typically 8.1 ⁇ , 32. car C4 with ft trend over time towards the higher widths).
  • the units of 2 (2 ⁇ are the ⁇ ld elements 1.3.5 , 2*' - 1 ,
  • Ptolyaamiais of Uglier order may also lie used: in general (27J, for 1 ⁇ u? € N, over Z/ ⁇ 2*),
  • P- I (x) cx 3 + fx 2 + gx + h .
  • P* denotes an encoded implementation derived front function P.
  • P maps rn- vectors to n- vectors, we write ; ⁇ , /*.
  • P is t hen called an n x m Junction or an n m tran ⁇ fonnniion.
  • M indicates that M has m columns and v rows.
  • JJ j E ⁇ mnemonic: enirc/pfHransfer function) is any function from ?n-veetors over B t n- vectors over B which loses no l>its of information for rn ⁇ n and at most rn - n bits for m > Farm A function ! which is not an instance of "E is lossy. Multiple occurrences of " S E in a given formula or equation denote the same function.
  • e (mnemonic: entropy veetor) is an arbitrary veetor selected from n . Multiple occurrences of n e in a given formula or equation denote the same vector.
  • An affine function or affine tramformation is a vector-tovector function V defined for all vectors m v € S ir ' for some set S by J
  • M is a constant matrix
  • d a constant diap mmmt veetor.
  • a function /; F* * ⁇ F m from ⁇ -vectors to m-v txm met ⁇ » +, € «P(i) for some prime power £ is- mi nmr iff j linear function g: F k »— F m and encodin s tf
  • ftoeittre* are po entially useful in ohtoeatlon b m the computation which they perforin effectively does not appear in t bo code the amount and form of code to perform a normal networked encoding and one which adds an operation by means of a fmctn is identical, and there appears to be no obvious way to disantbigtiat these cases, since- encodings tlietiuwlves lend t ⁇ be some h t aiiibi tii is.
  • e may also refer to the MF JJ derived by 1 ⁇ of / as a partial evaluation (PE ) of /, Thai is, the terra partial evaluation may be used to refer to either the derivation process or its result.
  • PE partial evaluation
  • X the input set which the PE retains
  • Y the input set which the PE removes by choosing a specific member of it
  • D x T the Cartesian product of the set D of source semantic descriptions and the set T of target platform semantic descriptions
  • Z the output set
  • E the set of object code ties
  • PE is used in ( 10, l lj: the ⁇ 5- ⁇ 28 cipher ⁇ 16] and the DES cipher are partially evaJitated with respect to the key in order to hide the key from attackers.
  • l lj the ⁇ 5- ⁇ 28 cipher ⁇ 16]
  • DES cipher are partially evaJitated with respect to the key in order to hide the key from attackers.
  • Optimizing compilers perform Pi: when they replace general computations with more specific ones by determining where operands will be constant at run-time, and then replacing their operations with constants or wit h more specific operations which no longer need to input the (effectively constant ! operands.
  • RPE Reverse Partial Evaluation
  • RPE offers an sndefljiitely large rmmber of alternatives: for a given g. there can be any number of different tuples ( . e, V ) every one of which qualifies as an RPE of g.
  • Finding an efficient program which is t he i'K of a more general program may be very difficult that is. the problem is very tightly constrained.
  • Finding an efficient RPE of a given specific program is normally cpiite easy because we have so many legitimate choices that is. the problem is very loosely constrained.
  • Control Flow Graphs in Code Compilation.
  • CFGs Control Flow Graphs
  • a basic Mock (nn) of executable code (a 'straight line * code sequence which has a single- start point, a single end point, and is executed sequentially from its start point to its end point) is represented by a graph node, and an arc connects the node corresponding to a Bit U to the node corresponding to a BR V if. during the execution of the containing program, control either would always, or could possibly, flow from the end of BB I ' to the start of BB 1 " . This can happen in multiple ways:
  • Control flow may naturally fall through from BB V to BB V.
  • control flow naturally falls through from V to V:
  • Control flow may be directed from to ⁇ ' by an iutra-proeedurai control construct such as a while-loop, an if -statement , or a goto- tatement .
  • an iutra-proeedurai control construct such as a while-loop, an if -statement , or a goto- tatement .
  • Control flow may be directed from U to V by a call or a return.
  • control is directed from B to .4 by the call to f () in the body of g ( ) . and from ,4 to C by the return from the call to f ( ) :
  • Control flow may ho directed from V to V by an exceptional eoiitro!-iJow event.
  • control fa potentially direct «5 from f. > to V by a failure of the dynamic_casi of. say, a reference y to a reference to an object in class A:
  • / is a function, but if / makes use of nondetensinistic inputs such as the current reading of a high- resoiutiou hardware clock, / is an F but not a function.
  • some computer hardware includes instructions which may produce iiondeterrninisl c results, which, again, may cause / to be an F. but not a function.
  • N For an entire program having a CFG C - ( .V, T ) and start node » 3 ⁇ 4 we identify N with The set of mis of the program, we identify no with the BB appearing at the starting point of the program (typically the beginning BB of the routine mainC) for a C or C++ program) , and we identify T with every feasible transfer of control from one BB of the program to another,
  • a sorting network may be represented as shown in Figure 7, by a series of parallel wires 72 in which, at certain points, at right angles to these wires, one wire is connected to another wire by a cross-connection 74 (representing a compare-and-swap- if-greater operation on the data being elements carried by the two connected wires). If, irrespective of the inputs, the output emerges in sorted order, the resulting network is a sorting network.
  • the comparisons in a sorting network are data-independent: correct sorting results at the right ends of the wires irrespective of the data introduced at the left ends of the wires. Compare-and-swap-if-greater operations can be reordered so long as the relative order of the comparisons sharing an end-point are preserved.
  • Reducing bias requires that we ensure that the number of ways of reaching any permutatftai is roughly the same for each permutation. Since 2** ⁇ 1 is a power of two for my n, this cannot be done simply by adding extra stages. It is oe «e3 ⁇ 43 ⁇ 4ry isi m m to us other m hods for reducing bias,
  • T!ie first method of removing bias might be called attenuation.
  • tha we need to choose one of 12 elements.
  • the second method of removing bias might be called res kilion.
  • res kilion The second method of removing bias.
  • we need to choose one of 12 elements.
  • probability % we succeed on the first try.
  • Renvkcti has the advant age that it can almost completely eliminate bias. It has the disadvantage that, while spatially compact, it involves redoing some steps.
  • The- third method of removing bias might be called reconfiguration.
  • a bias of 2 1 with 8 nudes reachable 1 way and 4 reachable 2 ways.
  • this method provides the best combination of compactness and speed when the number of configurations needed to eliminate bias is small. (The maximum possible number of configuration?, is bounded above by the number of elements from which to choose, but at that number of configurations, using reconfiguration is pointless; because choosing the configuration is simply another of instance of the problem whose bias we are trying to reduce. )
  • nonlinear encodings (arbitrary 1-to-l functions, themselves representable as substitution boxes; i.e., as lookup t ables ⁇ on values used to index such boxes and on element values retrieved from such boxes are likewise restricted to limited ranges due to space limitations.
  • any data transformation computed by an input-output-encoded implementation of such a blocked matrix representation which is implemented as a network of substitution boxes, or a similar devices for representing essentially arbitrary random functions, is linear up to I/O encmling; t hat is. any such transformation can bo converted to a linear function by individually reeoding each input vector element and individually receding each output vector element .
  • the attack met hod in ⁇ 4 ⁇ is a particular instance of a class of attacks based on homomorp ie mapping.
  • the attack takes advantage of the known properties of linear functions, in this ease over CF(2 S ) since that is the algebraic basis of the computations in the AES.
  • addition in G ⁇ 2 n ⁇ is performed using bitwise (exclusive or), and tins function defines a Latin square of precisely known form.
  • elements of G,G U ,G V are all bit-strings (of lengths ⁇ , ⁇ , ⁇ , respectively).
  • elements of G are S-bit bytes and elements of G u and G r are 4-bit nybbles (half- bytes).
  • step 88 If the test in step 88 does not show that / is deeply nonlinear for. for the variant immediately following this list, sufficiently deeply nonlinear), we return to step 80 and tr again.
  • step 88 we may inc a e the number of ⁇ functions with randomly selected distinct grout** of three inputs and one output, for which we must show that the Instance count is not obtainable by aatrix. Tk more of these wes test, the more we ensure that / is not only deeply nonlinear, but is deeply nonlinear over all parts of its domain. We must Malice* the cost- of such testing against t he importance of obtaining a deeply nonlinear function which is gTsiaranteed to be deeply nonlinear over more aid more of its domain,
  • ⁇ i' t , . - - ii3 ⁇ 4 is the number of element positions at which ⁇ and v differ; i.e., it is
  • ⁇ ⁇ , i?) ⁇ i C N 1 1 ⁇ k a d Ui ⁇ v, ] ⁇ .
  • S is a finite set and ⁇ S ⁇ > 2
  • y is a ftmction for which lor any x.
  • MOS functions arc important in cryptography: they are xm i to piirforra a kind of Ideal mixing".
  • employs an MDS function as mm fff the two state-element mixing functions in each of its rounds except the last.
  • Wleit p q, t his is just the ordinary invcrsie of /, Wliesi p ⁇ q, the function behaves like «i inverse mt# for vw ora in ⁇ £? f .
  • ( 2) t here is a TM which reeo iisses a language not in S.
  • Rice's theorem applies only to Bnguistk* properties, not operational ones. E.g. it is deeidable whether a TM halts on a given input in ⁇ k steps, it is deeidable whether a TM halts on every input in k steps, and it is deeidable whether a TM ever halts in ⁇ k steps.
  • the general impossibility of virus recognition is linquistie and Rice ' s Theorem implies that a perfectly general virus recognizor Is impossible.
  • Patents in Irdeto's patent portfolio include software obfuseation and tamper- resistance Implemented by means of data-flow encoding [5, 7, 8] (the encoding of scalers and vectors and operations upon them), control-flow encoding jOj (modification of control-flow in a program to make it input-dependent with a many-to- many mapping from functions computed and chunks of code to com ute them), and mass-data encoding [20j (software-based virtual memory or memories in which logical addresses are physically scattered and are also dynamically receded and physically moved around over time by a background process) .
  • data-flow encoding is primarily a static process (although variable-dependent eating, in which the coefficients for the encoding of certain variabl s and vectors are provided by other variables and vectors, renders it, potentially somewhat dynamic ⁇
  • control-flow encoding and mass-data encoding are primarily dynamic: dat structures axe planned statically, but the actual operation of these sofware protections is largely a result of the dynamic operations performed on these data- structures at run-time.
  • the amtml-flow encoding was primarily aimed at (1 ) reducing repeatability to foil dynamic attacks, and (2) protecting disambiguation of control-flow by burying the normal control-flow of a program in considerable extra control-flow.
  • the mam-data encoding was originally aimed at finding an encoding which would work correctly in the presence of a high degree of dynamic aliasing: e.g. , in C programs making aggressive use of pointers.
  • a difficulty with the above forms of dynamic encoding is that the support data structures (dispatch and register-mapping tables for contml-ftom encoding, virtual- memory en/de-code tables and address- mapping tables for mam-data encoding) themselves leak information.
  • Virtual Machine Root Instruction Set inner sequence is modified b the outer macro instruction of which the sequence- is a parameter.
  • the macro instructions comprise the following:
  • Hi, ..., s m are the registers from iiieli the sequence inputs and d ⁇ , rf, ri are the registers to which it outputs;
  • T is the label to which control transfers when register s > n. and the value of n at;
  • the tree root is 2 3i (splitting the upper and lower half of the values), at its left dscend t is 2 *3 ⁇ 4 ' (splitting the first and second quarter of the values), and at its right descendant is 2 3! + 2 30 (splitting the third and fourth quarter of the values), and so on;
  • Tli will i!Mtfc 4 ilii* sJ « : *p ii :oli:iamriiy ⁇ > ⁇ ⁇ * ⁇ « ⁇ . ⁇ iijwr«* f3 ⁇ 4i : tkiii »iiikiti3 ⁇ 4 dif vf ilf «ti cat I lie a!i' « fiys3 ⁇ 4 iiii tk** ⁇ »*£» .iitlv'iA ... tl ⁇ ' j s ' . «.. f3 ⁇ 4» not « » ⁇
  • ⁇ , ⁇ n- 1 ;:. ⁇ .1 ",. * viug i* ⁇ ⁇ Hiavofy, t!ic ⁇ ] « ⁇ « ⁇ from Hi *;!*.
  • ⁇ !.&, ⁇ $ is* in* Hv mixjfif MEijmt 1 3 ⁇ . ⁇ > S, * ⁇ an «f. m > lib 4t
  • tf p*js*J is MKHIAI to Mark I » ' «* ⁇ ⁇ réelle» p, J!lj n that 11 l»d .3 ⁇ 4 su ttal strurtur*'. itli oaly iitiMig Base function
  • wiih 1 ⁇ 2 c ming from a fxf S via a ⁇ - ⁇ ⁇ ⁇ , m"tter e « ii r, is a ⁇ 'H macro ta-trwt ion of the* fonti with ail e, «'H « ! ⁇ BS in t!te KEooiiis choen randomly for / ⁇ , ,
  • i ⁇ HMirr [ i cf ⁇ , .... * u : .. , , « ⁇ with i!or ⁇ a single, randomly elioscn petiiiiifatioit, and eight iwiircaiees of a 2x2 fitter, each of the forei
  • the role* is at any MOVE which mn he eliminated hy ordinary levels of optimization, m t ho eliniiiiated; i.e., the filial version of an bfusc ted fa f K l imptctitetit&tion must contain the sinailest number of MOV E instructions achievable lin n ordinary (i.e., n «!i-taero?c! levels of optimization,
  • a MOVE can ⁇ ⁇ elided when if forms an ar m a f ree of operations in which Movi3 ⁇ 4 form the ares, the original value producer f fXisstbly ⁇ s ignment ) is lite root , the root dominates the MOVE arts and the consum rs, and no consumer is itself a ⁇ ass!giiiiieiti.
  • Copy elision can la* perfonwl at. various points in the following process, and is eertainly done a fitw! step to remove redundant .MOVE last ructions.
  • Ilraiieti-to-braiieh elision mn ⁇ perfonned at varieiis o nts is the following process, and is certainly done as a final step to eliminate braiiclHo-Uttc ndittorMl- brand! sequences,
  • lite single output is a 'hash' of the inputs, and will ln> used to generate distinct nami alue* t , C 2' ⁇ - ⁇ - fer m y sliiiliiing ( e ⁇ 5.2,13).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Storage Device Security (AREA)

Abstract

Systems and techniques for securing accessible computer-executable program code and systems are provided. One or more base functions may be generated and blended with existing program code, such that it may be difficult or impossible for a potential attacker to distinguish the base functions from the existing code. The systems and code also may be protected using a variety of other blending and protection techniques, such as fractures, variable dependent coding, dynamic data mangling, and cross-linking, which may be used individually or in combination, and/or may be blended with the base functions.

Description

SECURING ACCESSIBLE SYSTEMS USING BASE FUNCTION
ENCODING
TECHNICAL FIELD
[1] The present invention relates generally to electronic computing devices and computer systems, and more specifically, to securing software and firmware on devices and systems which are accessible to attack.
BACKGROUND
[2] The use of computers, electronic computing devices and computer software in all of their various forms is recognized to be very common and is growing every day. As well, with the pervasiveness of powerful communication networks, the ease with which computer software programs and data files may be accessed, exchanged, copied and distributed is also growing daily. In order to take advantage of these computer and communication systems and the efficiencies that they offer, there is a need for a method of storing and exchanging computer software and data securely. [3] One method of maintaining confidentiality or privacy that has demonstrated widespread use and acceptance is encryption of data using secret cryptographic keys. Existing encryption systems are designed to protect their secret keys or other secret data against a "black box attack". This is a situation where an attacker has knowledge of the algorithm and may examine various inputs to and outputs from the algorithm, but has no visibility into the execution of the algorithm itself (such as an adaptive chosen input/output attack).
[4] While cryptographic systems relying on the black box model are very common, it has been shown that this model does not reflect reality. Often, the attacker is in a position to observe at least some aspect of the execution of the algorithm, and has sufficient access to the targeted algorithm to mount a successful attack (i.e. side-channel attacks such as timing analysis, power analysis, cache attacks, fault injection, etc.) Such attacks are often referred to as "grey-box" attacks, the assumption being that the attacker is able to observe at least part of the system execution. [5] Recognizing this, an effort has been made to design encryption algorithms and data channels which are resistant to a more powerful attack model—the "white box attack". A white box attack is an attack on a software algorithm in which it is assumed that the attacker has full visibility into the execution of the algorithm. To date, such protection systems have met with reasonable success, but as such protection systems have become more and more sophisticated, so has the sophistication of the attacking techniques (such as encoding reduction attacks, statistical bucketing attacks and homomorphic mapping attacks). Thus, many existing white box protection systems are being shown to be ineffective against concerted attacks.
[6] Obfuscation of software by means of simple encodings has been in use for some time. In order to be useful, applications of such encodings in software obfuscation must not increase the time and space consumption of the software unduly, so such encodings are typically relatively simple. Hence, while they can protect software in bulk, they do not provide a high degree of security. There are many communication boundaries in software which represent particular vulnerabilities: passage of data in unprotected form into or out of an obfuscated program, passage of data into or out of a cipher implementation in software or hardware, and the like. The strength of prior encoding strategies typically is sharply limited by the data sizes which they protect. For conventional encodings, such protected items are on the order of 32 bits, sometimes 64 bits, and sometimes smaller pieces of data such as characters or bytes. Given the limitations of encodings and the operand sizes, fairly swift brute-force cracking of such encodings cannot be prevented in general.
[7] There is therefore a need for more effective secret-hiding and tamper-resistance techniques, providing protection of software code and data in general, as well as protection of secret cryptographic keys, biometric data, encrypted data and the like. It also is desirable to provide a much stronger form of protection for software boundaries than conventional simple encodings.
SUMMARY
[8] Embodiments of the present invention aim generally at providing more effective secret-hiding and tamper-resistance techniques, providing protection of software code and data without fear that security will be breached. [9] The methods and systems disclosed herein are not limited to any particular underlying program. They may be applied to cryptographic systems, but equally, may be applied to non-cryptographic systems. As well, the software code that is being protected does not dictate what is done to protect it, so the protection techniques are not constrained by the underlying code. This may provide an advantage over other protection techniques which can leave or create patterns that are based on the underlying code. Such patterns may provide weaknesses that can be exploited by attackers.
[10] Some embodiments disclosed herein provide "profound data dependence", which can make it difficult or impossible to unentangle or distinguish the protected code and the code which is providing protection. For example, AES algorithms typically execute the same way all the time, no matter what the input data is. This makes it straightforward for an attacker to know what he is looking for and where to find it. Most white box protection systems have a rigid equation structure which does not address this type of problem. That is, an attacker may know what types of operations or effects to look for, and where in code or execution to look to find those operations or effects. In contrast, embodiments disclosed herein may provide coding which is not rigid, such as where each iteration of a protection algorithm results in a different encoding. Thus, the system is extremely non-repeatable. Among other things, this may make embodiments disclosed herein more resistant to a "compare" type attack, in which an attacker changes 1 bit and observes how the targeted program changes. In some embodiments disclosed herein, if an attacker changes 1 bit, then the protected code will look completely different.
[11] As a matter of overview, the embodiments of tools, families of tools, and techniques described herein may generally be grouped as follows:
1) Systems and techniques for blurring boundaries between modules of targeted code, and between the targeted code and the protection code. This may be accomplished, for example, by blending code together with surrounding code, and interleaving ciphers with other code, which is usually not done in other protective systems.
2) Systems and techniques for ensuring that a crack requires human intervention. Humans look for patterns that they have seen before. By introducing random functions according to embodiments disclosed herein, repetitive and/or common patterns can be removed so that automated attacks are largely ineffective. 3) Systems and techniques for protecting against "compare attacks". As noted above, a compare attack is an attack where two iterations of code execution are compared to see the difference, such as changing a single input bit to see how the operation and output change. Protection algorithms as disclosed herein may result in dramatically different functions with each iteration of the protected code, so a compare attack does not provide any useful information.
[12] The obfuscation techniques described herein may be implemented wherever the overhead can be accommodated. White box protection systems typically have larger overheads than the techniques described herein, and are therefore at a disadvantage. [13] Some embodiments include systems and techniques for software protection that operate by applying bijective "base" functions to the targeted code. These base functions are pairs of mutually-inverse functions fa, fx'1 which are used, for example, to encode an operation, and then un-encode the operation at a later point in a software application. The encoding obscures the original function and the data which it generates. There is no loss of information, as the unencoding operation accommodates for the encoding operation,
"undoing" or "reversing" its effect later in the encoded application. Base function pairs may be chosen such that an attacker cannot easily find or determine the inverse function. That is, given a function /κ, the inverse^"1 may not be found easily without the key K. The key K may be used at code generation time, but then discarded once the functions^, fx'1 have been generated and applied to the targeted code. These base function pairs are also lossless, i.e. mathematically invertible. The protected software application does not need to decode a function or process completely to use it elsewhere in the targeted code, as the encoding and unencoding changes are included within the encoded application. In some embodiments it may be preferred that the base functions are "deeply non-linear", thus making homomorphic attacks more difficult. In some embodiments, base function pairs may include permutation polynomial encodings. A permutation polynomial is a polynomial which is invertible (a polynomial bijection).
[14] Some embodiments may generate or use base function pairs in such a manner that they generate "instance diversity" and "dynamic diversity". To achieve "instance diversity", each base function pair may create a secure "communication channel", such as between portions of a software application, between two software applications or platforms, or the like. Dynamic diversity may be created by linking operation of the software to the input data. Each time an encoding is performed, such as for communication between two encoded applications, instance and dynamic diversity may be generated between the two applications. The base functions may be highly "text dependent" so they offer good resistance to plaintext and perturbation attacks. If an attacker changes anything, even making a very small change such as the value of 1 bit, the change will result in a very large behavioural change. This feature is a significant contrast to conventional cipher code, which typically results in the same patterns and structure with each iteration of the code, regardless of the changes that an attacker makes. By making small changes and observing the impact, the attacker is able to gather information about the operation of cipher code, but he is not able to do the same with software encoded using systems and techniques disclosed herein. The diversity provided by embodiments disclosed herein also provides resistance to a "class crack". That is, it is not possible to provide an attack methodology which can systematically and automatically crack each embodiment of the invention in all cases. Note also, that conventional white box
implementations and code optimizers will not provide sufficient diversity to gain any effective protection.
[15] The diversity and non-invertibility of the inventive base functions increase the complexity of the attack problem immensely. In contrast to conventional software code or code protection systems, when attempting to defeat the systems and techniques disclosed herein, an attacker must first figure out what function, code portion, application, or the like he is attacking, then how to invert it, and then how to exploit it.
[16] The diversity provided by embodiments disclosed herein may provide a variable, randomly-chosen structure to protected code. An engine which generates the base function pairs and encodings may rely on a random or pseudo-random key to choose the underlying function and/or the key. However, a key according to embodiments disclosed herein may not be as small as the keys of many conventional security systems (i.e. 64 or 128 bits); rather, it may be thousands or tens of thousands of bits. For example, a prototype was developed which uses 2,000 bits.
7] The base functions disclosed herein may include bijections used to encode, decode, recode data. Such bijections may include the following characteristics: [18] 1) Encoding wide data elements (typically four or more host computer words wide), unlike typical scalar encodings (see [5, 7] listed in the Appendix), but like block ciphers.
[19] 2) Encoding data only: unlike typical scalar encodings, but like ciphers, they are not required to protect computations other than those involved in their own recoding of data elements.
[20] 3) Concealing blocks or streams of data, and/or producing fixed-length hashes of blocks or streams of data for authentication purposes, similar to block ciphers, but unlike scalar encodings. [21] 4) Employing forms of operations purposely chosen from the operation repertoire of the software in which they will reside and with which they will be interlocked; i.e., they are designed to resemble the code in the context of which they are embedded, unlike ciphers.
[22] 5) Unlike both ciphers and scalar encodings, employing massive multicoding. A scalar encoding generally employs one or at most a few mathematical constructions. A cipher typically employs a slightly larger number, but the number is still small. In some embodiments of the invention, a variety of encodings are applied to an entire function, creating an intricately interlaced structure resulting from the interaction of many forms of protection with one another.
[23] 6) Unlike both ciphers and scalar encodings providing massively diverse algorithmic architecture. Embodiments may have no fixed number of rounds, no fixed widths for operands of various substeps, no fixed interconnection of the various substeps, and no predetermined number of iterations of any kind.
[24] 7) Unlike both ciphers and scalar encodings, providing massive dynamic diversity by means of highly data-dependent algorithms: i.e., for any particular employment of a base function bijection, the path through its substeps, its iteration counts, and the like, depend intensely on the actual data input to be encoded, decoded, or recoded.
[25] 8) Unlike both ciphers and scalar encodings, providing massive interdependence with their embedding context; i.e., their behavior may depend strongly on the software in which they are embedded, and the software in which they are embedded can be made to depend strongly on them.
[26] Some embodiments may use a large quantity of real entropy (i.e., a large truly random input). However, if an engine which generates the base function pairs is not itself exposed to attackers, it may be safe to employ significantly smaller keys which then generate much larger pseudo-random keys by means of a pseudo-random number generator, since in that case, the attacker must contend with both the real key entropy (that for the seed to the pseudo-random number generator) and the randomness inevitably resulting from the programming of the generator. [27] In some embodiments, biased permutations may also be used. If internal data is used to generate base function pairs or other encoding data/functions rather than random numbers, then the resulting encoding will contain bias. If code is introduced to create unbiased permutations that coding may be readily apparent, resulting in a weakness in the system. In contrast, embodiments disclosed herein may generate biased permutations, but then use various tools to make them less biased. This approach has been shown to be much less apparent than known techniques.
[28] Some embodiments may include techniques for binding pipe-starts and pipe-ends, so that the targeted software code is tied to applications or platforms at both ends. This may be useful, for example, in a peer-to-peer data transfer environment or a digital rights management (DRM) environment. Systems and techniques disclosed herein also may be used to tie ciphers to other software applications or platforms, which is generally difficult to do using conventional techniques.
[29] Some embodiments may use "function-indexed interleaving". This technique provides deep nonlinearity from linear components, and nonlinear equation solving. It can be used in many ways, such as boundary protection, dynamic constant generation (e.g. key-to- code), providing dynamic diversity (data-dependent functionality), self-combining ciphers, cipher mixing and combining ciphers and non-ciphers. For example, it may be used to mix black box ciphers with the other protection code disclosed herein, providing the long term security of a black box cipher, with the other benefits of white box security. As noted above, the encoding of the embodiments disclosed herein may be highly dependent on run- time data. With function index interleaving, two kinds of information are used: a key, K, which determines the base functions and structure, and R, which determines which obfuscations are to be applied to the "defining implementations". Typically the client does not see R. The key, K, may be augmented from the context, though in some examples described herein, only R is augmented in this way. Optionally, semi-consistent information or data from a user or his device (such as a smart phone, tablet computer, PDA, server or desktop computer system, or the like) such as an IP address, could be used to encode and decode as a runtime key.
[30] Recursive function-indexed interleaving also may be used. Function-indexed interleaving typically interleaves arbitrary functions. If some of these functions are themselves functions obtained by function-indexed interleaving, then that is a recursive use of function-indexed interleaving.
[31] Some embodiments may include random cross-linking, cross-trapping, dataflow duplication, random cross-connection, and random checks, combined with code-reordering, create omni-directional cross-dependencies and variable-dependent coding.
[32] Some embodiments may use memory- shuffling with fractured transforms (dynamic data mangling) to hide dataflow may also be employed. In dynamic data mangling, an array A of memory cells may be used which can be viewed as having virtual indices 0, 1, 2, ... , -l where M is the size of the array and the modulus of a permutation polynomial p on the finite ring ZJ(M) (i.e., the integers modulo M), as in a C program array. However, for any given index , there is no fixed position in the array to which it corresponds, since it is addressed as ρ(ί), and p employs coefficients determined from the inputs to the program. The locations
A\p( )], A[p(l)], ... , A[p(M-l)] may be considered "pseudo-registers" R , ... , RM-I extending those of the host machine. By moving data in and out of these registers, recoding the moved data at every move, and by re-using these "pseudo-registers" for many different values (e.g., by employing graph-coloring register allocation), the difficulty for an attacker to follow the data-flow of the program may be greatly increased.
[33] Some embodiments may use "spread and blend" encoding. This is another way of describing the use of base functions plus code interleaving, which "smears out" the boundaries of the base functions to make them more difficult for an attacker to discern. General data blending may have portions of base functions that are mixed with other code, making it more difficult to identify and lift the code. [34] Some embodiments provide security lifecycle management. Black box security provides good long-term protection, but is not very useful in today's applications.
Embodiments disclosed herein may refresh implementations faster than they can be cracked on unprotected devices. Different devices and applications have different needs. For example, a pay-per-view television broadcast such as a sporting event, may have very little value several days after the event, so it may only be necessary to provide sufficient security to protect the broadcast data for a day or so. Similarly, the market for computer games may tail off very quickly after several weeks, so it may be critical only to protect the game for the first few weeks or months. Embodiments disclosed herein may allow a user to apply the level of security that is required, trading off the security against performance. Literally, an adjustable "obfuscation dial" can be placed on the control console. Although the specific defined level of security achieved may be unknown, the intensity with which obfuscating methods are applied may be controlled. Generally, these settings may be adjusted when the application is created with its embedded base function, as part of a software development process. Security analysis may provide an estimate of how difficult the application will be to crack given a specific level of obfuscation. Based on the estimate, an engineering decision may be made of how to balance performance needs against the need for security, and "obfuscation dial" may be set accordingly. This kind of flexibility is not available with other protection systems. With AES, for example, a fixed key length and fixed code is used, which cannot be adjusted. [35] Some embodiments may provide a flexible security refresh rate, allowing for a trade-off of complexity for the "moving target" of refreshing code. In many cases, the need is to refresh fast enough to stay ahead of potential attackers.
[36] Some embodiments may not have a primary aim of providing long-term data security in hacker-exposed environments. For that, the solution is not to expose the data to hackers, but only to expose means of access to the data by, e.g., providing a web presence for credential-protected (SecurelD(TM), pass-phrases, etc.) clients which access the data via protected conversations which can expose, at most, a small portion of the data. In a hacker- exposed environment, it may be expected that a process of refreshing the exposed software in some fashion will be deployed. For example, in satellite TV conditional access systems, cryptographic keys embedded in the software in the set-top boxes (STBs) are refreshed on a regular basis, so that any compromise of the keys has value for only a limited period of time. Currently, such cryptographic keys may be protected over this limited exposure period by means of software obfuscation and/or white-box cryptography.
[37] However, white-box cryptography has proven to be vulnerable to attacks which can be executed very swiftly by cryptographically-sophisticated attackers with expert knowledge of the analysis of executable programs, since the cryptographic algorithms employed are amongst the most thoroughly examined algorithms in existence, and the tools for analysing programs have become very sophisticated of late as well. Moreover, ciphers have peculiar computational properties in that they are often defined over arithmetic domains not normally used in computation: for example, AES is defined over a Galois field, RSA public-key cryptosystems are defined by modular arithmetic over extremely large moduli, 3DES over bit operations, table lookups, and bit-permutations extended with duplicated bits.
[38] In fact, the sophisticated analysis of programs has created a method of attack which sometimes can bypass the need for cryptanalysis altogether: the code-lifting attack, whereby the attacker simply extracts the cryptographic algorithm and employs it with no further analysis (since it is, after all, an operational piece of software, however obfuscated it may be) to crack a software application's functionality.
[39] Some embodiments may provide much stronger short-term resistance to attack. Such protection may be suitable for systems where the time over which resistance is needed is relatively short, because longer term security is addressed by means of refreshing the software which resides on the exposed platforms. This addresses a specific unfilled need which focusses at the point of tension created by highly sophisticated cryptanalytic tools and knowledge, extremely well studied ciphers, limited protections affordable via software obfuscation, highly sophisticated tools for the analysis of executable programs, and the limited exposure times for software in typical commercial content distribution environments. The goal is to prevent the kinds of attacks which experience with white-box cryptography has shown to be within the state of the art: swift cryptanalytic attacks and/or code-lifting attacks so swift that they have value even given the limited lifespans of validity between refreshes of the exposed programs (such as STB programs).
[40] In many cases, it is only necessary to resist analysis for the duration of a refresh cycle, and to tie cipher-replacement so tightly to the application in which it resides that code- lifting attacks are also infeasible for the duration of a refresh cycle. The refresh cycle rate is determined by engineering and cost considerations: how much bandwidth can be allocated to refreshes, how smoothly we can integrate refreshes with ongoing service without loss of quality-of-service, and so on: these are all problems very well understood in the art of providing conditional access systems. These considerations indicate roughly how long our protections must stand up to analytic and lifting attacks.
[41] Some embodiments may provide significantly larger encodings which can resist attacks for longer periods of time, by abandoning the notion of computing with encoded operands— as is done with the simpler encodings above— and replacing it with something more like a cipher. Ciphers themselves can be, and are, used for this purpose, but often they cannot easily be interlocked with ordinary software because (1) their algorithms are rigidly fixed by cipher standards, and (2) their computations are typically very different from ordinary software and therefore are neither readily concealed within it, nor readily interlocked with it. The base-functions described herein provide an alternative which permits concealment and interlocking: they make use of conventional operations, and their algorithms are enormously more flexible than is the case with ciphers. They can be combined with ciphers to combine a level of black-box security as strong as conventional cryptography with a level of white-box security significantly superior to both simple encodings as above and known white-box cryptography.
[42] In some embodiments, a base function may be created by selecting a word size w and a vector length N, and generating an invertible state- vector function configured to operate on an N- vector of w-element words, which includes a combination of multiple invertible operations. The state-vector function may receive an input of at least 64 bits and provides an output of at least 64 bits. A first portion of steps in the state-vector function may perform linear or affine computations over Z/(2W). Portions of steps in the state-vector function may be indexed using first and second indexing techniques. At least one operation in an existing computer program may then be modified to execute the state-vector function instead of the selected operation. Each of the indexing techniques may control a different indexing operation, such as if-then-else constructs, switches, element-permutation selections, iteration counts, element rotation counts, function-indexed key indexes, or the like. Some of the steps in the state- vector function may be non-T-function operations. Generally, each step in the state- vector function may be invertible, such that the entire state-vector function is invertible by inverting each step. In some configurations the state-vector function may be keyed using, for example, a run-time key, a generation-time key, or a function-indexed key. The state- vector function may be implemented by various operation types, such as linear operations, matrix operations, random swaps, or the like. Various encoding schemes also may be applied to inputs and/or outputs of the state- vector function, and/or operations within the state- vector function. In some configurations, different encodings may be applied to as to produce fractures at various points associated with the state-vector function.
[43] In some embodiments, base functions as disclosed herein may be executed by, for example, receiving an input having a word size w, applying an invertible state-vector function configured to operate on N- vectors of w-element words to the input, where the state-vector function includes multiple invertible operations, and a first portion of steps in the state-vector function perform linear or affine computations over Z/(2W). Additional operations may be applied to the output of the invertible state-vector function, where each is selected based upon a different indexing technique. Generally, the state-vector function may have any of the properties disclosed herein with respect to the state-vector function and base functions. [44] In some embodiments, a first operation may be executed by performing a second operation, for example, by receiving an input X encoded as A(X) with a first encoding A, performing a first plurality of computer-executable operations on the input using the value of B~'(X), where B'1 is the inverse of a second encoding mechanism B, the second encoding B being different from the first encoding A, providing an output based upon B~2(X). Such operation may be considered a "fracture", and may allow for an operation to be performed without being accessible or visible to an external user, or to a potential attacker. In some configurations, the output of the first operation may not be provided external to executable code with which the first operation is integrated.
[45] In some embodiments, for a matrix operation configured to receive an input and provide an output, prior to performing the operation, the input may be permuted according to a sorting-network topology. The matrix operation may be executed using the permuted input to generate the output, and the output permuted according to the sorting-network topology. The permuted output then may be provided as the output of the matrix operation.
[46] In some embodiments, a first input may be received, and a function-indexed interleaved first function applied to the first input to generate a first output having a left portion and a right portion. A function-index interleaved second function may be applied to the first output to generate a second output, where the left portion of the first output is used as a right input to the second function, and the right portion of the first output is used as a left input to the second function. The second output may then be provided as an encoding of the first input. [47] In some embodiments, a key K may be generated, and a pair of base functions^, †K 1 generated based upon the key K and a randomization information R. The base function^ may be applied to a first end of a communication pipe, and the inverse ^ to a second end of the communication pipe, after which the key K may be discarded. The communication pipe may span applications on a single platform, or on separate platforms. [48] In some embodiments, one or more operations to be executed by a computer system during execution of a program may be duplicated to create a first copy of the operation or operations. The program may then be modified to execute the first operation copy instead of the first operation. Each operation and the corresponding copy may be encoded using a different encoding. Pairs of operations also may be used to create a check value, such as where the difference between execution of an operation result and execution of the copy is added to the result of the operation or the result of the operation copy. This may allow for detection of a modification made by an attacker during execution of the program.
[49] In some embodiments, during execution of a program that includes multiple operations and a copy of each operation, upon reaching an execution point at which an operation of the plurality of operations should be performed, either a copy or the original operation may be selected randomly and executed by the program. The result of the randomly-selected operations may be equivalent to a result that would have been obtained had only a single copy of the operations been performed.
[50] In some embodiments, an input may be received from an application. An array of size M may be defined with a number of M-register locations c\, ... ,c„, with n < M. A permutation polynomial p, an input-based 1 *n vector mapping matrix A yielding z from the input, and a series of constants =p(z+i) also may be defined. A series of operations may then be performed, with each operation providing an intermediate result that is stored in an M- register selected randomly from the M-register s. A final result may then be provided to the application based upon the series of intermediate results from a final M-register storing the final result. Each intermediate result stored in an M-register, may have a separate encoding applied to the intermediate result prior to storing the intermediate result in the corresponding M-register. The different encodings applied to intermediate results may be randomly chosen from among multiple different encodings. Similarly, different decodings, which may or may not correspond to the encodings used to store intermediate results in the M-registers, may be applied to intermediate results stored in M-registers. New M-registers may be allocated as needed, for example, only when required according to a graph-coloring allocation algorithm.
[51] In some embodiments, a first operation g(y) that produces at least a first value a as an output may be executed, and a first variable x encoded as aX+b, using a and a second value b. A second operation f(aX+b) may be executed using aX+b as an input, and a decoding operation using a and b may be performed, after which a and b may be discarded. The value b also may be the output of a third operation h(z). Different encodings may be used for multiple input values encoded as aX+b, using different execution instances of g(y) and/or h(z). The values may be selected from any values stored in a computer-readable memory, based upon the expected time that the constant(s) are stored in the memory. Similarly, existing computer- readable program code containing instructions to execute an operation f(aX+b) and g(y), and g(y) produces at least a first value c when executed; may be modified to encode x as cX+d. The operation f(cX+d) may be executed for at least one x, and c and d subsequently discarded.
[52] In some embodiments, at least one base function may be blended with executable program code for an existing application. For example, the base function may be blended with the executable program code by replacing at least one operation in the existing program code with the base function. The base function also may be blended with the existing application by applying one, some, or all of the techniques disclosed herein, including fractures, variable dependent coding, dynamic data mangling, and/or cross-linking. The base functions and/or any blending techniques used may include, or may exclusively include, operations which are similar or indistinguishable from the operations present in the portion of the existing application program code with which they are blended. Thus, it may be difficult or impossible for an attacker to distinguish the base function and/or the blending technique operations from those that would be present in the existing executable program code in the absence of the base function. [53] In some embodiments, a computer system and/or computer program product may be provided that includes a processor and/or a computer-readable storage medium storing instructions which cause the processor to perform one or more of the techniques disclosed herein.
[54] Moreover, because the algorithms used with base functions disclosed herein may be relatively flexible and open-ended, they permit highly flexible schemes of software diversity, and the varied instances can differ more deeply than is possible with white-box cryptography. Thus, they are far less vulnerable to automated attacks. Whenever attacks can be forced to require human participation, it is highly advantageous, because we can new instances of protected code and data may be automatically generated at computer speeds, but they can only be compromised at human speeds. [55] Other systems, methods, features and advantages of the invention will be, or will become, apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention, and be protected by the following claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[56] In the appended drawings:
[57] Figure 1 shows a commutative diagram for an encrypted function, in accordance with the present invention;
[58] Figure 2 shows a Virtual Machine General Instruction Format, in accordance with the present invention;
[59] Figure 3 shows a Virtual Machine Enter/Exit Instruction Format, in accordance with the present invention;
[60] Figure 4 shows a Mark I 'Woodenman' Construction, in accordance with the present invention; [61] Figures 5 and 6 show the first and second half respectively, of a Mark II
Construction, in accordance with the present invention; [62] Figure 7 shows a graphical representation of a sorting network, in accordance with the present invention;
[63] Figure 8 shows a flow chart of method of performing function-indexed
interleaving, in accordance with the present invention;
[64] Figure 9 shows a flow chart of method of performing control-flow duplication, in accordance with the present invention;
[65] Figure 10 shows a flow chart of method of performing data-flow duplication, in accordance with the present invention; and
[66] Figure 11 shows a flow chart of method of creating^ segments, in accordance with the present invention.
[67] Figure 12 presents a process flow diagram for implementation of the Mark II protection system of the invention;
[68] Figure 13 shows a graphical representation of the irregular structure of segment design in a Mark III implementation of the invention;
[69] Figure 14 shows a graphical representation of the granularity that may be achieved with T-function splitting in a Mark III implementation of the invention;
[70] Figure 15 shows a graphical representation of the overall structure of a Mark III implementation of the invention;
[71] Figure 16 shows a graphical representation of the defensive layers of a Mark III implementation of the invention;
[72] Figure 17 shows a graphical representation of mass data encoding in an
implementation of the invention;
[73] Figures 18 and 19 show graphical representations of control flow encoding in an implementation of the invention;
[74] Figure 20 shows a graphical representation of dynamic data mangling in an implementation of the invention; [75] Figure 21 shows a graphical representation of cross-linking and cross-trapping in an implementation of the invention;
[76] Figure 22 shows a graphical representation of context dependent coding in an implementation of the invention; [77] Figure 23 presents a process flow diagram for implementation of the Mark II protection system of the invention;
[78] Figure 24 shows a graphical representation of a typical usage of Mass Data Encoding or Dynamic Data Mangling in an implementation of the invention.
[79] Figure 25 shows an exemplary block diagram setting out the primary problems that the embodiments of the invention seek to address;
[80] Table 25 presents a table which categorizes software boundary problems;
[81] Figure 26 shows a block diagram of an exemplary software system in unprotected form, under white box protection, and protected with the system of the invention;
[82] Figure 27 shows a bar diagram contrasting the levels of protection provided by black box, security, white box security and protection under an exemplary embodiment of the invention;
[83] Figure 28 shows a process flow diagram contrasting ciphers, hashes and exemplary base functions in accordance with the present invention;
[84] Figure 29 shows an exemplary block diagram of how base functions of the invention may be used to provide secure communication pipes;
[85] Figure 30 shows a process flow diagram for function-indexed interleaving in accordance with the present invention;
[86] Figure 31 presents a process flow diagram for implementation of the Mark I protection system of the invention;
DETAILED DESCRIPTION [87] Embodiments disclosed herein describe systems, techniques, and computer program products that may allow for securing aspects of computer systems that may be exposed to attackers. For example, software applications that have been distributed on commodity hardware for operation by end users may come under attack from entities that have access to the code during execution.
[88] Generally, embodiments disclosed herein provide techniques to create a set of base functions, and integrate those functions with existing program code in ways that make it difficult or impossible for a potential attacker to isolate, distinguish, or closely examine the base functions and/or the existing program code. For example, processes disclosed herein may receive existing program code, and combine base functions with the existing code. The base functions and existing code also may be combined using various techniques such as fractures, dynamic data mangling, cross-linking, and/or variable dependent coding as disclosed herein, to further blend the base functions and existing code. The base functions and other techniques may use operations that are computationally similar, identical, or indistinguishable from those used by the existing program code, which can increase the difficulty for a potential attacker to distinguish the protected code from the protection techniques applied. As will be described herein, this can provide a final software product that is much more resilient to a variety of attacks than is possible using conventional protection techniques.
[89] As shown in Figure 25, embodiments disclosed herein may provide solutions for several fundamental problems that may arise when software is to be protected from attack, such as software boundary protection, advanced diversity and renewability problems and protection measurability problems.
[90] Software boundary problems may be organized into five groups as shown in Table 1: skin problems, data boundaries, code boundaries, boundaries between protected data and protected code, and boundaries between protected software and secured hardware.
Table 1
Figure imgf000019_0001
problems typically are hard to solve without introducing a
Data flows from protected to trusted enabling mechanism at the boundary. unprotected domains
Computation boundary
between unprotected and
protected domains
Current data transformation techniques are limited to individual data types, not multiple data types or mass
Data type boundary
data. The boundaries among distinct protected data items stand out, permitting identification and partitioning.
Data diffusion via existing data flow protections is limited. Original data flow and computational logic is
Data Data dependence boundary exposed. Most current whitebox cryptographic
Boundary weaknesses are related to both data type and data
dependency boundary problems.
Data communications among functional components of an application system, whether running on the same or
Data crossing functional
different devices, or as client and server, are made boundaries
vulnerable because the communication boundaries are clearly evident.
Boundaries among functional components are still visible after protecting those components. For example, whitebox cryptography components can be identified by
Functional boundaries among their distinctive computations. In general, such protected protected components computation segments can be easily partitioned, creating
Code vulnerabilities to component-based attacks such as code Boundary lifting, code replacement, code cloning, replay, code sniffing, and code spoofing.
Boundaries between injected Current individual protection techniques create secured code and the protected code that is localized to particular computations. Code version of the original boundaries resulting from use of different protection application code techniques are not effectively glued and interlocked.
Protected data and protected code are not effectively
Boundary between protected data and locked together to prevent code or data lifting attacks. protected code Current whitebox cryptographic implementations are vulnerable to such lifting attacks in the field.
We lack effective techniques to lock protected hardware and protected software to one another. The boundary
Boundary between protected software and
between protected software and secure hardware is secured hardware
vulnerable, since data crossing the boundary is unprotected or weakly protected.
[91] There are three types of "skin problems" which may be addressed by embodiments disclosed herein: data flows from unprotected to protected domains, data flows from protected to unprotected domains, and computation boundaries between unprotected and protected domains. Ultimately, data and user interaction should be performed in an unencoded form, so that the user can understand the information. In each case, attacks on unprotected data and computations can be the starting point for compromising their data and computation counterparts in the protected domain. These problems conventionally are hard to solve without introducing a trusted enabling mechanism at the boundary. However, the diversity provided by embodiments disclosed herein, and encoding at the boundary itself, provides a degree of protection that is not provided by known systems.
[92] Data Boundaries may be categorized as one of three types: data type boundaries, data dependency boundaries and data crossing functional component boundaries. With regard to data type boundaries, current data transformation techniques are limited to individual data types, not multiple data types or mass data. The boundaries among distinct protected data items stand out, permitting identification and partitioning. With regard to data dependency boundaries, data diffusion via existing data flow protections is limited: original data flow and computational logic is exposed. Most current white box cryptography weaknesses are related to both data type and data dependency boundary problems. Finally, with regard to data crossing functional component boundaries, data communications among functional components of an application system, whether running on the same or different devices, or as client and server, are made vulnerable because the communication boundaries are clearly evident. The use of base function encodings and function-indexed interleaving by
embodiments disclosed herein may address some or all of these data boundary issues because both the data and the boundaries themselves may be obscured.
[93] Code Boundaries may be categorized into two types: functional boundaries among protected components, and boundaries between injected code and the protected version of the original application code. Functional boundaries among protected components are a weakness because boundaries among functional components are still visible after protecting those components. That is, with white box protection, the white box cryptographic components can generally be identified by their distinctive computations. In general, such protected
computation segments can be easily partitioned, creating vulnerabilities to component-based attacks such as code lifting, code replacement, code cloning, replay, code sniffing, and code spoofing. Similarly, boundaries between injected protection code and the protected version of the original application code are also generally visible. Current individual protection techniques create secured code that is localized to particular computations. Code boundaries resulting from use of different protection techniques are not effectively glued and interlocked. In contrast, the use of base function encodings and function-indexed interleaving by embodiments disclosed herein may address all of these code boundary issues, because code may be obscured and interleaved with the protection code itself. Because basic computer processing and arithmetic functions are used for the protection code, there is no distinctive code which the attacker will quickly identify.
[94] The boundary between protected data and protected code presents another weakness which can be exploited by an attacker as current white box techniques do not secure the boundary between protected data and protected code. In contrast, embodiments disclosed herein may lock together the protected data and protected code, to prevent code or data lifting attacks. Current white box cryptography implementations are vulnerable to such lifting attacks in the field.
[95] Similarly, the boundary between protected software and secured hardware presents a vulnerability as existing white box techniques do not protect the boundary between protected software and secure hardware - data crossing such a boundary is unprotected or weakly protected. In contrast, embodiments disclosed herein may lock protected hardware and protected software to one another.
[96] There are also logistical issues associated with security, in particular, diversity and renewability problems. Current program diversity is limited by program constructs and structures, and by limitations of the individual protection techniques applied. As a result, diversified instances do not vary deeply (e.g., program structure variation is extremely limited), and instances may be sufficiently similar to permit attacks based on comparing diversified instances. Current protection techniques are limited to static diversity and fixed security. In contrast, embodiments as disclosed herein may provide dynamic diversity which may allow for intelligent control and management of the level of security provided by diversity and renewability. As disclosed in further detail herein, resolving advanced diversity and renewability problems may be fundamental to security lifecycle management.
[97] Figure 26 shows a block diagram of an example software system protected under a known white box model, and under an example embodiment as disclosed herein. The original code and data functions, modules and storage blocks to be protected are represented by the geometric shapes labeled Fl, F2, F3, Dl and D2. Existing white box and similar protection techniques may be used to protect the various code and data functions, modules and storage blocks, but even in a protected form they will (at the very least) disclose unprotected data and other information at their boundaries. In contrast, embodiments of the present invention may resolve these boundary problems. In some cases, once an instance of an embodiment as disclosed herein has been executed, an observer cannot tell which parts are Fl, F2, F3, Dl, D2 and data from the original program, even though the observer has access to the program and can observe and alter its operation.
[98] This may be accomplished, for example, by interleaving the code together between different code and data functions, modules and storage blocks, thus "gluing" these components together. With the code closely tied in this way, true boundary protection can be provided. As described above, diversity and renewability are provided in terms of 1) much greater flexibility being provided than past systems; 2) easy and powerful control; 3) enable dynamic diversity and security; and 4) measurable and manageable diversity. Embodiments disclosed herein also may provide a "complexity property" of one-way bijection functions, as well as a measurable, controllable and auditable mechanism to guarantee required security for the user. Bijections are described in greater detail hereinafter, but in short, they are lossless pairs of functions,^, fx'1, which perform a transposition of a function, which is undone later in the protected code. The transposition may be done in thousands or millions of different ways, each transposition generally being done in a completely different and non-repeatable manner. Various techniques may be used to conceal existing programs, achieving massive multicoding of bijective functions, which are not humanly programmed, but are generated by random computational processes. This includes bijective functions which can be used in cipher- and hash-like ways to solve boundary problems. [99] Embodiments disclosed herein may provide improved security and security guarantees (i.e. validated security and validated security metrics) relative to conventional techniques. Greater diversity in time and space than is provided by white box cryptography also may be achieved. The security metrics are based on computational complexity of known attacks, the basic primitive being the generation of mutually inverse function pairs. Other primitives can be constructed as described herein, with or without symmetric or asymmetric auxiliary keys. [100] Figure 27 contrasts conventional black box and white box models with properties of the embodiments disclosed herein, in terms of the long-term security and resistance to hostile attacks. Cryptography is largely reliant on Ciphers and Hashes; Ciphers enable transfer of secrets over unsecured or public channels, while Hashes validate provenance. These capabilities have enormous numbers of uses. In a black-box environment, such cryptographic techniques may have very good long term security. However, in terms of resistance to attacks, such systems have a very short life. As explained above, Ciphers and Hashes have a rigid structure and very standardized equations which are straightforward to attack. White box protection may be used to improve the level of resistance to attacks, but even in such an environment the protected code will still reveal patterns and equations from the original Cipher-code and Hash-code, and boundaries will not be protected. As well, white box protection will not provide diversity which protects code against perturbation attacks.
[101] In contrast, embodiments disclosed herein may incorporate Cipher-like and Hashlike encodings, which gives the protective encodings the security and strength of Ciphers and Hashes. In other words, the process of applying white box encodings to Ciphers and Hashes typically uses simple encodings in an attempt to protect and obscure very distinctive code. The techniques disclosed herein, however, may use strong, diverse encodings to protect any code. With the diverse encodings and interleaving as disclosed, distinctiveness in the targeted code will be removed. Thus, as shown, the disclosed techniques may provide a much stronger security profile than conventional black box and white box protection.
[102] Figure 1 shows a commutative diagram for an encrypted function using encodings, in accordance with embodiments of the present invention. For a F where F::D→R is total, a bijection d: D→ U and a bijection r: R→R ' may be selected. F—r ° F ° dl is an encoded version of F; d is an input encoding or a domain, encoding and r is an output encoding or a range encoding. A bijection such as d or r is simply called an encoding. In the particular case where F is a function, the diagram shown in Figure 1 then commutes, and computation with F' is computation with an encrypted function. Additional details regarding the use of such encodings generally are provided in Section 2.3 of the Appendix.
[103] Figure 28 contrasts the properties of conventional Ciphers and Hashes with those of the bijective base functions disclosed herein. Ciphers are non-lossy functions; they preserve all of the information that they encode, so the information can be unencoded and used in the same manner as the original. Ciphers are invertible provided that one is given the key(s), but it is hard to determine the key or keys Kl, K2 from instances of plain and encrypted information ("PLAIN" and "ENCRYPTED" in Figure 28). Hashes are lossy above a certain length, but this typically is not a problem because hashes are generally used just for validation. With a hash it is hard to determine the optional key, K, from instances of the original data and the hash ("PLAIN" and "HASHED" in Figure 28).
[104] The base functions disclosed herein may serve in place of either ciphers or hashes, as it is hard to determine the key or keys from consideration of the encoding and unencoding functions fx, fx'1. The advantage that the base functions provide over the use of Ciphers or Hashes, is that the computations used by the base functions are more similar to ordinary code, which makes it easier to blend the code of the base functions with the targeted code. As noted above, Ciphers and Hashes use very distinctive code and structure which is difficult to obscure or hide, resulting in vulnerability.
[105] Mutually-inverse base function pairs as disclosed herein may employ random secret information (entropy) in two ways: as key information K which is used to determine the mutually inverse functions^, fx'1, and as randomization information R which determines how the fx, fx'1 implementations are obscured.
[106] For example, two mutually inverse base functions may be represented by subroutines G and H, written in C. The base functions may be constructed by an automated base function generator program or system, with G being an obfuscated implementation of the mathematical function fx and H being an obfuscated implementation of the mathematical function^"1. Thus, G can be used to 'encrypt' data or code, which can then be 'decrypted' with H (or vice versa).
[107] Optionally, run-time keys can be provided in additional to the build-time key K. For example, if the input of a given base function is wider than the output, the extra input vector elements can be used as a run-time key. This is much like the situation with a cipher such as AES-128. A typical run of AES-128 has two inputs: one is a 128-bit key, and one is a 128-bit text. The implementation performs encipherment or decipherment of the text under control of the key. Similarly, a base-function can be constructed to encrypt differently depending on the content of its extra inputs, so that the extra inputs in effect become a runtime key (as opposed to the software generation time key K controlling the static aspects of the base function). The building blocks of base functions disclosed herein make it relatively easy to dictate whether the runtime key is the same for the implementations of both_ «, fg l or is different for fg than ίοτ/κ'1: if the runtime key is added to the selector vector, it is the same for /K and^"1, and if it is added elsewhere, it differs between^ and. ^ 1. [108] Key information K can be used to select far more varied encoding functions than in known white box systems, permitting much stronger spatial and temporal diversity. Diversity is also provided with other techniques used in embodiments of the invention such as Function- Indexed Interleaving which provides dynamic diversity via text-dependence. Further diversity may also be provided by variants of Control-Flow Encoding and Mass-Data Encoding described hereinafter.
[109] Base functions as disclosed herein may incorporate or make use of state vector functions. In general, as used herein a state-vector function is organized around a vector of N elements, each element of which is a w-bit quantity. The state vector function may be executed using a series of steps, in each of which a number between zero and N of the elements of the vector are modified. In a step in which zero elements are modified, the step essentially applies the identity function on the state-vector.
[110] In some embodiments, one or more of the state- vector functions used in
constructing a base function may be invertible. A state- vector function is invertible if, for each and every step in the state-vector function, a step-inverse exists such that that applying the step-algorithm and then applying the step-inverse algorithm has no net effect. Any finite sequence of invertible steps is invertible by performing the inverse-step algorithms in the reverse order of their originals.
[Ill] Illustrative examples of invertible steps on a vector of w-bit elements include adding two elements, such as adding i to j to obtain i+j, multiplying an element by an odd constant over Z/(2,v), mapping a contiguous or non-contiguous sub- vector of the elements to new values by taking the product with an invertible matrix over Z/(2W). The associated inverse steps for these examples are subtracting element from element j, multiplying the element by the multiplicative inverse of the original constant multiplier over Z/(2W), and mapping the sub- vector back to its original values by multiplying by the inverse of that matrix, respectively. [112] Some embodiments may use one or more state-vector functions that have one or more indexed steps. A step is indexed if, in addition to its normal inputs, it takes an additional index input such that changing the index changes the computed function. For example, the step of adding a constant vector could be indexed by the constant vector, or the step of permuting a sub-vector could be indexed by the permutation applied. In each case, the specific function executed is determined at least in part by the index provided to the function.
[113] Indexed steps also may be invertible. Generally, an indexed step is invertible if it computes an invertible step for each index, and the index used to compute the step, or information from which that index can be derived, is available when inverting the step. For example, S17 is invertible if Sn'1 is defined, and the index (17) is available at the appropriate time to ensure that it Sn"1 is computed when inverting the state- vector function. As an example, a step may operate on some elements of the state. To index this step, other elements of the state may be used to compute the index. If invertible steps are then performed on the other elements, the index by may be retrieved by inverting those steps, as long as the two sets of elements do not overlap.
[114] Function-Indexed Interleaving as disclosed herein is a specific example of the principle of the use of indexed steps within a base function. Other uses of indexed steps as disclosed herein may include: allowing the creation of keyed state-vector functions: the set of indexes used in some of the indexed steps can be used as a key. In that case, the index is not obtained from within the computation, but is provided by an additional input; i.e., the function takes the state-vector plus the key as an input. If the indexed steps are invertible, and the ordinary, non-indexed steps are invertible, then the whole state- ector function is invertible, rather like a keyed cipher.
[115] In some embodiments, the index information may provide or may serve as a key for the generated base functions. If the state-vector function is partially evaluated with respect to the index information when the state- vector function is generated, so that the index does not appear in the execution of the generated function explicitly, it is a generation-time key. If code to handle the index information is generated during execution of the state-vector function, so that the index does appear in the execution of the generated function explicitly, it is a run-time key. If the code internally generates the index within the state- vector function, it is a function-indexed key. [116] In an embodiment, a base function may be constructed based upon an initial selected or identified word-size w. In some configurations, the default integer size of the host platform may be used as the word size w. For example, on modern personal computers the default integer size typically is 32 bits. As another example, the short integer length as used, for example, in C may be used, such as 16 bits. In other configurations, a 64-bit word size may be used. A vector length N is also selected for the base function, which represents the length of inputs and outputs in the w-sized words, typically encompassing four or more words internally. In some embodiments, such as where interleaving techniques as disclosed herein are used, it may be preferred for the word size w to be twice the internal word size of the N- vector. The state-vector function then may be created by concatenating a series of steps or combinations of steps, each of which performs invertible steps on N- vectors of w-element word. The inverse of the state-vector function may be generated by concatenating the inverses of the steps in the reverse order.
[117] In some embodiments, one or more keys also may be incorporated into the state- vector function. Various types of keying may be applied to, or integrated with, the state- vector function, including run-time keying, generation-time keying, and function-indexed keying as previously described. To generate a run-time keyed state- vector function, the function may be modified to receive the key explicitly as an additional input to the function. To generate a generation-time keyed state- vector function, code in the state- vector function may be partially evaluated with respect to a provided key. For many types of operations, this alone or in conjunction with typical compiler optimizations may be sufficient to make the key unrecoverable or unapparent within the generated code. To generate a function-indexed keyed state- vector function, the state-vector function may be constructed such that appropriate keys for inverse operations are provided as needed within the state-vector function. [118] In some embodiments, it may be preferred to select an implementation for the state- vector function that accepts a relatively wide input and provides a relatively wide output, and which includes a complex set of invertible steps. Specifically, it may be preferred to construct an implementation that accepts at least a 64-bit wide input and output. It also may be preferred for a significant number of steps in the state-vector function, such as at least 50% or more, to be linear or affine operations over Z/(2w). It also may be preferred to select steps for the state-vector function which have wide variety [119] In some embodiments, it may be preferred to index a significant portion of the steps, such as at least 50% or more, using multiple forms of indexing. Suitable forms of indexing include if-then-else or switch constructs, element-permutation selection, iteration counts, element rotation counts, and the like. It also may be preferred for some or all of the indexes to be function-indexed keys as disclosed herein.
[120] In some embodiments, it may be preferred for the initial and/or final steps of the state-vector function to be steps which mix input entropy across the entire state-vector, typically other than any separate key-input.
[121] In some embodiments, it may be preferred to construct the state-vector function such that at least every few steps, a non-T-function step is performed. Referring to
programming operations, examples of T-function steps include addition, subtraction, multiplication, bitwise AND|, bitwise XOR, bitwise NOT, and the like; examples of non-T- function steps include division, modulo assignment, bitwise right shift assignment, and the like. Other examples of non-T-function steps include function-indexed keyed element-wise rotations, sub-vector permutations, and the like. As previously disclosed, the inclusion of non- T-function steps can prevent or reduce the efficacy of certain types of attacks, such as bit-slice attacks.
[122] As previously described, a state-vector function pair includes the state-vector function as described herein and the complete inverse of the state- vector function. In operation, construction of the state-vector function pair may, but need not be performed by, for example, combining a series of parameterized algorithms and/or inverse algorithms in the form of language source such as C++ code or the like. Similarly, substitution of generation- time keys may, but need not be performed by a combination of macro substitution in the macro preprocessor, function in-lining, and use of parameterized templates. Such
combinations, substitutions, and other operations may be automated within a state-vector generating system as disclosed herein. Once the state-vector function pair has been generated, one or both may be protected using binary- and/or compiler-level tools to further modify the generated code. In some embodiments, the specific modifications made to one or both functions in the state-vector function pair may be selected based upon whether or not each member is expected to execute in an environment likely to be subject to attack. [123] For example, in some embodiments, the function or a part of the function that is expected to be in an exposed environment may be bound near a point at which an input vector is provided to the state- vector function, and/or near the point where an output vector is consumed by its invoking code. The code may be bound by, for example, the use of dynamic data mangling and/or fractures as disclosed herein. For example, the inputs provided may be from a mangled store, and outputs may be fetched by an invoker from the mangled store. Other techniques may be used to bind code at these points, such as data-flow duplication with cross-linking and cross-trapping as disclosed herein. Different combinations may be used, such as where dynamic data mangling, fractures, and data-flow duplication are all applied at the same point to bind the code at that point. The protections applied to code expected to be in an exposed environment may be applied within one or both of the state-vector function, with the portion of the code affected determined by the needed level of security. For example, applying multiple additional protection types at each possible point or almost each possible point may provide maximal security; applying a single protection at multiple points, or multiple protection types at only a single code point, may provide a lower level of security but improved performance during code generation and/or execution. In some embodiments, fractures may be applied at multiple points throughout the generation and binding process, because many opportunities for fracture creation may exist due to generation of many linear and affine operations among the steps of the state-vector function during its construction. [124] In some embodiments, it may be useful to make one member of a state-vector function pair more compact than the other. This may be done, for example, by making the other member of the pair more expensive to compute. As a specific example, when one member of a state-vector function pair is to be used on exposed and/or limited-power hardware such as a smart card or the like, it may be preferred for a hardware-resident member of the state-vector function pair to be significantly more compact than in other embodiments disclosed herein. To do so, a corresponding server-resident or other non-exposed member of the state-vector function pair may be made significantly more costly to compute. As a specific example, rather than using a relatively high number of coefficients as disclosed and as would be expected for a state-vector function generation technique as disclosed previously, a repetitious algorithm may be used. The repetitious algorithm may use coefficients supplied by a predictable stream generation process or similar source, such as a pseudo-random number generator that uses a seed which completely determines the generated sequence. A suitable example of such a generator is the a pseudo-random generator based on ARC4. In some embodiments, such as where the available RAM or similar memory is relatively limited, a variant that uses a smaller element size may be preferred. The pseudo-random number generator may be used to generate all matrix elements and displacement- vector elements. Appropriate constraints may be applied to ensure invertibility of the resulting function. To invert, the generated matrices can be reproduced by knowledge of the seed, at the cost of creating the complete stream used in the exposed pair member, reading it in reverse, multiplicatively inverting each matrix, and additively inverting each vector element in a displacement, over Z/(2W). Thus, a limited-resource device such as a smart card may be adapted to execute one of a state- vector function pair, while the system as a whole still receives at least some of the benefits of a complete state-vector function system as disclosed herein.
[125] Securing Communication Pipes
[126] As shown in the block diagram of Figure 29, base functions as disclosed herein may be used to provide a secure communication pipe from one or more applications on one or more platforms, to one or more applications on one or more other platforms (i.e. an e-link). The same process may be used to protect communication from one sub-application to another sub-application on a single platform. In short, a base function pair fx, fx.] may be used to protect a pipe by performing a cipher-like encrypt and decrypt at respective ends of the pipe. In an embodiment, the base function
Figure imgf000031_0001
may be applied to the pipe start and pipe end, and also applied to the application and its platform, thus binding them together and binding them to the pipe. This secures (1) the application to the pipe- start, (2) the pipe-start to the pipe-end, and (3) the pipe-end to the application information flow.
[127] An illustrative way of effecting such a process is as follows. Firstly, a key K is generated using a random or pseudo-random process. The base-functions ^,^"1 are then generated using the key K and randomization information R. The base functions are then applied to pipe-start and pipe-end so that at run time, the pipe-start computes. ^, and the pipe- end computes./*-"1. The key K can then be discarded as it is not required to execute the protected code. In an application such as this, the base-function specifications will be cipher- based specifications ίοΐ κ,/κ (similar to FIPS-197 for AES encrypt and decrypt). Cloaked base-functions are specific implementations (pipe-start and pipe-end above) of the smooth base-functions designed to foil attempts by attackers to find K, invert a base-function (i.e., break encryption), or break any of the bindings shown above. That is, a smooth base function is one which implements^ or^"1 straightforwardly, with no added obfuscation. A cloaked base function still computes,//;:
Figure imgf000032_0001
but it does so in a far less straightforward manner. Its implementation makes use of the obfuscation entropy R to find randomly chosen, hard to follow techniques for implementing^ or j 1. Further examples of techniques for creating and using cloaked base functions are provided in further detail herein.
[128] Function-Indexed Interleaving
[129] To guard against homomorphic mapping attacks, embodiments disclosed herein may use replace matrix functions with functions which are (1) wide-input; that is, the number of bits comprising a single input is large, so that the set of possible input values is extremely large, and (2) deeply nonlinear; that is, functions which cannot possibly be converted into linear functions by i/o encoding (i.e., by individually recoding individual inputs and individual outputs). Making the inputs wide makes brute force inversion by tabulating the function over all inputs consume infeasibly vast amounts of memory, and deep nonlinearity prevents homomorphic mapping attacks.
[130] Some embodiments may use "Function-Indexed Interleaving", which may provide diffusion and/or confusion components which are deeply nonlinear. A function from vectors to vectors is deeply nonlinear if and only if it cannot be implemented by a matrix together with arbitrary individual input- and output-encodings. If it is not deeply nonlinear, then it is "linear up to I/O encoding" ("linearity up to I/O encoding" is a weakness exploited in the BGE attack on WhiteBox AES.)
[131] Function-Indexed Interleaving allows conformant deeply nonlinear systems of equations to be solved by linear-like means. It can be used to foster data-dependent processing, a form of dynamic diversity, in which not only the result of a computation, but the nature of the computation itself, is dependent on the data. Figure 30 shows a process flow diagram of an example Function-Indexed Interleaving process, which interleaves a single 4 x 4 function with a family of 4 x 4 functions. The l x l function with l x l function-family case permits combining of arbitrary kinds of functions, such as combining a cipher with itself (in the spirit of 3DES) to increase key-space; combining different ciphers with one another;
combining standard ciphers with other functions; and combining hardware and software functions into a single function. [132] In the example implementation shown in Figure 30, the square boxes represent bijective functions, typically but not necessarily implemented by matrices. The triangle has the same inputs as the square box it touches and is used to control a switch which selects among multiple right-side functions, with inputs and outputs interleaving left-side and right- side inputs and outputs as shown: if the left-side box and right-side boxes are 1-to-l, so is the whole function; if the left-side box and right-side boxes are bijective, so is the whole function; if the left-side box and right-side boxes are MDS (maximum distance separable), so is the whole function, whether bijective or not. [133] If the triangle and all boxes are linear and chosen at random, then (by observation) over 80% of the constructions are deeply nonlinear.
[134] In an example embodiment disclosed herein, function-indexed interleaving appears four times in an fx, fx'1 specification. Each time it includes three 4 < 4 linear mappings for some 4 x 4 matrix M. Each instance of function-indexed interleaving has a single left-side function and 24 = 16 right-side functions.
[135] Notably, function-indexed interleaving also may be nested, such that the left- function or right-function-family may themselves be instances of function-indexed
interleaving. In such a configuration, the result is a recursive instance of function-indexed interleaving. In general, such instances typically are more difficult for an attacker to understand than non-recursive instances; that is, increasing the level of recursion in function- indexed interleaving should increase the level of obscurity.
[136] A further example embodiment and corresponding mathematical treatment of function-indexed interleaving is provided in Section 2.9, and specifically in Section 2.9.2, of the Appendix, and Figure 8. [137] Mark I System
[138] Three specific example embodiments are described in detail herein, referred to as the Mark I, II and III systems. An exemplary implementation of the Mark I system is presented in the process flow diagram of Figure 31. In this example, the square boxes represent mixed Boolean arithmetic (MBA) polynomial encoded matrices. The ambiguity of MBA polynomial data- and operation-encodings is likely to be very high and to increase rapidly with the degree of the polynomial. Each matrix is encoded independently, and the interface encodings need not match. Thus, 2 x 2 recodings cannot be linearly merged with predecessors and successors. The central construction is function-indexed interleaving which causes the text processing to be text-dependent. Using simple variants with shifts, the number of interleaved functions can be very large with low overhead. For example, permuting rows and columns of 4 x 4 matrices gives 576 choices. As another example, XORing with initial and final constants gives a relatively very high number of choices. Initial and final recodings mix the entropy across corresponding inputs/outputs of the left fixed matrix and the right selectable matrices. Internal input/output recodings on each matrix raise the homomorphic mapping work factor from order 23w/2 to order 25w 2 allowing for full 'birthday paradox' vulnerability - the work factor may be higher, but is unlikely to be lower.
[139] An example embodiment of a Mark I system and corresponding mathematical treatment is provided in Sections 3.5 and 4 of the Appendix and in Figure 4.
[140] However, it has been found that a Mark I type implementation may have two weaknesses that can be exploited in some circumstances:
1) Static dependency analysis can be used to isolate the components.
2) Only shift operations and comparisons in the 'switch' are non-T- functions. All of the other components are T-functions and therefore may be recursively analysable using a bit- slice attack.
[141] T-Functions
[142] A function f: (Bw)k→ {Bw)m mapping from a A vector of w-bit words to an /n-vector of w-bit words is a T-function if for every pair of vectors x E (Bw)k, y G (Bw)m :- y =f(x), with x '≠ x and y ' =f(x '), and with bits numbered from 0 to w - 1 in the w-bit words, the lowest numbered bit in an element word at which y and y' differ is not lower than the lowest numbered bit in an element word at which x and x ' differ.
[143] Thus, a function which is a T-function will have the property that a change to an input element's 2' bit never affects an output element' s 2^ bit when > j. Typically, the bit- order numbering within words is considered to be from low-order (2°) to high-order (2W l) bits, regarding words as representing binary magnitudes, so this may be restated as: an output bit can only depend on input bits of the same or lower order. So it may be possible to "slice off or ignore higher bits and still get valid data. Some embodiments also may incorporate tens of millions of T-functions, in contrast to known implementations which only use hundreds of T-functions. As a result, embodiments disclosed herein may be more resistant to bit slicing attacks and statistical attacks.
[144] Functions composable from Λ, ν,φ, ^ computed over Bw together with +, -, χ over Z/(2W), so that all operations operate on w-bit words, are T-functions. Obscure constructions with the T-function property are vulnerable to bit-slice attacks, since it is possible to obtain, from any T-function, another legitimate T-function, by dropping high-order bits from all words in input and output vectors. The T-function property does not hold for right bit-shifts, bitwise rotations, division operations, or remainder/modulus operations based on a
divisor/modulus which is not a power of two, nor does it hold for functions in which conditional branches make decisions in which higher-order condition bits affect the value of lower-order output bits. For conditional branches and comparison-based conditional execution, conditional execution on the basis of conditions formed using any one of the six standard comparisons =,≠, <, >, < > all can easily violate the T-function condition, and indeed, in normal code using comparison-based branching logic, it is easier to violate the T- function condition than it is to conform to it.
[145] External and Internal Vulnerabilities and Attack-Resistance
[146] By repeatedly applying either of a pair of bijective functions^, /κ where fx, fx'1 are T-functions, it may be possible to precisely characterize the computations using a bit-slice attack. In such an attack, the operation of these functions is considered ignoring all but the low-order bits, and then the low-order two bits, and so on. This provides information until the full word size (e.g., 32 bits) is reached, at which point complete information on how the function behaves may be available, which is tantamount to knowledge of the key K. This is an external vulnerability. While the attack gains knowledge of implementation details, it does so without any examination of the code implementing those details, and could be performed as an adaptive known plaintext attack on a black-box implementation. [147] A less severe external vulnerability may exist if the functions of the pair have the property that each acts as a specific T-function on specific domains, and the number of distinct T-functions is low. In this case, a statistical bucketing attack can characterize each T-function. Then if the domains can similarly be characterized, again, without any examination of the code, using an adaptive known plaintext attack, an attacker can fully characterize the functionality of a member of the pair, completely bypassing its protections, using only black- box methods. Plainly, it may be desirable to have an effective number of distinct T- functions to foil the above attack. In Mark III type implementations, for example, there are over 10 distinct T-functions per segment and over 1040 T-functions over all. Mark III type
implementations are described in further detail herein.
[148] In some cases, the pair of implementations may include functions which achieve full cascade, that is, every output depends on every input, and on average, changing one input bit changes half of the output bits. An example of an internal vulnerability may occur in a Mark II type implementation where, by 'cutting' the implementation at certain points, it may be possible to find a sub-implementation (a component) corresponding to a matrix such that the level of dependency is exactly 2 χ 2 (in which case the component is a mixer matrix) or 4 x 4 (in which case it is one of the L, S, or R matrices). Once these have been isolated, properties of linear functions allow very efficient characterization of these matrices. This is an internal attack because it requires non-black-box methods: it actually requires examination of internals of the implementations, whether static (to determine the dependencies) or dynamic (to characterize the matrices by linearity-based analyses).
[149] As a general rule, the more external attacks are prevented, and a potential attacker is forced to rely on increasingly fine-grained internal attacks, the harder the attacker's job becomes, and most especially, the harder the attacks become to automate. Automated attacks are especially dangerous because they can effectively provide class cracks which allow all instances of a given technology to be broken by tools which can be widely distributed.
[150] Thus embodiments disclosed herein may provide, by means of of variable and increasingly intricate internal structures and increasingly variegated defenses, an environment in which any full crack of an instance requires many sub-cracks, the needed sub-cracks vary from instance to instance, the structure and number of the attacked components varies from instance to instance, and the protection mechanisms employed vary from instance to instance. In this case, automating an attack becomes a sufficiently large task to discourage attackers from attempting it. In the substantial time it would take to build such an attack tool, the deployed protections may have been updated or otherwise may have moved on to a new technology for which the attack-tool's algorithm no longer suffices. [151] Mark II System
[152] A block diagram of an example Mark II type implementation according to an embodiment is presented in Figures 23 and 12. Figure 23 presents the processing of a "base core function" which appears four times in Figure 12. The complete execution flow for a Mark II type system is shown in Figures 5 and 6, and described in further detail with reference to Figures 5 and 6 in Section 5.1 of the Appendix.
[153] In an implementation according to a Mark II type embodiment, explicit use of recoding is part of the functionality chosen by K. Right-side recodes and permutations are chosen text-dependently from pairs for a total of 16 configurations per core and 65,536 configurations over all. However, a T-function count of 65,536 over all may be much too low for many cases; even a blind bit-slice attack, which ignores the internal structure and uses statistical bucketing, might suffice to crack the Mark II implementation given sufficient attack time.
[154] The balance of a Mark II type implementation is shown in Figure 12. Initial and final permutations and recodes as shown are statically chosen at random. Swapping sides between cores 1 & 2 and between cores 3 & 4, and half-swapping between cores 2 & 3, ensure text dependence across the entire text width. However, the highly regular structure facilitates component-isolation by interior dependency analysis. Once the components are isolated, the T-functions can be analysed by bit-slice analysis. The non- T-function parts are simple and can be cracked using straightforward attacks. Thus, the Mark II implementation is effective and is useful in many applications, but could be compromised with sufficient access and effort.
[155] The Mark II proposal is similar to Mark I in that it has a fixed internal structure, with only coefficient variations among the base function implementation pairs. Further description regarding the example embodiment of a Mark II implementation and a
corresponding mathematical treatment is provided in Section 5.1 of the Appendix. [156] Mark III System [157] In contrast to the Mark I and Mark II implementations described above, a Mark III base function design according to an embodiment disclosed herein may include the following properties: an irregular and key-determined structure, so that the attacker cannot know the details of the structure in advance; highly data-dependent functionality: varying the data varies the processing of the data, making statistical bucketing attacks resource -intensive; a relatively extremely high T-function count (the number of separate sub-functions susceptible to a recursive bit-slice attack), making a blind bit-slice attack on its T-functions infeasible; redundant and implicitly cross-checked data-flow, making code-modification attacks highly resource-intensive; and omni-directional obfuscation-induced dependencies, making dependency-based analysis resource-intensive. [158] Figure 13 shows a schematic representation of execution flow in a portion of an example Mark III type implementation. Similar to the example execution flows described with respect to the Mark I and Mark II type implementations, each component may represent a function, process, algorithm or the like, with arrows representing potential execution paths between them. Where different arrows lead to different points within the components, it will be understood that different portions of the component may be executed, or different execution paths within the component may be selected. As shown in Figure 13, a Mark III type implementation may provide an irregular, key-dependent, data-dependent, dataflow-redundant, cross-linked, cross-checked, tamper-chaotic structure, containing a nested function-indexed- interleaving within a function-indexed interleaving. Cross-linking can be omnidirectional because right-side selection depends on the inputs, not the outputs, of the left-side in each interleaving, so that simple code reordering within each segment allows right-to-left cross connections as well as left-to-right ones. As shown in Figure 14, Irregular Extremely finegrained T-function splitting makes overall T-function partitioning attack ineffective. [159] Figure 15 shows another example schematic of a portion of a Mark III type implementation as disclosed herein. As shown in Figure 15, the initial and final mixing may use linear transforms of 32-bit words having widths of 3 to 6. Five to seven segments may be are used, each of which contains a 3-band recursive instance of function-indexed interleaving. Each band is 3 to 6 elements wide, with a total of 12 elements for all three bands. Matrices are I/O permuted and I/O rotated, giving over 100 million T-subfunctions per segment: the whole base function has over 1040 T-subfunctions. Dataflow duplication, random cross-connection, and random checks, combined with code-reordering also may be used, creating omnidirectional cross-dependencies. [160] A number of the different defenses that may be used in a Mark III type system are shown graphically in Figure 16. They include features such as the following: memory-shuffling with fractured transforms (dynamic data mangling) which hides dataflow; random cross-linking, cross-trapping, and variable-dependent coding which causes pervasive inter-dependence and chaotic tamper response; permutation polynomial encodings and function-indexed interleaving which hobble linear attacks; variable, randomly-chosen structure which hobbles advance-knowledge attacks; and - functionality is highly dependent on run-time data, reducing repeatability and hobbling statistical bucketing attacks.
[161] Further details regarding a Mark III type implementation are provided in Section 6 of the Appendix. A related process for creating an invertible matrix over Z/(2W) is provided in Section 3.3 of the Appendix. As shown and described, initial and/or final mixing stelps also may be used, examples of which are provided in Section 2.8 of the Appendix.
[162] By replacing conditional swaps with 2 > 2 bijective matrices mixing each input into each output, we can take precisely the same network topology and produce a mixing network which mixes every input of a base function with every other initially, and we can employ another such network finally to mix every output of the base function with every other. As noted above the mixing is not entirely even, and its bias can be reduced with conditional swaps replaced by mixing steps. A segment's input and output vectors also may be subdivided, for example as described in further detail in Sections 6.2.3-6.2.7 of the Appendix, and as illustrated in Figure 1 1. [163] Data-Flow Duplication
[164] Some embodiments may include data flow duplication techniques. For example, as described below, for every instruction which is not a JUMP. . . , ENTER, or EXIT, the instruction may copied so that an original instruction is immediately followed by its copy, and new registers may be chosen for all of the copied instructions such that, if x and;; are instructions, with y being the copy of x,
[165] 1) if x inputs the output of an ENTER instruction, then the corresponding y input uses the same output;
[166] 2) if x inputs the output of an original instruction u with copy v, then the corresponding input of y inputs from the v output corresponding to the u output from which x inputs; and
[167] 3) if x outputs to an EXIT instruction, then the corresponding output of y outputs to a a special unused sink node indicating that its output is discarded.
[168] Thus, all of the computations except for the branches have an original and a copy occurrence. [169] To accomplish this transformation, we proceed as follows.
[170] We add a new instruction JUMPA ('jump arbitrarily'), which is an unconditional branch with two destinations in control-flow graph (cfg) form, just like a conditional branch, but with no input: instead, JUMPA chooses between its two destinations at random. JUMPA is not actually part of the VM instruction set, and no JUMPA will occur in the final obfuscated implementation of or f^1.
[171] We use JUMPA in the following transformation procedure: [172] 1) If the implementation is not in SMA (static multi-assignment) form already, convert it to SMA form;
[173] 2) For each of BB X, of the BB's in the implementation Xl5 . . .,X¾ replace it with three BBs C„ X;, X'i by creating a new BB X', which is identical to X,, and adding a new BBC, which contains only a single JUMP A instruction targeting both X, and X';, making X, and X'i the two targets of 's JUMP A, and making every non-JUMPA branch-target pointing to X; point to C, instead.
[174] 3) Convert the implementation to SSA form (static single assignment), isolating the local data-flow in each X; and X'i, although corresponding instructions in X, and X\ still compute identical values.
[175] 4) Merge all of the code in each X', back into its X,, alternating instructions from X, and X', in the merge so that corresponding pairs of instructions are successive: first the X, instruction, and then the corresponding X'; instruction.
[176] 5) Make each branch-target which is a Q point to the corresponding X, instead, and remove all of the Qand X', BBs. At this point, the data-flow has been duplicated, the original shape of the CFG has been restored, and the implementation is free of JUMPA instructions. Remember which instructions correspond in each X, for future use.
[177] Further details regarding control flow duplication are provided in Section 5.2.6 of the Appendix, and described with respect to Figure 9, which shows an example process for control flow duplication according to embodiments disclosed herein.
[178] Fractures and Fracture Functions
[179] Generally when an encoded output is produced, it is consumed with exactly the same encoding assumed, so that an encoded operation z =βχ, y) becomes z ' =/' ( y ') where {χ ', γ ', z ') = (ex(x), ey(y), ez(z)), for encodings ex, ey, ez, and where f = ez ° f °[ ex ~l , e ~ [180] In some embodiments, it may be advantageous to output a value with one encoding, and subsequently input assuming some other encoding. If x is output as as ei(x), and later consumed assuming encoding e2, in effect we have applied e2 1 0 e; to the unencoded value. Such an intentional mismatch between the encoding in which a value is produced and the encoding assumed when it is consumed is referred to herein as a "fracture." If the encodings are linear, so is the fracture function e _1° e^, and if they are permutation polynomials, so is the fracture function β2 1 ° e/.
[181] In some embodiments, fractures may be useful in obfuscation because the computation which they perform effectively does not appear in the encoded code - the amount and form of code to perform a normal networked encoding and one which adds an operation by means of a fracture is identical, and there appears to be no obvious way to disambiguate these cases, since encodings themselves tend to be somewhat ambiguous.
[182] Note that the defining property of a fracture is the fracture function, for example v-1 ° u. Generally, there are many different choices of consuming encoding v and producing encoding u which produce exactly the same fracture function. It is quite possible, for example, to have «i, . . . , ¾ v/, . . . , Vk such that v f 1 ° u; is the same fracture function for = 1, . . . , k. Thus, specifying the fracture function does not necessarily specify the producing and consuming encodings which imply it.
[183] Data Scrambling via Mass Data Encoding [184] Mass Data Encoding (MDE) is described in United States Patent No. 7,350,085, the contents of which are incorporated herein by reference. In short, MDE scrambles memory locations in a hash-like fashion, dynamically recoding memory cells on each store and dynamically recoding and relocating memory cells by background processing. By
mismatching fetch and store recodings, a fetch or store can perform an add or multiply while continuing to look like a simple fetch or store. This makes it hard for an attacker to disambiguate between mere obfuscation and useful work.
[185] MDE is compiled, not just interpreted, so supporting data structures are partially implicit and hence, well-obscured. Actual addresses are always scrambled and rescrambled by background activity. As shown in Figure 17, the code accessing the Virtual MDE memory is initially written as if it were accessing an ordinary piece of memory. The code is then modified by the methods described in US patent 7,350,085 to employ a mapping technique which encodes both the data and locations in the memory. Thus, the locations accessed move around over time, and the encodings applied to the data likewise change over time, under the feet of the running code. This technique of protection has substantial overhead, but its highly dynamic nature makes it arduous for an attacker to penetrate the meaning of software which uses it. Cells are recoded when stored, and are recoded periodically by background activity. Mismatching recode on store and corresponding recode on fetch can do a covert add or multiply (key-controllable). Fetched items are recoded, but not to smooth (i.e., not to unencoded). Stored items are not smooth prior to store, and are recoded on store to a dynamically chosen new cell encoding. Stored data are meaningless without the code which accesses them. One program can have any number of distinct, nonoverlapping MDE memories. An MDE memory can be moved as a block from one place to another or can be transmitted from one program to another via a transmission medium. That is, messages of sufficient bulk can be transmitted in MDE-memory form. [186] The initial state of the memory is not produced by hacker-visible activity, and hence conceals how its contents were derived. That is, the initial state is especially obscure.
[187] Control Confusion via Control Flow
[188] Control Flow Encoding (CFE) is described in United States Patent No. 6,779,114, the contents of which are incorporated herein by reference. CFE combines code-fragments into multi-function lumps with functionality controlled by register-switching: many-to-many mapping of functionality to code locations; execution highly unrepeatable if external entropy available: the same original code turns into many alternative executions in CFE code. By modifying the register-switching and dispatch code, key information can control what is executed and therefore control the computation performed by embodiments of the invention. [189] Code represented by the control-flow graph of Figure 18, where the letters denote code fragments, can be encoded as shown in Figure 19. The protected control-flow encoding shows lumps created by combining pieces, executed under the control of the dispatcher, with the 'active' piece(s) selected by register switching.
[190] CFE is compiled, not just interpreted, so supporting data structures are partially implicit, and hence, well-obscured. Lumps combine multiple pieces; that is, they have multiple possible functionalities. When a lump is executed, which piece(s) is/are active is determined by which operate via registers pointing to real data, not dummy data. The same piece may occur in multiple lumps, with different data-encodings: mapping from
functionalities to code-locations is many-to-many. [191] The dispatcher can be arranged to select pieces which embody a background process, making it hard to distinguish background and foreground activity. Available entropy is used to determine which alternative way of executing a sequence of pieces is employed, providing dynamic execution diversity (nonrepeating execution). As well, key information can be used to influence dispatch and hence vary the represented algorithm.
[192] Dynamic Data Mangling
[193] As shown in Figure 20 re-use of M-registers may be maximized, allocating separate M-registers only where required, using Chaitin's graph-coloring allocation algorithm. As a result, M-registers are re-used frequently, making data-flow harder for attackers to follow.
[194] To do so, first a modulus M, a permutation polynomial p over the mod- ring, an input-based l n vector mapping matrix A yielding z from the inputs, and a series of constant , -p(z+i) for 1 < i < M, may be selected, where the c, values are distinct since p is a mod- perm-polynomial. Locations c\, ..., c„ (with n < M) are treated in an array Χοΐ size M as 'M- registers'.
[195] During computation, data may be moved randomly into and out of M-registers, and from M-register to M-register, changing encoding at each move. Some embodiments also may randomly cause either the encodings to form an unbroken sequence, or may inject fractures as disclosed herein where encodings do not match. [196] Given a fracture with data in encoding el, the input is assumed to be in encoding e2, thus computing the fracture function e3 = e2-l ° el . If el, el are linear, so is e3. If el, e2 are permutation polynomials, so is e3. The code has identical form whether a fracture is present or not; i.e., it is ambiguous whether or not a fracture is present. Thus, as previously described, fractures may provide a means of injecting hidden computations such that the code looks much the same before and after it is added.
[197] Additional details and mathematical treatment of the use of dynamic data mangling is provided in Section 7.8.14 of the Appendix.
[198] Cross-Linking and Cross-Trapping [199] The generous application of cross-linking and cross-trapping can provide aggressive chaotic response to tampering and perturbation attacks, with much stronger transcoding and massive static analysis resistance. In an embodiment, cross-linking and cross- trapping may be effected as follows, as illustrated in Figure 21: 1) copy computations at least once;
2) randomly swap connections between the original and the copy. Because they are duplicates, the results will not change;
3) encode all of the resulting computations so that duplicates are independently encoded; 4) randomly take duplicate results and inject computations adding their difference (=
0) or multiplying one by the ring inverse of the other (= 1) and then adding the 0 or multiplying by the 1 (in encoded form). The injected encoded 0-adds and 1 -multiplies have no functional effect unless tampering occurs, in which case the code behaves chaotically.
[200] An added benefit is that the static dependency graph becomes much denser than that for the original program, making static analysis attacks difficult. Thus, effective tampering requires that the (differently encoded) duplicates be correctly identified and the correct duplicates be changed in effectively the same way under different encodings. This is much harder to accomplish than ordinary tampering without cross-linking and cross-trapping.
[201] An example implementation of data-flow duplication is provided in Section 5.2.8- 5.2.10 of the Appendix, and illustrated in Figure 10. In addition to its normal use within the entry and exit base-functions, data flow duplication and cross-checking or trapping also may be performed using these transformations for the data-flow within the decision-block
including the transfer of information from the outputs of the entry base-function to inputs of the decision-block and the transfer of information from the outputs of the decision-block to the inputs of the exit base-function.
[202] Context-Dependent Coding
[203] In some embodiments, the context in which base function pairs are implemented may be an integral part of the operation of the base-function. Context includes information from the application, hardware, and/or communication. Context of one base-function component can also include information from other components, which are part of the application in which it resides.
[204] Referring to Figure 22, an implementation of a base-function pair or a similar construct may be hosted on a platform from which hardware or other platform signature constants can be derived and on which the implementation can be made to depend. It may be preferred for the implementation to reside in a containing application from which an application signature or other application constants can be derived and on which the implementation can be made to depend.
[205] The implementation may also take inputs from which further constant signature information can be derived and on which the implementation can be made to depend.
[206] Biased Permutations via Sorting Networks
[207] Permutations may provide a basis for storing enormous numbers of alternatives in limited space. For example, row/column permutations may be used to turn a non-repeating 4x4 matrix into 576 non-repeating 4x4 matrices. In some embodiments, the order of computations may be permuted, deep dependence of computations on run-time data may be generated, and the like.
[208] Referring to Figure 7, some embodiments may first sort, at each cross-link, compare, and swap on greater-than. To permute, swaps are performed with probability ½. It is easy to show that if the network sorts correctly with a compare-swap, then it permutes with random swap with the full range of permutations as possible outputs. Some embodiments may use a recommended probability ½ Boolean generator to compare two text-based full-range permutation polynomial encoded values.
[209] Such sorting networks permute in a biased fashion, that is, some permutations are more probable than others, since the number of swap configurations is 2number ofstages.
However, the permutation count is equal to the number of elements to permute, which does not evenly divide the number of swap-configurations. In spite of the biased output, the advantage is simplicity and high dependency count with non-T functionality.
[210] Unbiased Permutations via Simple Selection [211] In some embodiments, unbiased permutations can also be generated by selecting a 1st element at random by taking the ri mod n element among the elements (zero origin), selecting 2nd element at random by taking the r2 mod («-l) element at random from the remaining elements, and the like. With this process each rt is a full range text-based perm- poly value. This may provide almost perfectly bias-free and non-T- function. However, operations may be harder to hide in or interleave with ordinary code than for sorting-network- based permutation.
[212] Hobbling Bit-Slice Analysis
[213] As explained above, bit-slice attacks are a common attack tool: repeatedly executing a function and ignoring all but the lowest-order bit, and then the lowest-order two bits, the three lowest-order bits, etc. This allows the attacker to gain information until the full word size (say 32 bits) is reached, at which point complete information has been obtained on how the function behaves.
[214] A function constructed using T-function and non-T-function components has subdomains over which it is a T-function embedded in an entire domain in which the function is not. In some embodiment it may be advantageous to make the number of such subdomains very large (for example, in a Mark III type system as described herein, there may be over 1040 such subdomains) to make bucketing attacks on the subdomains highly resource-intensive. In some embodiments, liberal use also may be made of non-T-function computations at other points, such as at decision points, in permutations, in recodings, and the like.
[215] An Example General Data Blending Mechanism
[216] Figure 24 shows a graphical representation of a typical usage of Mass Data Encoding or Dynamic Data Mangling as described above. If inputs to a base function are provided by such an obscured memory array, by either of these two techniques, and the results are also obtained by the application from the obscured memory array, it becomes difficult for an attacker to analyse the data-flow of information entering or leaving the base function, making attacks on the base function more arduous.
[217] Security-Refresh Rate [218] For effective application security lifecycle management, applications typically must be capable of resisting attacks on an ongoing basis. As part of this resistance, such applications may be configured to self-upgrade in response to security -refresh messages containing security renewal information. Such upgrades may involve patch files, table replacements, new cryptographic keys, and other security-related information.
[219] A viable level of security is one in which application security is refreshed frequently enough so that the time taken to compromise an instance's security is longer than the time to the security-refresh which invalidates the compromise; i.e., instances are refreshed faster than they can typically be broken. This is certainly achievable at very high security- refresh rates. However, such frequent refresh actions consume bandwidth, and as we raise the refresh rate, the proportion of bandwidth allocated to security -refresh messages increases, and available non-security payload bandwidth decreases.
[220] Plainly, then, engineering the appropriate security-refresh rate is required for each kind of application, since the tolerable overheads vary greatly depending on context. For example, if we expect only gray-box attacks (neighbor side-channel attacks) in a cloud application, we would use a lower refresh rate than if we expected white-box attacks (insider attacks by malicious cloud-provider staff).
[221] Authentication of Equality With Chaotic Failure
[222] Suppose we have an application in which authentication is password-like:
authentication succeeds where G, the supplied value, matches a reference value Γ; i.e., when G = Γ. Further suppose that we care about what happens when G = Γ, but if not, we only insist that whatever the authentication authorized is no longer feasible. That is, we succeed when G = Γ, but if G≠ Γ, further computation may simply fail.
[223] The authenticating equality is not affected by applying any non-lossy function to both sides: for any bijection φ, we can equivalently test whether φ (φ) = φ (Γ). The
authenticating equality may remain valid with high probability even if φ is lossy, if φ is carefully chosen so that the probability that φ (β) = φ (Γ) when G≠ Γ is sufficiently low (as it is in Unix password authentication, for example). Based on technology previously described herein, we can easily perform such a test. We previously described a method for foiling tampering by duplicating data-flow, randomly cross connecting the data-flow between duplicate instances, and performing encoded checking to ensure that the equalities have not been compromised. We can adapt this approach to test whether G = Γ, or in encoded form, whether φ (β) = φ (Γ).
[224] We note that a data-flow yielding φ (G) already duplicates a dataflow yielding φ (Γ) along the success path where G = Γ. We therefore omit, for this comparison, the data-flow duplication step. Then we simply cross-connect as described above and insert checks. By using these computations as coefficients for future encoded computations, we ensure that, if φ (G) = φ (Γ), all will proceed normally, but if φ (G)≠ φ (Γ), while further computation will proceed, the results will be chaotic and its functionality will fail. Moreover, since φ is a function, if φ (G)≠ φ (Γ), we can be sure that G≠Y.
[225] Variable-Dependent Coding
[226] In some embodiments that incorporate operations which make use of one or more variables which need not have a specific value during their use in the operation, variable- dependent coding may be used to further obscure the operation of related code. One way of doing so is to use values that are used or generated by other operations in nearby or related sections of code. Thus, such values may be used repeatedly for different purposes within a region of code, which may make it more difficult for an attacker to discern any individual use, or to extract information about the specific operations being performed in relation to those values. For example, if a value x is encoded as aX+b, there may be a great deal of leeway in the specific values used for the constants a and b. In this example, if there are values available within the executing code that remain constant over the life of x, they may be used as one or more of the constants a and/or b.
[227] Further, for a single defined operation, different values may be used during each execution of the operation, such that the specific values used may change each time the operation is executed. This may act as an additional barrier to a potential attacker, who may not be able to track values from one execution to another as might be expected for other types of clear, encrypted, or obfuscated code. Continuing the example above, a first operation f(Y) may return values a and b and a second operation g(Z) may return values c and d, each of which is stored in memory for a period of time. The variable x may be encoded during the time that a and b are stored in memory as aX+b, and as cX+d during the time that c and d are stored in memory. Thus, the appropriate constants will be available via the memory to allow for decoding or otherwise manipulating x in the appropriate encoding. The values may be overwritten or discarded after that time, since the encoding constants need only be available during the time that x is used by operations within the executing program.
[228] Similarly, variable values generated during execution of code may be used for other purposes in addition to or as an alternative to the finite encoding example provided. For example, variable values may be used to select a random item from a list or index, as a seed for a pseudo-random number generator, as an additive, multiplicative, or other scaling factor, or the like. More generally, variable values generated by one portion of executed code may be used in any place where a constant value is needed at another portion of executed code, for a duration not more than the generated variable values are expected to be available.
[229] Example Advantages
[230] Embodiments of the invention described herein may be used to provide the following, where a "sufficient period of time" may be selected based on, or otherwise determined by, the needs of security lifecycle management: 1) Black-Box Security: security as a keyed black-box cipher against attacks up to adaptive known plaintext for a sufficient period of time;
2) Secure Boundary: securely pass information in and out to/from surrounding code in encoded form for a sufficient period of time;
3) Key-Hiding: prevent key-extraction from implementations for a sufficient period of time;
4) Secure Weakest-Path: cryptographically secure even on weakest data path for a sufficient period of time;
5) Anti-Partitioning: partition implementation into its construction blocks for a sufficient period of time; 6) Application-Locking: cannot extract implementation from its containing application for a sufficient period of time; and 7) Node-Locking: cannot extract implementation from its host platform for a sufficient period of time.
[231] Generally, embodiments disclosed herein relate to base-function encoding, using various techniques and systems as disclosed. Specific embodiments also may be referred to herein, such as in the Appendix, as "ClearBox" implementations.
[232] The various techniques as disclosed herein may use operations that are similar in nature to those used in an application that is being protected by the disclosed techniques, as previously described. That is, the protection techniques such as base functions, fractures, dynamic data mangling, cross-linking, and variable dependent coding may use operations that are similar to those used by the original application code, such that it may be difficult or impossible for a potential attacker to distinguish between the original application code and the protective measures disclosed herein. As a specific example, base functions may be constructed using operations that are the same as, or computationally similar to, the operations performed by the original application code with which the base functions are integrated, in contrast to the distinctive functions typically employed by, for example, known encryption techniques. Such operations and techniques that are difficult or impossible to distinguish may be described herein as "computationally similar."
[233] A method is generally conceived to be a self-consistent sequence of steps leading to a desired result. These steps require physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It is convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, parameters, items, elements, objects, symbols, characters, terms, numbers, or the like. It should be noted, however, that all of these terms and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. The description of the present invention has been presented for purposes of illustration but is not intended to be exhaustive or limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiments were chosen to explain the principles of the invention and its practical applications and to enable others of ordinary skill in the art to understand the invention in order to implement various embodiments with various modifications as might be suited to other contemplated uses.
[234] Embodiments disclosed herein may be implemented in and used with a variety of computer systems and architectures. Figure 32 is an example computer system 3200 suitable for implementing embodiments disclosed herein. The computer 3200 may include a communication bus 3201 which interconnects major components of the system, such as a central processor 3210; a fixed storage 3240, such as a hard drive, flash storage, SAN device, or the like; a memory 3220; an input/output module 3230, such as a display screen connected via a display adapter, and/or one or more controllers and associated user input devices such as a keyboard, mouse, and the like; and a network interface 3250, such as an Ethernet or similar interface to allow communication with one or more other computer systems.
[235] As will be readily understood by one of skill in the art, the bus 3201 allows data communication between the central processor 3210 other components. Applications resident with the computer 3200 generally may be stored on and accessed via a computer readable medium, such as the storage 3240 or other local or remote storage device. Generally, each module shown may be integral with the computer or may be separate and accessed through other interfaces. For example, the storage 3240 may be local storage such as a hard drive, or remote storage such as a network-attached storage device.
[236] Many other devices or components may be connected in a similar manner.
Conversely, all of the components shown need not be present to practice embodiments disclosed herein. The components can be interconnected in different ways from that shown. The operation of a computer such as that shown is readily known in the art and is not discussed in detail in this application. Code to implement embodiments of the present disclosure may be stored in a computer-readable storage medium such as one or more of the memory 3220, the storage 3240, or combinations thereof.
[237] More generally, various embodiments disclosed herein may include or be embodied in the form of computer-implemented processes and apparatuses for practicing those processes. Embodiments also may be embodied in the form of a computer program product having computer program code containing instructions embodied in non-transitory and/or tangible media, such as floppy diskettes, CD-ROMs, hard drives, USB (universal serial bus) drives, or any other machine readable storage medium. When such computer program code is loaded into and executed by a computer, the computer may become an apparatus for practicing embodiments disclosed herein. Embodiments also may be embodied in the form of computer program code, for example, whether stored in a storage medium, loaded into and/or executed by a computer, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing embodiments disclosed herein. When implemented on a general-purpose processor, the computer program code may configure the processor to create specific logic circuits. In some configurations, a set of computer-readable instructions stored on a computer- readable storage medium may be implemented by a general-purpose processor, which may transform the general-purpose processor or a device containing the general-purpose processor into a special-purpose device configured to implement or carry out the instructions.
Embodiments may be implemented using hardware that may include a processor, such as a general purpose microprocessor and/or an Application Specific Integrated Circuit (ASIC) that embodies all or part of the techniques according to embodiments of the disclosed subject matter in hardware and/or firmware.
[238] In some embodiments, the various features and functions disclosed herein may be implemented by one or more modules within a computer system, and/or within software executed by the computer system. For example, a computer system according to some embodiments disclosed herein may include one or more modules configured to receive existing computer executable code, to modify the code as disclosed herein, and to output the modified code. Each module may include one or more sub-modules, such as where a module configured to modify existing computer executable code includes one or more modules to generate base functions, blend the base functions with the code, and output the blended code. Similarly, other modules may be used to implement other functions disclosed herein. Each module may be configured to perform a single function, or a module may perform multiple functions. Similarly, each function may be implemented by one or more modules operating individually or in coordination.
[239] One or more currently preferred embodiments have been described by way of example. It will be apparent to persons skilled in the art that a number of variations and modifications can be made without departing from the scope of the invention as defined in the claims. ί , INTRODUCTION
This document addresses the problem of creating pairs of programmatic iniple- roentfttiotts, F. G, for pairs of bljoctive functions /, /-' . respectively, such that
(1) given white-box access to F and & value , it is "hard* to find x for which
' y = fix)
(2) gives white-box access to G and a value x. it is iard' to find y for which
* = /->(ø);
(3} given white-box acces to F, it is 'hard' to find an implementation Q for Γ J: and
(4) given white-box access to G, it is 'hard* to find an implementation P for /.
We note that information A* sufficient to readily deteraiiae /. /"' can be regarded as a key for a symmetric cipher, with F and G being encrj'ption and decryption according to key A".
We have not specified what we mean by 'hard'. At a minimum, we want it to be significantly less effortful to choose A* and generate such F, G than it is to solve any of problems (1) -(4) above.
APPENDIX
Notation M ning
B the set of bits = {0.1}
N the set of natural numbers = f i .2, 3 }
No the set of finite cardinal numbers ~ {0, t, 2, ...} Z the set of integers .... {. LO, I, ...}
T :- if such that y
x— if set to y
x iff V x if and only if y
W] ί if assertion .4 is true: 0 otherwise
Jilif concatenation of topics or vectors and x Au logical or bit ist1 and of x and y
logical or bitwise inclusive-or of x and y x©t/ logical or bitwise ex elusive- of of x and y
— x or x logical or bitwise not of x
r-t inverse of x
fxl smallest integer k such that x < k
w largest: integer k such that k < r
image of set S under MF /
/{J-} = 9 applying Ml" / to i yields y and only y
/<*> - y applying MF / to x may yield i/
fix) - t; applying MP / to 3* cannot yield y
/(a-) = 1. the result of applying MF / to x is undefined
transpose of matrix M
:s\ cardinality of set 5*
i 'l length of tuple or vector V"
in| absolute value of number n
k- tuple or l*-vector with elements i, ... , x*t fffi,,, ..,i?½l fe-aggregation of MFS nti,.... m*
(ni ...,mk) ^-conglomeration of MFS mi, ... , mk
{*,,....**} set, of j !, ... , xk
set of x such that€
{ € S I C) set of members x of set S such that C
Δ(χ,ι/) Hamming distance {= number of changed
element positions) from x to
Sj · · - x Sft Cartesian product of gets Si,..., S* in 5 « · · P ro fc composition of ¾!i>s m m*
r€ S x is a member of set: 5*
SC T set 5 is contained in or equal to sal T
S CT set S is properly contained in set
Figure imgf000055_0001
Gfin) Galois field (= finite field) with n elements Zf(k) f suite ring of the integers modulo k
ids identity function oa set 5
rand{jf)
extract [a, ]{ x)
extract [ . b\(v)
vk interleave!' »„ v)
Figure imgf000055_0002
TABLE 1. Notations APPENDIX
Abbreviation
A KB Advanced Εηατνρϋο» Standard
agg aggregation
API application rcheedtiiiil interface
IIA Booltmn-arith etic
f!B basic block
CFC control-low graph
OES Data Encryption Standard
directed g aph
dll dynamically linked library
GF Galois field {= finite field}
iA interwafag aggregation
iff if and only if
MBA mixed Bookan-arithinetic
MBS maximu distance separable
MF nwlti-fimet
OF, output extension
PR partial evahialioii
FLl'B point-wise linea partitioned bijeclieii
ItSA Ri .vest - Slminir ··· Adleman
R S residual number system.
RPE reverse partial evmlnatJoB
Til tumpar resistance
SU substitution box
SIE sctftware-based entity
SO shared object
VMDL very high speed integrated circuit
hardware description language
TABLE 2, Abbreviations
2. TERMINOLOGY AND NOTATION
We write " * to denote ^ecA that" md we write « iff * to denote ami mi t f'„ Table 1 summarizes many of the notations, and Table 2 araimiirixc!S many of the abbreviations, employed hereto,
2,1. Sets, Tuples, Relations, and Functions, For a set S, we write tS\ to denote the mnlinaiii of S |Le„ the number of nieinbers in set 5). We also use ;n| to denote the absolute vaim of a nuiirt*er n,
W write {mj , »¾* . . . , TO* } to denote t!ie set whose members are m% » m^f
(Hence if t»i , m¾, are all distinct, |{n»i . i¾, . . . , r¾. } \ = fe.) We also write
{T J C} to denote the set of all entities of the form x such that the condition C holds, here€7 is normally a condition depending ott r,
2.1.1. Cartesian Products, Tuples, and Vector*. Where A and B ate sets, A x J3 is t he Cartesian prated of Mid B: Le,, the set of all pairs {a, b) where a £ Λ (i ,e., ø fa a member of A) mid b <c B (Le,, ft is & member of B). Thus we have (s,: b)€ 4 x fl. In funer l, for sets S|t .. . ,5*, a member of 5| x £½ x - - * x is a fc-tiipie of APPENDIX the form ( ι , . . where ,¾€ Si for i - 1.2. k. If t— (&\ , . . , , %} is a tuple, we write |r | to denote the length of t (in this case, f j = k: i.e.. the t uple has k element positions). For any x, we consider x to !>e the same as (z) a tuple of length one whose sole element is x. If all of the elements of a tuple belong to the same set, we call it a vector over that set.
If a and v are two tuples, then a I t' is their eonmlmatimv. the tuple of length 4- v\ obtained by creating a tuple containing the elements of u in order and then the elements of v in order; e.g.. (a, c, d} \ (z, y, z) — (a. b. e. d, x. y. z).
We consider parentheses to be significant in Cartesian products: for sets .4. B. C. members of (A x B) x C look like ((a. M, e) whereas members of , x (B x C) look iike {«. (b, c)). where .4, ft ς B, a id c <E C, Similarly, members of A χ { B x B) x C look like (a. (ft , l¾), c) where a€ .4. ς B. and c€ C
2.1.2, Relations, M iti- functions (MFsj, and Functions, A fc-ary relation on a Cartesian product St x · - - x S¾ of sets (where we must have A* > 2) is any set R S% x - * · x Sfc. Usually, we will be interested in binary relations; i.e., relations i¾ £ A x B for two sets <4, B (not necessarily distinct). For such a binary relation, we write a R b to indicate that (a, 6)€ JR. For example, where R is the set of real numbers, the binary relation < C R R on pairs of real numbers is the set of all pairs of real numbers (s. y) such that x is smaller than y, and when we write χ < y it means that x, y)€ < -
The notation R :: A→ B indicates that R C .4 x ?: i.e., that R is a binary relation on .4 x £?. Thi notation is similar to that used for functions below. Its intent is to indicate that the binary relation is interpreted as a multi-function (MF), the relational abstraction of a computation not necessarily deterministic which takes an input from get A and returns an output in ml B. In the ease of a function, this computation must be deterministic, whereas in the ease of an MP. the amputation need not be deterministic, and so it is a better mathematical model for much software in which external events may effect the progress of execution within a given process. .4 is the domain of MP R, and B is the cod&mmn of MF R. For any se X c ,4. we define R{X } = {y - B | 3*€ X :- (x. y)€ R}. R{ } is the image of A" under R. For a MF Λ: A B and a€ .4, we write Ri ) = b to mean /?{ {«! } = {b}, we write Ria)→ b to mean that b€ Λί {β} }, we write R(a) -** b to mean that 6 1?{{α}}, and we write 11(a) - ± (read *·/?ία) is undefined" to mean that there is no b€ B (a, b)€
For a binary relation /?:: ,4 f— B, we define
l?~" s = {{&,«> I (o. 6)€ R) .
R '' is the inverse of ΙΪ.
For binary relations R : A→ B and S :: B *→ C, we define S /? :: .4→ C by
5 e R = {{a, ej [ 36€ B ;- o R b and 6 S e} .
S R is the composition of S with /?, Composition of binary relations is associative; i.e.. for binary relations Q, R. S (SoR) Q = S o (RoQ). Hence for binary relations we may freely write Rk ■ - - o R.2 R{ without parentheses because the expression has the same meaning no matter where we put them. Note that ia which we first take t he image of A' under ¾ , and then that image's image under J?2 ; &»i so on up to the penultimate image's image under ?*, which is the reason APPENDIX that the i?,\s in the composition on the left are written in the reverse order of the imaging operations, just like the i¾¾ in the imaging expression on the right . Where .4, * i¾ for i - 1.... ,k. R = \R% ,....J¾} Is that binary relation :-
R:: Ai x · · · x Ak < Βχ x · · · x Bk and
¾ ¾)→ («/Ι,- ...Λ) iff y« fori= .
.... , ¾j is the aggregation of 11%, .., , ¾.
Where !¾:: dt x - · · .4m i * ¾ for ϊ = I , .. , . n, J? = J?i , fin) is that binary relation :- i .Ai x · ·· x .4,,,— Bi x- ·. ¾
and
Ri-i'ii · · · ;¾)→ (ift..-.,ifn) iff #«{;rt,.. -» Ii for I = l....,n .
(R\.... , Ι1ι·) is the conglomeration of ?)...., i?*.
We write /: .4 *→ 23 to indicate that / is a function from Λ to B: i.e.. that /:: A *-* B :- for any a A and 6 e B, if («)→ 6, then /(a) = 6. I¾r any set S. ids is the function for which ids(x) = x for every x€ 5.
2.1.3. Directed Graphs, Contrd-Flow Graphs, and Domm iors, A dire ted graph (ΐκή is an ordered pair G = (Λ\ .4) where set N is the nwitz-ae! and binary relation A ~ N x N is the arc-relation or edge-n ation. (r.y)€ .4 is an sir or ed#£ of G*.
A |»ΙΛ in H DC G" = (;V..4) is a sequence of nodes {«,{.., ..¾} where ¾€ JV for i = 1..... k and (r½. n , ,÷ s ) e A for i = 1, , .. , ί . k 1 > 0 is the length of the path. The sliortest possible path has the form (r?t 'i with length zero. A path o¾ j is acyclic iff no node appears twice in it; i.e.. iff there are no indices i.j with 1 < < j < A' for which nt = nr For a set S. we define Sr - $ x ··· x S where S appears r times and χ appears r - 1 times (so that Sl = 5), and we define
S* = 5* U S2 υ S3 L:■■ · the infinite union of all Cartesian products for S of all possible lengths. Then every path in€ is an element of Λ*+.
In a directed graph (DG) G = {N,A}. & node ?/€ ;V is reachable from a node r e .V if there is a path in G which begins with r and ends with y. (Hence every node is reachable from itself.) The xch of a-€ V is {y Λ' I y is reachable from x). Two nodes x. y are connected in G iff one of the two following conditions hold recursively:
(1) there is a path of G in which both x and y appear, or
Figure imgf000058_0001
N in G such that .r and : are connected and y and ,r are connected.
(If J" = y, then the singleton (i.e.. length one) path (_r) is a path from x to y, so every node n NofG is connected to itself.) A DG G - (N, A) is a connected DG iff every pair of nodes x, y el .V of 6* is connected.
For every node x >= N, . y \ ix.y) c- A \. the number of arcs in .4 which start at x and end at some other node, is the vnt-dtgme of node x, and for every node If € N, {x j ix.y)€ A}\. the number of arcs in A which start at some node and end at is the in-degree of node y. The degree of a node ri «= ΛΓ is the sura of i in- and out-de rei?s.
A source node in a DC G" = {N. A) is a node whose in-degree is zero, and a sink node in a DC G ~ [ N. A) is a node mdiose out-degree is zero. APPENDIX
A fx: G - (N,. A ) is a emtrvl-ftew graph (crRi) Iff it lias a d mgttished scnsr«» nod n©€ -V from which every node m€ i$ reachable,
Let G = (N,A) I» a CFC with source node n©. A node x€ N dominates a node ¾f £ Λ iff ovary path hegdiuiing with and ending with contains i, (Note thai, by this definition and the remarks above, every nodo dominates itself.) A set of nodes λ' do inates* a set of nodes Y in a CFG Iff every path be inning with a start node art l ending wit an. etenieiit of Y contains an element of X.
With G = (N, A) md m Above, « nonempty iiocte »t X ΛΓ dommate nonempty node set Y N iff every pat starting with nt> and ending with an lement of Y eontaiiLS an element of X. (Note that the ease of a single node dominating another single node is tie special ease of tills definition where \X J *= Y\ =
2,2, Algebraic Structures. Z denotes the set of all isttegers and den te the set of all integers greater than xoro (the natural numi ). 2/ (TO) denotes the ring of the integers modulo m, for some integer m > 0. Whenever m is a prime number, Z (m) = CF(m)t the Galois field of the integers modulo m. B denotes the set {ø, 1} of frifj, which ma be identified with the two elements of the ring Z f2) ^ C,F{'2}>
2,2, 1. lAentttim, MmtttieM (Le„ ^uoti ) play a crucial role i ohfu^Mion: if for two iprc^iois X. Y. we kne t at Λ* = ¾*, thai we ran ptbstltute the value of Y for the mine of , and we can substitute the computation of V" for the computation of A", and vim ve a.
'That stieli stitetittttions based on aigebraie identities is ermcial to obfuseation is easily seen by the fact that their me is found to varying extents in every one of [5f 7, 8. 10, 11, 12. 21, 22, 23, 24, 28, 29, 30],
Soiiieti Ks %ve wh to identify (equate) Boofcftn i*xprt»sloi», which wa themselves involve equations, For example, in typical reraputer ar ittnnetie.
T = 0 «T (- (X V (- T)) - 1) < 0
Inning signed eonparison , Thus * iff " uates conditions, and m oximmim containing * iff M are also identities - speeiiealJy, condition identities or Brnkmn identities.
2.2,2, Matrices. We denote an r x c (r rows, c columns) matrix M by f'*2,t ro2»2
Λ/ = whore its transpose is denoted by where
mi . j m2A m
m tOr r
.HO that, for exampl ,
Figure imgf000059_0001
APPENDIX
2.2,3. Relationship of l/i 1) to Computer Arithmetic. On Bn. the set of all length- n bit -vectors, define addition ( — } and multiplication ( · i as usual for computers with 2*s complement fixed point arithmetic { ee [25]). Then < Bn. + . · ) is the finite two 's eompkmeiit ring of order 2r\ The modular integer ring Z/(2n) is isomorphic to ( Bfl. · ). which is the basis of typical computer fixed-point computations (addit ion, subtraction, multiplication, division, and remainder) on computers with an »-bit word length.
(For convenience, we may write x y {x multiplied by y) by ry: i.e., we may represent multiplication by juxtaposition, a common convention in algebra.)
In view of this Isomorphism, we use these t o rings interchangeably, even though we can view (B" . + · * } as containing signed numbers in the range -2n"% to 2 ~ 1 1 inclusive. The reason that we can get away with ignoring the issue of whether the elements of { ΒΓ· . <·· . - s occupy the signed range above or the; range of magnitudes from 0 to 2n · 1 inclusive, is that the effect of the arithmetic operations and - · '* on bit-vectors in B'' is identical whether we interpret the numbers as two's complement signed numbers or binary magnitude unsigned numbers.
The issue of whether we interpret, the numbers as signed arises only for the inequality operators < . > . <■> . which means that we should decide in advance how particular numbers axe to be treated: inconsistent interpretations will produce anomalous results, just as incorrect use of signed and unsigned comparison instructions by a C or C++ compiler will produce anomalous code,
2.2.1. Bit a ise Camp tit er hut mcti om and (B*. V . Λ . -■ ) . On Bn , the set of all leng! h- n bit-vectors, a computer with ii-bit words typically provides bitwise and ( Λ }, inclusive or I V } and not ( -* f. Then f ΒΤί . V, Λ, is a Boolean algebra. In i B, V, A, -i), in which the vector-length is one, 0 is false and 1 is true.
For any two vectors IL v ς B". we define1 the bitwise nduMve or ( ··;· . } of w and r. by u * i' = (u Λ ( ~<ιή) V { { ->u) Λ v). For convenience, we typically represent ~»x by 1. For example, we can also express this identity as u .j> r = ( « Λ o V (a Λ v).
Since vector multi lication bitwise and ( Λ ) in a Boolean algebra is associative, ( B?\ Φ. A) is a ring (called a Boolean ring).
2.2.5. T-Pufif turns and Non- T-Ftinctiowi, A function /: (B'J' )k >→ (Bv" n mapping from a fr- vect r of tc-bit words to an m- ector of te-bit words is a T- function if for every pair of vectors x€ (B"jk, y€ (Bw )m :- y = f(x)t with 3? ≠ r and = f( ! ) , and with bits mmibered from 0 to w - 1 in the u-bit words, the lowest numbered bit in «0 element word at which y and tf differ is not lower than the lowest numbered bit in an dement word at which x and xf differ. Typically we consider t his numbering within words to be from low-order (2°) to high-order i2u'~ 1 } bits, regardin words a representing binary magnitudes, so we can restate this as: an output bit can only depend on input bits of the same or lower order.
Functions eoniposable from Λ, ν, φ, -» computed over B*' together with f .™, x over 2/(2*). so that all operat ions operate on u-bit words, are T-funetions. Obscure ruiistrnetkms with t he T-function property are vulnerable to hit-slice attacks, since we can obtain from any T-fuuetion another legitimate T-funetion by dro|>ping high- order bits from all words in input and output vectors.
The T-ranetion property does not hold for right bit-shifts, bitwise rotations, division operations, or remainder/modulus operations based on a divisor /modulus which is not a power of two. nor does it hold for functions hi which conditional APPENDIX branches make decisions in which higher-order condition bits affect the value of lower-order output bits.
For conditional branches and TOoiparison-based conditional execution, note that conditional execution on the basis of conditions formed using any one; of the six standard comparisons =, ?=, <. > , <, > ail can easily violate the T-funefiort condition, and indeed, in normal code tyin eomparLsc>n-basod branching logic if is easier to violate the T-funet ion condition than it is to conform to it.
2.2.6. Pdynomtol*. A polynomial is an expression of the form fix) =∑_0 ot.r* = s ~r · - + t x- ~ «j ir + an (where i° = i for any x k If ¾ ≠ 0. then d is the degree of the polynomial. Polynomials can be added, subtracted, multiplied, and divided, and the result of such operations are themselves polynomials. It d— 0, t he polynomial is constant; i.e., it consists simply of the scalar constant <¾· If d > 0, the polynomial is non-wnslant. We can have polynomials over finite and infinite rings and fields.
A non-constant polynomial is irreducible if it cannot be written as the product of two or more non-constant polynomials. Irreducible polynomials play a role for polynomial? similar to that played by primes for the integers.
The variable r has no special significance: a regards a particular polynomial, it is just a place-holder. Of coarse, we may substitute a value for x to evahtate the polynomial that is, variable ;r is only significant when we substitute something for it.
We may identify a polynomial with its coefficient id + 1 )-vect* >r {<¾, . . . . in , ¾).
Polyn mi ls over€F{2; = Z C2) have special significance in cryptography, since the {d -*- i )-voetor of coefficients is simply a bit-string and can efficiently be represented on a computer se.g.. polynomials of degrees up to 7 can be represented as 8-bit bytes); addition and subtraction are identical; and the sum of two such polynomials In bit-string representation is computed using bitwise -;e (exclusive or).
2.3. Encodings. We formally introduce encodings here.
Let F:: D—> R be total. Choose & bijection d: D <→ D* and a bijecfion r; R R' . We call F* = roFctf"* 1 an encoded version of F. d is an input encoding or a domain encoding and r is an output mroding or a range encoding. A bijection such as d or r is simply called an encoding. In the particular case where F is a function. the diagram shown in Fig. 1 then commutes, and computation with F' is simply computation with an encrypted function [28, 29]. As shown in Fig. 1, only D', F' (the encrypted function) and R' are visible to the attacker. None of the original information is visible to the attacker (D, F, R), nor is the information used to perform the encoding. APPENDIX
Let ft:: S * S n where m».n.€ N, for τ = 1,2,...,*. Then the nrfalw»
Figure imgf000062_0001
ft T
(ft; ft;..,: ft)
is that relation B Vxi€ Si, Vac e ft
■ · - - * Iff <(ϊ, .... I). \x , .... » e (ft , . , . , ft) .
2,3,1, Network Ε & ίί Computations. Generally, utput of a transformation will heeome the input to another sub&eqiicitt transfer mat ion, whieh means the output encoding of the first must match the input oneodiiig of the second as follows.
A networked encoding- for computing Y o A' (Le» traiisforiflatioB X followed by trairforiiiaiioii Y) h tin encoding of t e lotto Y' o A*' = (II o Y © C~* ) © (£? o A" a F"1) = uiVsDsF 1.
Ill the general ease, we nave encoded meiww'kx, whieh are data-flow networks in whieh the node faiietioiis are encoded functions.
Encodings may be derived from algebraic structures (see §2.2). For
example, finite rmg eneoding (FR) is based on the tact that afflne functions; nf -■ - #£+ fewer Zfi2*)f where a? i» the word width, whieh c n be im ! metited by igmmmg overflow w* that the modulus ¾t the natural m chine integer raotaliw, aw totes whenever $ is odd
We note from Fig. 1 that t e key to encoded computation is that inputs, outputs, and computation are all encoded. For example, consider elements of Z/{2**|, the ring of the integers modulo 2*» where m Is the preferred word- width for setiKf computer {typically 8.1§, 32. car C4 with ft trend over time towards the higher widths). The units of 2 (2^· (i.e., those with a muttiplieative inverse) are the < ld elements 1.3.5 , 2*' - 1 ,
Suppose we want to encode additions, subtractions, and multiplications on a binary computer with word- width a- so that the taiencoded computations are perforated over Z/ps'). We could me affitie e codings over Z/(2*r'), With uneueoded vmmMm r.y, ι and TOrr spoiiiltog encoded wi WeR sr*. if. :' where
We want to iteteriiiiiie how eoioputo
Tli# f/j..,„. |¾1 notatiuii was iMroKhavid for fe«nctkwi a«K¾a iwn by Joltn facta* i» hm ACM Turing Aw¾id iectinr. t bav« t!tketi tl ta Λ ρϊν to binary fd tkwe in gem-fdtl. APPENDIX
z1 = r y!
i.e., we need representations for +'. x '. (Over a network of such operations, we would have many different encodings +'. x\ with the requirement being that the result of an operation employs the same encoding as the corresponding input encoding of the eoiLsnoiiiig operation, j
Where juxtaposition represents x over Ζ/(2*)- - x in 2's complement is -x in Ζ/ί2¾>). and xy in 2's complement is xy in Z/{2tt!)» Thus if e»> is the encoding of v and
1 its inverse, = f ,,{i:) = r + bv and ί>',: 3 |ί·' } = ( Γ'™ br)sr 1 = ·,Τ 1 ϊ + * ) {anot her ailtne encoding over Z (2a'}). Then
z = x + y
Figure imgf000063_0001
which has the general form z* - C\ * + + ¾ with const nts c\ . <¾, <¾: the original data and encoding cwif ents have vanished. If y is a positive or negative constant k. we may choose c-^ = id (i.e.. = 1 and ¾ - 0). and the above reduces, to
- e: !e^ ) - k) which has the general form z - C} jr' ■ 2 for constants ij . ei . Alternatively, we can compute = -r k as z' — . xf where we define = sg and bz = b3 ~~ ssk so that we can compute = x' k with no computation at all. To make z! = -V without computation, we simply define #. = -*x and bz = -hx and set z' = x' . Similarly, for subtraction
Figure imgf000063_0002
= e: (e l (j') - « 1 (y')) which again has the general form z* = ci x* · <¾ with constants cj , the original data and encoding coefficients have vanished. If y' is a eoiistaut c, we proceed above for addition, setting k -= - e. To subtract = k -' τ w can compute it without computation by negating r' without coniputatiou, and then adding k as described above. For multiplication.
= T x .y
= I A; ! *™ 1 ¾ j x'y + ( - ¾ 3 sy % szbv ) x which has the general form z' = etx'y' * o^' + e* for constants ej , <¾, ¾, C| . Filially, if x is a constant *·. we may clioose eT = id (i.e.. we may choose gr = 1 APPENDIX and bT = 0) in w!iicli c« t he above irmitiplieatioii (orttitua reduces t
which has the general form = e%y* i- ¾ for constants cl t e-j. Alternatively, if k is iiwertiHe (i.e.. odd) in Z/(2* f v& cm compute z1 = k ' y m z' = y by defining #. = £*" and b: = ¾, which has the standar tllii form for Fit encoding, aiwi slows its to take but with encoding e. rather than its own encoding <¾, to be .- so that we can compute z* - k x' f with no computation at all.
Ptolyaamiais of Uglier order may also lie used: in general (27J, for 1 < u?€ N, over Z/{2*),
» a permutation p ynomiat (i.e., hyectiv or lossleis potyn miat) Iff
(11 o, is §- (modulo 2W) for i - I , . . . , <f.
(2) ai is odd,
(3) £i-j + a j f · > . is even, and
(4) a 4- (1-, + «7 T ... is ("V( ;
ft clMrsctOTi¾itkj3i due I» Ri est (Only a s teet of I WjecttoiB on Ζίίϊ-Ζ^ - 1] em be written a* SBA a permutation pclyiwmia! over 2/(2*) .) The higher the degree, the -more entropy contained in a choice of polynomial, but time and s ace complexity rise correspondingly.
I¾rnitit soii polynomials exist over many rings and fields, Kltiiov |17] extended this clJAttcte sitiDn to what lie called gmemlimi peintit&tioii polynomials, which are like those de*¾¾b«l afwve exeept that any given o, may be + or - (modulo 2s'} or bitwise oxeliisi ve-or ( £>) on Bv ami the o operations can be applied in any fixed order.
White we can write polyiioimials of arbitrary degree, every polynomial over Z/{2W) is equivalent to a polynomial of very limited degree. la fact, it has been known that, for w€ NL every permutation polynomial F over Z (2V) lias an equivalent pertamation polynomial g mm 2 (2*) of degree < w ÷ 1 ,
A difficulty with permutation polyiwiruafe, whether gencs-arized or not, ii that they only become truly useful when their inverses are known and computationally convenient. It is known that most permutation polynomials haw inverses of high degree (dose to 2*' for permutation polynomials over Ζ/(2*)). ltowwr, using RivesfiS (lioti-geiieriJiasl) characterization above, if « = 0 for i = 2, . . . . df then the decre of t he inverse b the Mine as the decree of the iwlvnomial to be inverted. Formulas for inverses for permutation polynomials are as follows (see section C for a more rigorous treatment):
2.3,2. Quadratic Pehjnomiak and Inverses, If J*{x) = ax2 + fei" e where a3 = 0 and h is odd, then P is inwrtible, and
- dx2 -i" ear ÷ /' , APPENDIX where the constant coefficients are defined by
e = 2— - - , ana
Figure imgf000065_0001
2.3.3. Cubic Polynomials and Inverses. If P(r) = o * + bx2 + er + d where «- = 6- ----- 0 and e is odd, then P is invenible, am!
P-I(x) = cx3 + fx2 + gx + h .
where the constant coefficients are define by r = 3 L ± .
Figure imgf000065_0002
., ad b \ ,·, f ] ad- „ M \
2.3.4. Qu rt : Poi umiials and Inverses. If P r) = eat + fct3 f cr2 + dx + e where a'2 = ir = e- = (} and d is odd. then P is invert ihk\ and
P " 5 (21 = fx 1 + $ ¾ - hoi2 + ix - j , where the constant coefficients are defined by
a
/ ¥
.9 IF
Figure imgf000065_0003
4cc3 be"2 ee
i - and
ae
1P rf
2.3.5. Notes on Permutation Polynomials Over Z {p¾'). Let p be a prime and «*€ N. The properties periiiutatiori polynoiiiiais over Z/(pw) are explored in a 19*4 paper by Mullen and Stevens [19j, which teaeks us the following.
(1) With T{m) being the ntiniber of PPS over Z!i m ). for arbitrary m £ N with m > 0, where m = f'l -i P^' ¾it1i pi ,— pt distinct: primes and iij , . . . . ¾ £ N. we have rim ) = f]*-i r(P* * >·
(2) The number of functionally distinct PPS over Z/{pn) Is APPENDIX
where SP(n) = ^ ί¾ " ' vPi i), (n is the greatest integer p :- i>p{p) < ft . and
S κ :- pn i (« + 1 )! but " K*„ and we have (3)
Figure imgf000066_0001
Every polynomial function = «, ' can be expressed in the falling factorial form
Figure imgf000066_0002
where xt t ; = ~ * ) with = 1. is a faitint) factorial,
2.3.G. A¾ie§ <m Pemmtation Polynomials Over Z/(2*} . For a computer with M'-bit words. PPs over Z/(2u ) are especially conwnient, since addition, mult: iptkat ion, and subtraction mod 2"' can be performed by simply ignoring overflows, underflows, and the dist inction between magnit ude and 2 s complement computations, and taking he ordinary uncorrected hardware result for such machines. Moreover, this is also the default behavior of computations on operands of type int. signed int. or unsigned int in programs written in C. C++, or Java™,
Adjusting the results from [19" above, the number of functionally distinct PPs over Z/ is * '
γ-ί'Υ'1 j -_ <>irf A a ( «>» +· J } ~¾t' tr>
where Sz(w) = i ) < w. and
Figure imgf000066_0003
at that i{n) is that integer κ :- 2" | (κ + I)! but 2n i K!, and we have Λ¾(?? ϊ = - 1 1 ( mod" 2).
2.3.7. General Notes on Encoding*. P* denotes an encoded implementation derived front function P. To emphasize that P maps rn- vectors to n- vectors, we write ;·, /*. P is t hen called an n x m Junction or an n m tran^fonnniion. For a matrix Λ/, M indicates that M has m columns and v rows. (These not ations naturally correspond, taking application of Λ/ to a veetor as function application. )
JJjE {mnemonic: enirc/pfHransfer function) is any function from ?n-veetors over B t n- vectors over B which loses no l>its of information for rn < n and at most rn - n bits for m > ». A function ! which is not an instance of "E is lossy. Multiple occurrences of "SE in a given formula or equation denote the same function.
,: e (mnemonic: entropy veetor) is an arbitrary veetor selected from n. Multiple occurrences of ne in a given formula or equation denote the same vector.
An affine function or affine tramformation (AT) is a vector-tovector function V defined for all vectors mv € Sir' for some set S by J|,V(w fd = S» mt' ·+ ,td (eonciselv: V"{jr) = Mr + d'l. where M is a constant matrix, and d a constant diap mmmt veetor. If A and B are ATs. t en so are A iji . (.4 , B\, and A o B where defined. An AT V(x) = Mx + d is a linear function or linear tmnxfwmatitm ( IT ) iff d = p . APPENDIX
A function /; F* *→ Fm from ^-vectors to m-v txm met < » +, € «P(i) for some prime power £ is- mi nmr iff j linear function g: Fk »— Fm and encodin s tf| , , - - . < , rj rm: F ^ F f - (r, , . . . , rOT| o § srf 1 , , .. , , <¾; Jj,
(Note that If 3$' :- / =
Figure imgf000067_0001
, . . . . rJJ3] o </' A (ef 1 s . . , , d"' } where is alfiiic, then eertamly 3 linear § , n , . . . , rm :- / = [rt . . , . , r½J o o (tf J" 1 , , . , t1 ], sme* we can ch xM* rs , , , , , rm to perform the elanenf w» addition of the vector dlspjaeeraeiit f f*.)
If g: Ak Am h not deeply nonlinear for a prime power \Ai > 1, we say that § Is liner up to i o encodin .
We have proven tine following regarding linearity and identity up to i/o encoding. f 1 } If a function is linear up to l/o eiMwiing, thru art1 ail of its projections, (i'l Two matrices are identical up to i/o encoding Iff one can be converted into the other by a seqtie nee of multiplications of ft row or a coltMiiti by a noius&r scalar,
(3) If two functions *w linear tip to i/o enemling and i entical up to j/u enabling, then they are I/O encodings of m trices which, are nfco identical up to i/o encoding.
(4) If two matrices are identical, mp to l/o encoding, then so are their corro jcpondiiig
Figure imgf000067_0002
(5) If M is a tion®efo matrix over of {») there is & matrix M* over <;r{n) so that M, M* are identical up to l/o encoding, where Ι (the ifti-mmmiml form of M) has a leading row and column which contain only O's and Vs.
2.3J. *fht.ciur »d fticiiirr fiim m . As noted in §2.3,1, generally when an ncode output is produced, it is ronmimed wit exactly the same encoding assumed, so that an encoded operation ; = f(x, y) became z' — f tf, ) where
(x'. ' , :") - i xix), cv(y},€s (z )), et, ev, et are the- encodings, and f - e. o / o
It is soiiietitnes twlmtttagecrtis to output a, value with one encoding, and sulise- qwetitly input
Figure imgf000067_0003
some other encoding. If we output x as e-$ (dr). and ktcr ooiwuwM* it aisnming caeoding c2, in effect we ha e applied <¾' 5 s et to iho uneii- coded value. W© erf I such an Intentionftl misjnatch bet ween t he encoding in which a value is produced and the caiccnling assumed when it fa txmmaaed a fmttmv. If the cneodiiigs are linear, so is the fmctnre function e 1 oes . and if they are permutation olytKatnab, m tli fmei m fitnetim 1 o e( ,
ftoeittre* are po entially useful in ohtoeatlon b m the computation which they perforin effectively does not appear in t bo code the amount and form of code to perform a normal networked encoding and one which adds an operation by means of a fmctn is identical, and there appears to be no obvious way to disantbigtiat these cases, since- encodings tlietiuwlves lend t© be some h t aiiibi tii is.
Note that the ciefitiiiif pro|airiy of a fracture Is the fmeium function tr 3 u. my, CJeneraHy, there are many ditferent elicwees of «itisittrung encodin i» and producing encoding u which produce exactly the s&me fmetmm. f&ncM : it is quite possible, for example, to have u% »i , , . . , ¾ such t .-hat. tf~ 1 o «, is the saine fmeimre fwtelimi for ί = 1, , . .. k. Thus specifying the fruetvm fmtciim does net nail down the producing and i«tet»ning eueoditigs which imply t .
2.4. Partial Evaluation (P.E). A jmtiml ^eaimtim (FE| of an ¾JF is he goo- eratioa of a M by freezing some of the inputs of some other MP (or the MF so APPENDIX generated). More formally, let /:: X < Y »— Z h an MF. The partial evaluation {PR J of f for constant c fc Y is the derivation of that MP g:: X Z such that, for any τ€ Λ" and :€ Z, fix) -»· : iff f{x, e)→ z. To indicate this pr, relationship, we may also write g( - }≡ ft - . e). e may also refer to the MF JJ derived by 1Έ of / as a partial evaluation (PE ) of /, Thai is, the terra partial evaluation may be used to refer to either the derivation process or its result.
To provide a specific example, let us consider the case of compilation.
Without PE. for a compiler program p. we may have p. S E where S is the set of all source code files and E is the set of object code files. Then e = pf.«) would denote an Application of the compiler program p to the source code file s, yielding the object code file e„ ( We take p to be a function, and not just multi- funct ion, beearise we typically want compilers to he deterministic.)
Now suppose we have a very general compiler q. which inputs a source program s, together with a pair of semantic descriptions: a source language semantic description d and a description of the semantics of executable code on the desired target platform f. It compiles the source program according to the source language semantic description into executable code for the desired target platform. We then have q S x (D x T) *→ E where is the set of source code flies, D is the set of source semantic descriptions, Γ is the set of platform executable code semantic descriptions, and E is t e set of object code files for any pla form. Then a specific compiler is a PE p of q with respect to a constant tuple (d. t)€ D x Γ, i.e., pair consisting of a specific source language semantic description and a specific target platform semantic description: that is, p(s) = q{$, (d. t)) for some specific, constant (d. t ) c D x Γ. In this case. X (the input set which the PE retains) is 5 I Hie set of source code flies). Y (the input set which the PE removes by choosing a specific member of it) is D x T ( the Cartesian product of the set D of source semantic descriptions and the set T of target platform semantic descriptions), and Z (the output set ) is E ( the set of object code ties).
PE is used in ( 10, l lj: the ΑΕ5-Γ28 cipher ~16] and the DES cipher are partially evaJitated with respect to the key in order to hide the key from attackers. A more detailed description of the underlying methods and system is given in "21. 22] .
Optimizing compilers perform Pi: when they replace general computations with more specific ones by determining where operands will be constant at run-time, and then replacing their operations with constants or wit h more specific operations which no longer need to input the (effectively constant ! operands.
2.5. Output Extension (OE). Suppose we have a function /: U—* V . Function g : U *-* V H" is an output extension (OB) of / iff for every «€ V we have gi tt) = (f(u) , w) for some «·€ W. That is, g gives us everything that / does, and in addition produces extra output information.
We may also use the term output trtemi (OK ) to refer to the process of finding such a function § given such a function /.
Where function / is implemented as a routine or other program fragment , it is generally straightforward t o determine a routine or program fragment implementing a function g which is an OE of function /. since the problem of findin such a funct ion g is ver loosely constrained. APPENDIX
2.0. Reverse Partial Evaluation (RPE). To create general, Iw-overhead, effective interlocks foT binding protections to SBI¾ we will employ a novel method based on reverse partial evaluation (Κ!Έ).
Plainly, for almost any MF or program g:: X »~ Z. there is mi extremely large set of programs or Ps /, sets F, and constants c€ V, for which, for any arbitrary T Λ", we always have gix) = /far. ).
We call me process of finding such a tuple i f. c. Y) (or t he tuple which we find by this process) a reverse partial evaluation (itPE) of g.
Notice that 1Έ tends to be specific and deteniiinisfie, whereas RPE offers an sndefljiitely large rmmber of alternatives: for a given g. there can be any number of different tuples ( . e, V ) every one of which qualifies as an RPE of g.
Finding an efficient program which is t he i'K of a more general program may be very difficult that is. the problem is very tightly constrained. Finding an efficient RPE of a given specific program is normally cpiite easy because we have so many legitimate choices that is. the problem is very loosely constrained.
2.7. Control Flow Graphs (CFGs) in Code Compilation. In compilers, we typically represent the possible low of control through a program by a control flow graph (ere;: see the definition in §2 J . 1. where a basic Mock (nn) of executable code (a 'straight line* code sequence which has a single- start point, a single end point, and is executed sequentially from its start point to its end point) is represented by a graph node, and an arc connects the node corresponding to a Bit U to the node corresponding to a BR V if. during the execution of the containing program, control either would always, or could possibly, flow from the end of BB I ' to the start of BB 1 ". This can happen in multiple ways:
(1 ) Control flow may naturally fall through from BB V to BB V.
For example, in the C code fragment below, control flow naturally falls through from V to V:
switch (radix) {
case HEX :
U
case OCT :
V
}
(2) Control flow may be directed from to \ ' by an iutra-proeedurai control construct such as a while-loop, an if -statement , or a goto- tatement .
Fur example, in the C axle fragment below, control is directed from A to Z by the break-statement:
svltcn (radix)
case HEX :
.4
break ;
case OCT :
B
}
Z
(3) Control flow may be directed from U to V by a call or a return. APPENDIX
For example, in the C code fra ment below, control is directed from B to .4 by the call to f () in the body of g ( ) . and from ,4 to C by the return from the call to f ( ) :
void f (void) i
A
return ;
>
int g ( int a , float x) {
B
t O ;
c
y
1*4} Control flow may ho directed from V to V by an exceptional eoiitro!-iJow event.
For example, in the C++ code fragment Mew, control fa potentially direct «5 from f. > to V by a failure of the dynamic_casi of. say, a reference y to a reference to an object in class A:
#include<typeinio> int g (int a, float x) {
try {
U
kk x = dynamlc_cast<A<fe> (y) ; catch (bad.cast c) {
V
}
}
For each node v ς N in a CFG C = (N, T) C for control T for transfer node fi is taken to denote a specific BB, and that mi computes an MF determined by the code which pu n contains; some function /:: Λ' 1% wl½re A' represents the set of all possible values reed and used by the code of n (and hence the inputs to function /). and Y represents the set of all possible values written out by the code of n iaad hence the outputs from function /}. Typically / is a function, but if / makes use of nondetensinistic inputs such as the current reading of a high- resoiutiou hardware clock, / is an F but not a function. Moreover, some computer hardware includes instructions which may produce iiondeterrninisl c results, which, again, may cause / to be an F. but not a function.
For an entire program having a CFG C - ( .V, T ) and start node »,¾ we identify N with The set of mis of the program, we identify no with the BB appearing at the starting point of the program (typically the beginning BB of the routine mainC) for a C or C++ program) , and we identify T with every feasible transfer of control from one BB of the program to another,
Sonietinies, instead of a CFG for an entire program, wo may have a CFG for a single routine. In that case, we identify " with the set of Bits of the routine, we APPENDIX identify ¾·* with the BB appearing at the bcptmiiig of the routine, and we identify T with ev ry possible transfer of control ft«a one BB of the tcrttti e to aiMrtiier,
2.8. *Pcmiutliig by Pair-Swapping. Here we consider how to pred«» ponmi- tatioiig of n elements using only random 2 x 2 switch dements which compute either a*t . ¾ (no swap) or g&, gi *~ xt, x$ (swap), each with proljability |,
2.8.1. *Permuting by Blind Pair Swamping. A sorting network may be represented as shown in Figure 7, by a series of parallel wires 72 in which, at certain points, at right angles to these wires, one wire is connected to another wire by a cross-connection 74 (representing a compare-and-swap- if-greater operation on the data being elements carried by the two connected wires). If, irrespective of the inputs, the output emerges in sorted order, the resulting network is a sorting network. The comparisons in a sorting network are data-independent: correct sorting results at the right ends of the wires irrespective of the data introduced at the left ends of the wires. Compare-and-swap-if-greater operations can be reordered so long as the relative order of the comparisons sharing an end-point are preserved.
An efficient way of iOtiMriictiiii ^tieh a sorting network for n I MIIS IS given by Batcher's Odd-Even Merge*>rt [2], which is A data-iirfcpeiicieiit sort: exactly the same ea«i Mr -«i(i-i«¾| -jf- Teftter iterations are perfewr nied irrespecti e of the data to be sorted. The algorithm performs Q{n(lo n)2) comparisons in sorting a set of n elements, Details of the algorithm CUB he found at th se UiiLs: |3|.
If saelt a net mark will sort arbitrary (lata into order, it foltows that:, If the coiiipiiri^aiid-swftp-if-peftter operation* are re lac d with operations which swap with probability . the m network oEilgriratioii will permute A sequence of n distinct, elements into a random order, probably a biased one. This is the baste of the iHecittHiisiM we will use to implement p#rimntaliaiiss using pseudo-random true/fata* variiites created using coaptatio s.
Note that the aitniber of permutations of n elements is η!„ whereas tie nwx&m of g a «mlgii atioiis. mmg «» bit in iMwitfoi 2" to in icate wh th r the ftlt swap was done or not done. Is f4^ w ew b(n) is the lumber of stages in a Batcher network sorting n elements.
for example, far n = 3, n\ = fi, b(n'} = ¾ and 2^R) = 8, so there must lie perttttitatiotis which can l>e selected more than one way, but some of them eaiuiot. Similarly, for n = 5. n! - 120, b(n) - 9, and M2. so there must be perratttatioiis wMeit mm be selected mam than mm way, but the number of swa (joBfi urationB which select a given permutation cannot always be the same be atise 120 f 512.
Reducing bias requires that we ensure that the number of ways of reaching any permutatftai is roughly the same for each permutation. Since 2**η1 is a power of two for my n, this cannot be done simply by adding extra stages. It is oe«e¾¾ry isi m m to us other m hods for reducing bias,
2.8.2. 'Permuting ¾ Cmitmiied. Pwr-Swapping. The method described in §2.8 Λ
suffers from sigtiilcaiit bias (the bias cm easily exce two to one). The problem was that the number of random swap/«o-swap decisoas is always a power of % whereas the num e of rmutations is always a factorial, aid for e!ewsjt munis above the number of permutations never evenly divides the mmim of swap/no-swap lineups which form a number between 0 and 2fe' - 1 iiwdiisive wiiieh can be viewed m a string of k bits: one lit per swap/iio-swap decision. APPENDIX
There are two different mechanism which we can deploy to obtain results from tin- same kinds of decision elements (comparing two pseudo-random numbers). We begin with a straight selection problem: to generate a permutation, we choose one of n elements for a given position, and then choose one of n ~ 1 remaining elements for another, and so on. until we are forced to choose the remaining element for the remaining position.
T!ie first method of removing bias might be called attenuation. Suppose, for example, tha we need to choose one of 12 elements. We could create a binary tree of decisions with 16 leaf nodes, and map the leaf nodes onto the 12 elements, with 8 being reachable via one leaf and 4 being reachable by two leaf nodes each. (We simply wrap the 16 leaf nodes around the 12 choices until all leaf nodes have been used; i.e.. we create a sequence of 16 elements by repeating the element numbers from I to 12 until we have an element number for each of the 16 leaf nodes: (1 , 2, 3, 4. 5.6, 7, 8. 9. 10, 11. 12. 1.2, 3. 4; .) At this point, we have a maximum 2: 1 bias in our selection, if we use a tree wit h 32 leaf nodes, we will have 4 elements reachable from two leaf nodes mid S reachable from 3. and our inaxi um bias is reduced to 3:2. If we use a tree with 64 leaf nodes, we will have 8 elements reachable from 5 leaf nodes and 4 reachable from 6, arid our maximum bias is reduced to 5:4. Atten uation produces fast but bulky code.
The second method of removing bias might be called res kilion. Suppose*, as above, that we need to choose one of 12 elements. We could create a binary tree of decisions with 1G leaf nodes, and map 12 leaf nodes to the 12 choices. The other four choices map to looping back and performing the whole selection over again. With probability % we succeed on the first try. With probability we succeed in the first two tries. With probability | . we succeed in the first three. Renvkcti has the advant age that it can almost completely eliminate bias. It has the disadvantage that, while spatially compact, it involves redoing some steps. Also, since we would want to limit the number of iterations, it, involves counting the repetitions and defaulting to a (slightly) biased choice on those rare cases where the count is exceeded. Remkction is more compact, dower, and eliminates more of the bias than attenuation.
The- third method of removing bias might be called reconfiguration. Suppose, as above, that, we need to choose one of 12 elements. We note that using 1G leaf nodes, we have a bias of 2: 1 with 8 nudes reachable 1 way and 4 reachable 2 ways. Suppose we set up t hree ways to identity the mapping of leaf nodes to elements. For each of the 12 elements, it appears in the "1 way" set in two configurations and in the "2 ways" set in one configuration. We then (using fvse!ection} choose one of the three configurations, and then select using the tree of 1C resulting in an almost perfectly unbiassed selection, and we only had to apply reseleetiou on 3 ement*-!, not 12. At the cost of a lit tle extra data, this method provides the best combination of compactness and speed when the number of configurations needed to eliminate bias is small. (The maximum possible number of configuration?, is bounded above by the number of elements from which to choose, but at that number of configurations, using reconfiguration is pointless; because choosing the configuration is simply another of instance of the problem whose bias we are trying to reduce. )
2.9. Deep Nonlinear! ty: Function-Indexed Interleaving. The AF.s-128 implement tion described in [10], built using the methods of -21] . has been penetrated APPENDIX using the attack in |4|. While this at tack succeeded, the attack is quite complex, and would require significant human labor to apply to any particular software implementation, so even without modifications, the methods of -'21 ' are useful . It would be extremely difficult to make the attack of [4| succeed against an attack on an implementation according to [21 J fortified according to [22j. However, we now seek stronger protection, and so it behooves us to rind ways to further bulwark the methods of [21. 221 in order to render att acks such as those in [41 infeasible.
2.9.1. Shallow N tdinearity and llomomorphic Mapping Att&ef:$. Much use is made in implementations according to (21. 22] of wide-input linear transformations (§4.0 in [21] ) and the matrix blocking method described in §4.1 on pp. 9 10 (paragraphs [0195- J02Q9] in [2Π. It is true that t he methods of J2b produce non-linear encoded implementations of such linear transformation matrices. However, the implementations are shallmely nonlinear. That is. such a matrix is converted into a network of substitution b xes (lookup tables) which necessarily have* a limited number of elements due to space limitations. The nonlinear encodings (arbitrary 1-to-l functions, themselves representable as substitution boxes; i.e., as lookup t ables} on values used to index such boxes and on element values retrieved from such boxes are likewise restricted to limited ranges due to space limitations.
Thus any data transformation computed by an input-output-encoded implementation of such a blocked matrix representation, which is implemented as a network of substitution boxes, or a similar devices for representing essentially arbitrary random functions, is linear up to I/O encmling; t hat is. any such transformation can bo converted to a linear function by individually reeoding each input vector element and individually receding each output vector element .
The attack met hod in \4\ is a particular instance of a class of attacks based on homomorp ie mapping. The attack takes advantage of the known properties of linear functions, in this ease over CF(2S) since that is the algebraic basis of the computations in the AES. In particular, addition in G¥{2n } is performed using bitwise (exclusive or), and tins function defines a Latin square of precisely known form. Thus it is possible to search for a homomorphism from an encoded taMe- looktip version of - to an unenooded one, and it is possible in the case of any function f = Q Q 1 where Φ is bitwise, to find an approximate solution Q Q o A for a particular afflue .4 (i.e., an approximation Q which is within art affine mapping A of the real Q) with reasonable efficiency. These facts are exploited in the attack of [4j , and there are other attacks which could similarly exploit the fact that the blocked matrix function implementations of [21, 22] are linear up to I/O encoding. While such attacks yield only partial information, they may narrow the search for exact information to the point where the remaining possibilities can be explored by exhaustive search. For example, a white-box implementation of encryption or decryption using the building blocks provided by '21 , 22] may be vulnerable to key-extraction attacks such as that in (4L or related attacks based on hoinoisorphie mapping.
2.9.2. Foiling HwnwnorpMe Mapping: Deeply Nonlinear Functions. The solution t homoraorphie mapping attacks is to replace such matri functions with functions which are ( 1 ) wide-input: that is. the number of bits comprising a single input is large, so t hat the sot of possible input values is extremely large, and (2) deeply nonlinear: that is. functions which cannot possibly he converted into linear functions APPENDIX by i/o encoding (i.e.. by individually receding individual inputs and individual outputs) .
Making the inputs wide makes brute force inversion by tabulating the function over all inputs consume iafpAsibly vast amounts of memory, and deep nonimearity prevents homomorpliie mapping attacks such as that in (4;,
For example, we could replace the MixColumm and IavMixColumns transformations in AES. which input, and output 32-bit (4-byte) values, with deeply nonlinear .li>s transforms which input and output 04-bit {8-byte·) values, rendering bru!e- foree inversion of eit er of these impossible. Call these variants MixColumm^i and tavMixCohtmm^t . (Since encryption of a message is done at the sender and decryption at the recipient, these would not normally be present on the same network node, so An attacker normally has access only to one of them.
Suppose, for example, that we want to construct such a deeply nonlinear vector- to- vector function over GF(2") (where n i the polynomial— i.e., the bit-string size for the implementation) or. respectively, over Z/{2*) (where π is the desired element width J. Let u + v = n, where u and v are positive nonzero integers. Let G■■ our chosen representation of GF(2") (respectively, of Z/{2':j),€lXi =.· ur chosen representation of CF(2B) (respectively, of Z/{2")}, and Gv = our chosen representation of CF(2'') { respectively, of Z/(2*)).
Suppose we need to implement a deeply nonlinear function/: CP" - - Gq. with p >
3 and q > 2; i.e., one mapping -vectors to q- vectors over our chosen representation G of GF(2").
If we wanted & linear function, we could construct one using &q p matrix over G. and if we wanted one which was nonlinear, but linea up to i/'o encoding, we could use a blocked encoded implementation of such a matrix according to [21.22j. These methods do not suffice to obtain deep nonlhie&rity, however.
We note that elements of G,GU,GV are all bit-strings (of lengths η,η, ν, respectively). E.g.. if n 8 and u — v — L then elements of G are S-bit bytes and elements of Gu and Gr are 4-bit nybbles (half- bytes).
The Mlowiiig construction is called function-indexed int rleamig. We introduce operations extract jr..«]( · } and interleave! · . · ) which are readily iinpkuiientaMe on virtually any modern computer, as would be evident to those versed in code generation by compiler. For a bit-string
S= (hJ; ...Jh) ,
we define
extract fr. »[(S) = (br.br+%.— b0) ;
i.e.. extract fr. *j returns bits r to inclusive. For a vector of bit-strings
r = (5,,¾,...fSs) ,
we define
extract {r, *](V") = extractir.*J(Si)textract(r.*J{5a}, , , . , extract [r,*|(S.}> . i.e.. extractor, a] returns a new vector containing bibs r to *. inclusive, of each of the old vector elements. For t o vectors of bit-strings of the same length, say V = ( Si .....5: ) and W = (Ft , .... Tz ) , we define
interleave^". U') = (St (j 1 , S2 :| J¾. .... S. jj T: ) : APPENDIX io., <Ewh ffammt of tot«*r!eav { H*| is th*» («ιΐί¾ί**ΐί ΐ :η ojf tlio iwr*^f»it4irtg d rtietit oil" will} t!i
Figure imgf000075_0001
T< · btain our 4<> pty !i<¾ii«i«.«r fuaeii a /: Gp » * ( '* alxjve. ¾-«.- pr o^-i &s !o!kiw» per the flow chart of Figure X.
30
82
Figure imgf000075_0002
s k*rt : q Λ p ma t ers uV t r, ISJSCI* sm nar ^mxe uwtttef*. ttmtf vitlawftl'ilitiw* to h nKttw¾rj,«W>c mapping, it i T*»fiW«i ihet jwiiMnKtUar. ϊί ii^..... ?* } «** Mt>>» ι« sqtuwi- *«1ΜΙ»!Γ« *4 arty <*, siiigtiiiW, so th prt'l iifH* fo. < t«iti!y %?fcfi«i.S
84 St*l«i « iuftrticsi! (:T , . ut \ . (ί· \ J |iir wliie
f'U..„... , .l 1}
ίϊ«:».,
Figure imgf000075_0003
* llt-il » !#fii«" or '.tsfyvfi 'r'j.
ii!hff t mil » t*« ¾¾*<onM fb*** « * «ι i fj ¾w<-r,
Figure imgf000075_0004
s«iis Mftiv f r Ι»ι««ω¾ »., Λ* ait r««mfA<, »«*givi>o¾r | !f fi** c ti*iii.isitoii: f*>f ,% . !>>;!
|f A* * ii» we !i n illicit fii3< »jt *ι :€ ζ * ♦ *!:ϊ ( «|iiivalfnf lv, j* ! *' p tiiAi :x ο«·ί G . \ ΜΊΛ » innvikm
Similftfly. if m i? ·' 2«, <'»n «'h**c»** » !iti*¾ir ίηηπι«ΐι # . ^ * » G )%η n unrtiou ί;· ** - » s !... : 1 ], .tint v.> on. Ifi ,% 3 . »5, Itl
1 be* ΜΊ»ί«'Π·*χί «ϋ}κ«!ιιι¾¾ι{, ί' s 2« ί, ϋ, r MJIIW1 li** | ¾¾»·Ϊ two.
Sii.»|M»N A* 2, ΊΊϊΐΊΐ #a etrti! r*4!ini.j itM* kw+tmUt Mt *4 fb** Wt-^triig tf'im tmb . «ί »n *¾«ιι«¾ι «ί Gu if I< 4, r«n!<! r*"Hinj f h*-
Figure imgf000075_0005
2 Wt*, aiiit in ¾*:*ι**·ϊ;ί! ii * * «. ,«3 ΛΜΪΜ r tisiii »* Vmlm of flu* ltii»?fi tg. i! liilo k. mliiiii S r our pcHffm! οΙ¾>κ-ο «ί fc f!*. ι-, <A>i tvt*i hy 4*xirwt%®% ill** m io«'»« i!t* tiils «f %%w >, ι«}ΐ »«ί,
*Π · έί j»ri'ffn*«<i iwtito,! ¾'!ΐ?}ί« ti> to
Figure imgf000075_0006
nwtrlx uftj>!«- Mii'iitatHrti iof *S f i tlia? «ti*i i* t f 21,.22 apfjly tt» it.
Figure imgf000075_0007
We adi sirai*Iiii«r*'»rc!Iy ί>!Μ«ίϊι Mi MUpIiWiWiiliiUuu of * w(t*-ii / I» in* *riit»ic*, it**i«g. tlit r<*lc*rr«i v Arm-tUm,. hy t'h*>
■alikh <'Wint«'« njj / 1 i*«i<!'fto» a*h*'¾»«« <n*a»tn}»"ti
Figure imgf000075_0008
i*» t ai « /. 86 i¼ aoy V" G*„ h
1% ~ extracts, u - iVi„
' xtracts. « Ϊ i . »¾i
/{¥) - Ittterleawi £(I4K ijii ; i
I TO j ... fdy,} .
88 The fmtcium / 4 ¾»f%1 in »χ* (·»} ΐϊ!**ν· ruay «r may not flw ilv .^iiiu- ar. Τίϊο w* t ίϊί- . tbm. fa tt> el fc for Φπψ ΙΛΒΙΙΠ*¾ΙΓΪ Τ, APPENDIX
If / is deeply nonlinear, then if we freeze «11 of its inputs but one to constant values, and ignore all of its output but one., we obtain a 1 1 p j tion /'. If we choose different values for the frozen inputs, w© may obtain different * functions. For a linear ftinedom, or a fsmctioii linear up to I/O encoding, the number of distinct * functions obtainable by choosing different values for the frozen inputs is easily computed. For example, if P = f and / is 1-to-l (Le,, if £,, 1%, . , , » .f¾„i are 1-to-l) then there are xactly jG! such funct ion*. / t¾ii only be t-io-1 in this construe!. ion if
'/ >
Wo simply count such /" functions, represented as j£? (-vectors over C (e.g., by using a li&A table to store the number of occurrences of cad vector as tine p - ! firozen-mput constants are varied over all possibilities). If the number of distinct functions eouid not e obtained by replacing / «it.li a * x rj matrix . hen / Is deeply nonlinear.
We am accelerate this test b noticing that we may perform the above test, not on /, but on arbitrary 1 x 3 projections ø of /. where f fa obtained by freezing ail but three of the inputs to constant amin s and ignoring all but one of the outputs. Tins minces the number of function instances to count for i gives iiiiiroxen input awl a given iBiijpored output from i€?p 1 to j 7[3, which nifty provide a ssitetaiitlal speedup. oreover, if / is deeply nonlinear, we generally discover this fairly soon during testin : the vm first time we find a projection function count not obtainable from a matrix, we know that g is deeply nonlinear, and therefore / is deeply nonlinear, if we use the accelerat ion using g with a random selection of three inputs and one ou ut, and we do not succeed in deiiioustratitig deep iioiiiinearity of /, then / is prohaMy linear up to l/o encoding.
(Note that it is possible that the projection instance counts are obtainable hy matrix but that / is still deeply nonlinear. However, this is unlikely to occur by chance and we may ignore it. In any ease,, if the above test, indicates that / is deeply nonlinear, then it certainly is deeply nonlinear. That Is, in testing for deep nonlmearity, the above test may generate a false negative, but never a false positive.)
If the test in step 88 does not show that / is deeply nonlinear for. for the variant immediately following this list, sufficiently deeply nonlinear), we return to step 80 and tr again.
Otherwise, we terminate lite const rtict km, ha ing obtained the desired deeply nonlinear function /,
As a variant of the above. We may wish to obtain a ftuiction / winch is deeply nonlinear, and not only that, but that its pjojeetioiss are also deeply nonlinear. In that ca.se, in step 88 above, we may inc a e the number of § functions with randomly selected distinct grout** of three inputs and one output, for which we must show that the Instance count is not obtainable by aatrix. Tk more of these wes test, the more we ensure that / is not only deeply nonlinear, but is deeply nonlinear over all parts of its domain. We must Malice* the cost- of such testing against t he importance of obtaining a deeply nonlinear function which is gTsiaranteed to be deeply nonlinear over more aid more of its domain,
2.9.3. Experimental Verification. I 000 pseu orandom trials of the preferred em- bwliiiietil. of the me* hod for cwiislrueting deeply nonlinear ftiiieifcuis / were tried APPENDIX wit h psetidfi-randora!y generated I S matrices /. and H& {k = 2} w w / : 6*3 <--* G3, G - CF 2 ),- and 6%, - 6V « cr(2'). The MPS matrices woo generated using the Vandcnnondo matrix method with pseudo-randomly selected distinct coefficients.. Of the resulting 1 000 functions, 804 ere deeply nonlinear; i.e.. in 804 of the executions of the? construction method, step 88 indicated
t at the inethod had produced a deeply twnllnear function on Us lust try.
A similar
Figure imgf000077_0001
wm performed in which, instead of mmg he selector function ¾ = ®*i according to the referred erobodiniwit. function *j was Implemented s a table of 16 i-bit elements with each element chosen p_«mdo- randomly from the set (0, 1 }, Of 1000 stick functions, 784 were deeply jtKmlmear; i , in 784 of the com- str¾trtJotis.f step 88 indicated that tk* «Mstruetfcra method's first try had produced a deeply mmM m function.
Finally, a similar experMiient w s performed hi whiefi A was created a fable mapping from to psetido- randomly selected dements of {0, 1 }. In 1000 pseudorandom trials., this produced §97 deeply nonlinear functions. Thus this method produces the highest proportion of deeply nonlinear functions. However, it requires n sable table (512 bytes for this small experiment, and 2048 b tes far a similar function /: G* *~* Gi with t sam i/o dimensi ns m the MLxCahmms matrix of
ΛΕ«) t »tQfe <*.
We so f then, that the construction method i en above for creating deeply nonlinear functions over finite fields and rings, and in particular,, its preferred em- bodlment, are quite efficient. Moreover, creating inverses of the generated deeply nonlinear ftiiictitilts is straightforward, AS we will see Mow.
23 A. $*rop rtit>.» of the Above Construction. A function /: Op■—* CF constructed as described above has the following properties:
( 1 ) if L and k% f , . , , /?¾ are 1-to-l f tinea / m t-to 1 ;
(¾ If L and are hfjeetw© (is,, if rfcey are and onto, m that p = q). then / is bijecttve; and
(3) if L ami i 3 , , . . , . ¾. are all maximum distance separable (MDS; see below), then / is DS.
The Hamming imitmm between two Sect rs, say u - f i½) and t* =
{i't , . - - ii¾ , is the number of element positions at which κ and v differ; i.e., it is
Δ Μ, i?) = \{i C N 1 1 < k a d Ui≠v, ] \ .
A TOar OTiira distance separable {MPs') function /: · * £9 where S is a finite set and \S\ > 2, is a ftmction for which lor any x. y€ S1*, if Δ,(χ- Μ) - > 0t then
> - rf + 1. If f = f5 stieli an MDs taictioii is always bii«*e tm Any pr jeetiort /' of an MM function f: Sp→ obtained fexsrag m < p of the inputs to const nt value and ignoring all but n < q of the outputs, with « > 1 (so that /': S,w >·· ·► SH) is also an MPS function. If S is a finite field or finite ring and /s a funetioii computed by m q x p matrix |&n MDS matrix, since the vector transform it coai|Mlt€s is MDSJ, say M» then airy z x z Matrix M* oblailied by deleting all but z of the tows of and then deleting all but 5 of the cetenre {where z > 1 ), is iionsiiifiiki; every «|ti«e sul matrix of M tKHisingular,
Such MOS functions arc important in cryptography: they are xm i to piirforra a kind of Ideal mixing". For example, the ARS cipher | 1G| employs an MDS function as mm fff the two state-element mixing functions in each of its rounds except the last. APPENDIX
2,9.5» Inverting the Constructed Function. When we employ a l-to-l (usually deeply nonlinear) function Gp Gq for me finite field or finit ring G, we often n«fl an inverse, or at least A relative inverse, of / as well, fin terms of |2L 22], the corresponding situation is that we have a bto-1 linear func ion /: C *·-* G¾f which will be Mhail&wiy nonlinear after I /O encoding, whose inverse or relative inverse we require, However, we cam strengthen [21, 22} significantly by using deeply nonlinear ftiiictions and (relal ive) inverses instead.)
We now give a method by mmm of which such an in e t (if p = q) relative inverse (If p < q) ts obtained for a i*to function / (dee ly nonlinear or otherwise) created according to our method.
For any bijective function f: S >- > Sn, there is a unique function f~% : S *-* S :- f o /- » - o / -s id,K's , If /: $** >-* S" and ι» < n, f cannot !x? bijecl ive. However, / may st ill be 1-to-i* in which case there is a unique relative inverse / ' 1 1 f{$"} ~→ Sm / 1 o / = Ms* . That »„ jf we ignore vectors in 5" which cannot be produced by calling /. then 1 acts like an iitwrsso for vectors which am be produced by calling /,
We- now disclose a method for constructing such a relative inverse for the func- t m f which we construct, whenever L and all of are .1 -to- 1 (in which case q > p). U - qt then L mil all of /¾. . . , ^ J¾~~ 1 ar hijeetive, and stlth a relative inverse of / is also tie (ordinary) inverse of J,
This method can be employed when function & (see §t#p 84 of Figure 8
is eonstrueted from a linear function $>\ nd a, final function »i is enipioj'td to ma the output of \ onto {CX „„, Jfr - 1}, where ¾ is computed as the remainder from dividing the *{ result by fc, (If k is a power of two, we may compute by taking
Figure imgf000078_0001
their relative inverses can be obtained easily and efficiently by solving sinittltaneous linear equations by Gaussian diminuti n or the lie— he,, by methods well known in the ml of linear algebra over finite ieids and finite rings,)
We have β = % <ί st from the coistr etioi of /, We define & ■■=. .» i,™1 , where l~l is the relative inverse of {This is computed by a 1 x q matrix over Gu etmly discovered by methods well known in the art of linear algebra over finite fields ami finite rings.) We define a' := ¾ o » . We no have mi onto function
{0, . . . . * - Ί}.
The disirwl Utii* inverse or ordinary inverse if p = q in the function
/- ' ; <?* »→€?* deBneti as follows.
For any II* G let
Η'β = «ΛηκΛ[0. « - 1]!H' ) ,
Wv = extraet[u, » - i j(IV) . anc!
/ J f) = interieavoi / ' (IVJ, Wj' tll ) )
here j = &'{ \YH ) ,
Wleit p = q, t his is just the ordinary invcrsie of /, Wliesi p < q, the function behaves like «i inverse mt# for vw ora in ζ £?f.
If we liave an unrestricted form for i.e., if it is not eons-trueted as ia the fffrferred eiiiboiiirneiit abow, we can still invert or relatively invert a hijeetive or APPENDIX
/. f¼ exAJu|»fc, si ,« m siaipiy a i»b - V*T ΡΙ«Μ«ΪΙ S of (! , then if
a. i¥*w mhte *f :: » L" ih itio formula aiiovc- for / hut wring liita slitfi'Mtt . Γί>ΐϊ;»ίί et«T«ct , 'ΓΙιΙ» t»bbs #' rati !*» obtained by tr vwsiitg all «k¾tiettt* <s ίί7ξ\ <l<-t«iiiHiiiig Li* \. ate! filing in iimvw Lie) vmwm f ,v with the«»ut«'!si*
2.10 Static Single Assignm nt (SUA) Form. Sttpf»ii*t« *v ait* a pp¾ritm ι» l&iigiiago ««¾ ti* C mill «:«ipfw» ¾: , ι.,Ν.-. arrays, street un-*,
Tho mutim* {ftmctw in C t«-rtiincic;¾{y j typif-aly art «a
Figure imgf000079_0001
K« h n MU IW
Figure imgf000079_0002
Λ r utine* ts In «f.i,'tr as t^nrnmi ^^\) form with r<<* <«-t t t* M*&k* miiiiMi-s si awl otily if -wr s alar araUo has weactly oaf cfomiit !tsg ns*ifpiiii<ettt, Oftiifwiw*, it t* in sfeiir fj»¾Iii.* i nrw«i H J% | foiia.
We mAe thai it is oot {κ« »1ο, in gciiwitl., to eoov»«ft an nxHitmty C cwfe rem* me fa ss\ itjfiii with F«|M:V* to it* scalar % ΜΊΛΙ Ϊ«* »Ί||:ΙΙ» tho C lu 'U.t.<« ;:«·!! e* tms*m i> that tfcofo e»» ho w m tb»- o»*l<\ elicit a «n ii-c<f:iHt *
«"if!i tf k a fA«: alieft tivt* mi mi
Figure imgf000079_0003
%'h*»r*» two «|iiF*-r**ftt « lata- flow fattlis iii-fgf i,o, ¾1i<*ro a viiiaWi*, say x is aa«gii«t in .tL the ?Λ* u-fsifls AIM! t tw- #^*:- «*ιίι„ A piiialar ruMwa iiiisi:* »"ttls ii»p :i to Ιοοψ , wi h «¾n t froiii 11«* tij|» i*f r«¾Wi»r«! frow ΪΪΗ· IwitowK
To aiMli this proiilrn , a S MVW i*»nii i ¾igiiiBt«iil ia « ·!«· .<« -.^^i^:.;; ·»!. E..g , in I he f ** rf f ?Λ*η.|¾ιΐ!ι awl c£«»-p tii am i x, ««iM r«ttntin- il*' ¾'«rii%M*-i X: «ιμ| x>.
Figure imgf000079_0004
ami f |»«*ί.. »ω«*4ί«Ι |τ flfe«
Mer ing *»! ll : -p Mii> ftl 11»-· iarft ifi of tly ii ^tislftlrt. tiW.Tt the* OAwitiftlin-Jit
With tin* t»xt«sk« erf
Figure imgf000079_0005
it is mm IKJSSMA5 Ϊ<Ι esv«rt an arbitrary C wmtiw into SSA fctriii mg m cwy vmmbk* m iititialiiwi «t fU i, (If ΐ:·»ι, a i«rf *-r t mA, nmtumaxi iifwm to euifa* i!a* Κ *ΪΙΚΜΙ €«ηρ|*'ί *·'ν
£«· rm tlif* with C
Figure imgf000079_0006
«<wi, lilt «» rn£ «-r :|Μ¾ί«ηιι it in a ftiith« <*xt<'t»i*»n in wtoh m% 1«- aa >1κ*ιι Wo . M) tliat tta*
Figure imgf000079_0007
always rrtiiovisg
2.10 ., f ofn¥r»«w mm form tt* s i fffn. Ct<m¾¾»R t SSA itirni inv «, ill «t m titatiu33 of dwtttftatar* In ilwi'tc,
Figure imgf000079_0008
ΪΙΜΪ¾ ίΐκ* <Ιοϊΐιίϊί*ιΐΜ.·έ* iilgortilitti su .1 ;
( 1 «*«iiB|witafi«.M:t ©I tli** tiomm ur fnmiur for a si t»iH'!it in tin* C whirli trf«¾1iiiii€» ο|Λίϊΐι«1 |.ιΙ««ι·ΐί ϊ« if brt ^a^si !iBtcBl*, ΙΜ«*Ι nttiiovial using iba !«aii»ta:: » front ϊ« algorit Imi in ] 13'; «ίκ!
i "Λ ¾ Γ».*ίϊ«»ιΆίΐι «ί t i" M * to s A foirei, h*?>t
Figure imgf000079_0009
! I lRotit !tia hi Ίί ttnittti» ii» i xmmi t* id Aitusy iiw iroatirr |.¾ >, which afo
Figure imgf000079_0010
hy a! ofttliitis. &ΟΪΪΙ iiotiiig thai tlv* o *ttvt*sk»ii im«th«d in the |¾»|a>r M nah~*& ny tinhlv m tl»* »ig.»a*l C rcaiitii*» ts ilvtit i (t,**-, aaat tioii Ά χ &ί}' tii it fjtwiti ji fl»»ijiii»tii¾. all m\ *% iw*. If flits s B«t tli«> lot «ny viuriftbt* v. «r ;i <I r \ 4>1;: .% ·, ir.isi.u .„ ^-,^ΐιπ.· τ;* ·>ί t „- ff»rin APPENDIX i<Ni<i!il¾ tttstiitJjatKSi i«« Mil ttwiHm*< M the 1»¾iniillsg of rnt*. !.«». i!tii»«li»tc'ly i«ite»iiii fix* ESH:R i&stiiirfkjfi. prior to th* m>t*km • >f Ik* toiitiiw- fnsii SMA to s .\ foriit.
, 10, i*nrrr*uftt fnm $$A F**rm > .v.tf.-t *rm, *I !# n-v*f e e m'**m> »a fr» «« >·* \ form lei A form i> trivial.
(IS ΕίΐΑ friisMi iiiKail tufa:* its mpm^ from ysrt ii-* hn«;< '<l>«'to (Mis} which t¾!i «tfi " t»- i*lt'titifW in ^SA kmtt π¾«· ialic kx^t!-m «t %'liirli itey ·ΛΙ i ¾t.'c I**. tu a
Figure imgf000080_0001
utput *>f nil i!^ifurtioa iAlwi itiaii * 4 ^mm^mw t «rt» uf<* » ¾¾rthl>!<» which »n
Figure imgf000080_0002
|21 HC« «¾*»I1 < ;tj*!ipin»iit> and w-^JSiiiwiit;- StMe that I lie*
h wt-iw. vj.-M
Figure imgf000080_0003
;ibki Sinn I I t! t*. if ¾<· win t n «.irs¾ti»l rumim*, which < »J«ay:-» u» S A form wlvti iitti'tt in a laagtbtiy:* maA C to s¾% f rm, an*! ih- :i 1.0 SMA teai, tip* filial s¾A-fortu «i |>rtif»m is imUk'ly to IM* idoiBicml to tbe **MA-«iifii&l, altk.H¾ti it k ft!ttHicrtmilv «|ΐίί%»!πί? t« It,
;i. BASl.-: Pi'M JoN PAIRS ΛΜ» THEIR
Figure imgf000080_0004
Figure imgf000080_0005
uwcrsk^Bs im ii» ρϊ<ηι<!«! i «β| >i* - 11 *ι»ι * t Ϊ Ϊ .
XI, Whitc-Dox Trapdoor One- Way .ftmrttetw; Defttittloiw. Wt- Wbii-whaf a ttaixkw €M»?-«». itim-tioa » in g«w¾iil, tl«i clfiil wilii the w!iif«-t»x ¾« ttuwh with >p Mkm «ί fli*«riifrti.|iy is !ιη|Α'ΠΜ»«ΙίΚϊο!ί> tnt« e an*! raijititiikiif ton
tola :3 .
fat ***«ilv* «:«n|P ««! Y.j A" !n« . ; v ff >r -ailHwt all' j# t, / { V } A ow«*-w»y fttac i.iii »ίκ! ¾# it {Λ* } »·> fiiicl ,r ./fx) ^ ,. (If /
Figure imgf000080_0006
3,1,2, Whiie*B "I ipd r i m*♦
Figure imgf000080_0007
To tin* Λ!ΛΪ«* nUtmiatit lietimt urns it oil i :3I;, P swW, tl*'- foIJ«J«'ir¾ m0h fhmdmni
Figure imgf000080_0008
iit!wii
/ fi!*» r t*- a
Figure imgf000080_0009
& -imi fnmitmi Iff / j» <Ι*· ¾*!Ι«1 to !M» im!*kwt>tnh as to
Figure imgf000080_0010
ι»ι « wtal*»-i«JX a«i«- , Stt»sLiiiv„ /
& a sAii -ter Ιηϊκί(¾*Γ am'-imv j ntiitm ΙΙΓ / is cl*s«igi¾«di to ΙΜ· iiii>kiii«:*iiia!»l<* .»
.its to l»¾'f« t!io lrajKlfx*r iw»¾*ay fuii lloii pii¾x»rt¥ n ki wUtt<-«t x atta<¾. APPENDIX
I Jc-rtiofi ; Λ' ' - 1 to n »■'-< ·-·<:, ». *!··.'. tlf fill i* 3» » if¾' <Ί«|ΜΪΪ«Ι f Λ". bat if Is *'f«ii|*¾!ftt!»*ttallv ίΜι ι«ιΜ<*' to !lwl .r :r / for 'ata t All' JI 1*.
fil ation / mim> v> i«i a irej*i<*'* *iiw»!:;¾| ttfif Iff Λ · ■ I' * s cm»«* &y 1Μ|« .¾» ΑΪΚΙ '¾ it i*t fivrti #, .\»ί;· ΧΛΪ' U >;¾<;»' {. r &m # Y to !ifni I- / ·· / 'if?). ( r
Figure imgf000081_0001
f¾ima!Iy
Figure imgf000081_0002
to s b:*y *, h a lf»|Mi«.»f «i¾-U'i tajt tici i tlic «m»'t itil*iiiji*««f }«>a tti<- ta4y !
!;:·.-< * i> ,' Λ!< ·,·' . . f'-.Ί Μ·',»', iff / jt* flrmtglr*! to I * llti-
Ι»««·«Κ'ίίί«ίιΙ<· Ή H ?« h* i» 'he nt»*¾-«y i?j«iiiiti f»r.:»f»t»f iy m»*m *hit«*-t*.¾ ntt -k Stiiularly,. / ¾ '< -'.*·/ :»!· · ' >-'·.' >: *-■·*« Iff / I» 4<«ρι«ϊ ί» I » Jtfijilf1- in !iaV»« trttfxi'- T οΐ «»»Λ biji iitJls }# {M.<jrty tit*!*1! »'h!l«»*i.i.»X (l½r x«ii|»lt*, «a f#¾liw wiiito «¾ i:w¼i~l;»n* isa§iieiii %iiit « of « *yas- iwlrir i'lplifi / I* a wht >»t*-x i>»"»a fr»f*kmi tii* mfoiniatistii
im| *-iti*>n* twWi «WI
Figure imgf000081_0003
!*>ii <»i / {. {T «s; »-i iii!!»rit!«fkrti «ΛΙΜ tf. *if I» «, f-iifiipiitatti iiajl
Figure imgf000081_0004
;ii,3 K#-$' ntn> y mid
Figure imgf000081_0005
ftf f !w liiodsoa* jA >fh 's mx tts*!,. giv«:ii i¾ fiarttftilitr
Figure imgf000081_0006
»! »ι |ηι. * fl*»iniii«tt of /K.1
» tlwfi in..L ·,«,!.· - η ,* tu> »t ;«·:;· .^ηΊ q, _ ¾!.'« pl} ίιιΐ|Α»ϊβ»ΐ» / iitui ii!f«:.'i ftttirtti*iialjf v: i! Is oiilv i Α*!*·πΐίΐ»¾ ft<»r ύ»· ftii 'tk»im!ity I* ri*t*w«il<*I n burn's id olm mmg fit** I mp)*f mf km w¾tti> i« tt»<-kiag of tb*- plis iiinl qiifj. isoiirw **nt iauiti*Ily*tll:%'**f *" B««w PnB itiri
Figure imgf000081_0007
peir
:i j. S^urfty; lllrtc»ry» Tlic y, mul Profxitmd Af^rnadi. Our i»in,¾l {»t>
Figure imgf000081_0008
li has brfo «i<*nn*ii».ir«ti:¾i «¾ir|tp:i%**| ifui-i. f his «-Xf si:«|.i<ni »*. W**»: miiv st h f;tils ΙΝ¼*Ι%ΙΙΛ -!ΙΛ«Ι»ΙΪ¾Ϊ> ¾ΪΙΪ|*!Ι:« <«ίΛ¾ίι t» «<* !j*i!n|ili* s-itmigti to «ί»Ι>
Figure imgf000081_0009
!":.;* k l ns 1<> s««¾ <'&¾I *«, IIJ wl«**ii «Ί« κϊϊη 3ί ia ««t pley * i*rt, l«af in wli!rtt th*'V w« k to ivci ri wifls i!yaniia*- |if¾i*BBJiatie iwsikAat«iffe Ί 'stuffs **.!¾»¾ . ro t iifM' ili}., t»r*«iwt»"r «l*t,*
Figure imgf000081_0010
.· · · Mai »yh!»ii» i»¾t«| to J.;ffli¾IeJlii» ilff* *llll)*:!||t. V«* llaV*' firiAVll, «ftJiJ|ii *,. thai frtltt!iiliHK-)' itu r*-.vi hnUt"ity f f.I nis fv*r *ηίκ4-Ηο»· flatif-tiiaf In ih*>
Figure imgf000081_0011
*il f*ur«li ;f - APPENDIX
More generally, we have Rice's Theorem, which states that, for any non-trivial property of partial functions, there is no general arid effective method to decide whether a given algorithm computes a function with that property. ("Trivia!* means that the property either holds either for all partial functions or for no partial functions.) The theorem is named after Henry Gordon Rice, and is also known as the "Riee-Myhili-Shapiro Theorem" after Rice. John Myhill, and Norman Shapiro.
An alternative statement of Rice s Theorem is the following. Let S be a set of languages2 that is son- trivial, meaning that:
( 1 ) there is a Turing machine (TM) which recognizes a language in 5. and
( 2) t here is a TM which reeo iisses a language not in S.
Then it is itiideeidable whether the language decided by art arbitrary TM lies in S.
Rice's theorem applies only to Bnguistk* properties, not operational ones. E.g. it is deeidable whether a TM halts on a given input in < k steps, it is deeidable whether a TM halts on every input in k steps, and it is deeidable whether a TM ever halts in < k steps. However, the general impossibility of virus recognition is linquistie and Rice's Theorem implies that a perfectly general virus recognizor Is impossible.
Patents in Irdeto's patent portfolio include software obfuseation and tamper- resistance Implemented by means of data-flow encoding [5, 7, 8] (the encoding of scalers and vectors and operations upon them), control-flow encoding jOj (modification of control-flow in a program to make it input-dependent with a many-to- many mapping from functions computed and chunks of code to com ute them), and mass-data encoding [20j (software-based virtual memory or memories in which logical addresses are physically scattered and are also dynamically receded and physically moved around over time by a background process) .
Of the above methods for obfuscating software and rendering it tamper-resistant. data-flow encoding is primarily a static process (although variable-dependent eating, in which the coefficients for the encoding of certain variabl s and vectors are provided by other variables and vectors, renders it, potentially somewhat dynamic}, whereas control-flow encoding and mass-data encoding are primarily dynamic: dat structures axe planned statically, but the actual operation of these sofware protections is largely a result of the dynamic operations performed on these data- structures at run-time.
The amtml-flow encoding was primarily aimed at (1 ) reducing repeatability to foil dynamic attacks, and (2) protecting disambiguation of control-flow by burying the normal control-flow of a program in considerable extra control-flow. The mam-data encoding was originally aimed at finding an encoding which would work correctly in the presence of a high degree of dynamic aliasing: e.g. , in C programs making aggressive use of pointers.
A difficulty with the above forms of dynamic encoding is that the support data structures (dispatch and register-mapping tables for contml-ftom encoding, virtual- memory en/de-code tables and address- mapping tables for mam-data encoding) themselves leak information.
We propose to use protections similar to a combination of coitirol-flow encoding and ma -data, encoding, but with a sharp reduction in specialized data-structures, by moving most of the special! zed processing which they support from run-time to eompiie-time. Let its call this new form of protection, with much of the dynamic
*A language is a set of strings over an alphabet. An alphabet s a finite, wewMHBipty art . APPENDIX
Wif|i*t«iny ©ί nrntt >i-fi&w mm m§ mA
Figure imgf000083_0001
efm.o4m , h wit h !*| ¾*ΙΑ1Ι «| llmi f!ic* Itiii!H to littgiii ti
Figure imgf000083_0002
3.3. Choosing Inwrtlltte Matrices! owt*r 2/ |23 j. To crw * mi ι.ον*·«ϊ! · tiixirk M » « : 2 J n. ! its invew II 1 , *o f»i«:*wl i» fcikws,
< ":, ·< ·*- ·;;·;·£-»: t.* ;-t! <r nr.. η..-t! wifli ΙΒΙΙ¾**Κ» *½»fi?h «» «4 J¼! »V«* ':.« <3; .··> .. ! mlwf^ all « X a ttj¾w>r>mi*::£tJ*r liwfttfili* iiwttrsx fr . w
• if ? ·" n ) ϊ * rend (2'· I s.
• if t ii, « I * Ί M rjttKl; s 5h «η4
• tf I > j, fi{ j 0..
Sine* all iti*g«.*&«J t-lc*»«ts *r«* ««M, Γ is r. uh ;;Λ«Ί ? If-.
Ιΐι.ί'·;····:..!. nth < ί '-. , ? v. . v., }, j.sii>1..t:i :
Figure imgf000083_0003
Iti«t fit** ΤΪ¥*Μ ΛΙ - Λ*> iiiici ΛΙ 1 * (>)",Λ*
This ep rcwh eiittKw that fli c»i»|»!.ii#»Ei<je t,f i vr wry e^sy WJ * nil iiiiWKsii* Mi- 8>iHjHii«*l ttii !i|»| f
Figure imgf000083_0004
m.rt;- .-, wi,:. l, rt t» alj -iih* in i-«m"~ w - 1· .:t !·>: »,* w «f^ nil iwwliiig 2<1 at*' «ίΐϊΚ tif Z · *2"';
3.4, Virtual Miicliiiie awl Itwtiwtk Set, Th«- jJi |«w«l fatis tif «ii|>li'}.sj*-tti*tki¾« « *t.¾i«i!iJsiitfir mid »ΐ|»«»ιΐ*«ί.·ι! «*» tti fri i*' (i<H8w« if frfiiii t IN* poiiwl iip ia it-mm *4 * vittmi m&thi ( uji NT,-!.-
Figure imgf000083_0005
.«4
Figure imgf000083_0006
¾1if»f»* li * fl«|irt||! Λ if ίδΐ <*f USSJ #4 l»t t» 32*
*ΰ!β... Tls<- vvt ifi*{nirtK>!i c|»f«f m'ifl^ssi! *i . <¾m ,. nifi*tf*f¾*iH¾ 3f-lj|!
3,1.1. l»i>frw'ftni«„ Itcot
Figure imgf000083_0007
et iipst fldkl*. a 2*Wiit fhf*» l-hil ifi'inl llag:s U, l„2, . awl ihr.c 32-t»il o|«¾«atts c w aad J, optv ml 2, .".,· , - ./ ' .» I of wiiirlt is Ι«#τ&! if tbf «*r¾«¾|M:*» tt«j¾: M ml *g t* «*t »wi * rt¾is!c*r iiiiiiilMT » Iwml^- (ΗΨ fig.2 ) All \ %i iii*tfitrtic»w i» t tils for ιιιιιΐ. , rtcf'f.rt ffif EMjji , κϊ l' rr, fiittrii ti t)«- ff»fti:»i Λί**ΐι Hi Fig ΙΪ ), *i!h a 2i«htf R»:iW !¾>!<!ΙΪ¾ a, ««ii.rt k, S-bi? t» . :*¾|.\ i¼rw l-!»it literal Is^ ^II «:>t to 0, alul : 32- Hit r»¾i*t?f Ran,i »r»,. APPENDIX
3, 1.2- lmpk* υ·>: *
Figure imgf000084_0001
«wh IMSWI cut s| :*ri.iil ti»* of a twksk itwimcttmi, ¾ri* om*n in 'fkHk* ί :.
Ttw *t iwuprtsuig ilie ft»! awl iiiiplinl iiBtructi iL c¥iiiij*rt«« f it**
V\t fc»lf i!ISlri3Ctl :«i "f ..
Figure imgf000084_0002
!rttigiiiig*' ίΐΐί|ΐ|ι*ϊ«ιΐ«ΐι*η*,. *s nil un'^*:.«-t;«- -^u^! Ι*Ϊ - <·Ι ? «tbi»i«t«' otPtb t AS«j, Mil.. im\ ANO, OR, xofi, Kit, tm u;i\ t'cr:., t sw, ¾tt*i l.l^l-l «»?Ϊ*§ΜΜΙ to t* ? iu.r» >·}« J ;»*··: * ». /, X, *, I . **,— s <. <·, >. >«, «, », r»|wiiwlv with asst *i 1st t. ^ ^ ;,:^ ttt ι,Γ.?.1(Τ4*1 int*. x::.::.,rlv \f · vt; «.,;r, ,|.,«;Ί!..';:...ι (*.γ-: s'.-r- - i -h *,US. ::.M.;T usilgs^j «r.X !·. iV.-.*>h *-! ; -I ί - . J .
1*1 <, <·. >. >· with 32- Wl Sat ci|Mfii»:fe, Haft, axst ·. - .- > L-I *<> C* << > witli S2-I«t fat o vntwls and a jxxsitivf* iitiifl*«rttet,. »n l « «
€| Ι,ΟΛΙΙ, STliilf:, Jt Ml*, Π Mi t! M„'\/ -J* UP- lV,-> ]::<> , . Vt
Figure imgf000084_0003
Tliiw t *» .» · < · tr«*i«»iM!i ·{«·»· fjt«t*w»ti the atww* VM iii»tiu- » »
34,1. ,¾ir«» ·;.- 'ί<·.\ U-..- ·· »;,>!Μί r:.»is- t,- >;>^. n*. in
tiirtiiaffon of in ikm perw. Ei«:¾ iiiiw ** Iteiiu. <>:....n wty *«ssil lw*
«a p..i i;.
Figure imgf000084_0004
iintf jmwliifiii:!* ».n titfiijt vi tor -«»|»rij«iog rt*¾is:i**r ll. , , , . ,4, '?,
Til» if!ijf* t.if IJKffsi4} ;·,'!:'.·.»*ι"' \ .'¾;·;■·- « ·ί ■·»»<- Of li M* MA3i:V 5»'^;-?"f - APPENDIX
Opcode Mnemoak- Operands Eifert
0 HALT (ignored) hall execution
I ADD i!j— ?\, ?3 h - ?2 + ¾ in Z/(232}
2 SUB di— ½,i;i di - ½ ~ i;S in Z/i232}
3 Milt di ½, i;¾ rfi ½ x /3 in Z/(232)
4 DIV «Ί - ½,½ *i * ;r2/i3. in Nn
5 REM rfl ½,¾ di · ½ mod !3 in Nn
6 AND <ij— ½. i di _ ½ Λ j;j bitwise
7 Oil rfj * · ½,½ di - ½ V <3 bitwise
8 XOR d, - ½,f;5 ¾ *)i l;; bitwise
1
1 unsigned]
1
Figure imgf000085_0001
misigiied]
13 SET ds - ?2.f3 di * *?i < /.» signed]
14 SLE d% < ½,½ :*t < ¾ signed]
15 LLSII dt - (½ 2's)inod232 in Νδ
Hi LB ii dj «- ½.»:5 dt - i.i2/- *J in No
17 ARSH i i2/2ls j in N0; then set ff3 rood 32)
high-order dj bits *~ rfj 231
18 LOAD di *~~ , j— Μ{(ύ + ) mod
1§ STORE *l · 1*2, ½ M[( + ) mod 2¾ m
Figure imgf000085_0002
20 JUMPZ lj? ½.¾ pc — ½ + i3 in Z/(232} tfti = 0
21 JUMPNZ c ½ + i:i in Z/f232) ift,≠ 0
22 Jl'MPSUB - -h d t'C+l: PC - i2- + in
23 ENTER enter routine, setti g registers d\ . , . , ; tlfc to I ho routine inputs
24 EXIT exit routine, taking the routine outputs from registers ,§;....,.¾
legend:
j I ,#* source operands: registers; can't be lit oral
di, , .. ,dtt destination uperauds: registers; can't be literal
. ¾ input: operands: register contents or literal
s ureo/dt»st memory locat ion, address = a
X *-~ v replace contents- of∑ with value v
PC program counter (address of next iBstnietion)
TABLE 3. Virtual Machine Root Instruction Set inner sequence is modified b the outer macro instruction of which the sequence- is a parameter.
The macro instructions comprise the following:
SAVEREGS [] Γ APPENDIX
Implied Instruction Translation
MOVE cfi < ¾ OR di J2-O
NEC iff — f;i SUB di 0, ½
NOT rf( — ½ xoRdj - t2?{-t in Z/(232j]
UCT rfi i3ii2 ULT di— ½.i3
«ΪΕ d, « ½, ½ ULE rfj t2(i3
SG (If *- f3, ½ SLT fi
SGE -i * , SLE fl! <·» ½,
JUMP ½,?3 JUMPZ 0? ?2,i3
Implied Virtual Machine Instructions
Store s% in Mid + i l! for I = 1..... n; then set r *·-- r + «. Typically, r would specify the stack pointer register.
RESTOREREGS [] fif .... , dn «- r
Set * r -- I then load c¾ from [d --- i - 1] for i = 1, , , , , n. Typically, r would specify the stack pointer register.
UKEARMAP d ...,4n «- where is an m x n matrix over Z/f2"¾2). which, where » = ί^, ... mid d - (ill ,...,£„}. computes d ~ Ms. with d, s treated as column vectors daring the computation;
SHORTROTATE 0 dn — ½ : stt
where, using register or literal io and letting
*— (- l)totBlxl2( ¾/2J Λ31) .
with Λ computed bitwise over B"u, which computes, over Z/(2'i ),
d, < fc + I2fc-;¾tj
for ί = 1..... n when k > 0. and computes
Figure imgf000086_0001
for i = 1 , .. , . n when * < 0:
LONGROTATE [] rf3..... n <- ½ : 8|,„.,in where, osing register or literal o and letting
{ l>t nfc'da (|ij A3l) ,
wit h Λ computed bitwise over B '-\ which, letting
S =232(fi"'* ¾ and ^ =∑23¾s ¾ i~l t'=l APPENDIX over No; computes, over Ζ ίί3"),
D *- 2*S + )2*""32nSi
for = I n wh n k > 0, and compotes
/? <_ 232n+*s +· !2*s:
for i = 1..... fi when fr < 0;
SEQUENCE fli 1*3 <5f|,... *- *j ,£m where i i , .. , , i* are instructions, which computes the instructions sequentially, where
Hi, ..., sm are the registers from iiieli the sequence inputs and d\ , rf,ri are the registers to which it outputs;
SPLIT lei , ....¾] d% ..... dfr *- -«i where 2 < e, < 232 1 for i = 1.. (:i≤ 232 !· assigns
Figure imgf000087_0001
for i 1, ... ,A% where we assume e ) - 1 where e is an arbitrary function; i.e.. the product of no elements at
Figure imgf000087_0002
always 1;
CHOOSE [i i....,y dfi, , ., dn— ; «t *m where i,, . , . , ik are instructions, which computes the sin le instruction isa■ -¾ «0*i÷i ? where «j .... , ½» axe all the registers which are used as inputs by any of it,
tfm are all the register which are wed as outputs by any of if ..... i¾. and any register which may or may not be used as an input or output by the CHOOSE instruction most appear s both an input in ,««,«....$m and art output in d% , dm: where t he expansion of a CHOOSE macro instruction into basic VM instructions uses a binary tree of comparisons as close to balanced as possible: for example, if k = 16. there are 15 branches in a balanced binary tree, each having the form
VGK d * s, n
JU PNZ d? T
where T is the label to which control transfers when register s > n. and the value of n at; the tree root is 23i (splitting the upper and lower half of the values), at its left dscend t is 2' (splitting the first and second quarter of the values), and at its right descendant is 23! + 230 (splitting the third and fourth quarter of the values), and so on;
REPEAT [a, I?, i] (£,.„„., dn *— ¾; «$.. · . , s, where ¾ s either a literal or a register, executes a + *2""M(ft - -r Ijioj times;
PERMUTE [JH ..... pfc] rfi , .... dn— Sol «ι,.,.,ί» APPENDIX wbtw tmrfi of |¾ , ... , |½ is a Μτιηι.«»ί n of ( I , .. ,,M!. m urii f*»r Imi 4,■ »|tt , f : J'*r I !.. . fi, wbn * (2 :i ^ * I :Μ· , it pa* tfi «j„ . .. *{> |¾·πϋΐ*ι«ίΐ*»ίι i'f rfj,. -i. t4 :i*>4
Figure imgf000088_0001
with ail .'Xfi»fr<*>B ss»o ki*ie % %t $t»Ii«Cl!ts|.» t»«I¾ tb*« «uii*' ¾s nearly H> § »«iMy ataa*.«! Irtnitf* fr»v of .¾i§Mi MJtia »»! t mi: l( i, ..... rB 1 <i ........ rf„ · .« *
ENCODE [rt , ... , cj rfj .... , * *,
Figure imgf000088_0002
1, . ,ί'ι, with the iOi I fiMrtMMi *r<:*fc*fi*i«-** !»*li»g an jiii/ii Λ|ρ1ι«ιί>ί«η «4 fit*-* fAitef thus AH
Figure imgf000088_0003
y.c... fiartur*-} t hcM»' fU
Figure imgf000088_0004
FILV t'lf EfV J , di ,dn - .*$,.....»„
«f¾. Mi« rf, · <:,i*. ' f<»f l 1, . , »ϊ!, wit I hi* litij!i-iti* ti*<t*t|tJll §>D''it*f!'¾*''«' i'*Pttl¾ «¾l.i tmplmt %i «»„ ?·.*♦ a|>|>!;. .;;· >s* ·? *!.«· * , »ΙΙι« iti« A» * jf»Ii«i A | i«*ti«.»fi of its* # a Jul
!u i tlifjt St ill** ikfinr, a, III I tc> *»lit*Bi tf * rshjA r«-s.ilt, by a «tn|ililW \.>r..t;* >*f .>·■ *.■·*.■ * > « ---.i I*.4* 'ΰ.:;·,*.,^ slifii ¾ »>t
Figure imgf000088_0005
Mj« Uv««, { iiinlv i» hijwrtiv** u ΐί»ιΛ.Ι. ]·».►· 5. &mt uutoj
Figure imgf000088_0006
ifi.f. f j. for i * {Wt, ngfii with ·, rtunpiiiing · h|j«;*iiv»- n«i ii,u»I Willi i> projwltiJii t €(Ι»!ί«ϊΐ«Ι Ιή' aibilfanly ftxj.}i* <? « · ;'i; j!tts¾ jt
MoA'Mlng Program Strticmro.
Figure imgf000088_0007
n4
4m§<m Imhik I ' .,
f' *r i*v*'f V i! tirti s otlw fli«n !lie lifsiiA iiistftrfi«i» MI*. Mvmt ntt'i J«;. I*N'Z. t * ispxi itiiilf«rii« !i «w!it«! I* tb eft1 sn«a<«ils *-i fi!I«>¾:»¾ t * f.-iiir«!t ii^irwiiua, 1 «s, in u-raw of «..rtitri.il aw, ?!ϊ·* »rc¾rAiii imt « *rapli i.a *¾trb ifni* iifriiiglil-iifti8 lmaki* o «K1C* wfaicti *r<* unly **ntt-fi««i at lit** !«¾ρτ$!.ϊΗ§¾ ¾»1 only It'll hi tin' ml, vi fxml i*> imim in m g aph
Figure imgf000088_0008
ii'tili a fte*i<" lmk (Hiif: a »¾ iipncf* * V\i ίί ϊΐτΙΐ'**»» n tilcfe (Ι « list iii-uwikm * «t!« » >- ttrs. iiiit}itirt«:«i tfi lotiliw {always* nn r. TTM !i.*irii<l»je r t$ I!M» APPENDIX
niways a !>r«jj<ii <ii «n f¾tf instruct kw.
W<*
Figure imgf000089_0001
.«ΙΪΪΙ, fin rtufe tit ».» i;x?T ii»ft tirticei, ^stlta* t!««t«:* ts rtartlv $¾tf.» s»tf η«.-ΐΗ»!ΐ f..«w* $xrr iwtfttcikw hi ί!κ» roiilas*.
Within A IT. *«t t»i#*acli i»rfr¾-fto», ify* tnM w!ut h flu* *kfiti£ <>| iii*fi'ti< !Mt*
Figure imgf000089_0002
t *«» IT*ws»- a e *4 Jhw ksrtd
i i > An i!¾ifiieik« f i* tripii!»p¾!«!.iij*-i*i fil «11 AM a&i i iirtiee jr Iff §< Ι.Λ¾« m m Input * viiJiii' j»f«iiif«i *» «i t!itifpiif I >, - ..* -, ir |-Γ·- ->if ',<-:' :<:* ;, , lis Λ ! .¾inti?»!n fi*i¾i y
Figure imgf000089_0003
« vr"UL* iB,*ti¾* i« i / 1 If ;·/ Ιι<·κ1> f» !ro!si » ¾ti*>« m!« wtu«-lt
Figure imgf000089_0004
j M<*t«f tb#»t VHI***. *
If y t» fcKui-Sf fi* »J*'| *!iili>Tii «8 mil its lfirt sfi **!lict» l!.*:£l« f»r*:»<}r-j:«' . ttt ί¾
Figure imgf000089_0005
T.
T¾<w ilffwfcww n^'ar tiwii j \v> *«x«rnmi afirt 1, Tb^ <>*.i< .*::->; ··! « l»lcM¾ I» IIMSJ i%.%t l to »» «^««ΐ»!**»! with a* *-.-<-i^-„ ¾- «¾t %4 ϊΐκ*
Figure imgf000089_0006
Tt*» |i:K¾:»»fiI ipjiJM-s ¾»» of ti» kik mg.
|i :
Figure imgf000089_0007
< li«M'w «!*f iiutatwtv tt» MM «1 ir;<tlf ifily ηύ #ti t»!»ig» to rtepiirt'. ►·*,«· '
Tli will i!Mtfc4 ilii* sJ«:*p ii :oli:iamriiy ι>ί Ιΐκ* ϊϊ«ίί.ιΐΑΪ¥ iijwr«* f¾i :tkiii
Figure imgf000089_0008
»iiikiti¾ dif vf ilf«ti cat I lie a!i' «fiys¾ iiiii tk** ΪΗΙ »*£» .iitlv'iA ... tl≠' j s '.«.. f¾» not «»ΐίι|Μ»Α' ! be* loi; f# tis«*y ar** all 1-tolJ,
Tti# *|i»rt of ifiM'ttifti* I» t tet f **t^n tint* ι«»¾Ιϊθ*«ιίι. o| i»s- APPENDIX
( I > Paii *** tailing of miiiiti mpmt% m*I final i i
Figure imgf000090_0001
tbr* «sf» turn- Ut n ft«ilttftg fr«*m t « .< i. «·ν-»|ν iiwc ft |i»|?tii tuitpiit of *»\ lll ti!* a»| utput* IM } |, .: , , ?r f ¾K*tK*ft> «r|*h ' itU*|>|*lf!g
Figure imgf000090_0002
in i'tg, 4 «i p. $0 ί«ι rh * w!* r<- tb> ami ii lly
; >i τ-°· Ή ' sn.*;> ·χ·' ···;-
Figure imgf000090_0003
«*ί ' Is- ;Λ Ι
<-t<** for
Figure imgf000090_0004
40 IV to|s«t. <-1·-«ί mr »ss«! In pair* >«· 2 » 2 arixisisi
i«!i(ift¾, a!iirti <*t»«S4 !«· fjrfft.f js»i| !*y
Figure imgf000090_0005
ί §(|>ι
a m¾* !i 2 -(2¾) «r 1 Ί ssw ix .t ! <>a«
»!r ,f H|»». -fstl tic <-;«*χίί h!f rmfi-twMi!
:* nm8«.tt<.« j*ilyih*m.*K tta* SsieA* <-|» s?s«>si>» ><<«! i« !«> **- <<ο&Μ &f- '«si t>¾ tlis» oa r* t* isw*f or
iwMsiifin «i wte vi*ii«?ii.- ;mfc|wirf, M im *4 * .·<·«*·{«« to *
j«14ceitK m*\ n«<j!s|4is»t?krti «hl»- » wtik*»«J.
tt |»i»«i« km fr ιηρκ* 1.3.5.7 3.3 t»
i.ssfmt* 2J,e.» s* M. Bv B inx κιρηί»; I 2, 3 !, ·". 7 s, «»i a |10| f»-ki« tnixlai wt!j-at1* S i, S 1. < .7 , vi' > ι¾«ίϊ·· *wh pr«j« st«* ff*»!st sii!<¾is t<. mi »«»f '<»;■ s«» i 1 » « :i«i»<i;!.j .|»>
Afl«t »t«t» * ti««>n.i*f¾4s,iit~, tif *i& Is !,'i.S,7 wii - wit! «l v the I f mt^>i |¾ily»twil¾I!y «»»· itiiiif4 tmg. 'f lu* ΒΙ*?| <-x»cf!.¥ its*1 wsit
Mi ·«*!!*· lias ··;ι |2.9 **
2 '{ $wit<A (wrkiin the «' -t «<»tjf¾, i>> (i!f i¾i'ri!i«Sie!«
Figure imgf000090_0006
<¾!! .'i 576 »1 »». W f its tt,v i -i«atig «{ l-xtuiiilMiic*. ί !η¾ί',,¾ί«·
nu,n.fn'f in i utfi - y mMi A- <<( wt rtw ΊΪΜ. ¾;¾S W»>
\η·*.
by ..i ! sl* 'Switch',
4.1 OtMt It-ft rf<ii'«s Fig. S -tss«tlsi:'is:!|>:i«*to/il:i Ous s f«|
Βΐ«|.ί|ίι¾ Iii >wi a!rjt t, a to»»¾ kind
; j ¾> fof «a¾ ϋ.βί'»<:ίι!«| ί:ϊ*¾»!.>.?β <. tin*** *¾n»
t ifH- to fill Tro !![<* lisinilxf «f p:s««it(|i» gi «*«^ !«ί!«ί<
r' bwp r tin- w»f stsi< i ,· 2'" , Xw APPENDIX
i'inplrt' an / ίΐΐβ|:>|»ίηΕ 4- fc§«« I vm- m wl §t kuwtk !a»|ipii¾ i-v«-|* rs to |-*wfi>rs ff» f !.,,,,«. tt*- mtil <!>^ u-.^ fin.-I ,.Jjt*-t;.,« r th«< * ·ι1ρ! ton
Figure imgf000091_0001
411 Th » ίΐι|κΐί vvfiif ilc tK-Bfs *rr iiilxwl ?:i |»i,if» by 2 χ 2 u xilig. efM-*- >*.:·*;** wliic'li IM- |»*fftaiiK*t !w «ϊα:ꫧ wi&fxix OiApfi«¾o», hop* th*:*
) o? 1 'lit'*}. Τ1<· maim opwafion?* wr ** ?iso♦·?.« <· in - HT«» mypfMs ri
Figure imgf000091_0002
»»rmntiif: tt p yiwms O ly rtiw* kmd> < { ιφ^Γ?»ί }*¾.*> n>i«t to I»· on- ct«t«<! *» te»¾¾t tb** «ia |»¾¾ t/*ibw *l w
Figure imgf000091_0003
t* U«m? «-r nitm**; »*|*Itii«»i» of two i taNci IMMVW). ί*<ϊ<1ι? ο∑ι « ii mit^i iit t" Λ mriiM*'5
Tin* iHirpei* of tip8- 2 x t mixing «* litis. If < ,· n- 1 ;:.·.1 ",. *viug i* ιρ ΙΜ Hiavofy, t!ic ·ί ]«ιί«ι from Hi *;!*. I «1,5.7 to otiipijfs 1,3,5,7 | iiM-itr up tt* i/ oticfidwig, ϊΙί**«ΐΙι itn* t>>j^ii«it fio!tt irs us^ JJAS t« output?.2.!.&,·$ is* in* Hv mixjfif MEijmt 1 3 Ϊ.■> S, * < an«f. m > lib 4t
Figure imgf000091_0004
firr«i.liiig ft* III**
Figure imgf000091_0005
luteU* hv tl«* fijiirtk¾;*ii t*x«l $iilcft«*«v::i¾, i'i «'t€if.
I «'
Figure imgf000091_0006
fiinrtu .. «hifli f¼»*¾ u'hj<!i <-f itt«* riglit-**il in ii& * mil! !w< U .- I i* **l t)v tl«* 1 .» I !ϋ«|»¼'ΐ, ni til -.J i¾ usually «<ti- os.,!*!**! «ΙΙΐκ* ϊψ-ΐΜ Μ <>\<t f ri!¾. Ί¾.; i?u*pp«*:f !:;4*'» «ικΐΐ t !ι«· tw* Iiiptri iiws iift***i<i- n<m, tnmmvap thmt tiv* iuxvr m
Figure imgf000091_0007
mirks.
42 i * Switch | ¾ lor s i » ι ho tuil » *l« « t of ·, by 4 tt « 11 ί tg i n * <t ri i«Ii*t
2, 1.Γ«.^ fiotu st#> 40 to f apf»¾fi«li* ί/ί «Ut|.iS' ·::.!;,' ,?}·
ttii* » *!i«>*n i Γιχ. I«m Mn k* Mftin htftg«»f i!!t»-f!SK«liJ»fo* it* lit*- vlf«*-n {t iiit|.i!oa*if>n*id , m ytacifc*. t!»» «♦{.»·! alion sini fj UMM tiiiiii !ltH. I» |:M«lc*nI.i*i . u t* h¾My *l»irai l** t -' "¾ > t; ;;.J- »»! · \< ι· *; bo oxii«¾K»|t* largo, t!¾*t mm fiiuttially Wj*^l!',. ^ r . !*.A tr«j*.itK¾r *..«.io- ¾¾y fiiis' ksi i.iBploiia¾p*t.ioii5 cxfiitMl Ι1ιι«*-¾ι»ιΐ}«1 doof* iioiiii«oiirily. I» «f<¾!f f«< e lii vp tliw, «¾» fopfswiii a !»&o wi»t»s-r <»i s !}tipi<*tiiot$i#*« t«. o in ttiiiiti l 5 «». Λ> a
Figure imgf000091_0008
-r ΛΪΙΤ iw.it.o::ji:¾al*f I tmtnx.
*i*<-ti toft" i*fi*l « si!nii fc» ufik{!¾-, by »?m*mri£ ! *o ro s i i-1 «*«1*«κϊ«»» «'o onii i.-iijj 576 iwiiiirt nwfrkfs. VV oiin «J*-» %¾ty iho
Figure imgf000091_0009
»* ΙκΜΐΐίίΙαι**» (i.o., *l |*»»ts w!»ofi« aif<i«¾ .fiiiiii.i. :il»!# froin i t« to it»otfj*c HI Fig,. I;, i*it ii«y
Figure imgf000091_0010
ΧΙΛ. Ί'Ι»Ι» W mn i/ ilv r«¾«ii.ly * lii**vo ¾ «.ip y «*| /^ in A 1ΙΙ»ΙΙ»·«Ι <. :,< ·ΊΙΛ of »|*III:*O by t>r -f!\ ii:ii|>!«ii¾r!iig this 'S¾Ptdt'- <*« f <- l*-!s F¾.. , »** itipirt* {o / in | !f:«. T¾s*
Figure imgf000091_0011
ί*«ί<Λ ηο·ιιι**πο ίί¾Γ iBn|*|>iii; iinrfc*. In s«r¾ «« ,at o h*v*« * kn* Utpd of €ϊ|Μ·ί»!ί«ιι : j■? y, iiisl f<* an givc¾ m.»H ·',ΜΙ*»*1 Ο| *«Ι|Ι?Π ·* , itffo af tlirw *1«? fill Tte ίί»# !sU5 «-: <-S p. i— )!■;.· ϋ·;«»-· N ·*::(·!.-! SIMJW t*- rs «-l¾fo r is flic oMiotwr of of ring.:;. i..o„, by i****, Xo* APPENDIX Λψ - ** that the* mtiiuh't *¾f«y fiif trtiiXitii*! %*> ?>;ti -i »v
miv ^ p* V I t«i « ¾*. ί nil «I tads ty of 2*t%" wirfrti f« f a 'Λ'2 i* ϊη*¾* for tr <&i s 2** UM»
ow sa if¼»* tti«t (ma mix input* in j¾*if Tbw w m fi
for
44 »
s>
45
Figure imgf000092_0001
ti*i, '*niiJlJ«,«.
6 1!;*· I * I ttwp|«*f» m ! «' issla x«J*> *f«- «'h tiiiiHialiji oii»:*ioP«t U m <* Alfttl** ! ran*fc«:i»f k* .
41 Tfio ittipifin* ·,! »ί;^ϋ >·! / . u ;>v Ui.xt .: otitpfilx III air* PaiU tills t« *!«ii.< fo iiiako ifcjiiwiwif ti iti*i»p¾g. »ttii fa imw dilictili diiprtlyi partly it i* 4<ί¾· to*«nsiift« f!sAt tlw tfi stj aio
Figure imgf000092_0002
ui |. iif» in t!it* ΐπ κ * fiiiicti«e ft* tt»«* tvt im i ifi in 42 iho c, I,1M« / ' 1 :>*;..» :;.· $.t-»ri"tt iuv<*Iv«
!b-
Figure imgf000092_0003
<·Η* JiU'* <*J s.i IWf WPM
4i ¾!!ii*aily; tbt> iiUiM»«*ti?4f wn «I ««¾ t|:l itm «I tli*» BV ·« ·:π.< :ι' » bot to iiiiii* |is*tntiti:ior:»bi ia.ri > .a.; \ h- nm*? «»!iy SI I l«#W(> tliitt flw tnpsi* itri* B$tx«| its ¾ttr» in ¾: ¾, *ia«* tlii* ( * ii}i|>l»«a}* itfiik¾si
! Λ<4 *' ίί»' |f*l'«»- i IUl|«*ltk« of ttti»t <<!}.>¾, fj Bi f g, I, "*«» ittll itlptlt*
49 |-!!»!!y. f fiiiiil oti!pm* Ate »i *¾i Itt imir*. |«-rh«|M ΙΛ* Ϊ««ΪΪ&¼1¾ **B« « x!"I 2 *2 iiia rix ««-Tati<>i«, j¾si » tb
Figure imgf000092_0004
i.o|»«» « «* in m. m oixfcf {o«¾¾i ft»t m pi !■:·> f .-a-
Figure imgf000092_0005
to !nttiilxfs < >l · exts^f irli!r itft* Itfioiif tip to i o c'liiwling, hut s : f h- : ν-, * κϋ * :< 5ι pr-'- .
The- ttaa adiusliti it*; toodifv ι»¾· ^!ii itafi ic«iu iti I;¼.4 a* Ι*»ϋ<«¾..
By tidng ϊ4*Μΐιι«.,. iDc silig MUX ii|i¾ititio»„ f*mii Ptrl«
Figure imgf000092_0006
in I Ity Ainm^ d<iH'tm <Ιιιΐ»-Λο«Ί «<> b]urt«:.!, iu nl*»r f4i mix of¾s-
Figure imgf000092_0007
-iifiiot«! b Ϊ,ΉΜΚ* m th<* itli i»patfiti<.>n* m
Wss- nu > lilMfil <♦! ff*' ffjM.fww iioiwt in ¾ t to help o mir*' itat tip* ί!'Β|.«ρβίί»ΐίίι»ι.ιοΐ to t»iilitt«r at «w k«%ri
¾¾* « 'ptB HlC" t!>« IIiipk*Bi*¾t*liOll t© *¾e*ltf# %*!« ΐ1ΐ*|*' |«'ί!( ΓΪΒί1Ι*-0. I¼ *¾·
*i pi*«, «ΙΚ¾Ϊ m ny i®£n > vA *«* #¾η « ν·*!. tl** p mw?* **ί -»n itijwif arc
Figure imgf000092_0008
imev l*if nil uv oi that ia :»¾t. APPENDIX
:> vitK 11
Tti*« Mark II |tf p*js*J is MKHIAI to Mark I »'«*■ ♦ ».» p, J!lj n that 11 l»d .¾
Figure imgf000093_0001
su ttal strurtur*'. itli oaly
Figure imgf000093_0002
iitiMig Base function
iiiiplaneftafioti pairs. » 1 Initial StrtKtttm Choosing /JV end /Λ : . Ti * mm/*I prt:*ii»i« !*<·>*
ΛΠΗ\ . Ι¾«·*·|»ι for t *- tr- i «» tr»|»«t »u th** t!mU cMivt-ry oi ut jmt ί Ι..*:·... ι*χ¾ν§ΐΐ It*r iko
Figure imgf000093_0003
of thr VM i»irti ?i<ifi*
ΓΥΪΕΒ Mid fi l!'. ;kl .. >«Τ«η^Π·. 'Λί *h< :> T !l! lfO «'»kf do fiot ma of u¾M>of ORK it»lriK*tti¾?.:, »r k „;> <>\k -fr h tv-tit.' ¾ Bom**y
All «»!ΐϊ|«!.η!ι<ί ·* <>;■· . 8lJ n¾r| |ίί»κ!ί«-«* it«¾it. fr*»ut 8'-'. 1*ω frsf sin*. <t ·«. t-f,. >«- it vaJn*-
Figure imgf000093_0004
*s tfrttwttt* «*t Ζ ί1·" » «i i««tiJf? ι «· tlx* s>|jf«|jf i*t t , , * fm l f r thnt ii *l«lar niif, Thai t\ C e tpnief «| *r»t
*4 i , * , ΙΙΪΙΪΙ ' , I ftfif jfof » i:i:..:t.** for uaiiga«l 1st i|*-mwis is a t jwml
3i-Wt m»ftl3l|# ίΙΪΙ|Λ¾|Μ·{Λ»ΙΙΟ| With !iO Wf §€»" eh«¾it¾,
coii5iia¾> v ikv ) mttt-p tti ifa** «iasii¾t turn of , «»! f!i*« ^prnttrafkm • 4 *· w.iuit* «>t on!v «***
Figure imgf000093_0005
*A fh ' *tth <*x«rt ly ttw
;f«it Willi dil! tcti! ;<-:,* v
Tin* liiiMe Mi?K"tur«* of ikk!ior fh- >!>»*~ii in F¾, '» |tb# k ¾t half i AW! Fig: . iJ |tli#> «¾I kali k wl«*rt* cijrloi a b. c, 3. t , §« |. h *!«Ί»Ι«*
."V.1,1. £tee»i''ki«'. IS Fig*. '» aJi <,« t* i««y?s «j»
«-k«tf«rt f i in !*¾ΙίϊΒίϊΐ8. <·? tl» aifoW to t\¥i itirp tifiiii Wlk'n
Figure imgf000093_0006
tli«« w I* fiM t«i f« lh# «nt
tf i4*' :nii i oi ιΐ!ΐ< *f fs*f i iiwfewl, rtw » ·πί|- .-ίι* t* !i**^*n f¾ijft?»:.il# eli<i
Figure imgf000093_0007
S.1.3. n x « T«* » .« n : n . Ss«* «-rtJ3 w!.i M i» Fi¾», S AI :1 β tsp an « χ n Mm4„
»i»"ti iM* 4x$ I. nr ^x^- |w*riiiiitAt{<»3i, wh'tem *;rfl»'f* η*·*> »fl n : » l.ilw-k sa h IIH
Figure imgf000093_0008
ΊΊΐί» « i; wntk»t«* ik.«l t!w- f»»mjwf M'iit « i»|> t awl » ««it}Mii>, «\«<«-iit- iihi t i* ¾¾itf»i!< at < nil iiioi¾4 «;!f'»|.A' ffoiu an ntpm !o a iM. i-«'i>r >f«.jiiiBii:g o:i |»it i:&f* in iaii|»f»t!¾ ¾ v-. *··? t:.r-<';ch .«
Figure imgf000093_0009
Tli** M fi iitAffiii* f!mf tin* os.*ia «>!:i«il l» n iiiptli.^ l n mitpnts, mml <*
Ι !¾! it
Figure imgf000093_0010
APPENDIX
5.1.4. Selection Components, The selection components in Figs.5 and 6 arc labelled select I of 24x4 permiit 'ns or select 1 of 24:4 recodes.
Thos*:* labelled sefeee 1 of 24x4 permiit have tlw form
PERMUTE Ipi r j.¾ ] dt,da,d3. i *- : .*!.*_, ^α,^ι
APPENDIX wbert pt,f»j acre randoml &omi 4 x 4 {a-rnmtafioas, with % fining from a 4x1 S via a 4, 1 twxlc,
T c»«» labelled s«fi¥t I of 2 /:·! mxwfos have f l«» f .rm
iHH>M:Cf i, r 3 J[, ίίτ, </i. ( i " .*·» :
wiih ½ c ming from a fxf S via a ίι-ί Γ ΛΚΙΟ, m"tter e« ii r, is a \'H macro ta-trwt ion of the* fonti with ail e, «'H« !{BS in t!te KEooiiis choen randomly for /{, ,
5.1.5. fosr OefBf¥:R«¾ f Futwtiatwindtrtii intetlmr g, FWisefioiHtaiexed latfff- leavinc*. th 'rii l lis ¾2.9.2, ap e r* four Mum in m fK w K 1 |x¼'i- !icattou. icfa liine it «κιι π^ tiirw 1 < 4 linear m ppin s. iv\i taarm instruction U i;. K¾,tAP[ n ... for MMtie 4 χ 4 !ti rix M >, latw!M 4x4 /.. ΛΙ S, and xl I? m Figs, 5 and 6, iso»44 x I matrices »w eliosen hidofx*iide»th* using the method ii* §3.3; Cogrttlior with one $:4 vtxk* >;\M wwit mstnte- tfcifi RK* uHE[r i, ί :|] .... witli the* four i4 *¾ro<iiiig« ehuseii randomly from i!it» amil&ble pefUMtat km |xilys»tnial enr<Mtittgsj, nm ix-njieiiris ΐιίΛι I of 2 •1x4 p mrnt 'm, and two cK^iiraics of select J of 2 -1:4 rewoAs p.l .4),
Each iterance of ft.ut(-ttorHiid<*x«t interleaving hn* a single left-side ίι»κ1ίι»η aw! J = lii right -side functions.
5.1.6, Other C&mponmts. The retn&iiiiiig eociiponsenfs are not within iasiaaoss of ft!»iiio5Mt!di4x«i itttt*r!t¾viag, rotnprissrig three
Figure imgf000095_0001
of «« , a mwiic me of t e form
Figure imgf000095_0002
and two of-.-nreBcos of an 8x ΓΠΙΜ'ΤΑΤΙΟΧ, twii of the form
i^HMirr: [ i cf}, .... * u : .. , ,«κ with i!or ί a single, randomly elioscn petiiiiifatioit, and eight iwiircaiees of a 2x2 fitter, each of the forei
Li ΝΕΛ R Λ I" [§ΛΟ d%,dt - ere A/ is t 2 ί 2 ffiafrix ebwtt. for JK, ¾* the tnetli«i iven m p,3,
5,2, Obfuscating /&- or Implementations. l'i>*« fallowing jtic!tli«<b nreetn- p!»>yed lu obcm* a progiain implementing or ^J. IHT*:- iiii}>! »t>:¾t«f.k:»it¾ liiw the* ixmmi< mm e dptailinj in ¾5,i mt j», 43 aatl diap ni8i«l in Fig, 5> and Fig. ii.
ΤΪ · f raitsformaii ns in tlw following >«-ttows ar<« {x f rnww! oiw- aftt*r > otlm fxoepl w!n-rt' o1ln>r is«< iwrtwl in tho Ixaiy of t ho « -ti n:<.
5,2.1, &ψΐ! tM*ion. aiv ixio for /κ *Λ' t!ii|>I«nen!atioas ixmtmm niauy MOVE jij«irtietioiis. XSlmi a rnhv* fc U» i e4 itmn on*1 register t aisot fis * 0VF t rough !!«cnii«-ilkte ttii ei , it b often |HK«?hi tc 'Siniinaio th>- i Tnio- diate sirps and
Figure imgf000095_0003
This is »:¾poeiali etft«tive in the eai«' of Ι*ΕίΜϊ.:"Π¾, hseh are isajveiy tmple« !!ieti?«i iinng w*}«ieue« of tvm. E.g., iifi|>lii«itiott of xlw initial aad final 5x l¼TBiii ,"iijo« fiieaits t at the ratwloialy <·|«Λ'η iM'TJiintHf km on!v na-aiis that which da a-flow soii.rc%> m first recoiv»«- a particular input is *·{κβί¾ ai raailoto, and which data-flow aak are filially dels vers a |»rtkaiiar output is elawn at ratidooi. APPENDIX
The role* is at any MOVE which mn he eliminated hy ordinary levels of optimization, m t ho eliniiiiated; i.e., the filial version of an bfusc ted fa fK l imptctitetit&tion must contain the sinailest number of MOV E instructions achievable lin n ordinary (i.e., n«!i-taero?c! levels of optimization,
Mor spwifiealiy, assuming that we am readily a*¾oeiaitt! a value* producer with its value eoasuiners in SSA form, those MOVES which most be elided are those which mn be removed in SSA by renraatiortng outputs and reconverting to SSA until no further copy elisions occur.
A MOVE can ίχ elided when if forms an ar m a f ree of operations in which Movi¾ form the ares, the original value producer f fXisstbly ^ s ignment ) is lite root , the root dominates the MOVE arts and the consum rs, and no consumer is itself a < ass!giiiiieiti.
Copy elision can la* perfonwl at. various points in the following process, and is eertainly done a fitw! step to remove redundant .MOVE last ructions.
5.2.2. Bfwi h~t@- Brunch Bisum, A related tiffin of elision can be perforated OR branches. If a basic bl ck t iiiij contains only an unconditional branch instruction (i.e., mi unconditional JUMV I sie i nnmion), then the- target att brunch which branches t that iin can IM» tii«iifi«l to brand} clirertly to the target of the JUMP to which it brandies, eliminating a branch to an tiiM Jtidtt tonal branch. This ran repeated until no such braneh-to-braneh occurences remain, and any niireaehable iiiis which conta n only an itneoa!itional sum' can then f»e reniovod.
Ilraiieti-to-braiieh elision mn \ perfonned at varieiis o nts is the following process, and is certainly done as a final step to eliminate braiiclHo-Uttc ndittorMl- brand! sequences,
5.2.3. t'nmed Cmie Elimination. When code is in SSA form, if any register is the output of an instruction r. but never the input of an instruction , instruct ion i is unused code, Wt> cm
Figure imgf000096_0001
remove all swti iitstrtjctioas tmtil no unusod code remains.
This can tie done, at various tiiiii* during ohniscatto», and is certainly d ne as a ftoaJ step to eMtmuate unused code,
5,2, ifmh insertion and Gmt mt$m * IMsim t Dynamic Velum, Chw & l c 8 ash matrix f randomly chosen distinct odd deniails over Z {2:li| and generate CT ie which maps the original inputs through this matrix yielding a single output . i¾ee the oode for the Α βΛ-matrix computation (initially a UXEAH MAI* macro- ittst ruction) nnntediate!y following f he initial ISNTBU instru tum. "lite single output is a 'hash' of the inputs, and will ln> used to generate distinct nami alue* t , C2' ^ - · - fer m y sliiiliiing ( e §5.2,13).
Next, choose a pernjtii ation pol tiotaial (f s) . iial, where ; is the output register eoataitiin the output from the above matrix eoinpntiitkiii, insert code to generate A' values — P(z i i t over Z/fS'd where = fIo¾.2Al and A"s derhmtion will be deseritei iM , initially, the ΙΨ eoro|Mitations are inserted as a B CtjfiE tn»ei*t !!istrti t oti in which all the ciictKlings are kleiitieally P.
5.2.5, Mamt-imiructmt ftrjwuMmi. AH raai'TO-tristractiotia are oxi«wd«i to m- f|tieiiees of hestc ittstraetlojts. The ex|¾¾miotis a ¾» trivia! aiifl therefore oinittefl iiei e.
After expansion, only basic insiructious remain. APPENDIX
5.2.6, *C ii i-Fk D pltmt m, One matrix h addod in §5.2. , mapping ail eight inputs to one out ut, and there are 20 matrices hown in Figs.5
and ti« e&eb denoting the mapping of two or four inputs to two or four outputs, respectively, each of the 21 ittatrix-taappinp oxpattdiiig into a coratguou-i se uence A" of bask- instructions with no internal branch {i.e., jt'MP.. , ) itjstreetfotis.
Tin* hmh oiafrix-eomputiitini s iniiially the ftot mauix-nmpping ««iptttali«B. We take t e code for matrices in a horimnitiiJ row. >ttcb as the 2x2 mixer row ww the t¼:¾iiii !ig and end of rho c¾ structure in Fi¾*. S and C, or the -ίχί ί.. /.νΊ i¼*quen«* in each 'round to be computed e iienttaily; i.e.. first fit*- code for the left most matrix. then tin- code for the mic to its right, an » on. Moeo r, HI each 'round', we nolo that the code for the mpuation of the 4x4 I. and 4x4 S matrix Mappings ncessarily precedes the code for the computation of the select 1 of 24x4 pertnut n>. followed by the select J et 24:4 rwois. followed by the 4x4 It code, followed by the sejfert I of 24x4 reeofite followed by the. seieet I of 2 xf persmt'ns at the right side of the figure,
In terms of the e r sentation noted m §3.5, efendi on which of the
20 matrices we are dealing with, the code for computing the effect of the matrix on the 2- or -l« vector to which it is applied appmrs in the representation a ft straight-line sequence of uisfruef ion which
{ I ! occupies an entire kmc black (KB) exce t for a final mtidffioti#ii branch sequence (a com|wisoii hallowed by a conditional branch, taking the form of ait fjpr, JtJMPNZ infraction ptir, hnraediately following the matri computation; e.g.. this \> initially the ease for the matrices tAelled 4X4 ft in Figs, 5 and 6, since they t mediat y follow the point at which the ^-then- else structure, which contains a tt»iM!fi and an else-Bit each ending wit a Jt' P to the Bii containing the* code for the 1x f? matrix mapping followed by the conditi nal s u nc fan i?t/F followed by the M'MPX'i ending She mi) h ts one of two reeodes as indicat d in t h<> diagram by a sefcer ! 2 :4 mxwfcs? following the 4x4 !l: or
f2j occurs in a BB with computational instruction;* both preceding and following it in the straight dine lOtl of the tie: e.g., this is initially the ease for twh of the iiiiitrices JaMl 4x4 L and 1x4 S in Υί .5 and 6. and for all of the matrices labelled 2x2 mixer m Fig. and all but the leftmost matrix 1«!Μ»Ι1<Μ12X2 mixer in Fig.0; or
i'3) tw urs at the twinnin of a itn and is fol!o eii by htrtlier e tnputat tonal instruetiotis: this is the case for the leftiii«t matrix latailed 2x2 mixer in Fig.6;
( I) occurs at tin* begisintn f » mi ami Is foliowctl by H brAtiidi i *tt uction (.JS;MI», or a ronditional hraneJt last rtn«t ion »«q«en »» ftw, JUMI^NZ*: t!ii* dotw !itt initially occur in Figs.5 and C, but might «ft«*r prott*stng iiescribed be!ow.
In a ataiitier clmTitM l botow. we replace each smch blo<% of mat ix code with « braneh to one of two copies of the block of code, terminated by a branch to a common point: i.e., in eff»et. r lace ike code
Λ*
with ite ci je
if r < 23 then \'x else V.. APPENDIX where r is an output of an instruction dominating -V (and hence t he alcove if - const ract and the eliminatin instruction is eho-son uniformly at random from tlie possible predecess rs, A"t , X a e the then- and els#-tnsf an««,
Figure imgf000098_0001
of the b ock X of straight -line c de.
To accomplish the above f roiisforitifttfcwi. we pi'oet ! m folkra¾ (per Figure <ty
900 ll th implementation Is not currently in SSA for in, convert It to SSA form as described in §2.10.1,
905 Choose an instruct km 1 «sik>naly at. random from all of tlie iiistnietiorii* which have a single output and t!eBiiriaJe the first instruction of the code for the stibjeet matrix mapping.
10 Convert t lie implementation it* SMA form (k mlmi in §2 J 0.2 on p, 30,
915 Isolate the cool igiiotis s¾ i*¾ee of ii rtteiioiis comprising the matrix mapping code for the subject matrix, That is. if the first matrix-mapping iiistrt!ctton for the subject matrix docs not begin a nil, place an tiacendi- f ianaj Ji'Mf* instruction immediatel before mid first tnstrtwtion. splittiiig t he PM into two at that point , and if the last matrix* mapping instruction for tlie subject matrix dot* not iniraediately ptoa«Ia a JUMP. . , or EX IT instruction, insert an iMi«»ri<lttioaal it'MP tastraetioti itiijitedtirteiy after said last fitstpicifott. splitt ing its Bit into two at tltat (K»»rit. At litis point , the original it I* IIAS tasMt longitudinally eiit into stiiiltiple mm zero, am, or two times, and the e aie for the stitrjis't matrix mapping is isolated in i * own 811 ccaitalaing only that mapping «xfe iolfcwrwl by a single unconditional j!.*if » irwt ruct on.
920 Create two new am. The first , C ( for 'choose'}, contain* only an vt.r, ic'MPNZ m eiide implement tag an
if r < 23i then . . . else . , .
ci«*tsi«i. For register r. we iw the output of our ?«?I«*t«i tiistraetioti i above, lettin Λ* l*e our isolated BB ΑΪΧΛΌ, the sec nd is .V'. aji exact copy of .V In every re«f¾>et except: t at it is a distinct , and initially isolated, graph node in what was originally a eonirohSow graph (CFG), but currently is not since CFGs do not have isolated node*.
925 Replace t he original X in the C C with . l.f as follows. Cause all braiteli-targets which originally pointed to A' to point to C, final branch to branch to on the < 231 and to A"' <>n Site > 231 alternative, respectively,
930 Convert the implementation to SSA form, isolating the computations in A* and X* fr ii! otse Another: at this point, they are distinct, bat com ut identical results.
935 Perform braiicli-to-braaeh elision (see §ik2. l
ote that, while this replicates* &UK «iatIaBs, it does not produce disfittet copies for comparison, beeaiise. on f¾eh execution, only one of the two pat lis perforating the matrix- mapping fo a given matrix is e e uted.
S.2.7. Cmne- wm !mertmn. If the implementation is not currently in S A form, conver it to SMA form (pee §2.10.2 ).
Then, for each Boolean coBipiirisou viT tiistruetiori coiaputing ru < 211 yidding 1 (if inief or 0 (if false) and providing its input to a Jt*Mi*KZ ('jum if true* } tiptritctioio randomly choose two constant** e( » raiidi2:W I and eg * rattci!t*1* }. APPENDIX and insert code after t e conijwiscrti t' LT but before the , Γ Μ ΡΧΧ nistruetion iaking its input, wh w t he inserted code eonipittes r * r¾ a {/·¾ - <¾}rj/ .
At the true destinat ton of the .i wxz. insert ¾ · Cf , and at the false destination of the J fl'NZ, insert rd «■- f , (Recall that , when the code is in CFii form, each csMidklonai branch has two targets. )
Remember that tiie outputs of the infractions computing r> and should he identical, for future use,
5.2 A *i?s!a-Fl0ii* Buphmium , In a. maimer
Figure imgf000099_0001
Mos*. for every instruct km which is not a JiAii", , EXTEft. or EX IT, the instruction is copied (so t hat an original instruction is immediate!? followed by its copy), and new r gisters are efccf en for all of the copied iiist rations »wh t hat, if x and f are instructions, with | being the eofiy of J*,
( I ) if j" inputs the output of an ENTER instruction, then t e corresponding . y input uses the same out ut:
{2) if x inputs the out ut of an original instruction a with copy r, then the «MTespoadiHg input of p inputs from the r output corresponding to the u output from which x inputs: and
(3 ) if r output* to an t;:xiT tt strnctkiti, en the corres onding output of outputs to a a sjieetal unus d sink node indicating that its output is discarded.
Thus all of the computations except for t he brandies haw an original and a copy occurence.
To aeeetiipltsn this transfonmtion, we proceed as follows f per Figure 10):
We add a new instruction JV PA rjuiitp arbitrarilyi, which is an unrmiditionai branch with t t destinations in e ntroMlow graph (cm) form, just like A condit ional liraaeli (see §3.5). but with no input : instead. JU M I-A chooses lxa.weesi its two destinations at random. JUMPA is not actually part of the VU instruction set. and no Jt' FA will occur in the final otiftiserttecl implementation of
We use JUM I'A in the followi g tjaiisfonnatjoit procedure.
1000 If the inipieinent tton is not in SUA form already, nwt it to SMA form (see §2.10.2 ).
1005 For each of BR A", of the Bits in the iiBpIeiaental ton Aj . , . . . A'*. , replace it with three BBS <7|, Λ'„λ*,' by creating a new BB A,' which is identical to A',, and adding a new mi C, which contain only a single jt 'iiPA inst ruction targettiag both ,V» and A',', making A* , and A*,' the two targets of ,% J U PA, and niftking every non-JU ftv braneii- target pointing to Xs i«unt to G instead.
1010 Convert t he itiipleineiilatsott to SKA form (see §2,10.1), isoliittng the local data-Bo*** in e&eh A, and A**, alt hough eorrc«{K« ling i tistr tie t ions in A* ( and A * still eonspute identical values.
1015 Merge all of the code in each A ' back into its A, , alternating ins tract ions from A, ami A* s' in the merge so tha corresponding pairs of iiist ructions are stieeessive: first t he A\ distraction, and then the eorrwpotidiftg A, instruction.
1020 Make each brarteh- target wtrieh is a d point to the corr«s|K>i«iing A* s instead, and remove all of the Ct and A" ' Mts. At t his point, the data-flow has been difplseatisi the original shape of t he eta: lias been restored, and the APPENDIX iiripleinentiition is free of JUMPA iifetitietion . Renawalx'r which distractions eori«.*i !Ml in each Λ', for future use,
5.2.9, *itii iom a^C n iion. if the iiiipienteritalicin is not currently in ss,v form, convert the eode to SHA form seo §2.10.1), Duo to the use of ss.v form. «ine of the inst uct ioiis f¾ befow known to produce identical outputs may I » ^as ignments taking t eir iapuis from tiuti-e*-a,sstgniJ!erit instructions kn wn to pfmltK'e kfofl!ieal outputs.
Due t« transformations ap lied in §5.2,0, §5.2.S, and §5.2.7, niany instructions Wong to |wr which are statically known to produce identical out uts, Such cmses were noted in !boso sections, with th*» added re rk that the information on such identical output s should be retaimxl for fitttire us**. Wo now make tee of this saved toferttt fon. T e number of copies is always two; two for data-flow duplication, (wo for exported eontrol-flow duplication isiriee forth eotitroi- lutd data-low duplication have !MHUI applied to theni, tint onl one instance of a control-flow duplicfttsd onmpiilation orcurs in an *'*xe»"tition of he implementation,], and two for Ton ie- from ' insertion.
There are two w in which a pair of inst uctions in 8SA form in he implementation r be known to have identical results as result of the actions in 55,2.0,
§5,2,8, and §5,2.7. Two instructions can be data-flow duplicates of one another, or two d-asagnments can haw inputs known to lie data- low duplicates of one another due to control-flow duplication of niatiibc-mapping eompQtatioas.
Let ΗχΛΐ ! > a \mx inst uctions which are known on the basis of ueh saved information to have identieaJ outputs, eaeh u, taking k inputs, where k u either one or two if the instn ijon is a basic instruction (e.g., SEC or M< L) and is two for a -assigiiBient following a control-flow du licated ti iCrfx-iimpping.
With probability ~, we flip use of the tjj , a-j outputs m follows: for every instruction consuming the to output {if any), we modify it to take the ¾ out ut instead, and etre errsa.
We repeat t i- for every faissible siteh pair ut.u* UBti! I > stieh pairs remain to fo« considered for Hipping.
The efh«ei of t is η&ϊχ ηιΜϊαη is ,*» follows. As a result of dataflo duplication, except for the very t«¾intittig and end *>ί the implementation, d taflo is split into two distinct ul»¾raphs which never ve la . After random cT^^:»etiectioii, these two data-flow graphs have I SCH thoroughly merged into a single data-How graph.
5.2.10. 4(hetk Imtrtwn, if the iiiipl««ejstrtf ion is not currently in sSA form, convert it to SSA form iseo §2,10,1).
As in random eross-connection in § . .0, we proceed through tlie pairs say. of instructions known to have identical outputs as a result f value duplication due to itie proe<¾siitg in >'2J, §5.2,8* nnd §5.2.7.
As in §5.2. f, such iastr net ions may eitlier la:? tmsic itist ructions or c aasgntneiits:, and we use exactly the same criterion for identifying such pairs as in $5.2.9.
Successively c o se stieh pairs of instruct ons wirh known idontical otitpnls doe to duplication result in? from steps §5.2.0, §5.2.S, aw! §5.2.7, until ¼·!ί pair has been chosen. APPENDIX
For each surih instruction pair ¾„«¾,, say. uniformly at random from all choices of (, not previously a «,» In such processing, or if no such choice exists, from all choices of ue including those previously usr-d as a a,, in such ptca-essiii . a single instruction «v SIA t hat ue is doti«tiat«l by both of iCj. a*,. f Rtj s h «... exists st all, do not further process the «a . a?1 pair: simply proceed to the next pair, or if nemo renmins, teriniiMfe the processing according to this section. )
.Let »;j . Of,. ©v be the outputs of 8ei 8t,«f, respectively limiiediafely following «,r , place ftxle to compute % *-~ + 0%, and cause ail tnpniters of <¾ to input ¾ instead. (Since <¾ *--· we should Iiave ¾ ~ of , s this should have no net effect unless an at tacker f,aiiif:*ers with t he code. )
Continue stieb rocessing until all such pwirs a,,, % have beei selected.
iTlte Γ,, , Γ^ efeeefc? In connection wit It §5,2.7 help prevent branch jain-
Biing; the others help to foil purtur a ion attacks by causing downstream comput tions to malfunction if ne ai«niier of a pair modified without modifying t he ottiet ,)
5,2.11. Dmmwdi g. If the implementation is not emixentJy in S A form, convert it to SMA forni (see §2.1.0,2).
lake each binary operation computing
fix. y)
where / is one of -, or *« And replace it with a computation which is the algebraic simplification of
Figure imgf000101_0001
or e jtiivalentJy, replace the operation / with an algetiraie simplification of such that for each arc connecting a producer to a roiMiiiier, the encoding (the e, function) of ttic pro u ed value ttwtdtes t he encoding as umed f»y tlie eonsiiinef (wficre the inverse of tlie encoding is applied). That is, perft>rro network encoding tsee §2.3.i )-
When the out put z above is nsetl m the input of a coiiipart^on Kcj, SE. i"t.T. t. :cT, tit.E, t;<:E, si.,T, scT, SI..E, S€E, o a conditional branch M'UPt or it 'vtfxz, <¾ tsitst b the identity encoding. Moreover, any output derived as the final output of a RECODI iiiarrceiiBtriirtlon. or the expansion of a RRCOOB inaiTceiiistniriion, in the original program, cannot f>e further modified; t,e,» the KECODE IS taken to <ii* a plain «>mputat ion whose output cannot lie encoded. Initial inputs ami !lnal outputs also use the identity encoding. That is. any output whose transcoding would cluing** the function computed by t he program is left «nenew!«I.
Whore an input, is a constant c. replace it irttti some constant er(cj . and treat it as if it. ciiiiie from, a producer which preaiiieed. it wit h encoding i t ,
SoiiK-tiroea it is not possible to toake ail the prodaecr and eonsmiier cttcoiling¾ iiiatch cvcrymlicre tliey sltotild, Where tins occurs, produce with an output en- «x!ing cit and an input encoding t and itiseri v c^ 1 on tfie arc to resolve the fx>sflict .
Each €t is a !«j«1tvc {ptailrattc p^Tintttati n |:itdynoinia.I PP) function over 2/{2 ~), or flic invwse of such a pp, iieeorditig to the scheme chosen for t his pur- I«*st« as described in section . Let us simply refer to them as PPS. Since PPS involve only APPENDIX nniltiptieatioits and additions, a IT scan tie x>mputt*d as « eeiuenee oi afitite steps, mltieli wo now assume to be t he rase*.
5.2.12, Register Minimization. If the implementation is not in S A form, convert it to SMA form ( see §2.10.2 on p. 30).
We now der ve a mnfiiei gmph for the lifespans in registers. A lifespan fx¾ins when a value is pro uml (i.e., is output to the register by tin i t ruct ion) and ends at the point where no further uses of the value are made (i.e., after t in* last time the value output by an instruction is used as an input with no infer mediate changes to the register in between fiaeing the value in the register and using it J: that is. after the last consumer connected by a data-flow are to the eoiistnrier has eompleted execution. Two lifespans conflict (i.e., overlap) if they b gin at different producers and there is a point in execution at which bot h are both have been started and wither of them lias ended.
We can view tills as a graph where the lifespans are the nodes atic! an are connects t o nodes if and only if t he l fespa s conflict , The significance of tlie graph is that , if two lifespans conflict, their produced values must IM? stored in different registers, whereas if they do not:, t heir produced values may he stored in the same register.
The \'M permits an indefinite nmntier of registers (well 23t of them, at any rate), but ottr purpose is to iniiiliaize tlie nuintx of registers to increase tlie ofiscnrity of shuffling values through memory by potentially making many different operations two t e suae locat ion.
Starting wi h the tittles of rmntttial d gree:* in t he graph, we remove nodes one at a time wit h ilteir incident arcs, until all nodes have been removed. We then reinsert them in reverse order, with any afcs which were removed with them, choosing registers for ttient as we K¾isert them. This is a variant on Ciiaitin's algorit m, and tends to prod ire efficient graph colorings (i.e., register allocations) in the sense t at t he number of distinct colors {registers) tends towards the minimal number.
Retain the liiesfMin i lift wmat ion and the conflict graph for furt her use in §5,2.13 on p. 53.
5.2.13. * Memory Si ffbn , If the implementation is not in SSA form, convert it to SSA form (see $2.10.1 on p. 2 j.
Include in the implementation a memory array A containing 2*' binary words, where p ·■■··■ %¾ΛΠ (see §5.2.4 on p. 47).
identify in the code ail instances of a contiguous pair of instructions {M L'L.ADD) or ( MrL.stii) implementing an affinc mapping y * »x +4» or y *- si r + ·§) (depending on which instruction ptwdes the ot er), where *, I». and are constant inputs and x Is a !MKi-oMis t input, and in which either t e ΜΠ is the only iiistreeiioti which inputs the output of the A DD or sell, or vice versa, and in which the affine out put value y is subs ue tly used as an input by some other instruct ion. Once such a pair has been found, remove it from further cxuisideratkin. hit! continue until all sneit pairs have lieen found. Cat! these pairs 1% , . , . Note that each such (*air lias only a single non-eonstaiit input x.
Associate wit Pt . . . . . f*s values ΑΊ . . . . , K j * initially all equal to 1.
Traverse K j , . . . . A*JV · At e&eh AT*, with proba !it 4 changing each travereee's mine to 2. Traverse the J%, 's of value 2, wit h probability changing each traverse value to 3. Traverse t lie Kt 's of value 3, mith probability changing each travcrsee's value io 4. APPENDIX
At this point , for mch pair I* . there is value Kt with a value Monging to the set f 1.2. 3, 4 | .
Define l?i . . . . . /?,y as follows. Let ft, be the* number of points in the liv range of the' pair's output (the y compu ed by the affine mapping afxwe) at wliich a ami* hraneti task* instruction can he inserted which wouM lie in the live range if it were insetted, l¾ is a measure of the size of the live range,
Define Ι } , . . . , H*jv as follows. lM IF, he the cardinality of the largest set of instructions overlapping the live range (i.e.. either ia t he live range, or starting the live range by out put ting its value, or terminating a path in the live? range by eoisuBiing its value) such that no tiietnber of the set dominates any of the others. This estiiimtes the 'width* or path- multiplicity of the live range.
For each such live range of a y which is the output of a pair i* . with a probability which is the smaller of I and A' IV" 4 R,. select each point at which ait instruction can be inserted into the range as noted above, so that, in the fairly common ease where IF, = 1. there is an expected nuinl>er h of such sel cted points In the live range for the y output of P . Let the set of selected points lor a gi en live range tie S, so { hat the expert rd value of IS, ' - K where if, = 1 , (Of course, the actual value of !St( taay differ.'!
Define i . . . . . \ as follows. I the live range of the jHiutput of pair i , plainly each instruction, say w. inputting this §*otitput is dominated by the pair-inember producing the p«oniput; caJJ it m. if lor each such w. there is an s o S so that m doinhtates s which in turn dominates te, F, - 1, Otherwise, f - tl.
We now restrict our attention to those p I for which F, = I. For each mch paii l , we aJloeaie a new set of St * 1 indexes in A, C$, . , ,. ,€},HSt (s e §5.2,4}. so that l . and each member of Ss , lias its own assigned index. We reuse indices as much as §>ossihle among P 5S pairs under the constraint that the sot of indices for a given P,. S, pair can overlap with the set of indices for another only if their corresponding # * foiiitpiits are not connected fry an are in the conflict graph lie., if their live ranges to not overlap; see §5.2,12 ).
Remove the pair Pt a f Vft.,"t,ADl)f or {ΜΙΊ.,Μ.!Β) replacing it with a RETOOK. STORE. LOA D, R Fit "ODE sequeiH-c. Each RECOOt* maps one input to one output , so we recede a value on each store and load. There is then a sequence
such tltat. the final RECOPE above ominates st , , . m, wher<* , , . , s¾ c 5¾. «* is an Instruction inputting the iMwi nt of the removed P„ and each element of the sequence ss , . . , w dominates its siteeesscirs. A a result , we !dko the xdtiput of the removed si's eiiee, ma it through 2{k s I ) »t¾x)Piis, and then pass it to u\ We modify the final RCCXMiE so t hat f lic net efftvt of the series; of reecMjin s is to provide t# to w with t he itipiit-eiicsMliiig expert ed by w: i.e., we Introduce » fracture which computes y * sj* ÷· 6 by modifying the last eiMsaliiig in the sequ nce. We repeat this for all instructions «*. never changing an interitKsliate ncoding m> it has been chosen (since som i's may appear OH paths to nntitiple j -eoiistiniefs* w); i.e., if reeis ngs have been ehc»sett for one path, don't change them for another overlapping pat .
We proceed * above for every pair I* for which F, = 1. We then eon vert the im lem ntation to SMA form (s«» §2.10.2) and ex|»nd all rf the IECO E inaero-itisiriietioiis.
S.2.1 . Random Imi vi n Reordering, li the implementation is not in SSA form, convert it to SSA lortn {seo §2.10.1 )- APPENDIX
EusurittK t hai redundant Movn instructions arc first elided. §5.2.1 ), rwrder t he instructions in eh im as a top log-jcwl sort of its instructions usin the ik"IN tiS'*iiey-f*A««I partial order In . Among the candidate
Figure imgf000104_0001
of an instruction Airin the sort, the stiec-exsor ts chosen uniformly at random.
5.2,15. Final Clmnups end Ctxk Emission, Perform copy elision (w S.2.t.)} lirftiicii-to- rrtiK i elision ( e S.2.2h and unused cod elimina ion
15.2.3).
Perform register miijimixation (see ¾5,2.12} to mtnimia the number of registers (teim cMry variables!, tart waking no attempt to change tin* ΜΙΙΒ1 »Γ of locations used in the shuffling array A (see §5.2.13 . When mmmmMian completes, the ro is in S A form.
Ktnit I he code,
6, Clear Box MARK III
The Mark III proposal differs from that for Mark I ½M» §4 ) and Mark II
(see i t ) in that if has a variable internal stru t arc* in which l*oth coefficients and structure vary among Base function hnplenientation pairs.
A.s prevkaisly, t he primary * vehicle is a mutually inverse pair of eiptier- or hash-like function iropleiii€¾ta ioiJs> formed m mutually inverse pairs aeeordin to an algtirif hut th nging to a very large faoiiiy of stieii pairs, with the precise pair defer rained both by the algorithm and a (typically large) key K, In addition to th«> key Infortna ion K, the algorithm which fori ι i the pairs consumes randomization information II. which is used to sjwdfy those obfuscatk ial aspects of the iniplenjejitatioii which do not afc-t the external behavior of the pair, hut only lite iijiernal processing In* which t is externa.! behavior is achieved,
CU, Design Principles. We expect MARK 111 to be lined, in environments in which the implementation is exposed to m*hite» MIMI Or greydwx attacks*, and in which t e op ration of t e application* leaking us<> of MARK ill involve eotniuu- iiieatiun across ft network,
6.1,1, Security- Refresh Rale, For effective applications securit iifc-eyde management, applications must resist attacks on an ongoing teis. As part of this rests- t nci". we expect :-u< h ap licati n* t self-upgrade iti response to security-refresh mess ge ontaining sec u ty renewal information. Such u rade' may involve patch files, fable replacements, n w crypt graphic keys, »nd otbor sisnirity-relatod information,
A viable level of securit is one in application scciirity is refreshed frequently enm¾h o that the rune taken to com r mise an instance's security is longer than rite ;irm« to the s mi -rtdresb which invalidates t e compromise: i.e.. instance"' are reftvshed faster than t ey can typically be broken,
This is certainly achievable at very high security-refresh rates. However, jtueh frequent refresh adieus consume bandwidth, and as we raise t e refresh rate, the proportion of baiic!widt b allocated to seeiirity-rejfesri nassages increase*;, and available tion-seettrity payload bandwidth dectt¾is<>s.
Plainly, t en. on_rm«vrii the appropriate .vcuntv-relroh rate is required for each kin of application, since the tolerable ve heads wry greatly depending on context, for example, if «¾» expect only ray-tjox attacks (neighbor sidi dialtnel APPENDIX attacks) in a cloud application, we would use & lower refresh rate titan if we expected white- box attacks (insider attacks by malicious cloud-provider staff).
6.1 J. External, and Internal V lnemhtiitie mid Attack- Resistance. Suppose that our pair of implementations implement functions . a- . f^- 1 where κ, /^-1 are T- fimetiona. Then by repeatedly applying either of these functions, we can precisely characterize its computations using a bit- shoe attack in which we first consider the operation of these functions ignoring ail but th low-order bite, and then the low-order two tats, and so on, gaining mf rniatkm until t he full word size (say 32 bits) is reached, at which point we have complete information on how the function behaves, which is tantamount to knowledge of the key A*.
This is an external vulnerability. While the attack gains knowledge of implementation details, it does so without any exaniiiiation of the code implementing those details, and could be performed as an a adaptive known plaintext attack ou a black-box iinpleinentatioii.
A less severe external vulnerability exists if the functions of the pair have the property that each acts as a specific T-fiinction on specific domains, and the number of distinct T-fmtetions is low. In this case,, a statistical bucketing attack can characterize each T-f unction. Then if the domains can similarly he characterized, again, without any examination of the code, using as adaptive known plaintext attack, an attacker can fully characterize the functionality of a member of the pair, completely bypassing its protections, using only Mack-box methods.
Plainly, we must ensure that the effective number of distinct T-functions is sufficient to foil the above attack. (In Mark ΠΙ implementations, there are over Iff distinct. T-fimctions per segment and over Ml40 T-bmetions over all)
Now, suppose that the pair of implementatioas comprises fu cti ns which achieve full seade (every output depends on every input and on average, changing one input bit changes half of the output bits). An example of an internal vulnerability occurs in the Mark II implementation where, by 'cuttin ' the implementation at certain points, we can find a sur implementation (a component) corresponding to a matrix such that the level of dependency is exactly 2 x 2 (in which ease the component is a mixer matrix) or x 4 (in which case it is one of the L, S, or R matrices. Once these have been isolated, properties of linear functions allow very efficient characterization of these matrices.
This is an internal attack because it requires non-black-box methods: it actually requires examination of internals of the implementations, whether static (to determine the dependencies) or dynamic (to cltaracterize the matrices by Mnearity-based analyse*).
As a general rule, the more we can completely foil externa! attacks, and force the attacker into increasingly fine-pained internal attacks, the harder the attacker's job becomes, and most especially, the harder the attacks become to automate.
Automated attacks are especially dangerous because they can effectively provide class cracks which allow all instances of a given teehology to he broken by tools which can be widely (iistribnted.
Thus, we seek, by using variable and increasingly intricate internal structures and increasingly variegated defenses, to create an environment in which
(1 ) any full crack of an instance requires many sub-cracks;
(2) the needed sub-cracks vary from instance to instance: APPENDIX
(3) thf structure aiici number f the attacked eotniioHewts varies from in-taiico to instance; and
(4) the protection raerhatiKiis employed vary from instance to instance: so that automating the attack becomes a sufficiently large* task to discourage attackers front a! tem ting it (since, in t ft- substantial time it would fake to build such an attack tool, the deployed prot<-etio«s would have
Figure imgf000106_0001
n to a new fwlmolog where the sU-aek-tooTs algorithm no longer mffioesj,
6.2. Initial Structure: Choosing and fK s. Mark HI functions have an input. and output width of 1232-bit word* for a total width of M4 bits, The iMi l'tiiwif at kin c nsists* primarily f a juries of M>§ ut$, in which each »m in MI liist/tnce of function-indexed interleaving |Fii}. e series of
Figure imgf000106_0002
awl Mk l by initial and rinal mixing steps i t. mled to foil blind dep<»nd«-iiey- aualysis nHack (ones performed starting at the i uts, use to <k*mv «1«ΐΜ*ικΙ**ικ·ν relation?, without considering the st uctural details of t * implementation under attack!.
We will mainly deal with fh . The segment inverse* are fairly obvious gives cwir well-ktvown lit tiit'tliocls, and the over-all fK li found by xKicatviiatiiig t e inverses of the s«»gino«ts in re rse order, sandwiched between t e initial and final mixing steps til reverse order.
E rl) Mji'li segment has a left -fnitei ion, a selector computation which tts»*s the same Inputs as the left-function, and a right-ftmet km. The right- function is a nested instance of Fit, Hence each segment divides the input- and out ut-vectors into three subveetors: the portion which enters and exits the outer left- ftiaet ion. the portion which enters and exits the inner left-function, arid the portion which enters and exits the inner right-function. We will rail these t e left, middle, and right suhvof'tor*.
0.2.1. Sikrtin Mattit Ovtr Z/l2' ). We seloet matrices in two different ways:
• grm-rai. select an m \ n uianix over Z (2:u? at random under the constraints that no element is a 0 or 1 and all elements are distinct; or
• tnvt! ibk: select an it x n matrix over Z (2i ! aeeordiug to the method given in §3.3 on p.3S, hut with the additional mstraints that the resulting matrix cont ins no l m nt with value or 1 Mid all element-- are distinct..
6.2.2. *lntii l and Fma! Ahnng S . In we give t«imh;u<¾* for permuting a sequence of elements or other forms f choices using dwisiom having a sort trig- ia >\ w rk topology. By replacing conditional s with 2x bijeeiive matrices mixing each input into eae!i output, we ran take prectscJy the aw network toiwlogy aiKi jiioiaee a mixing iM»t¾'ork which πιίχί» e ry input of t e4 cit function with <-very other initiaily, and m*?m employ Htiother suelt et rk finally to mix evi>r output of the hinctiou with evtu oiher. As s'as the ej-isc> with permutations, the mixing is not out irely even, and if
Figure imgf000106_0003
uing the i«-htik*!es m ¾ 2 » but ngatn, with condition^! |*s re lace! by rotxiiig
6.2.3. *SuMiv1dm§ ft Segment's Input snd Ouipul
Figure imgf000106_0004
The following choices are an example only: many other choices are pewsibio with different widths and wider <- »u >'*s of di vis ton-sizes. APPENDIX
If input subvectors for a segmen are statically di ided in a particular way, say 3-5-4 for left, middle, and right, respectively, then so are its outputs.
The permitted subveetor lengths for the above are t hree, four, five, and x 32- btt words. Since each input and output vector has length twelve 32-bit words ( 384 bits), it follows that the ten permitted configurations, in Ie aeogjaphie order from left to right and then down, are
3- 3-6 3-4-5 3-5-4 3-0-3 4-3-5
4- 4-4 4-3-3 5-3-4 5-4-3 6-3-3
If we number the above ten configurations in order from Q to 9, then the number of the configuration chosen for each segment we generate is chosen statically by rand(lO); i.e.. we choose from the ten possibilities above uniformly at random at const ruction time.
6.2.4, *Seleetin§ the Ordering for a Segment 's inputs and Outputs. The first segment inputs the initial inputs to JK or f^- and hence is input- unconstrained. Similarly the last segment outputs to fH or and hence is output-unconstraine . Tlras the inputs of the first segment, or outputs of the last, are attached to the initial inputs or final outputs, respectively, uniformly at random.
In all ot er eases, we select the ordering of inputs and outputs of segments as follows.
We note that, for arty segment, the outputs of its left output -subveet r depends only on the inputs of its left input -subveetor. the outputs of its middle otitput- subveetor depends only on the inputs of its left and middle input-subvectors, and t he outputs of its right output-subveetor depend on the inputs of the left , middle, and right subvectors.
We therefore statically link the inputs of a segment, Y to the outputs of its preceding segment Λ* uniformly at random under the following constraints.
( 1 ) Segment A"s right output- ector outputs must have the maximal number of links to segment }"s left input-vector inputs. For example, if A" is a 6-3-3 segment and Y is a 3-4-5 segment, then three uf A"s right outpuf-subvect r outputs are link«l to three of F¾ left hipiit-sutweet r inputs.
{2} Any of segment A"s right output-vector outputs which are not li ked to segment Y"$ left input -vector inputs under constraint ( 1 ) above must be linked to segment F's middle input- vector inputs. For example, in the 6- 3-3 X, 3-4-5 Y case above, the remaining three of A"s right output-vector outputs are linked t o three of F's middle input-vector inputs.
(3) After satisfaction of constraints (1 ) and (2) above, segment A' s middle output- vector outputs are linked into the leftmost possible of l" *s inpiit- Kubvee ors. where those in 's left input-subveet r are left most. those in V 's middle iaput-subvwtor are intermediate between leftmost and rightmost, and those in Y"s right input-subveetor are rightmost,
A sntmnary of the above is that we statically attempt to maximize dependencies on inputs as we transfer information from one segment to the next . We are always guaranteed to achieve 'full cascade* after two segments i © that every output depends on every input), and we also try to maximize the width of the dataflow carrying these dependencies (hence constraints (2) and (3) above),
0.2,5. *A Concatenation of Segments, fK (and hence fK 1 } is a sequence of segments. Each segment's basic configuration r-&-i is chosen statically according APPENDIX
to §6,2,3, SJ I each statically linked from lite origiiiaj inputs or from its pred«¾ssor according to §6,2. ϊ· The number of successive segments hern® fh 1 ) implementation fa uniformly randomly hosen
Figure imgf000108_0001
6.2,0. *immut k Bneodmg*. At certain points In the code, we use immutable encodings. The significant** of an immutable encoding, hether it is the identity «K»<Jttiip a linear encoding, or a permutation polynoniial encoding, is that it cannot !a¾ eiiaiigetl when obfiiseaiton ts applied to the /a or fK l implcttM'ntatt ti: its wsaiKO is part of the soinaiities and thwefore cannot be modified.
If iinmuiaWo encodings are mentioned without specification of the encoding, a et initiation polynomial is assumed.
6,2.7, *One-ntmif « Segment, Given a eoiiigiiriitloil r-i-t (for example), we create ait /a segment as follows (per Fi ure H):
1100 Using the* immiMIe method of §0.2.1, choose an r x r matrix
L, an & % s matrix Af . and a t > t matrix H, and 24 uniformly randomly clcjsaii iniiinitalile encodings; 12 applied to the inputs to these matrices and 12 applied to their outputs (viewing the matrices m v«4tor-Hutpprag functions). Let as refer to these three niat rices mith their input and output irnitittf&ble encodings as functions £,AT¾. Then £ ~- LfM o £ o Lm, M ~ Mom o Λ/ Q Mm, and f? = Λ«« II o W!MTC Xmtt perform* the immutable out put encodings, and Xm perfcaiiis the inittiiitabie input etaxalings, attached to matrix Λ'. for X a {L. M, /?}.
1105 Using the gene i ίηΑΐκκΙ of §0,2.1, choose a 1 x r «*!«¾«* matrix
C wit corresiMjndiii ftinction wbieJi talcs the saute inputs as C and has input encodings /.1R and output encoding C0¾t;i.., C =€IMI QCO lin, {The caari«ponding f™ 1 s gm nt will have a s*!eeti>r of tlic form C*.i¾l o C » L "" 1 which will 'be shoplifted, of course.)
'lake the two high-order bits fC's output and add 2 to fonii an iteration count in th range 2 to 5. One less than this iteration count is a number in the range i to 4 which is t e taiiiihef of t toes t e out ut of the entire right side iiitietiort (taking the s~i inputs arid producing the s-t outputs) has its inputs fed directly !mek into its inputs and is executed all over again, Μκ» its out us! are passed on to a succeeding segment,
1110 Choose 2* elect fmsctioiis 1%, , , . ,2$ and .. , ,t¾. each similar to C above. The high-order four frits of these provide muntxfrs in the range t) to 15. which added to 8, provide rotation counts in the range S to 23. The lt r taton c unts are applied to the in uts to Af and the rotation counts at*1 applied to the out uts from T
These rotations are not fier?nnt«f when the inputs awl outputs of M are |M:*rnwite<l in the next step,
1115 Clawjse p ¾ s(Io «)^ se!wtor pairs Ai,...,Ap. Bi,...,Bp, Wi ,....¾», V|,..., V^, mdi similar to C above, which provide ji t sufficient: t#-to- T coiitfMrisofis and sufficient t»$o- coiBparis:>ns to control our random periipitaiion of the inputs to and outputs front M. res|wtivi:4y. hy the inethod of ¾2,8 by eoutrolliag its random swaps, E*w?h eomparion geaerates » Boolean decision { s*¾p or don't s¾¾pj with a swap probabilit of atKatt ~, APPENDIX
The logical ordering of selected functionality around the inputs and output s for M is: inir ml rotations, then initial permutation, t h»«n input encodings, t .-ii matrix mapping, then f-inpttt-output functionality (the Έ-part functionality), tlen output encodings, then output permut tion, then final rotations. hen a selector its** M inputs, it mm ih-in with the same encoding as M does (namely, th A/,.a ne din ii), so nspect ivc of any ermutations, H selector always sees exactly the same inputs in exactly the saint* order «s M does,
Note that all of the atawe αίιφ» are nmd the k»p for the ,Vf functionality; i.e. everything from initial rot at tons to final rotat n is ^iom i on each ltemrfou.
As r<*>ult . simplification are possible; for example, the input encodings need not be done sep ratel for M and the solict rs which us«> ΛΊ ¾ inputs: they citn share the same- encoded values,
1125 We now promsf with the inner Fit itniA tiientatioii composed of the input-output, part and the f-htpnt-o iput fwtrt (the f itiict tonality peri and tlie ¾-iuii» tonality part).
Usinc the j (in ml method of §0,2.1 » choose a I x s doctor matrix C with m -ixwiing function c" which t k'-s the same inputs m M and has input encodings L[a and output encoding t"''f,it;i,e,, ' ;
£*xif ,<~'** 0 »- ( he corres onding κ E s^gmient will have a selector of the form C^m oC o£:* " 1 o Lm'it 1 which «111 be siniplifial. of course.)
'lake the two high-order hits of C' out put and add 2 to form an iteration count in the range 2 to 5, One !es> than this iteration count is a number in the range 1 to f wliieh is the ntunl >cr of times the outputs of the K functionality ¾ taking the f inputs and producing the t outputs) has its inputs fed directly h ck into its inputs nd is executed «1! over again, during one iteration of the e-inputs-outpnfs,„Vf-part loop. he., in one iteration for the middle .'•-inputs-outputs part, ll iterations for the r-iiiputs-outputs are performed, so if the s'-part iterates four times and the f-part it era! ton count is three, the f-part is repeated 12 titties: three times for each *-part iteration.
11 0 Choose 2t selector functions 1.. , , :2t' and Ol . , , , . J , each similar to
AIMWO, The hi¾li-order four bits of those proid numbers in the range 0 to IS, which added to 8. provide rotation counts in the range s t*» 23, The/,' rotation counts are applied to the inputs to R and t e rotation counts are applied to the outputs from K. 1135
Figure imgf000109_0001
permutation of the inputs to and output from Έ, resjaxi ively, by the method of §2.> by controlling its random swaps. Each∞mparisott geiierates a BKilean decision (swap or d u't $<mp) with a s«¾p prol fiility of about . APPENDIX
The logical ordering of selected nine! tonality around the f inputs and outputs for R Is: initial rotations, then initial permutation, then input etiemb togs, tlien matrix mapping, then output encodings, then output permutation, then final rotations. When a etovtoc wses R inputs, it lise theni with the same eiM Kiiiig as does (aiuiiely, th Rm ncodings), so iro¾p«a.ive of any er utations, a selector always sees exactly the same inputs in exactly the same order m R does.
Note that all of the abov * sieps aie imide the loop for th i-tnpitts- oittpiits i'K' t ) ftiJiettonality; i.e., everything front initial rotation to final rotations is |ΚΤ6..»ΠΪΚ«Ι on each, iteration.
A a result, sirnplifies! ons are possilile: for example, the input encodings need, not be done separately for 'R and the selectors which use R' inputs: they can Aire tfie sam ucmled aminos.
6.3, Obfuscating fa or fa* Implerrtcntat Ions. The Ib!bwtng me hods are employed to ohseiire a prepaid inclement trig /K or fK l , where iniplenteiitations have the structure given in §6.2 aiawe,
The t aiisforai tions iu the* following see t ions are performed one after the other except where ot hiTmis noted in the body of th mictions.
11.3.1... Omn pi. The cleanups listed m §3.2.1 » §5.2.2 , and §5.2.3 are performed as needed, as in the Mark II implementation.
Ci.3.2. *Hmh insertion and Generation a Distinct Dytmmie Ibiaes. The transform of §5.2.4 is perforated, but using a 1 x 12 matrix taking all oi* the inputs.
Otherwise, this is very similar to the e ^xm mg Mark Π step,
6.3,3. Mfiem !mirmfmm Esjmmwn, Tins is done as in the Mark II implementation,
6,3, 1. €*m *P m Insertion. This is dose as iu the Mark 11 implementation. Note that, in Murk III, all of the mntrolTiow exists to create the nested per-segment loops..
0.3,5. Kandmm Instruction Reordering. We note that, in fti.net ion- indexed interleaving If!i) as employed in the generation of s€§menl$ of the implementation, we have divide*! inputs and outputs int |M«sibly irregular gtottps of widths r-*-?, respectively, i fa and fK 5 ,
• the r outputs dejM'ttd only tm the r inputs;
• the » outputs defend on the e and « itiptii-s; and
• t he t out ut* depend oti the r, #¾, and t inputs:
where the seieet or eouipuiiit ion for the rti hr-m m r and s is «>tistderetj part of the « fo!iipntatioB, aist the selector computation for the Ft! Iiotween » and t is considered part of t computation. Note that, with this understanding, the s outputs do not defaaid on the r eitfpais, and the t outputs do not depend on the r and s outputs.
If I It*:* implementation is not in SSA form, convert it to H A form (see §2.10, 1 ) and remove r«l«iidai»t MO E insi ructions (see §5.2,1).
We now topological!)- sort each se ment by itself, thereby mixing the r. s, «nd t instruct ion sentience randomly.
We similarly topological!)* sort the iaitial mixing by itself, and the iinal mixing by it f. APPENDIX
We? riMjeaten ie I lie »;rl«l orderitigsi initial mixin . segment I, segment 2, , , , , se ment k. final ntixiag, Create a item* relation ii re resenting the "pm^Iea1 re~ iaikatsliip m this eotM«etiattsi ordering, Create a new reliifion ft hy removing one its every two ares (x,y) c R uniformly at random. and. muting If with tlte execution coimr&mts" to form the over ail iwafes* ratal ton, topologjcatly s-orf the entire sw|iic»ee again.
Itist rtiettftn reordering is HOW complete.
6.3.6, *[Mi( Pi(iW Buplimtum. "This is performed as in the Mark II iHiplenieiitai km isee ¾5.2.8).
8.3. T, * Rand m
Figure imgf000111_0001
This is done as in the Mart II ttitpleiieniaiioii (see §5.2.9.
6.3.8. *C¾«rA" imertimt. This is done it m the Mark it iiiipiementation (see §5.2.10), with the foilcmiitg riiange; among the <¾ιιο¾! ο plareriients for a check, with probability . & candidate within the u nit segitieiit is chosen (.where such a candidate exists), and with probability ~, a candidate in a kiet segment is chosen (where, such a candidate exists). As a result of this change, and of the iiKxIiftod rwirdering scheme in §6.3ϋ> , with high probability all of the r. s, f se-giiients are cross«ertniM>€ted and mack? dependent on one another by t he inserted cheeks.
6.3.9. Wm t-odmg. This is done as in t he Mark II tBtp!etttentation see §5,2.11).
6.3.10. Register MinimizaUtm. This is performed as in the Mark 11 hnplcriieirtatiori (see §5,2.12).
6.3.11. *M rti Shuffling, This is perfort»«l as ill the Mart 11 ttiipteBientaiion ( ee §5.2.13). crte that , since we trave loops but no if then else, the M are generally πΜΐιίιικιΙ. which e!iiiiisatc^ scane anomalies which could arise i lite Mark II irnp!etueaatioiL
6.3.12. Finsf CUmnups and Cm Bmi n, These are p :foriii«l as In the Mark II inipleuientatioii (see §5.2.15),
7. Bl-FADfXi; AN! A t'tfORIXCi ITcttNtqtJES
The voim* of a rneiiiher of a pair of mutually inverse base functions is greatly increased if it can be ancliorod to the* application which entfiloys it and the platform on which that application re ide, and if its data and cock* can ie Metut i with the data and code of the application of which it forms A part, in the sense that the boniMiary bet e n differ nt kinds of data, or code twaroiiies Mtirred.
The effect of such awhoriiig and blending is to
fli foil iw.!e~ and data-lifting attacks,
(2) foil input-point, out put-point , and other ound r attacks by obftiaaitttg ttio exact position of the ijoniidary. and
ill) itierease d f^tMlewtes tM*twwn tlie PROTECTIVE :SMIC and. data and their stir- renttding eoutextttai «xie and data, thereby jnereesiag tais| S-r«tstaiice by increased fragility wider tampering.
The kinds of data awl code lamndark** to tie addresses! are; APPENDIX
11 ) mptif tvwidaries. whew tineacoded data m . e encoded *J*d brought front its unprotected domain into the protected domain (see ¾?.a),
{2} output boundaries, where protected data mmt IK* decoded and brought from the protected domain into the unprotected ctoiiiaia (see *7.3),
(A) entry ^underi , where control [Misses- from unprotected code into pro» teeted Hiid olrfoscaf <*d code f te ¾7.-l).
1 4 ) exit boun aries, h re c ntr l a^es fr m protected and obpiscat«t <·< «!«· into unprotected cede («· §7.4),
|5) at'pamting h&undarii>A where data chan es frotii a form in which entropy from multiple varia l is evenly mi ed across a stable bit-vector to a foritt in which entropy from individual variables is oi«*re isolated, although t ill encod»sl. and eonip rations are performed on t hese, mor isolated variables. and
61 miring houndarie t, when* data c ange* iurnt a fortw in which imtropy from Individual var s«blt¾s is encoded but relatively isol t d, generall « t*iriing the results erf computations on ic
Figure imgf000112_0001
t a form in which entropy irom multiple variables is evenly mixed torn a sizable bit-vector.
The challenge for protecting &ψψχί ιη§ and miring !ioiiiidaries i that frequently, the- data on which ««iipwnti<?iiK arc j>erformed after a separation t>r before mixing e^nie from ot her sites* where the data were ais*> separated into relatively few variables, This permits an attack in which isolated variables are perturbed, wit h the resttlt that , after mixing a»d ris«>jiaratic t$, Isolated vari bl s rewfxjiid to the lwtttr!»tioiL revealing the connection of the valit» at t he perturbing sit to those at the responding site.
In addition to the above forms of ending, we sot* to nehor code and data fo t heir context by means of interlocking techniques including
U ) rffibioii pendent mfff' ieimtM, where data-flow at some code sit s in the code rovide variables U HI to compute efficient which contr l encodings at stitis«{ueMly executed code sites in t lie code (see §7.1 below 1, and
12) data-flow duplieativn unih er s-ch cking mid cms»-lmppm§. where certain parts of the data-flow are replicated f htit with different enecaliiigs), dataflow links are swapped among duplicates, and eoiaptitations are injected which, if t he duplicates match wit h respect to t heir um>neoded valti !S. have no not effect, bttf if t he duplic e fail to match with respoct to their tm~ encoded vaiiies, cause computation to subtly fait or degrade.
1 3) dot a- fl w wrrupium and repair, where errors are injected into data-flow at certain rode site , and these errors arc «:>rteeted a? « bse<niemlv execute*! code sit«k<.
ί I ) cont l'fltm rrupium mui repair, where only execut ing code needs to be in a correctly executable state, m long as, as part of its execution, if: ensures the correct exeetlfabilily of the sn!*se:|Ueot *tiite in e!fect , there i a itwiving window of correctness including the currently executing code, r id code is corrupted w en left but corrected fadore entry iw citAtiges f o data su h as routine- ear iables, eaststiidt«.*s, arid the like, to avoid problems inherent ill
Figure imgf000112_0002
code, i.e.. all such corruption should affect data used in control, not the actual c e itself.
15} shii i
Figure imgf000112_0003
where multiple pieces of code make tiv of an insiiaicc of dynamic data tnangling, where a <lyii»rniea!ly addn»¾»ai store with ongoing APPENDIX
dat -shuffling and receding is shared among nraltipk- pieces of code, making it harder for attackers to eparate the data-flow of one of th<> pi««ees of code from that f 'loTigiiig to other pieces of ri¼ie,
16} immlM fimetwm* when* code fur perfortutng s me function is interleaved with that performing one or more other* so that the two or more functions
Figure imgf000113_0001
ot e n-e iiqties a e i»i» «« se ratn t e co e <>r t e two functions difficult, and
|7f subsfi iuwi s-and-pw s, where we deploy substs of the functionality of the !umj>s-iUid~pt -es control- flow protection |>atent (t'S 6.779.111}. including sivi! citable very large routines which combine multiple routines into a single routine.
7.t. Data- Dependent Coeficient. Examining the fortntilas fear perTiwitatif.ni- polyncrtiiial inverses {see §2.3.2, §2.3.3, and §2.3.4,
¾ m that rutiltipiieative ring inverses met the word ring* (typically 2s2 or 2*"' 5 <«> current computers) are used extensively, (They are the denominat rs in the fractions in the fonnal&si e.g., ajc% means ac~ 1 which means ale ·)* where e 1 is the* multiplicative ring invese of c.
For & nwehiiM with a ic~Mf word, and c~ are t«*o numbers such that c c" 1 - 1. where · is the ttiuJtiplication o|teraison operation within the ring. i.e.. it is jfiultiplic&tioii of two ti - it binary words with overflow ignored as in C and C*+.
While we i nn easily !ind such inverses computationally by em l ing the Extended Emiit t Aigttrithmllft, 26j. this is undesirable la*eatise the Hl orithm is easily re<t)gni3taW . Thus We need Anot er ttieJis of mnvcrting some of the entropy provided by input data into a choice of random coefficients for permutation polynomials.
We recommend an approach along these hues: in advance, randomly choose numbers in which <-aeh n. is n odd number in the range to ' ■■■■ 1 inehtstve. all «,*s are pairwi ' distinct, and there is no pair ,¾ in the list sneh that «r · «*c ::=: 1. We will also employ their multiplicative inverses found in advance usin the Extended Euclk >;m Algorithm noted above,
Then for any random nonzero word value v chosen from the early «mif>«taii«ii¾ within a CB function, we choose a product c using the hits of r; if bit 2* is s t in
Figure imgf000113_0002
c i.e., F inee re by some sitiaJli-r number and have a smaller iMii l w of ,*s and their inverse* which is pretMtbty >t!ll adeqtwte: 18 bits iiistead of 32 still would |jcrniit a selection oi OOT 2CM1,CK11 eolfficietit r invtise c efrident. pairs.
N<te that we itave only provide*! a means fo generating o<kl eoeftieients. Other kinds, of « *ifi( U->M are fHiit r to generate, since eit er we only mjutr t heir adcljtlv inverses (even l ments dotrt Itave multtplieattve ittverses). In order to generate a APPENDIX
«>oit ent which is even, we simply double* a generated value v. hi order to create on whose square is 0, we sim l k¾ieally left shift r by J ^f i positions lie., we multiply it by 2 ~* ' with overflow ignored" ,
7,2. Fine Control for Encoding Intensity, to the current Irsascwfer, there are sui t ings called daie-flo krH and cantml-flow fcref which nominally run from 0 to 100 tiMlieai hig how strongly encoded t he data- or e< ml t«i» flow should be,
T rarts Πυη&Πν, the iiieebansims used to effect this ostensibl fine- grained cont rol are of two varieties:
1 } sudd n d ffer iM in behavior which occur at certain specific numeric thresholds in th«* data- or ro irnl-fiow-it-irl m that , hek <w the threshold, a eert*»m f taiisformatioii is not applied, and above the threshold, it is, and
!*2 tirii*-gr«j«Hi diff rence* in the prolsdMiity-tlireshokl for jM*rfonaiag a certain transforms} ion on a certain code fragment, so that at a lower data-flow level. a |^ud »rnndo vallate might liavc to fail «Ικην ft.s in ord r to cau e it to be transform*"*!, whereas at a higher one, it might only have to fail above o.'i in order to ca e the traifei j iiiation,
We have Bo problem with method ( I ), but we can improve on (2) , The problem wit h met hod (2i is f hat . simply by chance, t he actual level achieved may not fall near t e intended level. We therefore recommend the following improvement .
We kwp a running tally of total sites to 1» covered, sites eowsred so far , and haw inatiy received line pr«t»bility-et!iitro!l«l transform {t he tnriudtd sites) and how mam did not (the fxtl tid sites ), When t he ratio of included sites is below t he desired ratio, we nien a t he probabili iA performing t he transform above it nominal value, and w en it is above the d bit d ratio, we tfeortcise the probability of performing the transform below its nominal value, A proper setting For the <l¾ ee of increase and decre can t>e gauged by experiment atkttt This tmn effectively ecutse the actual rat io for MI aifecl<s.l region of code t f«* profited to closely track the desired ratio, Instead of wandering awnv from that ratio due to chance effects.
This will have its best effect when the total number of potential sites is large, .Little fine-grained control can IK? exercised if only a few sit.es exist . short of massive code duplication to increase the -ihvtive number of* site*;.
This readily extends to ease* with more than two choices. Consider for example t he use of peraiiitatioe polynomial encodings over Z/(2*i2| (or over Z/fS**1 ) for recent , more powerful platforms}. If we vary among no encoding or ncoding of degrees ! t hrough «h t hen f lu-re are seven po ible choices to !«· covered, amongst which, we apportion the protmbtltties are»>nltiig to the desired ratios. The same principle applies: if we are get ting too lit tle of something, a* push its probability up; if we are getting t o much, we lower its probability down.
Input and Output Boundaries. At an input boundary, wieiteoded data itiitst be encoded and brought from its nnj»r«tceted. domain into the protect «1 domain. At an output boundary, protected data mus be dwcxled and brought from f ile protected domain into its unprotected domain.
This is as appropriate place to deploy tine control of encoding inn-nsity {see- f-7.2) for data-flow, Measurin data-flow distance itt rrsiniber of graph ares In the data-flow graph, w!tore ati arc eotitwcts a valye-produeing {«.*rat.ii*n to the o{M'<rat i**ii which ΟΟΙΜΙΙΙΚ*» it , we roew as follows. APPENDIX i t ) Protect the implementation at its chosen intensity (normally fairly high,
( 2) Protect operations which are 1. 2, 3, ... . k arcs away from operations inside the implementation with diminndiing intensities unt il the normal trmscodcd intensity of surrounding code is reaehed, for fort the input boundary and the output boundary.
This requires that we be able to gauge such distances, winch requires sotiie ext ra support from t he Thei5<x*der to add mch sel ctive distance information and to s nd to it by cont olling encoding intensity as outlined in §7.2.
Additional protections which apply to input output tMHiiidaries are deie-rfcpcRce- cnt meffiei t* to increase the dependency of the computations at the entr of the Base function on the code which providers the inputs and to increase1 f ile e endenc of the computations r«-eiviiig the Bas itincttonotttpntii on t he code within the implementation which provides those outputs, and shared bltipM rds f if data can eater and leave by tin a shawl blac *oatd an instance of ynamic data mangling then it is intteh tiarder for an Ht t .^ -k r to follow the dat -flow for that data),
7.4. Entry and Exit Boundaries, Typically, the point at which a ·
implementation receives its inputs ioiniexliately follows the point at which control enters t e implementation, atid the point at which a implementat ion provide its outputs irnitiediatelv p ecede t he point at winch control leaves he iniplententntiotL
As a result, all of the protection tn §7.3 typically also protect t e ewicf and ait boundaries. However, t e · implementation mill typically have stronger control- flow protections than regular trans- oded code.
Tliereiore we need to perform fine-grained ste is increment on entry and stej w diininishnic!it on exit, of tlie c ntn -fiow kvti Our metric for distanet' here is tlte estimated nariiher of FABRIC or machine-code infractions to IM* executed along the shortest path leading to the enfry-fMMnt (for entry): or depart ing front the exit-point (for exit h wit h blending for a distance f, say, one or two hundml instruction units for each of entry and exit.
This would he an excellent place to deploy fontmi-flmt' rr ptian srirf rr/w'r to tie together tlie code iteariii the entry and t e PROTH iVKetitry code, arid tlie OTECTiVE xit code and the o*fe» moving away from the exit, to increase the level of protection in the vicinity of the PROTECTIVE ent y and exit .
7.5, Separating and Mixing Boundaries. The general situation in which we e cottfiter se r ting a d mixing boundaries is one itt which structured dat is output in tightly encoded or imeneoded form from a tinpleiaeiitation, or lightly encoded or ttneiictaled structured data enters a implementation, or a pair of in^temmanons. of the invention are used to sandwich a decision-making computation !udi we wish to hide.
The effect of separating and/or mixing is t at we have potentially exposed data after the separation or prior to tlie mixi , creating an attack point , Tims, in Addition to the relevant protect ions for these situat ions already covered in §7.3 on 1>. G5 aw! §7.4 , we nml stronger protections for any computations which we need to sandwich between base funct ions used as sej*arattng or mixing ftiitetioiiS. APPENDIX
If the decision is i i on dieeking a passmwd, or some similar equality eom- |»sris n. *> stroiigly rw»miM¾ th^ meh d of '-A a> tlw* ¾> st ehoiw for its protection* However, we are rarely s fortunate.
The more corottiou ease is that % nwd to |κ«τίοπ.α some arithmetic, som eom- fmriscMis. some bitwise ΒβοΙοιιιι jK'f at loits, and S on, For these, we reeotmnend the following protections (see § ?):
|l) first aisJ iihxrt important, tkiiti-fh duplication with
Figure imgf000116_0001
end rmm-trapping to massively increas* the d« n-depcndertek-s between iho initial lmso-p.wef.ton and the decision eode,&nd the* decision vode and the iftal b&***-fttnctioii;
(2? liberal IM of
Figure imgf000116_0002
emffinnds: with rwificiettLs in the deci- m h!oek set by th preceding h«M^fiiticti n and of>¾fteieots hi the foUowhtg l>ased½aetion set by the code hi the decision block;
(3) use <f sftorwf blackboards (dynamic data mangling array**} as sites tisod to mniiimniraii* from the initial las- function to the decision code and from tin* decision c d*1 t the final bast* fitnrtioii: and
{\) if {H,»siWc% fmmih! fmuiw . s that the dt-ri ioti eode is mixe with other irrelevant ode, making it hard for the attacker to al se and distinguish from the eode with which it is c mpute*] m parallel bv interleaving their eotttptti tto&s,
7.6. General Protections. Certain protect i tas, am t«» applied «t every bonud- ary. and rx*t M*n boiuidaxies, itt order to farther protect de in contexts where
iMi^iiitt ti tis ar deployed, iieaiely control- and dnis-fl e corruption mid repair, pmmi!ei fimtUom, and sufewf imnp» and pieces. Where feasible, these added protections will increase the analytical dlilirjfltiai feeed bv the attacker; in particttkr. they will rend r static analyst* inffitsibl and dytuuiuV itaJysis ro^tl .
♦ J. EXEMPLARY Protection Scenario, Data is provided itt a form craoded ct« a base function, so that information is smear across its entire siate- veetor, The data thus encoded com rises
HI a J2 »ii key
i'2) a I2ii-bit data sfnictiire with a variety of holds, some only a few bits wide, one 32 bits wide, and sonic u to 16 bits wide
A cotttptitAt ion is to be performed on t e fields of the data straetiro and uiformat ion ott the current platform, s a result of which either the key will ' d«'livort<i in n f rm ready for nx- (indicating that t e current platforni tnforientioit phi> the deltver«I data struct tiro lead to a dwfetoii io r lease the ke h or a mmm^&e string of the s»mi« 'M the key will b*1 delivered in a form which appwtrs to be ready for ttsc but in tm'i will bill (it lfoa iiig that the ettrrent. platfortB iitioriwation pins tlie deli%ere<i data sftnrtttro lead to a di ision not to rel4«s* the kwb
T'hv ttu kor'H i to btain the i28~ it key irrwp etive «f the if*n»»at of the fields and the infontmtion on the enrreut plAtfonn. Tito def nder's gtwtl is to figure that ti«* key t eorrootly deiiyerod ©a a *y*«' dwasioa t«i3«,iss the defender's io!pIein<»rrtAitoti is am ere with, but is not obtAiitoble by the attaeker in situations wltor*' t h« * dwtstoti would fx? ivi hi the fA ma* t»! tiWtiperttig
'This eaptnres the primary blending floeds of the protective sy<lcm: t ere arc input, uutptit, ontrj', exit. sefMiraiiuit, and mixing boutidarios to prot«*t. APPENDIX
7,β·. Implementing the Protections with Blendtiig, Here we describe the itn* pIc*iiM*ntiiti(*H of the protirf ions from data-fiotv-deptvdent m£§iett-n$$ through «βδ· «ff lumps-and-pietest lisbd .starting §7 for Ik* prot ction scetiari© d<¾* ril>«-d its §7.7 n |>. *7 «s projmsed in ?7.2 t rough §7,0, ,
7.8 J . Slaritny Confiffumium, Our tarting eonfigitratfoit comprises the Itaiisr def tittereiediate representation erf i tic PROTECTIVE SYSTEM CORE a 12S-bit X 256-bit ftinetl into which the fiMtfitsiiiiig application inputs & 256- Wf value roiittuuitig in eno d form a 128-bit encode! k y and a 128-bit data structure and from which the application m^ivs a 12*-bit differently encoded kev ready for use.
The core r t ris s
il an entry 25ft-bit 25§4»t tm«~ function accepting 2¾6 bits in which entropy is ΙΪΙΪΧΙ ! across the- entire 2Mi bits, cimwJ g this to a struc ure with a 1 ^-Mi encoded key fene d d by s^tn ot er 128-bjt x 1284)Jt t»ae--ftioef ion btdurehwtd) and a 128-bit data struct re with separated fields in smooth hmeneoded) form:
(2) a dtHistan-Mock a*v<p ti«g the Tdbit data structure and the 12Mai ke , performing computations on fUMt- <>f the 128-bit data straetare, deriding whether to refeise the key or not, and providing to a second b& -imtetvm eit er the !2i >it encoded key itself I if t lie dmsioti is 'prwmO or a value whieb use* the key end further inforaifstioii form the !2S4«t data strue- ttnv as art entropy sowree and rovides to a second l *M«t * 1284»ΰ. bas**- iutietkai either ! he encoded ky or a nonsense value of the same width:
(31 liti exit i28-lit x 123- hit Imse-fune ton and returns a differently encoded key ready for its*** in some white-box er f)! ogra tifc ftitieti ti (e ,, ASS- 1 ),
The entry and exit
Figure imgf000117_0001
are const rutted are rdie to the Mark III design f^ e ·':6). The core is inline code its its containing routine; i.e.. it is no entered by a routine call nor exited In n routine return: rather, surrounding functionality is included within a routine containing, but the surrounding functionality and t e core,
The (xmim m of the cote and the application is railed the program.
7. ,2,
Figure imgf000117_0002
Segmt-nt Input and Output Vectors, The following materia] is exemplar : far wider rhoiees exi t. Her** we ehoose doubly recursive f/uitetion- iwimml interleaving, yielding three- art division of «¾ nents. It could a bo be singly recursive {two-part division), triply recursive (four- art division j. or order n recursive ffn + Ij-paxt division's.
In §6.2.3 a division was- given for f o vectors in She 12-words~wtde i;iH.- hit l/o) Mark Ml iriplwuentaf ion. According t the above, we ttavem S-words~wide entry biwdunettoti ati< 1 a 4-w.»rds-m*ide exit lms<»-funcikm. We subdivide entry
Figure imgf000117_0003
2-2-4 2-3*3 3-2-4 3-3-2 4-2-2
if we laitiiber the above hair eoiiftgt!raiioirs in order front 0 to 3, then the innnlier of the coniiguratioii efioscn for each segment we generate is chosen staticsiSly by randf $i: i.e.. we cliiMio fr ni the four possitiilii ies atawe uniformly at random at coiislTtielJoti lime.
For the exi the segments are subdivided a follows;
1-1-2 1-2-1 2-14 APPENDIX if we number the above three configurations in order from 0 to 2, then the number of the configuration chosen for each segment we generate is chosen statically by rand (3); i.e.. we choose from the three possibilities above iiniforraly at random at construct ion time .
7,8.3. *Di£tance Metrics, We make use of measures of distance from an operation to a core input and from a core output to an operation. There are four metrics: t wo for {lata- flow distance and two for control-flow distance.
*Dal&-Fk) Distances. We collect input-distances down to - 200 and output- distance tip to +200. Beyond that point, we can ignore greater distances, and apply heuristic methods to avoid computing them.
The distance of every computational instruction in the core from the core is zero. (A computational instruction is one wMcli either inputs one or more values or outputs one or more values.)
The distance of every other computational instruction (ci) from the core is either negative (if it provides values which affect values which are eoiisunied by the core) or positive (if it carnitines values affected by values which are produced by the core) .
We assume that, for the most part, both are not true; i.e., that either the core is not in the body of a loop in which the core is repeatedly employed, or it is in a loop, but that the loop is sufficiently extensive that we can ignore any information fed into the core which is affected by outputs from previous core executions. However, Instructions may reside in routines which may be called both before arid after the execution of the core, in which ease the data-flow distance is a pair comprising its input-distance and its output-distance.
Input-distance is determined as follows.
• If a ci outputs a value which is a direct input to the core, or loads a value which is a direct input to the core which the core accepts as a data-flow edge (i.e. as a 'virtual register1} or stores an input which the core loads from memory, its input-distance is—1. Otherwise,
• if a a j* outputs a value which is a direct input to a y ci which has an input-distance of - k (and possibly an output-distance of +tf as well due to instructions in routines called both before and after the core, as noted above) or x loads a value which is input by such a y, or stores an input which such a y loads from memory, its input-distance is k - 1.
When t he above considerations give an instruction multiple distinct input-distances, the one closest to zero is correct.
Outptit- distance is determined as follows.
• If the core outputs a value which is a direct input to a ci, or loads a value which is a direct input to the ci which the Ci accepts as a data-flow edge (i.e. as a 'virtual register") or stores an input which the Ci loads from memory, its output-distance is +1. Otherwise.
• if a ci x with output-distance - ? (and possibly an input-distance of k' as well due to instructions in routines called both before and after the core, as noted above) outputs a value which is a direct input to a ci y or such a Ci x loads a value; which is input by such a or stores an input which such a 1/ loads from memory, y has output-distance ÷k + 1. APPENDIX
W en tiie above considerations give an inst uction multiple distinct usu
Figure imgf000119_0001
the one closest to zero is correct.
This metric completely ignore* «*ntt<:*M!ow, A reti i- ith- value instruction is treaded as a load for this purpose. A rout ine~ent rv inst ruction input ting alue to variable within a routine t* considered to bo a rtH ts ruction for t his purpose.
*C*ofifmM¾ir f¾sfatjee. Wo collect entry-disf Miees down to 2 0 and exit-distances up to 200, Beyond that point , we can ignore reat r istances, and ly heuristic methods to void computing them.
Wo view distractions a> connected hy directed arcs m the «JHtrol-flow graph of the program. oooditkM»I branches having two outgoing arcs {leading to the successor if the tested condition is true or (also) and indexed branefow (case- or Sifitch-stateineut branches) having multiple success s chosen by a controlling index which s tes ed against he case- lftl*e!s <»f the cont rol- struct ure. For a rout ine retunt-instri 'tion, its uat ¾ss r are determituMl by the *4tes from which the routine is called; i.e., they are all of the instnietioiis which may e executed immediately after return from the routine, and the return instruction is considered to t» AH indexed branch to those ost -return instructions.
Any instruction within the core as; a eowtrol-tlow distance of ¾er<» from tip- core. As above, we assume a non- looping scen rio in which any looping involvin t!io core is sufficiently large-scale and infrequent to pertnit us to ignore it . However, In the ease of control- flow distance, instructions; may reside in routines which may fx* called both Ixdare iid after the execution f the core, in which «*.*·· the eontrol- !l¾)W distance is H pair eoiiiprwng its entry-distance mid its exit-dM atiee,
Enit-ii-disttm is determined as follows,
• If an instruction has a successor instruction in the core or is a branch with a destination in the core, its entry control-flow distance i¾ 1. Otherwise.
• if an instruction i has an immediate successor instruct ton y which h an entry control-flow distance of ™ A (and possibly an. exit control-Bow distance of * k' as well due to instructions in routines called both before and after the core, as noted above), or a* is a branch one of whose destina ion instruction h uch a i , its ciit v c ntrol-flow distance is k · · I .
When the a!w»ve «>!isi<lerati«!iis ive n tttstrijction liiulttple cisiiitci entry-distances, the one closest to zero is correct.
Em~4wtanee fo determined as follows.
• If A core instruction \ms a successor instruction outside the core or is a branch with destinat ion instruct ion outside the core, t h.it imtru ti»m outside the core has an exit eonlroMlow distance of f I . Otherwise,
• if instruction x which has an exit control-flow distance of +k (and possibl an entry control-flow distance of - k1 as moll due to ni trtief ions in routines called bof ti before and after the core, as noted above? has art immediat «ie«*s«..»t instruction g, or if such an r branchess to an instruction y, then h s an exit control- flow distance of +k + 1.
When the above consi e ati ns give an instruction mul iple distinct
Figure imgf000119_0002
i he ne closest to j¾«ro is correct,
T.s. i, ilamupx. The clea up listed in p>,2,L p,2J » and §5,2.3 are performed as needed, as in tlie Mark il implementation, not only for APPENDIX the entry an<i exit t» ~ func ion iiopkmientatioiis. but for the decision- block and tlte application a* well
7.8,$, *H h imerlion and Gene ium of Dtstimt Byimmie Vel m. Inst ad of performing til** transform of §a.2.4 once per base- function (i.e., performing It for the entry function, a d performing it separately fur the exit function), %«
Figure imgf000120_0001
ttatis!oriB within th* application using a i x 16 matri whose input are selected from data available in the application some tiiae before the entry base- ifttiKt ion is f Mii tite , W*> tise it to create MI array for dynyaniic data inan- liiiR which will rve the application. lx)tn the entry and exit tms^fttiiciiotis, and the decision-block, so flint they all u** on sltami blackboard.
7.S.G, Mucm irisirurti i EjHt wn, This is done as in the Mark II and Mafic ill implementations.
7. ,17, Coim-Vmm imttium, This is done a in the Mark ill implementation, but extends to every branch with an eiury-distatiee or exit -distance From the core which lias an Absolute mine of 100 or less; i.e., it extends well beyond the limits of the CiearBox iinp uneiiiatiews in the core,
7.S.S. *C ttrei-Fkm> Cotruptum ami ftrpair. As part of processing tlx- code, the code is flattened: the branch labels are made destination points in switcn- statetiiwii-like construct, and destinations are reached by branclaiiig io this selicis, passing it an index which eorr*¾p<»ids to the switch case* label which leads to the desired destinatio .
This should be done for all axl which is in the core or which is in a basic block with any itistnietiotis having an c»ntr -distance or exit-distance from the core with an absolut value < .f KM) or less.
We consider the destinations* eoncsfpoudirig indtifs to lie stored in v ria le t - , . , e.., corres onding to n de I» the control-flow- graph which represent basic bi»«ks flt, ... , B„ leiiteted via « label and exited via a final brands, return, or call;
Prior to flattening, we randotnl iat*el each basic block in the corruption region with a lotaJ bijeetivt' function Lt: fl,,.,,«| *→ for t = Ι,,,.,ιι, under the following; constraints.
Ill If » basic block II, can exit to b!oeks i¾t BJm, then its labelli g Lt function has the property that L jkl j*5 for A" = 1, .. ,.ιη,
|2) If two distinct basic blocks J¾> B3 mil both exit to Mock ¾.·, then £, = /, ?, ill) If a ha>ie block B, ran exit to a block B, . then the ntinilvr of points k ut which Lfik) / i.jiki' is !Μ«ΙΙΚΙ«Ι above by four times the inunlwr of denin l bask* block*
Figure imgf000120_0002
by any fmstc block which can exit to B
After flattening, ever basic block Bm is entered wi h the variables *·}„.,.. vn in a state such that for its ed c ssor's i,X Ldj) Ί for j - 1, ..... n. (This does n t mea thai the variable" states are e rrwt, but only that thev agree with the rede ess rs' If then proceeds to swap variables so that for each variable.
Imij} Vj this almost certainly is a different state of es , ... ,r„ than the one with which it was entered, although in view of the eotisfrattits, the number of changes has a iCH a ble bound.
Thus, by the time the end of a basic block is reached, the variables correspond to desttnatiotts m Mich a way that etirretst dostinattHis are »nect. but m st others are usually incorrect. APPENDIX
7,b,l). Random Imtnt t iht'ml* tin .. We not** thai, in fui«ti<mdnde»:st interleaving frit) lis en^!o ed in th<« generation rf atgm n of the implementation, m» have divided input nml out uts into |M««ibly irregular grotips of widths r--s-r, respect jve!y. In and /- 1 for each of the entry and exit ItaseTunet }«>!&-,
• the r tititpitts depend only on the r inputs;
• the a outputs defjend n the r and s inputs; and
• the ! outputs depend on the r, ,s, and I inputs;
ere the selector computation lor the Fit Iteiween r and s is considered part of the s eonipntation, and the selector computation tor the Fll beween « and f is eeinsMer«t part of r computation. Note that with f!iH undei-t nudity, the s outputs d not depend cm the r output and I he t outputs do fiot depend on the r and * mtpuis.
If he program is not in SSA form, convert it to sa form (sect §2.10.1 )
and remove redundant OVK instructions i ee §5.2.1).
We now topological ly sort o li segment in each oi the entry and exit Imse- functions hy itself, thereby mixing the r, s, and t instruction secitienees randomly. W<> similarly
Figure imgf000121_0001
sort the initial mixing hy itself, and t e final iitixmjr, hy itself in each r¾ase-fun<tion. Wo likewise topologioaUy sort every basic Mock {every straight-line stretch of ro without any branches or routine emits or returns) in the application >uid the d cislon-blodt,
fur oath of: the entry and exit baseduuetions, we concatenate the sort d or- derings: initial mixing, segm nt 1, segmen 2, ... . segment k. final mixing. Goal a a aw relation I? r re ent tug f e 'precede** relationship in this «iio leuated ordering. Create A new relation H by removing one in every two ares (r.y) c R uniformly at rntidom, and. uniting I? with the executions constraints to form the over nil 'precedes* ' rekiliftn. topologjadly sort the satire sequence again.
Instruction reordering for the program is now iswnplete,
*D(ita-Ffow DnpUmti i lh €m »'Cha'king/'i'tvppim . The method for fla transfer itiations is as in Mak II (see |5,2,8, §5.2.0, and §5.2.10), with the imxtifleaikms in Mark III (see -$„',b). but it is also done for additional pieces of code,
Specifically, in addition to its iiortiiid its*' within the entry and exit h e~ fttncf ions, we als<» perfortn these trausforaialions for l lie data- flow within the decision- block including the transfer of information from the outputs of the entry $m*~ fttnet ion to inputs of the decision-block and the transfer of information front tfse on! uis of the decision- lock t the inputs of t exit baedntietioii.
There are further changes to these «ej«» for our blending seeiiario (gw» §7.7 covered tu the nex istioin
. .11. *Den<ian iiidmt}. In he decision-block, ί 1ι«· fields of t he 12i*-fit struct tire are examined, computations arc {M*rfon«ed upon them, and a pass-fwl decision is resetted and. detivemi as A value. We duplicate some of th **' computations so that the decision value, consisting of one of a pair of arbitrary constants c^ and c)¾lS. is generated at least eight tilings. Transcoding will
Figure imgf000121_0002
inct .
Since ttan' are du lic tes, cross-linking and erfss-eiiet'kirtg aj»ply to thern. lit |M¾rik'uiar, we < n s tune that i!iey will yield c}:>te,. AIKI on that basis. p*»rform oiK'fiikiiis on datadiow words in the key as it is input to the exit
Figure imgf000121_0003
which cancel on a §mss but do not cancel on A faiL The cancelling vaitw* can niak*> i!e of further values frojn the nicture fifcj c¾ cancels, then so does fci - (Oic: i. APPENDIX
Th«' cottibination of i his with crosi linking and <-r< ^-ch«-kiiig In I'. AQ will cause the
Figure imgf000122_0001
to vary chaotically in a data-dependent fashion, but will, with high probability, cause a iioiisense value of the mim size as the key to be delivered to the application code following the exit {malfuncion whenever the decision- blocks tests on the street tire lead to 'fail* decision, {This method related to the i»¾««*ord-f*!M¥¾Htg tectttiicjtte in §A).
7. This i done as in t! - Mark II implementation (s<v §3,2.11 ), but with the following eh.attge>.
We divide the protection level s f llow :
f I) finite-ring protection using linear ma ing,
(2) perra-poly rotect ion using quadratic polynomial!*;
2-veetor ftiueitoti-oiatfix protection using tjtiMirafie fMilyiMimial elements;
Code in the core protected at the level ."{ f strong). Code out to an iiiptit-dJsfatiee or oiiff)tit~ilistatK*e with an alasolute value » .? exceeding KM) is r tect d at level 2 initt'rwH ikfe}, and the remainder of tfi«- application is protect d at 1·%.·] 1 (weak).
In addition. titmxiNung makes use of daiao!eperidest eoeilketits as follows..
Ill mmtxaiii d«wed. iti the application code leading tip t the entry l«i*e- ftaietion set one eighth, of the coefficients in the entry Ittise-ftinetiot s transoocttng,
(2) eonstaats derived to the entry t»seT\«tetkm set one quart or of the eiietfieietits in the tr it Kjifig of the decision- block,
f3 constants i-nvxi in the decision- block set one quart r f the crfiiteienfs iii tin- exit bas^f minion,
14} constants derived in the exit ! LSC^ function set at least one eighth of the coffici nt* m the application code receiving the output* from the exit liiw-funetiotj.
7.8.13, !iffiMer Minimisation. This is performed as trt the Mark II tiiipl oieirtaf ion .» .V2.12 >, but for the w hole program,
7. J . *D mnu H- u Maftgiiiig /Mem ry Shuffling}- This is rformed as in the Mark II implementation {sm §5.2.13), but affects c de te oud the core.
In particular, the St re blacktward provide*! b the shuffled memory used to provide the inputs from the application to the entry tmse-tattetoit and the outputs from tlie entry I mm- furiet ton to the decision-block and the inputs from the decision-block to the exit 1»»*» bine ft n, and the outputs* from the exit base- fii!iet ioii to tlie application.
7.8. IS, Final (l*anup* and Code Emis im. These me performed as in the Mark II implementation (see §5.2.15), but for the entire program.
SECTION Λ. 'At'TM^NTICATION HY 1:¾Ι'ΛΟΤΥ WITH OtA rfr FAIUIIE
Suppose we have an application to which authentication is password-like: authentication succe ds wl«*re G, tlie supplied value, raatelww a reference Value Γ; i.e..,. when G ~- I.
Further supfioso thai we care about, what Itsppeim when G - Γ. hut if not., we nlv insist that whatever the authentication Authoriz d ts n longer feasible. That %κ·ι· stiee«.«l whon G■■· V. but if G Γ, fur her computation «i«y sni l fail, APPENDIX
The anthem leafing equality is ml atfwt<*c:l by applyin any non-lossy iuncikm to both sicfe: for any htjeetion . we can equivalent ly test whether ø(£?} = off). The authenticating H|tiiility mm remain valid with high probability even if o ts h*s . if ίκ eiitefitlly rht**»n *ø that t t* roba ilit t at o\G) ~ ο(Γ tien · Γ is uifia«itty low (AS U is in Unix password authentication, for♦-xan.pl»1·}.
Based on t«¾noIogy previously deseril»«i iiereiih we can easily perform such a fist. e previcatsly ilescritx«cf a method for foiling tam ering by duplicating data-flow im* §5.2 )~ rarido iy mm ixitttMH*tiiig the datadlow etwivn duplicate iisMatiees §5.2.9 , and performing uwxif cheeking to nsure that the ecjitoltties have not lwi txiraprornised (see §3.2.10),
We can ada t this approach to test whether G -· Y tti nrole form, whether ,>(€) = >{Π. We n te that « data-flow yielding -iG already du licate a dataflow yielding {!*) along tiie stte<*f» path when* G = Γ. We therefor*' omit, for this i'Oii!I«¾rii¾ML the ciaia-ilo duplication step. Then we simply cr«¾Hi0tiii«<€t in ¾S.2.9 and insert cheeks as in ^5.2.10. By using tlx** mm uiatt t as ee*¾icieii!y l r futur«* enemled computations, we ensure that, if o{G) ~ ;Π. all will proe« l nortiially, but if cMC?) ^ of Γ), while further cosiptifaffoB will procml, the results will be chaotic and its functionality will fait Mom>v r, since o b a function, if oi i≠ at), we an l*> s re that€ - Γ.
APPENDIX
REFERENCES
i!j Alfred V. A ho, Ra i Sethi. Jeffrey D. IIRKMI. Compilers: Principles, Techlques, and Tools. A<idisCifi-Wt¾k-y. IS H 0-201- 10088-0.
[2] K.E. Batcher, Sorting Network* and their Appfux ion*. P we. AFirs Spring Joint Corn put. Osnf., vol.32. pp.307-314, K .
:3| Src ea,wi-tipedia.erg/H'iki/B*tcher_e ii-even_i.er¾esort and ws¾ . iti .fh-iieiisfcurg.d- e/laiig algGri-hmea/sorti.ersa/netMQrks/ oeoen. hta
i4j O. Billet. H. Gilbert, C. Ech-Chaihi. Cryptanalysis of a White Bos AES implementation. Prooeedinpi of SAC 2004 Co ference on Selected Area* n CYyptography. Asigtist. 2004, rrvised papers. Springer f ΐ Kc 3357;.
:5j Stanley T. Chow. Harold J. Johns- m. and Yuan Gu. Temper Ke-itsieni S& tw ar E c dina. rs pateit 6„¾)4.761.
s6] Stanley T. Chow, Harold J. Johnson, and Yuan G . emper ί¾:*«ίβ«( Sofiwmx: - Control /'four Εη<:«ί»η;?. t"S patent 6,770.114.
Stanle Γ. Chow. Harold J. Johnson, and YIMIS GU. Tamper R swtani Software Kneading. fs divisional patent 6Αο2,?ίδ2.
[$) Stanley T. Chow, Harold J. Johnson, Alexander Sbdfeorw. Tamper Resistant Software Encoding a d Anofyti*.2004. t'S patent application 10/478,678, publication I'S 2004/0230955 Al.
'pi Stank-y Chow, Yuan X, Gu. Harold Johns n, and Vh-ditnir Λ. Xakharov, An Approach to iAf Ofe Beeaften o/ Cenirof- Ffene *»/ ,Vr¾t»e«ii« C»mjM»ler I'nxjrarm, 1'rot ,.cedjirtt;s of tsc 20t'l Information Security, 4th lottjrnatiomtJ
Figure imgf000124_0001
(LM<32200), Springer, October, 2001, pp. 1 4 1.55.
jlQ) S. Chow, P. Eiseti, II. J bntton, P-C'. an Ctorsch t, White-Box Cryptography and an AES
Implementation Proceed ngs of SAf* 2002 C ferenc on Selected Awii in Cryptography,
March, 2002 \ LHCS 2595), Springer, 2003.
[t 1) S. Chow , P. Kijien, II. Johnson, P.C. van Oorschot, A White-Boi DBS iropfcrFw:?tfefc.»ti or PRM
.!pp/icsi!on*, F i»ic«linc3 ji DR 2002 - 2nd *<"M W rksh p on Digital Rights Management.
18, 2002 fixes 26!K»). Springer, 2003.
[12] Christian Sven Coliberg, Clark David Thomborwon, and Γλ.-uglas Wat Kok Low. Cbfuncation
Tfck titUKS for Enhancing Software Security, v patent 6/168,325.
&13| Keith Cooper, Timothy J. Harvey, and Ken Kennedy, A Simple, f¾.*f ^omwtenac Ι½οτ»ίητ»ι.
Software Practice and Ex erience.2001, no. I. pp. 1 10.
[14] Hon CVtroE. jean l½rrante, Barry K. Rosen, and Mark N. Wegmari. Efficiently Computing
SUtfie Single Assignment Form and the Control Depe dence Graph, ACM Transactions on
Programming 1.artgusges and Systems 13(4), October 1991, pp.451 4W3.
: 15] Extended Euclidean Aignr' m, Algorithm 2.107 on p.87 in A.J. i j«<.-s, P.C, van O richot .
S,A. VaTiSioar, Handbook of Applied Cryptography, ( c Pram.3001 (5th printing with «»r- reciions). Down-loadable from http: //mm. cacr . Bath . uwa* erloo . ca hac/
[16] National Institute of Standards and T«isaolo2>' (NBT), Adt-enaaf BnerypHon Standard (ABS).
Firs PubhY*tion 197.26 Nov.2001.
tittp:// csrc.aist .gov/publicationa/f ips/iipsi97/f xps-107.pdf
[17] Alexander Kltnw. , pjjiii-e0*srs,s o/ T-i^unctwne in Cryptography, PhD thesis under Adi
Shamir, Wcizmann Institute of Sdenc**. October.3004, heorem 5.3 p.41.
i¾ S. §7.4. pp. 250-25!», in A. J, Moiews, F'.C. va Oorschot. S.A. Vanirtow. Handbook of
AppUefi Cryptoa phy, CW" |; rc¾, 2001 (5tb printing with corresSkitss). Ot*-n-Soa<l»bfc froiii http;// ww . cscr .oath. tnmt»rl«o · ca/lsac/
10J <«. Mullen end H. St«wn». Vlji cmtal function! (mod m). Acta MiUhcnsatt is Hungarica
14(3 4), 1»R4. pp. 237 241.
|20J Harold j. Joltrmm, Statiloy T, Chow. Yuan X. Gu. Tamper tlc stant So tw v M ss Data
Kneading, m pat itt 7,350.085.
121) Harold J. Johnson, Stanley T. Chow. Philip A. Essea, System, and Method for Protecting
Computer Software Aoamet a Whif Box Attack, '"s pat«nt application f0 33.9€¾. pti iica- li ti i.;s 2OO4 013iK40 Al.
[22] Harold J. Jobnamt, Philip A. Easen. System and Method for Protecting Computer Software
4s>fttn»{ a White Bos Attaek. us atent applicatson 11/020.313 {continuatt«m in part of us patent applicati n tO/433,W:¾ not yvt fo nd in t«SPTO
Figure imgf000124_0002
dat«iba»e ). APPENDIX
23j Harold Joseph Johnson, Yuan Xiang Gu. Becky Lai ping Chang, and Stanley Taihai Cbcmr.
Encoding Techmqw for $αβ*<αη; and Hardwart', cs patent 6,088.452.
:2 j Aran Nara anan andaiidiatha. Yougscin Zhou. System and Method for Obscuring tfii- Wut: and Two V Com temi-nt I ntcgcr Virmpuiatiom w Software, cs patent -application 11 038,817. publication i'S 2005/0166101 A l.
''£$} IX K- Knuth, The art of computer programming, volume ·¾'■' wwi- wsf erf algorithm*. 3rd edition, tmx O-20t-S96S 1-2, Adds-on-Waik-y. Reading. Massachusetts, Ϊ3Γ»7.
26) Extended Euclid 's Algorithm, A lgorithm X on p. 342 in D. E. Knut h. IVic <iri of com- paler programming, volume 2: xcmi-r&ttmericei algorithms, 3rd edition, tSBN 0-20l-SitK>S -2,
AddisoB-Wistey, Reading, assaehiWRt t», 1997.
.27) Ronald I.. Riveet. Prrmutoikm Polynomial* Modulo 2'"''' , Laboratory for Computer Science,
MIX. October 25. 1999.
-.28] T. Sander. CP. TVhadio, Tovnrda Mobile Cryptogra hy, pp. 215 224. Proceed ngs of the
1998 Il.EE Symposium on Security an Privacy:
29] T. Sander. CP. TVhudin, Protecting Mobik. Agent* Against Mtd skm* Hmtt, pp. 44· 60.
Vlgna. Mobile Agent Security (t.scs 14 1?)}. Springer, 1998.
30] David R. Wall aw. ste and Mr.thod for Cloaking Soft war. i s patent 6,192,475.
31) A, J . AietMsass, F.C MI Oorathoi, S.A. V.»i intone, Handbook of Applied Cryptography, p.3. pp. 6 10. RC Press, 2001 (5 b printing with corrections), Down-loadable fcttp ://TO , cacr . sath . awat er l a . cz/ &c/
APPENDIX
SKCnONC.
Polynomial inverse computations over Z/{2 ), program component transformations and
ClearBox Cryptography *
In the theory awl practice of softwa ©tiiriscatiotj awt protection. aittforttjatk>iis over 2/{2"} awl more generally aver Bn play an import nt role, Cl arBox res rch is not an xcept ion. in this note we prei ist algorithm* to compute the poriipjtatioti Inverse / ' ¾{xj of a given pefiiititat ti p lymntiial f(i) and multiplicative inverse /(x) 1 of a given invert ibie polynomial /(x) over Z/{2*), Results of special polynomial ftiiit tkitis for efficient ttnpk*uictitatiuti to c e ae with general obfuseat son priiicipies ar dtsetjsaed mi presented.
We also investig te aJgorithras to generate matrices ove IT* with polyuo- ttitals as their determinants and escribe algorithms to use permutation polynomials mml matrix functions ov r II" to trare;foc«i arifhroetic op- eiations and data arrays. These tnubtfotmatioli* c>tu he eompoied with e isting MH \ trajjsfono tjoiis for the pretectiuss of software operations in the general ClearBox cryptography settings. Exantpes are given to illustrate new algorithms.
1 Iii rod net ton and notations
Let N be the sot of natural numbers and Z the integer ring. ΪΜ B = {0, 1}.
TIse tiiatfier«a?ical base of the arithmetic log ie unit, of a i»HT prooe» r is abstracted in t e f llow m¾ ai¾ehra sysiein.
Definition .1. With n€ N, we is fine the algebraic system (BM.A, ν,Θ,-s, < . >. >. <« *. >*. >*. <*.≠, : *.»,·«, -k ~. * ). a Ifc»eitii-arif hraette algebra
Figure imgf000126_0001
or SA[«]» trht r ·<¾;«;» dfttote left and right shifts, * th: notes multiply, end mpie ramp rt » find ritlm tie tight skiff Ofx indi ated by *, n r.< (he dknerisk>!i of ihc algebra.
BAjnj includes the Boolean algebra (8", Λ, V, ~"), the integer modular rng Z/!2w and G l is field <;rV2n).
N te that a very itsic rtspnreiiieit of PROT TIGN design Is to make its imple- i!icntation easily mi with application code. Therefore, building traiisf*riiiatMins on BA[«] rieamtes an efficient approach. We also argue thai it is suffieient lie- cause there are enotigli ΙΜΙΙΙΪΙΜΤ of ecsnpittational hard problems directly related
* Version: January 30.20 2. APPENDIX
C.2 Polynomials uwr Z/(2")
Lei f(x) t» a function over Z/(2nU where n € N. If /fx) is repres»tit b!e as
Ε ά«^'"· w -rc rt» € 2/(2"). ί = 0 m - I and m € N. then /| ) j> a ροΙψΐΜΜαϊ function, or a palynotmal Z/(2f5j[x! is the I of all polynomials over the ring Z/(2"h
he et of all permutation polytioinials in Z i 2s)f j.
Figure imgf000127_0001
ling factorial jxnv r JT{T - IMx - 2} - * * ( - k t- 1 ;, wher A'€ . Am p !>*noniiiil .f(x| can be represented »s 21· ίΐ α * χί^'* ^w* J"'0' is F
For I € N. let t«(i) - ^. j [»,'2rj, Each polynomial over 2/(2B! can Ise uniquely x es^-d in the form /fx) = ^* <se f '^* ^ where a, 1/(2" , -?;) and ir is the unique integer fior hich 2" j (a* - 1 }! but " t a,*! , Because of this mikjif<¾ic¾s, w is called the degne of /fx), denotc i by de.g{f{? I). or c( (xi}.
Note that rii) equals to the 2*adic or<lrr of ii, which is ί— s where a is the sittsi of all digit * .«f i in binary ιηιιηΐκτ representation, or the Hamming weight of i . This is quite useful in several algorithms in this note.
For polynomials over Z it'*). the upper IXHHMI of their ilegtaes is tmtii r ir >neh t at v{ir 4- I / = n hut vi tr i < n, As>swne n is a power of 2, ami n— 2f,
I!eeatee f(n + I ) ^ r(2'+ ϊ ί = 2*4*1 2 = 2! - I < m and r(n+2) = r(2* + 2) -
21 T 2 - 2 = 2*— », we have w— 214- 1— n - I , F r example, polriiomiafc owr Z/C232). the highest possible degr**e is 33,
Beeau>e of the fart that the highest degr<*e of ol nomials in Z/|2n)(x] is about lit{fi). this greatly minc s the computation eont «»ttipartng to polynomials over finite field < v( ),
"flier** are quite am unt of permutation |M>Iynoiiiials over di ring, The cardinality of Z/f2M )(xj is 2 <B+2:'-∑* *>' vi-k One eighth, or 2n'ftT2'-∑* ' · e<*--3 are pertm«ati tt». For « = 32, CI, tliere are 2α)\ 22ZTt permutations. repe-efive1y,
C.3 Pcrinutatioii olynomials
Figure imgf000127_0002
tion inverse " ), which is also refrrrwl as eoinfKisitioii tirver«',
♦ 3.1 C'oroputc preimages (roots) of a permutation polynomial.
For a given peruMitation pfilytioinial. we lisve an algorithm to compute its preim-
Proposition L h< l y = /(x) = e^r* be & permutation f lgn mml m'fr
Zfi l, For any given e«/«e 5 <= Z/(2n), m- ran find o € Zyi ) swh t nt
/(«)— ,i by the folkrtrmg .vfeps.* APPENDIX
/. fttpai* /(x) mul 3;
2. a - 0:
3. the Ot bit of a ts the Oth Mi of 3 - *¾;
Figure imgf000128_0001
fi) fir ith bit of a is the ith bit of 8 - % - («, - !)« - α,ο';
4. Output II.
The? roiTipirtittioa is cori x because for a pert-nutation (κ>1νκοιπί«1 /{ ), tit*5 ith bit of ( fr)— 1) fully dotpmiiiwi by bit vahies of x at i = 0.1, - - ,1— 1 and c«4ifide!it» of f(x},
C.3,2 The inverse of a permutation polynomial
In this section WP piwteat the following algorithm to- o ute the composittoti inverse of aity given permutation polynomials in Z/(2")|x|.
Proposition 2. Lclf(jc)—
Figure imgf000128_0002
« pcrmuiotmm p itftwmialmrrZ/{%n}. and let / '(x) ™ 4j-**t,J &e tts pdrtnuiaimrt in pers . The foilo in§ steps
§mmde a mrihmi to compute, f ffieimta f f'"%{s).
Figure imgf000128_0003
( ) inputs /fx) and i to PmposittmUl;
(h) Output. xt '(Note f(x ) = i):
S, b» = x0:
- !<w j /r ra I to n 4- I
(b) % = fj! » v{j) lmotQn
(c) bj = (( j - t») » I' J ) * ½rrt«f2"
5. Output li>,&. , · · I.
The c rrect it«¾> of the algorit m if* based on the following argtins tits, Sinn;
Figure imgf000128_0004
C.4 Multiplicative inverse function of a pol nomial
For a gsvt-n polynomial function f(x) over Ζ/(2ί5)> we want to deteniiiite if' /f ) ha a mult iplka* .ive inverse function t/(xjt such that /fx) * jf ) = 1. for all x€ Z/(2r9}. and cl aotrtl by /|x)"" if it exists. e also want to compute f(x} " 1.
Let ΜΙΡί2Μ}[ 1 he the si*! of ail Kiulciplirafive hivrrtihk' polynomials in APPENDIX 4.1 Criterion of f(r)€ MIP(2w)[x]
Proposition 3. Let f(x'f be a polynomial function «τ Z/ 2W).
f. f(t) km a niutitpUmiiee inverse funetum fir}"'1 if and only if the e :ffi~ actiis is odd and es is even m a failing factorial power erpn-smim;
2, Then- me 2f!i " + 2>""∑fc *ί vik) ~~ muii$pU itve mceriiblc po/ynomtais in Z/
Figure imgf000129_0001
,
3. (i* 1 K o rnhfrmmiai and cam be computed by an efficient algorithm.
Figure imgf000129_0002
x€Z/(2n).
The eliieieiit algorithm is stated i the folJciwing proportion.
Proposition 4* Let /(«r) &r a tmiiipticatwc mw-tiibie polymmiat in Z/(2 )jxj. its multiplicative inverse /(J)""* can fee genemtcd % fie ftdkmmng rfeps;
1, Sr
Μΐήίττ € Z/|2")|,rJ ί,« any polynomial;
2, Execute the recurrence equatiwi
Xk i = n(2- /(x)*x*)
by hi(ti) limes to generate, a new pelynmmal ffx)
3, Standardize fix) fe its faUing factorial it: presentatio of degree at most (n + I);
4, m pmt i{x).
The correcness erf the algorithm based on the following olwrviition: for any iiiwrtiMe element c* € Ζ/(2Τίϊ« the Newton iteration use!, in the roces* d uble* the rojiiiber of bits in terra of accuracy of computing o "1. The mmi r 8 is ttsed
Figure imgf000129_0003
3 bite of fix) and f{r) 1 are identical for ail x dne to the fact the first 3 hits of « ami a 1 are identical lor ail odd number a€ Z/(2n) .
Since polynomials are closed imder the coin {wish ion operation, we have the inverse in polynomial format.
Note that the algorithm with differen i itial values produces different intermediate romptitations, and therefore diversified eode.
The pcff mmiee of die algorithm efficient sine*.3- it lakes only lt»(«) iterations. This. syttii»olie computation products a formula of the |>olynt»tittai inverse which txs used to compute the coefficient instaii«>s of the itiver¾«e APPENDIX
C.4,2 An algorithm to compute /(x)~*
Another method to compute the inverse of a i en ultiplicatively inwrtible polynomial /(x) is due to the feet thai any pxilynoniial f(.i) owr 2/(2"} rati be determined by the set of values f / (0), j( 1 ). · · · « f(n 4- 1 )}. The following is a simple algorithm.
1. Take a nutltiplieatively invert! hie /fx) as input
2. Compute {/{0),/(l). · · - . {n + l)f
II. Compute the tooti inverses { /(0) - 1 , /( ! r 1. - · · ./(« + !} 1 }
4. Compute the ewfficieois of potyrioiiiial §(x) = /{J") " 1
(a) Compute falling factorial format c ifBcieiits { s}of g{x) based on the ' value set {g(0) ~ /(0) ~ 1 , «?{ I ) - /( Ϊ j ' '.-·· . ff(m 4 ! /{« + I ) '}
(b) Trim «K*Ikiemts «, by modulo 26 " ",?(
(e) Convert falling factorial format to normal format
5. Output ix) = /(x)
The algorithm holds because; of the simple fart that f(i)*g{i)≡ 1ϊιιαί(2 5, i = 0, 1. · , ft + L Tiie e of trimming tiie ewffiei nte is im ty in order to produce zero cxM'iirients io have the sitortesi representation which is needed f r efficient eonipiitatioits.
C .3 Multiplicative.^ invertible polynomials with ttilpotent
coefficients
AH tiniliiplieatively invertible §:»lynotriials owr Z/(2") form a group, the unit group of Z/(2B)|.rJ. Its subgroups can be iiwstigati l for efficient couiptitaf ioiis in terms of reduced number of tion-zeto coefficient*. For example, is the inverse of a mlpoi tit cx-ffirieHt polynomial still a nil pot nt one? If so, this can lxs an efficient anbsrt. The following result is what we expected.
Lemiaa 1. Lei /fx) = «% + ats * ·· - mxm€ Z7{2 )[ ] with ijtlpoteiit eoeffi. eients; a* = (kmmi{2n}J = 1,2.··· . m. Then
I. 9(f T) < &f{x /«"' »€ ¾
S. If fix) i* multiplt iivrtj mv rithk, that is, % is odd. we have &( fix)"1) <
,'. For any integer m€ N. flic set
m t ... o
ts a sub rou «f A ariii §rotip ί·?((Ζ/ί2"))(χ], *).
Here is a short proof, if e let i r) - fix) - a0, ilwn l(x)2≡ 0. Tlicrefore ¾ )- ~ ¾ 4- ejif(x). Similarly /(x}:t ~ <¾ +<ij|(x), lte first ressult follows from an induction on «€ Λ?. The « "otiii result can 1M* pwwl by the Newton iteration APPENDIX l>K)€€5ss, χη 1 as - /fx) * jr¾J illi 0 = 1 and th first result. As a matter of fact, x, = x0(2 - /(jr) . xe) = 2 - /(x), and x2 = x,(2 - /(x) * XjJ. By induction ¾ is a polynomial of f(x) and has a degree not greater than that of fix). The third result is easy to check thanks again to the nilpot nt property of ctK'iicicnts, and the proff m complete.
A Smalltalk implementation of the algorithm in previous subsection provides us the following examples of inverses of nilpotent€»c fieient riolvnoniials over
1. A quadratic pol>nomiaI fix) = «3248235+ l7268340424704**+2342188231416* x1 with invert /(x) 1 = 1416251459 + 2SS7631744*x240386048*x2. whkii m also a quadrat ic polyinoiaial:
2. A cubic polynomial f(x\ = l2352357S9 + 68S737G59lG16*x4G7S34£031G80* x2 + f &!7?% 339 1 * x3 with imvrse /ix}"1 = C4644¾¾» - *9:1362688♦ x 4· 2192018128 * x* 1208221696 * x3, which is also a cubic poIymomtaJ;
:i A qitarttc polynomial /fx} * 9865323! + 1 20291 263114 » 23:&2 l 9299*40*
2 + 1393677070761984 * x!l + 1393816906304 * x'1 with iuverso /(x) 1 = 2846913231 + 37<ϊ045¾½ί0·χ + 3063152640* x2 + 1 iS61.72!6*x;l +2^540160* χ, which is also a quart ic pol Hioniial
B nmrk i. The n:Ij*>!ftii cmfficienis co dition could be relaxed by u{≡ mod?f2ri}0, f 2. More detailed investigati n needed. 5 Decomposition factorization of permutation and
mult ipl icatively invert: tblo polynomials in this section we study polynomials f(x) with small
Figure imgf000131_0001
This is useful l>pcau>e a gciKxal polynomial over Z/f 2a ) has degree { n + 1 ) and if a high degree permutation is used as transformation trattsformod code would become inefic!ciit . Moreover code obfuscation can also I tciii from small representations based on the ratonal that email language mtijponexiis: make t e rnaiiagciBcait of XKIC diversity a d uniorniity asier. Note that in the context of data transformation, small repres:»iitat «s are required, for both f(x) and its inverse' f'"l{x) I f(r) 1 in case of fP{2n)), which turns out to t « challenging issue.
For a given jwnnnf ation polynomial /fx) the i»Htii>er of its no-iero terms in coiiventkund polynomial representation is defined as its weight u'ri{/{x))(in falling factorial representation, we can haw- similar definition of weight, but it will lie treated different since there is no repreatcd squaring algorithm works here). Obvious «·< i{f{x)) < < '-§{f{x))- To aw both fix) and / Jfx) in small representations, put restrict ions on gree is an obvious option, ji » kno n to provide provides a class of j;«n nutation polynomials f{x) snc that deg{ftx) ) = deg f "' 1 (x) ) , On the other hand, finding f(x) tfh small wt[f(x)) and irc {/"'!{x) is an option to find useful small representations. Uraus. of the existence of efficient exponentiation computations such as re eated"! squaring method. APPENDIX
Polynomial function decxmiptadtion provides us another means to find /(a-) with small ropnseiitat iis. If /( ,r) = g(h(x) ). <kg(g(x} ) and dcg( h{x ) are integer factors of rg(f(i}}. We have similar rase for multivariate polynomials, which can be niorpher code of an arithmetic operat ion.
Polynomial fat-torization is our third approach to haw small ^' resent ti n . Note t hat there are about l /m irreducible polynomials of d» m in « ;ι··<; ιί ι ίχ) .
Far example, there are only 99 degree 10 polynomials over < F|2j are irreducible. Fortunately, permutation polynomials. I /S of C; F(2J(XJ, are for faun irreducible, jgxlsting raathei itieal rales make any factorization of /fx } over P! 2} extend to factorizations (not tintque) owr Z/ (2n ), for any n > 2, and coefficient conditions over CiFf 2| for a polynomial to l ie a permutation (or ratt!- tiplicatively invert t hie) docs not restrict it being irreducible. In this context , for f{x) and {£) ideal small represeirtat ions are stitali number of factors ami each iaetor is a power of a polynomial with a small representation such as low degree one or km weight one (kind of recursive definition) ,,
Representing polynomials using mixed addt ion toons, composition eontfM> neitts, nailtiptication factors provides another excellent example that the same set of techniques serve both efficient computation and software obfnseation purposes f an exist ing example is additon chain lor RSA key ohfuseation ), Since our ultimate goal is for the morpher code which is a romposmkm of three permutation polynomials with arithmetic ofxtattons, and a bivariate polynomial as last result, we have good chances to const met/find large number oi pemwita- tion polynomials of small represent tions to have opt irapswt result code (genera! algorit hm is to \ determined we have some hade ideas),
In he following siibseetioRH we describe algorithms for
these three approaches as well as tlieir mixture. 5,1 Low degree or tew weight f(x)
We have obtained a sufficient condition on coefficients of a permutation polynomial f(x ) such that both f{x) and has the same degree, which can be small.
Here is the result about the degree of h (f (r) 4- g (y)}, where /(,r j , £?(j/) , h (z )€
Figure imgf000132_0001
a positive integer and let Pm (Z/ (2r' )} be a set of polynomials over
Pm(2/(2n )}
Figure imgf000132_0002
The degree is rn an<l there are 2m.— I coefficients.
We invest igated the cast;* of less restricted conditions, and possible necessary and sufficient conditions for d*:.g(f(x)) = deglf 1 ( j), but it irnect out that the theoretical condition based on a s st m of coefficient equations is complicated. but it does shine s m light on the computation of ssicli t«:)lyii€>ritials(details APPENDIX omitted here). At t is point we me coin put at ion to do search and would resume theoretical studies if the computation results could provide furtlier in fort nation,
The bask* computation algorithm i* to apply algorithm in section C.3.2 for the computation of ΐιιν«Μ· and tunc the iwffidciit io find small repn¾<etttation.
Over G lois Held, low weight: polynomials, such as low,- weight irreducible polynomials are rt« i«Mi through commit at ion , In this Galois rine Z/t'2* \vmc we shall use algorithm in section C.3,2 again, to compute arid find la* weight ewes. Again wftirient tttnirtg rocess ha e s at runtime.
5.2 Decomposition
Polynomial time <kt »mfKhitk«i meth ds owr field tno ne«:-ssarily finite)
are known but over Galois rings, no eonviiicmg general algontlun iotutdect/disroverecl yet. to my knowledge so lax. On the other hand, methods and ideas over fields provide valuable information for works* on rings.
A special class of polynomials called Chebyshrv polynomials {of the first kind Tn(x) deserve our attention. Recall that Tn(x) can ho defined by t e following recurrence relation:
Figure imgf000133_0001
= 2rTn{jr)-~ n atxl A pn»p- erty of Oiebyshev polynomials is alnnit t e- composition: Tnm ") - Tn Tm{jr),'. An interesting t^erv tlon is that all odd indexed polynomial I^ut-*),^ = 1.2."·, are permutation polynomials ove Z/(2r*l. Therefore big odd indexed Chebyshev polvriomiais can IK* dcwmpcised into low degree* Cheliysliev penmi- tat ion polynomials.
Note that if f(.r) = gih(xj), and g{x} and li(x € Ζ ί2 )[χ}, these components g( ) and h(x) are still permutations. Dm> osit km of multiplier ive!y inver ii*e will l»e interesting, because components are not
Figure imgf000133_0002
multip!iea- tively iiivertible,
C5.3 Factorization of jF(x) and f(x*y)
Factorisation of a giv n polynomial /(x)€ Z/(2ft)Jx) starts at Z/(2)[x] = C5 i2)fj*J. Then various forms; of Ilerisei lifting ran he chooscti to factor fir) owr Z/(2n). Algorithms in this area are well studied (except factoring multivariate polynomial*) and we will use existing algorithms.
Most permutation polynomials are not basic primitive |w>iytK>»dals and have non-trivial factors, For example, permutation fxilyiiorittat
f(x) = x. 8 * x2 + 16 * 3 = x(4 * x .1 f.
For any fix) c Z (2n )[xj, sqw e-frw factorisation algorithm and Berlekarnp's ( -matrix algorithm are used to factor f(x)€ 2/(2)1x1. Note that we may just have a partial fact rizat tot t. finding two eoprime factors, to go to the next step to factor fix} Z'(2f fix], i > 1
The following form of IfensePs Lemma is
the one having the essence of the technique. APPENDIX
Figure imgf000134_0001
Aso note that ft* am g* ran be constructed directly from the Bout's identity of gai{gji)≡ 1(»ιοτίί).
Iterate this procress we can lave desired results*.
Note that (actors of a permutation pol nofnial are not necessarily permutations. This provides another Savor in terms of diversity aftioting different s ade . However, factors of a mnltiplieattvely invertible polytioiniaj are stt!! trniltiplica- tively invert i!ifc.
C, 5. Mixing addition terms, multiplication factors, and composition components
We know that all permutation polyiioniia!s over Z/(2 ) form a group PP(2 !fxj !»ased OR function ro§Bf. isit»n operation . The unit group U(Z/(2**)[x). <) of polynomial ring Z/(2n)[x] hast on ring multiplication Is the set of all inttli!- pHcattvely invertible polynomials Μ1Ρ(2λ )(,Γ] (see sapeiion C.4).
Here is a simple hut interesting observation:
Proposition S, L t f(x)
zero constant term, '! en
Figure imgf000134_0002
mvcrtiMe,
Hole that the coefficients of (,,,xi2\ and J[AI of / xf must l*e odd. even and even, res ectivel . In that, format these conditions etiaWe h(r)"'$ constant term to he odd, and coefficient of χί||: even, The eorrwtiic¾s of the observation follows* Im sition 3,
Another observation is the intersection PP(2n)[xj n MIPf2fljljr]. which is empty (containing only odd constant functions if wo allow PP(2" V has constant functions). This im lies that the t wo function seta are orthogonal its some
Back to the set of Cheliysiiev polynomials (of the hrst kind) T«(xj. Previously we mentioned odd indexed ones are permutations, it is easy to see that even indexed (also even indexed Chelyshev polynomials of tlie second kind) are multiplicative invertible polyrKMtiisIs. Therefor** big even in xed on can I cltxoiiii jsirf into small ones based on .Jtt-m (x) = Tn(Tm{ ))> an alternatively, tt can be factored into small factors for reducible OIKS.
More studies can be done in this area, including algorithms to select suit ab transidrniaiioits for tlse purpose of generating highly obfuscated code.
C, 6 Generalized polynomials f{x) =
Figure imgf000134_0003
For a geueralfeed pc»lyt *tnial fix) to lie a formulation. ICliinov gave an interesting; if and only if condition taisad on its eoeflietcnt set {¾¾, - - · ,iijf . That is APPENDIX the same condition for the polynomial function /(x)
Figure imgf000135_0001
obtained by replacing all ft operations wit +>, referred as the reduce*! polynomial of /{ p
Because the associ tive law is invalid in /fx), it actually represents a set of functions, generated from all possible order* of ope rat ions a* well a* «)inbi nations of operators ft and ft.
An interesting new result was published in the end of 2011 on conditions of single cycle property: assumin the operation order is from left to right, that is, f ) hm functions of the fori]]at(- · < ί{οθ;* '(»ι.χ))!* (ti2j-2))f · · -f |<yj'lf)).
Proposition 6, ft'dli. the der iestrtctmrn and an assumption that iherr arc no cmtsecuMttve -h fwret rs, x) = %ft«jx;j' * · ft <¾χ' is a single ctfch: 1*:rmuttitimi if and only tf it is a mn§!c cycle pennutat n over the ring Z (2*4'2' ), where I ts the tmm.hr of odd numbers m f * i , ½,■ · · . i„, } C { 1.2, - · · , dj . «AkA t flic *ci of depre indices of terms «Ε,^ 1·* tpffft an - operator before them.
Tlis is an interesting result although Ζ/|2;,'**) could lie a big ring.
7 Matrix transformations
Matrix functions over Z/(2 } with predetermined detri tia t functions are constructed in this section for the t ransforniatiotis of arithmetic operations and vectors. The following ts a set of matrices we try to work on;
Ω = f ΛΙ, |.r, y.z .·.·)) I € ft « ( . y, ί .-·.)€ B"[x. y, s .· - · ],
Figure imgf000135_0002
where Bn[,r, i. - ·] are nntliivariate functions over BA[«j.
Recalled that MIP(2") is the set of all multiplicative invertiMe pol nomials
A few lines of explanations. Tlita ts a set of matrices whose determinants w nmlt iplieative tiiwrtihle olynomials. With a predeteftnitit l deteratinani a matrix in J? c n IK- constructed based, on element ry row and column operations over the ring Z (2w Note that other operations in BAinJ are also involved but we "favor** multiplication and addition. The following standard algorithm which is remarkably similar to mat ices over fields offers more details.
Proposition 7. tef »«,n€ . LetAH be a set o/ functions from BA[n][x, y, r , · · -j rrfc.rtxd as (xmtext function set. Let A^ IM> the set of g nerated functions fmm An · 7¾c foiiowmg process generates an meerttMe m®iii:r ΛΙ— in .) {<¾"· {/- ft * * * ))m* m over BAjnl (more ptrekefy mines m,,,(x, |, .j, · - - )€ {ΛΓ. U ΜΙΡ\2η)}'χ' } who e determinant t.« « fM}* nomtat f(, )€ MlP(2n):
1. inputs: BAa!f cim dimension n, Λη,ΜίΡ{2η} mid matrix dimension m:
2, B ndamttf choose rn polynomials iCxJ. sfx). ·** , m( ) MIP(2n) a d set t«iJ: ,r, !,.:, · - ) = /i(x): APPENDIX
X iicjM'mi fitiitc steps of the following process
(s) Randomly pk tp 1.2, · · · , m } x f 1 , 2, · · * , m ) :
(h) Randomly pickup τίτ, ,;,···) Λ^:
(cf Randomly perform a nne opt ration R, * r(r.y.:t - - - ) 4· Ii} or a txAumti ojtemtkm€τ * fix, y, z, · - · ) 4 C, :
,|. Output: β m x m matrix with determinant O * I i £}'
In t he algorithm t he nmtext fimeiwn set , |s i% mi mi mitittg concept « This SH of fundi* m front 8A|»| , y. c. * · · ) define In- requi ed 'similarity' for the mat rix transformations to seain!esdy Mend into th application code viw'mmmvut. Λ can \*> predefined based on existing ap lic tion code for taut nd pmj cted code format . Λ typical -example of A t» a net of expressiotts tu the code context ,
This concept can also helps us introduce depeiMleiicies between code variables in both application and transformations. &*e example Section D.
Remark .;·, An alternative Algorithm t construct ilm matrix in by u er for lowen triangular matrices* with polynomials in MfP|2Bj to form the diagonal entries and eleitieiite from Λ^' to form the upper {or lower eniies. The product of these matrice is still in MiP{2s ).
Remark X About the iiiiiforinity of inverses of matrices in Ω. There are two ty e.- of applications: with or without the matrix raver>e code- in application code. To trami rui vectors (data arrays j it is may not l>eeii necessary because the inverse t'onipittation can !»e appeiuxl in server side w ich does not n«ld code ohfiiscat n. But to traa*form οριτ auds in operations using matrix inverse Iwonies necessary to keep original fktrt tonalit .
Matrices tn i? serves the later ease well lm-ase entries ©f t e inverse matrix an? composed of dements roin Aft and polynomials (the ttiwrx* of deteriiiuiaiits).
Remark 4· Tttc- algorithm can ίκ» fine tuned based on precise criteria of defined cede uniformity level and criteria of performance of transformed cede. 8 Bl rk-itivertible function matrices
It! this section w c nstr ct a t**i*c s¼*t of square block matrices with block invert ible properties to Ικ· applied to the coding context that both internal and external iransfor inatons keep iniiliiplieative inwrt ibiMties.
Note that this is also an extension of the constant ease construction tliat api>lii*cl in While~Bi*x AV key hiding. 8»l The exLstancc of block-invertibl function matrices
In this section, we refer an even poiynom !
coefficient of r and constant term are l»th
Figure imgf000136_0001
APPENDIX nilpount radical of the. iKjlynoi al ring (Z/(2»*))[jr|?). Let. su Ψ C (2/(2N »[JTJ l*e the union of n)uhipii<;itiv ί avertible po!ynotoiab; and ev< n polynomials. Then
Figure imgf000137_0001
is sttbring i#.T.*). Let Ξ =< 2fZ/f2M))x,(2/i2w)}rJf = 2,1-·· ,« + 1 >. the ideal g n rate} in t lie ring Φ.
It is easy to «srtly that Φ/Ξ h isomor hic to Z: the set of ntultip!ieative!y invert ibh? pojyriomials becom s dd immHers and even polynoniials are turned into even ti«t»bers, We will see that this isotti rphhni transform; the con¾trtietiori method over filed Z/ 2) in f?j to the ring Φ,
N te that Ψ contains the .stthring generate*! by unit, goup
Figure imgf000137_0002
» » at id iit!potejit radical idea!
Figure imgf000137_0003
The matrix ring M( l*y« is what we work on in this section. rst w aw 5 he following rwih:
.Lemm 2. For a. given non-zero square matrix A€ Μ(Φ).χ,* there exist two invtrtiaUe. matrta-* P.Q€ Μ{Ψ)„χ, nwh that St = P * I) * Q. when D i» a diagonal mains with r wws and a— r zetvs, trhcre r€ N,
Lemma 3. For ana jt.r€ N vnth s > r. there frit two invcrtiabk matrices Ί Λ€ Μ(Ψ).,χΛ Mich (hat T■= L) ~A.. w -rr D u diagonal matrix th r »t > and - r i ms, ivhen: r€ N
The mrrmiiess of f h<¾e two lemmas follws the isomorphism ahow. For this subset of polynomials i Φ, the basic idea of the eotistrtiction algorithm works fine hew in ill** litnctk*H marices easts* {ewti over the 8A algebra,
Figure imgf000137_0004
over the modular ring Z/(2fi)J.
APPENDIX
10 Transformations of program components in general
We describe an approach to transform program com o ents (see next section for arif hiitetie operations and data arrays and permutation polynomial*, i n vert i hie polynomials and iaat ric s as primary traitefoniiaf k>ns). .\> >t<- that the approach described in this sect km is essentially the complexity extension of the data transformation concept IRDETO/Claakwaa* data transftHinatiori technology,
10.1 Transformation process and. configurations
Definition of a mnfigumtion. In this note, a configuration is defined as a state of a set of variables ( w> operations) involve! its the transformation r cess. In m st ca>es, the variable set iucludo
1. Variable in program components;
2. Transforiaated variables of the program rotnponents;
3. Transformations represented by their t e*:;
1. Coefficient* and /or their values of tratisfonuat kins;
5. Coefficients and /or their valnes of inverse* of transformations:
6. Variables representing the eon figuration itself
7. Other variables
The transf rmat ion pro«¾¾* comprises .series of*
1. Input eoiiligiiralioii
on secti n
The inanagenient of eonligtirations, which is the major part of the transformation process, can b represented by a Finite State Machine, Note that the transfonnat ion {encryption) of a program is much c mpicated than tiinary strings for cryptography purpose.
Example; existing dat a transfer! nations
10.2 Complexity analysis of transformed program components and program
.! . Seraeli space of all |x:>ssiWe coinpositionis:
2, Complexity of propagation of dependencies of transfcnned program components:
3, System of equations for each instance
11 Transformations of program components in
MARK.1 /WoocJetimait constructions
Based on ilie oodemnan cons tract ion m nreroeiit , we have the following pr- t.iiram com fx >itent t ratt ti »otta t i< >i is . APPENDIX
11.! Tr mfor mat ton of ad d ί t ion
Let z = x + j? lie the addition to he fransI nmxL The bask process h
L Decode the input eoiifiVur lions to find out input expresdotis of both ,t and
'-
2. Transform flu* two operands hy permutation polynomials in PP|2n) and gtwrat iii'w exprosskin :
a. Create* a vrctor with the eiienled. operands and other terms .¾ its entries: ί, TratMif rrn the vector hy matrix in it
, Cmitpose i» <iw*dt!?ii of ux- matrix and the ti«» operands with f he <*ΪΜ*«Κ1«¾ oi the ddition operation by a polynomial in PP{2"' ). atd/or a matrix in SI: i*. Apply a ermutation fx yiK»inial to transform z and save the result* in a vector:
7, S w information ai>oijt the encoding in the* final vector for the consumer of the addition.
The interf ce betwotm t ose steps are an array of variables with speci- ficd dliferent configuration;*,
11.2 Transformation of multiplication
Similar steps alcove just replacing addition by iiitjliiplieatioi..
11.3 TraitsforiWi ion of wet ors/ar rays
Similar to ste s of addition wit In >ut the addition coding.
11.4 Tratwfbrroation of addition, multiplication and vector
Uniform tin- thro1 transformations by se!t ikins of matrix traiisfonnattons.
O Attacks
F<>r permutation polynomials - siin IifiHx] representation
Proposition 8. ossible attacks: if all petrnt atmn jtuhftsommh can he rrpre- Mttird by kne de rre penn tati i patynmm&l* then there err (simpUficaikm) attacks ΗίΝΤ: emmi number of low de§me pt rtnuttttmns orer Fmtte fit Id to show (hc.se is net such etttaek
Proposition 9. Possible aiieck : Using value set /(t)ii * 0. I, · · - . » 4 ! to rrp- n t'.nt pol nomtah mid eomp».mti®m to eoritpii!e the pwbabthttf APPENDIX
14 Remarks
Java code itttpleBientatioii of the algorithms tested.
15 Tito conce t of dynamic permutation transformations
In the existin data transformation techniqu in irdcto/Cloak art* patents fM-rt iUt t ifiiS ate tw.<i t< » transform operands of an operation. Fur example, operandi- u, v, and w of an addition oj 'raiion w— it - v can !>c transformed by a linear function p = a * r + b> where a mid b are constant* (a mast 1 » odd to fx* a permutation}, in the computation of trairsfb med code, variables are r late! to u and r only because traitefortit d operations have fixed c nst nt?; a* coefficient?, of the newly transformed o|«*ratfc>it5, such as those for add operation. Therefore w will trf r these I raiisi rrnat ions as static permutation trattsfonn ti<Mis. Mote that data transformations UMS! in s ndard public > erypt system such a> RSA can also be regarded as static permutation transformations, i -cansc- t he private atid public key pair l>«xinits fixed constants at the titne an ent ity uses them to encrypt or decrypt data.
In contrast, dynamic }>ermiitation transfon nations are pennutations t hat have new variables which can be introduced into the computation context of transformations. For example, a liner function y = a » x - it can lie used to transform a operand jr. But in this dynamic case, coefficients « and h are varibles (with one bit restriction on variable a). In this way, the transioriHeci add operation will have three sets of new variables (total 6 in this case).
Static ermutat ion transforma ions are designe Co transform jieran s of ach individual operation. These micro iraiisfortnations always haw small code si¾e. Alt ough dynamic permutation transforations can be used as micro traits- formations,, its main goal is to introduce eotiiiections tetwwit variables for code integrity. Therefor t he code size can be and should be bigger than most micro t mtis imitations. Note that these large ssw* traii iorroatio}is arc still within t fie totitidary of polynomial time cotnputational complexity i terms of original ske.
Dynamic and static pertm nation transform tions can work toget her to achieve a level of code obfnscaf ion and ode protections with reasonable code size* expansions.
10 Dynamic permutation transformations
A permutation polynomials /fx) = >- i i »x+ ,.. --¾ * ? ow Z/(*2W ) ran be used HH dynamic permutation t an>fonnatiots .. where y, y , · · - . y are variables with conditions that y, is odd, ¾ +j t s + « · * and j¾ y , 4· · · * are odd ntnn{>ers. As in the static data transformation ease, permutation inverses have to be computed.
Besides the general pertnttf attott polyiioiiiials over Z/ f2K), special dynamic permutation polynomials, such as those with ni!pot rtt coefficients can reduce APPENDIX the size of transformed code. Formulas of comput ing their inverses are known, in where all coefficients are variables in this dynamic cane. 'Flic special properties of rmdficieitt variables, Mich : the nilpoteu propert ies; can also be used for interlocking.
Note t at nothing prevents the coefficients variables in dynamic perrnut a- tt ii transformations liave some constant corlfid nts as long as the permutation coiidit ions arc satisfied. They will facilitate the eoiufMMtt ion with existing stat ic pertiMt tioii traiisfortnaf fens,
17 Properties of dynamic permutation transformations and interlocking
In dynamic permutation traiisfon ation, t here are two kinds of coeffici nt variables: conditional ones, such a a in y— a * x + b , and tinconditional ones such as ft in example above. For code uMtiscation purpose, uncondit ional coefficient variables can be any variables from original comput tion context, or variables in any f rajtstbraiaiion formula, etc. The conditions ones are m r interesting: t he conditions can be οοϊη|κ*·«1 with interlocking properties for the purpose of code protection. T e code can \ protected so because t he conditions of coefficient variables are exactly the conditions for tlie transformation to bo a pennut tion. Therefore breaking the conditions implies a non-permutation transformation for an ti|K«raiid of an operation, resulting in a faulty computat ion which is tlie way we want it hapf>en when tampering occurs.
Because dynamic permutation conditions are e re ented In* a property of a set of variables it -eon es hard to distinguish these properties front original code properties. It is also hard to figure out emdleieot variable set from ail variables in t e transformed code.
c id<s propert ies of coefficient variable*, condit ions of the eorm rstss of formula can also lie composed with integrity wrifieation properties: it will break the iiii it})— T identity if the coiiditbn is broken!
crurataiioii polynomiab* are also fully determined by its roots. In addit ion to t he normal :«*ffieient representation, root representation format can also !»e used. Special root structures and value properties can also reduce the code size for more efficient compu tions. Note that in this dynamic context, roots are variables, not fixed values. Conditions for the cor ctness of r<*ot computation process can also be composed with verification properties.
Other dynamic properties in tlie process of computing the inverse, can also ! e used to compos*:1 with integrity proper! lis. For instance, the algorithm based on Newton iteration algorithm to compute modular inverse over t he ring Z/ (2"J works correct ly only odd variables, a nice property,
17.1 Identities and dynamic equations
Ait equation that involves mult iple variable inherently has a dynamic property: the identity itself. Mixed Boolean arithmetic identities are example* of these epilations: breaking of BA identities implies t he occurrence of tampering, APPENDIX
Mtiltiplkatively invert jble dynamic pol nomials i|x) over ring Z/(2n J alse provkle a set of equati ns f (x ) * fi x) " 1 — I Similar to dynamic permit tat ion polynomials, conditio! s on the coefficient variables also provide a property to c mpose with integrity verificat ion properties. Polynomial?* with special rocBi* etetit variables provide implementations of flexible code size.
Dynamic permutation trartsfonnadons composed with MBA kJeirtitk¾, This can Im done either hy tratisftraniig variables in the equations or traiisf nnai km of the equatio s themselves*.
17.2 Permutation T-fiiiictions
In general any permutation Tditnctiotis front Ik jJeait arithmetic algebraic sy tem can also be used as ciynamie rxrrnutatiori transforinations, The computation of their inverses can f> aelnevixl through tit hy bit eoniputatiotis. One example is the gencralr/wl permutation |x»lynotiitais. The computations of their inverses are still efficient computer programs.
MOTE: Mot all program components are necessary to obfuscate: Not transformation of data variable* but integrity code: just com o e with variables and operations.
18 Block data protection
Block data variables, such s ιη«υΙκτ¾ in a data structure or fields in a class ean be traiisfonwed using lyiianiic permutation polynomials: coefficients can lie individual data variables, (2) composed with individual static permutation trails fori nations of each data variables ,
References
L G. II. Hardy and E. M. Wright , An imtmiueikm lo the Theory &f Numbers, Oxford P«s».
2. Zitaopciig Dai, and ZJntcijistt Uti, 71*: Single C'ycle T-fmtetiom. On hue: http ; / epriot , iacr , ©rg/20il/54? . pdf
3. Alexander Klkiiev, AppU ttom of T~functio m Cr ptogra hy PhD Thesis, <.½.
m Institute of Science. 2004.
4. Dexter Kmen, Susan Landau. Pni p Mnmi dermnpo*ttitm algorithms. J, Syntb.
Ctanp, 7{5} ( ! 9β9),445-ί5β.
5. Λ, Meiie¾e§, P. Oorsc!tot , S. Vaiistoae. ii uihrmk of Applied crypt og^n hy CRC Press, 19».
tl Med tl Sudan, A igcbm ami ootmput iian, MIT lect ure noks. On line:
.ait , «ia sa<lh»/FT§8 c08XSf,html
7. , Polynomial functions (mod to). Act a Math. Hungary, 44( 3-
Figure imgf000142_0001
8. Ronald I. KivcKt, ermutat ion Polynomials Modulo 2to Finite Fields and their Applications, voL 7, 2001. pp 2*7 2§2, APPENDIX ί*. Henry S. Warren. Jr., Mack*r'» lk!i≠1.
Figure imgf000143_0001
Boston, 2***2. On Urn: w» , hackersdtlif ht » rg .
II). Jam«s Xiao, V.7Amn,Genrrnting iargt nmt-Mnguiiir nmtnceM over an arbitrary field with hlock of full tank.2CW2. On tine: ht tp t //*pr int . iacr . org/2002/096. p f .
11. ejsan ti, Zfiau(M;fig Dai and Zatigdtio Da Tin formula* of corffietmt* of sum and product of p-adic iaiqers mik appiiasti&t to Will recto*, Acta Arithmetic... 150 (2011). 361-384. On line http://Anciv.org/abe/1007.0878 http: //journals . epaa,pl/cgi- ia/doi?aatS0-4-3.
12. V. Ziwu. 5. !s w, S> toin :uul met hoi <jf hiding ry togra hic private key?., U.S. Patent Ho.7.634.091, 2009.
13. Y Xfiou, A. ain, V. Gti asd 11, Joliitaon, Inf iaatloii Hiding in Software with Mlxt'd Bmilcmi-Aiitlitttelic TixtufS nm. !xtfar t Security Ajmki-ati i*, Sth in- tfrnatimml U mishap, UTSA 2007, LNC'S 1367, 2«R
P. An Example of transformations of 2-dim nsional
vector
H re is a wry sim le ex.anipl«» about tramformation of 2~dimeQ¾onai weter* i-Y, V I BJ )*. 'Πκ- code context, e IIS>IIHM\ is
'I'iii' tirst at i» to pick up tw pcritfUlat on |x>lyn mi.als 9*x+ 42.67* x÷ 981 to traitHf rni and V:
ΑΊ =39* A' +42.
and
'| =<»7· V +981.
T e next step k to pick up a niatrtx ho#e determinant ?.*> a intiltip!iratiwly iim-rtiWf polynomial { 1335 + 2860 * x + 1657 * x * [x ~ 1 }){6S39 + 8 68 * x) : ( 1335 2W * 3*
Figure imgf000143_0002
A column o ration
Γ ( 1 °) n A m
A - Λ x C - Λ/«*ηχ((1335 + 2«I) * x + 1657 * x « (x - \ ) + (67 « (x ¾ j} +
8 » J/2) * f 716 * c + 13 · (xky ) . (67 * (x n) + 8 * !], ((6539 + 826» * x) · (716 * : 4- 93 * < λ-y)). |6 I® + 8268 * x)}).
Il.i n *i row jx arion APPENDIX
Figure imgf000144_0001
on A produces
.4 = /? x A = Λ/αίπ>[[133δ - 2861) * x + 1 5? · « i,r - I) + |07* ( -.r. jn + · if } * (TIC * : + 93* (s y))t 67 * ( χ ©■ y ) + 8 » jr], [31 * (x&jf) * (1335 + 2SG0 * x + 1657 * x * (x - 1 ϊ 4- f C7 * f © y) + 8 * if] * {716 *: + 93 * ( fcy )) ) + (6539 + 8208 * x) * (716 *z + 03* ky)),M * <x&y) « 67 * · x ' ,ν'- * s * ,r I r 0539 - N 0S
Applying this invert ible matrix ,4 to transform f.Y|.V|), w have the transformed vector tXj.Y?) ¾'itli
X2 = 52005 « Λ' 4- 56070 + 4091.7 *x*A* + 50520 * x + 0-1023 * x2 * X
+ «9594 * 2 + 1870908 * (x © y *z*X + 2014824 * (x © yj * z
243009 · i © i/l * (x4.\v) * X -f 2ft 1702 · ίχ Θ il * (xky)
+ 223392 * # :. * λ' 210576 * * z + 2») 16 * |f * (xky) *
4 31248 * if * ( ¾)+ 448»! * (x , y) * V" + «572 * (x 3 y)
Figure imgf000144_0002
and
> = 81ϊ0ί»8 * x + 7595328 * (sky) * tf *∑ » .V
I- 63610872 * ( &yj * (x © y) * ; * X 22 1718 * fx © y) * (sky)
+ 266832 * f 3 * &y) + 182595036 * z * X 4553956 * x * V
+ 1062132 * tr * (x&j + 248635296 * x * z + 25487163 * (sky) * X
+ 34012692 * {xky) *x + 2300196 * t &j/'i * x~ + 88!)7868 * (x ©- y) * s kii
4- 817958-4 * (xky) * 2 * : -t- 68504016 * (x&|) » (x © f ) * z
4- 1 221 · I2 * i fcy) * V 4- 1 2626 * (χ © y) * (xky) * Y 4- 230875032 * x * z * X
4 986 11 * it * {xky}2 * 8262306 * f © y¾ * |x )2 * X
2197182 * (xky) * x2 * X 431583214 * (xky) * x *
4 6414759 + 27417714 * (xky) + 196610808 * j + 438113 * Y.
Then w ("tti replace x, y ami ; by any ex ressions of any variables, including X and V. or constants in f he code context to inject no dependencies into the*** transfornwHi <*o<!e» Further code optimisation COHH I TOHI' necessary depending on ex rrs i* »ns d loscn . APPENDIX
E. Polynomial representation of carry bit values
Figure imgf000145_0001
Obviously, tin* second teen alxwc is a polynomial representation of a carry bit value. Similar formula for multiplication can al$o he derived.

Claims

1. A method comprising:
selecting a word size w;
selecting a vector length N;
generating an invertible state-vector function configured to operate on an N- vector of w-element words, the invertible state-vector function comprising a combination of a plurality of invertible operations, wherein the state-vector function receives an input of at least 64 bits and provides an output of at least 64 bits, and a first portion of steps in the state-vector function perform linear or affine computations over Z/(2W);
indexing a first portion of steps in the state-vector function using a first indexing technique;
indexing a second portion of steps in the state-vector function using a second indexing technique;
selecting at least one operation in an existing computer-executable program to modify; and
modifying the existing computer program to execute the state-vector function instead of the selected at least one operation.
2. The method of claim 1 , wherein each of the first and second indexing techniques controls an operation type independently selected from the group consisting of: an if-then-else construct; a switch construct, an element-permutation selection, an iteration count, an element rotation count, and a function-indexed key index.
3. The method of any one of claims 1 or 2, wherein each step in a third portion of steps in the state-vector function comprises a non-T-function operation.
4. The method of claim 3, wherein each of the steps in the third portion of steps is an operation type selected from the group consisting of: a function-indexed keyed element- wise rotation, and a function-indexed keyed sub-vector permutation.
5. The method of any previous claim, wherein the invertible state- vector function comprises a concatenation of the plurality of invertible operations.
6. The method of claim any previous claim, wherein w is selected from the group consisting of: 16 bits, 32 bits, and 64 bits.
7. The method of claim any previous claim, wherein w is selected as a default integer size of a host computing platform.
8. The method of claim any previous claim, wherein the word size w is twice the internal word size of the TV- vector.
9. The method of claim any previous claim, further comprising:
generating an inverse of the invertible state-vector function, the inverse of the invertible state-vector function comprising a concatenation of an inverse of each of the plurality of invertible operations.
10. The method of claim any previous claim, further comprising:
selecting a key type for the invertible state-vector function from the group consisting of: a run-time key, a generation-time key, and a function-indexed key.
11. The method of claim 10, wherein the selected key type is a run-time key, said method further comprising:
modifying the state-vector function to accept a run-time input providing a key k.
12. The method of claim 10, wherein the selected key type is a generation-time key, said method further comprising partially evaluating the state- vector function with respect to a key
K.
13. The method of claim 10, wherein the selected key type is a function-indexed key, said method further comprising, for each of the plurality of invertible operations A, providing a key RA for the associated inverse of the invertible operation.
14. The method of claim any previous claim, wherein the state- vector function is implemented at least in part by a plurality of matrix operations.
15. The method of claim any previous claim, wherein at least one of the first and second indexing techniques controls a plurality of operations comprising random swaps performed according to a sorting-network topology.
16. The method of claim 15, wherein the sorting-network topology is selected from the group consisting of: a Batcher network, a Banyan network, a perfect-shuffle network, and an Omega network.
17. The method of claim any previous claim, further comprising:
encoding an input to the state-vector function with a first encoding mechanism;
wherein each operation in the state-vector function is adapted and configured to operate when the input to the state- vector function encoded with a second encoding mechanism different from the first encoding mechanism.
18. The method of claim 17, wherein the first encoding mechanism encodes the input as aM + b, wherein a and b are constants.
19. The method of claim 18, wherein M is an invertible matrix.
20. The method of any one of claims 18 or 19, wherein the second encoding mechanism, when applied to the input, encodes the input as cN + d, wherein c and d are constants different than a and b, respectively.
21. The method of claim 20, wherein N is an invertible matrix.
22. The method of any previous claim, wherein the at least one operation in the existing computer-executable program and the state-vector function use computationally-similar operations.
23. The method of any previous claim, wherein the step of modifying the existing computer program further comprises applying, to a combination of the state-vector function and the existing computer program, at least one technique selected from the group consisting of: a fracture, variable-dependent coding, dynamic data mangling, and cross-linking.
24. The method of claim 23, wherein each of the state- vector function and code implementing the at least one technique uses operations computationally similar to those present in the existing computer program.
25. A method comprising: receiving an input having a word size w;
applying an invertible state-vector function configured to operate on N- vectors of w- element words to the input, the invertible state-vector function comprising a combination of a plurality of invertible operations, wherein a first portion of steps in the state-vector function perform linear or affine computations over Z/(2W);
applying, to the output of the invertible state- vector function, a first operation from among a first plurality of operations, the first operation being selected based upon first indexing technique;
applying, to the output of the first operation, a second operation from among a second plurality of operations, the second operation being selected based upon a second indexing technique different from the first indexing technique; and
providing an output of the second operation.
26. The method of claim 25, wherein the second operation is selected from the second plurality of operations based upon an index derived from execution of the first operation.
27. The method of any one of claims 25 or 26, wherein each of the first and second indexing techniques controls an operation type independently selected from the group consisting of: an if-then-else construct; a switch construct, an element-permutation selection, an iteration count, an element rotation count, and a function-indexed key index.
28. The method of any one of claims 25-27, wherein each step in a third portion of steps in the state-vector function comprises a non-T-function operation.
29. The method of claim 28, wherein each of the steps in the third portion of steps is an operation type selected from the group consisting of: a function-indexed keyed element-wise rotation, and a function-indexed keyed sub-vector permutation.
30. The method of any one of claims 25-29, wherein the invertible state-vector function comprises a concatenation of the plurality of invertible operations.
31. The method of claim any one of claims 25-30, wherein w is selected from the group consisting of: 16 bits, 32 bits, and 64 bits.
32. A method of executing a first operation by performing a second operation, the method comprising:
performing the second operation by:
receiving an input X encoded as A(X) with a first encoding A;
performing a first plurality of computer-executable operations on the input using the value of B~'(X), where B'1 is the inverse of a second encoding mechanism B, the second encoding B being different from the first encoding A; and
providing an output based upon B~'(X).
33. The method of claim 32, wherein the first operation is performed on the value B'
' A(X).
34. The method of claim 33, wherein an output of the first operation is not provided external to executable code with which the first operation is integrated.
35. The method of any one of claims 32-34, wherein the first encoding mechanism encodes the input as aM+ b, wherein a and b are constants.
36. The method of claim 35, wherein Mis an invertible matrix.
37. The method of any one of claims 35 or 36, wherein the second encoding mechanism, if applied to the input, encodes the input as cN + d, wherein c and d are constants different than a and b, respectively.
38. The method of claim 37, wherein Nis an invertible matrix.
39. A method comprising:
for a matrix operation configured to receive an input and provide an output, prior to performing the operation, permuting the input according to a sorting-network topology;
executing the matrix operation using the permuted input to generate the output;
permuting the output according to the sorting-network topology; and
providing the permuted output as the output of the matrix operation.
40. The method of claim 39, wherein the sorting-network topology is selected from the group consisting of: a Batcher network, a Banyan network, a perfect-shuffle network, and an Omega network.
41. The method of any one of claims 39 or 40, wherein, for each of a plurality of subsequent operations, an input for the subsequent operation is permuted according to the sorting-network topology.
42. A method comprising:
receiving a first input;
applying a function-indexed interleaved first function to the first input to generate a first output having a left portion and a right portion; applying a function-index interleaved second function to the first output to generate a second output, wherein the left portion of the first output is used as a right input to the second function and the right portion of the first output is used as a left input to the second function; and
providing the second output as an encoding of the first input.
43. The method of claim 42, wherein the first input is encoded with a first encoding, further comprising:
applying the function-index interleaved first function and the function-index interleaved second function based upon a second encoding different from the first encoding.
44. The method of claim 42, further comprising:
encoding the input with a first encoding; and
performing each other recited step using operations that are adapted and configured to operate on the input when the first input is encoded with a second encoding mechanism different from the first encoding mechanism.
45. The method of claim 44, wherein the first encoding mechanism encodes the first input as aM + b, wherein a and b are constants.
46. The method of claim 45, wherein Mis an invertible matrix.
47. The method of claim 45, wherein the second encoding mechanism, if applied to the first input, encodes the input as cN + d, wherein c and d are constants different than a and b, respectively.
The method of claim 47, wherein N is an invertible matrix.
49. A method comprising:
generating a key K;
generating a pair of base functions fi1 based upon the generated key K and a randomization information R;
applying the base function f to a first end of a communication pipe;
applying the base function inverse ^"7 to a second end of the communication pipe; and discarding the key K.
50. The method of claim 50, wherein the key K is generated using a random or pseudorandom process.
51. The method of any one of claims 49-50, wherein the first end of the communication pipe is accessed by a first application on a first platform.
52. The method of any one of claims 49-51 , wherein the second end of the communication pipe is accessed by a second application on the first platform.
53. The method of any one of claims 49-51 , wherein the second end of the communication pipe is accessed by a second application on a second platform.
54. A method comprising:
receiving at least one base function;
receiving application code for an existing computer program; and
blending the at least one base function and the application code for the existing computer program by replacing at least one operation in the application code with the at least one base function.
55. The method of claim 54, further comprising:
applying at least one blending technique to the at least one base function and the application code, the at least one blending technique selected from the group consisting of: a fracture, variable dependent coding, dynamic data mangling, and cross-linking.
56. A computer system comprising:
a processor; and
a computer-readable storage medium storing instructions which cause the processor to perform a method as recited in any previous claim.
PCT/CA2013/000305 2012-03-30 2013-03-28 Securing accessible systems using base function encoding WO2013142981A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP13767371.1A EP2831794B1 (en) 2012-03-30 2013-03-28 Securing accessible systems using base function encoding
CN201380028121.0A CN104335218B (en) 2012-03-30 2013-03-28 Addressable system is protected using basic function coding
US14/389,361 US9965623B2 (en) 2012-03-30 2013-03-28 Securing accessible systems using base function encoding

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201261617991P 2012-03-30 2012-03-30
US201261618010P 2012-03-30 2012-03-30
US61/618,010 2012-03-30
US61/617,991 2012-03-30

Publications (1)

Publication Number Publication Date
WO2013142981A1 true WO2013142981A1 (en) 2013-10-03

Family

ID=49258006

Family Applications (4)

Application Number Title Priority Date Filing Date
PCT/CA2013/000309 WO2013142983A1 (en) 2012-03-30 2013-03-28 Securing accessible systems using cross-linking
PCT/CA2013/000304 WO2013142980A1 (en) 2012-03-30 2013-03-28 Securing accessible systems using variable dependent coding
PCT/CA2013/000303 WO2013142979A1 (en) 2012-03-30 2013-03-28 Securing accessible systems using dynamic data mangling
PCT/CA2013/000305 WO2013142981A1 (en) 2012-03-30 2013-03-28 Securing accessible systems using base function encoding

Family Applications Before (3)

Application Number Title Priority Date Filing Date
PCT/CA2013/000309 WO2013142983A1 (en) 2012-03-30 2013-03-28 Securing accessible systems using cross-linking
PCT/CA2013/000304 WO2013142980A1 (en) 2012-03-30 2013-03-28 Securing accessible systems using variable dependent coding
PCT/CA2013/000303 WO2013142979A1 (en) 2012-03-30 2013-03-28 Securing accessible systems using dynamic data mangling

Country Status (4)

Country Link
US (4) US9965623B2 (en)
EP (4) EP2831795B1 (en)
CN (4) CN104335218B (en)
WO (4) WO2013142983A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109640299A (en) * 2019-01-31 2019-04-16 浙江工商大学 It is a kind of to guarantee that M2M communication is complete and the polymerization and system of failure tolerant
CN110248035A (en) * 2018-03-09 2019-09-17 株式会社理光 Information processing unit, image forming apparatus, image processing system, image processing method and program
WO2019229234A1 (en) 2018-05-31 2019-12-05 Irdeto B.V. Shared secret establishment
WO2021032792A1 (en) 2019-08-20 2021-02-25 Irdeto B.V. Securing software routines
CN112989421A (en) * 2021-03-31 2021-06-18 支付宝(杭州)信息技术有限公司 Method and system for processing safety selection problem
CN114208359A (en) * 2019-08-15 2022-03-18 高通股份有限公司 New in-radio coexistence in broadband systems
US11281769B2 (en) 2016-12-15 2022-03-22 Irdeto B.V. Software integrity verification
US11606211B2 (en) 2017-03-10 2023-03-14 Irdeto B.V. Secured system operation

Families Citing this family (101)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3015726B1 (en) * 2013-12-24 2016-01-08 Morpho SECURE COMPARATIVE PROCESSING METHOD
EP2913772A1 (en) * 2014-02-28 2015-09-02 Wibu-Systems AG Method and computer system for protecting a computer program against influence
CN106415566A (en) 2014-03-31 2017-02-15 爱迪德技术有限公司 Protecting an item of software
EP3127269B1 (en) * 2014-03-31 2018-07-11 Irdeto B.V. Protecting an item of software
JP6260442B2 (en) * 2014-05-02 2018-01-17 富士通株式会社 Information processing method and program
WO2015178896A1 (en) * 2014-05-20 2015-11-26 Hewlett-Packard Development Company, L.P. Point-wise protection of application using runtime agent and dynamic security analysis
WO2015178895A1 (en) * 2014-05-20 2015-11-26 Hewlett-Packard Development Company, L.P. Point-wise protection of application using runtime agent
US9646160B2 (en) * 2014-09-08 2017-05-09 Arm Limited Apparatus and method for providing resilience to attacks on reset of the apparatus
US10657262B1 (en) 2014-09-28 2020-05-19 Red Balloon Security, Inc. Method and apparatus for securing embedded device firmware
BR112017006236A2 (en) 2014-09-30 2017-12-12 Koninklijke Philips Nv electronic calculating device, ring coding device, table calculating device, electronic calculating method, computer program, and, computer readable media
DE102014016548A1 (en) * 2014-11-10 2016-05-12 Giesecke & Devrient Gmbh Method for testing and hardening software applications
RU2710310C2 (en) 2014-12-12 2019-12-25 Конинклейке Филипс Н.В. Electronic forming device
US20160182472A1 (en) * 2014-12-19 2016-06-23 Nxp, B.V. Binding White-Box Implementation To Reduced Secure Element
US10262161B1 (en) * 2014-12-22 2019-04-16 Amazon Technologies, Inc. Secure execution and transformation techniques for computing executables
JP6387466B2 (en) * 2014-12-22 2018-09-05 コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. Electronic computing device
US10311229B1 (en) 2015-05-18 2019-06-04 Amazon Technologies, Inc. Mitigating timing side-channel attacks by obscuring alternatives in code
US10868665B1 (en) 2015-05-18 2020-12-15 Amazon Technologies, Inc. Mitigating timing side-channel attacks by obscuring accesses to sensitive data
US10437525B2 (en) * 2015-05-27 2019-10-08 California Institute Of Technology Communication efficient secret sharing
FR3039733B1 (en) * 2015-07-29 2017-09-01 Sagemcom Broadband Sas DEVICE AND METHOD FOR MODIFYING A STREAMED MEDIA DATA STREAM
US9942038B2 (en) * 2015-11-04 2018-04-10 Nxp B.V. Modular exponentiation using randomized addition chains
FR3047373B1 (en) * 2016-01-28 2018-01-05 Morpho SECURE MULTIPARTITE CALCULATION METHOD PROTECTED AGAINST A MALICIOUS PART
EP3208968A1 (en) * 2016-02-22 2017-08-23 HOB GmbH & Co. KG Computer implemented method for generating a random seed with high entropy
EP3208788B1 (en) * 2016-02-22 2020-06-03 Eshard Method of protecting a circuit against a side-channel analysis
WO2017203992A1 (en) * 2016-05-23 2017-11-30 ソニー株式会社 Encryption device, encryption method, decryption device, and decryption method
JP2017211842A (en) * 2016-05-25 2017-11-30 富士通株式会社 Information processor, compilation management method, and compilation program
RU2621181C1 (en) * 2016-06-02 2017-05-31 Олег Станиславович Когновицкий Cycle synchronization method with dynamic addressing recipient
US10201026B1 (en) 2016-06-30 2019-02-05 Acacia Communications, Inc. Forward error correction systems and methods
US10243937B2 (en) * 2016-07-08 2019-03-26 Nxp B.V. Equality check implemented with secret sharing
CN107623568B (en) * 2016-07-15 2022-09-06 青岛博文广成信息安全技术有限公司 SM4 white box implementation method based on S box dependent on secret key
US10771235B2 (en) * 2016-09-01 2020-09-08 Cryptography Research Inc. Protecting block cipher computation operations from external monitoring attacks
US20180130108A1 (en) 2016-11-09 2018-05-10 Morphotrust Usa, Llc Embedding security information in an image
KR102594656B1 (en) 2016-11-25 2023-10-26 삼성전자주식회사 Security Processor, Application Processor having the same and Operating Method of Security Processor
CN106778101B (en) * 2016-12-08 2019-05-14 合肥康捷信息科技有限公司 It is a kind of that method is obscured with the Python code that shape is obscured based on control stream
WO2018126187A1 (en) 2016-12-30 2018-07-05 Jones Robert L Embedded variable line patterns
US11615285B2 (en) 2017-01-06 2023-03-28 Ecole Polytechnique Federale De Lausanne (Epfl) Generating and identifying functional subnetworks within structural networks
US10579495B2 (en) 2017-05-18 2020-03-03 California Institute Of Technology Systems and methods for transmitting data using encoder cooperation in the presence of state information
US10902098B2 (en) * 2017-07-11 2021-01-26 Northwestern University Logic encryption for integrated circuit protection
US10521585B2 (en) * 2017-10-02 2019-12-31 Baidu Usa Llc Method and apparatus for detecting side-channel attack
US11323247B2 (en) 2017-10-27 2022-05-03 Quantropi Inc. Methods and systems for secure data communication
EP3701664A4 (en) * 2017-10-27 2021-07-28 Quantropi Inc. Methods and systems for secure data communication
CN108009429B (en) * 2017-12-11 2021-09-03 北京奇虎科技有限公司 Patch function generation method and device
US11461435B2 (en) * 2017-12-18 2022-10-04 University Of Central Florida Research Foundation, Inc. Techniques for securely executing code that operates on encrypted data on a public computer
CN109995518A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 Method for generating cipher code and device
US11218291B2 (en) * 2018-02-26 2022-01-04 Stmicroelectronics (Rousset) Sas Method and circuit for performing a substitution operation
FR3078463A1 (en) * 2018-02-26 2019-08-30 Stmicroelectronics (Rousset) Sas METHOD AND DEVICE FOR REALIZING SUBSTITUTED TABLE OPERATIONS
FR3078464A1 (en) * 2018-02-26 2019-08-30 Stmicroelectronics (Rousset) Sas METHOD AND CIRCUIT FOR IMPLEMENTING A SUBSTITUTION TABLE
US11480991B2 (en) * 2018-03-12 2022-10-25 Nippon Telegraph And Telephone Corporation Secret table reference system, method, secret calculation apparatus and program
CN108509774B (en) * 2018-04-09 2020-08-11 北京顶象技术有限公司 Data processing method and device, electronic equipment and storage medium
US11032061B2 (en) * 2018-04-27 2021-06-08 Microsoft Technology Licensing, Llc Enabling constant plaintext space in bootstrapping in fully homomorphic encryption
GB2574261B (en) * 2018-06-01 2020-06-03 Advanced Risc Mach Ltd Efficient unified hardware implementation of multiple ciphers
US11893471B2 (en) 2018-06-11 2024-02-06 Inait Sa Encoding and decoding information and artificial neural networks
US11663478B2 (en) 2018-06-11 2023-05-30 Inait Sa Characterizing activity in a recurrent artificial neural network
US11972343B2 (en) 2018-06-11 2024-04-30 Inait Sa Encoding and decoding information
EP3591550A1 (en) * 2018-07-06 2020-01-08 Koninklijke Philips N.V. A compiler device with masking function
US10505676B1 (en) * 2018-08-10 2019-12-10 Acacia Communications, Inc. System, method, and apparatus for interleaving data
US10545850B1 (en) 2018-10-18 2020-01-28 Denso International America, Inc. System and methods for parallel execution and comparison of related processes for fault protection
CN109614582B (en) * 2018-11-06 2020-08-11 海南大学 Lower triangular part storage device of self-conjugate matrix and parallel reading method
CN109614149B (en) * 2018-11-06 2020-10-02 海南大学 Upper triangular part storage device of symmetric matrix and parallel reading method
US11764940B2 (en) 2019-01-10 2023-09-19 Duality Technologies, Inc. Secure search of secret data in a semi-trusted environment using homomorphic encryption
CN111459788A (en) * 2019-01-18 2020-07-28 南京大学 Test program plagiarism detection method based on support vector machine
US11403372B2 (en) * 2019-01-29 2022-08-02 Irdeto Canada Corporation Systems, methods, and storage media for obfuscating a computer program by representing the control flow of the computer program as data
JP7287743B2 (en) * 2019-02-26 2023-06-06 インテル・コーポレーション Workload-oriented constant propagation for compilers
JP7233265B2 (en) * 2019-03-15 2023-03-06 三菱電機株式会社 Signature device, verification device, signature method, verification method, signature program and verification program
US11652603B2 (en) 2019-03-18 2023-05-16 Inait Sa Homomorphic encryption
US11569978B2 (en) * 2019-03-18 2023-01-31 Inait Sa Encrypting and decrypting information
JP7107432B2 (en) * 2019-03-28 2022-07-27 日本電気株式会社 Analysis system, method and program
US10764029B1 (en) * 2019-04-02 2020-09-01 Carey Patrick Atkins Asymmetric Encryption Algorithm
US11654635B2 (en) 2019-04-18 2023-05-23 The Research Foundation For Suny Enhanced non-destructive testing in directed energy material processing
EP3959840A4 (en) 2019-04-23 2023-01-11 Quantropi Inc. Enhanced randomness for digital systems
CN110196819B (en) * 2019-06-03 2021-08-24 海光信息技术股份有限公司 Memory access method and hardware
US11314996B1 (en) 2019-06-04 2022-04-26 Idemia Identity & Security USA LLC Embedded line patterns using square-wave linecode
CN112068799B (en) * 2019-06-11 2022-08-02 云南大学 Optimal signed binary system fast calculation method and elliptic curve scalar multiplication
US11323255B2 (en) 2019-08-01 2022-05-03 X-Logos, LLC Methods and systems for encryption and homomorphic encryption systems using Geometric Algebra and Hensel codes
CN110609831B (en) * 2019-08-27 2020-07-03 浙江工商大学 Data link method based on privacy protection and safe multi-party calculation
US11086714B2 (en) * 2019-09-20 2021-08-10 Intel Corporation Permutation of bit locations to reduce recurrence of bit error patterns in a memory device
JP7274155B2 (en) * 2019-09-26 2023-05-16 日本電気株式会社 Calculation system and calculation method
US11509460B2 (en) * 2019-10-02 2022-11-22 Samsung Sds Co.. Ltd. Apparatus and method for performing matrix multiplication operation being secure against side channel attack
US11651210B2 (en) 2019-12-11 2023-05-16 Inait Sa Interpreting and improving the processing results of recurrent neural networks
US11816553B2 (en) 2019-12-11 2023-11-14 Inait Sa Output from a recurrent neural network
US11580401B2 (en) 2019-12-11 2023-02-14 Inait Sa Distance metrics and clustering in recurrent neural networks
US11797827B2 (en) 2019-12-11 2023-10-24 Inait Sa Input into a neural network
US12099997B1 (en) 2020-01-31 2024-09-24 Steven Mark Hoffberg Tokenized fungible liabilities
JP7380843B2 (en) * 2020-03-24 2023-11-15 日本電気株式会社 Secure computation system, secure computation server device, secure computation method, and secure computation program
US11204985B2 (en) * 2020-03-31 2021-12-21 Irdeto Canada Corporation Systems, methods, and storage media for creating secured computer code having entangled transformations
US11552789B2 (en) * 2020-05-27 2023-01-10 Volodymyr Vasiliovich Khylenko System for an encoded information transmission
CN111881462A (en) * 2020-07-17 2020-11-03 张睿 Online analysis technology for commercial password application encryption effectiveness
WO2022035909A1 (en) 2020-08-10 2022-02-17 X-Logos, LLC Methods for somewhat homomorphic encryption and key updates based on geometric algebra for distributed ledger technology
TW202215237A (en) * 2020-09-02 2022-04-16 美商賽發馥股份有限公司 Memory protection for vector operations
US11683151B2 (en) 2020-09-17 2023-06-20 Algemetric, Inc. Methods and systems for distributed computation within a fully homomorphic encryption scheme using p-adic numbers
FR3118233B1 (en) * 2020-12-18 2024-01-19 St Microelectronics Alps Sas METHOD FOR DETECTING REVERSE ENGINEERING ON A PROCESSING UNIT USING AN INSTRUCTION REGISTER AND CORRESPONDING INTEGRATED CIRCUIT
US11502819B2 (en) * 2021-01-21 2022-11-15 Nxp B.V. Efficient masked polynomial comparison
CN112861331B (en) * 2021-01-28 2022-02-25 西南石油大学 Method for rapidly constructing coefficient matrix of oil and gas reservoir numerical simulator
CN112863132B (en) * 2021-04-23 2021-07-13 成都中轨轨道设备有限公司 Natural disaster early warning system and early warning method
IT202100012488A1 (en) * 2021-05-14 2022-11-14 Torino Politecnico Method of configuring neural networks and method of processing binary files
US11930074B2 (en) * 2021-10-26 2024-03-12 King Fahd University Of Petroleum And Minerals Content distribution over a network
TW202324967A (en) * 2021-11-03 2023-06-16 美商艾銳勢企業有限責任公司 White-box processing for encoding with large integer values
CN114117420B (en) * 2021-11-25 2024-05-03 北京邮电大学 Intrusion detection system of distributed multi-host network based on artificial immunology
CN114091624B (en) * 2022-01-18 2022-04-26 蓝象智联(杭州)科技有限公司 Federal gradient lifting decision tree model training method without third party
US12013970B2 (en) 2022-05-16 2024-06-18 Bank Of America Corporation System and method for detecting and obfuscating confidential information in task logs
US20240080186A1 (en) * 2022-09-07 2024-03-07 Google Llc Random Trigger for Automatic Key Rotation
CN116436473B (en) * 2023-06-09 2023-10-03 电子科技大学 Rule F-LDPC code parameter blind identification method based on check matrix

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6668325B1 (en) * 1997-06-09 2003-12-23 Intertrust Technologies Obfuscation techniques for enhancing software security
US20040139340A1 (en) * 2000-12-08 2004-07-15 Johnson Harold J System and method for protecting computer software from a white box attack
US6779114B1 (en) * 1999-08-19 2004-08-17 Cloakware Corporation Tamper resistant software-control flow encoding
US6842862B2 (en) * 1999-06-09 2005-01-11 Cloakware Corporation Tamper resistant software encoding
EP1947584B1 (en) * 2006-12-21 2009-05-27 Telefonaktiebolaget LM Ericsson (publ) Obfuscating computer program code
WO2009108245A2 (en) * 2007-12-21 2009-09-03 University Of Virginia Patent Foundation System, method and computer program product for protecting software via continuous anti-t ampering and obfuscation transforms
US20110214179A1 (en) * 2001-11-26 2011-09-01 Irdeto Canada Corporation Secure method and system for computer protection

Family Cites Families (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2009572A (en) * 1935-03-21 1935-07-30 Joseph H Dunn Rotary photograph printing machine
US5095525A (en) * 1989-06-26 1992-03-10 Rockwell International Corporation Memory transformation apparatus and method
US5081675A (en) 1989-11-13 1992-01-14 Kitti Kittirutsunetorn System for protection of software in memory against unauthorized use
US6088452A (en) 1996-03-07 2000-07-11 Northern Telecom Limited Encoding technique for software and hardware
US5974549A (en) * 1997-03-27 1999-10-26 Soliton Ltd. Security monitor
US6192475B1 (en) * 1997-03-31 2001-02-20 David R. Wallace System and method for cloaking software
US7430670B1 (en) 1999-07-29 2008-09-30 Intertrust Technologies Corp. Software self-defense systems and methods
CA2305078A1 (en) 2000-04-12 2001-10-12 Cloakware Corporation Tamper resistant software - mass data encoding
US6983365B1 (en) * 2000-05-05 2006-01-03 Microsoft Corporation Encryption systems and methods for identifying and coalescing identical objects encrypted with different keys
FR2811093A1 (en) * 2000-06-30 2002-01-04 St Microelectronics Sa DEVICE AND METHOD FOR EVALUATING ALGORITHMS
JP2002049310A (en) * 2000-08-04 2002-02-15 Toshiba Corp Ciphering and deciphering device, authentication device and storage medium
US20020092003A1 (en) * 2000-11-29 2002-07-11 Brad Calder Method and process for the rewriting of binaries to intercept system calls in a secure execution environment
GB2371125A (en) * 2001-01-13 2002-07-17 Secr Defence Computer protection system
CA2348355A1 (en) 2001-05-24 2002-11-24 Cloakware Corporation General scheme of using encodings in computations
CA2369304A1 (en) 2002-01-30 2003-07-30 Cloakware Corporation A protocol to hide cryptographic private keys
JP2003312286A (en) 2002-04-23 2003-11-06 Toyoda Mach Works Ltd Wheel driving force allocation controlling system
US7366914B2 (en) 2003-08-29 2008-04-29 Intel Corporation Source code transformation based on program operators
KR100506203B1 (en) * 2003-09-17 2005-08-05 삼성전자주식회사 Booting and boot code update method and system thereof
US7966499B2 (en) 2004-01-28 2011-06-21 Irdeto Canada Corporation System and method for obscuring bit-wise and two's complement integer computations in software
US7512936B2 (en) * 2004-12-17 2009-03-31 Sap Aktiengesellschaft Code diversification
DE602005018736D1 (en) * 2004-12-22 2010-02-25 Ericsson Telefon Ab L M Watermarking a computer program code using equivalent mathematical expressions
GB2435531A (en) * 2006-02-27 2007-08-29 Sharp Kk Control Flow Protection Mechanism
JP4938766B2 (en) * 2006-04-28 2012-05-23 パナソニック株式会社 Program obfuscation system, program obfuscation apparatus, and program obfuscation method
US20080016339A1 (en) * 2006-06-29 2008-01-17 Jayant Shukla Application Sandbox to Detect, Remove, and Prevent Malware
EP2107489A3 (en) * 2006-12-21 2009-11-04 Telefonaktiebolaget L M Ericsson (PUBL) Obfuscating computer program code
US8752032B2 (en) * 2007-02-23 2014-06-10 Irdeto Canada Corporation System and method of interlocking to protect software-mediated program and device behaviours
WO2008101340A1 (en) * 2007-02-23 2008-08-28 Cloakware Corporation System and method for interlocking to protect software-mediated program and device behaviours
US8245209B2 (en) * 2007-05-29 2012-08-14 International Business Machines Corporation Detecting dangling pointers and memory leaks within software
EP2009572B1 (en) * 2007-06-29 2010-01-27 Telefonaktiebolaget LM Ericsson (publ) Obfuscating execution traces of computer program code
US8271424B2 (en) * 2008-05-15 2012-09-18 International Business Machines Corporation Privacy and confidentiality preserving reporting of URLs
US8312249B1 (en) * 2008-10-10 2012-11-13 Apple Inc. Dynamic trampoline and structured code generation in a signed code environment
US8874928B2 (en) * 2008-10-31 2014-10-28 Apple Inc. System and method for obfuscating constants in a computer program
JP5322620B2 (en) 2008-12-18 2013-10-23 株式会社東芝 Information processing apparatus, program development system, program verification method, and program
CN101477610B (en) * 2008-12-25 2011-05-18 中国人民解放军信息工程大学 Software watermark process for combined embedding of source code and target code
FR2950721B1 (en) * 2009-09-29 2011-09-30 Thales Sa METHOD FOR EXECUTING A PROTECTIVE ALGORITHM OF AN AFFIN-MASKING ELECTRONIC DEVICE AND ASSOCIATED DEVICE
US20110167407A1 (en) * 2010-01-06 2011-07-07 Apple Inc. System and method for software data reference obfuscation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6668325B1 (en) * 1997-06-09 2003-12-23 Intertrust Technologies Obfuscation techniques for enhancing software security
US6842862B2 (en) * 1999-06-09 2005-01-11 Cloakware Corporation Tamper resistant software encoding
US6779114B1 (en) * 1999-08-19 2004-08-17 Cloakware Corporation Tamper resistant software-control flow encoding
US20040139340A1 (en) * 2000-12-08 2004-07-15 Johnson Harold J System and method for protecting computer software from a white box attack
US20110214179A1 (en) * 2001-11-26 2011-09-01 Irdeto Canada Corporation Secure method and system for computer protection
EP1947584B1 (en) * 2006-12-21 2009-05-27 Telefonaktiebolaget LM Ericsson (publ) Obfuscating computer program code
WO2009108245A2 (en) * 2007-12-21 2009-09-03 University Of Virginia Patent Foundation System, method and computer program product for protecting software via continuous anti-t ampering and obfuscation transforms

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KLIMOV, A. ET AL.: "Cryptographic Applications of T-functions", SELECTED AREAS IN CRYPTOGRAPHY, SAC 2003, LECTURE NOTES IN COMPUTER SCIENCE, vol. 3006, 2004, pages 248 - 261, XP019004160 *
See also references of EP2831794A4 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11281769B2 (en) 2016-12-15 2022-03-22 Irdeto B.V. Software integrity verification
US11606211B2 (en) 2017-03-10 2023-03-14 Irdeto B.V. Secured system operation
CN110248035A (en) * 2018-03-09 2019-09-17 株式会社理光 Information processing unit, image forming apparatus, image processing system, image processing method and program
WO2019229234A1 (en) 2018-05-31 2019-12-05 Irdeto B.V. Shared secret establishment
US10797868B2 (en) 2018-05-31 2020-10-06 Irdeto B.V. Shared secret establishment
CN109640299A (en) * 2019-01-31 2019-04-16 浙江工商大学 It is a kind of to guarantee that M2M communication is complete and the polymerization and system of failure tolerant
CN109640299B (en) * 2019-01-31 2021-09-21 浙江工商大学 Aggregation method and system for ensuring M2M communication integrity and fault tolerance
CN114208359A (en) * 2019-08-15 2022-03-18 高通股份有限公司 New in-radio coexistence in broadband systems
WO2021032792A1 (en) 2019-08-20 2021-02-25 Irdeto B.V. Securing software routines
CN112989421A (en) * 2021-03-31 2021-06-18 支付宝(杭州)信息技术有限公司 Method and system for processing safety selection problem

Also Published As

Publication number Publication date
WO2013142983A1 (en) 2013-10-03
US9698973B2 (en) 2017-07-04
EP2831797B1 (en) 2018-05-02
EP2831791A1 (en) 2015-02-04
CN104981813B (en) 2018-08-07
US20150067875A1 (en) 2015-03-05
US20150067874A1 (en) 2015-03-05
EP2831797A1 (en) 2015-02-04
CN104335218B (en) 2017-08-11
EP2831791A4 (en) 2015-11-25
US9906360B2 (en) 2018-02-27
EP2831797A4 (en) 2015-11-11
CN104335219B (en) 2018-06-05
CN104662549A (en) 2015-05-27
CN104335218A (en) 2015-02-04
EP2831795A4 (en) 2015-11-25
EP2831794B1 (en) 2021-11-10
EP2831794A4 (en) 2016-03-09
EP2831791B1 (en) 2020-10-21
EP2831794A1 (en) 2015-02-04
US20150326389A1 (en) 2015-11-12
EP2831795A1 (en) 2015-02-04
WO2013142980A1 (en) 2013-10-03
CN104981813A (en) 2015-10-14
US20150082425A1 (en) 2015-03-19
CN104335219A (en) 2015-02-04
US9965623B2 (en) 2018-05-08
EP2831795B1 (en) 2019-01-09
WO2013142979A1 (en) 2013-10-03
CN104662549B (en) 2019-02-19

Similar Documents

Publication Publication Date Title
WO2013142981A1 (en) Securing accessible systems using base function encoding
US9910971B2 (en) System and method of interlocking to protect software-mediated program and device behaviours
JP5861018B1 (en) Computing device configured by table network
EP3513348A1 (en) Efficient obfuscation of program control flow
US10331896B2 (en) Method of protecting secret data when used in a cryptographic algorithm
WO2008101340A1 (en) System and method for interlocking to protect software-mediated program and device behaviours
Löfström et al. Hiding Information in Software With Respect to a White-box Security Model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13767371

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14389361

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2013767371

Country of ref document: EP