WO2022047540A1

WO2022047540A1 - Device fingerprinting

Info

Publication number: WO2022047540A1
Application number: PCT/AU2021/051024
Authority: WO
Inventors: Damith Ranasinghe; Yang Su; Yansong Gao
Original assignee: The University Of Adelaide
Priority date: 2020-09-03
Filing date: 2021-09-03
Publication date: 2022-03-10

Abstract

A method of generating a device fingerprint for a device that comprises at least one integrated circuit comprises: obtaining a plurality of raw bit values from the at least one integrated circuit; generating noise-resistant bits from the raw bit values by: grouping the raw bit values into a plurality of groups; and applying a transformation to each of the groups, wherein for each group, the transformation either generates a single noise-resistant bit for the group or does not output any value for the group, based on a noise tolerance threshold; and outputting a string of the noise-resistant bits as the device fingerprint.

Description

DEVICE FINGERPRINTING

RELATED APPLICATION

The present application is related to Australian Provisional Patent Application No. 2020903156, filed 3 September 2020 in the name of THE UNIVERSITY OF ADELAIDE, entitled "Noise Tolerant Memory Fingerprints from Commodity Devices" , the originally filed specification of which is hereby incorporated by reference herein in its entirety.

TECHNICAL FIELD

The present disclosure relates to device fingerprinting, for example through use of random bits generated by a memory or other integrated circuit component of a device.

BACKGROUND

Hardware instance-specific fingerprints can act as the basis for a Root of Trust in a cryptographic system. There have been various attempts to devise fingerprinting methods for commercial-off-the-shelf (COTS) devices, for various applications such as anticounterfeiting and key-based authentication and attestation. Such attempts have relied on device components such as on-board sensors, CPUs, and memories, including static random access memory (SRAM), dynamic random access memory (DRAM), and Flash memory.

Memory components are ubiquitous in COTS devices. This is especially true for low-end Internet of Things (loT) devices, projected to grow to 75.44 billion worldwide by 2025. Therefore, fingerprinting embedded memories is an extremely attractive proposition for provisioning security functions, due to their wide availability, and the fact that memorybased fingerprinting does not impose additional hardware cost.

Whenever a fingerprint is regenerated from a given device, the digitized fingerprint should be consistent for it to be able to be used in security functions. However, in reality, reliable fingerprint regeneration is usually infeasible since the measurements used to generate the data for fingerprinting are susceptible to thermal noise, environmental parameters (supply voltage and temperature), and aging-induced variations over time. Current memory fingerprinting mechanisms cannot inherently deal with unreliability. Consequently, extracted fingerprints (digitized as binary bits) cannot directly support security functions. Current memory fingerprinting methods extract a 1-bit fingerprint from a single cell of memory, for example the power-up state of one SRAM cell. However, the fingerprints generated by such methods are susceptible to noise, such as thermal noise. Such susceptibility is inherently unpredictable. Hence, it is infeasible to ensure that the regenerated fingerprint is identical to the reference fingerprint which is enrolled at a secure server (for example) and used in a security function implemented between the server and a device.

To deal with the above issue, existing fingerprinting methods rely on fuzzy extractors, which are based on error correction coding schemes. Fuzzy extractors are implemented on-device and use associated helper data to correct bit errors. However, a fuzzy extractor has fundamental limitations. The computation overhead introduced on-device by fuzzy extractor logic is high. Further, the associated helper data can be actively manipulated, in helper data manipulation (HDM) attacks, to weaken or even compromise the security of the derived fingerprint. A general countermeasure against HDM attacks remains an open challenge. Therefore, both from utility and security perspectives, there are still significant practical issues to overcome to be able to implement security functions using hardware fingerprints.

There thus remains a need for a method of generating device fingerprints that can address one or more of the above difficulties.

SUMMARY

Disclosed herein is a method of generating a device fingerprint for a device that comprises at least one integrated circuit, the method comprising: obtaining a plurality of raw bit values from the at least one integrated circuit; generating noise-resistant bits from the raw bit values by:

(i) grouping the raw bit values into a plurality of blocks; and

(ii) applying a transformation to each of the blocks, wherein for each group, the transformation either generates a single noiseresistant bit for the group or does not output any value for the group, based on a noise tolerance threshold; and outputting a string of the noise-resistant bits as the device fingerprint. Disclosed herein is a device attestation method comprising: generating a device fingerprint according to the method above; and enrolling the device fingerprint at a server.

Disclosed herein is a secure key derivation method comprising: generating a device fingerprint by the method above; and enrolling the device fingerprint as a reference fingerprint at a server.

Disclosed herein is a device comprising at least one integrated circuit, the at least one integrated circuit comprising at least one processor configured to carry out the method above.

BRIEF DESCRIPTION OF THE DRAWINGS

Some embodiments of a method and system for device fingerprinting, in accordance with present teachings will now be described, by way of non-limiting example only, with reference to the accompanying drawings in which:

Figure 1 is a schematic illustration of transformation of a raw fingerprint extracted from an integrated circuit to a noise-resistant fingerprint;

Figure 2 is a flow diagram of an example method for generating a device fingerprint;

Figure 3 is a schematic illustration of a first example of a transformation for use in a method for generating a device fingerprint;

Figure 4 shows empirical distributions of raw noisy fingerprints, and the effect of the choice of a noise-tolerance threshold in the first example transformation;

Figure 5 is a schematic illustration of a second example of a transformation for use in a method for generating a device fingerprint;

Figure 6 shows empirical distributions of raw noisy fingerprints, and the effect of the choice of a noise-tolerance threshold in the second example transformation;

Figure 7 is a schematic illustration of a third example of a transformation for use in a method for generating a device fingerprint;

Figure 8 is a schematic illustration of a fourth example of a transformation for use in a method for generating a device fingerprint;

Figure 9 is a schematic illustration of a fifth example of a transformation for use in a method for generating a device fingerprint;

Figure 10 is a block diagram of a system for device attestation according to an embodiment; Figure 11 shows graphs of validation data for a method using the first example transformation;

Figure 12 shows graphs of validation data for a method using the second example transformation;

Figure 13 shows graphs of validation data on different memory types for a method using the first example transformation;

Figure 14 shows graphs of validation data on different memory types for a method using the second example transformation;

Figure 15 shows simulation data for a method using the second example transformation;

Figure 16 is pseudocode for a remote attestation protocol according to certain embodiments;

Figure 17 shows simulation data for extraction efficiency as a function of number of words per block of memory cells;

Figure 18 schematically illustrates different scenarios encountered in a method using the first example transformation;

Figure 19 schematically illustrates different scenarios encountered in a method using the second example transformation;

Figure 20 shows images of evaluated memories (bottom) and products (top) that integrate them;

Figure 21 schematically illustrates quantities in a derivation of extraction efficiency for the second example extraction method according to an embodiment;

Figure 22 shows data flow on a Device in a remote attestation method according to an embodiment;

Figure 23 shows an example protocol that enables lightweight session key establishment between a resource-tight device and a server;

Figure 24 shows simulation results for key reliability of the fifth example method, Method-

A, of device fingerprinting;

Figure 25 shows simulation results for extraction efficiency of the fifth example method, Method-A;

Figure 26 shows simulation results for key reliability of the sixth example method, Method-

B, of device fingerprinting; and

Figure 27 shows simulation results for extraction efficiency of the sixth example method, Method-B;

Figure 28 shows a probability density function for a random chip model with the fifth example method, Method-A with larger selection threshold, θ, i.e.

Figure 29 shows shows a probability density function for a random chip model with the fifth example method, Method-A with smaller selection threshold θ,

Figure 30 shows simulated and predicted key reliability as a function of threshold, 6, for Method-A (the fifth example method) and Method-B (the sixth example method);

Figure 31 shows simulated and predicted key extraction efficiency as a function of threshold, theta, for Method-A (the fifth example method) and (the sixth example method) Method-B;

Figure 32 shows the bias of raw bits extracted by Method-A (the fifth example method) and Method-B (the sixth example method) for two physical chips, nRF52 and MSP430;

Figure 33 shows the distribution of raw bits extracted by Method-A (the fifth example method) and Method-B (the sixth example method) for nRF52 and MSP430;

Figure 34 shows SRAM usage of an implementation of the present disclosure, compared to a prior art method, SecuCode;

Figure 35 shows FRAM usage compared to SecuCode;

Figure 36 shows clock cycle consumption compared to SecuCode;

Figure 37 shows one example implementation of Global TRE on a server;

Figure 38 shows one example implementation of Regional TRE on a server;

Figure 39 shows the results from a large-scale simulation of the run time of one example Global TRE implementations when Method-A is employed; and

Figure 40 shows the results from a large-scale simulation of the run time of one example Global TRE implementations when Method-B is employed

DETAILED DESCRIPTION

Embodiments of the present disclosure are directed to extracting noise-resistant bits for device fingerprinting, by applying a transformation rule to raw bits that are obtained directly from one or more integrated circuits of a device. The raw bits may be obtained in a wide variety of different ways, based on any hardware component, or set of hardware components that is or are subject to variability, for example due to their physical properties.

Certain embodiments will be discussed below in relation to fingerprinting using memory components such as SRAM, Flash and EEPROM. However, it will be appreciated that other non-memory circuits may also be used for fingerprinting according to the teachings of the present disclosure. For example, fingerprinting may make use of gate and wire delays in circuits, with the delays being digitised to generate raw bits that can then be used to extract noise-resistant bits. In another example, CPU execution units may be used to generate the raw bits, for example based on time differences for two of the execution units to process instructions. Accordingly, while specific examples will be described in detail in relation to memory-based fingerprinting, it will be understood that any source of hardware variability in a device may be used to generate raw bits for fingerprinting.

In some examples, SRAM, Flash or EEPROM memory may be used for device fingerprinting. When SRAM is powered up, each cell exhibits a favoured power-up state; such an initial state varies from cell to cell, and chip to chip. Therefore, each SRAM cell's power-up state can be treated as a fingerprint bit. To extract fingerprints from Flash memory, all Flash cells in the same page are first reset, for example to 'O'. Then partial programming is applied. As a result of tiny fabrication variations, some cells will remain in state '0' while others flip to '1'. Whether a cell remains in the same state or flips is determined by fabrication process variations. One can treat whether a cell flips as the fingerprint bit— a flip as logic '1', otherwise 'O'. The same procedure can be applied to fingerprint EEPROM.

In contrast to extracting a single fingerprint bit from each memory cell, a many-to-1 bit transformation that is invariant to the unpredictable, complex and dynamic processes generating the fingerprint bits is used to generate the device fingerprint. This is illustrated schematically in Figure 1, in which an integrated circuit 10, such as a memory device, comprises a plurality of addressable units, for example memory cells 12 that are arranged in rows 14. A group 14 of raw fingerprint bits f (here n = 7 raw bits) extracted from measurements of the memory cells 12 shows slight variations in bits due to noise (notably, bold and upright Times New Roman is used here to explicitly denote vectors but with a slight abuse in notation, and for simplicity, use italics Times New Roman for vectors where the distinction is not important or can be inferred from the context). The position and the number of raw bit flips can differ under different measurement conditions at different times t =0, 1, 2, 3, .... However, these raw bits at different measurement instances are transformed by a transformation function 16 into the same noise-resistant bit F (e.g. '0') due to invariance of the transform 16 to the illustrated bit patterns.

Turning now to Figure 2, an example process flow for a method 100 of generating a device fingerprint is shown.

The method begins at step 102 by reading values from one or more integrated circuits (such as integrated circuit 10 of Figure 1). For example, the values may be power-up states of memory cells in SRAM, or bit-flip values for memory cells in a Flash memory or EEPROM subject to partial programming. The values may be read by a processor that is part of integrated circuit 10 or that is in communication with integrated circuit 10, for example via an I/O bus. Step 104 is optional and comprises digitizing the values read at step 102 to obtain a raw fingerprint comprising a plurality of raw bit values. If raw bit values are already available in digital format from step 102, then step 104 need not be carried out. However, in some cases, digitization may be required. For example, if a non-binary measurement such as residual charge in DRAM is used as the source of raw bits, then this should first be digitized at step 104.

At step 106, the raw bits are grouped into a plurality of groups. For example, these may be wordlines or sets of (contiguous or non-contiguous) wordlines, though it will be appreciated that raw bits need not be obtained from spatially adjacent or otherwise related measurements of the integrated circuit(s).

Next, at step 108, a transformation is applied to each group of raw bits. The transformation maps a group of raw bits to a single noise-resistant bit. A number of different transformations are possible, some of which will be described in detail below. The examples described below are l1-norm-based methods. However, it will be appreciated that many other types of metrics may be used.

Finally, at step 110, once the groups of raw bits have been transformed to respective noise-resistant bits, a string of the noise-resistant bits may be output as the device fingerprint.

The method 100 may be executed a plurality of times, with the first execution generating a fingerprint for enrolment, and subsequent executions generating fingerprints for comparison to the enrolled fingerprint for authentication of the device, for example.

Turning now to Figures 3 to 8, some specific examples of transformations applicable at step 108 of the method 100 will be described.

In general, a transformation according to the present disclosure may be written as:

F ← T(f) (1)

In Equation (1), f is a raw bit vector, T is a transformation function, and F is a transformed bit.

The /1-norm of a raw fingerprint can be considered to be the distance of the fingerprint from an all-zero vector. This is also referred to as the Hamming distance, and is a permutation-invariant quantity, i.e. multiple permutations of fingerprints f may possess the same l1-norm.

Let f, be a raw noisy fingerprint binary vector of length n, and f_i be the i^th bit in f. The l1- norm is defined as:

In a first example of a transformation, referred to herein as S-norm, a transformation from raw fingerprint space f to noise-resistant fingerprint space F may be expressed as:

In the above, f is a binary vector of length n bits (for example, from one wordline of the memory or several continuous wordlines), where n is any positive odd integer.

Figure 3 illustrates the S-Norm transformation using a simple example. Here, f is a 7-bit raw fingerprint vector. Accordingly, if the norm of f is less than or equal to [3.5J, the transformed bit F is '0', while if the norm is greater than or equal to [3.51, the transformed bit F is '1'. As shown in Figure 3, for row 14b, the norm of the raw fingerprint ||f ⁰ ||₁ and ||f¹||₁ (at t = o and t = 1) generates a transformed bit F='1' which is invariant to the raw fingerprint bit patterns. Across time, we further observe the transformation to be invariant to two different combinations of bit patterns; that at t - o and t - 1 to that at t - 2 for row 14a, which generates a transformed bit F - '0'. Consequently, the F bits obtained from the transformed space can tolerate a range of bit error patterns in raw bits.

Although the transform can mitigate the errors from underlying noise processes, in the worst case, a transformed bit cannot tolerate a single bit flip from a vector norm adjacent to the decision boundary For instance, ||f||₁ for t = 0, 1, 2 for row 14b F = '1'), all ||f ||₁

values are at the decision boundary of Equation (3). Here, a single change in the underlying fingerprint bit vector is sufficient to affect the value of the transformed fingerprint bit F. Therefore, we may consider these vectors as unreliable and eliminate the reliance on such raw noisy fingerprint vectors for generating noise tolerant fingerprints. In one variant, therefore, the method may be generalized based on a noise tolerance parameter θ.

Let θ e N°. Then an extracted raw fingerprint bit vector

of n bits extracted at t = 0 is selected for fingerprinting the device using S-Norm, if ∀i ∈ {1, ... , |N|}:

To demonstrate the significance of transforming fingerprints into the space of l1-norm, and the role of the noise tolerance parameter θ, we consider the distribution of ||f ||₁. We used one fingerprint dataset obtained from Nordic Semiconductor chips— detailed in Table I. Figure 4 illustrates the resulting distribution of initial measurements (at t - 0). As expected, the distribution of ||f ||₁ approximates a bell curve. Figures 4(a) and 4(b) respectively show two cases in which a small and a large θ is applied. The groups of f⁰ fingerprint vectors (indicated by arrows) represent those closest to the decision boundary, , of the transform T_SNorm ; consequently, these raw bit vectors represent

those most likely to exhibit a bit error in a transformed bit F due to the noisy raw fingerprint measurements. When θ is small (θ = 1), at least two raw fingerprint bit changes in a given pattern are required to reduce the l1 -Norm of f to approach and cross the decision boundary defined by S-Norm in Equation (3), resulting in a F bit flip in a subsequent fingerprint evaluation.

In contrast, when θ is large and set to 4, ||f ⁰ ||₁ is further away from the decision boundary. Consequently, at least 5 bit changes are required to affect a F bit flip in a subsequent fingerprint evaluation. So that: i) simultaneously applying the S-Norm based selection in (4) to select raw noisy fingerprints and employing the S-Norm transform measured at t = 0 generates a noise tolerant reference template fingerprint for a device memory; and ii) ensures the same subsequent F fingerprint extraction by simply applying the S-Norm transformation in (3) on reevaluations of n-bit raw noisy fingerprints. Ideally, we would expect the transformed fingerprints in the noise-tolerant space to always match the template enrolled at t = 0.

Increasing the threshold θ increases the noise tolerance of the transform T_SNorm . However, the trade-off is that fewer noise tolerant fingerprint bits are available, as seen in Figure 4. Accordingly, in a second example transformation, a distance measure capable of presenting a bimodal distribution may be used, as this provides an intrinsic separation of groups of underlying raw fingerprint bits. In particular, a differential distance measure may result in such a desirable distribution. The second example transformation is referred to herein as the D-Norm transform.

Let the lowest and highest l1-Norm of m groups (each group is a n-bit vector) be I and h, respectively. Then:

Then, following the general definition in Equation (1), the D-Norm transform is defined as:

Here, we denote the spatial index (memory address, in practice) of i for i e {1, ... ,m} chosen for h or I using a square bracket, "[ ]".

The D-Norm transformation is illustrated schematically in Figure 5. In the simplified example, the l1-Norm of m = 3 groups of n = 8-bit vectors are firstly evaluated at t = 0. For the D-Norm, n can be an even or odd integer, which is different from the S-Norm (for which n has to be an odd integer). Secondly, out of a block with m vectors, the lowest and highest l1-Norm— i.e. I and h, respectively— are determined as given in Equations (5) and (6). Then the difference between l and h and the spatial relationship between the index of h and l is evaluated as expressed by T_DNorm in Equation (7) to extract a F bit. If h has a lower index number - for example, lower a physical memory address -- than I within the memory block, then F - '1' as in the example shown in Figure 5; otherwise F - '0'.

In subsequent evaluations of the fingerprint at time t - 1, the permutation invariance property is observed, similar to the S-Norm. For example, the highest l1-Norm at t = 0 and t = 1 is h = 5 for the third 8-bit vector, despite repeated generation of the raw bits not being exact for the F- '0' block. It can further be observed that the difference in the norm of the raw fingerprint h - 1 as 6-1 = 5 at t= 0 and the extreme case of 3-3 = 0 at t= 1 for '1' is invariant to two different combinations of bit patterns measured for h and I. In both of these illustrations, the fingerprint remains invariant to the raw fingerprint bit error patterns.

A noise tolerance threshold may be incorporated into D-norm as follows. From a block of m groups: each group is an n-bit raw noisy fingerprint vector providing

at t = 0, for i ∈ { 1,...,m} The transformed F is selected for fingerprinting the device using D-Norm conditional on h and I as defined in Equations (5) and (6) satisfying

|h- l | ≥ θ (8) where θ bounds the noise tolerated by the F.

Unlike S-Norm, the h and I selection at t = 0 maximizes or enlarges the distances between the n-bit vectors at [h] and [I] for each block of m groups. Further, the absolute value of h — l determines the noise degree tolerated by the transformed bit F in the D- Norm transform space. Correspondingly, the new fingerprint F from a D-Norm transformation can tolerate a larger range of noise bounded by θ; a fact that becomes more apparent when we consider the distribution of D-Norm. Specifically, Figure 6 demonstrates the bimodal distribution of D-Norm from one fingerprint dataset obtained from Nordic Semiconductor chips (specifications of which are detailed in Table I). The two clear groupings of m x n bit blocks based on the D-Norm distance measure results in an intrinsic separation. The D-Norm transformation method, intuitively, appears to sacrifice more of the available entropy— m x n raw fingerprint bits are transformed into 1-bit F — than S-Norm. However, the differential distance measure h - I is bimodal. This, in fact, provides a significantly higher number of noise-tolerant F bits, as validated and detailed below.

In the previously described transformation, D-Norm with fixed m, the selection rate is capped since only two /’1-Norms (corresponding to two groups) are actually used to produce the noise-tolerant bit F. In that case, a fixed m leads to a fixed amount of (m -

2) x n raw bits being wasted without any work-done. Accordingly, a modified version of D-norm, referred to herein as Elastic D-Norm, may be applied in the transformation step 108 of method 100. Elastic D-norm is illustrated schematically in Figure 7. Instead of slicing l1-Norms into blocks with fixed size m, we take a more flexible strategy:

1) Define a maximum allowed block size m_max, and a selection condition θ.

2) Starting from block size m = 2

3) Evaluate the difference between Max (h) and Min (l). If |h - l | ≥ θ select this block and calculate F following the same rule as defined by Equations (5) to (8). Else if |h - l | < θ, increase m by one.

4) If m < m_max, repeat 3) - 4), else discard the current block and repeat 2) - 4).

In this method, the size of m varies from 2 to a certain (user-specified) value m_max. In the Elastic D-norm method, we introduced a new parameter m_max to limit the maximum words that can be included in a block. Elastic D-norm can be further improved upon by removing the extra parameter m_max. Accordingly, in a fourth example of a transformation in step 108 of method 100, an alternative that will be referred to herein as sliding-window D-Norm may be used. Sliding window D-norm is illustrated schematically in Figure 8 and may be summarized as follows.

1) Define a selection condition θ.

2) Select the n-by-m block.

3) Evaluate the difference between Max (Zi) and Min (z). If \h - l\ > θ select this block and calculate F following the same rule as defined by Equations (5) to (8). Else if |h - l | < θ, move the block down by one word.

4) Repeat 3) until the end of the memory is reached.

In a yet further alternative, a fifth example of a transformation that will be referred to herein as Sorted D-Norm where the l1-Norm difference between the selected (h, l) pairs is always θ (Method-A for short) is illustrated schematically in Figure 9. This is an example of a method that maximises the extraction efficiency (the number of F bits extracted from a given memory size is maximised). In this method, we only pick up pairs of l1-Norm with a difference |h - l | exactly equal to θ. Sort D-Norm with fixed θ may be summarized as follows.

1) Calculate the l1-Norm from the raw bits.

2) Sort the l1-Norm in increasing order.

3) Pick the smallest available l1-Norm from the sorted list as the I.

4) Search for the first l1-Norm exactly θ larger than I, denote this l1-Norm as h.

5) Mark h and I as used, and calculate F using the original memory addresses of h and I.

6) Repeat 3) -5) until reaching a key bit requirement or until no remaining h, l combination satisfies the selection condition θ.

In a yet further alternative, a sixth example of a transformation will be referred to herein as Min-Max Sorted D-Norm where the l1-Norm difference between the selected (h, l) pairs can be any value above a minimum θ (Method-B for short). This alternative method example maximizes the number of highly reliable of F bits from a given memory size and for a given minimum θ, compared to Method-A, minimises the key failure rate given

in Equation 19) of the generated key. In Sort D-Norm with auto adjusting θ, an initial value of θ is selected and the words (or other groupings of raw bits) in the memory are sorted according to their l1-Norm values. Next, a first word (or other grouping of raw bits) with small l1-Norm value is selected, and a second word (or other grouping of raw bits) with l1-Norm value that differs by at least θ relative to the first word is searched for. The first word and the second word are marked as used such that they are not selected again. This process is repeated until no remaining words satisfy the selection condition. If the initial value of θ cannot produce enough fingerprint bits for the desired application, θ is reduced, and the entire process is repeated, discarding all selections from the previous higher θ.

In some embodiments, which may be suitable for device fingerprinting for highly resource- constrained devices such as loT devices, e.g. field devices used in Industrial Control Systems such as Supervisory Control and Data Acquisition (SCADA) systems or Device- to-Cloud Systems (DCS), a transformation method may be applied that has a lower noise tolerance level, to thereby increase the bit extraction rate of the transformation. This may compensate for the fact that such devices typically have less memory, and thus lower capacity for generation of raw bits that can be used for key generation. Additionally, in such embodiments, a reliability of each generated (transformed) bit F may be determined when generating the device fingerprint F. The device fingerprint F and the reliability of the bits of the device fingerprint Fmay be collected in a secure environment by an external entity/device such as the Verifier 1000 and securely stored by the Verifier 1000 (in Figure 10) as part of an enrolment process.

The device may regenerate the fingerprint using any of the methods described above. Due to the looser threshold (lower noise tolerance threshold), and thus the likelihood of bit errors when regenerating the fingerprint, the regenerated fingerprint F' will typically differ from the stored reference fingerprint F.

A process for recovering the key F generated by the device is shown in Figure 23. The Device (Prover 1030) may send a hash of the fingerprint, tag' = CMAC_F, (nonce) to the Server (Verifier 1000). The hash may be generated using a nonce that is generated by the Verifier 1000 and sent to the Prover 1030. Because the Verifier 1000 has previously stored the bit-specific noise tolerance of each fingerprint bit F of the fingerprint F at enrolment, it is able to use this information to attempt to recover the hash tag'. In particular, the Verifier 1000 ranks each bit F according to its bit-specific noise-tolerance degree. For the bits with smaller reliability_i within F that have the lowest degree of noise tolerance, the Verifier 1000 exhaustively tries possible values (0 or 1) for those bits without changing the enrolled values for the remaining bits with relatively higher reliability. For each possible combination of F (denoted "F_tre" in Figure 23), the Server 1000 computes tag - CMAC _Ftre (nonce) and compares this with the received tag'. If any trial matches, the key generation succeeds, thus recovering the tag' sent from the device side whilst also achieving authentication of the Device (Prover 1030) in the challenge-response mechanism embedded in the process; hence both the Server (Verifier 1000) and Device (Prover 1030) is able to establish a shared secret key between the two parties. If no match occurs after certain amount of exhaustive trails, the key regeneration and authentication is considered to have failed.

By storing the bit-specific noise tolerance degree and using this to distinguish reliable bits from unreliable bits for recovering the CMAC tag, it is possible to generate reliable secure key/s without using error correction codes at the device side, by exploiting the rich computational resources owned by the server. Therefore, the computational overhead brought to the server is negligible to establish the reliable secure key, which is also secure and lightweight to the device.

The server (Verifier 1000) may attempt to recover the hash tag' by trial and error in several different ways.

In one example, referred to herein as Global TRE, it is assumed that each F bit in the key string has an equal probability of being erroneous. Global TRE will be able to correct up to x error bits by exhaustively trying the power set of binary substrings in the key string. For example, suppose we have a set S = a, b, c}, the power set of S is expressed as P(S = { Φ , {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}} where Φ denotes an empty set. One implementation example of the Global TRE is given in Appendix I, and evaluation of run time is in Figure 39 (based on using Method-A) and Figure 40 (based on using Method-B).

Another example is referred to herein as Regional TRE. In Global TRE, the reliability of all F bits is assumed to be equal. However, as observed in non-fixed θ methods, such as Sort D-Norm with self-adjusting θ, the reliability of each bit F varies as a function of the D- Norm value. In other words, the D-Norm value can be treated as a direct measure of the noise tolerance degree or reliability of the F. By recognizing this fact, a Regional TRE method can adaptively non-uniformly distribute x into different sub-strings within the F string (the key string). The Regional TRE can now allocate the majority of computing power to correct x bits with lower noise-tolerance degree or reliability in the entire key string.

If there are s different D-Norm values, the key string can be split into s sub-strings, where the D-Norm values are equal within each sub-string. In a sub-string with length I, the probability of no more than x error bits if every bit has an error rate of BER_F is expressed as: p = binocdf(x, I, BER_F)

Let binoinv() be the inverse function of binocdf(). We can express x as: x = binoinv(p, I, BER_F), where x is the upper bounded number of expected error bits in a sub-string of length I with confidence p. In other words, performing TRE for x bits in this sub-string can recover all error bits.

If there are s sub-strings in total, assuming each has an equally successful regeneration confidence of p, the expected successful regeneration confidence of the entire key string is P_key - p^s- Therefore, the expected p can be estimated by

Based on this, we can consequentially work out the number of expected error bits x for each sub-string.

Within each sub-string, the same power set as that described in the Global TRE is applied. Between sub-strings we also use a power set to enumerate all possible combinations, but the elements are sub-strings instead of F bits. One implementation example of the Regional TRE is given in Appendix J.

Bit Reliability and Extraction Efficiency

The usability of fingerprints for security requires: I) a formal treatment to understand the degree of noise tolerance afforded or the reliability of a transformed fingerprint bit F; ii) an analysis of the expected number of fingerprint bits of sufficient reliability that can be extracted from a given on-chip memory capacity.

Reliability. The reliability of a fingerprint bit vector F is quantified by bit error rate (BER), expressed as:

BER_F = FHD(F, F') (9) where F and Fare two distinct and random fingerprint evaluations from the same physical memory. The function FHD() is the fractional Hamming distance between the (binary) vectors F and F'. Commonly, F is a reference fingerprint template measured at t = 0 and F' is the reevaluation under a potentially different device operating condition such as temperature, therefore being subject to thermal noise. A lower BER_F would indicate a higher tolerance to noise in the raw fingerprint space. We formally derive an upper bound for the BER_F .

S-Norm. The BER_F of F through the S-Norm is formulated in Equation (10), which is a function of both selection criterion θ and the BER_fthat is the BER of raw bits. This provides a worst-case (upper-bound) assessment of BER_F.

The binopdf and binocdf are the probability density function and cumulative density function of the binomial distribution, respectively.

D-Norm. For D-Norm, a block of m groups — each group is with n raw fingerprint bits — is used to be transformed into a 1-bit F. The BER_F of D-Norm that is a function of tolerance bound fl and n is expressed in Equation (11). Again, this provides an upper- bound estimation.

Notably, the BER_FIS independent of the number of groups m within the block.

Method-A: Sorted D-Norm (which maximises extraction efficiency). Method-A uses the same selection condition as D-Norm, which is d ≥ θ, and so the upper bound reliability of BER_F can also be estimated with Equation (11).

Method-B: Min-Max Sorted D-Norm (which maximises key reliability). In method- B, we select all pairs above a minimum 0. Here we consider three representative cases for F bit reliability: i) the best BER; ii) the worst BER; and iii) the average BER.

For the best-case, we assume all the requested k bits can be extracted with maximum possible distance, i.e. d - |h - l | - n. The average bit error rate of the best-case of the method- is illustrated in Equation (12). This equation is derived by substituting

θ with n in Equation (11).

(12)

On the other hand, the worst-case is that all selected h, l pairs are fallen in the minimal selection condition i.e. d - |h - l | - θ. Because the problem now reduces to that of Method-A, the worst-case bit error rate of the method-B BERg ^c ) can be estimated with Equation (11) from Method-A.

The best-case and the worst-case are two extreme situations, the actual case will appear in between of the two cases. On requesting a fixed-length key, the smaller the memory size is, the closer to the worst-case the BER is, and vice versa.

We can use Equation (13) to predict the available number of each l1-Norm with a given memory size. As illustrated in Equation (14) the number of each l1-Norm P( ||f ||₁ = q) can be calculated from the memory size (in bytes) and the group size n.

( )

For example, we want to know, in a given memory of size 16 KiB, the expected number of groups with l1-Norm less than or equal to 3. Then we can plug j = 3 into Equation (14), We expect

there to be up to 87 groups with l1-Norm less than or equal to 3.

Based on this information, we can predict an expected bit error rate for the Method-B for a given is a function of d

Extraction Efficiency

The second is, when the tolerated noise (or equivalently BE _F) is bounded, the extraction efficiency η, which indicates, out of the total number of raw bits f; eventually the memory size, the number of F that is invariant to the bounded noise (practitioner preset selection criterion θ). This part presents the final formulation with a concise formulating strategy for each metric, while the formulating steps are deferred to Sections D and E of the Appendix.

Strictly, the extraction efficiency is the ratio between the number of extracted transformed bits F) and the number of raw bits (f) given a selection bound (0). This work quantifies and further normalizes it as the number of extracted reliable F out of 1 Ki B raw bits — a l KiB (1024 Bytes) memory size.

S-Norm. The extraction efficiency of S-Norm is expressed as below, details are deferred to the Appendix C: ^

The term | is the case when the l1-Norm of n-bit f is larger than

, assuming that the probability of each bit being '1'/'0' is 0.5. While the term

binocdf

formulates the other case when the l1-Norm of n-bit f is less than or equal to Both cases meet the selection criterion. Clearly, the overall

extraction efficiency should be the sum of above two cases divided by n — recall n raw bits transform into a 1 -bit new bit. The 1024x8 term normalizes the extraction efficiency to be expressed as bit/ KiB -- number of selected reliable bits F out of a 1 KiB memory.

D-Norm. For the D-Norm, one new bit is transformed from a block of m groups: each group with n raw bits presents one l1-Norm. In a block as illustrated in Figure 19, the probability that this block will meet the selection criterion ( |h - l | ≥ θ) is termed as

The normalized extraction efficiency that is number of noise-tolerant fingerprint bits being selected from a 1 Ki B memory under the D-Norm is then given by:

The term of indicates that n x m raw bits producing a new 1-bit F, while the 1024x8

term indicates a 1 KiB memory. The direct derivation of

is non-trivial. We instead solve it through a different but equivalent problem. The details are deferred to Appendix Section E. The fitness of formalized _DNorm is validated through running extensive simulation tests (defined in the next Section), as detailed in Figure 17.

Method-A: Sorted D-Norm (which maximises extraction efficiency). b(q) = binopdf(q, n, 0. 5)

We provided an analytic equation (see Equation (17)) to predict the extraction efficiency of Method-A. For a detailed derivation please refer to Appendix H.

Method-B: Min-Max Sorted D-Norm (which maximises key reliability). The extraction efficiency for the method-B is quite easy to analyse, as there are no overlapping areas, and all selected l1-lN1orms are sparsely and symmetrically located at the two sides of the histogram.

Experimental Validation

Some numerical and physical measurement experiments conducted in relation to methods making use of the first and second example transformation methods will now be described. We take 29 commodity chips embedded with SRAM memories from 4 different manufacturers for evaluating embodiments of the present disclosure comprehensively. The size of the SRAM chip ranges from 64 KiB to 512 KiB. We have also evaluated Flash and EEPROM memories to corroborate the generalization of the present disclosure. Those three types of memories are pervasive, especially in the most interesting low-end loT devices. COTS loT devices with the evaluated memories are exemplified in Figure 20.

We consider three test settings:

• Prediction: numerical calculations based on formalized equations in Section IV are presented.

• Simulation: based on an ideal chip model, see definition in Appendix Section A, with the same size as the dataset measured from the actual physical chip. The ideal chip model assumes each bit has binomial probability to be flipped across repeated measurements, which is the worse-case BER measured from the physical chip, see Table I.

• Measurement (Chip): measurements from physical chips are used to validate the prediction.

For evaluation of the BER_F of transformed bits, we expect that results from simulation and measurement are lower or better than that of prediction because formalized equations provide a conservative or upper bound estimation. For the extraction efficiency η , we expect results from the measurement to be higher or better than the prediction and the simulation that are both assigning each bit with a 50% probability of being '1'/'0' independently. Because of the existence of individual group-level raw bits bias 2 of '1'/'0'— bits from a proximity physical area may have slightly similar tendency to be '1'/'0'. More specifically, the setting of raw bit ideal independence is, in fact, conservative for assessing the extraction efficiency. Since local spatial bias among the grouped raw bits could exist on a physical device, which is beneficial for extracting a higher number of noise-tolerant transformed bits.

Memory Datasets

Specifications of four SRAM memory, one Flash memory dataset, and one EEPROM datasets are summarized in Table I: each dataset is from a different manufacturer. Overall, the total number of raw SRAM bits evaluated is 69,206,016, while the evaluated number of Flash bits and EEPROM bits are 3,902,972 and 32,786, respectively. Furthermore, they are repeatedly measured multiple times under each operating condition. When we evaluate the unreliability of transformed bits, BER_F, average of repeated re-evaluations is reported.

However, we would like to emphasize that repeated measurements are solely for evaluation purposes. Reliable fingerprint provision according to embodiments of the present disclosure only requires a single measurement, which we follow in later implementation.

Table I: Memory Datasets.

In Table I, the abbreviations map to the following manufacturer/model:

• NORDIC - Nordic Semiconductor nRF52832

• ISSI - ISSI IS61WV25616BL

. IDT - IDT IDT71V416S

• CY - Cypress CY62146EV30

. Flash - Winbond W29N02GV

. EEPROM - Microchip 24LC256

For Flash, the public data set used only tested 69,696 bytes, whereas the total memory is 256 MiB. For EEPROM, only the first 2 KiB was evaluated out of 32 Ki B total memory.

We use our NORDIC dataset to extensively validate the reliability of BER_F and extraction efficiency η metrics of both the S-Norm and D-Norm, considering the fact that it is collected with the broadest operating range and highest number of repeated measurements. Then rest datasets corroborate the generality. Below briefly describes the collected NORDIC dataset while details of the rest of the public datasets are available in references [16], [28], [29]. Our method for collecting EEPROM data is similar to that for the Flash [15].

NORDIC. It is collected from 12 nRF52832 chips. The nRF52832 is a popular RF-enabled MCU, supports various protocols, including Bluetooth 5, Bluetooth mesh, ANT, and NFC. This chip is with a 64 KiB SRAM memory. The choice of NORDIC lies in the fact that it is a typical low-cost loT device. Three temperature corners (-15 C, 25 C and 80 C) are evaluated to measure the reliability of raw bits: 100 times repeated measurements are taken under each operating corner. The worst-case BER_F of 6.09% occurs under 80 C when the referenced enrolment is at 25 C.

S-Norm Validation on NORDIC Chips

We concatenate all 12 chips to gain a large sample population: the concatenated size is 12 x 64 KiB = 768 KiB. Results of S-Norm are detailed in Figure 11 under various n and θ settings.

Reliability. We can confirm that the formalization of Equation (10) provides a conservative estimation of the selected new bits F, which is validated as the chip measurement is always smaller than the prediction. Notably, as indicated by the arrow 1, we find no error for the chip test and simulation that are both empirical tests with a finite sample population.

Extraction Efficiency. The simulation and prediction (Equation (15)) are in good agreement, whereas the extraction efficiency of the chip measurement is higher. As we expected early, the reason is that the local spatial bias among n physical memory raw bits (reference [22]) — prediction and simulation were based on the ideal memory fingerprint model that has assumed ideal independence across all individual raw bits, so that the chip test has more '1'/'0' raw bits in nearby locations. In other words, the local spatial bias has positively improved the extraction efficiency 77.

D-Norm Validation on NORDIC Chips

The validation results of D-Norm are shown in Figure 12.

Reliability. Under expectation, the reliability is substantially increasing as the noisetolerance bound θ is getting large. Again, we can confirm that the formalized BER_F is a conservative estimation because it is always shown to be higher than both the simulation and chip measurement.

Extraction Efficiency. Under expectation, the simulation and prediction do (Equation 16) agree well, whereas the extraction efficiency of chip measurement is usually higher. The higher chip measurement again attributes to the fact that any local spatial bias among raw bits in the real chip will facilitate more reliable new bits F to meet the selection criterion. While in the prediction formulation, each raw bit is supposed to be independent under an ideal memory fingerprint model.

Generalizability

Three public rest SRAM datasets: ISSI, CY, and IDT, one Flash dataset, and one EEPROM dataset are further utilized for generality validation. Results of S-Norm and D-Norm validated on those datasets are detailed in Figure 13 and Figure 14, respectively. They follow same observations as the above extensively analyzed NORDIC dataset. Based on comprehensive experimental validations on SRAM memories from four different manufacturers, Flash memories and EEPROM memories. We can now conclude that our formalization of the unreliability BER_F and extraction efficiency η above well matches the real chip measurement. Most importantly, the prediction serves as an upper bound in practice.

CRYPTOGRAPHIC KEYS FOR SECURITY FUNCTIONS

We demonstrate the expressive power of our formalization by investigating the derivation of root keys from commodity memory chips. The dynamic and direct generation of cryptographic keys from memory fingerprint transformations into noise-tolerant bits is a basis for building security functions because: I) memory biometrics are a true source of randomness; and ii) removes the need for a protected non-volatile memory - a key can be generated on-demand and "forgotten" after use. Here, we compare the formal model analysis with fingerprint measurements from chips under test; we begin our systematic investigation with the following question.

S-Norm and D-Norm

What is the reliability of a k-bit noise-tolerant fingerprint? Transformed bits F can be directly utilized as a cryptographic key because they are invariant to a desirably high number of noise induced bit error patterns—these bits exhibit a high noise-tolerance. The overall failure rate

of a κ-bit noise-tolerant key can be expressed as:

Recall the formalized BER_F discussed above is a conservative formulation. Therefore, the in Equation (19) will yield a conservative evaluation. We expect a key failure rate in

practice to be lower than our estimation here, this hypothesis is validated with a random chip test in Figure 15. Considering a practitioner's desire of

for typical industrial applications [30], we investigate the following question.

Different from other methods, the k-bit reliability is no longer 1 - (1 - BER_F)^k given in Equation (19) for the average case of Method-B, the is given in Equation (20)

below.

The probability of a specific group to have its l1-Norm value equals q is analytically given in the function P( ||f ||₁ = q)

The probability provided in the Equation above can predict the available number of each l1-Norm with a given memory size.

As illustrated in the equation below, the number of each l1-Norm can be calculated as a function of the memory size and the group size n.

Thus:

What is the most efficient transformation method presenting the highest extraction efficiency while maintaining the desired capability to utilize F directly as a cryptographic key with a failure rate lower than 10^-6 ?

We employ NORDIC SRAM fingerprints— the largest memory biometric dataset under test with 69,206,016 bits— to perform extensive evaluations to address the question. The evaluation process is described below:

1) Evaluate the minimum θ for the required BER_F (Equation (10) for S-Norm and (11) for D-Norm) to achieve

(Equation (16)) for a selected n in S-Norm or n and m parameters in D-Norm. Specifically, the θ is searched by gradually increasing it by 1 from zero.

2) Use the parameters n, m and minimum θ to extract noise tolerant bits F from the physical chips. The number of extracted F bits is counted and normalized to report extraction efficiency η in F bits extracted per Ki B.

3) Uniformity reports the proportion of '1' bits in the F bits obtained from the physical chips.

From the results summarized in Table II, we can conclude that the D-Norm can provide significantly higher extraction efficiencies under a constraint. Therefore, in the

following discussions, we focus on the D-Norm transform.

Table II: Extraction efficiency η (bit/KiB) of F when is just met for a 128-bit

key. NORDIC SRAM dataset is used.

In Table II, the bold entries represent the best results among the tested parameter settings. The m setting for S-Norm is not applicable. The extraction efficiency η is measured from the physical chip and predicted based on Equation (15) for S-Norm and Equation (16) for D-Norm. For entries indicated as *, under such a specific setting it is difficult to meet

For uniformity, since the number of selected F becomes too small, it has large variance and is thus invalid.

Given different: i) sizes of memories embedded within various COTS electronics; and ii) BERF characteristics of noisy fingerprints from different memory technologies, we investigate the following question next.

What is the lowest failure rate achievable for a 128-bit key from each memory technology and manufacturer?

This scenario resembles practical application setting where the available memory is fixed (determined by the computing platform or micro-controller unit) and the inherent (worst- case) BE _F of raw fingerprints are known (published measurement studies on memory technologies). Thus, to assess the practicality of deriving a robust key using the presently disclosed method, we test our suite of memory technologies using the following approach:

1) For each manufacturer listed in Table III, we conduct a parameter search using possible combinations of parameter values where n,m e {8,16,32,64,128} and θ e [l,n]. This process identifies the (n,m,θ) combination exhibiting the lowest while still providing a F with

at least 128 bits.

2) We employ the model in Equation (11) to obtain the BER_F of the extracted F bits using the identified m, n, θ and known BER_f.

3) We use BER_F in Equation (19) to determine the best for obtaining transformed F

with at least 128 bits.

Results are summarized in Table III. Given a key length requirement of 128 bits, the best can be extremely low for SRAM memory technologies with a worst-case BER_F ranging from 5.42% to 6.88% for directly extracting a key from the transformed space with highly noise tolerant F bits by the virtues of the transformation.

Table III: Lowest key failure rate achievable for obtaining at least a 128-bit key for each investigated memory chip, using the D-Norm method.

In practice, it is desirable to have prior determinations of which memory chips can adopt methods according to the present disclosure to directly provide a root key. So we are further concerned with the following question.

What is the minimum memory size (MMR) required to obtain a 128-bit key with a key failure rate below conditional on the inherent BER_Fof raw fingerprints for each

memory technology?

To answer this, we test all types of chips by the below steps:

1) First, perform an exhaustive search for the largest possible number of F bits while keeping below 10^-6.

2) Second, measure the extraction efficiency η , by dividing yield F number over the total memory size.

3) Finally the MMR can be calculated by dividing the desired key length, e.g. 128 bits, by the η .

Results are summarized in Table IV. This is a practical guideline to quickly determine whether functionality according to embodiments of the present disclosure is immediately mountable given the memory technology and manufacturer. Whenever the memory size is larger than MMR that tends to be more likely the case considering ever-increased memory volume even in low-end loT devices, methods according to the present disclosure provide a practical option for providing a usable root-key for commodity computing devices

Table IV: Minimal memory size requirement (MMR) for at least a 128-bit key subject to conditioned on the inherent BER_f of raw memory technology. The D-Norm

method was applied.

In Table IV, under the listed (n,m,θ) settings, the number of extracted F may be higher than 128 bits. The MMR is scaled down to truncate to the first 128 bits. Values for MMR are not reported for Flash or EEPROM because for the tested size (69 KiB for Flash Winbond and 2 KiB for Microchip) it is hard to provide a 128-bit key subject to Predicted

values are obtained using Equation (16).

Method A and Method-B

What is the reliability of a k-bit noise-tolerant fingerprint from Method -A and Method-B?

To accurately evaluate the validation of our equations derived in section Reliability with ample sample size, we first leverage the ideal chip model with the same worst-case BER_r = 5.30% when the MSP430 is repeated measured within the range of 0 °C ~ 40 °C. The simulated memory size of the ideal chip model can be infinite— the main reason for using the chip model for simulation. In other words, it will automatically generate more random bits till the requested number of fc bits of F satisfying preset key reliability have been extracted. The key reliability is plotted in Figure 30 where the simulation for both Method-A (Simulation-A) and Method-B (Simulation-B) are plotted. Here the Baseline method refers to the first D-Norm method. Both methods follow the trend of our analytic prediction, and Method-B shows slightly better performance in terms of lower key failure rate.

Notably, the Baseline, prediction of Method-A and the worst-case prediction of Method-B (Prediction-B w.c.) are calculated from the same equation, they coincide, and plot as the Baseline. In addition to the worst-case condition, we also have the best case (Prediction- B b.c.) and an average case (Prediction-B ave.) prediction for Method-B. The best case suggests how better can we expect if there is unlimited memory space. Moreover, the average case takes the actual memory size into consideration and provide an expected estimation.

What is the extraction Efficiency of Method-A and Method-B

As shown in Figure 31, the Baseline method can never achieve a 64 bit/KiB extraction efficiency (equivalent to extract a 128-bit device fingerprint from a 2 KiB memory) regarding-less which θ is used. One of the most critical design targets of embodiments of the present disclosure is to optimize the extraction efficiency. The bar chart indicates that both Method-A and Method-B showed improved extraction efficiency compared to the Baseline method. In the meantime, our derived analytic equation could accurately predict the extraction efficiency of the two new proposed key extraction methods.

In summary, both Method-A and Method-B could achieve extraction efficiency above 64 bit/KiB. Method-A shows better performance in slightly higher extraction efficiency across all selection threshold θ values.

What is the uniformity (bias) and bit-aliasing artefacts of key bits extracted from Method- A and Method-B?

Herein, we measure — using Measurement (Chip) outlined previously — and compare the uniformity/bias key bits Fextracted from three methods: the Baseline, the Method-A and the Method-B over two physical chip datasets: MSP430 and nRF52. We firstly examine the bias of raw bits, as shown in Figure 32. It can be seen that both chips have raw bias close to the ideal 0.5.

For the Baseline method, MSP430 exhibits an ideal mean bias of 0.50, with a standard division of 0.06. However, the Baseline method has a notable bias when using nRF52 chips. The mean bias is 0.61, with a standard deviation of 0.19; the Baseline method cannot always retain a good bias after transformation from the raw to the noise-tolerant space. It appears to be sensitive to chip characteristics, e.g., spatial correlation.

With Method-A and Method-B, there is no noticeable bias encountered. For both chips, the mean bias is within 0.47% ~ 0.53%, with standard deviation below 0.05.

In summary, presently disclosed Method-A and Method-B outperform the Baseline method in relation to bias.

Uniqueness evaluation for Method-A and Method-B

Uniqueness describes the ability for a device to uniquely distinguish itself from a large population. In the context of key extraction, the better the uniqueness, the lower the chance of key collision (repetition). The uniqueness is measured via inter-class fractional Hamming distance (HD).

Ideally, the uniqueness should be 0.5.

Again, we applied the three key extraction methods to the two physical chip datasets. The uniqueness evaluations are presented in Figure 33.

In summary, all three methods can retain a good uniqueness with a mean value close to 0.5. In addition, their performance is comparable,

Bit-aliasing evaluation for Method-A and Method-B

Bit aliasing is a common problem in many physical key extractions. In detail, if there are bit patterns with the same ' 0'/' l' across several keys generated from different hardware instances, the uniqueness would be degraded. We examine bit-aliasing by measuring the inter-class bit frequency. If the frequency is close to 1, all chips are likely to generate ' 1^' at this specific bit address in the key string. Otherwise, if the frequency is close to 0, all devices prefer to output ' 0' at that bit. Ideally, a bit frequency close to 0.5 is preferred. In Table V below, we summarize some key numbers.

Table V: Comparing the bit-aliasing between proposed key extraction methods

In Table V above, we have applied all three methods to the two physical chips and obtained six sets of results. For each of them, we show Aliasing statistics and Key bit quality. The former describes the overall bit aliasing over the tested physical chip population. The latter indicates how many bits are affected by bit-aliasing and how many bits are less affected in the key extracted.

In the Aliasing statistics, we examine three values: the mean μ, the standard derivation δ and the worst-case (w.c.) The mean indicates the overall 'O'/'l' preference of all key bits, which is quite similar to the bit bias discussed in Bias evaluation. Ideally, μ closer to 0.5 is desired. The standard derivation describes how to differentiate the extent of bit aliasing from one key bit to another. The worst-case highlights the bits farthest from the ideal 0.5.

In the Key bit quality, we examine the bit-aliasing of each bit in the key and classify them according to the flowing rules. If the bit-aliasing is within the 0.5 ± o.l margin, we assert this bit is 'less aliasing' which means it is less affected by the bit aliasing. Next to the ' less aliasing' region, we have the 'intermediate' region with an outer border of 0.5 ± 0.3. The rest, which includes 0.8 to 1.0 and 0.2 to 0.0, are considered 'highly aliasing'. Ideally, more key bits in the 'less aliasing' region is better.

In summary, with regards to bit-aliasing, both Method-A and Method-B show improved results relative to the Baseline method, when tested with the nRF52 physical chip data set. However, with the MSP430 data set, the mean value of bit-aliasing slightly degenerates. One thing that requires special attention is that the Method-A records worst-case = 1.00, which means in all 25 chips, the corresponding bit always outputs value '1'. A designer may consider excluding such a bit from the key derivation to avoid entropy leakage.

END-TO-END SECURITY FUNCTIONS Implementation Study of Remote Attestation

Remote attestation allows one party to establish trust in another. Herein, building upon F that naturally tolerates a high degree of noise, we use it to attest the firmware integrity of the deployed loT device during boot-up by a verifier [31].

Notably, we experimentally show that SRAM fingerprint can be accessed even during run- time by exploiting the separate memory bank power control feature made available by the COTS chip.

System Overview

The entities, a Verifier 1000 and a Prover 1030, involved in this case study are illustrated in Figure 10. In this example, the Verifier 1000 comprises a server 1010 and a wireless network gateway (smartphone) 1020, and the Prover is a wireless sensor node (containing a Bluetooth chip) 1030. In this setup, the server 1010 serves as a coordinator, holds the enrolled Prover's 1030 information in the database, and issues commands to instruct the Prover 1030 to perform remote attestation. The gateway 1020 bridges the communication between the server 1010 and the Prover 1020. The traffic between the server 1010 and the gateway 1020 is assumed secure by applying standard security protection mechanisms. The Prover 1030 is wirelessly deployed in the (insecure) environment.

Details of the corresponding attestation protocol are provided below. Our case study is on carrying out lightweight remote attestation upon the Prover with constrained resource by the following [31].

Protocol. The protocol overview of remote attestation is illustrated in Figure 16. It has two phases: enrollment (a one-time task) and attestation (typically requested multiple times).

Enrollment. The enrollment executes one-time-only, under a secure environment by the Verifier 1000. Fingerprints of dedicated SRAM section (coined as FingerPrint zone) specified by a starting address addff and length leng_f is read out from the Prover 1030, termed as fprti. Prover's immutable identification number ID, and the fprti are then stored in Verifier's database (DB). In this case study, the D-Norm is applied to extract highly noise-tolerant bits F_i and the corresponding mask; (mask points to the position of F_i, detailed in Appendix Section G). Both F, and maski are stored in the Verifier's DB. The maski is then stored on the Prover's readable but non-writable memory, which is realized by instantiating an immutable bootloader [32], After that, the Prover is deployed and only wirelessly accessible.

Attestation. The remote attestation can be requested anytime. First, the Verifier scans for visible Provers by sending the message "hello". Once there is a Prover in the horizon responding with its ID., the Verifier fetches the Prover's information from the DB by indexing the ID,. Second, if the received ID, matches one of that stored in the Verifier's DB, the Verifier instructs the Prover to perform attestation. In this context, the Prover performs a power cycling for memory banks solely corresponding to the FingerPrint zone to regenerate Fi under the enrolled mask; at run-time. After receiving a ready acknowledgement from the Prover, the Verifier randomly generates a challenge/nonce chai, and sends it to the Prover along with the address addr and the length leng of the target App code bin in the Prover's memory. Meanwhile, the Verifier also loads the enrolled F from the DB. The Prover's response resp is generated using Cipher-based Message Authentication Code (CMAC) through the noise-tolerant fingerprint F_i, detailed in Figure 22. The Verifier compares the received response resp with a locally calculated reference response resp'. The remote attestation is accepted if resp and resp' match, otherwise rejected.

Implementation Overhead Comparison

Implementation Details. Detailed complete remote attestation system implementation is discussed in Appendix Section F.

Performance Comparison. The overhead is mainly measured by the number of clock cycles used. Overall, in terms of obtaining 128 bits reliable F, the implementation of D- Norm-based method 100 takes 44,082 clock cycles. In contrast, to achieve 128 bits of reliable F via error correction, it costs greatly higher overhead. Specifically, as evaluated in Table VI, using FE (only counting decoding overhead) and RFE (only counting encoding overhead) cost 553,278 and 166,296 clock cycles, respectively. Therefore, an embodiment of the present disclosure reduces clock overhead by 73.50% even in comparison with the state-of-the-art RFE case. As for the complete remote attestation (including the key derivation and following operations such as CMAC), an embodiment of the present disclosure reduces clock cycles (91,518) by 84.76% and 57.18% in comparison with using FE (600,714) and RFE (213,732). Notably, we compare an embodiment of the present disclosure with (R)FE with a relaxed setting : giving a key with

below 10^-6, while the embodiment can provide a key with much lower

e.g. 10^-13, given abundant free memory size. If such a

is required using (R)FE, a higher computational overhead is needed.

It will be appreciated that many modifications are possible within the scope of the present disclosure.

For example, for all the above experimental validations, only a single fingerprint enrollment measurement has been used. In one variant, majority voting may be performed via multiple repeated measurements to reduce unreliability BERf of raw bits. Consequently, unreliability BERF of transformed bits will be reduced, which will hence reduce the Pfaii key of the root-key given a fixed memory size.

In another example, in the context of D-norm and its variants as discussed above, the new bit F can be obtained across different blocks rather than from a single block consisting of continuous words, which could greatly improve the extraction efficiency given the same unreliability setting. This is expected to provide sufficient highly transformed bits F even given an extremely tight memory size, eg. several KiB. In this case, it is important to ensure that the mask is stored on-chip and is only readable to prevent tampering.

Although the above discussion focuses on the SRAM memory considering its ubiquity in low-end loT devices, the fingerprinting method according to the present disclosure is applicable for other memories, including Flash and EEPROM memories. Besides, in principle, it can be applied to any other hardware fingerprints (see e.g. references [33], [34]) provided that the raw digital fingerprint bits space is abundant.

The method according to the present disclosure fundamentally obviates computationally heavy on-device ECC logic. It is thus immune to the helper data manipulation (HDM) attack [24], [25] that strategically tampers with the helper data associated with the ECC to weaken or compromise the key extracted via the (R)FE. The vulnerability is induced by the usage of error correction codes. Various error correction codes have previously been examined and shown to be vulnerable [25] to HDM attacks. A generic countermeasure against HDM attacks does not yet exist, and thus appears to be an open challenge. The presently disclosed method has ultimately abandoned the necessity of helper data associated with ECC, thus avoids the HDM attack that exploits ECC helper data [25]. In embodiments of the present disclosure, the position mask (examples of provision of the mask are detailed in Appendix Section G) into the immutable bootloader, thus preventing tampering. A MAC can be produced over the transformed bits derived key for later mask integrity checking to defeat any tampering if the mask is stored in a writable memory.

Implementation Study of Secure Kev Derivation

Secure key derivation protocol

The overview of the secure key derivation protocol is illustrated in Figure 23. The protocol consists of two stages: enrolment and regeneration.

Enrolment: This stage is conducted under a secure environment, which is a one-off task throughout the entire lifetime of the device. The start-up state of the device SRAM memory (raw fingerprints) is dumped and stored by the Server. The raw fingerprints and the mask of positions of raw bits contributing to each F are fed into the NoisFre transformation function FPReg() to acquire new bits F: each bit having accurate bit-specific reliability.

The triplet, F, mask, reliability, are added to the server's database DB, which can be retrieved according to the device's identification number i. The mask, is also stored to the read-only memory (ROM) of the device.

Regeneration: During this stage, the session key establishment can be requested any time after the device is deployed. For starters, the Server sends ' Hello' to the device and the device replies with its ID number ID_i. Then the server generates a random number nonce from a random generation function RNG() The nonce is then transferred to the device. The device regenerates new bits F'according to the mask_i stored in its ROM. Notably, the F' is not necessarily exactly the same as the F_; enrolled to the Server — some bits could slightly vary. The F' is directly used as a key in a cipher based message authentication code CMAC(), and the nonce received from the server is also fed into the CMAC() as a refreshing component to generate a tag . The tag is then sent back to the Server.

The protocol above utilizes the device's ROM to store the mask for extraction. In case such memory is not supported, the mask can also be transmitted during the Regeneration. To prevent the mask from being modified through the unsecured air interface, a similar technique to prevent a helper data modification attack [24] can be used.

Performance comparison To demonstrate the practicality of embodiments of the presently disclosed method (referred to below as NoisFre-Lite) in the real world, we integrated it with SecuCode, which is an SRAM PUF entangled secure firmware update scheme for battery free CRFID devices. NoisFre-Lite serves as the secure and reliable key extraction building block, in place of SecuCode's inbuilt key derivation that relies on the BCH code-based error correction. In particular, a quantitative comparison with Reverse Fuzzy Extractor (FE) based key derivation is carried out. The main considered overhead is the clock cycles consumed by the key derivation. In contrast, memory usages, including run-time code usage and executable code usage, are also considered.

The MSP430 we tested is equipped with an AES accelerator. Hence, we use HW-AES-CMAC as our CMAC function. However, the AES accelerator maybe not be available in other microcontrollers, such as the ATmega328p widely used in Arduino and many other smart cards. Therefore, we also consider and evaluate a software implementation of lightweight cryptographic hash function such as BLAKE2s, as included in Figure 34.

Both the NoisFre-Lite and FE-based key generator are implemented on the same 16-bit RISC embedded platform MSP430, with only a 2 KiB SRAM for run-time data and 64 KiB FRAM for run-time data executable code. The NoisFre-Lite greatly outperforms the FE in all three overheads: number of clock cycles, data memory (SRAM) usage, and code size (FRAM), as summarized in Figure 35.

Cycle Cycles: The NoisFre-Lite only consumes 20,238 clock cycles to produce a 128-bit key under 10^-6 industry standard requirement. As a comparison, the FE-based key derivation requires 112,762 clock cycles — 5 times higher. NoisFre-Lite with the software implementation of MAC consumes 78,109 clock cycles. It is still 36.4% better than the FE- based method.

Memory Usage: The data memory (RAM, run-time variable storage) usage for NoisFre-Lite is 86 bytes, and that for FE-based key generation is 117, dropped by 31 bytes (26.5% off). The code memory usage also saw a noticeable improvement, reducing from 3,369 bytes to 3,146 bytes (6.62% off). However, if using the software implementation of the MAC function, the data memory usage increased by 17.9% to 138 bytes. The code memory slightly increases to 3,450 bytes, which is 2.4%.

In summary, for embodiments of NoisFre-Lite, the improvement of computational overhead measured by clock cycles is phenomenal. Fewer clock cycles would reduce the possibility of power losses of intermittently powered energy harvest devices, such as an RFID token. As an additional benefit, the reduced memory occupation releases more system resources for other applications, such as user functions.

APPENDICES

The following appendices include derivations of the Equations above. For these derivations, the below ideal chip model is adopted.

A. Random Chip Model

Due to the size limitation and difficulty of extreme number of repeated measurements of the physical chip, a randomly generated chip model is adopted to evaluate analytic predictions.

The random chip model is based on the following settings:

1) Each bit has a 50% chance to be logic '1' or '0' during the enrollment phase.

2) Each bit has equal BER_f for being flipped during regeneration.

3) The value of each bit is independent and identically distributed (lid), so that there is no spatial or temporal correlation.

B. Unreliability Formalization of S-Norm Transformation

All possible cases of noise-tolerant S-Norm transformed bit F are shown below.

To ease understanding, the formalization is visualized in Figure 18. Recall that a new bit F is transformed from n raw bits and the l1-Norm of the F is between [0, n]. To assess the worst-case BER_F, we consider the condition where the selected word's l1-Norm is exactly equal to as shown in boundary condition F='1' inFigure 18. Here, θ is a

threshold to select highly reliable bits F.

Each raw bit has probability BER_f to be flipped under re-evaluation. Using boundary condition F 1' as an example, on one hand (case 1 in Figure 18), if there are raw bits of '0' (marked as section B in Figure 18) flipping, it will increase the tolerance of the number of raw bits of ’1 'that allow being flipped (section A in Figure 18) without influencing bit F. On the other hand (case 2 in Figure 18), flipping raw bits of '1' (marked as section A in Figure 18) will potentially result in an error to the new bit F. Furthermore (case 3 in Figure 18), supposing that raw bits of '0' (section B) remain unchanged, if more than θ raw bits of '1' flip, the new bit F will exhibit an error - flipping from '1' to 'O'. To be precise, the transformed new bit F will not exhibit error unless more than θ + i raw bits of '1' flip.

Overall, bit flipping within raw bits of '0' (section B) increases the reliability of extracted F. In contrast, bit flipping within raw bits of '1' (section A) decreases the reliability of the extracted F. The boundary condition F = '0' is logically equivalent to the case F = '1' but only inverts F's 'O'/'l' value rather than its BER.

Without losing generality, we focus on one case shown as case 1 in Figure 18. ||fj the probability of having exactly x error bits in section A can be expressed as

P given that each raw bit has a BER_f probability of flipping. Similarly,

the probability of y bits in section B to be flipped is formulated as

Although bit flip could occur in either section A or B, consequential BER_F of bits F are opposite: flipped bits in section A reduce the margin or potentially increase the BER_f (shown as the dashed boundary line in Figure 18 that moves towards the left); in contrast, flipped bits in section B increase the margin or potentially decrease BER_F (the boundary moves towards the right). If the boundary crosses the middle point of the Ilf_i ||

₁ falls below

and consequently the new bit F flips, thus exhibiting an error.

Starting from the extreme but straightforward condition that there is no bit flip in section B, i.e. y - 0, the maximum number of erroneous bits that can be tolerated is θ as discussed above. This can be expressed as

, where the term Pr(x ≥ θ) can be expressed as

By substituting

^ ( [^] ) into the P equation, is expressed :

However, there is more than one case that satisfies x - y ≥ θ for {(x,y): |x| ∈ A, |y| ∈ B}. Since I and B are finite sets, the combinations of x and y are numerable. Another property of note is |A| > |B|, meaning that the total number of combinations has an upper bound set by | where "| |" denotes the cardinality of a set. If we enumerate and sum

up all possible combinations, we get the complete form of Equation (10).

C. Unreliability Formalization of D-Norm Transformation

As discussed above, the transformed bit generated by D-Norm is determined as follows:

In order to apply the same derivation strategy as the S-Norm, as illustrated in Figure 19(b), the two l1-Norms are reshaped into a single row, and four partitions are now rearranged as two sections: A and B. The length of section A can be written as h + (n - I) since h = I + θ whereas we can see that the length of A is n + θ. The largest number of errors/flips within raw bits f that still do not result in an error or flip in the new transformed bit F is (n + θ) - n = θ.

The rest of the formulating strategies are identical to the S-Norm. Using Case © of Figure 19(b) as an example, on the one hand, if there is one raw bit in section B flipped, it will increase the tolerance of the number of raw bits in section A that allows being flipped without influencing the output bit F. On the other hand, in Case ® of Figure 19(b), flipping one raw bit in section A will potentially result in an error to the new bit F. Furthermore, in Case ® of Figure 19(b), supposing that section B's raw bits remain unchanged, if more than θ raw bits are flipped in section A, the F will exhibit an error, flipping from '0' to '1'. The transformed bit F will therefore not exhibit an error unless more than θ + i raw bits in section A are flipped. Now, an extreme condition is considered as a starting point: as shown in the 3rd column in Figure 19 (second column of Figure 19(b)), we have two words, labeled with spatial index the "first" and the "second". We denote the raw bits as f_first and f_second respectively.

In the exemplified case, f_first has the lowest l1-norm while f_second has the highest l1-norm in the m-word block, i.e. ||f_first ||₁ = I, || f_second ||₁ = h. From the diagram, we can write down the following equation: || f_second ||₁ - ||f_first ||₁ = θ

By substituting || f_second ||₁ = h and ||f_first ||₁ = l into the equation above (Case © in Figure 19(a)), we get: h — l = θ

Adding n (number of bits in one word/group) to both sides of the equation, we get: h + (n — I) = n + θ

If we reshuffle the four partitions in Figure 19(b), the error rate of D-Norm can be formalized in a similar manner as the S-Norm (Equation 10). The margin (denoted as a dashed line) reduces and results in an unstable trend if any bit flips in the section A. Once the margin crosses n (marked as a solid line) from the right to the left, the new bit F exhibits errors. In contrast, bits flipped in section B increase the margin and stabilize the F.

The probability of x error bits occurring in section A can be expressed as

. Similarly, the probability for for y bits in section B to be flipped can

be expressed as

Now consider the special case where there is no bit flip in section B, then the highest number of bits allowed to be flipped in section A is simply θ. Otherwise, the F will exhibit errors, and consequently can be expressed as:

If the number of flipped bits in section B is non-zero,

where y e [0, |B|], |B| = n - θ. In other words, flipped bits in section B allow more tolerance of error bits in section A, before F exhibiting error. Therefore, the D-Norm, BER_F, is the summation

(y) for all possible y, finally formulated as in Equation (11).

D. Extraction Efficiency of S-Norm Transformation

For the S-Norm, if one group of raw bits (e.g. word) f is selected, it must satisfy the selection criteria II f . Hence, the probability of a group being

selected can be expressed as:

By substituting Pr( ||f ||₁ ≤ i) = binocdf (i,n, 0.5) and Pr( ||f ||₁ ≥ k) = 1 - binocdf k,n, 0.5), we get:

The m formulates the probability that one group is selected under S-Norm. The extraction efficiency η _SNorm can be directly expressed via

Where means that a transformed bit F is from n raw bits, and the last term 1024 x 8 is the conversion factor between bits and KiBytes (bit/KIB). By substituting intoη _{SNo rm} , we finally obtain Equation (16).

E. Extraction Efficiency of D-Norm Transformation

To estimate the extraction efficiency of D-Norm, what needs to be done first is estimating the probability that among m groups of raw bits (e.g. words) f₁f₂, the minimum l1- Norm ||f_i||₁ is any given value a from 0 to n, and the maximum l1-Norm ||f_i||₁ is another given value z from o to n

Recall that:

Once we comply with the above principle, the that one block to be selected for noise-

tolerant fingerprint extraction is simply the sum of all P(a,z) over z - a ≥ θ.

Although P(a,z) is non-trivial to estimate, it is possible to solve an easier and related problem first:

Another angle to look at Q(a,z) is: what is the probability that, among m words in a block, all l1-Norm are at least a and at most z? That question can be answered, because it poses an independent question on each word f_i : is a ≤ f_i ≤ z? The answer must be "yes" for all m words, and it is "yes" for a single word with probability binopdf(i, n, 0.5) (the usual

formula for the number of ||f_i||₁ meet θ divided by the number of all m words) and because those events are independent, the probabilities can be consequentially multiplied. The question becomes: how do we get from Q(a,z) to P(a,z)? Note that

{(f₁,f₂,...,fm) :(l = a /\ h = z)}

= {(f₁ f₂, ... , f_m):(l = a /\ h = z)} - {(f₁,f₂,...,fm) : (I ≥ a + 1 /\ h = z)} because for the l to be equal to a it is equivalent to ask for the l to be at least a but not to be at least a + 1. Also, the set we are subtracting is actually a subset of the set we are subtracting from, so we get:

P(a, z) = Prob{(f₁,f₂, ...,f_m): (I = a /\h — z)}

= Pr{(f₁, f₂, ... , f_m): (l ≥ a^ h = z)} - Pr{(f₁,f₂,...,fm) : (l ≥ a /\ h = z)}

Our two operands are of the same type, so we can do the same operation to reduce each probability to something expressible by some Q(r,s):

P(a, z) = {(f₁,f₂,...,fm) :(l = r /\ h = z)}

= {(f₁,f₂,...,fm) :(l = r /\ h ≤ z)} - {(f₁,f₂,...,fm) : (I ≥ r /\ h ≤ z - 1)} and we get:

Pr{(f₁ f₂, ... , f_m): (I ≥ r /\ h = z)} =

Pr{(f₁ f₂, ... , f_m): (I ≥ r /\ h ≤ z)} - Pr{(f₁ f₂, ... , f_m): (I ≥ r /\ h ≤ z - 1)} = Q(r,z) - Q(r,z - 1) and finally, for P(a,z), by substituting this in the above formula:

P(a,z) = (Q(a,z) -Q(a,z-1)) - (Q(a + l,z) - Q(a + l,z - 1))

The normalized D-Norm extraction efficiency η _DNorm is finally given by:

To be concise, we keep -Pbiock and Q(l,h) to be expressed separately.

F. Remote attestation

During the one-time enrollment conducted by the trusted Verifier 1000, we use a cabled JTAG interface and Segger J-link command-line tool to read out the start-up state (fingerprints) of Prover's 1030 (Figure 10) (Nordic Semiconductor nRF52832) SRAM. Readout raw fingerprints are saved as binary files and then processed (using MATLAB) for performing the S-Norm/D-Norm transform (step 108 of process 100 of Figure 2). Such a process may produce i) a database entry containing a Prover ID for the Prover 1030 and selected reliable noise-tolerant bits F and ii) a C language .h file containing the mask pointing to the address of F to be compiled with the sensor node code.

During the attestation phase, a command line Verifier tool can be used to randomly generate a challenge/nonce, and to look up the database according to the Prover's 1030 returned ID, and compute the expected response. In order to visualize the data exchange for demonstration purposes, a Gateway 1020 can be built using an Android demo APP based on FastBLE library (https://github.com/Jasonchenlijian/FastBle), and the smartphone's built-in Bluetooth-LE interface can be used to communicate with the Prover 1030. Note that in practice, the Gateway 1020 could be realized by a USB wireless dongle. The Prover 1030 in this case study is a representative low-end sensor node equipped with nRF52832 Bluetooth-LE SOC. The code to be attested on the Prover 1030 is statically allocated with linker Preprocessor command (for example, attribute ((section(".ARM. af 0x50000"))) in Keil uVision specifies placing the function at memory starting from address 0x50000.) The noise-tolerant fingerprint regeneration function, the mask, and the immutable bootloader are placed in read-only memory protected by ARM MPU.

G. Noise-resistant Fingerprint Mask

Positions of transformed bits F are provisioned during the key enrollment phase and provided during the key regeneration phase. These positions are used to form a mask. Recall, we have referred to those raw bits that produce a 1-bit F as a block. For the S- Norm, one block has n raw bits, while for the D-Norm, one block has n x m raw bits. For both methods, n raw bits form one Fl-Norm.

S-Norm: The starting address is set to be zero. To save storage, only the offsets rather than the actual addresses of selected blocks (each block transforms into a 1-bit F) are recorded. The offsets monotonically increase.

D-Norm: Again, the starting address is set to be zero. Here, besides the (outer) offset of each block, each block consists of two inner offsets that point to the positions/indices of l and h, respectively. The organization of both outer and inner offsets also monotonically increase. For both methods, once the mask is determined, its MAC an be produced over the derived key for later integrity checking. The mask and MAC can be publicly stored off-chip and/or on-chip.

H. Derivation of Method-A extraction efficiency

The analysis in this section is based on the aforementioned random chip model. In addition, the following constraints are taken into consideration.

Constraint 1 : The difference between the selected pair of l1-Norms is no less than θ.

Constraint 2: Each l1-Norm can be selected only once.

Constraint 3: Maximising the extraction efficiency is the first priority.

Constraint 4: Maximising the extracted key's reliability is the second priority.

To analysis the extraction efficiency for Method-A η_A We apply the steps described in below with an example setting: d = 2048 , n = 16 , 0 = 8. The selected l1-Norms are visualized in Figure 28 to ease understanding, selected l(low l1-Norms) are coloured in red (at the far left of the graph), and selected h (high l1-Norm) are coloured in yellow (at the far right of the graph), where |h - l | = θ.

Best Extraction Efficiency Proof. For example, as shown in Figure 28, for ||f||₁ = 5 there are left l1-Norms (marked with green dashed line and label (A)). According to our Constraint 1, if there are any l1-Norm could be matched, should located in the range [0, ||f||₁ - θ ]- U [ ||f||₁ + θ,n], for case (A), ||f||₁ = 5, θ - 8, possible matched candidates should be within [0,-3] or [13,15]. The former one is invalid, as the domain of ||f | |₁ is {0,...,n}.For the later one, as shown in Figure X, all ||f||₁ = 13,14,15 have been used (coloured in yellow). According to Constraint 2, every l1-Norms can only be used once. No element in (A) could be used to produce more l1-Norm pairs that satisfy |h - l | = θ.

We can repeat above process for ||f||₁ = 6,7,...,11. The conclusion is that, there is no possible new pair could be matched subject to constraints. For any given n e z, the number of l1-Norms is a finite number.

In the probability density function plot Figure 28, the area indicates the probability. Hence the upper-bound _A .can be derived from the area coloured by Method-A method, which has been proven to be the best method in terms of the extraction efficiency. Because D- Norm produces a 1-bit noise-tolerant F from a pair of l1-Norms, only one coloured area needs to be counted: either red (at the far left of the graph) or yellow (at the far right of the graph).

Extraction Efficiency Formulation. For better visualization, Figure 28 has a different setting : n = 16,θ = 3. Regarding the Constraint 1, the "red area" (columns at the far left of the graph) may span q ∈ {0,1,...,n - θ}. The reproduced results are depicted in Figure 29. To ease the analysis, we can split the red area into two partitions. First, for q ∈ [0,θ), all l1-Norms are selected as i . Therefore the area of partition one is simply binopdf{q,n, 0.5) . The partition two of the red area is a bit complex to analysis. Considering Constraint 2 there are some l1-Norms in the second partition that have already been used to match l1-Norms in the first partition. For example, at q = 6 in Figure 29, the total probability labeled with (A) can be calculated with the same formula binopdf(q,n, 0.5) as that in partition one. However, some l1-Norms, labeled with (C), have already been used to be matched with ||f||₁ = q - 0, labeled with (D) in the figure. The area of (C) and (D) is binopdf(q - e,n, 0.5) , hence the area of (B) can be calculated by deduct (C) from (A), i.e. binopdf(q,n, 0.5) - binopdf(q - θ,n, 0.5). Nevertheless, the area of (B) is also capped by the available l1-Norms at ||f||₁ = q + 0, whichever is the smaller, (E) or (B) will be kept to be the final red area at ||f||₁ = q.

Based on the analysis above, the extraction efficiency

of Method-A can be formulated as that shown in Equation (17).

I. Global TRE exemplar implementation

Figure 37 shows an exemplar implementation for the Global TRE, and Figure 39 and Figure 40 shows the run time evaluation results for the method.

A piece of re-generated fingerprint ® contains several error bits with highlighted borders. The re-generated fingerprint is kept inside the remote device, never exposed. Neither the attacker nor the server can see it. Instead, the remote device calculates the re-generated fingerprint's hash and makes it available to the public, including the server. x = binoinv(p, k, BER_F), (21)

The server may compute the hash over the reference fingerprint

. However, due to the error bits that exist, the two hashes would not match. The server may use Equation (21) to predict how many error bits may exist in the re-generated fingerprint. In this example, let consider x - 2, which assumes the error bits are less and equal to 2. The server may apply the power set described in section DETAILED DESCRIPTION to try and examine any possible location of the error bits as

. Inside each trail, the server may flip the bit, calculate the hash and test all 2^x possible combinations as indicated by

If the hash value of the current TRE fingerprint is matched with the hash received from the remote device, as shown in step

. We can then say the current TRE fingerprint is the same as the re-generated fingerprint

To evaluate TRE complexities, we measure the required number of trials rather than the absolute time is taken, as the former is invariant to varying computation platforms, system loads, and CPU temperatures.

The upper bound (worst-case) complexity of performing Global TRE (Gio. TRE) that is over a bit string with tolerance up to x errors is expressed as:

Since the key length, k is a constant, and we can now substitute 2 ■ k with constant symbol c and refer to the number of possible error bits x as the scaling factor n, the worst-case complexity of the global TRE is then generally represented as O(cⁿ).

O(Glo.TRE(x,k)) = O(cⁿ)

The Method-B with Global TRE. According to Figure 25, with the Method-A, the larges1 0 to extract 128 bit key from a 2 KiB memory space is 7. The predicted key failure rate is 1.115 x 10^-2, and the corresponding simulated key failure rate is 5.985 x 10^-3, neither can immediately meet 10^-6 requirement.

To further reduce the key failure rate, we can apply the Global TRE to restore few unmatched bits. Here we can use the function x - binoinv(p,k,BER_A) to estimate the number of tolerated bits, x, when meeting key failure rate less than10^-6. In this context, under key failure rate p = 10^-6 with key bit error rate BER_A = 8.759 x 10^-5 (obtained with Equation (11), the x - 2 is calculated. Therefore the maximum number of trials or iterations by Gio. TRE to achieve a key failure rate below 10^-6 can be estimated by the function e(Glo. TRE(k,x)) above. 65, 536

The above presents the estimated maximum trials of Gio. TRE through formulated equations, now we validate it through empirical evaluations. To be precise, we perform 1,000,000 key generations and record the number of TRE routine interactions or trials for each regeneration. Ideally, there should be no more than one failure, and the number of trials of each TRE should be below 65,536.

The result is illustrated in Figure 39. The Method-A could manage a failure rate down to 6.64 x 10^-3 with 6,458 errors out of 1,000,000 repeated key extractions. In addition, when the Gio. TRE is employed, it is shown to be capable of correcting all those errors with a maximum complexity of 7,862 iterations, well below the 65,536 upper bound estimated above. Consequently, using the combination of Method-A and the Gio. TRE, a key failure rate below 10^-6 is achievable.

The Method-B with Global TRE. Similarly, as shown in Figure 27, to obtain a 128-bit key from a 2 KIB memory with the Method-B method, the largest θ is 6. Moreover, the predicted key failure rate is 6.10 x 10^-2,, and the simulated data is 3.94 x 10^-2,.

We can apply the Gio. TRE as we did above, however, we need to consider the worst-case key bit error rate which is BER_B = 4.92 x 10^-4 (calculated with Equation (11)) and the equation x = binoinv(p,k,BER_B) gives expected number of error bits x = 3. The predicted upper bound of TRE complexity C(Glo. TRE (128, 2)) = (2 ■ 128)³ = 4,194,304

The result is shown in Figure 40. The vast majority of key generations are noise-free, only takes one iteration to resolve errors. Only 3.40% (33,972 out of 1,000,000) of the key generations requires TRE to be activated. The mean number of TRE iterations is 6.54, and the maximum number of iterations in our tested 1,000,000 key generations is 301,308, well below the estimated 4,194,304 upper bound. Therefore, the Global TRE could easily reduce the key failure rate of Method-B key extraction from 3.40% to below 10^-6 with minimal cost.

J. Regional TRE exemplar implementation

Figure 38 shows an exemplar implementation for the Regional TRE. In the Regional TRE, instead of treating all bits in the fingerprint as equal reliability rate, the system utilizes the reliability information, which is the d - \h - l\. As shown in step

®, the fingerprints are split into multiple pieces, which can be referred to as "sub-strings". In each sub-string, the expected error bits can be predicted with Equation (22). x_i — binoinv(p, l_i, BER_F), (22) where x, is the upper-bounded number of expected error bits in a sub-string of length with expected reliability of the sub-string p. In other words, performing TRE for x, bits in this sub-string can recover all error bits.

In this example, the fingerprint is split into having three substrings. In a real case, the sub-strings of different x, values may be distributed across the entire fingerprint. In this drill example, the bits that belong to the same sub-string are grouped for easier understanding.

In the left-most sub-sting, x₀ = 0, which means there are expected to be no errors at all under current reliability grade defined by p in the Equation (22). The other two sub-strings have X₁ — 1 and x₂ — 3.

The remaining steps are the same as that in the Global TRE.

REFERENCES

The following documents may be helpful for an understanding of embodiments of the present disclosure and are not necessarily prior art.



INTERPRETATION

The term "substantially" can indicate a percentage greater than or equal to 90%, for instance, 92.5%, 95%, 97.5%, 99%, or 100%.

The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that the prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.

Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.

Claims

1. A method of generating a device fingerprint for a device that comprises at least one integrated circuit, the method comprising: obtaining a plurality of raw bit values from the at least one integrated circuit; generating noise-resistant bits from the raw bit values by: grouping the raw bit values into a plurality of blocks; and applying a transformation to each of the blocks, wherein for each group, the transformation either generates a single noiseresistant bit for the group or does not output any value for the group, based on a noise tolerance threshold; and outputting a string of the noise-resistant bits as the device fingerprint.

2. A method according to claim 1, wherein the transformation is arranged to provide a bimodal reliability distribution for the noise-resistant bits.

3. A method according to claim 1 or claim 2, wherein one or more of the integrated circuits comprises a plurality of addressable units, and wherein at least a subset of the raw bit values are obtained from at least a subset of the plurality of addressable units.

4. A method according to claim 3, wherein each group comprises one or more wordlines of the addressable units.

5. A method according to claim 3 or claim 4, comprising storing locations of the addressable units from which the raw bit values are obtained.

6. A method according to any one of claims 1 to 5, wherein applying the transformation comprises: computing a summary value for the group; and determining whether the summary value is greater than half the group size plus the noise tolerance threshold, or is less than half the group size minus the noise tolerance threshold.

7. A method according to any one of claims 1 to 5, wherein operation (i) further comprises: dividing the block into a plurality of groups; and operation (II) further comprises: computing a summary value for each subgroup; determining a highest summary value and a lowest summary value for the subgroups; and if the difference between the highest summary value and the lowest summary value is greater than or equal to the noise tolerance threshold, determining whether the subgroup having the highest summary value occurs later in the group than the subgroup having the lowest summary value. A method according to claim 7, wherein operation (i) comprises generating a reconfigured group by increasing a size of the group and/or a number of subgroups in the group if the difference between the highest summary value and the lowest summary value is less than the noise tolerance threshold; and operation (ii) is re-performed on the reconfigured group. A method according to claim 8, wherein the size of the group has a user-defined upper limit. A method according to claim 7, wherein each group has a fixed size and operation (i) comprises dividing the raw bit values into a sequence of sliding windows of raw bits, each sliding window corresponding to a group. A method according to any one of claims 1 to 5, wherein applying the transformation comprises: computing a summary value for each group to generate a list of summary values; sorting the summary values in ascending order or descending order; determining pairs of the groups that have an absolute difference that is equal to the noise tolerance threshold; and for each pair, generating a single noise-resistant bit based on the positions of the groups of the pair in the list. A method according to claim 11, wherein the method is repeated until a key bit requirement for the digital fingerprint is reached or until no pairs of the groups have an absolute difference that is equal to the noise tolerance threshold. A method according to any one of claims 1 to 5, wherein the noise tolerance threshold is an initial noise tolerance threshold and applying the transformation comprises:

(a) computing a summary value for each group to generate a list of summary values;

(b) determining pairs of the groups that have an absolute difference that is equal to or greater than the initial noise tolerance threshold;

(c) for each pair, generating a single noise-resistant bit based on the positions of the groups of the pair in the list;

(d) repeating (a)-(c) until all pairs are exhausted; and if a key bit requirement for the digital fingerprint is not reached, decreasing the noise tolerance threshold below the initial noise tolerance threshold, and repeating (a)-(d) with the reduced noise tolerance threshold. A method according to any one of claims 6 to 13, wherein the summary value is the l-norm of the group or subgroup. A method according to any one of the preceding claims, wherein the noise tolerance threshold is selected to provide a minimum number of noise-resistant bits. A method according to any one of claims 3 to 15, wherein the at least one integrated circuit comprises at least one memory unit. A method according to claim 16, wherein the at least a subset of the addressable cells is a predefined area of the memory unit. A method according to claim 16 or claim 17, wherein the raw bit values are obtained from power-up states of the addressable units. A method according to any one of the preceding claims, wherein the device fingerprint is generated on demand. A device attestation method comprising: generating a device fingerprint according to any one of claims 1 to 19; and enrolling the device fingerprint at a server. A device attestation method according to claim 20, wherein said enrolling comprises transmitting reliability data comprising a reliability of each bit of the device fingerprint to the server for storage at the server.

22. A device attestation method according to claim 21, comprising predicting the total number of possible error bits in the entire device fingerprint.

23. A device attestation method according to claim 21 or claim 22, comprising predicting possible locations of error bits in the device fingerprint.

24. A device attestation method according to claim 21, comprising: splitting the device fingerprint into a plurality of sub-strings; and predicting the number of possible error bits in each sub-string.

25. A device attestation method according to claim 24, comprising predicting possible locations of error bits in each sub-string.

26. A device attestation method according to any one of claims 21 to 25, comprising: receiving, at the server, a hash of a device fingerprint to be verified for a device; comparing a computed hash of the enrolled device fingerprint for the device to the received hash; if the computed hash matches the received hash, outputting a positive authentication result; if the computed hash does not match the received hash: determining one or more unreliable bits from the stored reliability data; modifying values of the one or more unreliable bits for the enrolled device fingerprint to generate a modified device fingerprint; recomputing the hash for the modified device fingerprint; and comparing the recomputed hash to the received hash.

27. A secure key derivation method comprising: generating a device fingerprint by a method according to any one of claims 1 to 19; and enrolling the device fingerprint as a reference fingerprint at a server.

28. A secure key derivation method according to claim 27, wherein said enrolling comprises transmitting reliability data comprising a reliability of each bit of the device fingerprint to the server for storage at the server.

29. A secure key derivation method according to claim 28, comprising predicting the total number of possible error bits in the entire device fingerprint.

30. A secure key derivation method according to claim 28 or claim 29, comprising predicting possible locations of error bits in the device fingerprint. A secure key derivation method according to claim 28, comprising : splitting the device fingerprint into a plurality of sub-strings; and predicting the number of possible error bits in each sub-string. A secure key derivation method according to claim 31, comprising predicting possible locations of error bits in each sub-string. A secure key derivation method according to claim 30 or claim 32, comprising reverting the identified error bits to obtain a highly reliable secure key. A device comprising at least one integrated circuit, the at least one integrated circuit comprising at least one processor configured to carry out a method according to any one of claims 1 to 25.