CA2129203C

CA2129203C - Public key cryptography utilizing elliptic curves

Info

Publication number: CA2129203C
Application number: CA 2129203
Authority: CA
Inventors: Gordon B. Agnew; Ronald C. Mullin; Scott A. Vanstone
Original assignee: Certicom Corp
Current assignee: Certicom Corp
Priority date: 1994-07-29
Filing date: 1994-07-29
Publication date: 2010-01-12
Anticipated expiration: 2014-07-29
Also published as: CA2129203A1

Abstract

An elliptic curve encryption system represents coordinates of a point on the curve as a vector of binary digits in a normal basis representation. A key is generated from multiple additions of one or more points in a finite field. Inverses of values are computed using a finite field multiplier and successive exponentiations.

Description

.

PUBLIC KEY CRYPTOGRAPHY UTILIZING ELLIPTIC CURVES
FIELD OF THE INVENTION

The present invention relates to public key cryptography.

The increasing use and sophistication of data transmission in such fields as telecommunications, networking, cellular communication, wireless communications, "smart card" applications, audio-visual and video communications has led to an increasing need for systems that permit data encryption, authentication and verification.

It is well known that data can be encrypted by utilizing a pair of keys, one of which is public and one of which is private. The keys are mathematically related such that data encrypted with the public key may only be decrypted with the private key and conversely, data encrypted with the private key can only be decrypted with the public key. In this way, the public key of a recipient may be made available so that data intended for that recipient may be encrypted with the public key and only decrypted by the recipient's private key, or conversely, encrypted data sent can be verified as authentic when decrypted with the sender's public key.

The most well known and accepted public key cryptosystems are those based on integer factorization and discrete logarithms in finite groups. In particular, the RSA system for modulus p= q where p and q are primes, the Diffie-Hellman key exchange and the ElGamal protocol in Zp, (p a prime) have been implemented worldwide.

The RSA encryption scheme, where two primes p and q are multiplied to provide a modulus n, is based on the integer factorization problem. The public key e and private key d are related such that their product e=d equals 1 (mod ~) where ~= (p-1) ( q-1) . A message M is encrypted by exponentiating it with the private key e to the modulus n, [ C= Me (mod n) ] and decrypted by exponentiating with the public key mod n[M = Cd (mod n) ]-This technique requires the transmission of the modulus n and the public key and the security of the system is based on the difficulty of factoring a large number that has no relatively small factors. Accordingly both p and q must be relatively large primes.

One disadvantage of this system is that p and q must be relatively large (at least 512 bits) to attain an adequate level of security. With the RSA protocol this results in a 1024 bit modulus and a 512 bit public key _ 2129203 which require significant bandwidth and storage capabilities. For this reason researchers have looked for public key schemes which reduce the size of the public key. Moreover, recent advances in analytical techniques and associated algorithms have rendered the RSA encryption scheme potentially vulnerable and accordingly raised concerns about the security of such schemes. This implies that larger primes, and therefore a larger modulus, need to be employed in order to maintain an acceptable level of security. This in turn increases the bandwidth and storage requirements for the implementation of such a scheme.

Since the introduction of the concept of public key cryptography by Diffie and Heliman in 1976, the potential for the use of the discrete logarithm problem in public key cryptosystems has been recognized. In 1985, ElGamal described an explicit methodology for using this problem to implement a fully functional public key cryptosystem, including ~igital signatures. ,This methodology has been refined anq incorporated with various protocols to meet a variety ofYapplications, and one of its extensions forms the basis for a proposed U.S. digital signature standard (DSS)'. Although the discrete logarithm problem, as first employed by Diffie and Hellman in their public key exchange algorithm, referred explicitly to the problem of finding logarithms with respect to a primitive element in the multiplicative group of the field of integers modulo a prime p, this idea can be extended to arbitrary groups (with the difficulty of the problem apparently varying with the representation of the group).

The discrete logarithm problem assumes that G is a finite group, and a and b are elements of G. Then the discrete logarithm problem for G is to determine a value X(when it exists) such that a x = b. The value for x is called a logarithm of b to the base of a, and is denoted by logb.

The difficulty of determining this quantity depends on the representation of G. For example, if the abstract cyclic group of order m is represented in the form of the integers modulo m, then the solution to the discrete logarithm problem reduces to the extended Euclidean algorithm, which is relatively easy to solve. However, the problem is made much more difficult if m+l is a prime, and the group is represented in the form of the multiplicative group of the finite field Fm+l. This is because the computations must be performed according to the special calculations required for operating in finite fields.

It is also known that by using computations in a finite field whose members lie on an elliptic curve that is by defining a group structure G on the solutions of y2+xy=x3+ax2+b over a finite field, the problem is again made much more difficult because of the attributes of elliptic curves. Therefore, it is possible to attain an increased level of security for a given size of key.

5 Alternatively a reduced key may be used to maintain a required degree of security.

The inherent security provided by the use of elliptic curves is derived from the characteristic that an addition of two points on the curve can be defined as a further point that itself lies on the curve. Likewise the result of the addition of a point to itself will result in another point on the curve. Therefore, by selecting a starting point on the curve and multiplying it by an integer, a new point is obtained that lies on the curve. This means that where p=(X,y) is a point on an elliptic curve in a finite field [E(FQA) l , with x and y each represented in by a vector of length Fq bits, then, for any other point R E<p > (the subgroup generated by p), dP = R. To attack such a scheme, the task is to determine an efficient method to find an integer d, 0 s d s( order of P) - 1 such that ajp = R. To break such a scheme, the best algorithms known to date have running times no better than p(Vp-), where p is the largest prime dividing the order of the curve (the number of points on the curve).

Thus, in a cryptographic system where the integer d remains secret, the difficulty of determining d can be exploited.

An ElGamal protocol of key exchange based on elliptic curves takes advantage of this characteristic in its definition of private and public keys. Such an ElGamal protocol operates as follows:

1. In order to set up the protocol, where a message is to be sent from A to B, an elliptic curve must be selected and a point p-(X,y), known as the generating point, must be selected.

Encryption 2. The receiver, B, then picks a random integer d as his private key. He then computes dp, which is another point on the curve, which becomes his public key that is made available to the sender and the public. Although the sender knows the combination dP, due to the characteristic of elliptic curves noted above, he has great difficulty determining the private key d.

3. The sender A, chooses another random integer k,the session seed, and computes another point on the curve, kp which serves as a session key. This also ------ - --- -----exploits the characteristic of elliptic curves mentioned above.

4. The sender, A, then retrieves the public key dp and computes kdP, another point on the curve, which serves as the encryption key.

5. The sender, A, then encrypts the message M with the encryption key to obtain the ciphertext C.

6. The sender then sends the session key kp and the ciphertext C to the receiver B.

Decryption 7. The receiver, B, determines the encryption key kdP
by multiplying his private key d by kp.

8. The receiver, B, can then extract the message the decrypting the ciphertext C with the encryption key kdP.

During the entire exchange, the private key d and the seed key k remain secret.

Elliptic curve cryptosystems can thus be implemented employing public and private keys and using the ElGamal protocol.

The elliptic curve cryptography method has a number of benefits. First, each person can define their own elliptic curve for encryption and decryption, which gives rise to increased security. If the private key security is compromised, the elliptic curve can be easily redefined and new public and private keys can be generated to return to a secure system. In addition, to decrypt data encoded with the method, only the parameters for the elliptic curve, and the session key need be transmitted. One of the drawbacks of other public key systems is the large bandwidth and storage requirements for the public keys. The implementation of a public key system using elliptic curves reduces the bandwidth and storage requirements of the public key system because the parameters can be stored in less bits. Until now, however, such a scheme was considered impractical due to the computational difficulties involved and the requirement for high speed calculations. The computation of kp, dp and kdp used in a key exchange protocol require complex calculations due to the mathematics involved in adding points in elliptic curve fields.
Computations on an elliptic curve are performed according to a well known set of relationships. If K
defines any field, then an equation of the form y2 + alxy + a3y = x3 + a2 X2 + a4X + a6 , where the a j lie in K, defines an elliptic curve over K. If E is the set of points on this curve, then an abelian group can be defined on the set E U{0}, where 0 is a special element not occurring in E. 0 acts as the zero element of the group. If p = (X, y) , then -p =(X, -y) in the case of an odd characteristic, and for two points p and Q on the curve where g# p, the sum p + Q is the third point on the curve where the line joining p and Q again meets the curve. If p- Q, then the tangent line is used. As in any abelian group, we use the notation np to denote p added to itself n times if n is positive, and -p added to itself I n I times if n is negative, and OP = 0=

If Fq is a finite field, then elliptic curves over Fq can be divided into two classes, namely supersingular and non-supersingular curves. If Fq is of characteristic 2, i.e. q=2M , then the classes are defined as follows.
The set of all solutions to the equation y 2 + ay = x 3 + bX + c where a, b, c E FQ , a * 0, together with a special point called the point at infinity 0 is a supersingular curve over F
Q.
The set of all solutions to the equation yZ + xy = x3 + ax2 + b where a, b E Fq , b#0 , together with a special point called the point at infinity 0 is a nonsupersingular curve over Fq. By defining an appropriate addition on these points, we obtain an additive abelian group.

The addition of two points p(xl,yl) and Q(xz,yz) for the supersingular elliptic curve E with y 2 + ay = x3 +,bx + c is given by the following.
If p- (xl,yl) E E; then define -P= (xl,yl+a), P+0=0+p=p for all pEE=

If Q= (x2,y2) E E and Q# -p, then the point representing the sum of p + Q, is denoted (xj, y3) , where X3 = ( yy2)2 Xl X2 ( P # Q) or x = xl bz 3 az (P = Q) and y3 = yX-~ yz ( xl x3) yl a (P # Q) ( i z) or y3 = {(xeb)(Xl x3) yl a (P=Q) ` 11 The addition of two points p(xl,yl) and Q(x2,y2) for the nonsupersingular elliptic curve y2 + xy = x3 + ax2 + b is given by the following.

If P = (xl,yl) E E then define -p =(xl, yl + xl) . For all PEE, O+P=P+O =P. If Q= (xZ.y2) EE and Qo -p, then p + Q is a point (x3,y3) , where x yl y2 )z yl y2 x x a ( P * Q) 3 X X~ 1 2 or X3 xl 2 (P = Q) Il X1 and y3 = y~y2 (xl X3 ) x3 yl ( P * Q) ( 1 2) or y3 = {x12 (Xl Xi l X3 X3 (P = Q) Accordingly it can be seen that computing the sum of two points on E requires several multiplications, additions, and inverses in the underlying field Fq. In turn, each of these operations requires a sequence of elementary bit operations When implementing an ElGamal or Diffie-Hellman scheme with elliptic curves, one is required to compute kp = p+ p+.,, + p ( p added k times) where k is a positive integer and p E E. This requires the computation of (x3,y3) to be computed k-1 times. For large values of k, this has previously been considered impractical for data communication.

It is an object of the present invention to provide a method of encryption utilizing elliptic curves that facilitates the computation of additions of points while providing an adequate level of security in an efficient and effective manner.

The applicants have developed a method using a modified version of the Diffie-Hellman and ElGamal protocols defined in the group associated with the points on an elliptic curve over a finite field. The method involves formulating the elliptic curve calculations so as to make elliptic curve cryptography efficient, practical and viable, and employs the use of finite field processor such as the Computational Method and Apparatus for Finite Field Multiplication as disclosed in U.S.
Patent 4,745,568. The method exploits the strengths of such a processor with its computational abilities in finite fields. The inventive method structures the elliptic curve calculations as normal basis finite field multiplication and exponentiation over the field a,b E F2a, which can readily be calculated on a finite field processor.

The inventors have recognized that the computations necessary to implement the elliptic curve calculations can be performed efficiently where such computations are expressed in a normal basis representation over a finite field. The inventors have further recognized that the elliptic curve calculations are further simplified where a finite field of characteristic 2 is chosen.

With the computations presented in this form, the applicants have realized that specialized semiconductor devices can be fabricated to perform the calculations.
With the calculations presented in such a form, additions can be efficiently performed in one clock cycle utilizing a simple XOR operation. Multiplications can be performed very efficiently in only n clock cycles where n is the number of bits being multiplied. Furthermore, squaring can be efficiently performed in 1 clock cycle as a cyclic shift of the bit register. Finally, inverses can easily be computed, requiring approximately logZn multiplications rather than the approximately 2n multiplications required in other arithmetic systems.
The inventors have also recognized that the bandwidth and storage requirements of a cryptographic system utilizing elliptic curves can be significantly reduced where for any point p(X,y) on the curve, only the x coordinate and one bit of the y coordinate need be stored and transmitted, since the one bit will indicate which of the two possible solutions is the second coordinate.

The inventors have also recognized when using the ElGamal protocol that messages need be points on the curve if the protocol is modified such that the encrypted message C = (cl, c2) is determined by transforming kdP with M to obtain a cyphertext C. The receiver can then extract the message M=(m1,M2) by applying the inverse transformation. Although this requires an inverse operation, it may be performed efficiently with the processor noted above.

To assist in the appreciation of the implementation of the present invention , it is believed that a review of the underlying principles of finite field operations is appropriate. The finite field F(2) is the number system in which the only elements are the binary numbers 0 and 1 and in which the rules of addition and multiplication are the following:

0+0=1+1=0 0 + 1= 1 + 0 1 Ox0=1 x0=0 x1=0 1 x 1 = 1 These rules are commonly called modulo-2 arithmetic. All additions specified in logic expressions or by adders in this application are performed modulo-2 as an XOR

operation. Furthermore, multiplication is implemented 10 with logical AND gates.

The finite field FZe, where m is an integer greater than 1, is the number system in which there are 2m elements and in which the rules of addition and 15 multiplication correspond to arithmetic modulo an irreducible polynomial of degree m with coefficients in F(2). Although in an abstract sense there is for each m only one field FZ., the complexity of the logic circuitry required to perform operations in F2o depends strongly on the particular way in which the field elements are represented.

The conventional approach to the design of logic circuitry to perform operations in FZ, is described in such papers as T. Bartee and D. Schneider, "Computation with Finite Fields", Information and Control, Vol. 6, pp.
79-98, 1963. In this conventional approach, one first chooses a polynomial P(X) of degree m which is irreducible over FZa, that is, P(X) has binary coefficients but cannot be factored into a product of polynomials with binary coefficients each of whose degree is less than m. An element A in F2e is then defined to be a root of P(X), that is, to satisfy P(A)=0. The fact that P(X) is irreducible guarantees that the m elements A = 1, A, A2, ... Ad1'1 of F2m are linearly independent over F2.

For the purposes of illustration, the example of F23 will be used with the choice of p(g) = g3 + X+ 1 for the irreducible polynomial of degree 3. The next step is to define A as an element of F23 such that A3 + A + 1 = 0-The following assignment of unit vectors is then made:
A = 1 = [1, 0, 0]

A' = [0, 1, 0]
A2 = [0, 0, 1]

An arbitrary element B of F23 is now represented by the binary vector [b2, bl, b ] with the meaning that B = [b21 bl , b ]= b2A 2 + b1A + b =

If we represent a second element C= [c2, cl, co] , it follows that B + C = [b2 c2, bl cl, bo co]

Thus, in the conventional approach, addition in FZ
is easily performed by logic circuitry that merely forms the modulo-2 sum of the two vectors representing the elements to be summed component-by-component.
Multiplication is, however, considerably more complex to implement.

Continuing the example, from the irreducible polynomial it can be seen that A3 = A+ 1 and A4 = A2 + A
where use has been made of the fact that -1 = +1 in F(2).
Multiplication can be simplified by taking advantage of the special feature of a finite field FZm that there always exists a so-called normal basis for the finite field. That is, one can always find a field element N
such that N, Nz , N4 ,,,jJ2m-1 are a basis for F2,. Every field element B can be uniquely written as B = bm-JN2m-l + . . . + b2N + b1N2 + boN = [bm-1, . . . , b2, bl, bo]
where bo, bl, b2 l ,,,,bm-1 are binary digits.

For example, in the finite field FZõ if we let N= [1, 1, 0]

Element Field Normal Basis Representation Normal basis Vector 1010,01 - 1010,01 [1, 0, 0] N2 + N4 [1, 1, 1]
[0, 1, 0] N+N2 + N4 1011,11 1010,11 N+ NZ (110,11 1111,01 N [1, 0, 0]
[110,11 N+N4 1011,01 [011,11 N 1111,01 [1, 1, 1] N2 1010,11 Moreover, in a normal basis representation, squaring in F2. is a linear operation in the sense that for every pair of elements B and C in FZe, (B + C) 2 = B2 + C2 and it is the case for every element B of FZe that B2M = B.
Then, if B = L.bm-,, . . . , b2, bl, bo ] and C= [cM_1, ..., c2, cl, co] are any two elements of F2+ in normal basis representation, then the product D = B x C = [d._l, . . . , d2, dl, do] has the property that the same logic circuitry which when applied to the components or binary digits of the vectors representing B and C produces dm_i will sequentially produce the remaining components dm-2, ,,,, d2, dl, do of the product when applied to the components of the successive shifts of the vectors representing B and C.

As illustrated in U.S. Patent 4,745,568 for Computational Method and Apparatus for Finite Field Multiplication, multiplication may be implemented by storing bit vectors B and C in respective shift registers and establishing connections to respective accumulating cells such that a grouped term of each of the expressions di is generated in respective ones of m accumulating cells.
By rotating the bit vectors B and C in the shift registers and by rotating the contents of the accumulating cells, each grouped term of a respective binary digit dj is accumulated in successive cells. Thus all of the binary digits of the product vector are generated simultaneously in the accumulating cells after one complete rotation of the bit vectors B and C.

As noted above, the inventors have taken advantage of.the efficiency of the mathematical operations in finite fields in the implementation of an elliptic curve encryption scheme. The applicants have developed a method of formulating the elliptic curve calculations so as to make elliptic curve cryptography efficient, practical and viable. The preferred method employs the use of a finite field processor such as the Computational Method and Apparatus for Finite Field Multiplication as disclosed in U.S. Patent 4,745,568. The method couples the attractive 21 9w 9?03 cryptographic characteristics of elliptic curves with the strengths of the field processor through its computational abilities in finite fields. The inventive method structures the elliptic curve calculations as 5 finite field multiplication and exponentiation over the field where a,b E FZ,, which can readily be calculated on a =finite field processor.

An embodiment of the invention will now be described 10 by way of example only with reference to the accompanying drawings in which:-Figure 1 is a diagram of the transmission of an encrypted message from one location to another, 1 Figure 2 is a diagram of an encryption module used with the communication system of Figure 1, Figure 3 is a diagram of a finite field processor used in the encryption and decryption module of Figure 2.
Figure 4 is a flow chart showing movement of the elements through the processor of Figure 3 in computing an inverse function.

Figure 5 is a flow chart showing movement of elements through the processor of Figure 3 to compute the addition of two points.

An embodiment of the invention will first be described utilising an ElGamal key exchange protocol and a Galosis field 2'55 to explain the underlying principles.
Further refinements will then be described.

System Components Referring therefore to Figure 1, a message M is to be transferred from a transmitter 10 to a receiver 12 through a communication channel 14. Each of the transmitters 10 and receiver 12 has an encryption/decryption module 16 associated therewith to implement a key exchange protocol and an encryption/decryption algorithm.

The module 16 is shown schematically in Figure 2 and includes an arithmetic unit 20 to perform the computations in the key exchange and generation. A
private key register 22 contains a private key, d, generated as a 155 bit data string from a random number generator 24, and used to generate a public key stored in a public key register 26. A base point register 28 contains the coordinates of a base point P that lies in the elliptic curve selected with each coordinate (x, y), represented as a 155 bit data string. Each of the data strings is a vector of binary digits with each digit being the coefficient of an element of the finite field in the normal basis representation of the coordinate.

The elliptic curve selected will have the general form y2 + xy = x3 + ax2 + b and the parameters of that curve, namely the coefficients a and b are stored in a parameter register 30. The contents of registers 22, 24, 26, 28, 30 may be transferred to the arithmetic unit 20 under control of a C.P.U. 32 as required.

The contents of the public key register 26 are also available to the communication channel 14 upon a suitable request being received. In the simplest implementation, each encryption module 16 in a common security zone will operate with the same curve and base point so that the contents of registers 28 and 30 need not be accessible.
If further sophistication is required, however, each module 16 may select its own curve and base point in which case the contents of registers 28, 30 have to be accessible to the channel 14.

The module 16 also contains an integer register 34 that receives an integer k, the session seed, from the generator 24 for use in encryption and key exchange. The module 16 has a random access memory (RAM) 36 that is used as a temporary store as required during computations.

The encryption of the message M with an encryption key kdP derived from the public key dP and session seed integer k is performed in an encryption unit 40 which implements a selected encryption algorithm. A simple yet effective algorithm is provided by an XOR function which XOR's the message with the encryption key. Alternative implementations such as the DES encryption algorithm could of course be used.

Key generation, exchange and encryption In order for the transmitter 10 to send the message M to the receiver 12, the receivers public key is retrieved by the transmitter 10. The public key is obtained by the receiver 12 computing the product of the secret key d and base point P in the arithmetic unit 20 as will be described more fully below. The product dP

represents a point on the selected curve and serves as the public key. The public key dP is stored as two 155 bit data strings in the public key register 26.

Upon retrieval of the public key dP by the transmitter 10, it is stored in the RAM 36. It will be appreciated that even though the base point P is known and publicly available, the attributes of the elliptic curve inhibit the extraction of the secret key d.

The transmitter 10 uses the arithmetic unit 20 to compute the product of the session seed k and the public key dP and stores the result, kdP, in the RAM 36 for use in the encryption algorithm. The result kdP is a further point on the selected curve, again represented by two 155 bit data strings or vectors, and serves as an encryption key.

The transmitter 10 also computes the product of the session seed k with the base point P to provide a new point kP, the session key, which is stored in the RAM 36.

The transmitter 10 has now the public key dP, a session key kP and an encryption key kdP and may use these to send an encrypted message. The transmitter 10 encrypts the message M with the encryption key kdP in the encryption unit 40 by XOR'ing the message M with the encryption key kdP to provide an encrypted message C.

The ciphertext C is transmitted together with the value kP to the encryption module 16 associated with receiver 12.

The receiver 12 utilises the receivers public key kP
with its private key d to compute the encryption key kdP
in the arithmetic unit 20 and then decrypt the ciphertext C in the encryption unit 40 to retrieve the message M.

During this exchange, the secret key d and the session seed k remain secure. Although P, kP and dP are known, the encryption key kdP cannot be computed due to the difficulty in obtaining either d or k.

The efficacy of the encryption depends upon the efficient computation of the values kP, dP and kdP by the arithmetic unit 20. Each computation requires the repetitive addition of two points on the curve which in 5 turn requires the computation of an inverse of one of the values.

Operation of the Arithmetic Unit 10 The operation of the arithmetic unit 20 is shown schematically in Figure 3. The unit 20 includes a multiplier 48 having a pair of cyclic shift registers 42, 44 and an accumulating register 46. Each of the registers 42, 44, 46 contain M cells 50a, 50b...50m, in 15 this example 155, to receive the m elements of a normal basis representation of one of the coordinates of e.g. x, of P. As fully explained in U.S. Patent No. 4,745,568, the cells 50 of registers 42, 44 are connected to the corresponding cells 50 of accumulating register 46 such a 20 way that a respective grouped term is generated in each cell of register 46. The registers 42,44,46 are also directly interconnected in a bit wise fashion to allow fast transfers of data between the registers.

The movement of data through the registers is 25 controlled by a control register 52 that can execute the instruction set shown in the table below:

INSTRUCTION SET

Operation Size Clock Cycles Field Multiplication 155 bit blocks 156 MULT

Calculation of 24 multiplications approx.

Inverse INVERSE
I/O 5-32 bit transfers per 10 WRITE(A,B or C) read/write to registers 2 READ(A,B or C) clock cycles per transfer Elementary Register 155 bit parallel operation NOP
Rotate (A,B or C) Copy (A-B ) (A-C) (A'-B) (B'-C) SWAP ( A-B ) CLEAR (A,B or C) SET (A,B or C) ADD (AMB) ACCUMULATE
The unit 20 includes an adder 54 to receive data from the registers 42,44,46 and RAM 36. The adder 54 is an XOR function and its output is a data stream that may be stored in RAM 36 or one of the registers 42, 44.
Although shown as a serial device, it will be appreciated that it may be implemented as a parallel device to improve computing time. Similarly the registers 42,44,46 may be parallel loaded. Each of the registers 42,44,46, is a 155 bit register and is addressed by a 32 bit data bus to allow 32 bit data transfer in 2 clock cycles and the entire loading in 5 operations.

The subroutines used in the computation will now be described.

a) Multiplication The cyclic shift of the elements through the registers 42, 44 m times with a corresponding shift of the accumulating register 46 accumulates successive group terms in respective accumulating cells and a complete rotation of the elements in the registers 42, 44, produces the elements of the product in the accumulating register 46.

b) Squarins The multiplier 48 may also provide the square of a number by cyclically shifting the elements of one cell along the registers 42. After a one cell shift, the elements in the register represent the square of the ~ 28 2129203 number. In general, a number may be raised to the power 29 by cyclically shifting g times through a register.

c) Inverse Computation of the inverse of a number can be performed efficiently with the multiplier 48 by implementing an algorithm in which X-1 is represented as g2"-a or X2 (2"-1-1) =

If m-1 is considered as the product of two factors g,h then g-1 may be written as g2 (2 h-1) or 2(ph-1) where X2.

The exponent 2(9h-1) is equivalent to n-i 2g-1 E 29' i=0 g-1 The term 29-1 may be written as E 2J
j=o so that X-1 Zjl(~ 2Ql 2j f21+2+2=+23..... 29-1 0 and is denoted ti This term may be computed on multiplier 48 as shown in Figure 4 by initially loading registers 42, with the value X. This is shifted 1 cell to represent (3 (i.e. X2) and the result loaded into both registers 42, 44.

Register 44 is then shifted to provide gZ and the registers 42, 44 multiplied to provide R2+1 in the accumulating register 46. The multiplication is obtained with one motion, i.e. a m bit cyclic shift, of each of the registers 42, 44, 46.

The accumulated term p1;z is transferred to register 44 and register 42, which contains gz is shifted one place to provide g4. The registers 42, 44 are multiplied to provide R1+2+4 This procedure is repeated g-2 times to obtain y.
As will be described below, 7 can be, exponentiated in a similar manner to obtain h-1 29' 1' i.e xl Y 1+29+2a9+239 ......2cn-t>D
This term can be expressed as As noted above, 7 can be exponentiated to the 2g by shifting the normal basis representation g times in the register 42, or 44.

10 Accordingly, the registers 42, 44 are each loaded with the value 7 and the register 42 shifted g times to provide Yz' . The registers 42, 44 are multiplied to provide Y, Y2 or Y1+2 in the accumulating register 46. This value is transferred to the register 15 44 and the register 42 shifted g times to provide ~-2`' =

The multiplication will then provide Y1+2'+2'9 Repetition of this procedure (h-1)g-1 times produces the 20 inverse of X in the accumulating register 46.
From the above it will be seen that squaring, multiplying, and inverting can be effectively performed utilising the finite field multiplier 48.

Addition of ooints P to itself (P + P) using the subroutines To compute the value of dP for generation of the public key, the arithmetic unit 20 associated with the receiver 12 initially computes the addition of P + P. As noted in the introduction, for a nonsupersingular curve the new point Q has coordinates (X3,Y3) where X3 =1Ci e b Y3 =X1 (Xl ~ ) X3 X3 To compute X3, the following steps may be implemented as shown in Figure 5.

The m bits representing X1 are loaded into register 42 from base point register 28 and shifted one cell to the right to provide Xi . This value is stored in RAM 36 and the inverse of X1 computed as described above.

The value of gl2 is loaded into register 44 and the parameter b to extracted from the parameter register 30 and loaded into register 42. The product bxi2 is computed in the accumulating register 46 by rotating the bit rectors and the resultant value XOR'd in adder 52 with value of gi stored in RAM 36 to provide the normal basis representation of X3. The result may be stored in RAM 36.

A similar procedure can be followed to generate Y3 by first inverting X1, multiplying the result by Y, and XORing with X1 in the adder 52. This is then multiplied by X3 stored in RAM 36 and the result XOR'd with the value of X3 and gi to produce Y3.

The resultant value of (X3, Y3) represents the sum of P + P and is a new point Q on the curve. This could then be added to P to produce a new point Q'. This process could be d-2 times to generate dP.

The addition of P + Q requires the computation of (X3, Y3) where X3 X3 - Yl YZ 'yl Y2 x x2 a _(X X ~{2 1 ~

and Y3 = y3 = yl Yz (1f1 X3) X3 Y1 ( 1 (~ 2) This would be repeated d-2 times with a new value for Q at each iteration to compute dP.

Whilst in principal this is possible with the arithmetic unit 20, in practice the large numbers used make such a procedure infeasible. A more elegant approach is available using the binary representation of the integer d.

Computation of dP from 2P

To avoid adding dissimilar points P and Q, the binary representation of k is used with a doubling method to reduce the number of additions and the complexity of the additions.

The integer d can be expressed as d E )1i2i.liE(0, 1) and dp =E1lo(2iP) i.e.
i-o i-o It2 tP+Xt-i2 c-iP. . . )L323P+)L 222P+;L 12P. +loP

The values of X are the binary representation of d.

Having computed 2P, the value obtained may be added to itself, as described above at Figure 5 to obtain 22P, which in turn can be added itself to provide 23P etc.

This is repeated until 21P is obtained.

At each iteration, the value of 2`P is retained in RAM 36 for use in subsequent additions to obtain dp.
The arithmetic unit 20 performs a further set of additions for dissimilar points for those terms where X
is 1 to provide the resultant value of the point (x31Y3) representing dP.

If for example k=5, this can be computed as 2ZP + P
or 2P + 2P + P or Q + Q + P. Therefore the result can be obtained in 3 additions; 2P = Q takes 1 addition, 2P + 2P
= Q + Q = R takes 1 and R + P takes 1 addition. At most t doublings and t subsequent additions are required depending on how many X are 1.

Performance of Arithmetic units For computations in a Galosis field 21ss it has been found that computing the inverse takes approximately 3800 clock cycles.

The doubling of a point, i.e. the addition of point to itself, takes in the order of 4500 clock cycles and for a practical implementation of a private key, the computation of the public key dP may be computed in the order of 1.5 x 105 clock cycles. With a clock rate typically in the order of 40 mHz, the computation of dP

5 will take in the order of 3 x 10-2 seconds. This throughput can be enhanced by bonding the seed key k with a Hamming weight of, for example, 20 and thereby limit the number of additions of dissimilar points.

10 Computation of session key kP and encryption key kdP
The session key kP can similarly be computed with the arithmetic unit 20 of transmitter 10 using the base point P from register 28. Likewise, because the public 15 key dP is represented as a point,(x3,y3), the encryption key kdP can be computed in similar fashion.

Each of these operations will take a similar time and can be completed prior to the transmission.

The recipient 12 is similarly required to compute dkP as he received the ciphertext C which again will take in the order of 3 x 10-2 seconds, well within the time expected for a practical implementation of an encryption unit.

The public key dP, and the session key kP are each represented as a 310 bit data string and as such require a significantly reduced bandwidth for transmission. At the same time, the attributes of elliptic curves provides a secure encryption strategy with a practical implementation due to the efficacy of the anthetic unit 20.

Curve selection a) The selection of the field The above example has utilised a field of 21ss and a non-supersingular curve. The value 155 was chosen in part because an optimal normal basis exists in F2155 over F2. However, a main consideration is the security and efficiency of the encryption system. The value 155 is large enough to be secure but small enough for efficient operation. A consideration of conventional attacks that might be used to break the ciphertext suggests that with elliptic curves over FZ, , a value of m of about 130 provides a very secure system.
Using one thousand devices in parallel, the time taken to find one logarithm is about 1.5 x 1011 seconds or at least 1500 years using the best known method and the field F2 155 . Other techniques produce longer run times.

~ 37 b) Supersingular v. Nonsupersingular Curves A comparison of other techniques for attacking encryption suggests that non-supersingular curves are more robust than supersingular curves. For a field Fqk , an attack based on the method suggested by Menezes, Okamoto and Vanstone in an article entitled "Reducing elliptic curve logarithms to logarithms in finite field" published in the Proceeding 22 Annual ACM

Symposium Theory Computing 1991, pp. 80-89, (The MOV
attack) shows that for small values of k, the attack becomes subexponential. Most supersingular curves have small values of k associated with them. In general however, non-supersingular curves have large values of k and provided k>log2q then the MOV attack becomes less efficient than more conventional general attacks.

The use of a supersingular curve is attractive since the doubling of a point (i.e. the case where P = Q) does not require any inversions in the underlying field. For a supersingular curve, the coordinates of 2P are X3 = Xl b Z and y3 c(xb)(XX) j/1 d-Since a is a constant, al and a'2 is fixed for a given curve and can be precomputed. The values of Xi and x4 can be computed with a single and double cyclic shift respectively on the multiplier 48. However, the subsequent addition of dissimilar points to provide the value of dP still requires the computation of an inverse as X3 = YX? Z xl (1) x2 and ( 1 2) Y3 = Y (xl x3 ) yl a ( i z) Accordingly, although supersingular curves lead to very efficient implementations, there is a relatively small set of supersingular curves from which to choose, particularly if the encryption is to be robust. For a supersingular curve where m is odd, there are 3 classes of curve that can be considered further, namely y 2 + y = x3 y2+y = x3+x yz+y = x3+x+1 However, a consideration of these curves for the case where m= 155 shows the none provide the necessary robustness from attack.

Enhanced security for supersingular curves can be obtained by employing quadratic extensions of the underlying field. In fact, in Fq where q= 2310 , i.e. a quadratic extension of F215S , amongst the supersingular curves, there are four which under the MOV
attack require computation of discrete logs in F2930 These curves provide the requisite high security, also exhibit a high throughput. Similarly, in other extensions of subfields of F2155 (e.g. F231 ) other curves exist that exhibit the requisite robustness.
However, their use increases the digits that define a point and hence the bandwidth when they are transmitted.

By contrast, the number of nonsupersingular curves of Fq, q = 2155, is 2( 2155 _ 1). This large choice of curves permits large numbers of curves over this field to be found for which the order of a curve is divisible by a large prime factor. In general, determining the order of an arbitrary nonsupersingular curve over Fq is not trivial and one approach is explained further in a paper entitled "Counting Points on Elliptic Curves" by Menezes, Vanstone and Zuccherato, Mathematics of Computation 1992.

In general however, the selection of suitable curves is well known in the art, as exemplified in "Application of Finite Fields", chapters 7 and 8, by Menezes, Blake et al, Kluwer Academic Publishers (ISBN 0-7923-9282-5) because of the large numbers of such curves that meet the requirements, the use of nonsupersingular curves is preferred despite the added computations.

An alternative approach that reduces the number of inversions when using nonsupersingular curves is to employ homogeneous coordinates. A point P is defined by the coordinates (x,y,z,) and Q by the point (x2,y2,x2) The point ( 0, 1 , 0) represents the identity 0 in E.

To derive the addition formulas for the elliptic 10 curve with this representation, we take points p= (xl, yl, zl) and Q= (X2, y2, z2) , normalize each to (xl/zl, yl/zl, 1) , (x2/z2, y2/z2, 1) , and apply the previous addition formulas. If P = (x1,Y1, zl) , Q = (x21 Y21 z2) , P, Q# 0, and po-Q then 15 p+ Q = (x3,y3, z3) where if p# Q, then X3 = AD

y3 = CD +A 2 (Bxl + Ayl ) z3 = A 3 Z1 Z2 where A= x2zl + xlz2 , B= y2zl + ylz2 , C= A+ B and 20 D A2 (A + az1z2) + Z1z2BC-In the case of p= Q, then X3 =AB

y3 = xiA + B(xi + ylzl + A) Z3 = A 3 ~

where A = xlzl and B= bzi + xi =

It will be noted that the computation of x3 y3 and z3 does not require any inversion. However, to derive the coordinates x3Y3 in a nonhomogeneous representation, it is necessary to normalize the representation so that _ Y3 x3 - 3 y3 -This operation requires an inversion that utilizes the procedure noted above. However, only one inversion operation is required for the computation of dP.
Computing dP using the version of the double and add method the computing action of p + Q, p* Q, requires 13 field multiplications, and 2P requires 7 multiplications.

Claims

1. In a data encryption system in which the data is combined with an encryption key to produce ciphertext, a method of generating a key comprising the steps of:

a) selecting an elliptic curve of the form y2 +xy = x3 +ax2 +b lying in the finite field GF2m, said field being selected to have elements A 2' (0 <= i <=
m) that constitute a normal basis;
b) representing the coordinates of a point on said curve as a set of vectors, each vector representing a coordinate of said point and having m binary digits, each of which represents the coefficient of A2' in the normal basis representation of said vector;

c) computing from addition of at least two sets of vectors an additional set of vectors to represent the coordinates of further point on said curve; and d) utilising said additional set of vectors to derive a key for encrypting data.

2. A method according to claim 1 wherein addition of sets of vectors involves at least one squaring operation.

3. A method according to claim 2 wherein said squaring operation is performed on at least one of said vectors of one of said sets representing a point.

4. A method according to claim 3 wherein said squaring operation is performed on combinations of vectors from a plurality of said sets representing respective points.

5. A method according to claim 3 wherein each of said vectors is represented as m binary digits and squaring thereof is performed by a cyclic shift of said m binary digits.

6. A method according to claim 5 wherein said m binary digits are stored in respective cells of a shift register and squaring thereof is performed by a cyclic shift of said m bits in said register.

7. A method according to claim 1 wherein addition of sets of vectors involves the computation of at least one inverse of a vector.

8. A method according to claim 7 wherein said inversion utilises multiple squaring operations.

9. A method according to claim 8 wherein squaring operations are performed by a cyclic shift of binary digits.

10. A method according to claim 7 wherein computation of said inverse includes an exponentiation of the square of the vector to provide a value .gamma. of the form:

.gamma. = .beta. 1+2+2 2+ +2g-1;
where is the square of the vector and g is a factor of m-1.

11. A method according to claim 10 wherein successive terms of said exponentiation are obtained by successive cyclic shifts of the vector.

12. A method according to claim 11 wherein the value y is accumulated after each cyclic shift by multiplication of the shifted term with the previously accumulated value of .gamma..

13. A method according to claim 10 wherein m binary digits representing .beta.
are stored in each of a pair of shift registers, one of said pair of registers being cyclically shifted and said pair of registers being multiplied to provide an intermediate value of .gamma..

14. A method according to claim 13 wherein said one of said pair of registers is further cyclically shifted to provide a further successive term of said expansion and said further successive term multiplied with said intermediate value to provide a further intermediate value of .gamma..

15. A method according to claim 14 wherein said cyclic shifting and multiplication is performed g-2 times to complete said exponentiation of .beta. and provide a value of .gamma..

16. A method according to claim 10 where computation of said inverse includes a further exponentiation of y of the form .gamma. 1+2g+2 2g+ +2(h-i)g where h is a factor of m-1 such that gh = m - 1.

17. A method according to claim 16 wherein successive terms said further exponentiation are obtained by successive cyclic shifts of the m binary digits representing .gamma..

18. A method according to claim 17 wherein the value of said inverse is accumulated after each cyclic shift by multiplication of the shifted term with the previously accumulated value of .gamma..

19. A method according to claim 16 wherein m binary digits representing y are stored in each of a pair of shift registers, one of said pair of registers being cyclically shifted and said pair of registers being multiplied together to provide an intermediate value of said inverse.

20. A method according to claim 19 wherein said one of said pair of registers is further cyclically shifted to provide a further successive term of said expansion which is then multiplied with said intermediate value of said inverse to provide a further intermediate value thereof.

21. A method according to claim 20 wherein said cyclic shifting and multiplication is performed (h-1)g-1 times to complete exponentiation of .gamma..

22. A method according to claim 11 wherein said further point on said curve is an integer multiple d of said point P and said value dP is computed by successively doubling multiples of P
to provide terms 2i P from t = 0 to t = m, and computing:

where .lambda. is the coefficient of the binary representation of d.

23. A method according to claim 22 wherein doubling of multiples of p is obtained by computing:

where X1Y1 are the coordinates of the point 21-1 and X3Y3 are the coordinates of the point 2i P.

24. A method according to claim 23 wherein computation of the term X1 2 is obtained by a cyclic shift of binary digits representing X1 in a normal basis.

25. A method according to claim 24 wherein computation of the inverse of X1 2 is computed by an exponentiation of X1 2 to provide a value y of the form:

.beta.1+2+2 2+ +2g-1 ;

where = X1 2; and g is a factor of m-1.

26. A method according to claim 25 wherein successive terms of said exponentiation are obtained by successive cyclic shifts of the binary digits representing X1 2 in a normal basis.

27. A method according to claim 26 wherein computation of the inverse of X1 2 includes a further exponentiation of y of the form:

.gamma..SIGMA.1+2g +2 2g +...+2(h=i)g;

where h is a factor of m-1 such that gh = m - 1.

28. A method according to claim 27 wherein successive terms said further exponentiation are obtained by successive cyclic shifts of the m binary digits representing .gamma..

29. A computer readable medium comprising computer executable instructions that when executed, cause a cryptographic processor to perform the method according to any one of claims 1 to 28.

30. A cryptographic system for generating a key, said system comprising data to be combined with an encryption key to produce ciphertext, said system comprising an encryption/decryption module configured for:

a) selecting an elliptic curve of the form y2 +xy = x3 +ax2 +b lying in the finite field GF2M, said field being selected to have elements A2i (0 <= i <= m) that constitute a normal basis;
b) representing the coordinates of a point on said curve as a set of vectors, each vector representing a coordinate of said point and having m binary digits, each of which represents the coefficient of A2i in the normal basis representation of said vector;

c) computing from addition of at least two sets of vectors an additional set of vectors to represent the coordinates of further point on said curve; and d) utilising said additional set of vectors to derive a key for encrypting data.

31. The system according to claim 30 wherein addition of sets of vectors involves at least one squaring operation.

32. The system according to claim 31 wherein said squaring operation is performed on at least one of said vectors of one of said sets representing a point.

33. The system according to claim 32 wherein said squaring operation is performed on combinations of vectors from a plurality of said sets representing respective points.

34. The system according to claim 32 wherein each of said vectors is represented as m binary digits and squaring thereof is performed by a cyclic shift of said m binary digits.

35. The system according to claim 34 further comprising a shift register, wherein said m binary digits are stored in respective cells of said shift register and squaring thereof is performed by a cyclic shift of said m bits in said register.

36. The system according to claim 30 wherein addition of sets of vectors involves the computation of at least one inverse of a vector.

37. The system according to claim 36 wherein said inversion utilises multiple squaring operations.

38. The system according to claim 37 wherein squaring operations are performed by a cyclic shift of binary digits.

39. The system according to claim 36 wherein computation of said inverse includes an exponentiation of the square of the vector to provide a value .gamma. of the form:

.gamma.=.beta.1+2+2 2+...+2g-1 ;
where .beta. is the square of the vector and g is a factor of m-1.

40. The system according to claim 39 wherein successive terms of said exponentiation are obtained by successive cyclic shifts of the vector.

41. The system according to claim 40 wherein the value .gamma. is accumulated after each cyclic shift by multiplication of the shifted term with the previously accumulated value of .gamma..

42. The system according to claim 39 further comprising a pair of shift registers, wherein m binary digits representing .beta. are stored in each of said pair of shift registers, one of said pair of registers being cyclically shifted and said pair of registers being multiplied to provide an intermediate value of .gamma..

43. The system according to claim 42 wherein said one of said pair of registers is further cyclically shifted to provide a further successive term of said expansion and said further successive term multiplied with said intermediate value to provide a further intermediate value of .gamma..

44. The system according to claim 43 wherein said cyclic shifting and multiplication is performed g-2 times to complete said exponentiation of 0 and provide a value of y.

45. The system according to claim 39 where computation of said inverse includes a further exponentiation of y of the form .gamma.1+2g+2 2g+ +2(h-i)g where h is a factor of m-1 such that gh = m - 1.

46. The system according to claim 45 wherein successive terms said further exponentiation are obtained by successive cyclic shifts of the m binary digits representing .gamma..

47. The system according to claim 46 wherein the value of said inverse is accumulated after each cyclic shift by multiplication of the shifted term with the previously accumulated value of .gamma..

48. The system according to claim 45 further comprising a pair of shift registers, wherein m binary digits representing y are stored in each of said pair of shift registers, one of said pair of registers being cyclically shifted and said pair of registers being multiplied together to provide an intermediate value of said inverse.

49. The system according to claim 48 wherein said one of said pair of registers is further cyclically shifted to provide a further successive term of said expansion which is then multiplied with said intermediate value of said inverse to provide a further intermediate value thereof.

50. The system according to claim 49 wherein said cyclic shifting and multiplication is performed (h-1)g-1 times to complete exponentiation of .gamma..

51. The system according to claim 40 wherein said further point on said curve is an integer multiple d of said point P and said value dP is computed by successively doubling multiples of P
to provide terms 2i P from t = 0 to t = m, and computing:

where .lambda. is the coefficient of the binary representation of d.

52. A method according to claim 51 wherein doubling of multiples of p is obtained by computing:

where X1Y1 are the coordinates of the point 2i-1 and X3Y3 are the coordinates of the point 2i P.

53. The system according to claim 52 wherein computation of the term X1 2 is obtained by a cyclic shift of binary digits representing X1 in a normal basis.

54. The system according to claim 53 wherein computation of the inverse of X1 2 is computed by an exponentiation of X1 2 to provide a value y of the form:

.beta.1+2+2 2+ +2g-1.
where = X1 2 and g is a factor of m-1.

55. The system according to claim 54 wherein successive terms of said exponentiation are obtained by successive cyclic shifts of the binary digits representing X1 2 in a normal basis.

56. The system according to claim 55 wherein computation of the inverse of X1 2 includes a further exponentiation of y of the form:

y.SIGMA.1+2g +2 2g +...+2(h=i)g;

where h is a factor of m-1 such that gh = m - 1.

57. The system according to claim 56 wherein successive terms said further exponentiation are obtained by successive cyclic shifts of the m binary digits representing .gamma..