Multiplication in a finite field
The invention relates to a method of multiplying two elements in a finite field. The invention further relates to an apparatus for multiplying two elements in a finite field.
Arithmetics in finite fields is the basis of many algebraic error correcting codes and cryptographic protocols. Many applications require that the finite field arithmetics is performed in streaming data, e.g. data received via a communication system or read from a storage medium in a streaming manner (typically at a regular rate in which the processed data is subsequently rendered). To support high bit-rate data streams usually the finite field arithmetics is performed by dedicated or optimized hardware, such as a processor with an extended instruction set. For low-cost systems it is desired to be able to use conventional hardware, such as a standard microcontroller or digital signal processor (DSP). In particular in many transmission system, such as mobile communication system, error correcting codes,
(e.g. Reed-Solomon codes) are used using arithmetic in the Galois field GF f 28 J with #=256. For such high- volume consumer products costs is a major factor.
The finite field Fq with q = 2m elements and with α as a generator consists of the elements Fq ={0, 1, α, a2,...,a*"2}. The elements of the field can be represented in several ways. In the exponent representation of Fq the non-zero field element x = αe is represented as the integer exponent e <= {0,...,q-2}. In addition to an exponent representation, also an index representation of Fq can be used. In such a representation, the elements of Fq are ordered in an arbitrary but fixed way as Fq = {xo, x\, ..., xq-\}- The field element x,- is represented by the integer index ye {0,...,q-l}. The integer index of an element x,- of the field is indicated as
J=KxJ)-
Addition can be performed relatively easily on most processors, for example by choosing an index representation defined by a basis representation of F^ over F2 such that addition in Fy, can be computed by taking the bitwise exclusive-or of the indices. Multiplication is then typically performed by using logarithm and exponentiation, using
respective tables with the outcome of those steps. Mathematically, it is well-known that multiplication can be expressed in terms of exponentiation and logarithm as: 0,
Computationally, the input to the processing is i(x) and i(y) and it is required to output i(x-y). In principle such a calculation can be performed using a 2-dimensional table, indexed by i(x) and Hy) and providing i(xy). However, this requires a table with q2 entries, which is too large for many applications. In the exemplary Galois field GF28 , storing the entire table would require 65 KB of memory. Therefore, multiplication is usually computed using a table L for logarithm and a table E for exponentiation, each requiring q-1 entries. Hence, the total tabulated data is 2q-2 entries, thus for the example of #=256 requiring 510 bytes of memory. In pseudo-code, the calculation is then as follows: if i(x) = i(0) or i(y) = i(0) then return i(0); else e := L[i(x)] + L[i(y)]; if e >= q-l then e := e - (q - 1); fi; return E[e]; fi.
Both 'if statements deal with the situations wherein a possible value of 'e' would fall outside the range of table E. The first 'if statement deals with multiplications including at least one zero-element. The second 'if statement performs the modulo q - 1 reduction shown above.
It is an object of the invention to provide a faster way of multiplication in a finite field, in particular for execution in software on a processor using conventional processor instructions.
To meet the object of the invention, a method of performing a multiplication in a finite field F11 ={0, 1, α, α2,...,α9'2}, where α is a generator of the field, q=pm,p is a prime and m≥\; the field in an index representation being Fq = {XQ, xi, ..., xq.\} where each
element Xj of the field is represented by a respective index,/ indicated as j= i(xj); includes multiplying non-zero elements x, y ≡ Fq by: obtaining L[i(x)] and L[i(y)], where L is a logarithm table including at least q entries with L\f\ = \oga(xj), for j=0, ...,q-l; calculating L[i(x)] + L[i(y)]; and providing as a multiplication outcome an index retrieved from a table entry E[L[i(x)] + L[i(y)]], where E is an exponent table including a first part with respective entries E\j] = i [a3 ) for 7=0, ... ,q-2 and a second part immediately subsequent to the first part with respective entries E\j+q-Y\ = i(ccJ) forj=0,...,q-3. The first part of table E is the conventional exponentiation table. By having a second part as defined in the claim, the conditional modulo q - \ reduction is no longer required. In the exemplary pseudo code, the second if statement is redundant. On most conventional processors, conditional branching requires at least one CPU cycle. By eliminating the branch, the processing is faster, at a limited increase in storage requirements. According to the measure of the dependent claim 2, the exponentiation table also includes third and second part respectively dealing with the case where one or both factors are the zero element of the field. This eliminates the need for testing for and dealing with such factors. In the exemplary pseudo code, the first if statement is redundant.
According to the measure of the dependent claim 3, by choosing Z slightly larger than minimally required, table L can be incorporated in an unused part of table E, reducing the storage requirements.
According to the measure of the dependent claim 4, by minimally increasing the second and third part of table E, the tables can also be used for performing a quick division operation. According to the measure of the dependent claim 5, by choosing Z again slightly larger than minimally required, table L can be incorporated in an unused part of table E, reducing the storage requirements.
According to the measure of the dependent claim 6, an alternative way for division is to use an additional inversion table. This division method is faster than the method described in claim 5, at the expense of higher storage requirements.
According to the measure of the dependent claim 7, a further alternative way for division is to use an additional log-inversion table. This division method requires one table look-up operation less, and is thus even faster than the method of claim 6.
To meet the object of the invention, an apparatus for performing a multiplication in a finite field Fq ={0, 1, α, α2,...,^"2}, where αis a generator of the field, q=pm,p is a prime and m≥\; the field in an index representation being F9 = {XQ, XJ, ..., xq.\} where each element X1 of the field is represented by a respective indexy indicated as;= Xx1); the system including: a memory for storing a logarithm table L and an exponent table E, where L includes at least q entries with L[f\ = logα(x,), for/=0,...,q-l and E includes a first part with respective entries E\j] = i [a} ) ϊoτj=Q,...,q-2 and a second part immediately subsequent to the first part with respective entries E\J+q-l] = i (aJ} forj=0,...,^-3; and a processor for, under control of a program, multiplying non-zero elements x, y e F11 by: retrieving L[i(x)] and L[i(γ)] from the memory; calculating L[i(x)] + L[i(y)]; and providing as a multiplication outcome an index retrieved from a table entry E[L[i(x)] + L[i(y)]] in the memory.
To meet an object of the invention, a memory for use in the apparatus of claim 9 stores a logarithm table L and an exponent table E, where L includes at least q entries with L\j] = loga(xj), foτj=0,...,q-l and E includes a first part with respective entries E\f\ = i\ aJ J for y-0,...,q-2 and a second part immediately subsequent to the first part with respective entries E\j+q-l] = i (aJ^ for/=0,...,g-3.
These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter.
In the drawings:
Fig. 1 shows four representations of the finite field;
Fig.2 shows tables used in a preferred embodiment;
Fig.3 shows the structure of the exponentiation table according to the invention; and Fig.4 shows a system using the invention.
In the detailed description the embodiments of the invention are first described for an exemplary field, followed by a more general description. The invention has the following aspects:
1. extending the exponentiation table with a 'copy' to eliminate the modular reduction during multiplication
2. extending the exponentiation table with entries to eliminate the treatment of one or two zero- factors during multiplication
3. extending the exponentiation table with a 'copy' to eliminate the modular reduction during division a. shifting the extended table to the same range as used for multiplication b. using an inverse table to increase speed c. using a log-inverse table to increase speed even further
4. inserting the log table into the exponentiation table to reduce storage requirements The aspects 1, 2, and 3 can be applied independently, but are preferably used in combination. Aspects 3a, 3b and 3c are alternatives.
Example
In this part of the description example embodiments will be given for an exemplary field F2, with a generator element αe Fq (q=24=\6,p=2, m=4) and generator polynomial Xx)= x4+x+l; thus Xa)=O. The field is then F9 =(O, 1, a, a2,...,a14}. The field F2> is also usually referred to as the Galois Field GF(24). The table of Fig.1 gives four representations of the field. The leftmost column shows the sixteen elements of the field. The second column shows the elements in exponent representation. In this representation, the non-zero field element x = αe is represented as the integer exponent e e {0,...,q-2}. The field element 0 is represented by an integer Z, not being a member of {0,...,q-2}. The exponent of a field element x e F9 is denoted as logα(x) and by definition logα(0) = Z. The third column shows the polynomial representation. The fourth column shows a vector representation using as a basis {1, α, α2, α3}. The last column shows an example of an index representation. In this example, the index representation is defined by a basis representation of F24 over F2 such that addition in F4 can simply be computed by taking the bitwise exclusive-or of the indices.
Thus, i(x+y) = i(x) θ i(γ). For example, using the normal representation, if x=α5 and j^=αu, thenx+y=α5+α11=α+α2+ α+α2+α3= α3. In the index representation, this gives i(x+y) = i(x) θ
i(y) = 6 θ 14 = 0110B θ 0111 B = 0001 B = 8 = z'(α3). This particular index representation is very practical for simple addition in software on a general purpose processor. It can be derived directly from the shown vector representation, by using the vector components as the binary digits. For example, in the exemplary embodiment element α7 has a vector representation of (1,1,0,1) giving the four-bit value HOIB=H.
First embodiment
In this first embodiment, the need to perform the following part of the prior art implementation is removed: ife >= q -l then e := e - (q - 1); fi;
To this end, an extended table E is used with entries for the entire range of L[i(x)] + L(i(y)], removing the need for performing above test on whether the range is exceeded, and if so performing a correction in the form of a modulo reduction. Table E includes a first and second part that both give a mapping from the exponent representation to the index representation. The first part of E has the same 15 (q-l) entries as the prior art exponentiation table, given by E[ j = i [aJ ) forj=0,...,14. As an example using the table of Fig.l,
E [3] = i (a3 J = 8 . In fact, the first part of E is in this example simply the last 15 elements of the last column of Fig.l. The second part of table £ is a 'copy' of the first part (with the exception of the last element that in this first embodiment is not included in the second part):
E[ +J5] = i[(XJ) for/=0,...,13. Table L is the same as the prior art logarithm table, giving the 'inverse' mapping for the first part of the table E. In this example, L is a table with q=\6 entries, defined as L\J] - logα(x,), for_/=0,...,15. Thus, α LM = X3. As an example using the table of Fig.l, /(α3) = 8, thus I[8] = 3.
Using above definitions, the multiplication now becomes: if i(x) = i(0) or i(y) = i(0) then return i(0); else return E[L[i(x)] + L[i(y)]] ; fi.
The range of L[i(x)] is from 0 to 14 (q-2). The range of L[i(x)] + L[Hy)] is thus from 0 to 28 (2?-4).
Example 1. Multiplication of the elements x = a represented by z(x)=2 and y=α3, represented by z(y)=8. The multiplication outcome should be: x.y = α.α3= α4. Using the tables: E[L[i(x)] + L[i(y)]]=E[L[2]+L[8]]=E[l+3]=E[4]=3. The resulting index 3 indeed represents element α4.
Example 2. Multiplication of the elements x=α7, represented by z'(x)=l 1 and ^=Ot11 represented by i(y)=\4. The multiplication outcome should be: x.y = α7.α11=α18= α3. α15=α3.l=α3. Using the tables: E[L[i(x)] + L[i(y)]]=E[L[l 1]+L[14]]=E[7+11]=E[18]=8. The index 8 indeed represents element α3.
Second embodiment
In this second embodiment, the need to perform the following part of the prior art implementation is removed: if i(x) = i(0) or i(y) = i(0) then return i(0); Multiplying elements x, y ≡ Fq then simply involves retrieving and returning E[L[i(x)] +
L[i(γ)J], also for zero factors. To this end, the table E is extended with a part for dealing with one factor being the zero element and a part for dealing with both factors being the zero element. In this example, the third part includes 15 (=#-1) entries containing i (θ) , being the outcome of a multiplication by zero. If one factor is zero, the outcome should be in the table entries E[L[I(O)] + L[i(xj)]] for j = 0 to 15. Since the lowest value in L is 0, the third part starts at entry L[i(0)] of table E. i[z(0)] is defined as Z. To avoid an overlap with the second part of table E, Z ≥ 29 (=2q-3). The fourth part deals with the case that both factors are the zero element, also giving a multiplication outcome of zero. This outcome should be in the table entries E[L[i(0)] + L[i(0)]] = E[IZ].
Third embodiment
Division of element x by a non-zero element y can be performed in the following way:
x / y
e
The case where x = 0 can be handled separately, but it is preferred to use the extended table as defined for the second embodiment. For the case wherein both factors are non-zero, the second part of table E should actually have been located before the first part. This is still possible, but comes at the expense of a further significant extension to table E. Instead, in a preferred embodiment, the existing first and second parts are used. The division outcome is the index retrieved from table entry E[L[i(x)] + (q-l) - L[i(y)]]. The range of L[i(x)] and L[i(y)] is from 0 to 14 (q-2). The range of L[i(x)] - L[i(y)] is thus from -14 to 14. The range of L[Kx)] + (?-l) - LU(y)] is from 1 to 29 (2q-3). This implies that the second part is extended with one entry E[29] = E[2q-3] = i[a2 Y= i(a 4) = 9 . This additional entry is already shown in the table of Fig.1. If no multiplication were to be performed according to the first embodiment, entry E[O] can be removed. The extension of the second part has as a consequence that Z has to be increased with one, thus Z > 2q-2 (=30). It also has as a consequence that the third part must be extended with one entry giving a total of q entries with content z (0) . Preferably, Z =2q (=32) and table L is located in table E in between the third and fourth part, starting at entry 2q-2 (=30).
Example:
Division of the elements x = α7 represented by i(x)=l 1 by y=aA, represented by z(y)=3. The division outcome should be: xly - oc7/α4= α3. Using the tables: E[L[i(x)] +(q- 1) - L[i(y)]]=E[L[l 1]+15-L[3]]=E[7+15-4]=E[18]=8. The resulting index 8 indeed represents element α3.
Fourth embodiment Division of element x by a non-zero element y can alternatively be performed in the following way:
As described for the third embodiment, the case where x = 0 can be handled separately, but it is preferred to use the extended table as defined for the second embodiment. Unlike the third embodiment, no range problem exists, however fast access toy1 is desired. This is achieved
by using an additional table inversion table / including at least q entries with I\j] = /(x/1) for j=O,...,q-\. The inverse of the zero element of the field is represented as z(0). For the exemplary field, table / is shown in the fourth column of Fig.2. The table can be created in the following straightforward manner, using the table of Fig.1. Since fields are cyclic, α 5=1. The inverse of element x=α4 is thus y = oc11, since α4.α11= α15. Since /(α4) = 3, /[3] = 14 = z'(αn). Dividing an element* e Fq by a non-zero element y e Fq is then simply done by retrieving the index from table entry E[L[i(x)] + L[I[i(y)]]].
Example: Division of the elements x = α7 represented by z(x)=l 1 by y=ϋ?, represented by z(y)=3. The division outcome should be: xly = α7/α4= α3. Using the tables: E[L[i(x)] + L[I[i(y)]]]=E[L[l 1]+L[I[3]]]=E[7+L[14]]=E[7+11]=E[18]=8. The resulting index 8 indeed represents element α3.
Fifth embodiment
Division of element x by a non-zero element y can be performed in the following way, as described also for the fourth embodiment:
In this embodiment, an additional log inversion table L is used that in one access provides the outcome \oga(yΛ), effectively reducing the successive access to / and L to one step.
Thus L is a log-inverse table including at least q entries with L[ ] =jlogα (x^"1 ) foτj=0,...,q-
1". The table can simply be created by taking the output of table / for the desired index to access table L and using the outcome as the entry in table L . As an example, using the exemplary field, z(α4) = 3, 1[3] = 14, L[14] = 11, thus Z [3] i 1. The exemplary table L is shown in the fifth column of Fig.2. Dividing an element x e F9 by a non-zero element y e Fq is then simply done by retrieving the index from table entry £[-£[ ( x])£ f y)] \ ■ The table stores 1 o g ( O1) as 0.
Example:
Division of the elements x = oc
7 represented by z(x)=ll by y=a
4, represented by /(y)=3. The division outcome should be: xly = α
7/α
4= α
3. Using the tables: E[L[ ( x]$L~l y)]]=E[ [l
= 8 . The resulting index 8 indeed represents element α
3.
General description
A finite field F is assumed with q=2m elements, m being a prime, m≥\, and a generator element αe F9 . The generator polynomial may be chosen freely. The field is then F9 ={0, 1, oc, a2,...,O?"2). In the exponent representation of F9 the non-zero field elementx = αe is represented as the integer exponent e e {0,...,q-2}. The field element 0 is represented by an integer Z, not being a member of {0,...,q-2}. The exponent of a field element x e Fq is denoted as logα(x) and by definition logα(0) = Z. In addition to an exponent representation, also an index representation of F9 is used. In the index representation, the elements of F9 are ordered in an arbitrary but fixed way as F9 = {xo, xi, ..., xq-ι}- The field element x, is represented by the integer index ^e {0,...,q-l}. The integer index of an element Xj of the field is indicated asj=i(x,). Preferably, the index representation is defined by a basis representation of F2, over F2 such that addition in F2, can simply be computed by taking the bitwise exclusive-or of the indices. This specific index representation is, however, not required for the invention. It will be appreciated that in general any index representation may be used.
In an embodiment according to the invention, multiplication of non-zero elements is performed by returning E[L[i(x)] + L[i(y)]]. Here L and E are tables for logarithm and exponentiation, respectively. Table L has q entries, each of which is an integer in {0,...,q-2, Z). The table is defined as:
In a minimum embodiment, table E has 2 #-3 entries, each of which is an integer in{0,...,q- 1}. Fig.3A illustrates a first embodiment of table E. The table includes a first part 310 and a second part 320. The first part includes q-\ entries: [z( 1 ) p( ) , ./. α't2)] , corresponding to the classical exponentiation table. The second part includes q-2 entries:
[/( 1 ) f( ) , A. α,9f )1. The second part is immediately successive to the first part. The second
part removes the need for modular reduction when adding exponents. In this description it is assumed that the first entry of table E is entry 0. Then the second part starts at entry q-\. If so desired, the parts may also be located in table E with an offset. Persons skilled in the art can easily adjust the indexing of the table accordingly. In a further embodiment, table E includes a third part 330 and 330 as shown in
Fig.3B. The third part 330 removes the need to distinguish the case where one factor in the multiplication is zero. It includes q-\ entries [/(0),(0), ...XO)] . The fourth part 340 with one entry [/(O)] removes the need to distinguish the case where both factors are zero. The third part starts at entry L[i(0)] of table E. L[i(0)] is defined as Z. To avoid an overlap with the second part of table E, Z≥ 2q-3. This leaves an unused fifth part 350 in E with Z-2q-3 entries, that may be used for other purposes. The fourth part is located at entry 2Z to provide the outcome E[L[i(0)] + L[i(0)]] = E[2Z\. This leaves a sixth part 360 in E with Z-q+l entries. For parts 350 and 360 * denotes "unused". The integer Z can be chosen freely, as long as Z > 2q-3. It should be noted that the size of the table grows with Z. In a further embodiment, the choice for Z is: Z = 2q-\ . For this value of Z, the logarithm table L can be stored in the unused sixth part 360 of E. The total tabulated data then is 4q-\ entries, which only is a factor of two storage overhead, compared to the state of the art. It should be noted that this compact storage representation requires that all memory locations for entries of E are wide enough to store the value Z, being the largest value in L. In preferred embodiment, the tables L and E as defined above are also used to implement faster division. Three alternative embodiments will be described to compute division xly and inversion Hy. All methods assume that y Φ 0, so this must be tested if this cannot be asserted from the context.
The first division method extends the table E by two entries such that division can be carried out without case distinction. So both multiplication and division can make use of the same two tables L and E. The new table E is shown in Fig.3C. The second part 320 has been extended by one entry with i(θLq~2) and the third part 330 has been extended with one entry /(0). The free parts 350 and 360 have both been shrunk by one entry. In a preferred embodiment, Z is chosen as Z= 2q, so that again L in the sixth part 360 of E as described above. Then there are Aq + 1 entries in E, each of which is an integer in {0,...,2q}. Then for all y ≠ 0: i(x/y) = E[L[I(X)] + (q-l) - L[i(y)]].
The second division method uses an additional inversion table / including at least q entries, defined as:
/= [/(xo"1), /(xf1),..., /(X9-I "1)]. The inverse of the zero element of the field is represented as /(0). Each entry is an integer in {0,...,q-l}. Then for all y ≠ 0: /(x/y) = E[L[i(x)] + L[I[i(y)]]].
The third division method uses an additional log-inverse table L including at least q entries with:
L = [l 0,4 ,≠ )> g , I . X1)11 o gκ (x?_f ' )] , storing log^O"1) as 0.
Each entry in the table is an integer in {0;...,q-2}. Then for all y ≠ 0: /( x /y ÷E) [L {i φT(f- y))] .
Fig.4 shows an apparatus 400 for performing the multiplication and/or division as described in above embodiments. The apparatus includes a memory 410 for storing the logarithm table L 416 and exponent table E with a first part 412 and second part 414 as described for the first embodiment. Details of the exponent table have been described above, in particular as also shown in Fig.3. Preferably, table L is located in table E as for example shown in Figs.3B and 3 C where L can be stored in part 360 of table E. To support division, table E may be extended as described with reference to Fig.3C. Alternatively, the memory may store table / or L as described above. The apparatus also includes a processor for performing the multiplication and/or division. The processor may be any conventional processor and does not require any specific instruction for field calculations. In particular, the processor is a standard microcontroller or digital signal processor (DSP) used in mobile communication system, such as mobile phones or personal digital assistants (PDAs) equipped with mobile communication. In such systems, the calculations are preferably used to calculate/verify error correcting codes, (in particular Reed-Solomon codes) based on arithmetic in the Galois field GF (28 ) . In addition to or alternatively, the apparatus is suitable for calculating error correcting codes for storage applications. For example, for Reed- Solomon calculations on CD-like storage media. The software for controlling the operation of the processor may be embedded in a non-writeable way (e.g. using ROM). The software may also be rewriteable, preferably stored in a non- volatile way, for example using flash. Rewriteable software may be distributed in any suitable way, for example using a record carrier (e.g. CD-ROM, solid state 'memory stick') or via wired or wireless communication means such as GSM, GPRS, UMTS, WiFi, USB, etc.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. Use of the verb "comprise" and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. The article "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the device claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.