CN101572602A

CN101572602A - Finite field inversion method and device based on hardware design

Info

Publication number: CN101572602A
Application number: CNA2008100669157A
Authority: CN
Inventors: 陈婧; 蒋俊洁; 王石; 邓小铁
Original assignee: Individual
Current assignee: Individual
Priority date: 2008-04-28
Filing date: 2008-04-28
Publication date: 2009-11-04

Abstract

The invention discloses a finite field inversion method based on hardware design, which is applied to an elliptic curve encryption system and comprises the following steps: will have a finite field GF ^m) An internal element polynomial a is used as an input element a, and an irreducible polynomial f (x) is defined, wherein a ═ a_m－1x^m－1+a_m－2x^m－2+...+a₁x+a₀，f(x)＝x^m+f_m－1x^m－1+...+f₁x+f₀，a_iAnd f_iCoefficients of input element a and irreducible polynomial f (x), respectively; input element a is repeatedly multiplied or divided by x with irreducible polynomial f (x)²And adding, after m cycles, outputting the multiplication inverse a of the input element a^－1. Correspondingly, the invention also provides a finite field GF (2) based on hardware design^m) And (5) an inversion device. Therefore, the invention ensures faster calculation speed, thereby greatly improving inversion operationEfficiency. In addition, the working frequency of the inversion device is close to the working frequency of other operation devices in the encryption system, so that the utilization rate of hardware resources of the inversion device is fully improved.

Description

A method and device for inverting a finite field based on hardware design

技术领域 technical field

本发明涉及椭圆曲线密码技术，尤其涉及一种应用于椭圆曲线加密系统中基于硬件设计的有限域求逆的方法及装置。The invention relates to elliptic curve encryption technology, in particular to a method and device for inverting a finite field based on hardware design in an elliptic curve encryption system.

背景技术 Background technique

椭圆曲线密码学(Elliptic Curve Cryptography，ECC)是1985年由VictorMiller和Neal Koblitz提出的，其优点在于：它提供与其他密码系统同等安全性，同时具有较小的密钥尺寸，所以是目前已知的所有公钥密码体制中能够提供最高比特强度的一种公钥体制。较小的密钥尺寸意味着存储器需要的减少与计算时间的减少，尤其适用于低功耗和高速加密的安全应用系统，例如智能卡、个人计算机存储卡或者其他任何手持式或便携式设备等类型的应用。IEEE(Institute of Electrical and Electronics Engineers，电气及电子工程师学会)已经制定的公钥加密算法标准P1363就是基于ECC算法的。密码学界普遍认为它将替代RSA算法，成为通用的公钥密码算法，目前已成为很有前景的研究方向，而如何高效地实现ECC基本运算亦成为研究热点之一。Elliptic Curve Cryptography (Elliptic Curve Cryptography, ECC) was proposed by Victor Miller and Neal Koblitz in 1985. Its advantage is that it provides the same security as other cryptosystems and has a smaller key size, so it is currently known It is a public key system that can provide the highest bit strength among all public key cryptosystems in the world. Smaller key size means reduced memory requirements and reduced computing time, especially suitable for low power consumption and high-speed encrypted security applications, such as smart cards, personal computer memory cards, or any other types of handheld or portable devices application. The public key encryption algorithm standard P1363 established by IEEE (Institute of Electrical and Electronics Engineers) is based on the ECC algorithm. The cryptography community generally believes that it will replace the RSA algorithm and become a general-purpose public-key cryptographic algorithm. It has become a promising research direction, and how to efficiently implement ECC basic operations has become one of the research hotspots.

椭圆曲线密码学主要研究两类椭圆曲线，GF(p)和GF(p^m)。因此密码处理器中需要执行的基本代数运算指的是有限域GF(p)和GF(p^m)上的元素加法、减法、乘法和求逆运算。其中元素的求逆算法，即计算b＝a^-1mod f，是所涉及到的基本运算中开销最大的。Elliptic curve cryptography mainly studies two types of elliptic curves, GF(p) and GF(p ^m ). Therefore, the basic algebraic operations that need to be performed in a cryptographic processor refer to element-wise addition, subtraction, multiplication and inversion operations on finite fields GF(p) and GF(p ^m ). Among them, the element-wise inversion algorithm, that is, calculating b=a ^-1 mod f, is the most expensive among the basic operations involved.

一般的元素求逆算法是根据费马(Fermat)定理，将求逆算法看作元素乘法的组合，如下所示：The general element-wise inversion algorithm is based on Fermat's theorem, and the inverse algorithm is regarded as a combination of elemental multiplication, as follows:

$a a {((x x))}^{- - 11} = = a a {((x x))}^{22^{m m} - - 22} mod mod f f ((x x))$

这样，求逆共需2^m-1-1次域的平方(或者乘法)运算，所以这种算法运算量非常大，效率很低，只适用基域很小的简单应用。In this way, 2 ^m-1 -1 domain square (or multiplication) operations are required for inversion, so this algorithm has a very large amount of calculation and low efficiency, and is only suitable for simple applications with small base domains.

另外一种常用的元素求逆方法是根据Extended Euclid算法来实现。其基本思想是通过反复迭代来计算有限域的逆，a和f(x)反复各乘以或除以x并相加，同时将1和0作同样的变换。这样，当a变成1，1就变成a的逆。该算法完成一次有限域GF(p^m)上的求逆运算需要2m个周期，效率较高。但是随着密码学应用的日益发展，在某些高速低耗的电子装置，例如智能卡、个人计算机存储卡或者其他任何手持式或便携式设备等的应用中，此类算法依然限制了整个系统的处理能力及功率消耗。Another commonly used element inversion method is implemented according to the Extended Euclid algorithm. The basic idea is to calculate the inverse of the finite field through repeated iterations, a and f(x) are repeatedly multiplied or divided by x and added, and 1 and 0 are transformed in the same way. Thus, when a becomes 1, 1 becomes the inverse of a. The algorithm needs 2m cycles to complete an inversion operation on the finite field GF(p ^m ), so the efficiency is high. However, with the increasing development of cryptography applications, in the application of some high-speed and low-power electronic devices, such as smart cards, personal computer memory cards, or any other handheld or portable devices, such algorithms still limit the processing of the entire system. capacity and power consumption.

综上可知，现有应用于椭圆曲线加密系统中的有限域求逆方案，在实际使用上，显然存在不便与缺陷，所以有必要加以改进。In summary, the existing finite field inversion schemes applied to elliptic curve encryption systems obviously have inconveniences and defects in actual use, so it is necessary to improve them.

发明内容 Contents of the invention

针对上述的缺陷，本发明的目的在于提供一种基于硬件设计的有限域求逆的方法及装置，其可减少运算周期，进而大大提高求逆运算效率。In view of the above-mentioned defects, the object of the present invention is to provide a method and device for inverting a finite field based on hardware design, which can reduce the operation cycle and greatly improve the efficiency of inversion operation.

为了实现上述目的，本发明提供一种基于硬件设计的有限域求逆的方法，应用于椭圆曲线加密系统，所述方法包括步骤如下：In order to achieve the above object, the present invention provides a method for inverting a finite field based on hardware design, which is applied to an elliptic curve encryption system, and the method includes the following steps:

A、将有限域GF(p^m)内一元素多项式a作为输入元素a，并定义一不可约多项式f(x)，其中a＝a_m-1x^m-1+a_m-2x^m-2+...+a₁x+a₀，f(x)＝x^m+f_m-1x^m-1+...+f₁x+f₀，a_i和f_i分别为输入元素a和不可约多项式f(x)的系数，m为正整数，而x属于f(x)的自变量；A. Take an element polynomial a in the finite field GF(p ^m ) as input element a, and define an irreducible polynomial f(x), where a=a _m-1 x ^m-1 +a _m-2 x ^{m- 2} +...+a ₁ x+a ₀ , f(x)＝x ^m +f _m-1 x ^m-1 +...+f ₁ x+f ₀ , a _i and f _i are input elements respectively a and the coefficient of the irreducible polynomial f(x), m is a positive integer, and x is an independent variable of f(x);

B、所述输入元素a与不可约多项式f(x)反复乘以或除以x²并相加，在m次循环后，输出该输入元素a的乘法逆元a^-1。B. The input element a and the irreducible polynomial f(x) are repeatedly multiplied or divided by x ² and added, and after m cycles, the multiplicative inverse a ^-1 of the input element a is output.

根据本发明有限域求逆的方法，所述步骤B进一步包括：According to the method for inverting a finite field of the present invention, the step B further includes:

将多项式变量S、R、U、V分别初始化为f、a、1和0，将变量δ初始化为0；根据所述变量S和R的最高两组系数r_mr_m-1、s_ms_m-1的值，计算中间变量q与e；然后根据控制信号r_m、r_m-1δ₀、δ₁、e以及i来计算所述四个变量S、R、U、V，其在一个时钟周期内完成计算，在下一个时钟周期将四个变量S、R、U、V的值更新，在m次循环后，输出该输入元素a的乘法逆元a^-1。Initialize the polynomial variables S, R, U, and V to f, a, 1, and 0 respectively, and initialize the variable δ to 0; according to the highest two sets of coefficients r _m r _m-1 , s _m s of the variables S and R The value of _m-1 , calculate the intermediate variables q and e; then calculate the four variables S, R, U, V according to the control signals r _m , r _m-1 δ ₀ , δ ₁ , e and i, which are in The calculation is completed within one clock cycle, and the values of the four variables S, R, U, V are updated in the next clock cycle, and after m cycles, the multiplicative inverse a ^-1 of the input element a is output.

B1、将变量S、R、U、V和δ分别初始化为f、a、1和0；中间变量q、e、temp1和temp2分别初始化为0；B1, variables S, R, U, V and δ are initialized to f, a, 1 and 0 respectively; intermediate variables q, e, temp1 and temp2 are initialized to 0 respectively;

B2、对i从0到m-1的范围，执行以下步骤：B2. For the range of i from 0 to m-1, perform the following steps:

B3、计算中间变量q＝s_m和e＝s_m-1-s_mr_m-1，其中s_m和s_m-1分别为变量S的最高位系数和次高位系数，r_m-1则为变量R的次高位系数；B3. Calculate intermediate variables q=s _m and e=s _m-1 -s _m r _m-1 , wherein s _m and s _m-1 are the highest and second highest coefficients of variable S respectively, and r _m-1 is is the second highest coefficient of variable R;

B4、计算中间变量T＝S-s_mR；B4, calculate intermediate variable T=Ss _m R;

B5、计算中间变量W＝V-s_mU；B5, calculating the intermediate variable W=Vs _m U;

B6、如果变量R的最高位系数r_m对应为GF(p)元素1，执行以下步骤：B6. If the highest bit coefficient r _m of the variable R corresponds to GF(p) element 1, perform the following steps:

B7、如果变量δ的最低两位是都是0，并且变量e的值对应GF(p)的元素0，则执行以下子步骤B7a～B7h：B7. If the lowest two bits of variable δ are all 0, and the value of variable e corresponds to element 0 of GF(p), then perform the following sub-steps B7a～B7h:

B7a)用R的值代替temp1的值；B7a) replace the value of temp1 with the value of R;

B7b)计算x²T，将结果代替R的值；B7b) Calculate x ² T and replace the value of R with the result;

B7c)用temp1的值代替S的值；B7c) replace the value of S with the value of temp1;

B7d)用U的值代替temp2的值；B7d) replace the value of temp2 with the value of U;

B7e)计算xW mod f，将结果代替W的值；B7e) Calculate xW mod f, and replace the value of W with the result;

B7f)重复子步骤B7e，用W的值代替U的值；B7f) repeat substep B7e, replace the value of U with the value of W;

B7g)用temp2的值代替V的值；B7g) replace the value of V with the value of temp2;

B7h)用δ+2代替δ；B7h) replace δ with δ+2;

B8、如果变量δ的最低两位是都是0，并且变量e的值对应GF(p)的元素1，则执行以下子步骤B8a～B8f：B8. If the lowest two bits of the variable δ are all 0, and the value of the variable e corresponds to element 1 of GF(p), then perform the following sub-steps B8a～B8f:

B8a)计算xR-x²eT，将结果代替temp1的值；B8a) Calculate xR-x ² eT, and replace the value of temp1 with the result;

B8b)计算xT，将结果代替R的值；B8b) Calculate xT, and replace the value of R with the result;

B8c)用temp1的值代替S的值；B8c) replace the value of S with the value of temp1;

B8d)计算U-e(xW mod f)，将结果代替temp2的值；B8d) calculate U-e(xW mod f), and replace the value of temp2 with the result;

B8e)用W的值代替U的值；B8e) replace the value of U with the value of W;

B8f)用temp2的值代替V的值；B8f) replace the value of V with the value of temp2;

B9、如果变量δ的最低位和次低位分别为1和0，则执行以下子步骤B9a～B9f：B9. If the lowest bit and second lowest bit of the variable δ are 1 and 0 respectively, then perform the following sub-steps B9a-B9f:

B9a)用R的值代替temp1的值；B9a) replace the value of temp1 with the value of R;

B9b)计算x²T-x(eR)，将结果代替R的值；B9b) Calculate x ² Tx(eR), and replace the value of R with the result;

B9c)用temp1的值代替S的值；B9c) replace the value of S with the value of temp1;

B9d)计算U/x mod f，将结果代替temp2的值；B9d) calculate U/x mod f, and replace the value of temp2 with the result;

B9e)计算x(W-e-temp2)mod f，将结果代替U的值；B9e) Calculate x(W-e-temp2) mod f, and replace the value of U with the result;

B9f)用temp2的值代替V的值；B9f) replace the value of V with the value of temp2;

B10、如果变量δ的最低位和次低位分别为1和1，则执行以下子步骤B10a～B10e：B10. If the lowest and second lowest bits of the variable δ are 1 and 1 respectively, then perform the following sub-steps B10a-B10e:

B10a)计算x²T-x(e·R)，将结果代替S的值；B10a) Calculate x ² Tx(e·R), and replace the value of S with the result;

B10b)计算U/x mod f，将结果代替temp1的值；B10b) calculate U/x mod f, and replace the value of temp1 with the result;

B10c)计算W-e·temp1，将结果代替V的值；B10c) calculate W-etemp1, and replace the value of V with the result;

B10d)重复子步骤B10b，将temp1的值代替U的值；B10d) repeating sub-step B10b, replacing the value of U with the value of temp1;

B10e)用δ-2代替δ；B10e) replace δ with δ-2;

B11、如果变量R的最高位系数r_m和次高位系数r_m-1皆对应GF(p)的元素0，执行以下子步骤B11a～B11d：B11. If both the highest coefficient r _m and the second highest coefficient r _m-1 of the variable R correspond to element 0 of GF(p), perform the following sub-steps B11a-B11d:

B11a)计算x²R，将结果代替R的值；B11a) Calculate x ² R and substitute the result for the value of R;

B11b)计算xU mod f，将结果代替U的值；B11b) calculate xU mod f, and replace the value of U with the result;

B11c)重复子步骤B11b；B11c) Repeat sub-step B11b;

B11d)用δ+2代替δ；B11d) replace δ by δ+2;

B12、如果变量R的最高位系数r_m和次高位系数r_m-1分别对应GF(p)的元素0和1，执行以下子步骤B12a～B12d：B12. If the highest coefficient r _m and the second highest coefficient r _m-1 of the variable R correspond to elements 0 and 1 of GF(p) respectively, perform the following sub-steps B12a-B12d:

B12a)计算xR，将结果代替R的值；B12a) Calculate xR, and substitute the result for the value of R;

B12b)计算xT，将结果代替S的值；B12b) Calculate xT, and replace the value of S with the result;

B12c)计算xU mod f，将结果代替temp1的值；B12c) calculate xU mod f, and replace the value of temp1 with the result;

B12d)计算V-q·temp1，将结果代替V的值；B12d) calculate V-q·temp1, and replace the value of V with the result;

B13、循环计数器的计数i增加一位，当i小于m-1时，返回步骤B2；B13, the counting i of the loop counter increases by one bit, and when i is less than m-1, return to step B2;

B14、计数m次，即i等于m-1时，有限域GF(p^m)的求逆运算结束，输出值为输入元素a的乘法逆元a^-1。B14. Count m times, that is, when i is equal to m-1, the inversion operation of the finite field GF(p ^m ) ends, and the output value is the multiplicative inverse a ^-1 of the input element a.

根据本发明有限域求逆的方法，所述输入元素a和不可约多项式f(x)的系数a_i和f_i，对i从0到m-1的范围属于有限域GF(p)。According to the finite field inversion method of the present invention, the input element a and the coefficients a _i and f _i of the irreducible polynomial f(x) belong to the finite field GF(p) for the range of i from 0 to m-1.

根据本发明有限域求逆的方法，所述方法通过一硬件求逆装置实现有限域GF(2^m)元素求逆，且所述求逆装置的工作频率与椭圆曲线加密系统中其他运算装置的工作频率相近。According to the method for finite field inversion of the present invention, the method realizes the element inversion of finite field GF(2 ^m ) through a hardware inversion device, and the operating frequency of the inversion device is the same as that of other computing devices in the elliptic curve encryption system The working frequency is similar.

根据本发明有限域求逆的方法，所述有限域GF(2^m)上某个元素加减即为向量的按位异或；将某个元素乘以或除以x即将该向量左移或右移一位，补零；将某个元素取模f(x)即将该元素与f(x)按位异或取模，以保证结果仍在有限域GF(2^m)内。According to the finite field inversion method of the present invention, the addition and subtraction of a certain element on the finite field GF(2 ^m ) is the bitwise XOR of the vector; multiplying or dividing a certain element by x is to shift the vector to the left or Shift one bit to the right, fill with zeros; take the modulo f(x) of an element, that is, take the modulus of the element and f(x), to ensure that the result is still within the finite field GF(2 ^m ).

根据本发明有限域求逆的方法，所述变量U每次乘以x或除以x时，若最高位是1，需与 $f (x) = x^{m} + Σ_{0}^{m - 1} f_{i} x^{i}$ 按位异或取模以进行规约，以保证结果仍在有限域GF(2^m)内。According to the finite field inversion method of the present invention, when the variable U is multiplied by x or divided by x each time, if the highest bit is 1, it needs to be compared with $f (x) = x^{m} + Σ_{0}^{m - 1} f_{i} x^{i}$ Bitwise XOR modulo to reduce to ensure that the result is still within the finite field GF(2 ^m ).

本发明还提供一种基于硬件设计的有限域求逆的装置，应用于椭圆曲线加密系统，所述有限域为GF(2^m)，且该装置包括：The present invention also provides a device for inverting a finite field based on hardware design, which is applied to an elliptic curve encryption system. The finite field is GF(2 ^m ), and the device includes:

寄存器R、S、U、V和δ，所述寄存器R、S、U、V分别用于存储多项式变量R、S、U和V，所述δ寄存器用于记录变量δ的值，而变量δ的变化反映U寄存器的移位情况，初始化时，将有限域GF(2^m)内一元素多项式a作为输入元素a置入所述R寄存器，同时定义一不可约多项式f(x)置入所述S寄存器，且所述U寄存器设置为1，所述V寄存器和δ寄存器设置为0；Registers R, S, U, V and δ, the registers R, S, U, V are used to store polynomial variables R, S, U and V respectively, the δ register is used to record the value of the variable δ, and the variable δ The change of U reflects the shifting of the U register. When initializing, an element polynomial a in the finite field GF(2 ^m ) is put into the R register as the input element a, and an irreducible polynomial f(x) is defined and put into the R register. The S register, and the U register is set to 1, and the V register and δ register are set to 0;

RS计算逻辑模块，用于更新变量R和S；RS calculation logic module for updating variables R and S;

UV计算逻辑模块，用于更新变量U和V；UV calculation logic module for updating variables U and V;

控制逻辑模块，用于根据所述S寄存器和R寄存器的最高两组系数r_mr_m-1、s_ms_m-1的值，计算中间变量q与e，然后根据控制信号r_m、r_m-1、δ₀、δ₁、e以及i来控制所述RS计算逻辑模块和UV计算逻辑模块的工作；The control logic module is used to calculate the intermediate variables q and e according to the values of the highest two sets of coefficients r _m r _m-1 and s _m s _m-1 of the S register and the R register, and then according to the control signals r _m , r _m-1 , δ ₀ , δ ₁ , e and i to control the work of the RS calculation logic module and the UV calculation logic module;

所述RS计算逻辑模块和UV计算逻辑模块在一个时钟周期内完成计算，在下一个时钟周期将四个变量S、R、U、V的更新值重新输入寄存器S、R、U、V，同时循环控制计数器i更新，在m次循环后，输出该输入元素a的乘法逆元a^-1。The RS calculation logic module and the UV calculation logic module complete the calculation in one clock cycle, and re-input the updated values of the four variables S, R, U, and V into the registers S, R, U, and V in the next clock cycle, and cycle simultaneously Control the update of the counter i, and output the multiplicative inverse a ^-1 of the input element a after m cycles.

根据本发明有限域求逆的装置，所述RS计算逻辑模块与UV计算逻辑模块分别采用硬件来实现基本运算，According to the finite field inversion device of the present invention, the RS calculation logic module and the UV calculation logic module respectively use hardware to realize basic operations,

所述RS计算逻辑模块由相同且并列的m+1个RS计算逻辑单元组成，所述各RS计算逻辑单元并行工作，在一个时钟周期内完成运算后将更新值输入R寄存器和S寄存器，再等待所述控制逻辑模块的控制逻辑指令进行下一轮计算；The RS calculation logic module is composed of the same and parallel m+1 RS calculation logic units, and the RS calculation logic units work in parallel, and after completing the operation within one clock cycle, the update value is input into the R register and the S register, and then Waiting for the control logic instruction of the control logic module to perform the next round of calculation;

所述UV计算逻辑模块由相同且并列的m个UV计算逻辑单元组成，所述各UV计算逻辑单元并行工作，在一个时钟周期内完成运算后将更新值输入U寄存器和V寄存器，再等待所述控制逻辑模块的控制逻辑指令进行下一轮计算。The UV calculation logic module is composed of the same and parallel m UV calculation logic units. The UV calculation logic units work in parallel, and after completing the operation in one clock cycle, the update value is input into the U register and the V register, and then waits for all the UV calculation logic units. The next round of calculation is performed according to the control logic instruction of the above control logic module.

根据本发明有限域求逆的装置，所述R寄存器和S寄存器的存储量均为m+1比特，所述U寄存器和V寄存器的存储量均为m比特；所述R寄存器和S寄存器互相影响，所述U寄存器和V寄存器互相影响，且这两组寄存器的操作同步。According to the device for finite field inversion of the present invention, the storage capacity of the R register and the S register is both m+1 bits, and the storage capacity of the U register and the V register is m bits; the R register and the S register are mutually Influence, the U register and the V register affect each other, and the operations of these two sets of registers are synchronized.

本发明将有限域GF(p^m)或GF(2^m)内一元素多项式a作为输入元素a，并定义一不可约多项式f(x)，再将所述输入元素a与不可约多项式f(x)反复乘以或除以x²并相加，在m次循环后，输出该输入元素a的乘法逆元a^-1。借此，本发明完成一次求逆运算仅需m个时钟周期，是现有Extended Euclid算法所需2m个时钟周期的一半，确保了更快的计算速度，从而大大提高了求逆运算效率。另外，本发明的求逆装置的工作频率与椭圆曲线加密系统中其他运算装置的工作频率相接近，以充分提高求逆装置的硬件资源利用率，进而提高整个系统的计算性能。The present invention takes an element polynomial a in the finite field GF(p ^m ) or GF(2 ^m ) as the input element a, and defines an irreducible polynomial f(x), and then combines the input element a with the irreducible polynomial f( x) Repeatedly multiply or divide by x ² and add, after m cycles, output the multiplicative inverse a ^-1 of the input element a. Thereby, the present invention only needs m clock cycles to complete an inversion operation, which is half of the 2m clock cycles required by the existing Extended Euclid algorithm, ensures faster calculation speed, and thus greatly improves the inversion operation efficiency. In addition, the operating frequency of the inverting device of the present invention is close to that of other computing devices in the elliptic curve encryption system, so as to fully improve the utilization rate of hardware resources of the inverting device, and further improve the computing performance of the entire system.

附图说明 Description of drawings

图1是本发明有限域求逆的装置在椭圆曲线加密系统中的应用框图；Fig. 1 is the application block diagram of the device of finite field inversion in the elliptic curve encryption system of the present invention;

图2是本发明有限域求逆的装置的硬件实例图；Fig. 2 is the hardware example figure of the device of finite field inversion of the present invention;

图3是本发明有限域求逆的装置的RS计算逻辑单元的硬件实例图；Fig. 3 is a hardware example diagram of the RS calculation logic unit of the device for finite field inversion of the present invention;

图4A是本发明有限域求逆的装置的UV计算逻辑单元的硬件实例图；Fig. 4A is a hardware example diagram of the UV calculation logic unit of the device for finite field inversion of the present invention;

图4B是本发明UV计算逻辑单元中XMOD模块和D_XMOD模块的硬件实例图；Fig. 4B is a hardware example diagram of the XMOD module and the D_XMOD module in the UV calculation logic unit of the present invention;

图5是本发明基于硬件设计的有限域求逆的方法流程图；Fig. 5 is the method flowchart of the finite field inversion based on hardware design of the present invention;

图6是本发明优选的有限域求逆的方法流程实例图。Fig. 6 is an example flow chart of a preferred method for inverting a finite field in the present invention.

具体实施方式 Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

本发明的基本思想是：将有限域GF(p^m)或GF(2^m)内一元素多项式a作为输入元素a，并定义一不可约多项式f(x)，再将所述输入元素a与不可约多项式f(x)反复乘以或除以x²并相加，在m次循环后，输出该输入元素a的乘法逆元a^-1。借此，本发明可减少运算周期，进而大大提高求逆运算效率。The basic idea of the present invention is: take an element polynomial a in the finite field GF(p ^m ) or GF(2 ^m ) as the input element a, and define an irreducible polynomial f(x), and then combine the input element a and The irreducible polynomial f(x) is repeatedly multiplied or divided by x ² and added, and after m cycles, the multiplicative inverse a ^-1 of the input element a is output. Thereby, the present invention can reduce the operation cycle, and further greatly improve the inverse operation efficiency.

图1示出了本发明有限域求逆的装置在椭圆曲线加密系统中的应用，根据椭圆曲线加密算法的要求，对椭圆曲线加密系统100进行模块化设计，每个模块独立完成各自功能，模块之间根据控制信号进行相互数据交换，实现加密功能。所述椭圆曲线加密系统100主要包括控制接口10、算术控制单元20、双端口RAM 30、加法装置40、乘法装置50、平方装置60以及求逆装置70，其中：Fig. 1 shows the application of the finite field inversion device of the present invention in the elliptic curve encryption system. According to the requirements of the elliptic curve encryption algorithm, the elliptic curve encryption system 100 is modularized, and each module independently completes its own function. According to the control signal, mutual data exchange is realized to realize the encryption function. The elliptic curve encryption system 100 mainly includes a control interface 10, an arithmetic control unit 20, a dual-port RAM 30, an addition device 40, a multiplication device 50, a square device 60 and an inversion device 70, wherein:

控制接口10，是加密系统100中的主要控制器，控制了内部与外部的连接，以及传输指令于加密系统100内部其他组件。The control interface 10 is the main controller in the encryption system 100 , which controls internal and external connections, and transmits instructions to other components inside the encryption system 100 .

算术控制单元20，用于管理和组织各底层模块的运作，所述底层模块包括：加法装置40、乘法装置50、平方装置60以及求逆装置70。The arithmetic control unit 20 is used to manage and organize the operation of each bottom module, and the bottom module includes: an addition device 40 , a multiplication device 50 , a square device 60 and an inversion device 70 .

双端口RAM 30，负责数据的传输及存储。The dual-port RAM 30 is responsible for data transmission and storage.

加法装置40、乘法装置50、平方装置60以及求逆装置70则分别实现有限域GF(p^m)里加法、乘法、平方以及求逆等基本运算。The adding device 40 , the multiplying device 50 , the squaring device 60 and the inversion device 70 respectively implement basic operations such as addition, multiplication, squaring and inversion in the finite field GF(p ^m ).

实现一个椭圆曲线加密系统100主要有两种实现方案，即软件实现和硬件实现。采用软件实现所需开发时间短，但是其加密速度比较慢，妨碍了椭圆曲线加密的实用性。采用硬件实现则提供了比软件化方法优越的速度，更适合于对加密速度有要求的应用领域。本发明属于硬件实现方案，其应用于硬件实现方式的椭圆曲线加密系统100。There are mainly two implementation schemes for implementing an elliptic curve encryption system 100, namely software implementation and hardware implementation. The development time required for software implementation is short, but its encryption speed is relatively slow, which hinders the practicability of elliptic curve encryption. Adopting hardware implementation provides a superior speed than software-based methods, and is more suitable for application fields that require encryption speed. The present invention belongs to a hardware implementation solution, which is applied to the elliptic curve encryption system 100 implemented in hardware.

加密系统100中的乘法装置50目前多采用LSD(Least Significant DigitFirst，最低位优先)或者MSD(Most Significant Digit First，最高位优先)算法。采用该算法实现的硬件模块的最高工作频率低于采用Extended Euclid算法实现的求逆装置70的最高工作频率。因此，虽然Extended Euclid算法实现的求逆装置70单独能达到很高的工作频率，却依然受限于椭圆曲线加密系统100的整体工作频率，在每个时钟周期内，求逆装置70有大部分时间处于闲置状态。本发明算法实现的求逆装置70，其工作频率与其他运算硬件模块的工作频率相近，以充分提高求逆装置70的硬件资源利用率，进而提高整个加密系统100的计算性能。Currently, the multiplication device 50 in the encryption system 100 mostly adopts LSD (Least Significant Digit First, lowest digit first) or MSD (Most Significant Digit First, highest digit first) algorithm. The maximum operating frequency of the hardware module implemented by this algorithm is lower than the maximum operating frequency of the inversion device 70 implemented by the Extended Euclid algorithm. Therefore, although the inverting device 70 implemented by the Extended Euclid algorithm alone can reach a very high operating frequency, it is still limited by the overall operating frequency of the elliptic curve encryption system 100. In each clock cycle, the inverting device 70 has most Time is idle. The operating frequency of the inversion device 70 implemented by the algorithm of the present invention is similar to that of other computing hardware modules, so as to fully improve the utilization rate of hardware resources of the inversion device 70 and further improve the computing performance of the entire encryption system 100 .

图2为本发明有限域求逆装置的硬件实例图，为了便于描述，本实例为有限域GF(2^m)上元素求逆的硬件实现方式。按照有限域GF(2^m)的运算规则：其元素为m-1阶多项式，可用m位的二进制向量表示。这里，有限域GF(2^m)上某个元素加减即为向量的按位异或；将某个元素乘以或除以x即将该向量左移或右移一位，补零；将某个元素取模f(x)即将该元素与f(x)按位异或取模，以保证结果仍在有限域GF(2^m)内。而有限域GF(p^m)上元素求逆的硬件实现方式与之类似，只需将操作数替换为GF(p^m)上的元素，同时采用GF(p)上的基本运算规则，在本申请内将不详述GF(p^m)上元素求逆的硬件实现装置。Fig. 2 is a hardware example diagram of the finite field inversion device of the present invention. For the convenience of description, this example is a hardware implementation of element inversion on the finite field GF(2 ^m ). According to the operation rule of finite field GF(2 ^m ): its elements are polynomials of order m-1, which can be represented by m-bit binary vectors. Here, the addition and subtraction of an element on the finite field GF(2 ^m ) is the bitwise XOR of the vector; multiplying or dividing an element by x means shifting the vector to the left or right by one bit and padding with zeros; Taking the modulo f(x) of an element is to take the modulus of the element and f(x) bit by bit, so as to ensure that the result is still within the finite field GF(2 ^m ). The hardware implementation of element inversion on the finite field GF(p ^m ) is similar, only need to replace the operands with the elements on GF(p ^m ), and adopt the basic operation rules on GF(p), in this paper The hardware implementation device for element-wise inversion on GF(p ^m ) will not be described in detail in the application.

本发明的有限域求逆的装置70应用于如图1所示的椭圆曲线加密系统100中，其主要包括：R寄存器71、S寄存器72、U寄存器73、V寄存器74、δ寄存器75、RS计算逻辑模块76、UV计算逻辑模块77以及控制逻辑模块78，其中：The finite field inversion device 70 of the present invention is applied to the elliptic curve encryption system 100 shown in FIG. Calculation logic module 76, UV calculation logic module 77 and control logic module 78, wherein:

R寄存器71和S寄存器72为相互影响的一组寄存器R[r_m…r₀]和S[s_m…s₀]，两者存储量均为m+1比特，分别用于存储多项式变量R和S。U寄存器73和V寄存器74为相互影响的一组寄存器U[u_m-1…u₀]和V[v_m-1…v₀]，两者存储量均为m比特，分别用于存储多项式变量U和V。而这两组寄存器的操作同步，并且在每个时钟周期的上升沿载入上一次循环的更新值。The R register 71 and the S register 72 are a group of registers R[r _m ...r ₀ ] and S[s _m ...s ₀ ] that influence each other, both of which have m+1 bits in storage, and are used to store the polynomial variable R and S. The U register 73 and the V register 74 are a group of registers U[u _m-1 ...u ₀ ] and V[v _m-1 ...v ₀ ] that influence each other, both of which have m-bit storage capacity and are used to store polynomials respectively Variables U and V. The operation of these two sets of registers is synchronous, and the updated value of the previous cycle is loaded on the rising edge of each clock cycle.

δ寄存器75，用于记录变量δ的值，而δ的变化反映U寄存器73的移位情况，当U值乘以x²，此时δ寄存器75加计数2位；当U值除以x²，此时δ寄存器75减计数2位。The δ register 75 is used to record the value of the variable δ, and the change of δ reflects the shift of the U register 73. When the U value is multiplied by x ² , the δ register 75 counts 2 bits; when the U value is divided by x ² , at this moment, the delta register 75 counts down by 2 bits.

初始化时，将有限域GF(2^m)内一元素多项式a作为输入元素a置入所述R寄存器71，同时定义一不可约多项式f(x)置入所述S寄存器72，且所述U寄存器73设置为1，所述V寄存器74和δ寄存器75设置为0，然后运算过程开始。During initialization, an element polynomial a in the finite field GF(2 ^m ) is put into the R register 71 as an input element a, and an irreducible polynomial f(x) is defined and put into the S register 72, and the U The register 73 is set to 1, the V register 74 and the delta register 75 are set to 0, and then the operation process starts.

a＝a_m-1x^m-1+a_m-2x^m-2+...+a₁x+a₀；a=a _m-1 x ^m-1 +a _m-2 x ^m-2 +...+a ₁ x+a ₀ ;

f(x)＝x^m+f_m-1x^m-1+...+f₁x+f₀；f(x)=x ^m +f _m-1 x ^m-1 +...+f ₁ x+f ₀ ;

其中a_i和f_i分别为输入元素a和不可约多项式f(x)的系数，m为正整数，而x属于f(x)的自变量。Among them, a _i and f _i are the coefficients of the input element a and the irreducible polynomial f(x) respectively, m is a positive integer, and x belongs to the independent variable of f(x).

RS计算逻辑模块76，用于更新变量R和S。The RS calculation logic module 76 is used to update the variables R and S.

UV计算逻辑模块77，用于更新变量U和V。The UV calculation logic module 77 is used to update the variables U and V.

控制逻辑模块78，用于根据所述R寄存器71和S寄存器72的最高两组系数r_mr_m-1、s_ms_m-1的值，计算两个中间变量q与e，其中q＝s_m和e＝s_m-1-s_mr_m-1，然后根据六个控制信号r_m、r_m-1、δ₀、δ₁、e以及i来控制所述RS计算逻辑模块76和UV计算逻辑模块77的工作。The control logic module 78 is used for calculating two intermediate variables q and e according to the values of the highest two groups of coefficients r _m r _m-1 and s _m s _m-1 of the R register 71 and the S register 72, wherein q= s _m and e=s _m _-1 -s _m r _m _-1 _, and then control _the RS calculation logic module 76 and The UV calculation logic module 77 works.

RS计算逻辑模块76和UV计算逻辑模块77根据控制逻辑模块78的指示信号来执行不同的加法、减法、取模运算。本发明的RS计算逻辑模块76和UV计算逻辑模块77在一个时钟周期内完成计算，在下一个时钟周期将四个变量S、R、U、V的更新值重新输入R寄存器71、S寄存器72、U寄存器73、V寄存器74，同时循环控制计数器i更新，在m次循环后，输出值即为输入元素a在有限域GF(2^m)内的乘法逆元a^-1。The RS calculation logic module 76 and the UV calculation logic module 77 perform different addition, subtraction, and modulo operations according to the indication signal of the control logic module 78 . The RS calculation logic module 76 and the UV calculation logic module 77 of the present invention complete the calculation in one clock cycle, and re-input the updated values of the four variables S, R, U, V into the R register 71, the S register 72, the next clock cycle The U register 73 and the V register 74 update the loop control counter i at the same time. After m loops, the output value is the multiplicative inverse element a ^-1 of the input element a in the finite field GF(2 ^m ).

具体而言，所述RS计算逻辑模块76与UV计算逻辑模块77分别采用硬件来实现基本运算：Specifically, the RS calculation logic module 76 and the UV calculation logic module 77 respectively use hardware to implement basic operations:

RS计算逻辑模块76由相同且并列的m+1个RS计算逻辑单元组成，所述各RS计算逻辑单元并行工作，在一个时钟周期内完成运算后将更新值输入R寄存器71和S寄存器72，再等待所述控制逻辑模块78的控制逻辑指令进行下一轮计算。The RS calculation logic module 76 is composed of the same and parallel m+1 RS calculation logic units. The RS calculation logic units work in parallel, and input the updated value into the R register 71 and the S register 72 after completing the operation within one clock cycle. Then wait for the control logic instruction of the control logic module 78 to perform the next round of calculation.

UV计算逻辑模块77由相同且并列的m个UV计算逻辑单元组成，所述各UV计算逻辑单元并行工作，在一个时钟周期内完成运算后将更新值输入U寄存器73和V寄存器74，再等待所述控制逻辑模块78的控制逻辑指令进行下一轮计算。其中UV计算逻辑较复杂：其中变量U每次乘以x或者除以x都必须有规约运算，即如果最高位是1的话，需与f按位顺序异或取模，保证结果依然在有限域GF(2^m)内。The UV calculation logic module 77 is made up of the same and parallel m UV calculation logic units. The UV calculation logic units work in parallel, and after completing the operation in one clock cycle, the update value is input to the U register 73 and the V register 74, and then waits for The control logic instruction of the control logic module 78 performs the next round of calculation. Among them, the UV calculation logic is more complicated: the variable U must have a reduction operation every time it is multiplied by x or divided by x, that is, if the highest bit is 1, it needs to be XORed with f in bit order to ensure that the result is still in the finite field. within GF( ^2m ).

图3为本发明有限域求逆的装置的RS计算逻辑单元的硬件实例图，相同且并列的m+1个RS计算逻辑单元可组成RS计算逻辑模块76。RS计算逻辑单元用于更新多项式R与S的系数，选择器控制信号有r_m，r_m-1，δ₀，δ₁、e以及i。这几个控制信号的低电平分别表示R寄存器71的最高位系数r_m为0，R寄存器71的次高位系数r_m-1为0，δ寄存器75的值不是1而是0，δ寄存器75的值小于2和变量e等于0，此外当i的低电平表示R寄存器71和S寄存器72初始化为a_i和f_i，所述a_i和f_i即输入元素a和不可约多项式f(x)的系数。信号q与e是有限域GF(2)里面的元素，可由等式q＝s_m和e＝s_m-1-s_mr_m-1得到。RS计算逻辑单元中变量的下标对应该变量的某一位，输出r_o和s_o则分别对应下一轮计算时输入的变量R和变量S的第i位。FIG. 3 is a hardware example diagram of the RS calculation logic unit of the finite field inversion device of the present invention. The same and parallel m+1 RS calculation logic units can form the RS calculation logic module 76 . The RS calculation logic unit is used to update the coefficients of the polynomials R and S, and the selector control signals include _rm , _rm-1 , δ ₀ , δ ₁ , e and i. The low levels of these control signals respectively indicate that the highest bit coefficient r _m of the R register 71 is 0, the second highest bit coefficient r _m-1 of the R register 71 is 0, the value of the delta register 75 is not 1 but 0, and the delta register The value of 75 is less than 2 and the variable e is equal to 0. In addition, when the low level of i indicates that the R register 71 and the S register 72 are initialized to a _i and f _i , the a _i and f _i are the input element a and the irreducible polynomial f (x) coefficient. The signals q and e are elements in the finite field GF(2), which can be obtained by the equations q=s _m and e=s _m-1 -s _m r _m-1 . The subscript of the variable in the RS calculation logic unit corresponds to a certain bit of the variable, and the output r _o and s _o correspond to the i-th bit of the input variable R and variable S in the next round of calculation, respectively.

值得注意的是，因为某变量乘以x²采用硬件实现即将该变量左移2位补零，于是当i小于2时，部分下标(i-1，i-2)将小于0，此时该输入数据为0。这些RS计算逻辑单元并行工作，在一个时钟周期内完成运算后将更新值输入相应位的R寄存器71和S寄存器72，再等待控制逻辑模块78的控制逻辑指令进行下一轮计算。It is worth noting that because the multiplication of a variable by x ² is realized by hardware, that is, the variable is shifted to the left by 2 bits and filled with zeros, so when i is less than 2, some subscripts (i-1, i-2) will be less than 0, at this time This input data is 0. These RS calculation logic units work in parallel, and after completing the calculation within one clock cycle, input the update value into the R register 71 and S register 72 of the corresponding bit, and then wait for the control logic instruction of the control logic module 78 to perform the next round of calculation.

图4A为本发明有限域求逆的装置的UV计算逻辑单元的硬件实例图，相同且并列的m个UV计算逻辑单元可组成UV计算逻辑模块77。与RS计算逻辑单元相比，UV计算逻辑单元更为复杂，因为所述变量U每次乘以x或除以x时，例如xU，xW和除法操作U/x，若最高位系数是1，都需与一个不可约的多项式 $f (x) = x^{m} + Σ_{0}^{m - 1} f_{i} x^{i}$ 按位异或取模以进行规约，以保证结果仍在有限域GF(2^m)内。FIG. 4A is a hardware example diagram of the UV calculation logic unit of the finite field inversion device of the present invention. The same and parallel m UV calculation logic units can form a UV calculation logic module 77 . Compared with the RS calculation logic unit, the UV calculation logic unit is more complicated, because each time the variable U is multiplied by x or divided by x, such as xU, xW and division operation U/x, if the highest bit coefficient is 1, need to be related to an irreducible polynomial $f (x) = x^{m} + Σ_{0}^{m - 1} f_{i} x^{i}$ Bitwise XOR modulo to reduce to ensure that the result is still within the finite field GF(2 ^m ).

图4A中标识有XMOD的模块单元实现了某个变量乘以或除以x后取模规约的操作，而标识有D_XMOD的模块单元则实现了某个变量重复两次乘以或除以x后取模规约的操作。其中控制信号MultU＝(r_m＝1)&(δ₁＝1)&(δ₀＝1)：当MultU为高电平时，这两个模块则实现除法功能，即图4A中两个灰色单元。The modular unit marked with XMOD in Figure 4A realizes the operation of multiplying or dividing a certain variable by x, and the modular unit marked with D_XMOD realizes the operation of multiplying or dividing a variable twice by x The modulo reduction operation. Among them, the control signal MultU=(r _m =1)&(δ ₁ =1)&(δ ₀ =1): when MultU is at high level, these two modules realize the division function, that is, the two gray units in Fig. 4A .

所述UV计算逻辑单元的XMOD模块和D_XMOD模块的具体实现如图4B所示，图4B中带*号的选择器由e和δ₁同时控制，选择输入XMOD模块的输出信号或u_i。当e＝1以及δ₁＝0时，选择输出u_i时，带*号的信号线则输入XMOD模块的输出信号；反之当δ₁＝1和e＝0时，选择输入XMOD模块的输出信号，带*号的信号线则输出u_i。这些UV计算逻辑单元并行工作，在一个时钟周期内完成运算后将更新值输入U寄存器73和V寄存器74，再等待控制逻辑模块78的控制逻辑指令进行下一轮计算。The specific implementation of the XMOD module and D_XMOD module of the UV calculation logic unit is shown in Figure 4B. The selector marked with * in Figure 4B is controlled by e and _δ1 at the same time, and selects the output signal or u _i input to the XMOD module. When e=1 and δ ₁ =0, when the output u _i is selected, the signal line with * is input to the output signal of the XMOD module; otherwise, when δ ₁ =1 and e=0, the output signal of the input XMOD module is selected , the signal line with * will output u _i . These UV calculation logic units work in parallel, and after completing the calculation in one clock cycle, input the updated value into the U register 73 and the V register 74, and then wait for the control logic instruction of the control logic module 78 to perform the next round of calculation.

本发明从提高硬件资源的利用率角度出发，将运算周期大大减少，仅需常用的Extended Euclid算法的一半执行时间，确保了更迅速的实现计算。From the perspective of improving the utilization rate of hardware resources, the present invention greatly reduces the calculation cycle, only needs half of the execution time of the commonly used Extended Euclid algorithm, and ensures faster realization of calculation.

同时，本发明提高了开发的便捷性。本发明提供了可直接应用于硬件系统设计的IPC(Intelligence Property Core，智能核)，包括在系统设计中执行本发明的硬件RTL综合代码。应用本发明仅需对特定应用创建接口匹配，大大降低了系统设计的难度及减少了设计所需耗时间。帮助使用者提高系统设计的效率与正确性，提高产品质量的同时又加速了系统产品的研发进程。At the same time, the invention improves the convenience of development. The present invention provides an IPC (Intelligence Property Core, intelligent core) that can be directly applied to hardware system design, including implementing the hardware RTL synthesis code of the present invention in system design. The application of the present invention only needs to create interface matching for specific applications, which greatly reduces the difficulty of system design and reduces the time required for design. Help users improve the efficiency and correctness of system design, improve product quality and accelerate the development process of system products.

图5示出了本发明基于硬件设计的有限域GF(p^m)求逆的方法流程，优选的是有限域GF(2^m)求逆的方法，应用于椭圆曲线加密系统，可由图1或图2所示的求逆装置70实现，主要包括步骤有：Fig. 5 shows the method flow of the finite field GF(p ^m ) inversion based on the hardware design of the present invention, preferably the method for inverting the finite field GF(2 ^m ), which is applied to the elliptic curve encryption system, which can be obtained from Fig. 1 or The realization of the inversion device 70 shown in Fig. 2 mainly includes the following steps:

步骤S501，将有限域GF(p^m)内一元素多项式a作为输入元素a，并定义一不可约多项式f(x)，其中Step S501, taking an element polynomial a in the finite field GF(p ^m ) as input element a, and defining an irreducible polynomial f(x), where

a_i和f_i分别为输入元素a和不可约多项式f(x)的系数，m为正整数，而x属于f(x)的自变量。a _i and f _i are the input element a and the coefficients of the irreducible polynomial f(x) respectively, m is a positive integer, and x is an independent variable of f(x).

步骤S502，所述输入元素a与不可约多项式f(x)反复乘以或除以x²并相加，在m次循环后，输出该输入元素a的乘法逆元a^-1。本步骤又包括：Step S502, the input element a and the irreducible polynomial f(x) are repeatedly multiplied or divided by x ² and added, and after m cycles, the multiplicative inverse a ^-1 of the input element a is output. This step also includes:

1)将多项式变量S、R、U、V分别初始化为f、a、1和0，将变量δ初始化为0。具体而言，将输入元素a置入所述R寄存器71，将不可约多项式f(x)置入所述S寄存器72，且所述U寄存器73设置为1，所述V寄存器74和δ寄存器75设置为0，然后运算过程开始。1) Initialize the polynomial variables S, R, U, and V to f, a, 1, and 0 respectively, and initialize the variable δ to 0. Specifically, the input element a is placed into the R register 71, the irreducible polynomial f(x) is placed into the S register 72, and the U register 73 is set to 1, the V register 74 and the δ register 75 is set to 0, and then the calculation process begins.

2)根据所述变量S和R的最高两组系数r_mr_m-1、s_ms_m-1的值，计算中间变量q与e。具体而言，控制逻辑模块78根据所述R寄存器71和S寄存器72的最高两组系数r_mr_m-1、s_ms_m-1的值，计算两个中间变量q与e，其中q＝s_m和e＝s_m-1-s_mr_m-1。2) According to the values of the highest two sets of coefficients r _m r _m-1 and s _m s _m-1 of the variables S and R, calculate the intermediate variables q and e. Specifically, the control logic module 78 calculates two intermediate variables q and e according to the values of the highest two sets of coefficients r _m r _m-1 and s _m s _m-1 of the R register 71 and the S register 72, wherein q =s _m and e=s _m-1 -s _m r _m-1 .

3)然后根据控制信号r_m、r_m-1、δ₀、δ₁、e以及i来计算所述四个变量S、R、U、V，其在一个时钟周期内完成计算，在下一个时钟周期将四个变量S、R、U、V的值更新。具体而言，RS计算逻辑模块76和UV计算逻辑模块77根据控制逻辑模块78的六个控制信号r_m、r_m-1、δ₀、δ₁、e以及i来执运算，RS计算逻辑模块76和UV计算逻辑模块77在一个时钟周期内完成计算，在下一个时钟周期将四个变量S、R、U、V的更新值重新输入R寄存器71、S寄存器72、U寄存器73、V寄存器74，同时循环控制计数器i更新。3) Then calculate the four variables S, R, U, V according to the control signals r _m , r _m-1 , δ ₀ , δ ₁ , e and i, which completes the calculation in one clock cycle, and completes the calculation in the next clock cycle Periodically update the values of the four variables S, R, U, V. Specifically, the RS calculation logic module 76 and the UV calculation logic module 77 perform calculations according to the six control signals _rm , rm _-1 , δ ₀ , δ ₁ , e and i of the control logic module 78, and the RS calculation logic module 76 and UV calculation logic module 77 complete the calculation in one clock cycle, and re-input the updated values of the four variables S, R, U, V into the R register 71, S register 72, U register 73, and V register 74 in the next clock cycle , while the loop control counter i is updated.

4)在m次循环后，输出值即为该输入元素a在有限域GF(p^m)的乘法逆元a^-1。4) After m cycles, the output value is the multiplicative inverse a ^-1 of the input element a in the finite field GF(p ^m ).

图6是本发明优选的有限域求逆的方法流程实例图，应用于椭圆曲线加密系统100，可由图1或图2所示的求逆装置70实现，这种在有限域GF(p^m)上进行的运算实现如下运算：Fig. 6 is an example flow diagram of a preferred method for inverting a finite field in the present invention, which is applied to an elliptic curve encryption system 100, and can be realized by the inversion device 70 shown in Fig. 1 or Fig. 2 ^. The operations performed on the above implement the following operations:

a^-1＝a mod f(x)，a∈GF(p^m)a ^-1 ＝a mod f(x), a∈GF(p ^m )

所述方法把一个有限域GF(p^m)的一个元素多项式a作为输入元素a，并定义一不可约多项式f(x)，在方法实现过程中，将这两个元素理解成为多项式的集合：The method takes an element polynomial a of a finite field GF(p ^m ) as an input element a, and defines an irreducible polynomial f(x). During the implementation of the method, these two elements are understood as a set of polynomials:

f(x)＝x^m+f_m-1x^m-1+...+f₁x+f₀；且f(x)=x ^m +f _m-1 x ^m-1 +...+f ₁ x+f ₀ ; and

a＝a_m-1x^m-1+a_m-2x^m-2+...+a₁x+a₀。a=a _m-1 x ^m-1 +a _m-2 x ^m-2 +...+a ₁ x+a ₀ .

所述输入元素a和不可约多项式f(x)的系数a_i和f_i，对i从0到m-1的范围来说为有限域GF(p)的元素，因此以下所述求逆方法中的基本运算为GF(p)上元素的基本运算。The input element a and the coefficients a _i and f _i of the irreducible polynomial f(x) are elements of the finite field GF(p) for the range of i from 0 to m-1, so the inversion method described below The basic operation in is the basic operation of elements on GF(p).

本发明的有限域GF(p^m)上求逆方法，可包含以下步骤：The inversion method on the finite field GF(p ^m ) of the present invention may comprise the following steps:

1)将多项式变量S、R、U、V和δ分别初始化为f、a、1和0；中间变量q、e、temp1和temp2分别初始化为0；1) Initialize the polynomial variables S, R, U, V and δ to f, a, 1 and 0 respectively; the intermediate variables q, e, temp1 and temp2 are initialized to 0 respectively;

2)对i从0到m-1的范围，执行以下步骤：具体是根据控制逻辑模块78的控制信号r_m、r_m-1、δ₀、δ₁、e以及i来控制所述RS计算逻辑模块76和UV计算逻辑模块77执行以下步骤：2) For the range of i from 0 to m-1, perform the following steps: specifically, control the RS calculation according to the control signals r _m , r _m-1 , δ ₀ , δ ₁ , e and i of the control logic module 78 Logic module 76 and UV calculation logic module 77 perform the following steps:

3)计算中间变量q＝s_m和e＝s_m-1-s_mr_m-1，其中s_m和s_m-1分别为变量S的最高位系数和次高位系数，r_m-1则为变量R的次高位系数；3) Calculate intermediate variables q=s _m and e=s _m-1 -s _m r _m-1 , where s _m and s _m-1 are the highest and second highest coefficients of variable S respectively, and r _m-1 is is the second highest coefficient of variable R;

4)计算中间变量T＝S-s_mR；4) Calculate the intermediate variable T=Ss _m R;

5)计算中间变量W＝V-s_mU；5) Calculate the intermediate variable W=Vs _m U;

6)如果变量R的最高位系数r_m对应为GF(p)的元素1，执行以下步骤：6) If the highest bit coefficient r _m of the variable R corresponds to element 1 of GF(p), perform the following steps:

7)如果变量δ的最低两位是都是0，并且变量e的值对应GF(p)的元素0，则执行以下子步骤7a～7h(步骤S6)：7) If the lowest two bits of variable δ are all 0, and the value of variable e corresponds to element 0 of GF(p), then perform the following sub-steps 7a-7h (step S6):

7a)用R的值代替temp1的值；7a) replace the value of temp1 with the value of R;

7b)计算x²T，将结果代替R的值；7b) Calculate x ² T and replace the value of R with the result;

7c)用temp1的值代替S的值；7c) replace the value of S with the value of temp1;

7d)用U的值代替temp2的值；7d) replace the value of temp2 with the value of U;

7e)计算xW mod f，将结果代替W的值；7e) Calculate xW mod f, and replace the value of W with the result;

7f)重复子步骤B7e，用W的值代替U的值；7f) repeat sub-step B7e, replace the value of U with the value of W;

7g)用temp2的值代替V的值；7g) replace the value of V with the value of temp2;

7h)用δ+2代替δ；7h) replace δ with δ+2;

8)如果变量δ的最低两位是都是0，并且变量e的值对应GF(p)的元素1，则执行以下子步骤8a～8f(步骤S5)：8) If the lowest two bits of the variable δ are all 0, and the value of the variable e corresponds to element 1 of GF(p), then perform the following sub-steps 8a-8f (step S5):

8a)计算xR-x²eT，将结果代替temp1的值；8a) Calculate xR-x ² eT, and replace the value of temp1 with the result;

8b)计算xT，将结果代替R的值；8b) Calculate xT, and replace the value of R with the result;

8c)用temp1的值代替S的值；8c) replace the value of S with the value of temp1;

8d)计算U-e(xW mod f)，将结果代替temp2的值；8d) Calculate U-e(xW mod f), and replace the result with the value of temp2;

8e)用W的值代替U的值；8e) replace the value of U with the value of W;

8f)用temp2的值代替V的值；8f) replace the value of V with the value of temp2;

9)如果变量δ的最低位和次低位分别为1和0，则执行以下子步骤9a～9f(步骤S2)：9) If the lowest bit and the second lowest bit of the variable δ are 1 and 0 respectively, then perform the following sub-steps 9a-9f (step S2):

9a)用R的值代替temp1的值；9a) replace the value of temp1 with the value of R;

9b)计算x²T-x(eR)，将结果代替R的值；9b) Calculate x ² Tx(eR), and substitute the result for the value of R;

9c)用temp1的值代替S的值；9c) replace the value of S with the value of temp1;

9d)计算U/x mod f，将结果代替temp2的值；9d) calculate U/x mod f, and replace the value of temp2 with the result;

9e)计算x(W-e·temp2)mod f，将结果代替U的值；9e) calculate x(W-e temp2) mod f, and replace the value of U with the result;

9f)用temp2的值代替V的值。9f) Replace the value of V with the value of temp2.

10)如果变量δ的最低位和次低位分别为1和1，则执行以下子步骤10a～10e(步骤S1)：10) If the lowest and second lowest bits of the variable δ are 1 and 1 respectively, then perform the following sub-steps 10a-10e (step S1):

10a)计算x²T-x(e·R)，将结果代替S的值；10a) Calculate x ² Tx(e·R), and replace the value of S with the result;

10b)计算U/x mod f，将结果代替temp1的值；10b) Calculate U/x mod f, and replace the result with the value of temp1;

10c)计算W-e·temp1，将结果代替V的值；10c) Calculate W-e·temp1, and replace the value of V with the result;

10d)重复子步骤B10b，将temp1的值代替U的值；10d) repeating sub-step B10b, replacing the value of U with the value of temp1;

10e)用δ-2代替δ。10e) Replace δ by δ-2.

11)如果变量R的最高位系数r_m和次高位系数r_m-1皆对应GF(p)的元素0，执行以下子步骤11a～11d(步骤S4)：11) If both the highest coefficient r _m and the second highest coefficient r _m-1 of the variable R correspond to element 0 of GF(p), perform the following sub-steps 11a-11d (step S4):

11a)计算x²R，将结果代替R的值；11a) Calculate x ² R and substitute the result for the value of R;

11b)计算xU mod f，将结果代替U的值；11b) Calculate xU mod f and replace the result with the value of U;

11c)重复子步骤B11b；11c) Repeat sub-step B11b;

11d)用δ+2代替δ。11d) Replace δ by δ+2.

12)如果变量R的最高位r_m系数和次高位系数r_m-1分别对应GF(p)的元素0和1，执行以下子步骤12a～12d(步骤S3)：12) If the highest bit r _m coefficient and the second highest bit coefficient r _m-1 of the variable R correspond to elements 0 and 1 of GF(p) respectively, perform the following sub-steps 12a-12d (step S3):

12a)计算xR，将结果代替R的值；12a) Calculate xR, and substitute the result for the value of R;

12b)计算xT，将结果代替S的值；12b) Calculate xT, and replace the value of S with the result;

12c)计算xU mod f，将结果代替temp1的值；12c) calculate xU mod f, and replace the value of temp1 with the result;

12d)计算V-q·temp1，将结果代替V的值；12d) Calculate V-q·temp1, and replace the value of V with the result;

13)循环计数器的计数i增加一位，当i小于m-1时，返回步骤B2；13) The count i of the loop counter is increased by one bit, and when i is less than m-1, return to step B2;

14)计数m次，即i等于m-1时，有限域GF(p^m)的求逆运算结束，输出值为输入元素a的乘法逆元a^-1。14) Count m times, that is, when i is equal to m-1, the inverse operation of the finite field GF(p ^m ) ends, and the output value is the multiplicative inverse element a ^-1 of the input element a.

本发明还具有如下特点：The present invention also has following characteristics:

1)每次运算所需时钟周期恒定，即不会随着输入数据变化，从而有利于降低在大型系统应用中设计和控制难度。1) The clock cycle required for each operation is constant, that is, it will not change with the input data, which is conducive to reducing the difficulty of design and control in large-scale system applications.

2)所需硬件面积与有限域GF(p^m)定义的p与m成一定比例，即不随输入数据或者f变化而改变。2) The required hardware area is proportional to p and m defined by the finite field GF(p ^m ), that is, it does not change with the change of input data or f.

3)具有相当高的可配置性，如增加m比特寄存器个数，可应用于不可约多项式f变化的椭圆曲线加密系统。3) It has quite high configurability, such as increasing the number of m-bit registers, it can be applied to an elliptic curve encryption system where the irreducible polynomial f changes.

综上可知，本发明将有限域GF(p^m)或GF(2^m)内一元素多项式a作为输入元素a，并定义一不可约多项式f(x)，再将所述输入元素a与不可约多项式f(x)反复乘以或除以x²并相加，在m次循环后，输出该输入元素a的乘法逆元a^-1。借此，本发明完成一次求逆运算仅需m个时钟周期，是现有Extended Euclid算法所需2m个时钟周期的一半，确保了更快的计算速度，从而大大提高了求逆运算效率。另外，本发明的求逆装置的工作频率与椭圆曲线加密系统中其他运算装置的工作频率相接近，以充分提高求逆装置的硬件资源利用率，进而提高整个系统的计算性能。In summary, the present invention takes an element polynomial a in the finite field GF(p ^m ) or GF(2 ^m ) as the input element a, and defines an irreducible polynomial f(x), and then combines the input element a with the irreducible The approximate polynomial f(x) is repeatedly multiplied or divided by x ² and added, and after m cycles, the multiplicative inverse a ^-1 of the input element a is output. Thereby, the present invention only needs m clock cycles to complete an inversion operation, which is half of the 2m clock cycles required by the existing Extended Euclid algorithm, ensures faster calculation speed, and thus greatly improves the inversion operation efficiency. In addition, the operating frequency of the inverting device of the present invention is close to that of other computing devices in the elliptic curve encryption system, so as to fully improve the utilization rate of hardware resources of the inverting device, and further improve the computing performance of the entire system.

当然，本发明还可有其它多种实施例，在不背离本发明精神及其实质的情况下，熟悉本领域的技术人员当可根据本发明作出各种相应的改变和变形，但这些相应的改变和变形都应属于本发明所附的权利要求的保护范围。Certainly, the present invention also can have other multiple embodiments, without departing from the spirit and essence of the present invention, those skilled in the art can make various corresponding changes and deformations according to the present invention, but these corresponding Changes and deformations should belong to the scope of protection of the appended claims of the present invention.

Claims

1, a kind of method based on the finite field inversion of hardware design, be applied to elliptic curve encryption system, it is characterized in that, described method comprises steps as follows:

A. Take an element polynomial a in the finite field GF(p ^m ) as input element a, and define an irreducible polynomial f(x), where a=a _m-1 x ^m-1 +a _m-2 x ^{m- 2} +...+a ₁ x+a ₀ , f(x)＝x ^m +f _m-1 x ^m-1 +...+f ₁ x+f ₀ , a _i and f _i are input elements respectively a and the coefficient of the irreducible polynomial f(x), m is a positive integer, and x is an independent variable of f(x);

B. The input element a and the irreducible polynomial f(x) are repeatedly multiplied or divided by x ² and added, and after m cycles, the multiplicative inverse a ^-1 of the input element a is output.

2. The method according to claim 1, wherein said step B further comprises:

Initialize the polynomial variables S, R, U, and V to f, a, 1, and 0 respectively, and initialize the variable δ to 0; according to the highest two sets of coefficients r _m r _m-1 , s _m s of the variables S and R value of _m-1 , calculate the intermediate variables q and e; then calculate the four variables S, R, U, V according to the control signals r _m , r _m-1 , δ ₀ , δ ₁ , e and i, which The calculation is completed in one clock cycle, and the values of the four variables S, R, U, and V are updated in the next clock cycle. After m cycles, the multiplicative inverse a ^-1 of the input element a is output.

3. The method according to claim 2, wherein said step B further comprises:

B1, variables S, R, U, V and δ are initialized to f, a, 1 and 0 respectively; intermediate variables q, e, temp1 and temp2 are initialized to 0 respectively;

B2. For the range of i from 0 to m-1, perform the following steps:

B3. Calculate intermediate variables q=s _m and e=s _m-1 -s _m r _m-1 , wherein s _m and s _m-1 are the highest and second highest coefficients of variable S respectively, and r _m-1 is is the second highest coefficient of variable R;

B4, calculate intermediate variable T=Ss _m R;

B5, calculating the intermediate variable W=Vs _m U;

B6. If the highest bit coefficient r _m of the variable R corresponds to GF(p) element 1, perform the following steps:

B7. If the lowest two bits of variable δ are all 0, and the value of variable e corresponds to element 0 of GF(p), then perform the following sub-steps B7a～B7h:

B7a) replace the value of temp1 with the value of R;

B7b) Calculate x ² T and replace the value of R with the result;

B7c) replace the value of S with the value of temp1;

B7d) replace the value of temp2 with the value of U;

B7e) Calculate xW mod f, and replace the value of W with the result;

B7f) repeat substep B7e, replace the value of U with the value of W;

B7g) replace the value of V with the value of temp2;

B7h) replace δ with δ+2;

B8. If the lowest two bits of the variable δ are all 0, and the value of the variable e corresponds to element 1 of GF(p), then perform the following sub-steps B8a～B8f:

B8a) Calculate xR-x ² eT, and replace the value of temp1 with the result;

B8b) Calculate xT, and replace the value of R with the result;

B8c) replace the value of S with the value of temp1;

B8d) calculate U-e(xW mod f), and replace the value of temp2 with the result;

B8e) replace the value of U with the value of W;

B8f) replace the value of V with the value of temp2;

B9. If the lowest bit and second lowest bit of the variable δ are 1 and 0 respectively, then perform the following sub-steps B9a-B9f:

B9a) replace the value of temp1 with the value of R;

B9b) Calculate x ² Tx(eR), and replace the value of R with the result;

B9c) replace the value of S with the value of temp1;

B9d) calculate U/x mod f, and replace the value of temp2 with the result;

B9e) calculate x(W-e temp2) mod f, replace the value of U with the result;

B9f) replace the value of V with the value of temp2;

B10. If the lowest and second lowest bits of the variable δ are 1 and 1 respectively, then perform the following sub-steps B10a-B10e:

B10a) Calculate x ² Tx(e·R), and replace the value of S with the result;

B10b) calculate U/x mod f, and replace the value of temp1 with the result;

B10c) calculate W-etemp1, and replace the value of V with the result;

B10d) repeating sub-step B10b, replacing the value of U with the value of temp1;

B10e) replace δ with δ-2;

B11. If both the highest coefficient r _m and the second highest coefficient r _m-1 of the variable R correspond to element 0 of GF(p), perform the following sub-steps B11a-B11d:

B11a) Calculate x ² R and substitute the result for the value of R;

B11b) calculate xU mod f, and replace the value of U with the result;

B11c) Repeat sub-step B11b;

B11d) replace δ by δ+2;

B12. If the highest coefficient r _m and the second highest coefficient r _m-1 of the variable R correspond to elements 0 and 1 of GF(p) respectively, perform the following sub-steps B12a-B12d:

B12a) Calculate xR, and substitute the result for the value of R;

B12b) Calculate xT, and replace the value of S with the result;

B12c) calculate xU mod f, and replace the value of temp1 with the result;

B12d) calculate V-q·temp1, and replace the value of V with the result;

B13, the counting i of the loop counter increases by one bit, and when i is less than m-1, return to step B2;

B14. Count m times, that is, when i is equal to m-1, the inversion operation of the finite field GF(p ^m ) ends, and the output value is the multiplicative inverse a ^-1 of the input element a.

4. The method according to claim 3, characterized in that, the input element a and the coefficients a _i and f _i of the irreducible polynomial f(x) belong to the finite field GF for the range of i from 0 to m-1 (p).

5. The method according to claim 3, characterized in that, the method realizes the element inversion of finite field GF(2 ^m ) through a hardware inversion device, and the working frequency of the inversion device is the same as that of the elliptic curve encryption system The operating frequency of other computing devices in the computer is similar.

6. The method according to claim 5, characterized in that the addition and subtraction of a certain element on the finite field GF(2 ^m ) is the bitwise XOR of the vector; multiplying or dividing a certain element by x is about to The vector is shifted to the left or right by one bit and filled with zeros; to take a certain element modulo f(x), that is, to take the modulo of the element and f(x) bit by bit, to ensure that the result is still in the finite field GF(2 ^m ) Inside.

7. The method according to claim 6, characterized in that, when the variable U is multiplied by x or divided by x each time, if the highest bit is 1, it needs to be compared with

f (x) = x^{m} + Σ_{0}^{m - 1} f_{i} x^{i}

Bitwise XOR modulo to reduce to ensure that the result is still within the finite field GF(2 ^m ).

8. A device for realizing the inversion of a finite field according to any one of claims 1 to 7, which is applied to an elliptic curve encryption system, wherein the finite field is GF(2 ^m ), and the device include:

Registers R, S, U, V and δ, the registers R, S, U, V are used to store polynomial variables R, S, U and V respectively, the δ register is used to record the value of the variable δ, and the variable δ The change of U reflects the shifting of the U register. When initializing, an element polynomial a in the finite field GF(2 ^m ) is put into the R register as the input element a, and an irreducible polynomial f(x) is defined and put into the R register. The S register, and the U register is set to 1, and the V register and δ register are set to 0;

RS calculation logic module for updating variables R and S;

UV calculation logic module for updating variables U and V;

The control logic module is used to calculate the intermediate variables q and e according to the values of the highest two sets of coefficients r _m r _m-1 and s _m s _m-1 of the S register and the R register, and then according to the control signals r _m , r _m-1 , δ ₀ , δ ₁ , e and i to control the work of the RS calculation logic module and the UV calculation logic module;

The RS calculation logic module and the UV calculation logic module complete the calculation in one clock cycle, and re-input the updated values of the four variables S, R, U, and V into the registers S, R, U, and V in the next clock cycle, and cycle simultaneously Control the update of the counter i, and output the multiplicative inverse a ^-1 of the input element a after m cycles.

9. The device according to claim 8, characterized in that, the RS calculation logic module and the UV calculation logic module respectively use hardware to implement basic operations,

The RS calculation logic module is composed of the same and parallel m+1 RS calculation logic units, and the RS calculation logic units work in parallel, and input the update value into the R register and the S register after completing the operation within one clock cycle, and then Waiting for the control logic instruction of the control logic module to perform the next round of calculation;

The UV calculation logic module is composed of the same and parallel m UV calculation logic units. The UV calculation logic units work in parallel. After completing the operation in one clock cycle, the update value is input into the U register and the V register, and then waits for the The next round of calculation is performed according to the control logic instruction of the above control logic module.

10. The device according to claim 8, wherein the storage capacity of the R register and the S register is both m+1 bits, and the storage capacity of the U register and the V register is m bits; the R register The register and the S register affect each other, the U register and the V register affect each other, and the operations of these two sets of registers are synchronized.