CN102122241A - Analog multiplier/divider applicable to prime field and polynomial field - Google Patents

Analog multiplier/divider applicable to prime field and polynomial field Download PDF

Info

Publication number
CN102122241A
CN102122241A CN2010100226476A CN201010022647A CN102122241A CN 102122241 A CN102122241 A CN 102122241A CN 2010100226476 A CN2010100226476 A CN 2010100226476A CN 201010022647 A CN201010022647 A CN 201010022647A CN 102122241 A CN102122241 A CN 102122241A
Authority
CN
China
Prior art keywords
output
multiplexer
input
register
arithmetic unit
Prior art date
Application number
CN2010100226476A
Other languages
Chinese (zh)
Inventor
曹丹
曾晓洋
韩军
黄伟
Original Assignee
复旦大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 复旦大学 filed Critical 复旦大学
Priority to CN2010100226476A priority Critical patent/CN102122241A/en
Publication of CN102122241A publication Critical patent/CN102122241A/en

Links

Abstract

The invention relates to a dual-field analog multiplier/divider which is suitable for an ECC (Elliptic Curve Cypher) algorithm required in high-speed network application and portable mobile equipment application. The analog multiplier/divider comprises four PE operating units, 5 register files (Regfile), a Booth encoding unit, an input register (Load file), a control module (control) and 17 multi-path selectors. The analog multiplier/divider changes the connection of the four PE operating units and the reading position of data through the 17 multi-path selectors so as to complete analog multiplying or analog dividing, has expandability, can support 480-bit analog multiplying/dividing operations maximally, and shares hardware units for multiplying/dividing operations so as to reduce the area of the hardware; and in the algorithm, addition and subtraction and shift operation of long operands are carried out by the unit of byte, therefore, the convergence rate of the algorithm is greatly accelerated, and further the operating speed is multiplied.

Description

一种适用于素域和多项式域的模乘模除器 Suitable for a prime field and a polynomial modular multiplication modulo domain filter

技术领域 FIELD

[0001] 本发明属于集成电路设计技术领域,具体涉及一种适用于高速网络应用以及便携式移动设备应用需要的针对椭圆曲线密码(ECC)算法的双域模乘模除器。 [0001] The present invention belongs to the technical field of integrated circuit design, specifically relates to two domains mold suitable for high-speed networking applications and portable applications require the mobile device for Elliptic Curve Cryptography (ECC) algorithm is multiplication modulo.

背景技术 Background technique

[0002] 在当代,随着信息化的不断深入,越来越多的信息将暴露在公开的媒介中。 [0002] In modern times, along with the deepening of information technology, more and more information will be exposed to public media. 为了保护那些敏感信息,各种密码算法被应用到无线网络通信领域中。 In order to protect sensitive information, various encryption algorithm is applied to a wireless network communication field. 然而,通信设备尤其是便携式设备相对有限的处理能力已无法满足日益增大的数据量的需求。 However, the communication device is a portable device in particular a relatively limited amount of processing capacity can not meet the increasing demand for data.

[0003] 密码体制一般可以划分为两种类型:对称密码体制和公钥密码体制。 [0003] Cryptosystems can generally be divided into two types: symmetric cryptosystem and public-key cryptosystem. 对称密码体制的最大优点是效率高,但是主要缺点却是比较明显:密钥分发问题,即要求分发密钥的信道既是保密的又是保真的;另一个缺点是密钥管理问题,即在有N个实体的网络中,每个实体都必须存取NI个实体的密钥。 The biggest advantage of symmetric cryptosystem is a high efficiency, but the main drawback is obvious: the key distribution problem, which requires a key distribution channel is both confidential and is fidelity; Another disadvantage is that the key management issues, namely there are N network entities, each entity must NI entities of the access key. 而公钥密码体制却没有这些缺点,公钥密码仅要求密钥的交换是保真的,而不用保密,且提供了秘密性和不可否认性。 The public-key cryptosystem do not have these shortcomings, public key cryptography is only required to exchange keys fidelity, without confidentiality, and provide confidentiality and non-repudiation.

[0004] 目前,椭圆曲线密码(ECC)已成为除RSA密码之外呼声最高的公钥密码之一,它可以提供同RSA密码体制同样的功能。 [0004] Currently, Elliptic Curve Cryptography (ECC) has become one of the highest in addition to outside the RSA public-key cryptosystem voice password, it can provide the same functionality with RSA cryptosystem. 它的安全性建立在椭圆曲线离散对数问题(ECDLP)的困难性之上。 Its security is built on logarithm problem (ECDLP) the difficulty of the elliptic curve discrete. 普遍认为160位椭圆曲线密码可提供相当于1024位RSA密码的安全程度。 Generally considered ECC 160 may be provided corresponding to the degree of security of the RSA cryptosystem 1024. 由于密钥短,所以在实际应用中加解密速度较快,并且可节省功耗、带宽和存储空间。 Due to the short key, so in practice fast encryption and decryption, and may save power, bandwidth and storage space. ECC算法的核心运算是模乘和求逆,模乘和求逆的性能关系到整个ECC芯片的性能。 Core computing ECC algorithm is modular multiplication and inversion, modular multiplication and inverse performance relationship to the performance of the ECC chip. 另外,ECC算法时间上的并行性比较明显。 Further, the ECC algorithm parallelism time obviously. 一个高性能且功能齐全的模乘求逆器,并充分发掘算法时间上的并行性,则可以仅用较小的硬件代价,获得较高的加密速度,适应当前的高速网络应用的需要。 A high-performance, full-featured modular multiplication inversion control, and to fully exploit the parallelism of the algorithm on time, it can only smaller hardware costs, obtain higher speed encryption, meet the current needs of the high-speed network applications. 另外由于硬件消耗小、功耗低,它也可以适应便携式移动设备的应用需要.随着计算机技术的发展,运算速度也在飞速提高。 In addition, as hardware cost, low power consumption, it can also be adapted to applications requiring portable mobile devices. With the development of computer technology, computational speed is also rapidly improving. 我们有必要增加数据长度以加强加密算法的安全性。 We need to increase the length of data to enhance the security of the encryption algorithm. 本设计在硬件结构上具有可扩展性,可以方便地扩展数据宽度。 This design is scalable in the hardware configuration can be easily expanded data width.

发明内容 SUMMARY

[0005] 本发明的目的是提出一种适用于高速网络应用以及便携式移动设备应用需要的针对椭圆曲线密码ECC算法的双域模乘模除器,具有可扩展性,同时显著地降低硬件成本。 [0005] The object of the present invention is to provide a suitable high-speed network applications and portable mobile device applications required for the elliptic curve cryptosystem double domain mold ECC algorithm multiplication modulo, a scalable, while significantly reducing hardware costs.

[0006] 本发明的技术方案是:一种适用于素域和多项式域的模乘模除器(如图1所示), 由17个多路选择器(1〜17)、4个运算单元PE (26〜29)、输入寄存器(23)、5个寄存器堆(18〜22)、Booth编码单元(24)和控制模块(25)组成;其中: [0006] aspect of the present invention is: one for the prime field and field polynomial modular multiplication modulo device (1), consisting of 17 multiplexer (1~17), the arithmetic unit 4 PE (26~29), input register (23), register file 5 (18~22), Booth encoding unit (24) and a control module (25); wherein:

[0007] a.在第二多路选择器(2)和第八多路选择器⑶选1态、13个多路选择器(3〜 7、10〜17)选0态时,由第一多路选择器(1)、第九多路选择器(9)、5个寄存器堆(18〜 22)、输入寄存器(23) ,Booth编码单元(24)、控制模块(25)和4个PE运算单元(26〜29) 组成模除器(见图5),模除运算采用Euclidean算法;其中: When [0007] a. ⑶ a second multiplexer (2) and an eighth multiplexer select 1 state, 13 multiplexer (3 ~ 7,10~17) selected from the 0 state, a first multiplexer (1), the ninth multiplexer (9), register file 5 (18~ 22), input register (23), Booth encoding unit (24), control module (25) and four PE arithmetic means (26~29) modulo composition (see Figure 5), using the Euclidean algorithm modulo arithmetic; wherein:

[0008] 第一多路选择器(1),输入为运算单元PEO (26)的输出,由stage信号来选择输出, 输出至第一寄存器堆(18)或第二寄存器堆(19);[0009] 第九多路选择器(9),输入为运算单元PE2(28)的输出,由stage信号来选择输出, 输出至第三寄存器堆00)或第四寄存器堆; [0008] a first multiplexer (1), the input of the arithmetic unit PEO (26) of the output stage is selected by the output signal, output to the first register file (18) or the second register file (19); [ 0009] the ninth multiplexer (9), the input of the arithmetic unit PE2 (28) of the output stage is selected by the output signal, output to the third register file 00) or a fourth register file;

[0010] 第一寄存器堆(18)和第二寄存器堆(19)的输出均到输入寄存器03); [0010] a first register file (18) and a second register file (19) are output to the input register 03);

[0011] 第三寄存器堆OO)和第四寄存器堆的输出均到PE3运算单元09); [0011] The third register file OO) and the fourth register file are output to the arithmetic unit 09 PE3);

[0012] 第五寄存器堆02)的输出到Booth编码器04); [0012] The fifth register file 02) is output to the Booth encoder 04);

[0013] 输入寄存器03)的输出到PEO运算单元06); Output [0013] Input register 03) to the arithmetic unit 06 PEO);

[0014] Booth编码单元04)输出到PE2运算单元08); [0014] Booth encoder unit 04) to the arithmetic unit 08 PE2);

[0015] 控制模块(25)输出到PEO和PEl运算单元(26,27); [0015] The control module (25) to the PEO PEl and operating means (26, 27);

[0016] PEO运算单元06)的输出根据stage信号的选择写入到第一寄存器堆(18)或第二寄存器堆(19)中; Output [0016] PEO arithmetic unit 06) stage according to the selection signal is written to the first register file (18) or the second register file (19);

[0017] PEl运算单元(27)的输出到PE2运算单元(28); Output [0017] PEl arithmetic unit (27) to PE2 arithmetic unit (28);

[0018] PE2运算单元08)的输出根据stage信号的选择写入到第三寄存器堆QO)或第四寄存器堆中; Output [0018] PE2 arithmetic unit 08) written in accordance with the selection signal to the third stage register file of QO) or a fourth register file;

[0019] PE3运算单元(29)的输出到PEl运算单元(27); Output [0019] PE3 arithmetic unit (29) to PEl arithmetic unit (27);

[0020] b.在第一多路选择器(1)和第九多路选择器(9)不工作、13个多路选择器(3〜 7、10〜17)选1态时,由第二多路选择器O)、第八多路选择器(8)、5个寄存器堆(18〜 22)、输入寄存器(23) ,Booth编码单元(24)、控制模块(25)和4个PE运算单元(26〜29) 组成模乘器(见图6),模乘运算采用模哥马利算法;其中: When [0020] b. Does not work in the first multiplexer (1) and a ninth multiplexer (9), 13 multiplexer (3 ~ 7,10~17) selected from the 1 state by the first over two-way selector O), the eighth multiplexer (8), the register file 5 (18~ 22), input register (23), Booth encoding unit (24), control module (25) and four PE arithmetic means (26~29) composed of modular multiplier (see FIG. 6), modular multiplication algorithm employed modulus Ge Mali; wherein:

[0021] 第二多路选择器O),输入为第一寄存器堆(18)和第四寄存器堆内存储的数据,以FirstjOimd作为选择信号,选择输出正确的数据到输入寄存器Q3)内; [0021] The second multiplexer O), a first input register file (18) and the data stored in the fourth register stack, FirstjOimd as a selection signal to select and output the correct data to the input register Q3) therein;

[0022] 第八多路选择器(8),输入为第二寄存器堆(19)和第五寄存器堆0¾内存储的数据,以FirstjOimd作为选择信号,选择输出正确的数据到输入寄存器Q3)内; [0022] The eighth multiplexer (8), a second input register file (19) and the fifth register file stored within the data 0¾, FirstjOimd as a selection signal to select and output the correct data to the input register Q3) inner ;

[0023] 第三寄存器堆00)输出到Booth编码器04)中; [0023] The third register file 00) to the Booth encoder 04); and

[0024] 输入寄存器03)的两位输出到PEO运算单元06); [0024] The input register 03) to the output of the two PEO arithmetic unit 06);

[0025] Booth编码单元04)的输出分别为4个PE运算单元06〜29)提供输入; [0025] Booth encoder unit 04) outputs are provided to the input operation unit 4 PE 06~29);

[0026] PEO运算单元06)的输出到PEl运算单元、2Ί); Output [0026] PEO arithmetic unit 06) to the arithmetic unit PEl, 2Ί);

[0027] PEl运算单元(Xl)的输出到ΡΕ2运算单元08); Output [0027] PEl arithmetic unit (Xl) ΡΕ2 to the arithmetic unit 08);

[0028] ΡΕ2运算单元08)的输出到ΡΕ3运算单元09); Output [0028] ΡΕ2 arithmetic unit 08) to the arithmetic unit ΡΕ3 09);

[0029] ΡΕ3运算单元09)的输出分别到第五寄存器堆02)和第四寄存器堆Ql)。 Output [0029] ΡΕ3 arithmetic unit 09), respectively, to the fifth register file 02) and the fourth register file Ql).

[0030] 上述运算单元PE(如图2所示)由6个多路选择器(30〜35)、2个寄存器(36、 37)、2个反相控制器(38、39)、3个进位保留加法器(40、41、42)、PE内部控制模块(43)和PE内部移位器G4)组成;其中: [0030] The operation unit PE (FIG. 2) consists of six multiplexer (30~35), two registers (36, 37), the two inverter controllers (38, 39), 3 carry save adders (40,41,42), PE internal control module (43) and PE internal shifter G4); wherein:

[0031] 第十八多路选择器(30)的输入为模乘运算时累加被乘数的倍数mul_h和模除运算时累加D寄存器值的倍数div_h,由运算函数FimC_Sel作为选择,选择出正确的W寄存器值的倍数选择信号,输出到第十九多路选择器(31)的选择控制端; And multiple die mul_h [0031] Eighteenth multiplexer (30) input when accumulation multiplicand multiple modular multiplication cumulative addition div_h D register value when calculation by the arithmetic function FimC_Sel Alternatively, select the correct W multiple register value selection signal is output to the selection control terminal of the nineteenth multiplexer (31);

[0032] 第十九多路选择器(31)的输入为0、W寄存器的值和2*W寄存器的值,由第十八多路选择器(31)的输出作为选择信号,选择出正确的操作数值到第一反相控制器(38); [0032] Nineteenth multiplexer (31) input is 0, the value of W register and the value of 2 * W register, the output from the eighteenth multiplexer (31) as a selection signal to select the correct the first operand value to the inverter controller (38);

[0033] 第二十多路选择器(3¾的输入为字运算时模P的倍数doublel和double2,由运算函数FimC_Sel作为选择,输出正确的倍数选择信号到第二十三多路选择器(35);[0034] 第二十一多路选择器(3¾的输入为字运算时模P的倍数zerol和Zer02,由运算函数FimC_Sel作为选择,输出正确的倍数选择信号到第二十三多路选择器(35); [0033] Twenty-multiplexer (3¾ is a multiple of the input and double2 doublel modulo P operation when the word by the operation function FimC_Sel Alternatively, multiple output a selection signal to the right over three twenty-path selector (35 ); zerol multiple modulo P and Zer02, as selected by the operator input function FimC_Sel [0034] twenty-first multiplexer (3¾ for word operations, a multiple of the output selection signal to the correct path selected over three twenty (35);

[0035] 第二十二多路选择器(34)的输入为字运算时模P的倍数negl和neg2,由运算函数FimC_Sel作为选择,输出正确的倍数选择信号到第二反相控制器(39); [0035] The twenty-second multiplexer (34) is a multiple input negl modulo P operation and when the word neg2, by the operation function FimC_Sel Alternatively, multiple output the selected signal to the right of the second inverter controller (39 );

[0036] 第二十三多路选择器(3¾的输入为0、模P的值和2*模P的值,由第二十、第二十一多路选择器(32、33)的输出作为选择信号,选择出正确的操作数值到第二反相控制器(39); [0036] Twenty-three more path selectors (3¾ input is 0, the value of the modulo P and modulo P value * 2, the output from the twentieth, twenty-first multiplexer (32, 33) as the selection signal, the selected value to the proper operation of the controller of the second inverter (39);

[0037] 第七寄存器(36)用来存放运算时的模P中一个字的值; [0037] The seventh register (36) for storing values ​​of modulo P calculation time of a word;

[0038] 第八寄存器(37)用来存放运算时W中一个字的值; [0038] The eighth register (37) to a value of word W is stored when the operation;

[0039] 第一反相控制器(38)的输入为第十九多路选择器(31)的输出值,控制信号为第十八多路选择器(30)的输出值,输出为输入经过取反后的数值; [0039] The inverting input of the first controller (38) for the Nineteenth multiplexer (31) output values, control signals for the eighteenth multiplexer (30) output, the output is input through the after the counter takes a value;

[0040] 第二反相控制器(39)的输入为第二十三多路选择器(3¾的输出值,控制信号为第二十二多路选择器(34)的输出值,输出为输入经过取反后的数值; Input [0040] The second inverter controller (39) is more than three-way selector twenty (3¾ output value of the control signal for the twenty-second multiplexer (34) output value, the output is input after the negated values;

[0041] 第一进位保留加法器00)的输入为保存一个操作数的U寄存器当前字的结果和进位(Uc,化),以及素域或多项式域的域选择信号Field,第一个字Firstjord,和第一反相控制器(38)的输出,输出为通过进位保留加法器的值以及进位Uc_l ; [0041] a first carry-save adder 00) is inputted to hold operand U register current word result and a carry (Uc of, based), and a prime field or domain selection signals Field polynomial field, the first word Firstjord and a first inverter controller (38) is output as a reserved adder and the carry bit by the carry Uc_l;

[0042] 第二进位保留加法器的输入为第一进位保留加法器GO)的输出Uc_l,Us_l, 素域或多项式域的域选择信号Field,第一个字First_W0rd,和第二反相控制器(39)的输出,输出为通过进位保留加法器的值化_2以及进位Uc_2 ; [0042] The second carry save adder the adder input remains GO) to a first carry output Uc_l, Us_l, or domains prime polynomial Field selection signal domain, the first word First_W0rd, and a second inverter controller (39) output, the output is retained by a carry adder and the carry value of _2 Uc_2;

[0043] 第三进位保留加法器0¾的输入为PE内部移位器04)的输入Uc_3、Us_3和carry,输出为经过进位保留加法器的值以及进位Uc_out ; Input Uc_3 [0043] The third input carry-save adder for PE 0¾ internal shifter 04), Us_3 and carry, through the carry-save output and the carry adder Uc_out;

[0044] PE内部控制模块03)的输入为第一进位保留加法器GO)的输出Us_l、Uc_l的低两位,以及模P的低两位、右移的位数shift的值,输出为确定后面所加模P的倍数的控制信号; [0044] Internal Control Module 03 Input PE) as a first carry-save adders GO) output Us_l, Uc_l lower two bits, and the lower two bits of the modulo P, the number of bits of right shift values, the output is determined the control signal is applied behind the multiple of modulo P;

[0045] PE内部移位器04)的输入为第二进位保留加法器Gl)的输出值化_2、Uc_2的值,以及右移的位数shift的值,输出为经过右移后的h_3、Uc_3以及右移出来的carry的值。 [0045] PE internal shifter 04) to a second input of carry-save adders of Gl) is an output value of _2, Uc_2 value, and the right shift bit number value, right after the output h_3 the value Uc_3 and carry out the right.

[0046] 本发明提出的适用于高速网络应用以及便携式移动设备应用需要的针对ECC(椭圆曲线密码)算法的双域模乘模除器,最高可执行480-bit的模除运算,采用SMIC 0. ISym CMOS工艺综合,关键路径时延为4. 71纳秒,最高频率达212. 3MHz,完成256-bit模除用时i:3us,面积为3¾等效门,256-bit模乘用时ljus,面积为11. 4k等效门。 [0046] The present invention is made of modular multiplication modulo double domain suitable for high-speed networking applications, and (ECC) algorithm for ECC portable mobile device applications that require a maximum of 480-bit perform modulo arithmetic, using SMIC 0 . ISym CMOS integrated process, the critical path delay of 4.71 nsec, frequencies up to 212. 3MHz, complete except for using 256-bit mode when i: 3us, 3¾ area equivalent gates, when 256-bit modular multiplication with ljus , an area equivalent gate 11. 4k.

[0047] ECC算法是一种公钥的算法,E⑶H,E⑶SA中主要运算集中在点乘的运算,而点乘又是由倍点和点加组成,而倍点和点加的运算主要是有限域下的模乘和模除算法。 [0047] ECC algorithm is a public key algorithm, E⑶H, E⑶SA mainly concentrated in the operation of point multiplication operation, the point by point and point addition in turn composed of times, and the double point and point addition operation is limited primarily modular multiplication and modulo arithmetic in the field. 用硬件完成有限域的模乘模除算法,上层用软件去实现不同的公钥算法是一种性能和灵活性的最好折中。 Complete with hardware finite field modular multiplication modulo arithmetic, the upper layer software to implement different public key algorithm is the best compromise one property and flexibility. 现在常用来处理有限域模除的算法主要是应用Euclidean算法,具体算法实现见图3 ;而处理有限域模乘的算法主要是用蒙哥马利模乘算法,具体算法实现见图4。 Algorithm is now used to process the finite field modulo Euclidean algorithm is the application of specific algorithm shown in Figure 3; the finite field arithmetic processing of the modular multiplication with Montgomery mainly modular multiplication algorithm, specific algorithm shown in Figure 4.

[0048] 本发明的优点在于能同时处理ECC运算中素域及二进制域下的模乘和模除运算, 主要的运算部件由相同结构的PE运算单元组成,并能通过重新配置运算单元之间的连接关系,能相应的加快不同算法的速度;同时也能尽可能的做到硬件的复用,在达到性能要求的同时,减少了硬件面积。 [0048] The advantage of the present invention is able to handle and modular multiplication modulo arithmetic in a prime field and ECC calculation binary field, the main operation member by the operation unit PE composed of the same structure, and between the arithmetic unit can reconfigure the connection relationship, can speed up the respective different algorithms; can also be done as a multiplexing hardware, while achieving the required performance, reducing the hardware area.

[0049] 本发明的优点还在于提出了一种专门处理ECC运算中模乘模除的运算单元PE 单元,通过分析素域及二进制域下的运算过程,提取出了运算中的基本运算:U = (U+mul_ h*W+kP) >> shift操作。 [0049] The advantages of the present invention is to propose a special ECC processing in the modular multiplication arithmetic modulo arithmetic unit PE unit, the operation process by analyzing the prime field and binary field, a basic operation to extract the calculation: U = (U + mul_ h * W + kP) >> shift operation. 每个PE运算单元都能完成这一基本运算,在进行模乘模除运算时,可以通过合理分配每个PE单元进行的操作,加速模乘模除运算。 Each computation unit PE can complete the basic operation, during the modular multiplication modulo operation, the operation can be reasonably assigned to each PE of the unit, in addition to accelerating the modular multiplication operation mode.

[0050] 本发明的优点之三在于该模乘模除器主要运算部件由多个结构相同的PE运算单元组成,结构是可以配置的(图1中以四个PE运算单元说明),可以根据硬件或功率的要求,相应的增加或减少PE运算单元,以达到新的要求。 [0050] III advantage of the present invention is that the modular multiplication modulo arithmetic components mainly of the same computation unit PE plurality of structures, a structure can be configured (in FIG. 1 described four operation unit PE), according to hardware or power required, a corresponding increase or decrease the arithmetic unit PE to meet the new requirements.

附图说明 BRIEF DESCRIPTION

[0051] 图1是本发明双域模乘模除器顶层结构图; [0051] FIG. 1 is a modular multiplication modulo dual domain top level block diagram of the present invention;

[0052] 图2是本发明双域模乘模除器的PE运算单元结构图; [0052] FIG. 2 is a computation unit PE double domain structure of the present invention FIG modular multiplication modulo filter;

[0053] 图3是本发明改进后的Euclidean模除算法; [0053] FIG. 3 is the Euclidean mode of the present invention in addition to improved algorithm;

[0054] 图4是本发明改进后的蒙哥马利模乘算法; [0054] FIG. 4 is a Montgomery modular multiplication algorithm of the present invention is improved;

[0055] 图5是本发明双域模乘模除器作模除时的等效硬件结构图; [0055] FIG. 5 is an equivalent diagram of a hardware configuration of the dual domain modular multiplication modulo addition modulo present invention;

[0056] 图6是本发明双域模乘模除器作模乘时的等效硬件结构图; [0056] FIG. 6 is an equivalent diagram of a hardware configuration of the dual domain modular multiplication modulo multiplication modulo the present invention;

[0057] 图7a是本发明进行模除运算时数据通路的算法说明图; [0057] Figure 7a is an explanatory view when the algorithm of the present invention modulo arithmetic data path;

[0058] 图7b是图5的等效说明图; [0058] FIG. 7b is a view explaining the equivalent of 5;

[0059] 图fe是本发明进行模乘运算时数据通路的算法说明图; [0059] FIG fe algorithm of the present invention is a modular multiplication explaining data path;

[0060] 图8b是图6的等效说明图。 [0060] FIG 8b is an explanatory diagram equivalent to FIG. 6.

[0061] 图中标号:1为第一多路选择器,2为第二多路选择器,3为第三多路选择器,4为第四多路选择器,5为第五多路选择器,6为第六多路选择器,7为第七多路选择器,8为第八多路选择器,9为第九多路选择器,10为第十多路选择器,11为第十一多路选择器,12为第十二多路选择器,13为第十三多路选择器,14为第十四多路选择器,15为第十五多路选择器,16为第十六多路选择器,17为第十七多路选择器,18为第一寄存器堆,19为第二寄存器堆,20为第三寄存器堆,21为第四寄存器堆,22为第五寄存器堆,23为输入寄存器,M为Booth编码单元,25为控制模块,26为PEO运算单元,27为PEl运算单元,28为PE2运算单元,29为PE3运算单元,30为第十八多路选择器,31为第十九多路选择器,32为第二十多路选择器,33为第二十一多路选择器,34为第二十二多路选择器,35为第二十三 [0061] FIG numeral: a first multiplexer, a second multiplexer 2, a multiplexer 3 is a third, a fourth multiplexer 4, a multiplexer 5 for the fifth , a sixth multiplexer 6, a multiplexer 7 for the seventh selector 8 for the eighth multiplexer, 9 ninth multiplexer, 10 is a tenth multiplexer, 11 is a first over eleven channel selector, 12 is a multiplexer twelfth, thirteenth multiplexer 13, a multiplexer 14 is a fourteenth, a fifteenth multiplexer 15 is a selector, 16 is a first over sixteen channel selector 17 seventeenth multiplexer, a first register file 18, second register file 19, a third register file 20, register file 21 for the fourth, fifth register 22 stack, 23 for the input register, M being Booth encoding unit, the control module 25, 26 PEO operation unit 27 is an arithmetic unit PEl, PE2 arithmetic unit 28, arithmetic unit 29 of PE3, 30 is a multiplexer eighteenth , a nineteenth 31 multiplexer, 32 is a multiplexer twentieth, twenty-first multiplexer 33 is a selector, 34 is a multiplexer twenty-second, twenty-third to 35 路选择器, 36为第七寄存器,37为第八寄存器,38为第一反相控制器,39为第二反相控制器,40为第一进位保留加法器,41为第二进位保留加法器,42为第三进位保留加法器,43为PE内部控制模块,44为PE内部移位器。 MUX, a register 36 for the seventh, eighth registers 37, a first inverter controller 38, the controller 39 is a second inverter, the first 40 carry-save adders, 41 is a second carry-save adder device, 42 is a third carry save adder, the control module 43 inside the PE, the PE 44 internal shifter.

具体实施方式 Detailed ways

[0062] 下面结合附图进一步说明本发明。 [0062] The present invention is further explained below in conjunction with the accompanying drawings.

[0063] 图1是本发明双域模乘模除器顶层结构图,主要的运算部件是PEO Q6)、PE1 (27)、 ΡΕ2(28)和ΡΕ3(29)四个运算单元,四个运算单元结构相同(见图幻,且可加速完成模乘模除运算的基本运算。该模乘模除器能完成素域下及二进制域下的模乘模除操作,在进行模乘(模除)运算时,所用到的被乘数(被除数)、乘数(除数)、乘法结果(除法结果)及模等数据,都存储在五个寄存器堆(18〜22)中。Booth编码单元用来提供模乘运算中所用到的mul_h的值,而Control单元用来提供模除运算时所用到的div_h的值。而17个多路选择器(1〜17)用来改变四个PE运算单元之间的连接以及数据的读取位置以完成模除或模乘操作,整个ECC模乘模除器可以通过重新配置单元之间的连接通路,来完成不同的算法。 [0063] FIG. 1 is a modular multiplication modulo double domain structure of the top level view of the invention, the main operation member is PEO Q6), PE1 (27), ΡΕ2 (28) and ΡΕ3 (29) four arithmetic unit, four operational the same cell structure (see phantom, and may accelerate the completion of the modular multiplication modulo arithmetic basic operations. the modular multiplication modulo completed by the operation of modular multiplication modulo the prime field and in the binary domain, performing modular multiplication (mod ) in operation, when the multiplicand used (dividend), the multiplier (the divisor), the multiplication result (result of the division) and analog data, are stored in five .Booth coding unit for the register file (18~22) of providing modular multiplication arithmetic values ​​as used mul_h, while Control unit for providing a value of the modulo operation is used div_h while more than 17 channel selectors (1~17) for changing the four operation units PE and a connection between the data reading position to complete a modular multiplication or modulo operation, the entire ECC modular multiplication modulo may be connected by reconfiguring the path between the units, to perform different algorithms.

[0064] 1.模除状态 [0064] In addition to the state of the mold 1

[0065] 模除状态由第一、第九多路选择器(1、9),五个寄存器堆Regfile (18〜22),输入寄存器(23) ,Booth编码单元(24),控制模块(25),四个PE运算单元(26〜29)组成,数据通路如附图5所示,完成的模除算法如附图7a所示,模除时的顶层说明图如附图7b所示, 其中: [0065] In addition to the state of the first mode, the ninth multiplexer (1,9), register file five Regfile (18~22), input register (23), Booth encoding unit (24), control module (25 ), four PE arithmetic unit (26~29), with the data path as shown in Figure 5, the completed modulo arithmetic as shown in figures 7a, the top layer when the modulo explanatory view as shown in figure 7b, wherein :

[0066] 第一多路选择器(1),输入为PEO中计算的结果,由stage信号来选择,将输入的结果写入到Regfilel或Regfile2中。 [0066] a first multiplexer (1), the result input PEO calculated, selected by the stage signal, the result is written to the input of Regfilel or Regfile2.

[0067] 第九多路选择器(9),输入为PE2中计算的结果,由stage信号来选择,将输入的结果写入到Regfile3或Regfile4中。 [0067] The ninth multiplexer (9), enter the result of calculation of the PE2, selected by the stage signal, the result is written to the input of Regfile3 or Regfile4.

[0068] 第一寄存器堆(18),在stage信号为0时,存储的是以除数初始化的C寄存器中的计算结果以及进位(Ce,Cs),stage信号为1时,存储的是以模P初始化的D寄存器中的计算结果及进位(Dc,Ds);输入为PEO的计算结果,输出到输入寄存器(23)中。 [0068] The first register file (18), the signal at stage 0, the divisor is calculated based on the stored initialization register C and the result carry (Ce, Cs), stage signal is 1, it is stored as mold and carry results (Dc, Ds) P D register initialized; PEO inputs the calculation result is output to the input register (23).

[0069] 第二寄存器堆(19),在stage信号为0时,存储的是以模P初始化的D寄存器中的计算结果以及进位(Dc,Ds),stage信号为1时,存储的是以除数初始化的C寄存器中的计算结果及进位(Ce,Cs);输入为PEO的计算结果,输出到输入寄存器(23)中。 [0069] The second register file (19), the stage signal is 0, modulo P calculation is stored in the D register is initialized and the carry results (Dc, Ds), stage signal is 1, it is stored and carry results (Ce, Cs) C initialized divisor register; PEO inputs the calculation result is output to the input register (23).

[0070] 第三寄存器堆(20),在stage信号为0时,存储的是以被除数初始化的U寄存器中的计算结果以及进位(Uc,Us),stage信号为1时,存储的是以0初始化的W寄存器中的计算结果及进位(Wc,Ws);输入为PE2的计算结果,输出到PE3运算单元中。 [0070] The third register file (20), the stage signal is 0, the dividend is calculated based on the stored initialization U register and the carry result (Uc, Us), stage signal is 1, 0 is stored uninitialized W register and the carry calculation result (Wc, Ws); an input of the calculation results PE2, PE3 outputted to the operation unit.

[0071] 第四寄存器堆(21),在stage信号为0时,存储的是以0初始化的W寄存器中的计算结果以及进位(Wc,Ws),stage信号为1时,存储的是以被除数初始化的U寄存器中的计算结果及进位(Uc,Us);输入为PE2的计算结果,输出到PE3运算单元中。 [0071] The fourth register file (21), the stage signal is 0, 0 is stored in the W register initialization calculation result and the carry (Wc, Ws), stage signal is 1, the dividend is stored the results and the carry register initialization U (Uc, Us); an input of the calculation results PE2, PE3 outputted to the operation unit.

[0072] 第五寄存器堆(22),在stage信号为0或1时,都存储的是模(0,P)的值;输出到Booth编码器(24)中。 [0072] The fifth register file (22), the stage signal is 0 or 1, is a value stored mode (0, P); an output to a Booth encoder (24).

[0073] 输入寄存器(23),输入为第一,第二寄存堆(18,19)的输出结果;输出到PEO运算单元中。 [0073] The input register (23), a first input, a second register stack (18, 19) output; PEO output to the operation unit.

[0074] Booth编码单元(24),在Func sel为模乘运算时,作为Booth编码器对乘数进行编码;而在模除运算时,只作为寄存器使用;输入为第五寄存器堆(22)的结果;输出到PE2 运算单元中。 [0074] Booth encoding unit (24), when the modular multiplication is, as a Booth encoder for encoding the multiplier in Func sel; in modulo arithmetic, using only as registers; fifth input register file (22) result; output to the operation unit PE2.

[0075] 控制模块(25)为PE0、PE1运算单元提供计算所需的参数divh,输出到PEO和PEl [0075] The control module (25) providing a parameter required for the calculation of divh PE0, PE1 arithmetic unit, and outputs the PEO PEl

运算单元。 Calculation means.

[0076] PEO运算单元(26),完成C= (C+div_h*D) >> shift运算,输入来自输入寄存器(23)的输出(C,D)的值,以及控制模块(25)的输出divh ;输出为保留进位加法器计算的结果(Ce,Cs);根据stage信号的选择写入到第一寄存器堆(18)或第二寄存器堆(19)中。 Output [0076] PEO arithmetic unit (26), complete C = (C + div_h * D) >> shift operation, an output from the input register (23) in (C, D) value, and a control module (25) DIVH; results (Ce, Cs) output carry save adders calculation; written to the first register file (18) or the second register file (19) in accordance with the selection signal stage.

[0077] PEl运算单元(27),完成U = U+div_h*W运算,输入为(U,W)的值来自PE3运算单元(29),以及控制模块(25)的输出div_h ;输出为保留进位加法器计算的结果(Uc,Us),输出到PE2运算单元(28) ο [0077] PEl arithmetic unit (27), to complete the U = U + div_h * W operation input (U, W) output div_h value from the PE3 arithmetic unit (29), and a control module (25); an output reserved result (Uc, Us) carry adder computing, calculating PE2 to the output unit (28) ο

[0078] PE2运算单元08),完成U = (U+kP) > > shift运算,输入为PEl运算单元(Xt) 的输出,以及Booth编码单元的输入模P;输出为保留进位加法器计算的结果(Ce,Cs);根据stage信号的选择写入到第三00)或第四寄存器堆中. [0078] PE2 arithmetic unit 08), to complete the U = (U + kP)>> shift operation, the output PEl arithmetic unit (Xt) is input, and Booth input mode P coding unit; output carry save adders calculated results (Ce, Cs); written according to the selection signal to the third stage 00), or a fourth register file.

[0079] PE3运算单元(¾),只完成对读出的第三寄存器堆00)或第四寄存器堆中的数据进行存储;输出到PEl运算单元(XT)。 [0079] PE3 arithmetic unit (¾), the read data is completed only a third register file 00) or a fourth register file for storing; PEl is output to the arithmetic unit (XT).

[0080] 当进行模除运算时,四个PE单元的连接关系经过多路选择器的选择后如附图5所示,PEO运算单元06)完成C= (C+div_h*D) >> shift操作,而PE1、PE2配合,完成对寄存器U的计算,其中PEl完成U= (U+divh*W)的操作,PE2运算单元Q8)完成U = (U+kp) >> shift的操作,PE3用作数据存储器使用。 [0080] When the modulo operation, four PE connection relationship after the selector means to select the multiplexer as shown in Figure 5, the PEO arithmetic unit 06) to complete the C = (C + div_h * D) >> shift operation, and PE1, PE2 with complete calculation of the U registers, wherein PEl complete U = (U + divh * W) operation, PE2 arithmetic unit Q8) to complete the operation U = (U + kp) >> shift of the PE3 use as data storage.

[0081] 在进行模除运算时,基本思想如下(见图3):针对已知的X、Y、P,求Z = X/Ymod P。 [0081] In addition to the mold during operation, the basic idea is as follows (see Figure 3): For the known X, Y, P, find Z = X / Ymod P. 使用两个等式:cx = UY mod P禾口DX三WY mod P。 Using two equations: cx = UY mod P Wo port DX three WY mod P. 初始化时C = Y、U = X、D = P、W = 0,然后用扩展Euclidean算法,将gcd (C,D)化为gcd (0,1)或gcd (0,-1),在此过程中W和U做相应的素域下的线性变换,最终得到W = X/Y mod P或W = -X/Y mod P。 Initialization C = Y, U = X, D = P, W = 0, then the extended Euclidean algorithm, the gcd (C, D) into gcd (0,1) or gcd (0, -1), this U and W during the corresponding linear transformation prime field, finally resulting W = X / Y mod P or W = -X / Y mod P.

[0082]模除算法中用到的基本运算有:C = (C+div_h*D) > > shift ;U = U+div_h*W ; U= (U+kP) >> shift,具有相似的结构,于是设计一个基本的运算单元,能够完成X = (X+kY) >> shift的运算,这样就可以完成模除的运算要求。 [0082] In addition to the basic calculation algorithm used in the mold are: C = (C + div_h * D)>> shift; U = U + div_h * W; U = (U + kP) >> shift, have a similar structure , then a basic design of an arithmetic unit can be accomplished X = (X + kY) >> shift operation, so that operation can be done in addition to the required mode. 考虑到取得最大的运算并行性,四个PE运算单元(¾〜29)进行相应的运算,而控制模块0¾为运算单元提供运算所需的参数div_h,五个寄存器堆(18〜22)存储运算所需的数据,由于考虑到可扩展性和灵活性,所以运算中多bit的模运算都分解成字级运算.于是得到了针对模除运算的硬件单元,见附图5,顶层简图如附图7b所示。 Taking into account to achieve maximum operational parallelism, four PE arithmetic unit (¾~29) corresponding operations, and a control module 0¾ provide the desired operation parameter for the arithmetic unit div_h, five register file (18~22) memory operation required data, considering the scalability and flexibility, so that multi-bit arithmetic modulo operation are decomposed into word-level operation. Thus obtained modulo arithmetic for hardware unit, see Figure 5, as a schematic top As shown in figure 7b.

[0083] 设计的PE运算单元具有相似的运算结构,能够完成Z = (Z+mX+nY) >> shift的运算,如附图1所示。 [0083] The design of the arithmetic unit PE have similar operational structure can be accomplished Z = (Z + mX + nY) >> shift operation, as shown in Figure 1. 这样PE运算单元也即满足了除法中的运算要求。 Such arithmetic unit PE i.e. met the operational requirements of the division.

[0084] PEO 运算单元(26)完成C = (C+div_h*D) >> shift 操作,操作数C、D、div_h 和shift都由外部控制模块及寄存器堆提供。 [0084] PEO arithmetic unit (26) to complete the C = (C + div_h * D) >> shift operation, operand C, D, div_h and shift by the external control module and a register file provided. 每个clock完成一个字的C= (C+div_h*D) > > shift的操作,为了保证能工作在一个较高的时钟频率下,所以运算采用进位保留加法器CSA,这样结果包含两部分,运算的结果和进位结果,各为32bit,将这两部分值再暂存入寄存器堆中。 Each clock to complete a word C = (C + div_h * D)>> shift operation, in order to ensure that work at a higher clock frequency, the arithmetic operations using carry-save adders CSA, so that the result consists of two parts, the calculation result and a carry result, each 32bit, the two parts then temporarily stored in the value register file.

[0085] PEl运算单元、2Ί)完成U= (U+div_h*W)的操作,操作数U、W和div_h都来自外部控制模块及其他的运算单元提供。 [0085] PEl calculation means, 2Ί) complete U = (U + div_h * W) operations, operand U, W and div_h are from external control module and other arithmetic unit provided. 每个clock完成一个字的U= (U+div_h*W)的操作.运算结果包含运算的结果及进位(Ucjs),将结果写入到下一个运算单元PE2中,完成后续的操作,形成一个小的两级流水线。 Each of the word clock to complete a U = (U + div_h * W) operation. Results and carry operation (Ucjs) comprising operations, write the result to the next operation unit PE2, the subsequent operation is completed, to form a a small two-stage pipeline.

[0086] PE2运算单元08)完成U = (U+kp) >> shift的操作,操作数U、P、k和shift 来自其它的运算单元或外部寄存器堆。 [0086] PE2 arithmetic unit 08) to complete U = (U + kp) >> shift operation operand, U, P, k, and the shift from other register file or an external arithmetic unit. 每个clock完成一个字的U= (U+kp) >> shift 的操作,运算结果包含运算的结果及进位(Uc』s),将这两部分结果再暂存到外部寄存器堆中。 Each of the word clock to complete a U = (U + kp) >> shift operation, calculation results and the calculation results include a carry (Uc of "s), then the results of the two parts to the outside scratch register file.

[0087] PE3运算单元09)这时完成数据存储功能,而不再参与运算,这样能够与算法相一致,并且不用再添加多的硬件结构来存储PEl运算单元06)的运算数据。 [0087] PE3 arithmetic unit 09) to complete the data storage time, rather than participating in operation, so that can be consistent with the algorithm, and do not add more hardware configuration of the arithmetic unit 06 stores PEl) computed data.

[0088] 这时所需要的运算操作数有C、D、U、W、P、div_h和shift,其中C、D、U和W—方面需要存储初始的值,另一方面还要处理C和D、U和W数据的交换,所以实现中采用了五个寄存器堆,为了不要添加太多的面积,所以采用stage信号进行区分,对寄存器堆在除法中进行复用。 [0088] In this case the number of arithmetic operations required to have C, D, U, W, P, div_h and shift, where C, D, U and W- aspects need to store the initial value, and on the other hand have to deal with C the exchange D, U, and W data, is achieved using the five register files, in order not to add too much area, the stage employed to distinguish signal, multiplexes the division in the register file.

[0089] 第一寄存器堆(18)有16个64-bit的寄存器,在stage信号为0时,存储的是(Ce, Cs)的数据,而当stage信号为1时,存储的是(Dc,Ds)的数据。 [0089] The first register file (18) has 16 64-bit registers, in the stage when the signal is 0, is stored (Ce, Cs) data, and when the signal is at stage 1, is stored (Dc , Ds) data.

[0090] 第二寄存器堆(19)有16个64-bit的寄存器,在stage信号为0时,存储的是(Dc, Ds)的数据,而当stage信号为0时,存储的是(Ce,Cs)的数据。 [0090] The second register file (19) has 16 64-bit registers, in the stage signal is 0, is stored (Dc, Ds) of the data signal when the stage is zero, is stored (Ce , Cs) data.

[0091] 第三寄存器堆00)有16个64-bit的寄存器,在stage信号为0时,存储的是(Uc, Us)的数据,而当stage信号为0时,存储的是(Wc,Ws)的数据。 [0091] The third register file 00) has 16 64-bit registers, in the stage signal is 0, is stored (Uc, Us) of the data signal when the stage is zero, is stored (Wc of, Ws) data.

[0092] 第四寄存器堆有16个64-bit的寄存器,在stage信号为0时,存储的是(Wc, Ws)的数据,而当stage信号为O时,存储的是(Uc,化)的数据。 [0092] The fourth register file has 16 64-bit registers, in the stage signal is 0, is stored (Wc, Ws) of the data signal when the stage is O, is stored (Uc of, based) The data.

[0093] 第五寄存器堆02)有16个64-bit的寄存器,在stage信号为O或1时,存储的都是模P的数据(0,P)o [0093] The fifth register file 02) has 16 64-bit registers, is O or 1, it is stored in the die data stage signal P (0, P) o

[0094] 而Booth编码单元04)此时只提供模P的数据存储作用,这样复用模乘时的Booth编码单元,不用再为模P专门再添加暂存单元。 [0094] and Booth encoding unit 04) provided at this time only the data storage mode of action of P, Booth encoding means so that when the multiplexing of modular multiplication, modulo P no longer add special register unit.

[0095] 2.模乘状态 [0095] 2. The modular multiplication state

[0096] 模乘状态由第二、第八多路选择器0、8)、五个寄存器堆(18〜22)、输入寄存器03)、Booth编码单元(M)、四个PE运算单元(¾〜29)组成,数据通路如图6所示,具体实现的模乘算法如附图8a所示,模乘时的顶层说明简图如附图8b所示.其中: [0096] a second modular multiplication state, the eighth multiplexer 0,8), register file five (18~22), input register 03), Booth encoding unit (M), by the four PE arithmetic unit (¾ ~29), with the data path shown in Figure 6, modular multiplication algorithm embodied as shown in figures 8a, a top-level schematic view of a modular multiplication described 8b as shown in the accompanying drawings wherein:

[0097] 第二多路选择器O),输入为第一寄存器堆(18)和第四寄存器堆内存储的数据,是否为第一次运算的信号First round作为选择信号,选择出正确的数据到输入寄存器(23)内。 [0097] The second multiplexer O), a first input register file (18) and the data stored in the fourth register stack, whether as a selection signal for the first operation signal First round, to select the correct data to the input register (23).

[0098] 第八多路选择器(8),输入为第二寄存器堆(19)和第五寄存器堆0¾内存储的数据,是否为第一次运算的信号FirstjOimd作为选择信号,选择出正确的数据到输入寄存器(23)内。 [0098] The eighth multiplexer (8), a second input register file (19) and the fifth register file within the data storage 0¾ whether FirstjOimd signal as a selection signal for the first operation, the correct choice into the data (23) input register.

[0099] 第一寄存器堆(18),存储的是被乘数W和模P的值(W,P),按字存储在寄存器堆中;输出到第二多路选择器O)中。 [0099] a first register file (18), the value stored in the multiplicand and the tool P W (W, P), according to the word stored in the register stack; output to the second multiplexer O) in.

[0100] 第二寄存器堆(19),存储的是运算中的U的值及进位(Ucjs),也是按字存储在寄存器堆中;输出到第八多路选择器(8)中。 [0100] The second register file (19), is stored in the operation and a carry value U (Ucjs), also according to the word stored in the register stack; output to an eighth multiplexer (8).

[0101] 第三寄存器堆(20),存储的是运算中的乘数的值(0,C),也是按字存储在寄存器堆中;输出到Booth编码器(24)中。 [0101] The third register file (20), calculation is stored multiplier value (0, C), but also by the word stored in the register stack; output to the Booth encoder (24).

[0102] 第四寄存器堆(21),存储的是运算中存储的被乘数和模(W,P)的值,在模乘时第四寄存器堆作用是一个fifo,存储的数据也是按字存储在寄存器堆中;输出到第二多路选择器O)中。 [0102] The fourth register file (21), the stored value is the operation mode and the multiplicand (W, P) stored in the fourth register file role in the modular multiplication is a FIFO, data is stored by words stored in the register file; output to the second multiplexer O) in.

[0103] 第五寄存器堆(22),存储的是运算中存储的U的值及进位(Ucjs),在模乘时第五寄存器堆0¾的作用也是一个fifo,存储的数据按字存储在寄存器堆中;输出到第八多路选择器(8)中。 [0103] The fifth register file (22), the storage operation is stored and the binary value of U (Ucjs), the fifth register file 0¾ role in the modular multiplication is a FIFO, the data stored by the word stored in register stack; output to the eighth multiplexer (8).

[0104] 输入寄存器(23),有两个64bit的数据输入,分别是被乘数及模P (W,P)和计算中的进位及结果(Uc,Us);输出到PEO运算单元06)。 [0104] input register (23), two 64bit input data, namely the multiplicand and die P (W, P), and the calculation result and the carry (Uc, Us); PEO outputted to the arithmetic unit 06) . [0105] Booth编码单元(24),输入为乘数C中的数据;分别为四个PE运算单元提供乘以W 的倍数mul_h0、mul_hl、mul_h2 和mul_h3。 [0105] Booth encoding unit (24), the input data to the multiplier C; providing multiple mul_h0, mul_hl, mul_h2 mul_h3 and W are multiplied by the four PE arithmetic unit.

[0106] PEO运算单元,输入为两个64bit的数据,分别是被乘数及模P (W,P)和计算中的进位及结果(Uc,Us);完成运算U = (U+mul_h*W+kP) >> shift ;输出为计算后结果及进位(Uc, Us)和被乘数及模P(W,P)的值,输出到PEl运算单元中。 [0106] PEO arithmetic unit, two 64bit input data, namely the multiplicand and die P (W, P), and the calculation result and the carry (Uc, Us); completion operation U = (U + mul_h * W + kP) >> shift; and outputs the calculation result carry (Uc, Us) and analog multiplicand and P (W, P) value, the output of the operation unit PEl.

[0107] PEl运算单元,输入为两个64bit的数据,分别是被乘数及模P (W,P)和PEO中计算中的进位及结果(Uc,Us);完成运算U = (U+mul_h*W+kP) >> shift ;输出为计算后结果及进位(Uc,Us)和被乘数及模P(W,P)的值,输出到PE2运算单元中。 [0107] PEl arithmetic unit, two 64bit input data, namely the multiplicand and die P (W, P) and PEO computed carry bit and the result (Uc, Us); completion operation U = (U + mul_h * W + kP) >> shift; and outputs the calculation result carry (Uc, Us) and analog multiplicand and P (W, P) value, the output of the operation unit PE2.

[0108] PE2运算单元,输入为两个64bit的数据,分别是被乘数及模P (W,P)和PEl中计算中的进位及结果(Uc,Us);完成运算U = (U+mul_h*W+kP) >> shift ;输出为计算后结果及进位(Uc,Us)和被乘数及模P(W,P)的值,输出到PE3运算单元中。 [0108] PE2 arithmetic unit, two 64bit input data, namely the multiplicand and die P (W, P), and PEl computed carry bit and the result (Uc, Us); completion operation U = (U + mul_h * W + kP) >> shift; and outputs the calculation result carry (Uc, Us) and analog multiplicand and P (W, P) value, the output of the operation unit PE3.

[0109] PE3运算单元,输入为两个64bit的数据,分别是被乘数及模P (W,P)和PE2中计算中的进位及结果(Uc,Us);完成运算U = (U+mul_h*W+kP) >> shift ;输出为计算后结果及进位(Uc,Us)和被乘数及模P(W,P)的值,前者输出到第五寄存器堆中(22),后者输出到第四寄存器堆(21)中。 [0109] PE3 arithmetic unit, two 64bit input data, namely the multiplicand and die P (W, P), and calculating PE2 carry bit and the result (Uc, Us); completion operation U = (U + mul_h * W + kP) >> shift; and outputs the calculation result carry (Uc, Us) and analog multiplicand and P (W, P) value, outputs the former to the fifth register file (22), after are output to the fourth register file (21).

[0110] 当进行模乘运算时,四个PE单元的连接关系经过多路选择器的选择后如附图8b 所示,四个PE单元形成一个流水线结构,都进行U = (U+mul_h*W+kP) >> shift的操作。 [0110] When the modular multiplication, four PE units connection relationship after selecting multiplexer selector 8b as shown in the drawings, four units forming a pipeline structure PE, both for U = (U + mul_h * operation W + kP) >> shift of.

[0111] 对于模乘运算,对素域下,采用经过基-4的Booth编码的蒙哥马利模乘,算法中用到的基本运算有U= (U+mul_h*ff+kP) >> shift。 [0111] For modular multiplication of the prime field, through the use of Montgomery modular multiplication-4 Booth encoding basic algorithms used in computing have U = (U + mul_h * ff + kP) >> shift. 为了取得更快的运算速度,所以让四个PE运算单元(26〜29)都进行U = (U+mul_h*ff+kP) >> shift操作,形成一个四级流水线。 In order to achieve faster operation speed, so let four PE arithmetic means (26~29) for both U = (U + mul_h * ff + kP) >> shift operation, form a 4-stage pipeline.

[0112] 根据Field(素域还是多项式域)和C的值设置此次累加运算所需的参数值。 [0112] The accumulated value of the parameter calculation in accordance with the desired set value Field (or polynomials prime field region) and C. 其中mul_h表示累加时被乘数的倍数。 Wherein when accumulation mul_h represents a multiple of the multiplicand. 参数shift表示此次累加后U要右移的位数。 Parameter represents the number of digits after the shift to the right of U accumulation. 参数k 的决定方式同算法1,目的也是使U+kP后值的最低shift位为0。 Mode decision algorithm with the parameter k 1, again to shift bits U + minimum value is 0 kP. 参数h的值为乘数的位数。 H parameter value multiplier digits. 为了使素域下的运算结果在GF(P)内,算法对结果做了调整。 In order to make the calculation result in the case of the prime field GF (P), the results were adjusted algorithm. 而多项式域下的运算会将这一步跳过(见图4)。 And the polynomial computation domain will skip this step (see Fig. 4).

[0113] 可见,本算法可以支持双域下的模乘运算,而且素域下的模乘运算因为采用了booth编码,累加运算的次数可以缩减为原来的一半。 [0113] seen that the algorithm can support modular multiplication in the dual domain, and modular multiplication in the prime domain because the use of coding booth, the number of accumulation operation can be reduced to half of the original. 需要说明的是:为了实现可扩展,并使模乘与模除运算能够共用硬件单元,算法1和算法2中的长操作数的加减法和移位运算都以字为单位进行。 It should be noted that: in order to achieve scalable, and modular multiplication modulo arithmetic can be shared with the hardware unit, the length of operand 2 in Algorithm 1 and Algorithm addition and subtraction and shift operations are performed in word units. 设操作数长度为n,字长为《, Provided operand length is n, the word length is "

[0114] e = [-l则操作数以字为单位的向量表示为{O,^6-^...,^^^®}。 [0114] e = [-l operand in words represented as vectors {O, ^ 6 - ^ ..., ^^^ ®}. 高位上增加 Increase on high

W , W,

全零字的目的是为了防止加法运算时结果溢出,另外,它还可以作为符号字,方便加法结果右移后的符号位扩展。 All-zero word object is to prevent overflow adder result. Further, it can be used as a symbol words after sign extension to facilitate the addition result to the right.

[0115] PEO 运算单元(26)完成U = (U+mul_h0*W+kP) >> shift 操作,操作数U、W、P、 mul_h0及shift都由外部控制模块及寄存器堆提供。 [0115] PEO arithmetic unit (26) to complete U = (U + mul_h0 * W + kP) >> shift operation, operand U, W, P, mul_h0 and shift by the external control module and a register file provided. 每个clock完成一个字的U = (U+mul_ h*W+kP) >> shift的操作,为了保证能工作在一个较高的时钟频率下,所以运算采用进位保留加法器CSA,这样结果包含两部分,运算的结果和进位结果,各为32bit,将这两部分值再写入到下一个运算单元中,进行流水线操作。 Each of the word clock to complete a U = (U + mul_ h * W + kP) >> shift operation, in order to ensure that work at a higher clock frequency, the arithmetic operations using carry-save adders CSA, such results include two parts, and the result of the operation result into bits, each 32bit, the two parts then written to the next value operation unit, a pipeline operation. [0116] PEl 运算单元(27)完成U = (U+mul_hl*W+kP) >> shift 操作,操作数U、W、P、 mul_hl及shift都由外部控制模块及上一个PE运算单元提供。 [0116] PEl arithmetic unit (27) to complete U = (U + mul_hl * W + kP) >> shift operation, operand U, W, P, mul_hl and shift by the external control module and the operating unit provided on a PE. 每个clock完成一个字的U = (U+mul_h*W+kP) >> shift的操作,为了保证能工作在一个较高的时钟频率下,所以运算采用进位保留加法器CSA,这样结果包含两部分,运算的结果和进位结果,各为32bit,将这两部分值再写入到下一个运算单元中,进行流水线操作。 Each of the word clock to complete a U = (U + mul_h * W + kP) >> shift operation, in order to ensure that work at a higher clock frequency, the arithmetic operations using carry-save adders CSA, comprising two such results part, the result of calculation result and a carry, each 32bit, the two parts then written to the next value operation unit, a pipeline operation.

[0117] PE2 运算单元(28)完成U = (U+mul_h2*W+kP) >> shift 操作,操作数U、W、P、 mul_h2及shift都由外部控制模块及上一个PE运算单元提供。 [0117] PE2 calculation means (28) to complete U = (U + mul_h2 * W + kP) >> shift operation, operand U, W, P, mul_h2 and shift by the external control module and the operating unit provided on a PE. 每个clock完成一个字的U = (U+mul_h*W+kP) >> shift的操作,为了保证能工作在一个较高的时钟频率下,所以运算采用进位保留加法器CSA,这样结果包含两部分,运算的结果和进位结果,各为32bit,将这两部分值再写入到下一个运算单元中,进行流水线操作。 Each of the word clock to complete a U = (U + mul_h * W + kP) >> shift operation, in order to ensure that work at a higher clock frequency, the arithmetic operations using carry-save adders CSA, comprising two such results part, the result of calculation result and a carry, each 32bit, the two parts then written to the next value operation unit, a pipeline operation.

[0118] PE3 运算单元(29)完成U = (U+mul_h3*ff+kP) >> shift 操作,操作数U、W、P、 mul_h3及shift都由外部控制模块及上一个PE运算单元提供。 [0118] PE3 arithmetic unit (29) to complete U = (U + mul_h3 * ff + kP) >> shift operation, operand U, W, P, mul_h3 and shift by the external control module and the operating unit provided on a PE. 每个clock完成一个字的U = (U+mul_h*W+kP) >> shift的操作,为了保证能工作在一个较高的时钟频率下,所以运算采用进位保留加法器CSA,这样结果包含两部分,运算的结果和进位结果,各为32bit,将这两部分值再写入到寄存器堆单元中,进行流水线操作。 Each of the word clock to complete a U = (U + mul_h * W + kP) >> shift operation, in order to ensure that work at a higher clock frequency, the arithmetic operations using carry-save adders CSA, comprising two such results results and carry results section, operation, each 32bit, the value of the two parts then written into the register file unit, a pipeline operation.

[0119] 这里需要提供U、W、P、mul_h等参数,于是设计U、W、P等参数存入到第一、第二、 第三、第四和第五寄存器堆(18〜22)中,而四个PE运算单元所用的mul_hO、mul_hl、mul_ h2、mul_h3四个参数由Booth编码单元(24)提供。 [0119] It should provide U, W, P, mul_h other parameters, so designed U, W, P and other parameters stored in the first, second, third, fourth, and fifth register file (18~22) of , and the four PE arithmetic unit used mul_hO, mul_hl, mul_ h2, mul_h3 four parameters provided by the Booth encoding unit (24). 一个clock内,Booth编码单元一次提供四个运算单元所用的mul_h值,这样四个PE运算单元(26〜29)组成一个四级流水线进行并行计算。 A clock, Booth encoding unit provides a four mul_h value calculation means used, so that four PE arithmetic means (26~29) a 4-stage pipeline consisting of parallel computation. 其中: among them:

[0120] 第一寄存器堆(18)存储初始的被乘数和模P的值(W,P),输出到PEO运算单元(26),进行运算。 [0120] a first register file (18) storing the initial values ​​of the multiplicand modulo P and (W is, P), PEO outputted to the arithmetic unit (26), is operated.

[0121] 第二寄存器堆(19)存储运算单元所计算的进位及结果(Uc,Us),输出到PEO运算单元(27),进行运算。 [0121] The second register file (19) stored in the arithmetic unit and the carry calculated result (Uc, Us), PEO is output to the arithmetic unit (27), is operated.

[0122] 第三寄存器堆(20)存储乘数C的值(0,C),输出到Booth编码单元(24)中,为四个PE运算单元(26〜29)同时提供四个运算所需的参数mul_h。 [0122] The third register file (20) storing multiplier values ​​C (0, C), is output to the Booth encoding unit (24), four PE arithmetic means (26~29) to provide the desired operation while four the parameters mul_h.

[0123] 第四寄存器堆(21) Wfifo的形式存储当前运算的(W,P)的值,输出到PEO运算单元(26)进行运算。 [0123] The fourth register file (21) Wfifo stored in the form of the current operation (W, P) value, the arithmetic unit outputs to the PEO (26) is operated.

[0124] 第五寄存器堆(22)以fifo的形式存储当前运算的(Uc,Us)的值,输出到PEO运算单元(26)进行运算。 [0124] The fifth register file (22) stored in the form of the current operation of the fifo (Uc, Us) value, the arithmetic unit outputs to the PEO (26) is operated.

Claims (4)

1. 一种适用于素域和多项式域的模乘模除器,其特征在于:它由17个多路选择器(1〜17)、4个运算单元PE (26〜29)、输入寄存器(23)、5个寄存器堆(18〜22) ,Booth编码单元(24)和控制模块(25)组成;其中:a.在第二多路选择器(2)和第八多路选择器(8)选1态、13个多路选择器(3〜7、10〜 17)选0态时,由第一多路选择器(1)、第九多路选择器(9)、5个寄存器堆(18〜22)、输入寄存器(23)、Booth编码单元(24)、控制模块(25)和4个PE运算单元(26〜29)组成模除器,模除运算采用Euclidean算法;b.在第一多路选择器(1)和第九多路选择器(9)不工作、13个多路选择器(3〜7、 10〜17)选1态时,由第二多路选择器(2)、第八多路选择器⑶、5个寄存器堆(18〜22)、 输入寄存器(23)、Booth编码单元(24)、控制模块(25)和4个PE运算单元(26〜29)组成模乘器,模乘运算采用模哥马利算法 A suitable prime field and field polynomial modular multiplication modulo, characterized in that: it is composed of 17 multiplexer (1~17), the arithmetic unit 4 PE (26~29), input register ( 23), register file 5 (18~22), Booth encoding unit (24) and a control module (25); wherein: a multiplexer in the second (2) and an eighth multiplexer (8. when the) state is selected from 1, 13 multiplexer (3~7,10~ 17) selected from the 0 state, a first multiplexer (1), the ninth multiplexer (9), the register file 5 (18~22), input register (23), Booth encoding unit (24), control module (25) and four PE arithmetic unit (26~29) modulo composition, a modulo operation using the Euclidean algorithm; in B. a first multiplexer (1) and a ninth multiplexer (9) does not work, 13 multiplexer (3~7, 10~17) is selected from the state 1, the second multiplexer ( 2), the eighth multiplexer ⑶, 5 register stack (18~22), input register (23), Booth encoding unit (24), control module (25) and four PE arithmetic unit (26~29) composition modular multiplication, a modular multiplication algorithm employed modulus Ge Mali
2.按权利要求1所述的适用于素域和多项式域的模乘模除器,其特征在于:所述的步骤a中:第一多路选择器(1),输入为运算单元PEO (26)的输出;由stage信号来选择输出,输出至第一寄存器堆(18)或第二寄存器堆(19);第九多路选择器(9),输入为运算单元PE2 (28)的输出;由stage信号来选择输出,输出至第三寄存器堆(20)或第四寄存器堆(21);第一寄存器堆(18)和第二寄存器堆(19)的输出均到输入寄存器(23); 第三寄存器堆(20)和第四寄存器堆(21)的输出均到PE3运算单元(29); 第五寄存器堆(22)的输出到Booth编码器(24); 输入寄存器(23)的输出到PEO运算单元(26); Booth编码单元(24)输出到PE2运算单元(28); 控制模块(25)输出到PEO和PEl运算单元(26,27);PEO运算单元(26)的输出根据stage信号的选择写入到第一寄存器堆(18)或第二寄存器堆(19);PEl运算单元(27)的 2. according to claim 1 suitable for use in a prime field and field polynomial modular multiplication modulo, characterized in that: said step a: a first multiplexer (1), the input of the arithmetic unit the PEO ( output 26); and a signal to select the output stage, output to the first register file (18) or the second register file (19); a ninth multiplexer (9), the input of the arithmetic unit PE2 (28) output ; selecting an output signal from the stage, output to the third register file (20) or a fourth register file (21); a first register file (18) and a second register file (19) are output to the input register (23) ; third register file (20) and the fourth register file (21) are output to the arithmetic unit PE3 (29); a fifth register file (22) is output to the Booth encoder (24); an input register (23) PEO is output to the arithmetic unit (26); an output Booth encoding unit (24) to PE2 arithmetic unit (28); a control module (25) to the PEO PEl and operating means (26, 27); PEO arithmetic unit output (26) the writing selection signal to the first stage register file (18) or the second register file (19); PEl arithmetic unit (27) 出到PE2运算单元(28);PE2运算单元(28)的输出根据stage信号的选择写入到第三寄存器堆(20)或第四寄存器堆(21);PE3运算单元(29)的输出到PEl运算单元(27)。 PE2 to the arithmetic unit (28); calculating PE2 output means (28) is written to the third register file (20) or a fourth register file (21) according to the selection signal stage; an output PE3 arithmetic unit (29) to PEl arithmetic unit (27).
3.按权利要求1所述的适用于素域和多项式域的模乘模除器,其特征在于:所述的步骤b中:第二多路选择器(2),输入为第一寄存器堆(18)和第四寄存器堆(21)内存储的数据; 以FirstjOimd作为选择信号,选择输出正确的数据到输入寄存器(23)内;第八多路选择器(8),输入为第二寄存器堆(19)和第五寄存器堆(22)内存储的数据; 以FirstjOimd作为选择信号,选择输出正确的数据到输入寄存器(23)内; 第三寄存器堆(20)的输出到Booth编码器(24); 输入寄存器(23)的两位输出到PEO运算单元(26); Booth编码单元(24)的输出分别为4个PE运算单元(26〜29)提供输入; PEO运算单元(26)的输出到PEl运算单元(27);PEl运算单元(27)的输出到PE2运算单元(28); PE2运算单元(28)的输出到PE3运算单元(29);PE3运算单元(29)的输出分别到第五寄存器堆(22)和第四寄存器堆(21)。 3. The element according to claim applicable to fields and field polynomial modular multiplication modulo claim 1, wherein: said step b: a second multiplexer (2), a first input register file (18) and a fourth data register file (21) stored; FirstjOimd as a selection signal to select and output the correct data to the input register (23); and an eighth multiplexer (8), a second input register stack (19) and the fifth data in the register file (22) stored; FirstjOimd as a selection signal to select and output the correct data to the input register (23); an output of the third register file (20) to the Booth encoder ( 24); two output of the input register (23) to PEO arithmetic unit (26); an output Booth encoding unit (24) are provided for the four input PE arithmetic unit (26~29); PEO arithmetic unit (26) output to PEl arithmetic unit (27); an output PEl arithmetic unit (27) to PE2 arithmetic unit (28); an output PE2 calculation means (28) to PE3 arithmetic unit (29); PE3 arithmetic unit (29) output, respectively, fifth register file (22) and the fourth register file (21).
4.根据权利要求1所述的模乘模除器,其特征在于:所述运算单元PE由6个多路选择器(30〜35)、2个寄存器(36,37)、2个反相控制器(38,39)、3个进位保留加法器(40、41、 42), PE内部控制模块(43)和PE内部移位器(44)组成;其中:第十八多路选择器(30)的输入为模乘运算时累加被乘数的倍数mul_h和模除运算时累加D寄存器值的倍数div_h,由运算函数FimC_Sel作为选择,选择出正确的W寄存器值的倍数选择信号,输出到第十九多路选择器(31)的选择控制端;第十九多路选择器(31)的输入为0、W寄存器的值和2*W寄存器的值,由第十八多路选择器(31)的输出作为选择信号,选择出正确的操作数值到第一反相控制器(38);第二十多路选择器(32)的输入为字运算时模P的倍数doublel和double2,由运算函数FimC_Sel作为选择,输出正确的倍数选择信号到第二十三多路选择器(35);第二十一 According to claim modular multiplication modulo claim 1, characterized in that: said computation unit PE of six multiplexer (30~35), two registers (36, 37), two inverters The controller (38, 39), 3 carry-save adder (40,41, 42), PE internal control module (43) and the inner PE shifter (44); wherein: the eighteenth multiplexer ( input 30) is accumulated when the multiplicand multiples mul_h modular multiplication and accumulation mode D register value addition operation when multiple div_h, the operation function FimC_Sel Alternatively, select the correct register value W multiple selection signal is output to the selecting a nineteenth control terminal multiplexer (31); XIX multiplexer (31) input is 0, and the value of register W 2 * W register values ​​from the 18th to multiplexer output (31) as a selection signal, the selected value to the correct operation of the first inverter controller (38); twenty multiplexer (32) is a multiple input and double2 doublel P when word operation mode, Alternatively FimC_Sel the arithmetic function, the correct output signals to twenty-fold over three selection MUX (35); XXI 路选择器(33)的输入为字运算时模P的倍数zerol和zero2,由运算函数FimC_Sel作为选择,输出正确的倍数选择信号到第二十三多路选择器(35);第二十二多路选择器(34)的输入为字运算时模P的倍数negl和neg2,由运算函数FimC_Sel作为选择,输出正确的倍数选择信号到第二反相控制器(39);第二十三多路选择器(35)的输入为0、模P的值和2*模P的值,由第二十、第二十一多路选择器(32、33)的输出作为选择信号,选择出正确的操作数值到第二反相控制器(39); 第七寄存器(36)用来存放运算时的模P中一个字的值; 第八寄存器(37)用来存放运算时W中一个字的值;第一反相控制器(38)的输入为第十九多路选择器(31)的输出值,控制信号为第十八多路选择器(30)的输出值,输出为输入经过取反后的数值;第二反相控制器(39)的输入为第二十三多路选择器(35)的输出值 Path selector (33) is a multiple input P and the die zerol ZERO2, as the operation function FimC_Sel word select operation, the output signal to correct multiple selection over three twenty-path selector (35); twenty-second multiplexer (34) is a multiple input P and the die negl neg2, as the operation function FimC_Sel select word operation, output the selected signal to the correct multiple of the second inverter controller (39); over three twenty path selector (35) input is 0, the value of P and the die mold 2 * P values ​​by the twentieth, twenty-first multiplexer (32, 33) is output as a selection signal to select the correct a second operation value to the inverter controller (39); storing operation when a word W in the eighth register (37) used; seventh register (36) for storing values ​​of modulo P calculation when a word value; inverting input of the first controller (38) for the nineteenth multiplexer (31) output values, control signals for the eighteenth multiplexer (30) output, the output is taken via an input value after inverse; inverting input of a second controller (39) is more than three-way selector twenty (35) output value 控制信号为第二十二多路选择器(34)的输出值,输出为输入经过取反后的数值;第一进位保留加法器(40)的输入为保存一个操作数的U寄存器当前字的结果和进位(Uc, Us),以及素域或多项式域的域选择信号Field,第一个字Firstjord,和第一反相控制器(38)的输出,输出为通过进位保留加法器的值Us_l以及进位Uc_l ;第二进位保留加法器(41)的输入为第一进位保留加法器(40)的输出Uc_l,Us_l,素域或多项式域的域选择信号Field,第一个字First_W0rd,和第二反相控制器(39)的输出, 输出为通过进位保留加法器的值Us_2以及进位Uc_2 ;第三进位保留加法器(42)的输入为PE内部移位器(44)的输入Uc_3、Us_3和carry, 输出为经过进位保留加法器的值Us_out以及进位Uc_out ;PE内部控制模块(43)的输入为第一进位保留加法器(40)的输出Us_l、Uc_l的低两位,以及模P的低两位、右移 Output value, the output value is input through the control signal is negated to a twenty-second multiplexer (34); a first carry-save adder (40) to hold an input operand register of the current word U and a carry result (Uc, Us), and a prime field or domain selection signals field polynomial domain, the output of the first word Firstjord, and a first inverter controller (38), the output of the adder is retained by a carry value Us_l and the carry Uc_l; a second carry-save adder (41) input of a first carry save adder (40) output Uc_l, Us_l, prime field or domain selection signals field polynomial field, the first word First_W0rd, and two inversion controller (39) output, the output value and the carry Uc_2 Us_2 retained by a carry adder; third carry save adder (42) for the PE inner input shifter (44) input Uc_3, Us_3 and carry, output through the carry-save value Us_out adder and the carry Uc_out; input internal PE control module (43) to retain the adder to a first carry (40) output Us_l, Uc_l lower two bits, and the tool P two low, right 位数shift的值,输出为确定后面所加模P的倍数的控制信号;PE内部移位器(44)的输入为第二进位保留加法器(41)的输出值Us_2、Uc_2的值,以及右移的位数shift的值,输出为经过右移后的Us_3、Uc_3以及右移出来的carry的值。 Bits shift control signal values, the output is applied to determine the rear of the mold multiple of P; the PE internal shifter (44) to a second input of the adder retention carry (41) the output value Us_2, Uc_2 value, and right shift bit number value, the output value right after the Us_3, Uc_3 and carry out the right.
CN2010100226476A 2010-01-08 2010-01-08 Analog multiplier/divider applicable to prime field and polynomial field CN102122241A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010100226476A CN102122241A (en) 2010-01-08 2010-01-08 Analog multiplier/divider applicable to prime field and polynomial field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010100226476A CN102122241A (en) 2010-01-08 2010-01-08 Analog multiplier/divider applicable to prime field and polynomial field

Publications (1)

Publication Number Publication Date
CN102122241A true CN102122241A (en) 2011-07-13

Family

ID=44250803

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010100226476A CN102122241A (en) 2010-01-08 2010-01-08 Analog multiplier/divider applicable to prime field and polynomial field

Country Status (1)

Country Link
CN (1) CN102122241A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014115047A1 (en) * 2013-01-23 2014-07-31 International Business Machines Corporation Vector floating point test data class immediate instruction
CN104065478A (en) * 2014-06-18 2014-09-24 天津大学 Polynomial modular multiplication coprocessor based on lattice-based cryptosystem
CN105094746A (en) * 2014-05-07 2015-11-25 北京万协通信息技术有限公司 Method for achieving point addition/point doubling of elliptic curve cryptography
US9471311B2 (en) 2013-01-23 2016-10-18 International Business Machines Corporation Vector checksum instruction
US9703557B2 (en) 2013-01-23 2017-07-11 International Business Machines Corporation Vector galois field multiply sum and accumulate instruction
US9715385B2 (en) 2013-01-23 2017-07-25 International Business Machines Corporation Vector exception code
US9740482B2 (en) 2013-01-23 2017-08-22 International Business Machines Corporation Vector generate mask instruction
CN107169380A (en) * 2017-05-19 2017-09-15 北京大学 RSA circuit structure and RSA encryption method
US9823926B2 (en) 2013-01-23 2017-11-21 International Business Machines Corporation Vector element rotate and insert under mask instruction

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1731345A (en) * 2005-08-18 2006-02-08 上海微科集成电路有限公司 Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof
CN1738238A (en) * 2005-09-08 2006-02-22 上海微科集成电路有限公司 High-speed collocational RSA encryption algorithm and coprocessor
US20080130870A1 (en) * 2004-12-23 2008-06-05 Oberthur Card Systems Sa Data Processing Method And Related Device
CN101464920A (en) * 2008-12-10 2009-06-24 清华大学 Design method for automatic generation of two element field ECC coprocessor circuit

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080130870A1 (en) * 2004-12-23 2008-06-05 Oberthur Card Systems Sa Data Processing Method And Related Device
CN1731345A (en) * 2005-08-18 2006-02-08 上海微科集成电路有限公司 Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof
CN1738238A (en) * 2005-09-08 2006-02-22 上海微科集成电路有限公司 High-speed collocational RSA encryption algorithm and coprocessor
CN101464920A (en) * 2008-12-10 2009-06-24 清华大学 Design method for automatic generation of two element field ECC coprocessor circuit

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曹丹等: "可扩展的低成本双域模乘模除器算法及其VLSI实现", 《小型微型计算机系统》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014115047A1 (en) * 2013-01-23 2014-07-31 International Business Machines Corporation Vector floating point test data class immediate instruction
US10203956B2 (en) 2013-01-23 2019-02-12 International Business Machines Corporation Vector floating point test data class immediate instruction
CN104956319A (en) * 2013-01-23 2015-09-30 国际商业机器公司 Vector floating point test data class immediate instruction
GB2525356A (en) * 2013-01-23 2015-10-21 Ibm Vector floating point test data class immediate instruction
US10146534B2 (en) 2013-01-23 2018-12-04 International Business Machines Corporation Vector Galois field multiply sum and accumulate instruction
GB2525356B (en) * 2013-01-23 2016-03-23 Ibm Vector floating point test data class immediate instruction
US9436467B2 (en) 2013-01-23 2016-09-06 International Business Machines Corporation Vector floating point test data class immediate instruction
US9471311B2 (en) 2013-01-23 2016-10-18 International Business Machines Corporation Vector checksum instruction
US9471308B2 (en) 2013-01-23 2016-10-18 International Business Machines Corporation Vector floating point test data class immediate instruction
US9513906B2 (en) 2013-01-23 2016-12-06 International Business Machines Corporation Vector checksum instruction
US9703557B2 (en) 2013-01-23 2017-07-11 International Business Machines Corporation Vector galois field multiply sum and accumulate instruction
US10101998B2 (en) 2013-01-23 2018-10-16 International Business Machines Corporation Vector checksum instruction
US9715385B2 (en) 2013-01-23 2017-07-25 International Business Machines Corporation Vector exception code
US9823926B2 (en) 2013-01-23 2017-11-21 International Business Machines Corporation Vector element rotate and insert under mask instruction
US9733938B2 (en) 2013-01-23 2017-08-15 International Business Machines Corporation Vector checksum instruction
US9740482B2 (en) 2013-01-23 2017-08-22 International Business Machines Corporation Vector generate mask instruction
US9740483B2 (en) 2013-01-23 2017-08-22 International Business Machines Corporation Vector checksum instruction
CN104956319B (en) * 2013-01-23 2018-03-27 国际商业机器公司 Vector Floating Point instruction immediately test data class
US9778932B2 (en) 2013-01-23 2017-10-03 International Business Machines Corporation Vector generate mask instruction
US9804840B2 (en) 2013-01-23 2017-10-31 International Business Machines Corporation Vector Galois Field Multiply Sum and Accumulate instruction
US9727334B2 (en) 2013-01-23 2017-08-08 International Business Machines Corporation Vector exception code
US9823924B2 (en) 2013-01-23 2017-11-21 International Business Machines Corporation Vector element rotate and insert under mask instruction
US10338918B2 (en) 2013-01-23 2019-07-02 International Business Machines Corporation Vector Galois Field Multiply Sum and Accumulate instruction
CN105094746A (en) * 2014-05-07 2015-11-25 北京万协通信息技术有限公司 Method for achieving point addition/point doubling of elliptic curve cryptography
CN104065478B (en) * 2014-06-18 2017-07-14 天津大学 Lattice-based cryptography polynomial modular multiplication coprocessor
CN104065478A (en) * 2014-06-18 2014-09-24 天津大学 Polynomial modular multiplication coprocessor based on lattice-based cryptosystem
CN107169380A (en) * 2017-05-19 2017-09-15 北京大学 RSA circuit structure and RSA encryption method

Similar Documents

Publication Publication Date Title
McIvor et al. Modified Montgomery modular multiplication and RSA exponentiation techniques
EP1293891B1 (en) Arithmetic processor accomodating different finite field size
Okada et al. Implementation of Elliptic Curve Cryptographic Coprocessor over GF (2 m) on an FPGA
Cao et al. A residue-to-binary converter for a new five-moduli set
Blum et al. Montgomery modular exponentiation on reconfigurable hardware
McIvor et al. Hardware Elliptic Curve Cryptographic Processor Over $ rm GF (p) $
Kumar et al. Are standards compliant elliptic curve cryptosystems feasible on RFID
Harris et al. An improved unified scalable radix-2 Montgomery multiplier
Mclvor et al. Fast Montgomery modular multiplication and RSA cryptographic processor architectures
CN101782893B (en) Reconfigurable data processing platform
Bertoni et al. Efficient GF (p m) arithmetic architectures for cryptographic applications
CN1148643C (en) Modules power operation method
WO1998050851A1 (en) Improved apparatus & method for modular multiplication & exponentiation based on montgomery multiplication
Satoh et al. A scalable dual-field elliptic curve cryptographic processor
CN1205538C (en) Apparatus for multiprecision integer arithmetic
Groszschaedl et al. Instruction set extension for fast elliptic curve cryptography over binary finite fields GF (2/sup m/)
Suzuki How to maximize the potential of FPGA resources for modular exponentiation
Kammler et al. Designing an ASIP for cryptographic pairings over Barreto-Naehrig curves
Daly et al. An FPGA implementation of a GF (p) ALU for encryption processors
Wu et al. RSA cryptosystem design based on the Chinese remainder theorem
US8209369B2 (en) Signal processing apparatus and method for performing modular multiplication in an electronic device, and smart card using the same
JP3709553B2 (en) Calculation circuit and operation method
US7694045B2 (en) Methods and apparatus for pipeline processing of encryption data
JP2004326112A (en) Multiple modulus selector, accumulator, montgomery multiplier, method of generating multiple modulus, method of producing partial product, accumulating method, method of performing montgomery multiplication, modulus selector, and booth recorder
JP2005250481A (en) Extended montgomery modular multiplier supporting multiple precision

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)