CN100435090C - Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof - Google Patents

Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof Download PDF

Info

Publication number
CN100435090C
CN100435090C CNB2005100289154A CN200510028915A CN100435090C CN 100435090 C CN100435090 C CN 100435090C CN B2005100289154 A CNB2005100289154 A CN B2005100289154A CN 200510028915 A CN200510028915 A CN 200510028915A CN 100435090 C CN100435090 C CN 100435090C
Authority
CN
China
Prior art keywords
result
output
modular multiplication
processing unit
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2005100289154A
Other languages
Chinese (zh)
Other versions
CN1731345A (en
Inventor
曾晓洋
麻永新
范益波
顾叶华
陈俊
郭亚炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI MICROSCIENCE INTEGRAT
Fudan University
Original Assignee
SHANGHAI MICROSCIENCE INTEGRAT
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI MICROSCIENCE INTEGRAT, Fudan University filed Critical SHANGHAI MICROSCIENCE INTEGRAT
Priority to CNB2005100289154A priority Critical patent/CN100435090C/en
Publication of CN1731345A publication Critical patent/CN1731345A/en
Application granted granted Critical
Publication of CN100435090C publication Critical patent/CN100435090C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present invention belongs to the technical field of integrated circuits, more specifically an extensible high-radix Montgomery's modular multiplication algorithm and a circuit structure thereof. The present invention is the improvement of a multiword high-radix Montgomery's modular multiplier, wherein left shift operation is carried out for a modulo N and a multiplicand B in each step, shift operation is not carried for an intermediate result S, so that the delay between pipeline levels of a data path is shortened from two clock periods to one clock period. The circuit structure of the present invention comprises 3 memories, a data path module, a control module and a first-in first-out memory, etc., wherein the 3 memories are used for storing three operands A, B and N of the modular multiplication; the data path module is composed of processing units which are from the first level to the Pth level and has a pipeline form; the control module is used for controlling the operation of the whole modular multiplier. The present invention greatly improves the speed of the modular multiplication, and simultaneously, improves the memory unit of the intermediate result, so that the hardware overhead of the present invention is reduced.

Description

Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof
Technical field
The invention belongs to technical field of integrated circuits, be specifically related to a kind of improved extensible high-radix Montgomery's (Montgomery) modular multiplication algorithm and circuit structure thereof.
Background technology
Along with the develop rapidly of electronic communication technology, information security more and more is subjected to people's attention.In order to ensure the safety of transmission data, people have proposed a lot of cryptographic algorithms and agreement.Because realize that cryptographic algorithm was not only consuming time but also have potential safety hazard with software, so the hardware of cryptographic algorithm such as RSA realizes becoming a focus of information security technology research in recent years, numerous Chinese scholars has obtained a lot of achievements in research in this respect, and existing a lot of applications of result are in various safety information products.
The multiplying of Galois field is widely used in the various cryptographic algorithm, as RSA Algorithm, and elliptic curve cryptography (ECC) etc.Because along with the increase of key length with need a large amount of modular multiplications realize Montgomery Algorithm, cryptographic calculation is more and more consuming time, so how realize that with hardware high efficiency mould takes advantage of device to become the key of cipher processor design.The Montgomery algorithm is only realized modular multiplication by additive operation and shifting function, has avoided division arithmetic, therefore is well suited for hardware and realizes.
It is that a kind of remainder of integer that utilizes is represented (the Residue Number System of system that the Montgomery mould is taken advantage of, RNS) method of asking mould to take advantage of, count to the conversion of RNS by operation, ask shifting function after mould is converted into each scanning multiplier at the RNS division, return integer from the RNS conversion more at last, realize modular multiplication.Below the Montgomery modular multiplication algorithm is introduced.
Algorithm 1.Montgomery modular multiplication algorithm
The Montgomery mould is taken advantage of: and MM (A, B, N)=A * B * R -1(mod N), in the formula, N is the n position, A, B also are n positions and less than N, R=2 n, its algorithm is as follows:
Input: A, B, N
Output: S
MM?Algorithm:
S=0
for?i=0?to?n-1
q i=(S+A i×B)mod?2
S=(S+q i×N+A i×B)/2
end?for
if?S?≥N?then S=S-N
Q in the algorithm i=(S+A i B) mod 2, by S, A iWith the decision of B three's lowest order, its introducing is in order to make that the lowest order of accumulation result is 0, thereby is carrying out computing S=(S+q i* N+A iB)/2 o'clock, just to move to right and can not bring error one the time.Only need do addition and shift operation as can be seen from top algorithm and just can obtain mould and take advantage of the result, be fit to very much the hardware realization.
The operand of the modular multiplication of cryptographic algorithm is all bigger, and at present for the ECC algorithm, for RSA Algorithm, key length is then from 512 to 2048 to 256 from 128 for key length, even seniority more.The hardware design of present most of modular multiplication device all is at fixing key length, that is to say that the modular multiplication number can not exceed a fixedly figure place.In order to make same circuit structure can finish the modular multiplication that any bit wide requires as required, there is list of references to propose the Montgomery modular multiplication algorithm of a kind of employing based on word operation.In order to improve the speed of modular multiplication, have document to propose the Montgomery modular multiplication algorithm of Gao Ji simultaneously, promptly scan more than one for the multiplier A that is scanned at every turn, utilization cloth is thought coding (Booth encoding) and is calculated.Following algorithm is the high basic Montgomery modular multiplication algorithm that adopts multiword operation.
The Montgomery modular multiplication algorithm of the high base of algorithm word more than 2.
Input: A, B, N
Output: S
MWR2 kMM?Algorithm:
S=0
a -1=0
forj=0?to?n-1?STEP?k
q Bj=Booth(a j+k-1..j-1)
(C a,S (0))=S (0)+(q Bj*B) (0)
q Nj=S (0) k?1..0*(2 k-N (0)-1 K-1..0)mod?2 k
(C b,S (0))=S (0)+(q Nj*N) (0)
for?i=1?to?e-1
(C a,S (i))=C a+S (i)+(q Bj*B) (i)
(C b,S (i))=C b+S (i)+(q Nj*N) (i)
S (i-1)=(S (i) k-1..0,S (i-1) w-1..k)
end?for
C a=C a?or?C b
S (e-1)=sign?ext(C a,S (e-1) w-1.K)
end?for
jf?S≥N?then S=S-N
Wherein the figure place of modulus N is the n position, and w is the data width of processing unit, n=e*w.The high basic Montgomery modular multiplication algorithm detailed content of relevant multiword can be with reference to A.F.Tenca, G.Todorov, and C.K.Koc, " High-radix design of ascalable modular multiplier " in Cryptographic Hardware and Embedded Systems-CHES 2001, C.K.Koc and C.Paar, Eds.2001, Lecture Notes in Computer Science, No.1717, pp.189-206, Springer, Berlin, Germany.
The problem of the high basic Montgomery modular multiplication algorithm of above-mentioned multiword is, if adopt the pipelined circuit of multiple-stage treatment unit, because S (i-1)Be delivered to the next stage computing and must wait until S (i) K-1..0Calculate, thus data be delivered to the next stage streamline from the upper level streamline need be through the delay of two clock period, this causes the speed of modular multiplication to reduce.
Summary of the invention
The objective of the invention is to propose expanded Montgomery modular multiplication algorithm and the circuit structure thereof of a kind of improved multiword Gao Ji, make the delay between streamline have only a clock period, thereby improve the speed of modular multiplication, simultaneously middle result's storage unit is improved that its hardware spending is reduced.
The module multiplier structure of A.F.Tenca mentioned above design is middle S as a result to be carried out right-shift operation at every turn, like this one of the upper level streamline group of S as a result (i-1)Wait until S (i) K-1..0Just can be delivered to next stage after calculating and participate in computing, therefore the delay between the streamline of design is two clock period like this.When pipeline series more for a long time the modular multiplication speed of this structure can reduce greatly.The extensible high-radix Montgomery's modular multiplication algorithm that the present invention proposes, it is improvement to above-mentioned multiword high-radix Montgomery's modular multiplication algorithm (algorithm 2), its practice is each step to modulus N and the multiplicand B operation of shifting left, middle S is not as a result done shifting function, improved the streamline organizational form like this, make the delay between the pipeline stages of data path have only a clock period, therefore can improve arithmetic speed greatly.
About the design of A.F.Tenca and pipelining algorithm of the present invention more as shown in Figure 1.As seen from Figure 1, because the tissue of streamline of the present invention is move to left B and N, so the S that calculates of previous stage streamline (i-1)Directly enter the next stage streamline carry out computing and no longer need etc. to be shifted, so the organizational form of streamline of the present invention can improve modular multiplication speed greatly.
The circuit structure that the mould that the present invention proposes is taken advantage of multiplier mainly comprises: 3 storeies (RAM) module as shown in Figure 2: storer 5, storer 6 and storer 7 are used for depositing 3 operand A, N and the B of modular multiplication; Data path (Datapath) module 1 of the streamline form that 1-P level processing unit (PE) is formed; The storer (FIFO) 9 of a control module 10 (Controlunit) and a first in first out.Storer 6 is deposited modulus N, storer 7 storage multiplicand B, storer 5 storage multiplier A.During modular multiplication, modulus N and multiplicand B are respectively with the N of w bit wide (i)And B (i)Enter data path and participate in computing, multiplier A enters each processing unit with the width of k+1 position simultaneously.The intermediate result of p level processing unit 2 outputs enters FIFO9.Zero of the output result of FIFO9 and 2w position enters first order processing unit 4 by a selector switch 8.The whole mould of control module 10 control is taken advantage of the calculating process of device, comprises the flow direction of data in the read-write, data path of three storeies 5,6,7, the reading and writing data among the FIFO9.
In order to reduce critical path time-delay, adopt carry save adder (CSA) in the arithmetic element, all be that redundant form is represented so calculate intermediate result, promptly the result of addition with and result (SS) and carry result (SC) existence.In order to reduce to store the FIFO of intermediate result, the present invention the afterbody arithmetic element is calculated output with SS and carry as a result as a result SC by one in advance add with carry musical instruments used in a Buddhist or Taoist mass (CLA) 16 add with after the FIFO9 that restores, the size of FIFO9 can reduce half like this, thereby has reduced hardware spending.The fifo circuit structure as shown in Figure 3, its composition comprises: one in advance add with carry musical instruments used in a Buddhist or Taoist mass 16, register 15, one deposit 14, selector switchs of heap 13,12 and output registers 11 of two inputs and door.
When bypass signal when being high, carry as a result SC by selector switch 13 directly output to output register 11 and as a result SS output to output register 11 by one two input and door 12.When the bypass signal when low and as a result SS and carry as a result SC enter register file 14 by register 15 again after entering add with carry musical instruments used in a Buddhist or Taoist mass 16 additions in advance, the output of register file outputs to output register 11 by two input selectors 13 more then.
With the base 4 be example processing unit (PE) as shown in Figure 4, wherein getting the computing bit wide is 32.Processing unit mainly comprises Aj coder module 31 and 28, two carry save adders 27 of N coder module and 34, two inverter modules 21 and 32, some registers, selector switch and trigger.Multiplicand B and modulus N move to left two by trigger 17 and 18, output to the next stage processing unit by register 19 again.Multiplier Aj produces signal double, zero and neg by Aj coding module 31 backs, is controlled the multiple of three input selectors 30 and the corresponding multiplicand of phase inverter 32 generations by double, zero and neg.The output of input SC, SS and phase inverter 32 by carry save adder (CSA) 34 add with, wherein two result and N[1 of the output SS of CSA34 and SC by totalizer 37 summations] control N coding unit generation signal double, zero and neg together, double, zero and neg control selector switch 20 and phase inverter 21 produce the multiple of corresponding N.Output SS, the SC of the CSA of upper level and the output of phase inverter add by CSA27 and, simultaneously its result exports to the next stage processing unit through behind register 24.
As can be seen from Figure 4, processing unit at the corresponding levels comprises the register of 4 w positions, and they are used for storing mould and take advantage of multiplicand B (i), modulus N (i)And intermediate operations SS as a result (i)And SC (i)B wherein (i)And N (i)Respectively move to left two and be delivered to the next stage computing again, and intermediate operations SS as a result (i)And SC (i)Then directly enter the next stage streamline through register.
Description of drawings
Fig. 1 compares for the pipeline organization of the present invention and other lists of references.Wherein, (a) pipeline organization that designs for A.F.Tenca, (b) pipeline organization that designs for the present invention.
Fig. 2 is extendible Montgomery module multiplier structure.
Fig. 3 first-in first-out storage structure figure.
Fig. 4 base is 4 processing unit (PE) structural drawing.
Number in the figure: 1 takes advantage of the data path module of device for mould, 2 is p level processing unit, 3 is the 2nd grade of processing unit, 4 is the 1st grade of processing unit, 5,6 and 7 is storer, and 8 is two input selectors, 9 is push-up storage module (FIFO), 10 take advantage of control module for mould, and 11 and 15 is register, and 12 is two inputs and door, 13 is two input selectors, 14 is register file, and 16 for shifting to an earlier date add with carry musical instruments used in a Buddhist or Taoist mass (CLA), 17,18,23,26,33,36,38 and 39 is d type flip flop, 19 and 24 is register, 20 and 30 is three input selectors, and 21 and 32 is inverter modules, 22,25,29,35,40 and 41 is two input selectors, 28 is the N coding module, 31 is the Aj coding module, and 27 and 34 for carry keeps addition (CSA), and 37 is totalizer.
Embodiment
Further describe the present invention below in conjunction with accompanying drawing.
Of the present inventionly expand high basic mode and take advantage of the structure of device can be applied to any Cipher Strength requirement, can handle the modular multiplication of the Galois field of any long figure place.And can regulate the pipeline series of arithmetic element according to the arithmetic speed and the hardware spending of practical application needs, thereby reach the compromise of modular multiplication speed and area.Shown in the structure of accompanying drawing 1, arithmetic speed is very fast if desired, can increase streamline progression, improve the bit wide w of processing unit, (as base is 2 perhaps to adopt more Gao Ji 3, 2 4).Otherwise less hardware realizes that one is taken advantage of device to the not high mould of rate request if desired, then can or adopt lower base by the progression that reduces streamline, the bit wide w that reduces processing unit.For the modular multiplication unit that has designed, only need to increase to be used for storing intermediate operations result's FIFO simultaneously, increase arithmetic pipelining round-robin number of times and just can handle the more modular multiplication of seniority.
In the pipeline organization shown in the accompanying drawing 1, each perpendicular row is represented the one-level (i.e. PE) of streamline, walks crosswise each row and represents a clock period.When carrying out modular multiplication, N at first (0), B (0)And a0 (k+1 position) enters first order streamline, obtains SS through a clock period computing (0)And SC (0)Second clock period is with N (0)And B (0)K position, SS in addition move to left (0)And SC (0)Be delivered to the second level and carry out computing, simultaneously N (1), B (1)Enter the first order and carry out computing.Through the pipeline operation of p level PE, intermediate operations result is from the PE output of p level like this.Below in two kinds of situation, if e>p, when intermediate result was exported from p level PE, the 1st grade of PE was still at the N of a computing high position (i)And B (i)So, with output intermediate result by CLA add and after deposit FIFO in, up to N (e-1)And B (e-1)After the 1st grade of PE computing, again the intermediate result among the FIFO is read in successively the 1st grade of PE of streamline and begin computing.If e≤p, then when middle result when p level PE exports, the 1st grade of PE is idle, so the intermediate result SS of output (0)And SC (0)Can carry out computing by directly entering the 1st grade of PE after the shifting processing.
With base 4 be example processing unit PE concrete structure as shown in Figure 4, Aj[2:0] think the multiple that output three signals in coding back are used to select the B that need add by cloth, wherein double is that 1 representative adds 2 times B, neg is that 1 representative adds negative, zero is that 1 expression adds 0.Equally, when firstcycle=1, i.e. N (0)And B (0)During computing, the result of CSA computing minimum 2 add and result and N[1] produce the selection signal of the multiple of the required N of adding by coding, wherein double is that 1 representative adds 2 times N, neg is that 1 representative adds negative, zero is that 1 expression adds 0.Computing finishes back SS (i)And SC (i)Enter next stage PE through register and carry out computing, simultaneously B (i)And N (i)Move to left and enter next stage PE after 2 and carry out computing.
Multiword Montgomery mould of the present invention takes advantage of the streamline institutional framework of device can make the delay between every level production line have only a clock period, during for e≤p, can improve modular multiplication speed greatly, during for e>p, because saving grade inter-register can reduce hardware spending, two kinds of situation counterdies take advantage of the speed area of device than all having improved 2 times nearly.

Claims (4)

1, a kind of extensible high-radix Montgomery's modular multiplication algorithm, it is characterized in that based on the multiword high-radix Montgomery's modular multiplication algorithm, the operation of shifting left of multiplicand B that each step takes advantage of mould and modulus N, middle S is not as a result done shifting function, thereby make the delay between the pipeline stages of data path be reduced to a clock period by 2 clock period, concrete steps are as follows:
Modulus N at first (0), multiplicand B (0)And the a0 of k+1 position enters first order streamline, obtains and SS as a result through a clock period computing (0)With carry SC (0)Second clock period is with N (0)And B (0)K position, SS in addition move to left (0)And SC (0)Be delivered to second level streamline and carry out computing, simultaneously N (1), B (1)Enter first order streamline and carry out computing; Through the pipeline operation of p level PE, intermediate operations result is from the PE output of p level like this; Below in two kinds of situation, if e>p, when intermediate result was exported from p level PE, the 1st grade of PE was still at the N of a computing high position (i)And B (i)So, with output intermediate result by CLA add and after deposit FIFO in, up to N (e-1)And B (e-1)After the 1st grade of PE computing, again the intermediate result among the FIFO is read in successively the 1st grade of PE of streamline and begin computing; If e≤p, then when middle result when p level PE exports, the 1st grade of PE is idle, so the intermediate result SS of output (0)And SC (0)Carry out computing by directly entering the 1st grade of PE after the shifting processing; Here, PE is a processing unit, and N is a modulus, and B is a multiplicand, and SS is and the result, and SC is the carry result, and CLA is add with carry musical instruments used in a Buddhist or Taoist mass in advance, and FIFO is the push-up storage module; Here, e=n/w, n are the multiple of modulus N, and w is the data width of processing unit.
2, a kind of extensible high-radix Montgomery's modular multiplication device circuit structure is characterized in that comprising: 3 memory modules: storer (5,6,7) is used for depositing 3 operand A, N and the B of modular multiplication; The data path module of the streamline form that 1-P level processing unit is formed; The storer (9) of a control module (10) and a first in first out; During modular multiplication, modulus N and multiplicand B are respectively with the N of w bit wide (i)And B (i)Enter data path and participate in computing, multiplier A enters each processing unit with the width of k+1 position simultaneously; The intermediate result of p level processing unit (2) output enters FIFO (9); Zero of the output result of FIFO (9) and 2w position enters first order processing unit (4) by a selector switch (8); The whole mould of control module (10) control is taken advantage of the calculating process of device, comprises the flow direction of data in the read-write, data path of three storeies (5,6,7), the reading and writing data among the FIFO (9).
3, mould according to claim 2 is taken advantage of the device circuit structure, it is characterized in that push-up storage module (9) by output register (11) and register (15), two inputs with door (12), two input selectors (13), register file (14) with shift to an earlier date add with carry musical instruments used in a Buddhist or Taoist mass (16) and form, when bypass signal when being high, carry as a result SC by selector switch (13) directly output to output register (11) and as a result SS by one two the input with the door (12) output to output register (11); When the bypass signal when low, with SS and carry as a result as a result SC enter register file (14) by register (15) again after entering add with carry musical instruments used in a Buddhist or Taoist mass (16) addition in advance, the output of register file outputs to output register (11) by two input selectors (13) more then.
4, take advantage of the device circuit structure according to the desired mould of claim 2, it is characterized in that the processing unit in its data path, adopt trigger (17,18) and register (19) that multiplicand B and modulus N are moved to left and be delivered to the next stage processing unit again after 2.
CNB2005100289154A 2005-08-18 2005-08-18 Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof Expired - Fee Related CN100435090C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100289154A CN100435090C (en) 2005-08-18 2005-08-18 Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100289154A CN100435090C (en) 2005-08-18 2005-08-18 Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof

Publications (2)

Publication Number Publication Date
CN1731345A CN1731345A (en) 2006-02-08
CN100435090C true CN100435090C (en) 2008-11-19

Family

ID=35963707

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100289154A Expired - Fee Related CN100435090C (en) 2005-08-18 2005-08-18 Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof

Country Status (1)

Country Link
CN (1) CN100435090C (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102122241A (en) * 2010-01-08 2011-07-13 复旦大学 Analog multiplier/divider applicable to prime field and polynomial field
CN102999313B (en) * 2012-12-24 2016-01-20 飞天诚信科技股份有限公司 A kind of data processing method based on montgomery modulo multiplication
CN105094746A (en) * 2014-05-07 2015-11-25 北京万协通信息技术有限公司 Method for achieving point addition/point doubling of elliptic curve cryptography
CN106681691B (en) * 2015-11-07 2019-01-29 上海复旦微电子集团股份有限公司 Data processing method, modular multiplication method and apparatus based on montgomery modulo multiplication
CN106681690B (en) * 2015-11-07 2019-02-26 上海复旦微电子集团股份有限公司 Data processing method, modular multiplication method and device based on montgomery modulo multiplication
CN108241481B (en) * 2016-12-26 2022-08-23 航天信息股份有限公司 Partial remainder multiplier equipment suitable for RSA algorithm
CN109271137B (en) * 2018-09-11 2020-06-02 网御安全技术(深圳)有限公司 Modular multiplication device based on public key encryption algorithm and coprocessor
CN109284085B (en) * 2018-09-25 2023-03-31 国网湖南省电力有限公司 High-speed modular multiplication and modular exponentiation operation method and device based on FPGA
CN109814838B (en) * 2019-03-28 2024-04-12 贵州华芯半导体技术有限公司 Method, hardware device and system for obtaining intermediate result set in encryption and decryption operation
CN111190571B (en) * 2019-12-30 2022-03-22 华南师范大学 Modular multiplication circuit based on binary domain expansion and control method thereof
CN112099763B (en) * 2020-09-10 2024-03-12 上海交通大学 Fast secure hardware multiplier for SM2 and application thereof
CN112486457B (en) * 2020-11-23 2022-12-20 杭州电子科技大学 Hardware system for realizing improved FIOS modular multiplication algorithm
CN113723035B (en) * 2021-07-23 2024-04-02 西安交通大学 Bit width variable modulo operation method and modulo operation circuit
CN114706557B (en) * 2022-04-01 2023-03-10 华控清交信息科技(北京)有限公司 ASIC chip and implementation method and device of Montgomery modular multiplication

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742530A (en) * 1992-11-30 1998-04-21 Fortress U&T Ltd. Compact microelectronic device for performing modular multiplication and exponentiation over large numbers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742530A (en) * 1992-11-30 1998-04-21 Fortress U&T Ltd. Compact microelectronic device for performing modular multiplication and exponentiation over large numbers

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
一种Montgomery模乘的硬件算法及其实现. 方颖立,高志强.微电子学,第32卷第4期. 2002
一种Montgomery模乘的硬件算法及其实现. 方颖立,高志强.微电子学,第32卷第4期. 2002 *
一种新型硬件可配置公钥制密码协处理器的VLSI实现. 陈超,曾晓洋,章倩苓.通信学报,第26卷第1期. 2005
一种新型硬件可配置公钥制密码协处理器的VLSI实现. 陈超,曾晓洋,章倩苓.通信学报,第26卷第1期. 2005 *

Also Published As

Publication number Publication date
CN1731345A (en) 2006-02-08

Similar Documents

Publication Publication Date Title
CN100435090C (en) Extensible high-radix Montgomery's modular multiplication algorithm and circuit structure thereof
Tenca et al. High-radix design of a scalable modular multiplier
Tenca et al. A scalable architecture for montgomery nultiplication
Schinianakis et al. An RNS implementation of an $ F_ {p} $ elliptic curve point multiplier
JP4955182B2 (en) Integer calculation field range extension
Kuang et al. Energy-efficient high-throughput Montgomery modular multipliers for RSA cryptosystems
US6691143B2 (en) Accelerated montgomery multiplication using plural multipliers
Shieh et al. Word-based Montgomery modular multiplication algorithm for low-latency scalable architectures
US6366936B1 (en) Pipelined fast fourier transform (FFT) processor having convergent block floating point (CBFP) algorithm
CN100470464C (en) Multiplier based on improved Montgomey's algorithm
US20030140077A1 (en) Logic circuits for performing modular multiplication and exponentiation
Zhang et al. Towards efficient hardware implementation of NTT for kyber on FPGAs
US8078661B2 (en) Multiple-word multiplication-accumulation circuit and montgomery modular multiplication-accumulation circuit
CN102231102A (en) Method for processing RSA password based on residue number system and coprocessor
Lin et al. Scalable montgomery modular multiplication architecture with low-latency and low-memory bandwidth requirement
JP2011034566A (en) Low power fir filter in multi-mac architecture
Savas et al. Multiplier architectures for GF (p) and GF (2n)
US7240204B1 (en) Scalable and unified multiplication methods and apparatus
Liu et al. A regular parallel RSA processor
Järvinen et al. A generalization of addition chains and fast inversions in binary fields
JP2008535011A (en) Method and apparatus for performing Montgomery modular multiplication
KR100950117B1 (en) Method and apparatus for processing arbitrary key bit length encryption operations with similar efficiencies
Luo et al. A novel two-stage modular multiplier based on racetrack memory for asymmetric cryptography
KR20080050226A (en) Modular multiplication device and method for designing modular multiplication device
US20040015532A1 (en) Modular multiplication apparatus, modular multiplication method, and modular exponentiation apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20081119