CN1392472A - Montgomery analog multiplication algorithm for VLSI and VLSI structure of intelligenjt card analog multiplier - Google Patents

Montgomery analog multiplication algorithm for VLSI and VLSI structure of intelligenjt card analog multiplier Download PDF

Info

Publication number
CN1392472A
CN1392472A CN 02125399 CN02125399A CN1392472A CN 1392472 A CN1392472 A CN 1392472A CN 02125399 CN02125399 CN 02125399 CN 02125399 A CN02125399 A CN 02125399A CN 1392472 A CN1392472 A CN 1392472A
Authority
CN
China
Prior art keywords
montgomery
vlsi
mod
multiplication
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 02125399
Other languages
Chinese (zh)
Other versions
CN1230736C (en
Inventor
李树国
周润德
孙义和
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN 02125399 priority Critical patent/CN1230736C/en
Publication of CN1392472A publication Critical patent/CN1392472A/en
Application granted granted Critical
Publication of CN1230736C publication Critical patent/CN1230736C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The present invention relates to the encryption and decryption technology and features that it is one algorithm with high degree of parallelism and suitable for VLSI implementation. The thrice large number multiplications of primary montgomery analog multiplication are decomposed into 2ss+s times small number multiplications. The VLSI structur for the intelligent card analog multiplifier is one high-order analog multiplier, which has 32 bit multiplier to complete 1024 bit analog multiplication and three stage parallel flow water structure in the data passage. Compared with available structure, the present invention has reduced chip area and analog multiplication clock number and can realize digital signature and confirmation of RSM algorithm in intelligent card.

Description

Montgomery analog multiplication algorithm that VLSI uses and intelligent snap gauge are taken advantage of the VLSI structure of device
Technical field
The Montgomery that VLSI uses (montgomery) modular multiplication algorithm and intelligent snap gauge take advantage of device VLSI structure to belong to the smart card enciphering/deciphering
Technical field.
Background technology
1 public key encryption technology
1976, the M.E.Hellman of Stanford University, W.Diffe and R.Merkle proposed " open code key cryptosystem ", also are asymmetric cryptosystem, also are the double-key cipher system.In this cipher system, the encryption and decryption ability of an encryption system is separated.Encryption and decryption realize by two different keys respectively, and to go out another key by one of them key derivation be infeasible.Adopt each user of asymmetrical cipher system, a pair of selected key is all arranged, one of them is disclosed, becomes PKI.Another is preserved by user oneself is secret.Be called key.Public-key encryptosystem has some following advantages: (1) key distribution is simple.Since encryption and decryption key difference, and can not from encryption key, derive decruption key, thereby encryption key can be distributed as telephone directory book.(2) the secret size of key of preserving reduces.Each user only need preserve the decruption key of oneself.Differentiate mutually that as N smart card and M main frame only need produce (N+M) to key.(3) appearance of PKI makes asymmetric cryptosystem can adapt to open environment for use.(4) can realize digital signature.So-called digital signature mainly is in order to guarantee that the take over party can prove the authenticity in the authenticity of the message that it is received and the source of transmission and a kind of safety practice of taking to the third party to just.Its use can solve the sort of dishonest disagreement that produces owing to transmitting-receiving side, promptly can guarantee to provide and can not deny or counterfeit message according to the interests of oneself.
Contemporary cryptology has solved cryptography issue with key, and key is represented with K.K can be a lot of numerical value.The scope of the probable value of key K is called key space (keyspace).This key (be that computing all depends on key, and represent as subscript with K) is all used in the encryption and decryption computing, and like this, the encryption and decryption function becomes:
E k1(M)=C
D K2(C)=M wherein, E K1Be the encryption function that depends on key k1, M (Message) is encrypted plaintext
D K2Be the decryption function that depends on key k2, C (Crypto) is its ciphering process of ciphertext after encrypting, and has characteristic as shown in Figure 1:
The algorithm of realizing public-key cryptosystem is a lot, relatively is typically RSA Algorithm and elliptic curve.RSA Algorithm is in February, 1978, and by the member Riverst of research group of engineering college in the masschusetts, u.s.a (MIT), three experts of Shamir and Adleman propose, and is RSA Algorithm with a letter designation of their name.It can be used for encryption also can be used for digital signature.The safety of RSA is based on the difficulty that big prime number decomposes, and its public-key cryptography and private key are the functions of a pair of big prime number (100 to 200 big prime numbers or bigger).Realize the present chip that has produced many rsa encryptions about RSA hardware, the correctness of RSA Algorithm is by practice and theoretical the proof.
In PKI enciphering/deciphering system, exist a big digital-to-analogue power multiplication P eMod N, this computing has caused the huge operand of public key encryption and decryption computing.Big digital-to-analogue power multiplication speed has determined the application performance of public key encryption and decryption.From the domestic and international research present situation, because the high safety of public key encryption and decryption, it is very extensive to make big digital-to-analogue power multiplication use.
2 big digital-to-analogue powers are taken advantage of P eThe decomposition of mod N
The public-key cryptosystem encryption and decryption is carried out big digital-to-analogue power multiplication exactly, big digital-to-analogue power multiplication (P eMod N) availability of speed decision public key encryption.Big digital-to-analogue power multiplication (P eMod N) can be decomposed into big digital-to-analogue multiplication AB mod N, its decomposed form is:
Begin C=1; Assignment constant 1 for i=0 to u-1 do { form of form X=XX (mod N) // second AB mod N of if (ei=1) C=XC (mod N) // first AB mod N } the return C end of //C elder generation
Wherein, e=(e ne N-1... e i... ..e 0), from asking X eIn the algorithm that mod N decomposes, exist a kind of basic operational form AB mod N as can be seen.Because the computing of AB is a kind of common two number phase multiplications.About the research of phase multiplication algorithm comparatively ripe and general, like this obtain AB amass X the time, ask modular arithmetic X mod N just to become basic operation.Usually, when known X value,, and finally obtain X mod N by the circulation of the X-N computing of successively decreasing.This computing is commonly referred to mould and subtracts computing.In the general practical application, make X=AB, so carrying out carrying out multiplication AB earlier before mould subtracts computing, subtract computing again, this modular arithmetic is referred to as modular multiplication.Therefore, modular multiplication AB mod N is with regard to the problem of the research that becomes a value.
The modular multiplication algorithm of 3 Montgomery
RSA cryptographic algorithms is present comparatively successful a kind of public-key cryptosystem in theoretical and practical application, and its security is based in the number theory greatly that integer is decomposed on the difficulty of prime factor.It has pair of secret keys, promptly PKI or encryption key (e, N) and private key or decruption key (d, N).
To plaintext m, its ciphering process: c ≡ E (m)=m eC represents ciphertext in the mod N formula
And decrypting process: m ≡ D (c)=c dMod N m represents expressly can be proved by the Euler theorem consistance of enciphering/deciphering process.The RSA Algorithm encryption is exactly one in fact and calculates mould power m eMod N or c dThe process of mod N.But because m, e, c, d, operands such as N are greater than 1024 bits, and directly Montgomery Algorithm is impossible, must earlier it be decomposed into basic big digital-to-analogue multiplication AB mod N.Big digital-to-analogue multiplication AB mod N proposes the Montgomery algorithm in order to solve just.
Original Montgomey modular multiplication algorithm
If N is modulus and N>1, R is a base coprime with N, usually, and R=2 u, u is the figure place of N; R -1Satisfy 0<R with N -1<N, 0<N '<R, R R -1-N N '=1, i.e. RR -1(mod N)=1 or N N ' (mod R)=-1; To given big integer T, and 0≤T<RNMontgomery algorithm is as follows: function REDC (T)
m←(T?mod?R)N′mod?R
t←(T+mN)/R
if?t≥N?then?t-N?else?return?t
Above-mentioned algorithm only has twice large number multiplication TN ' and mN on the surface, but since T=AB during modular multiplication, 0≤A<N, and 0≤B<N is so algorithm carries out three large number multiplication computings altogether.Work as A, when B and N were big integer more than 1024, big number multiplied each other and realizes having brought difficulty to hardware, therefore must decompose big number.In addition, because being Montgomery, the return results of algorithm amasss ABR -1Mod N, rather than mould product AB mod N are so also should eliminate the long-pending constant term R of Montgomery when using -1And become the mould product.
At present, apply for that the patent that big digital-to-analogue takes advantage of is more, domestic less abroad.The patent that domestic relevant big digital-to-analogue is taken advantage of has two.These two patents are respectively " high speed modular multiplication method and device (96109838.4) ", " circuit of mould multiplication and device (99808871.4) ".These two patents will be applied for a patent with us and compare, and our patent advanced person is in these two patents, and are suitable for large scale integrated circuit VLSI and realize.
Universal day by day along with smart card, the data security in the smart card transaction becomes more and more important.Because (Rivest, Shamir Adleman) have solved digital signature, Information Authentication and authentication to public-key cryptosystem RSA, so smart card adopts the RSA implementation data of public-key cryptosystem to encrypt more and more necessity.But smart card adopts public-key cryptosystem RSA to encrypt two subject matters of present existence: the 1) VLSI of rsa cryptosystem coprocessor (Very Large Scale Integration) realization area excessive 2) the mould power multiplication speed of rsa cryptosystem coprocessor is lower.The application's analysis and improve the Montgomery algorithm that big digital-to-analogue is taken advantage of has proposed a kind of new high basic module multiplier structure.This structure has not only reduced chip area, but also has reduced the clock periodicity of mould power multiplication, is suitable for application of IC cards.
Summary of the invention
The objective of the invention is to take advantage of device design to propose the VLSI structure that Montgomery (montgomery) modular multiplication algorithm that a kind of VLSI uses and intelligent snap gauge are taken advantage of device at the die for special purpose of smart card.The software implementation algorithm FIPS (Finely Integrated Product Scanning) based on the Montgomery of uniprocessor that the present invention is directed to that Koc proposes has proposed the high degree of parallelism algorithm that a kind of VLSI of being used for realizes, also claims improved FIPS algorithm.
Montgomery modular multiplication algorithm proposed by the invention is characterized in that:
It is the high degree of parallelism algorithm that a kind of VLSI of being suitable for realizes, its essence is three times original large number multiplication computings are decomposed into 2s 2+ s time small integer is taken advantage of, and it contains following steps successively:
If A, B are respectively s position r system integer;
A=(a s-1?a s-2…a 1a 0),?B=(b s-1?b s-2…b 1b 0)
Mould N also is a s position r system integer,
N=(n S-1n S-2N 1n 0), and R=r s
N<R is then arranged, n 0n 0' mod r=-1, and make A<N, B<N,
S:=0, n ' [0] :=-n[0] -1Mod r // ask n 0Mould contrary have (A) to use s 2The low level S of-s time multiplication calculating result of product is individual, available intermediate result m[i] expression:
A.1 i=0,......s-1
A.2 j=0,......i-1
A.2.1?S:=S+a[j]b[i-j]+m[j]n[i-j]
A.3 S:=S+a[i]b[0]
A.4 m[i]:=S?n′[0]mod?r
A.5 S:=S+m[i]n[0]
A.6 S:=S/r//a r system position moves to right
(B) use s 2-s time multiplication calculates the high S position of result of product, and m represents with storage of variables:
B.1 i=s,...,2?s-1
B.2 j=i-s+1,...,s-1
B.2.1?S:=S+a[j]b[i-j]+m[j]n[i-j]
B.3 m[i-s]:=S?mod?r
B.4 S:=S/r//move to right a r system position (C) with the s sub-addition Montgomery (Montgomery) mould product by: [0,2N) adjust to [0, N)
C.1 t0:=S mod r//t0 in r system position is a r system position
C.2 carry Cy=1
C.3 j=0,...,s-1
C.3.1?(Cy,b[j]):=m[j]+not(n[j])+Cy
//Cy is a carry digit, becomes with carry
t0:=t0+not[0]+Cy
C.4 if t0=0
Then return (b[s-1] b[s-2] ... b[1] b[0])
Otherwise return (m[s-1] m[s-2] ... m[1] m[0])
Intelligent snap gauge proposed by the invention is taken advantage of the VLSI structure of device, it is characterized in that:
It is that 32 multipliers of a kind of usefulness realize that 1024 modular multiplications and data path adopt the high basic mode of three grades of flowing structures to take advantage of device, its first order is respectively a by two inputs, b and m, 32 multipliers of n, and two 64 bit registers that input end links to each other with the output terminal of above-mentioned two multipliers are respectively formed; The second level is made of with 65 bit registers that link to each other with these 64 adder outputs 64 totalizers that add up two 64 long-pending and produce a carry Cy.The third level by input end link to each other with the output terminal of above-mentioned 65 bit registers in the hope of total add up with 76 totalizers and link to each other alternately with these 76 totalizers and 76 bit registers of output terminal output result of product constitute.
It has reached its intended purposes to use proof.
Description of drawings
Fig. 1, the enciphering/deciphering process of two keys of use.
Fig. 2, improved FIPS modular multiplication method during s=3
Fig. 3~Fig. 5, the computer process block diagram of the VLSI purpose Montgomery modular multiplication algorithm that the present invention proposes.
Fig. 6, the RSA mould is taken advantage of the structural representation of device Monpro
Fig. 7, R=r s=2 KsCounterdie power M eThe computer process block diagram of mod N
Fig. 8, the structural representation of rsa encryption processor
Embodiment
Ask for an interview Fig. 2.Improved FIPS method example when it is s=3.It is divided into A, B, C three parts.A promptly calculates a low level s word of result of product corresponding to the calculating on dot-and-dash line right side among Fig. 2; B is corresponding to the calculating in dot-and-dash line left side, and high-order s word of calculating result of product.Used storage of variables m for the storage space of saving high-order s the word of storage space, last Montgomery is long-pending to be stored in (m[s-1] m[s-2] ... m[1] m[0]).Since Montgomery is long-pending can only guarantee [0, scope 2N), so also it should be adjusted to [0, in scope N).C finishes this adjustment function just.
The calculating bottleneck of above-mentioned algorithm is the number of times of multiplication.A need carry out s 2+ 2s multiplication, B need carry out s 2-s time multiplication carries out 2s altogether 2+ s multiplication.C need carry out the s sub-addition to adjust the mould product by [0,2N] to [0, N].
The essence of improving the FIPS algorithm is 3 big numbers of original Montgomery algorithm to be taken advantage of be decomposed into 2s 2+ s time small integer is taken advantage of, and be beneficial to VLSI and realize.FB(flow block) when Fig. 3~Fig. 5 is its computer realization.
It is rsa cryptosystem coprocessor its main operational parts that mould is taken advantage of device.Modular multiplication AB mod N speed depends on the clock periodicity of modular multiplication, so mould takes advantage of the device design object should reduce the clock periodicity of modular multiplication as far as possible under the area of regulation.In the VLSI implementation algorithm, because A, B, N are r system integers, claim that therefore r is a base, and get r=2 usually kIf r=2 kAnd k 〉=16 claim that then r is Gao Ji.Take advantage of device just to take advantage of device based on the mould of Gao Ji for high basic mode.In the design, count A greatly, B, N respectively are u binary digit, from the security consideration of data, we determine to get the u=1024 bit.A like this, B, N just can be expressed as the multiple precision number be made up of s=u/k word, A=(a S-1, a S-2... a i... a 1a 0) r, and a i=(a K-1, a K-2..., a 1a 0). be each a i(0≤i<s) can represent k binary digit.The k value is big more, and the VLSI of hardware realizes that scale is also just big more.
In the VLSI implementation algorithm, when s=u/k, total multiplication number of times 2s 2+ s just becomes 2 (u/k) 2+ u/k.As u fixedly the time, multiplication number of times 2 (u/k) 2+ u/k will reduce along with the increase of k, and corresponding operation time is also just few more, and this is that we are desirable.But, because the k value is directly proportional with the hardware realization scale of VLSI, the k value conference cause realization area and the time delay of VLSI bigger.Therefore, the value of k should reduce the clock number of computing as much as possible under the constraint of area.
Choose k= 2 (u/k) so 2+ u/k just becomes 2u+ The subduplicate reason of getting u is: ignoring
Figure A0212539900093
The time (when u 〉=1024, Compare very little with u), the multiplication number of times is just from nonlinear u 2Become linear u, this variation is very favourable to reducing the computing clock number.Work as k=
Figure A0212539900095
The time, carrying out comprehensively based on the standard cell lib of the 0.35 μ m of TSMC, the result shows that the password coprocessor hardware area is about the 38K door.If increase the value of k again, under identical experiment condition, carry out comprehensively, the password coprocessor mould takes advantage of the device hardware area will become bigger.Therefore, we determine k=in the design
Figure A0212539900096
Owing to determined u=1024 bit, k=
Figure A0212539900097
=32, so basic r=2 k=2 32So, realize 1024 modular multiplication with 32 multiplier.In the VLSI implementation algorithm, Part A and Part B respectively contain common product term a[j] b[i-j] and m[j] n[i-j], because these two product term no datat are relevant, therefore, available two 32 multipliers carry out multiplying as shown in Figure 6 simultaneously concurrently, so can finish twice multiplying in a clock period.
In VLSI implementation algorithm Part A, because a[j] b[i-j] and m[j] n[i-j] but two executed in parallel like this, are finished a[j] b[i-j] and m[j] n[i-j] s 2Taking advantage of for-s time only needs (s 2-s)/2 clock period.And other three product term a[i] b[0], Sn ' [0] and m[i] n[0] between exist two secondary data relevant, be a[i] b[0] relevant Sn ' [0] and the relevant m[i of Sn ' [0]] n[0], three grades of flowing structures according to Fig. 6, each relevant needs waited for 3 clock period, so two correlations need 6 clock period altogether.Again because a[i] b[0], Sn[0] and m[i] n[0] need circulation s time, need 6s clock period so finish adding up of these three product terms.In brief, the multiply-add operation of Part A needs 6s+ (s 2-s)/2 clock period, i.e. (s 2+ 11s)/2 clock period.
In VLSI implementation algorithm Part B, but only have the product term a[j of executed in parallel] b[i-j] and m[j] n[i-j], so, (s 2-s) inferior taking advantage of only needs (s 2-s)/2 clock period.And in Part C, the mould product is adjusted to [0, N) should carry out the s sub-addition, also need s clock period.Therefore, the Part A in the algorithm, B, three clock number sums that consumed of C are s 2+ 6s or u+6 The individual clock period.(with s=u/k, k=
Figure A0212539900099
Substitution formula s 2+ 6s gets u+6
Figure A02125399000910
)
In VLSI implementation algorithm Part A and since this s time product of Sn ' [0] do not count add up with S in, add up and should be 2s 2+ s-s=2s 2The inferior sum of products, therefore, at least should be as the totalizer bit wide that adds up greater than log 2(2s 22 64), and s=u/k= =32, so, log 2(2s 22 64)=75 are so the totalizer bit wide of selecting to be used to add up is 76.See Fig. 6.
Mould takes advantage of the data path of device to adopt three grades of flowing structures, takes advantage of the concurrency of device with enhancement mode.Be mul32=>adder64=>adder76, the first order is two 32 multiplier executed in parallel, add up two 64 long-pending and produce a carry Cy of the totalizer that the second level is 64, the totalizer that the third level is 76 ask total add up and.Mould takes advantage of the control path of device to adopt the state machine model Control Circulation to iterate and mould is taken advantage of exchanges data between device and the storer.In a word, mould takes advantage of device to finish the one-off pattern multiplication needs u+6
Figure A0212539900101
The individual clock period.When the u=1024 bit, the one-off pattern multiplication needs 1216 clock period.
The RSA mould that proposes according to the present invention is taken advantage of device Monpro, takes advantage of the mould power M of device realization based on this mould eMod N hardware implementation algorithm is as follows; R=r s=2 Ks
Function MonExp (M, e, N, R)/* N be odd number */
Step 1:M:=MR mod N
Step 2:x:=1R mod N
Step 3:for i=u-1 downto 0
Step 4:x:=MonPro (x, x)
Step 5:if (e i=1) then x:=MonPro (M, x)
Step 6:x:=MonPro (x, 1)
Step 7:return x
The program flow diagram that the corresponding calculated machine is realized is seen Fig. 7, and its RSA adds the structural representation of power processor and sees Fig. 8.Mux among Fig. 8 represents that 2 select 1 Port Multiplier, the module multiplier structure of Monpro presentation graphs 6.(e N) is encryption key.Modulus-power algorithm from left to right scans e=(e U-1E iE 0) come the RSA mould in the calling graph 6 to take advantage of device MonPro since Montgomery long-pending be not the mould product, so step 1,2,6 is used for the R of cancellation Montgomery in long-pending -1Product term makes it to become the mould product.It is exactly the rsa cryptosystem coprocessor that the VLSI of modulus-power algorithm realizes, as shown in Figure 8.E in the modulus-power algorithm iWith the e among Fig. 8 i' relation is: work as e i=0 o'clock, e i'=0, promptly only carry out the one-off pattern multiplication, work as e i=1 o'clock, e i'=01, carry out modular multiplication twice.
Under average situation, to i arbitrarily, e i=1 or e i=0 probability half and half so on average need carry out 1.5 times modular multiplication, is then finished the required clock periodicity of Montgomery Algorithm: 1.5u (s 2+ 6s)=1.5u 2+ 9u In the worst case, to i arbitrarily, all e i=1, all carry out modular multiplication 2 times, then finish the required clock periodicity of Montgomery Algorithm: 2u (s 2+ 6s)=2u 2+ 12u
Figure A0212539900103
(s=u/k, k=
Figure A0212539900104
).Based on the work clock of 5MHz, encrypt the u=1024 position, the average execution time is: 1.5 * 1024 * (s 2+ 6s)/(5 * 10 6)=1.5 * 1024 * (u+6 )/(5 * 10 6The worst execution time of)=374ms is 2 * 1024 * (s 2+ 6s)/(5 * 10 6)=2 * 1024 * (u+6 )/(5 * 10 6)=498ms1024 position rsa cryptosystem coprocessor, Verilog-XL carries out emulation with the Cadence instrument, has verified enciphering/deciphering M ≡ M EdThe consistance of modN and correctness.Based on 0.35 μ m TSMC standard cell lib, to carry out comprehensively with the Synopsys instrument, experimental result shows: the shared 38K door of rsa cryptosystem coprocessor, it finishes 1024 modular multiplication needs 1216 clock period.Its maximum delay is the combinational logic time delay of 32 multipliers, and its value is for 15ns, so the highest 65MHz that allows of rsa cryptosystem coprocessor satisfies the frequency of operation of smart card 20MHz.Under the work clock based on outside 5MHz, the plaintext that the encryption of rsa cryptosystem coprocessor is 1024 on average needs 374ms.

Claims (2)

1, Montgomery (montgomery) modular multiplication algorithm used of VLSI is characterized in that: it is the high degree of parallelism algorithm that a kind of VLSI of being suitable for realizes, its essence is three times original large number multiplication computings are decomposed into 2s 2+ s time small integer is taken advantage of, and it contains following steps successively:
If A, B are respectively s position r system integer;
A=(a s-1?a s-2…a 1a 0),B=(b s-1?b s-2…b 1b 0)
Mould N also is a s position r system integer,
N=(n S-1n S-2N 1n 0), and R=r s
N<R is then arranged, n 0n 0' mod r=-1, and make A<N, B<N,
S:=0, n ' [0] :=-n[0] -1Mod r // ask n 0Mould contrary have (A) to use s 2The low level S of-s time multiplication calculating result of product is individual, available intermediate result m[i] expression:
A.1 i=0......s-1
A.2 j=0......i-1
A.2.1 S:=S+a[j]b[i-j]+m[j]n[i-j]
A.3 S:=S+a[i]b[0]
A.4 m[i]:=S?n′[0]mod?r
A.5 S:=S+m[i]n[0]
A.6 S:=S/r//a r system position moves to right
(B) use s 2-s time multiplication calculates the high S position of result of product, and m represents with storage of variables:
B.1 i=s,...,2?s-1
B.2 j=i-s+1,...,s-1
B.2.1 S:=S+a[j]b[i-j]+m[j]n[i-j]
B.3 m[i-s]:=S?mod?r
B.4 S:=S/r//move to right a r system position (C) with the s sub-addition Montgomery (Montgomery) mould product by: [0,2N) adjust to [0, N)
C.1 t0:=S mod r//t0 in r system position is a r system position
C.2 carry Cy=1
C.3 j=0,...,s-1
C.3.1 (Cy,b[j]):=m[j]+not(n[j])+Cy
//Cy is a carry digit, becomes with carry
t0:=t0+not[0]+Cy
C.4 if t0=0
Then return (b[s-1] b[s-2] ... b[1] b[0])
Otherwise return (m[s-1] m[s-2] ... m[1] m[0])
2, the montgomery analog multiplication algorithm used of VLSI according to claim 1 and the smart card module multiplier structure that proposes, it is characterized in that: it is that 32 multipliers of a kind of usefulness realize that 1024 modular multiplications and data path adopt the high basic mode of three grades of flowing structures to take advantage of device, its first order is respectively a by two inputs, b and m, 32 multipliers of n, and two 64 bit registers that input end links to each other with the output terminal of above-mentioned two multipliers are respectively formed; The second level is made of with 65 bit registers that link to each other with these 64 adder outputs 64 totalizers that add up two 64 long-pending and produce a carry Cy.The third level by input end link to each other with the output terminal of above-mentioned 65 bit registers in the hope of total add up with 76 totalizers and link to each other alternately with these 76 totalizers and 76 bit registers of output terminal output result of product constitute.
CN 02125399 2002-07-31 2002-07-31 Montgomery analog multiplication algorithm for VLSI and VLSI structure of intelligenjt card analog multiplier Expired - Fee Related CN1230736C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 02125399 CN1230736C (en) 2002-07-31 2002-07-31 Montgomery analog multiplication algorithm for VLSI and VLSI structure of intelligenjt card analog multiplier

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 02125399 CN1230736C (en) 2002-07-31 2002-07-31 Montgomery analog multiplication algorithm for VLSI and VLSI structure of intelligenjt card analog multiplier

Publications (2)

Publication Number Publication Date
CN1392472A true CN1392472A (en) 2003-01-22
CN1230736C CN1230736C (en) 2005-12-07

Family

ID=4745548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 02125399 Expired - Fee Related CN1230736C (en) 2002-07-31 2002-07-31 Montgomery analog multiplication algorithm for VLSI and VLSI structure of intelligenjt card analog multiplier

Country Status (1)

Country Link
CN (1) CN1230736C (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696894B (en) * 2004-05-10 2010-04-28 华为技术有限公司 Multiplier calculating modular multiplicatin of large numbers
CN101021777B (en) * 2006-07-07 2010-08-25 开曼群岛威睿电通股份有限公司 Efficient mod operation based on divisor (2n-1)
CN101083525B (en) * 2005-12-30 2011-11-16 英特尔公司 Cryptography processing units and multiplier
CN101488225B (en) * 2009-03-05 2012-03-28 山东大学 VLSI system structure of bit plane encoder
CN102571342A (en) * 2010-12-27 2012-07-11 北京中电华大电子设计有限责任公司 RSA (Ron Rivest, Adi Shamir and Leonard Adleman) algorithm digital signature method
CN103440359A (en) * 2013-07-18 2013-12-11 北京空间飞行器总体设计部 Automatic generation method of FPGA (Field Programmable Gate Array) parallel computation circuit for realizing iterative algorithm
CN103888246A (en) * 2014-03-10 2014-06-25 深圳华视微电子有限公司 Low-energy-consumption small-area data processing method and data processing device thereof
CN104598199A (en) * 2015-01-07 2015-05-06 大唐微电子技术有限公司 Data processing method and system for Montgomery modular multiplier of intelligent card
CN117234458A (en) * 2023-11-09 2023-12-15 深圳大普微电子股份有限公司 Multiplication array, data processing method, processing terminal and storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1696894B (en) * 2004-05-10 2010-04-28 华为技术有限公司 Multiplier calculating modular multiplicatin of large numbers
CN101083525B (en) * 2005-12-30 2011-11-16 英特尔公司 Cryptography processing units and multiplier
CN101021777B (en) * 2006-07-07 2010-08-25 开曼群岛威睿电通股份有限公司 Efficient mod operation based on divisor (2n-1)
CN101488225B (en) * 2009-03-05 2012-03-28 山东大学 VLSI system structure of bit plane encoder
CN102571342A (en) * 2010-12-27 2012-07-11 北京中电华大电子设计有限责任公司 RSA (Ron Rivest, Adi Shamir and Leonard Adleman) algorithm digital signature method
CN102571342B (en) * 2010-12-27 2016-01-06 北京中电华大电子设计有限责任公司 A kind of RSA Algorithm digital signature method
CN103440359A (en) * 2013-07-18 2013-12-11 北京空间飞行器总体设计部 Automatic generation method of FPGA (Field Programmable Gate Array) parallel computation circuit for realizing iterative algorithm
CN103888246A (en) * 2014-03-10 2014-06-25 深圳华视微电子有限公司 Low-energy-consumption small-area data processing method and data processing device thereof
CN104598199A (en) * 2015-01-07 2015-05-06 大唐微电子技术有限公司 Data processing method and system for Montgomery modular multiplier of intelligent card
CN117234458A (en) * 2023-11-09 2023-12-15 深圳大普微电子股份有限公司 Multiplication array, data processing method, processing terminal and storage medium
CN117234458B (en) * 2023-11-09 2024-02-23 深圳大普微电子股份有限公司 Multiplication array, data processing method, processing terminal and storage medium

Also Published As

Publication number Publication date
CN1230736C (en) 2005-12-07

Similar Documents

Publication Publication Date Title
CN109039640B (en) Encryption and decryption hardware system and method based on RSA cryptographic algorithm
JP3014391B2 (en) Cryptography and cryptographic processor for implementing the method
US5499299A (en) Modular arithmetic operation system
CN101216754B (en) Modular multiplication based data encryption and decryption processing method and device
CN110519058B (en) Acceleration method for lattice-based public key encryption algorithm
US20080240443A1 (en) Method and apparatus for securely processing secret data
CN113032848B (en) Data processing method and chip for data processing
JP4783382B2 (en) Montgomery method multiplication remainder calculator
US20020126838A1 (en) Modular exponentiation calculation apparatus and modular exponentiation calculation method
CN1230736C (en) Montgomery analog multiplication algorithm for VLSI and VLSI structure of intelligenjt card analog multiplier
Naccache et al. Twin signatures: an alternative to the hash-and-sign paradigm
KR100508092B1 (en) Modular multiplication circuit with low power
CN1543725A (en) Method for producing encrypt unit with dissymmetry encrypt system by discrete logarithm function
JP4177526B2 (en) Multiplication residue calculation method and multiplication residue circuit
CN2566363Y (en) Intelligent card module multiplier structure for VLSI
JPH11212456A (en) Multiplication remainder calculation device using montgomery method
JP4423900B2 (en) Scalar multiplication calculation method, apparatus and program for elliptic curve cryptography
Wu et al. Fast parallel exponentiation algorithm for RSA public-key cryptosystem
Wong et al. Performance Evaluation of RSA and NTRU over GPU with Maxwell and Pascal Architecture
JPH11143688A (en) Arithmetic lint rsa crypto graphic device and elliptic cryptographic device lising the unit
CN2507064Y (en) Montgomery modulo multiplier
CN1525307A (en) Modulus multiply operation circuit and encrypt method of applying said modulus multiply operation circuit
Munjal et al. Analysing RSA and PAILLIER homomorphic Property for security in Cloud
Vollala et al. High-radix modular exponentiation for hardware implementation of public-key cryptography
CN114513306B (en) Data encryption transmission method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20051207