CN104252332B - A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device - Google Patents

A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device Download PDF

Info

Publication number
CN104252332B
CN104252332B CN201410414896.8A CN201410414896A CN104252332B CN 104252332 B CN104252332 B CN 104252332B CN 201410414896 A CN201410414896 A CN 201410414896A CN 104252332 B CN104252332 B CN 104252332B
Authority
CN
China
Prior art keywords
unit
multiplier
output end
input terminal
summation circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410414896.8A
Other languages
Chinese (zh)
Other versions
CN104252332A (en
Inventor
潘正祥
杨春生
李秋莹
闫立军
蔡正富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Airmate Electrical Shenzhen Co Ltd
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Airmate Electrical Shenzhen Co Ltd
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Airmate Electrical Shenzhen Co Ltd, Shenzhen Graduate School Harbin Institute of Technology filed Critical Airmate Electrical Shenzhen Co Ltd
Priority to CN201410414896.8A priority Critical patent/CN104252332B/en
Publication of CN104252332A publication Critical patent/CN104252332A/en
Application granted granted Critical
Publication of CN104252332B publication Critical patent/CN104252332B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

The present invention relates to a kind of multiplier processing unit PE for elliptic curves cryptosystem device, including computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout, the input terminal Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the output end B of the computing unit after calculation processingout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out, ring shift left d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinD additions of ring shift right are carried out, formula is:Cout=Cin>>d+L(Bin, Xin), wherein CinIt is a upper processing unit PE as a result, for first processing unit PECinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, input as next processing unit PE, d is expressed as numerical digit length, and the hop count that k is expressed as point, L is operation mark.By the calculating for carrying out shifting processing and J functions when calculating so that processing unit arithmetic speed is fast, and computation complexity is low so that the performance of scrambler improves.

Description

A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device
Technical field
The invention belongs to the multiplier processing that digital coding field more particularly to a kind of finite field are suitable for elliptic curves cryptosystem device Unit and multiplier.
Background technology
Recent years, the effective of finite field operations, high-performance and low complex design and its application have been obtained for very much Concern.For example, the algorithm and system of scrambler need to meet American National Standard and Institute for Research and Technology (National Institute of Standards and Technology, NIST) and American Institute of Electrical and Electronics Engineers The safety requirements that (Institute ofElectrical and Electronics Engineers, IEEE) is proposed, to reduce Potential attack, it is ensured that hardware security.One importance of scrambler is to reduce cost while resisting side channel analysis (side-channel attacks).In industrial quarters, error detection research field is much paid attention to, such as document [3-6], It can find out from the attack cryptographic system based on error analysis and side channel analysis.In practical applications, original to design often It needs to increase overhead, therefore they need effectively to design, and can tolerate and undertake this overhead.Recently, oval Scrambler meets public key cryptography requirement as a kind of effective technology, in many high-performance and security restriction application aspect Implemented.For example, the algorithm can make full use of mobile wireless ad hoc networks (Mobile Ad hoc NETworks, MANETs), confidence level and integrity checking are effectively provided.It is this to check without the concern for whether physical layer safety has danger Dangerous elliptic curve ciphers device is a kind of method based on elliptic curve Algebraic Structure in finite field, and the arithmetic operation of this method is determined The validity of the cryptographic system based on elliptic curve cipher device is determined.Therefore, many research work have paid attention in arithmetic element Effective, low complex degree become reconciled performance design, these units are close for elliptic curve cipher device and public key encryption algorithm (RSA) In code system.Nearest Gauss normal basis multiplier (Gaussian normal basis, GNB) has been widely applied to calculate ellipse Point multiplication (also referred to as scalar multiplication) in circular curve scrambler.It is worth noting that, this operation not only needs effectively Performance, and in temporal constraint application, its realization must be high-performance.
In two large-scale bit fields, domain multiplication can be designed by systolic arrays method and obtain the super of high speed and rule Large scale integrated circuit is realized.Systolic arrays will not encounter irregular circuit design.In other words, in two bit fields not With selection, their hardware configuration is modular closely similar.Its simultaneity, the balance of input and output and simple regular Design the features such as, be allowed to be suitable for performance application.Although in needing high-speed structures to apply, pulsation framework has obtained extensively It uses, but is typically premised on its area complexity is acceptable.For example, document [16] proposes a kind of optimization base pulsation multiplication Device, the multiplier have very strong systematicness, can be realized with data serial mode.This ripple multiplier is in document [17] High-performance is obtained on configurable hardware to realize.
[3] A.Yazdani, H.Sepahvand, M.Crow, and M.Ferdowsi, " Fault Detection and Mitigation in Multilevel Converter STATCOMs, " IEEE Trans.Ind.Electron., vol.58, No.4, pp.1307-1315,2011.
[4]M.A.A.Claudio-Sanchez, D.Theilliol, L.Vela- Valdes, P.Sibaja-Teran, L.Hernandez-Gonzalez, and J.Aguayo-Alquicira, " A Failure- Detection Strategy for IGBT Based on Gate-Voltage Behavior Applied to a Motor Drive System, " IEEE Trans.Ind.Electron., vol.58, no.5, pp.1625-1633,2011.
[5] T.A.Najafabadi, F.R.Salmasi, and P.Jabehdar-Maralani, " Detection and Isolation of Speed-, DC-Link Voltage-, and Current-Sensor Faults Based on an Adaptive Observer in Induction-Motor Drives, " IEEE Trans.Ind.Electron., vo1.58, No.5, pp.1662-1672,2011.
[6] S.Cruz, M.Ferreira, A.Mendes, and A.J.M.Cardoso, " Analysis and Diagnosis of Open-Circuit Faults in Matrix Converters, " IEEE Trans.Ind.Electron., vol.58, no.5, pp.1648-1661,2011.
[16] S.Kwon, " A Low Complexity and a Low Latency Bit Parallel Systolic Multiplier over GF (2m) Using an Optimal Normal Basis of Type II, " in Proc.IEEE Symp.Computer Arithmetic (Arith-16), pp.196-202,2003.
[17] J.Fan, D.Bailey, L.Batina, T.Guneysu, C.Paar, and I.Verbauwhede, " Breaking Elliptic Curves Cryptosystems using Reconfigurable Hardware, " in Proc.of 20th Int1 Conf.on Field Programmable Logic and Applications(FPL 2010), 2010, pp.133-138.
[18] A.Reyhani-Masoleh, " Efficient Algorithms and Architecturesfor Field Multiplication Using Gaussian Normal Bases, " IEEE Trans.Computers, Vol.55, no.1, pp.34-47, Jan.2006.
[20] R.Azarderakhsh and A.Reyhani-Masoleh, " A Modified Low Complexity Digit-Level Gaussian Normal Basis Multiplier, " in Proc.Intl Workshop Arithmetic of Finite Fields (WAIFI), vol.6087, pp.25-40,2010.
Invention content
The present invention provides a kind of multiplier processing unit for elliptic curves cryptosystem device, it is intended to solve existing processing unit and calculate The problem that speed is slow, operation time is grown.
The invention is realized in this way a kind of processing unit for elliptic curves cryptosystem device multiplier, multiplier processing is single First PE includes computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout, The input terminal Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the computing unit after calculation processing The output end Bout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out, Its ring shift left d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinIt is followed Ring moves to right d additions, and formula is:Cout=Cin> > d+L (Xin, Bin), wherein CinIt is the knot of a upper processing unit PE Fruit, for the C of first processing unit PEinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, as under The input of one processing unit PE, d are expressed as numerical digit length, and the hop count that k is expressed as point, L is operation mark.
Another object of the present invention is to provide a kind of one-dimensional multiplier, which includes k claim 1 institute The multiplier processing unit PE stated an and summation circuit AC, the summation circuit is connected after the k processing unit PE series connection The input terminal of AC, each PE are to export to obtain by the calculating of last PE, and three parameters of input of first PE are B respectively0, B1..., Bn-1, 0,0 ..., 0, X0, X1..., Xn-1, wherein by cyclic shift obtains to the right after A backwards, output calculation formula is X:Wherein, A is multiplicand.
The present invention further technical solution be:The summation circuit AC includes that addition unit, temporary storage location and displacement are single Member, the shift unit output end connect the addition unit input terminal, and the addition unit output end connects the temporary list First input terminal, the temporary storage location output end connect the shift unit input terminal, and the summation circuit is to k PE processing unit The result once calculated carry out displacement and with the output results added next time of k PE processing unit.
Another object of the present invention is to provide a kind of two-dimentional multiplier, the two dimension multiplier include k claim 2 or One-dimensional multiplier, 2k-2 CS module, k-1 summation circuit AC1 described in 3 and summation circuit an AC2, k a described one It is in parallel to tie up multiplier, the first one-dimensional multiplier outputs connect the shift unit of the first summation circuit AC1, and k-1 is a The summation circuit AC1 series connection, the summation circuit AC1 of kth -1 connect with the summation circuit AC2, second to kth -1 The output end of a one-dimensional multiplier is connect with a summation circuit AC1 respectively, and second described one-dimensional to kth -1 The ends input B, the ends X of multiplier are separately connected a CS module, and the input terminal of the first one-dimensional multiplier directly inputs, Its operational formula is:
The present invention further technical solution be:The summation circuit AC1 includes shift unit and addition unit, the shifting Bit location output end connects the addition unit input terminal, the summation circuit AC1 to input carry out displacement and be connected described in One-dimensional multiplier output results added output, shift unit ring shift right kd.
The present invention further technical solution be:The summation circuit AC2 includes shift unit, addition unit and temporary list Member, the shift unit output end connect the addition unit input terminal, and the addition unit output end connects the temporary list Member, the temporary storage location output end connect the input terminal of the addition unit, and the shift unit is to inputting numerical value ring shift right k2D;The summation circuit AC1 includes shift unit and addition unit, and the shift unit output end connects the addition list First input terminal.
The present invention further technical solution be:The CS modules are used to carry out ring shift right kd to the numerical value of input.
The beneficial effects of the invention are as follows:By the calculating for carrying out shifting processing and J functions when calculating so that processing unit Arithmetic speed is fast, and computation complexity is low so that the performance of scrambler improves.The present invention is one proposed based on systolic array architecture Kind multiplier, therefore it is easy to realize that there is low latency, high performance nature in VLSI systems.
Description of the drawings
Fig. 1 is the DL-PIPO GNB multiplier circuits of foundation of the present invention;
Fig. 2 is the structure chart of processing unit PE provided in an embodiment of the present invention;
Fig. 3 is one-dimensional multiplier circuit provided in an embodiment of the present invention;
Fig. 4 is two-dimentional multiplier circuit provided in an embodiment of the present invention.
Specific implementation mode
Fig. 2 shows the processing unit provided by the present invention for elliptic curves cryptosystem device multiplier, the multiplier processing units PE includes computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout, institute State input terminal Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the computing unit after calculation processing The output end Bout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out, Ring shift left d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinIt is recycled D additions are moved to right, formula is:Cout=Cin> > d+L (Bin, Xin), wherein CinA upper processing unit PE as a result, For the C of first processing unit PEinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, as next The input of processing unit PE, d are expressed as numerical digit length, and the hop count that k is expressed as point, L is operation mark.By being carried out when calculating The calculating of shifting processing and j functions so that processing unit arithmetic speed is fast, and computation complexity is low so that the performance of scrambler carries It is high.
Fig. 3 shows that another object of the present invention is to provide a kind of one-dimensional multiplier, the one-dimensional multiplier includes k power Profit requires the multiplier processing unit PE described in a 1 and summation circuit AC, after the k processing unit PE series connection described in connection The input terminal of summation circuit AC, each PE are to export to obtain by the calculating of last PE, and three parameters of input of first PE are respectively B0, B1..., Bn-1, 0,0 ..., 0, X0, X1..., Xn-1, wherein X is shifted to obtain by A, and output calculation formula is:
The summation circuit AC includes addition unit, temporary storage location and shift unit, the shift unit output end connection The addition unit input terminal, the addition unit output end connect the temporary storage location input terminal, the temporary storage location output End connects the shift unit input terminal, and the result that the summation circuit once calculates k PE processing unit is shifted simultaneously With the output results added next time of k PE processing unit.
Fig. 4 shows that, another object of the present invention is to provide a kind of two-dimentional multiplier, which includes k power Profit requires one-dimensional multiplier described in 2 or 3,2k-2 CS module, k-1 summation circuit AC1 and summation circuit an AC2, k A one-dimensional multiplier is in parallel, and the first one-dimensional multiplier outputs connect the displacement list of the first summation circuit AC1 Member, the k-1 summation circuit AC1 series connection, the summation circuit AC1 of kth -1 connect with a summation circuit AC2, Second output end to -1 one-dimensional multiplier of kth is connect with a summation circuit respectively, second to kth -1 The ends input B, the ends X of a one-dimensional multiplier are separately connected a CS module, the input of the first one-dimensional multiplier End directly inputs, and operational formula is:
The summation circuit AC1 includes shift unit and addition unit, and the shift unit output end connects the addition Unit input terminal, the summation circuit AC1 carries out displacement to input and to export results added with the one-dimensional multiplier that is connected defeated Go out, shift unit ring shift right kd.
The summation circuit AC2 includes shift unit, addition unit and temporary storage location, the shift unit output end connection The addition unit input terminal, the addition unit output end connect the temporary storage location, the temporary storage location output end connection The input terminal of the addition unit, the shift unit is to inputting numerical value ring shift right k2D.
The CS modules are used to carry out ring shift right kd to the numerical value of input.
Underneath with decomposition method to obtain two kinds of new numerical digit GNB multipliers.
It takesAs the normal basis (Normal basis, NB) of GF (2m), wherein β ∈ GF (2m).β is GF (2m) in a regular element, such set is GF (2m) normal basis.It is positive integer to take m and T so that p =mT+1 is a prime number and gcd (mT/k, m)=1, wherein k are the multiplication exponent numbers of 2 mould p.It is in GF (2 to take αm) in one The unit primitive root of a mT+1 ranks.In, for any T ranks unit primitive root τ,It generates one and is based on the two of GF (2) Bit field GF (2m) normal basisThe base is also referred to as T-type Gauss normal basis bottom (Gaussian Normal basis, GNB).The complexity (time and spatially) of GNB multipliers depends on their model T > 1.NIST is built Five kind of two bit field has been discussed, this five kinds first fields are m=163,233,283,409 and 571.The T of this five kinds first fields is even number, point It Wei 4,2,6,4 and 10.
It is the multiplication matrix R based in document [18] that GNB multiplication, which calculates,(m-1)*T.Take A=(a0, a1..., am-1), B= (b0, b1..., bm-1) be two in GF (2m) on T-type GNB elements.They are in GF (2m) in product can be expressed as:
Wherein,
Here (X < < i) indicates to carry out i cyclic shift to the left to X ∈ GF (2m).Wherein X ⊙ Y=(x0y0..., xm- 1ym-1),It represents and step-by-step and step-by-step XOR operation is carried out to the coefficient of X and Y.It is limited Domain multiplication may be designed to position grade (space complexity O (m) and time complexity O (m)), numerical digit rank (space complexity O (md) and Time complexity O (m/d)) and parallel-by-bit (space complexity O (m2) and time complexity be O (1)) framework.
Recently, number bit-parallel input parallel output (the digit-level parallel-in of low complex degree Parallel-out, DL-PIPO) GNB multipliers by document [18] [20] propose, Literature [20] is optimal.DL- PIPO frameworks are as shown in Figure 1.It will be seen that in this multiplier, two operands A and B (have been stored in deposit in advance Device<X>,<Y>In) should all retain in entire calculating process, and result should pass throughWhen 1≤d≤m Clock obtains simultaneously after the period.Notice for a given field size, numerical digit width d should be reasonably selected with lower the time and Space complexity.The time complexity of the GNB multipliers of numerical digit grade is
Area complexity be dm AND logic gate and
Logic gate.Formula elimination algorithm is reached using the identical sublist that document [20] proposes, Area complexity further decreases, as long asLogic gate, wherein
A.1-D digital level heart contraction structure
From matrix R(m-1)×TSymmetrical structure in (1) can show that formula S (i, B) can be written as follows:
Therefore, for instead of matrix R(m-1)×T, we can define matrixFor:
Wherein, uk,It is the row k of matrix u.In Fig. 1, DL-PIPO GNB multiplier architectures are illustrated.It is false If input element A (is already loaded into register in advance<X>In) be expressed as again WhereinThen, matrix is utilizedA and B products can be obtained by formula:
Wherein, J (X, Y)=X ⊙ P (Y), P (Y)=(y1, s ' (1, Y), s ' (2, Y) ..., s ' (2, Y), s ' (1, Y)),For each coordinate, J (X, Y) functions are obtained by displacement input parameter appropriate Result of calculation.These functions are that input B (is loaded into register in advance<Y>In) the weighted sum of each, and by matrixIt is determined with the position of input B.Matrix u is expressed as a P block again, which is used to calculate the linear combination of B, and It is realized by using XOR tree.
It takes1≤d≤m, then, we can write the product in (5) as:
Wherein,
Assuming that n and k, which are two integers, meets q=kn.Notice if q cannot be divided exactly by k, it would be desirable to X and B most Low order zero padding is so that it meets q=kn.By partial product CiIt is defined as:
It is mentioned here according to top, integer k and index i sum number bit widths d, Wo Menyou:Xi=X > > kid, Bi=B > > kid.Product C in formula (6) can be compressed into n partial product:
Wherein product C is preferential by its most significant digit (most significant digit first, MSD-first) It indicates.In order to calculate the partial product C in formula (8)i,
Assuming thatIt is it Preceding decision.Each partial product C can be expressed as again:
Algorithm 1 describes the use (9) of proposition and the 1-D heart contraction GNB multiplication of (10).According to algorithm 1, Fig. 2,3 are retouched The 1-D digital level heart contraction GNB multipliers proposed are stated.Fig. 3 illustrates the numerical digit grade heart contraction multiplier of proposition.I It can be found that propose structure by k processing unit (processing element, PE) composition, i.e. PE0, PE1..., PEk-1With a summation circuit (accumulation circuit, AC).Shown in the core circuit calculation formula (7) of Fig. 1 Multiplication.Therefore, we can build PE circuits as shown in Figure 2 with the circuit of Fig. 1.Each PE realizes algorithm 1 by calculating In step 8 and 9, AC circuits realize step 11.
We explain the multiplication step shown in figure 3.In view of PE in fig. 2 is operated and algorithm 1, PEjOutput B C is accumulated for calculating sectioniIt is expressed as BI, j.In the initial step, register<C>It is initialized to 0, XiAnd BiWherein 0≤i < n- 1 is calculated by circulative shift operation realization.In first clock cycle, two element Xn-1And Bn-11- as proposition The input of D heart contraction multipliers goes calculating section to accumulate Cn-1.In the next clock cycle, two element Xn-2And Bn-2By with It is input in the heart contraction multiplier of proposition and accumulates C to calculating sectionn-2, and so on.Each partial results, CiS, By k PE be calculated and there are registers<C>In, it needs the k+1 clock cycle altogether.Therefore, for proposing 1-D heart contraction multipliers, GNB multiplication C=AB completes after k+n clock cycle.Following proposition is presented in we Remove the clock periodicity of the heart contraction multiplier of the 1-D digital levels of measurement proposition.
Proposition 1.For the T-type GNB in the domain each GF (2m), the 1-D digital level parallel-by-bits of proposition export heart contraction As many as multiplier needsA clock cycle, d is the numerical digit width of selection here.In other words, the 1-D numbers of proposition The delay of the heart contraction multiplier of rank is
It proves:GNB multiplication is divided into q sections of calculating.The digital level heart contraction structure of proposition is provided in Fig. 2,3, Assuming that we have k PEs and AC, therefore, GNB multiplication can also be divided into n partial results to wherein q=kn.,WhereinWith XiAnd BiPartial product CiBy with It is used as the input of heart contraction array multiplier PEs.Entire GNB multiplication needs k+n clock cycle.For given q= Kn, if k (quantity of PEs) very little, n (quantity of PEs input elements) will become very big, and therefore, delay (k+n) will become It obtains very big.The delay and realization of minimum are for being worth prodigious m in order to obtain, and multiplication has good performance in finite field, I Need to reduceTherefore, first derivative should be equal to 0,This is neededTherefore, I SelectionThe delay of the multiplier of proposition becomesA clock cycle.IfIt is not one completely flat Side, then result of calculation even will also lack several clock cycle.This demonstrates our proposition completely.
Conclusion 1.According to proposition 1, if the quantity of d=1 and PEs byDetermine, then it is proposed that the 1-D hearts It is dirty shrink GNB multipliers delay be at mostA period always.
In order to talk clearly top be directed to it is proposed that 1-D heart contraction GNB multipliers discussion, we use following example Son illustrates operations of the PE in the different clocks period.
Example 1.It takes For two 6 type GNB elements in GF (227),We assume that the numerical digit width d=3 of selection.Then, Wo Menyou According to (9), product C can be expressed asWherein
For i=0,1,2.Table 1 lists the operation of each PE in each period always.It was noticed that for carrying The 1-D digital level heart contraction GNB multipliers gone out need 6 clock cycle.
Fig. 2,3 are realized using 1-D heart contraction arrays, and the GNB multipliers of presentation include k PEs and AC circuit.Often A PE circuits are made of the structure in Fig. 1, it includes dm AND,And three m For register.Each PE unit core paths postponeAC circuits include a m GF (2m) adders and one m registers of position.Fig. 2 for providing, 3 structures, the delay of the GNB multipliers of proposition areA clock cycle.
B, 2-D digital levels heart contraction structure
In this section, in order to obtain high performance realization, we show a 2-D heart contraction GNB multiplier, for Compared to segmentation heart contraction structure in front, it can reach higher performance to decimal bit width (or the field width degree beaten).Take k It is that two integers meet q=k with n2n.It notices if q cannot be by k2Divide exactly.We can mend X and B 0 so that it meets q= k2n.In order to 2-D digital level heart contraction multiplier derivations, we compress (6) be n partial results and be:
Wherein,
Xij=X > > k2Id+kjd, Bij=B > > k2id+kjd
Each partial results CijAll it is k partial product and the partial results C in (12) in (7)iIt is k partial results CijSum.Realize that calculating section accumulates C to reach a complete assembly linei, we define each partial results CijAll by the 1-D hearts Dirty systolic array structure is realized, as Fig. 2 is presented.In this regard, the calculating C of propositioni2-D heart contraction array multiplications Device is shown in figure 3.In figure 3, by k 1-D heart contraction array, (k-1) is a to follow the 2-D heart contractions multiplier of proposition Ring shift circuit, (k-1) a AC1 structures and an AC2 structure are constituted.Each CS modules provide kd cyclic shifts to the right It need to the rewiring in hardware realization.1-D heart contractions array [i] (1-D systolic array [i]) is used in figure Realize k partial product CijAdduction.
Proposition 2.Domain is taken to be made of the T types GNB of even number, when the 2-D heart contraction GNB multipliers of proposition need maximum Between delay beThe quantity of a clock cycle, PEs is
It proves:It is that two positive integers meet to take k and nWherein d is the numerical digit width of selection.GNB multiplication uses k2A PEs removes structure 2-D heart contraction array structures, if C of this circuit counting in (12)i, then when we have 2k The clock period.Therefore, the GNB multiplication shown in (11) needs 2k+n clock cycle.Similar proposition 1, Single order lead and need to be zero, i.e.,It is required thatSo we select2-D The time delay that heart contraction multiplier can obtain isA clock cycle.WhenIt is not complete cube When, calculating will need the less clock cycle, hence it is demonstrated that our proposition.
Conclusion 2.According to proposition 2, it is proposed that the delays of 2-D heart contraction GNB multipliers be at mostWhen a The clock period.
Fig. 4 are realized by using 2-D heart contraction arrays it is proposed that 2-D heart contractions multiplier byA AC1,1 AC2 composition.By using this structure, The minimum delay of GNB multipliers can reachA clock cycle.It was noted that comparing 1-D heart contraction multiplication Device, if selection decimal bit width (or big field width degree) delay will be lower.For example, it is 1, GF to select numerical digit width (2409) 2-D heart contraction multipliers under domain delay be 24 clock cycle with numerical digit width be 3 in the case of 1-D hearts The delay of contractive multiplication device is identical.This means that the effect of 2-D heart contractions multiplier is more preferable under these conditions.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement etc., should all be included in the protection scope of the present invention made by within refreshing and principle.

Claims (6)

1. a kind of multiplier processing unit PE for elliptic curves cryptosystem device, which is characterized in that multiplier processing unit PE includes Computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout, the input Hold Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the described defeated of the computing unit after calculation processing Outlet Bout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out, cycle is left Moving d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinCarry out ring shift right d Position is added, and formula is:Cout=Cin> > d+L (Bin, Xin), wherein CinIt is a upper processing unit PE as a result, for The C of one processing unit PEinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, single as next processing The input of first PE, d are expressed as numerical digit length, and L identifies for operation,Wherein, J (X, Y) =X ⊙ P (Y), input B are loaded into register Y, and P (Y) is used to calculate the linear combination of B.
2. a kind of one-dimensional multiplier, which is characterized in that the one-dimensional multiplier includes k multiplier processing described in claim 1 A unit PE and summation circuit AC, the summation circuit AC, the input of each PE are connected after the k processing unit PE series connection End is to export to obtain by the calculating of last PE, and three parameters of input of first PE are B respectively0, B1..., Bn-1, 0,0 ..., 0, X0, X1..., Xn-1, wherein X is obtained by cycle shifting to the right one after A backwards, and output calculation formula is:
Wherein, A is to be multiplied operand.
3. one-dimensional multiplier according to claim 2, which is characterized in that the summation circuit AC include addition unit, temporarily Memory cell and shift unit, the shift unit output end connect the addition unit input terminal, the addition unit output end The temporary storage location input terminal is connected, the temporary storage location output end connects the shift unit input terminal, the summation circuit To the result that k PE processing unit once calculates carry out displacement and with the output results added next time of k PE processing unit.
4. a kind of two dimension multiplier, which is characterized in that the two dimension multiplier includes that one-dimensional described in k Claims 2 or 3 multiplies Musical instruments used in a Buddhist or Taoist mass, 2k-2 CS module, k-1 summation circuit AC1 and an one-dimensional multiplier of summation circuit AC2, k are in parallel, first A one-dimensional multiplier outputs connect the shift unit of the first summation circuit AC1, the k-1 summation circuit AC1 Series connection, the summation circuit AC1 of kth -1 connect with a summation circuit AC2, and second described one-dimensional to kth -1 The output end of multiplier is connect with a summation circuit AC1 respectively, second to -1 one-dimensional multiplier of kth defeated Enter the ends B, the ends X are separately connected a CS module, the input terminal of the first one-dimensional multiplier directly inputs, operational formula For:
The CS modules are used to carry out ring shift right kd to the numerical value of input.
5. it is according to claim 4 two dimension multiplier, which is characterized in that the summation circuit AC1 include shift unit and Addition unit, the shift unit output end connect the addition unit input terminal, and the summation circuit AC1 moves input Position simultaneously exports, shift unit ring shift right kd with the one-dimensional multiplier output results added that is connected.
6. two dimension multiplier according to claim 5, which is characterized in that the summation circuit AC2 includes shift unit, adds Method unit and temporary storage location, the shift unit output end connect the addition unit input terminal, the addition unit output end The temporary storage location is connected, the temporary storage location output end connects the input terminal of the addition unit, and the shift unit is to defeated Enter numerical value ring shift right k2D;The summation circuit AC1 includes shift unit and addition unit, the shift unit output end Connect the addition unit input terminal.
CN201410414896.8A 2014-08-20 2014-08-20 A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device Expired - Fee Related CN104252332B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410414896.8A CN104252332B (en) 2014-08-20 2014-08-20 A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410414896.8A CN104252332B (en) 2014-08-20 2014-08-20 A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device

Publications (2)

Publication Number Publication Date
CN104252332A CN104252332A (en) 2014-12-31
CN104252332B true CN104252332B (en) 2018-09-18

Family

ID=52187288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410414896.8A Expired - Fee Related CN104252332B (en) 2014-08-20 2014-08-20 A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device

Country Status (1)

Country Link
CN (1) CN104252332B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101968732A (en) * 2010-10-09 2011-02-09 中国人民解放军信息工程大学 Bit parallel systolic array shifted polynomial basis multiplier with function of error detection
CN102929574A (en) * 2012-10-18 2013-02-13 复旦大学 Pulse multiplying unit design method on GF (Generator Field) (2163) domain
CN103186360A (en) * 2013-04-03 2013-07-03 哈尔滨工业大学深圳研究生院 Fast arithmetic multi-bit serial pulse dual-base binary finite field multiplier
TW201404108A (en) * 2012-07-09 2014-01-16 Univ Ching Yun Semi-systolic Gaussian normal basis multiplier

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7519644B2 (en) * 2004-05-27 2009-04-14 King Fahd University Of Petroleum And Minerals Finite field serial-serial multiplication/reduction structure and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101968732A (en) * 2010-10-09 2011-02-09 中国人民解放军信息工程大学 Bit parallel systolic array shifted polynomial basis multiplier with function of error detection
TW201404108A (en) * 2012-07-09 2014-01-16 Univ Ching Yun Semi-systolic Gaussian normal basis multiplier
CN102929574A (en) * 2012-10-18 2013-02-13 复旦大学 Pulse multiplying unit design method on GF (Generator Field) (2163) domain
CN103186360A (en) * 2013-04-03 2013-07-03 哈尔滨工业大学深圳研究生院 Fast arithmetic multi-bit serial pulse dual-base binary finite field multiplier

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Low-Latency Digit-Serial Systolic Double Basis Multiplier over GF(2(m)) Using Subquadratic Toeplitz Matrix-Vector Product Approach;Jeng-Shyang Pan等;《IEEE Transactions on Computers》;20140531;第63卷(第5期);第1169-1181页 *
基于分治算法的ECC乘法器结构及实现;罗鹏 等;《计算机工程》;20090731;第35卷(第13期);第153-155页 *
基于阵列结构的ECC算法核心运算模块设计;杨玲 等;《微电子学》;20100630;第40卷(第3期);第387-391页 *
改进的素数域椭圆曲线密码处理器;陈传鹏 等;《武汉大学学报(工学版)》;20110228;第44卷(第1期);第124-127,132页 *

Also Published As

Publication number Publication date
CN104252332A (en) 2014-12-31

Similar Documents

Publication Publication Date Title
Hossain et al. High‐performance elliptic curve cryptography processor over NIST prime fields
CN103793199B (en) A kind of fast rsa password coprocessor supporting dual domain
Fan et al. Efficient hardware implementation of Fp-arithmetic for pairing-friendly curves
Lee et al. Efficient design of low-complexity bit-parallel systolic Hankel multipliers to implement multiplication in normal and dual bases of GF (2 m)
Jafri et al. Towards an optimized architecture for unified binary huff curves
Liu et al. High performance modular multiplication for SIDH
Rashidi et al. Efficient and low‐complexity hardware architecture of Gaussian normal basis multiplication over GF (2m) for elliptic curve cryptosystems
Niasar et al. Optimized architectures for elliptic curve cryptography over Curve448
Hu et al. The analysis and investigation of multiplicative inverse searching methods in the ring of integers modulo m
Damrudi et al. Parallel RSA encryption based on tree architecture
Mishra Pipelined computation of scalar multiplication in elliptic curve cryptosystems (extended version)
McIvor et al. High-radix systolic modular multiplication on reconfigurable hardware
Li et al. Scalable and parallel optimization of the number theoretic transform based on FPGA
CN113467754A (en) Lattice encryption modular multiplication operation method and framework based on decomposition reduction
CN205721742U (en) It is applicable to mould and removes new architecture and the non-interwoven one-dimensional pulsation framework of algorithm
CN104252332B (en) A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device
Timarchi et al. A novel high-speed low-power binary signed-digit adder
Liu et al. A high speed VLSI implementation of 256-bit scalar point multiplier for ECC over GF (p)
Feng et al. A high-speed and spa-resistant implementation of ecc point multiplication over gf (p)
Rashidi et al. Full‐custom hardware implementation of point multiplication on binary edwards curves for application‐specific integrated circuit elliptic curve cryptosystem applications
Kadu et al. Hardware implementation of efficient elliptic curve scalar multiplication using vedic multiplier
Lee et al. Low complexity digit-serial multiplier over GF (2^ m) using Karatsuba technology
Masoumi et al. Efficient Hardware Implementation of an Elliptic Curve Cryptographic Processor over GF (2 163)
Jeon et al. Low-power exponent architecture in finite fields
CN104267926A (en) Method and device for acquiring elliptic curve cryptography data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180918

Termination date: 20190820