CN104252332B - A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device - Google Patents
A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device Download PDFInfo
- Publication number
- CN104252332B CN104252332B CN201410414896.8A CN201410414896A CN104252332B CN 104252332 B CN104252332 B CN 104252332B CN 201410414896 A CN201410414896 A CN 201410414896A CN 104252332 B CN104252332 B CN 104252332B
- Authority
- CN
- China
- Prior art keywords
- unit
- multiplier
- output end
- input terminal
- summation circuit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Complex Calculations (AREA)
Abstract
The present invention relates to a kind of multiplier processing unit PE for elliptic curves cryptosystem device, including computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout, the input terminal Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the output end B of the computing unit after calculation processingout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out, ring shift left d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinD additions of ring shift right are carried out, formula is:Cout=Cin>>d+L(Bin, Xin), wherein CinIt is a upper processing unit PE as a result, for first processing unit PECinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, input as next processing unit PE, d is expressed as numerical digit length, and the hop count that k is expressed as point, L is operation mark.By the calculating for carrying out shifting processing and J functions when calculating so that processing unit arithmetic speed is fast, and computation complexity is low so that the performance of scrambler improves.
Description
Technical field
The invention belongs to the multiplier processing that digital coding field more particularly to a kind of finite field are suitable for elliptic curves cryptosystem device
Unit and multiplier.
Background technology
Recent years, the effective of finite field operations, high-performance and low complex design and its application have been obtained for very much
Concern.For example, the algorithm and system of scrambler need to meet American National Standard and Institute for Research and Technology (National
Institute of Standards and Technology, NIST) and American Institute of Electrical and Electronics Engineers
The safety requirements that (Institute ofElectrical and Electronics Engineers, IEEE) is proposed, to reduce
Potential attack, it is ensured that hardware security.One importance of scrambler is to reduce cost while resisting side channel analysis
(side-channel attacks).In industrial quarters, error detection research field is much paid attention to, such as document [3-6],
It can find out from the attack cryptographic system based on error analysis and side channel analysis.In practical applications, original to design often
It needs to increase overhead, therefore they need effectively to design, and can tolerate and undertake this overhead.Recently, oval
Scrambler meets public key cryptography requirement as a kind of effective technology, in many high-performance and security restriction application aspect
Implemented.For example, the algorithm can make full use of mobile wireless ad hoc networks (Mobile Ad hoc NETworks,
MANETs), confidence level and integrity checking are effectively provided.It is this to check without the concern for whether physical layer safety has danger
Dangerous elliptic curve ciphers device is a kind of method based on elliptic curve Algebraic Structure in finite field, and the arithmetic operation of this method is determined
The validity of the cryptographic system based on elliptic curve cipher device is determined.Therefore, many research work have paid attention in arithmetic element
Effective, low complex degree become reconciled performance design, these units are close for elliptic curve cipher device and public key encryption algorithm (RSA)
In code system.Nearest Gauss normal basis multiplier (Gaussian normal basis, GNB) has been widely applied to calculate ellipse
Point multiplication (also referred to as scalar multiplication) in circular curve scrambler.It is worth noting that, this operation not only needs effectively
Performance, and in temporal constraint application, its realization must be high-performance.
In two large-scale bit fields, domain multiplication can be designed by systolic arrays method and obtain the super of high speed and rule
Large scale integrated circuit is realized.Systolic arrays will not encounter irregular circuit design.In other words, in two bit fields not
With selection, their hardware configuration is modular closely similar.Its simultaneity, the balance of input and output and simple regular
Design the features such as, be allowed to be suitable for performance application.Although in needing high-speed structures to apply, pulsation framework has obtained extensively
It uses, but is typically premised on its area complexity is acceptable.For example, document [16] proposes a kind of optimization base pulsation multiplication
Device, the multiplier have very strong systematicness, can be realized with data serial mode.This ripple multiplier is in document [17]
High-performance is obtained on configurable hardware to realize.
[3] A.Yazdani, H.Sepahvand, M.Crow, and M.Ferdowsi, " Fault Detection and
Mitigation in Multilevel Converter STATCOMs, " IEEE Trans.Ind.Electron., vol.58,
No.4, pp.1307-1315,2011.
[4]M.A.A.Claudio-Sanchez, D.Theilliol, L.Vela-
Valdes, P.Sibaja-Teran, L.Hernandez-Gonzalez, and J.Aguayo-Alquicira, " A Failure-
Detection Strategy for IGBT Based on Gate-Voltage Behavior Applied to a Motor
Drive System, " IEEE Trans.Ind.Electron., vol.58, no.5, pp.1625-1633,2011.
[5] T.A.Najafabadi, F.R.Salmasi, and P.Jabehdar-Maralani, " Detection and
Isolation of Speed-, DC-Link Voltage-, and Current-Sensor Faults Based on an
Adaptive Observer in Induction-Motor Drives, " IEEE Trans.Ind.Electron., vo1.58,
No.5, pp.1662-1672,2011.
[6] S.Cruz, M.Ferreira, A.Mendes, and A.J.M.Cardoso, " Analysis and
Diagnosis of Open-Circuit Faults in Matrix Converters, " IEEE
Trans.Ind.Electron., vol.58, no.5, pp.1648-1661,2011.
[16] S.Kwon, " A Low Complexity and a Low Latency Bit Parallel Systolic
Multiplier over GF (2m) Using an Optimal Normal Basis of Type II, " in Proc.IEEE
Symp.Computer Arithmetic (Arith-16), pp.196-202,2003.
[17] J.Fan, D.Bailey, L.Batina, T.Guneysu, C.Paar, and I.Verbauwhede,
" Breaking Elliptic Curves Cryptosystems using Reconfigurable Hardware, " in
Proc.of 20th Int1 Conf.on Field Programmable Logic and Applications(FPL
2010), 2010, pp.133-138.
[18] A.Reyhani-Masoleh, " Efficient Algorithms and Architecturesfor
Field Multiplication Using Gaussian Normal Bases, " IEEE Trans.Computers,
Vol.55, no.1, pp.34-47, Jan.2006.
[20] R.Azarderakhsh and A.Reyhani-Masoleh, " A Modified Low Complexity
Digit-Level Gaussian Normal Basis Multiplier, " in Proc.Intl Workshop
Arithmetic of Finite Fields (WAIFI), vol.6087, pp.25-40,2010.
Invention content
The present invention provides a kind of multiplier processing unit for elliptic curves cryptosystem device, it is intended to solve existing processing unit and calculate
The problem that speed is slow, operation time is grown.
The invention is realized in this way a kind of processing unit for elliptic curves cryptosystem device multiplier, multiplier processing is single
First PE includes computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout,
The input terminal Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the computing unit after calculation processing
The output end Bout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out,
Its ring shift left d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinIt is followed
Ring moves to right d additions, and formula is:Cout=Cin> > d+L (Xin, Bin), wherein CinIt is the knot of a upper processing unit PE
Fruit, for the C of first processing unit PEinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, as under
The input of one processing unit PE, d are expressed as numerical digit length, and the hop count that k is expressed as point, L is operation mark.
Another object of the present invention is to provide a kind of one-dimensional multiplier, which includes k claim 1 institute
The multiplier processing unit PE stated an and summation circuit AC, the summation circuit is connected after the k processing unit PE series connection
The input terminal of AC, each PE are to export to obtain by the calculating of last PE, and three parameters of input of first PE are B respectively0, B1...,
Bn-1, 0,0 ..., 0, X0, X1..., Xn-1, wherein by cyclic shift obtains to the right after A backwards, output calculation formula is X:Wherein, A is multiplicand.
The present invention further technical solution be:The summation circuit AC includes that addition unit, temporary storage location and displacement are single
Member, the shift unit output end connect the addition unit input terminal, and the addition unit output end connects the temporary list
First input terminal, the temporary storage location output end connect the shift unit input terminal, and the summation circuit is to k PE processing unit
The result once calculated carry out displacement and with the output results added next time of k PE processing unit.
Another object of the present invention is to provide a kind of two-dimentional multiplier, the two dimension multiplier include k claim 2 or
One-dimensional multiplier, 2k-2 CS module, k-1 summation circuit AC1 described in 3 and summation circuit an AC2, k a described one
It is in parallel to tie up multiplier, the first one-dimensional multiplier outputs connect the shift unit of the first summation circuit AC1, and k-1 is a
The summation circuit AC1 series connection, the summation circuit AC1 of kth -1 connect with the summation circuit AC2, second to kth -1
The output end of a one-dimensional multiplier is connect with a summation circuit AC1 respectively, and second described one-dimensional to kth -1
The ends input B, the ends X of multiplier are separately connected a CS module, and the input terminal of the first one-dimensional multiplier directly inputs,
Its operational formula is:
The present invention further technical solution be:The summation circuit AC1 includes shift unit and addition unit, the shifting
Bit location output end connects the addition unit input terminal, the summation circuit AC1 to input carry out displacement and be connected described in
One-dimensional multiplier output results added output, shift unit ring shift right kd.
The present invention further technical solution be:The summation circuit AC2 includes shift unit, addition unit and temporary list
Member, the shift unit output end connect the addition unit input terminal, and the addition unit output end connects the temporary list
Member, the temporary storage location output end connect the input terminal of the addition unit, and the shift unit is to inputting numerical value ring shift right
k2D;The summation circuit AC1 includes shift unit and addition unit, and the shift unit output end connects the addition list
First input terminal.
The present invention further technical solution be:The CS modules are used to carry out ring shift right kd to the numerical value of input.
The beneficial effects of the invention are as follows:By the calculating for carrying out shifting processing and J functions when calculating so that processing unit
Arithmetic speed is fast, and computation complexity is low so that the performance of scrambler improves.The present invention is one proposed based on systolic array architecture
Kind multiplier, therefore it is easy to realize that there is low latency, high performance nature in VLSI systems.
Description of the drawings
Fig. 1 is the DL-PIPO GNB multiplier circuits of foundation of the present invention;
Fig. 2 is the structure chart of processing unit PE provided in an embodiment of the present invention;
Fig. 3 is one-dimensional multiplier circuit provided in an embodiment of the present invention;
Fig. 4 is two-dimentional multiplier circuit provided in an embodiment of the present invention.
Specific implementation mode
Fig. 2 shows the processing unit provided by the present invention for elliptic curves cryptosystem device multiplier, the multiplier processing units
PE includes computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout, institute
State input terminal Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the computing unit after calculation processing
The output end Bout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out,
Ring shift left d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinIt is recycled
D additions are moved to right, formula is:Cout=Cin> > d+L (Bin, Xin), wherein CinA upper processing unit PE as a result,
For the C of first processing unit PEinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, as next
The input of processing unit PE, d are expressed as numerical digit length, and the hop count that k is expressed as point, L is operation mark.By being carried out when calculating
The calculating of shifting processing and j functions so that processing unit arithmetic speed is fast, and computation complexity is low so that the performance of scrambler carries
It is high.
Fig. 3 shows that another object of the present invention is to provide a kind of one-dimensional multiplier, the one-dimensional multiplier includes k power
Profit requires the multiplier processing unit PE described in a 1 and summation circuit AC, after the k processing unit PE series connection described in connection
The input terminal of summation circuit AC, each PE are to export to obtain by the calculating of last PE, and three parameters of input of first PE are respectively
B0, B1..., Bn-1, 0,0 ..., 0, X0, X1..., Xn-1, wherein X is shifted to obtain by A, and output calculation formula is:
The summation circuit AC includes addition unit, temporary storage location and shift unit, the shift unit output end connection
The addition unit input terminal, the addition unit output end connect the temporary storage location input terminal, the temporary storage location output
End connects the shift unit input terminal, and the result that the summation circuit once calculates k PE processing unit is shifted simultaneously
With the output results added next time of k PE processing unit.
Fig. 4 shows that, another object of the present invention is to provide a kind of two-dimentional multiplier, which includes k power
Profit requires one-dimensional multiplier described in 2 or 3,2k-2 CS module, k-1 summation circuit AC1 and summation circuit an AC2, k
A one-dimensional multiplier is in parallel, and the first one-dimensional multiplier outputs connect the displacement list of the first summation circuit AC1
Member, the k-1 summation circuit AC1 series connection, the summation circuit AC1 of kth -1 connect with a summation circuit AC2,
Second output end to -1 one-dimensional multiplier of kth is connect with a summation circuit respectively, second to kth -1
The ends input B, the ends X of a one-dimensional multiplier are separately connected a CS module, the input of the first one-dimensional multiplier
End directly inputs, and operational formula is:
The summation circuit AC1 includes shift unit and addition unit, and the shift unit output end connects the addition
Unit input terminal, the summation circuit AC1 carries out displacement to input and to export results added with the one-dimensional multiplier that is connected defeated
Go out, shift unit ring shift right kd.
The summation circuit AC2 includes shift unit, addition unit and temporary storage location, the shift unit output end connection
The addition unit input terminal, the addition unit output end connect the temporary storage location, the temporary storage location output end connection
The input terminal of the addition unit, the shift unit is to inputting numerical value ring shift right k2D.
The CS modules are used to carry out ring shift right kd to the numerical value of input.
Underneath with decomposition method to obtain two kinds of new numerical digit GNB multipliers.
It takesAs the normal basis (Normal basis, NB) of GF (2m), wherein β ∈ GF
(2m).β is GF (2m) in a regular element, such set is GF (2m) normal basis.It is positive integer to take m and T so that p
=mT+1 is a prime number and gcd (mT/k, m)=1, wherein k are the multiplication exponent numbers of 2 mould p.It is in GF (2 to take αm) in one
The unit primitive root of a mT+1 ranks.In, for any T ranks unit primitive root τ,It generates one and is based on the two of GF (2)
Bit field GF (2m) normal basisThe base is also referred to as T-type Gauss normal basis bottom (Gaussian
Normal basis, GNB).The complexity (time and spatially) of GNB multipliers depends on their model T > 1.NIST is built
Five kind of two bit field has been discussed, this five kinds first fields are m=163,233,283,409 and 571.The T of this five kinds first fields is even number, point
It Wei 4,2,6,4 and 10.
It is the multiplication matrix R based in document [18] that GNB multiplication, which calculates,(m-1)*T.Take A=(a0, a1..., am-1), B=
(b0, b1..., bm-1) be two in GF (2m) on T-type GNB elements.They are in GF (2m) in product can be expressed as:
Wherein,
Here (X < < i) indicates to carry out i cyclic shift to the left to X ∈ GF (2m).Wherein X ⊙ Y=(x0y0..., xm- 1ym-1),It represents and step-by-step and step-by-step XOR operation is carried out to the coefficient of X and Y.It is limited
Domain multiplication may be designed to position grade (space complexity O (m) and time complexity O (m)), numerical digit rank (space complexity O (md) and
Time complexity O (m/d)) and parallel-by-bit (space complexity O (m2) and time complexity be O (1)) framework.
Recently, number bit-parallel input parallel output (the digit-level parallel-in of low complex degree
Parallel-out, DL-PIPO) GNB multipliers by document [18] [20] propose, Literature [20] is optimal.DL-
PIPO frameworks are as shown in Figure 1.It will be seen that in this multiplier, two operands A and B (have been stored in deposit in advance
Device<X>,<Y>In) should all retain in entire calculating process, and result should pass throughWhen 1≤d≤m
Clock obtains simultaneously after the period.Notice for a given field size, numerical digit width d should be reasonably selected with lower the time and
Space complexity.The time complexity of the GNB multipliers of numerical digit grade is
Area complexity be dm AND logic gate and
Logic gate.Formula elimination algorithm is reached using the identical sublist that document [20] proposes,
Area complexity further decreases, as long asLogic gate, wherein
A.1-D digital level heart contraction structure
From matrix R(m-1)×TSymmetrical structure in (1) can show that formula S (i, B) can be written as follows:
Therefore, for instead of matrix R(m-1)×T, we can define matrixFor:
Wherein, uk,It is the row k of matrix u.In Fig. 1, DL-PIPO GNB multiplier architectures are illustrated.It is false
If input element A (is already loaded into register in advance<X>In) be expressed as again
WhereinThen, matrix is utilizedA and B products can be obtained by formula:
Wherein, J (X, Y)=X ⊙ P (Y), P (Y)=(y1, s ' (1, Y), s ' (2, Y) ..., s ' (2, Y), s ' (1, Y)),For each coordinate, J (X, Y) functions are obtained by displacement input parameter appropriate
Result of calculation.These functions are that input B (is loaded into register in advance<Y>In) the weighted sum of each, and by matrixIt is determined with the position of input B.Matrix u is expressed as a P block again, which is used to calculate the linear combination of B, and
It is realized by using XOR tree.
It takes1≤d≤m, then, we can write the product in (5) as:
Wherein,
Assuming that n and k, which are two integers, meets q=kn.Notice if q cannot be divided exactly by k, it would be desirable to X and B most
Low order zero padding is so that it meets q=kn.By partial product CiIt is defined as:
It is mentioned here according to top, integer k and index i sum number bit widths d, Wo Menyou:Xi=X > > kid, Bi=B >
> kid.Product C in formula (6) can be compressed into n partial product:
Wherein product C is preferential by its most significant digit (most significant digit first, MSD-first)
It indicates.In order to calculate the partial product C in formula (8)i,
Assuming thatIt is it
Preceding decision.Each partial product C can be expressed as again:
Algorithm 1 describes the use (9) of proposition and the 1-D heart contraction GNB multiplication of (10).According to algorithm 1, Fig. 2,3 are retouched
The 1-D digital level heart contraction GNB multipliers proposed are stated.Fig. 3 illustrates the numerical digit grade heart contraction multiplier of proposition.I
It can be found that propose structure by k processing unit (processing element, PE) composition, i.e. PE0, PE1...,
PEk-1With a summation circuit (accumulation circuit, AC).Shown in the core circuit calculation formula (7) of Fig. 1
Multiplication.Therefore, we can build PE circuits as shown in Figure 2 with the circuit of Fig. 1.Each PE realizes algorithm 1 by calculating
In step 8 and 9, AC circuits realize step 11.
We explain the multiplication step shown in figure 3.In view of PE in fig. 2 is operated and algorithm 1, PEjOutput B
C is accumulated for calculating sectioniIt is expressed as BI, j.In the initial step, register<C>It is initialized to 0, XiAnd BiWherein 0≤i < n-
1 is calculated by circulative shift operation realization.In first clock cycle, two element Xn-1And Bn-11- as proposition
The input of D heart contraction multipliers goes calculating section to accumulate Cn-1.In the next clock cycle, two element Xn-2And Bn-2By with
It is input in the heart contraction multiplier of proposition and accumulates C to calculating sectionn-2, and so on.Each partial results, CiS,
By k PE be calculated and there are registers<C>In, it needs the k+1 clock cycle altogether.Therefore, for proposing
1-D heart contraction multipliers, GNB multiplication C=AB completes after k+n clock cycle.Following proposition is presented in we
Remove the clock periodicity of the heart contraction multiplier of the 1-D digital levels of measurement proposition.
Proposition 1.For the T-type GNB in the domain each GF (2m), the 1-D digital level parallel-by-bits of proposition export heart contraction
As many as multiplier needsA clock cycle, d is the numerical digit width of selection here.In other words, the 1-D numbers of proposition
The delay of the heart contraction multiplier of rank is
It proves:GNB multiplication is divided into q sections of calculating.The digital level heart contraction structure of proposition is provided in Fig. 2,3,
Assuming that we have k PEs and AC, therefore, GNB multiplication can also be divided into n partial results to wherein q=kn.,WhereinWith XiAnd BiPartial product CiBy with
It is used as the input of heart contraction array multiplier PEs.Entire GNB multiplication needs k+n clock cycle.For given q=
Kn, if k (quantity of PEs) very little, n (quantity of PEs input elements) will become very big, and therefore, delay (k+n) will become
It obtains very big.The delay and realization of minimum are for being worth prodigious m in order to obtain, and multiplication has good performance in finite field, I
Need to reduceTherefore, first derivative should be equal to 0,This is neededTherefore, I
SelectionThe delay of the multiplier of proposition becomesA clock cycle.IfIt is not one completely flat
Side, then result of calculation even will also lack several clock cycle.This demonstrates our proposition completely.
Conclusion 1.According to proposition 1, if the quantity of d=1 and PEs byDetermine, then it is proposed that the 1-D hearts
It is dirty shrink GNB multipliers delay be at mostA period always.
In order to talk clearly top be directed to it is proposed that 1-D heart contraction GNB multipliers discussion, we use following example
Son illustrates operations of the PE in the different clocks period.
Example 1.It takes For two 6 type GNB elements in GF (227),We assume that the numerical digit width d=3 of selection.Then, Wo Menyou
According to (9), product C can be expressed asWherein
For i=0,1,2.Table 1 lists the operation of each PE in each period always.It was noticed that for carrying
The 1-D digital level heart contraction GNB multipliers gone out need 6 clock cycle.
Fig. 2,3 are realized using 1-D heart contraction arrays, and the GNB multipliers of presentation include k PEs and AC circuit.Often
A PE circuits are made of the structure in Fig. 1, it includes dm AND,And three m
For register.Each PE unit core paths postponeAC circuits include a m
GF (2m) adders and one m registers of position.Fig. 2 for providing, 3 structures, the delay of the GNB multipliers of proposition areA clock cycle.
B, 2-D digital levels heart contraction structure
In this section, in order to obtain high performance realization, we show a 2-D heart contraction GNB multiplier, for
Compared to segmentation heart contraction structure in front, it can reach higher performance to decimal bit width (or the field width degree beaten).Take k
It is that two integers meet q=k with n2n.It notices if q cannot be by k2Divide exactly.We can mend X and B 0 so that it meets q=
k2n.In order to 2-D digital level heart contraction multiplier derivations, we compress (6) be n partial results and be:
Wherein,
Xij=X > > k2Id+kjd, Bij=B > > k2id+kjd
Each partial results CijAll it is k partial product and the partial results C in (12) in (7)iIt is k partial results
CijSum.Realize that calculating section accumulates C to reach a complete assembly linei, we define each partial results CijAll by the 1-D hearts
Dirty systolic array structure is realized, as Fig. 2 is presented.In this regard, the calculating C of propositioni2-D heart contraction array multiplications
Device is shown in figure 3.In figure 3, by k 1-D heart contraction array, (k-1) is a to follow the 2-D heart contractions multiplier of proposition
Ring shift circuit, (k-1) a AC1 structures and an AC2 structure are constituted.Each CS modules provide kd cyclic shifts to the right
It need to the rewiring in hardware realization.1-D heart contractions array [i] (1-D systolic array [i]) is used in figure
Realize k partial product CijAdduction.
Proposition 2.Domain is taken to be made of the T types GNB of even number, when the 2-D heart contraction GNB multipliers of proposition need maximum
Between delay beThe quantity of a clock cycle, PEs is
It proves:It is that two positive integers meet to take k and nWherein d is the numerical digit width of selection.GNB multiplication uses
k2A PEs removes structure 2-D heart contraction array structures, if C of this circuit counting in (12)i, then when we have 2k
The clock period.Therefore, the GNB multiplication shown in (11) needs 2k+n clock cycle.Similar proposition 1,
Single order lead and need to be zero, i.e.,It is required thatSo we select2-D
The time delay that heart contraction multiplier can obtain isA clock cycle.WhenIt is not complete cube
When, calculating will need the less clock cycle, hence it is demonstrated that our proposition.
Conclusion 2.According to proposition 2, it is proposed that the delays of 2-D heart contraction GNB multipliers be at mostWhen a
The clock period.
Fig. 4 are realized by using 2-D heart contraction arrays it is proposed that 2-D heart contractions multiplier byA AC1,1 AC2 composition.By using this structure,
The minimum delay of GNB multipliers can reachA clock cycle.It was noted that comparing 1-D heart contraction multiplication
Device, if selection decimal bit width (or big field width degree) delay will be lower.For example, it is 1, GF to select numerical digit width
(2409) 2-D heart contraction multipliers under domain delay be 24 clock cycle with numerical digit width be 3 in the case of 1-D hearts
The delay of contractive multiplication device is identical.This means that the effect of 2-D heart contractions multiplier is more preferable under these conditions.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention
All any modification, equivalent and improvement etc., should all be included in the protection scope of the present invention made by within refreshing and principle.
Claims (6)
1. a kind of multiplier processing unit PE for elliptic curves cryptosystem device, which is characterized in that multiplier processing unit PE includes
Computing unit, input terminal Bin, input terminal Cin, input terminal Xin, output end Bout, output end CoutAnd output end Xout, the input
Hold Bin, input terminal CinAnd input terminal XinComputing unit is inputted respectively, from the described defeated of the computing unit after calculation processing
Outlet Bout, output end CoutAnd output end XoutIt exports, B in the computing unitin、XinRing shift left d is carried out, cycle is left
Moving d is:Bout=Bin< < d, Xout=Xin< < d, B in computing unitin、XinOperation values and CinCarry out ring shift right d
Position is added, and formula is:Cout=Cin> > d+L (Bin, Xin), wherein CinIt is a upper processing unit PE as a result, for
The C of one processing unit PEinIt is initially zero, CoutIt is that processing unit PE calculates output product as a result, single as next processing
The input of first PE, d are expressed as numerical digit length, and L identifies for operation,Wherein, J (X, Y)
=X ⊙ P (Y), input B are loaded into register Y, and P (Y) is used to calculate the linear combination of B.
2. a kind of one-dimensional multiplier, which is characterized in that the one-dimensional multiplier includes k multiplier processing described in claim 1
A unit PE and summation circuit AC, the summation circuit AC, the input of each PE are connected after the k processing unit PE series connection
End is to export to obtain by the calculating of last PE, and three parameters of input of first PE are B respectively0, B1..., Bn-1, 0,0 ..., 0, X0,
X1..., Xn-1, wherein X is obtained by cycle shifting to the right one after A backwards, and output calculation formula is:
Wherein, A is to be multiplied operand.
3. one-dimensional multiplier according to claim 2, which is characterized in that the summation circuit AC include addition unit, temporarily
Memory cell and shift unit, the shift unit output end connect the addition unit input terminal, the addition unit output end
The temporary storage location input terminal is connected, the temporary storage location output end connects the shift unit input terminal, the summation circuit
To the result that k PE processing unit once calculates carry out displacement and with the output results added next time of k PE processing unit.
4. a kind of two dimension multiplier, which is characterized in that the two dimension multiplier includes that one-dimensional described in k Claims 2 or 3 multiplies
Musical instruments used in a Buddhist or Taoist mass, 2k-2 CS module, k-1 summation circuit AC1 and an one-dimensional multiplier of summation circuit AC2, k are in parallel, first
A one-dimensional multiplier outputs connect the shift unit of the first summation circuit AC1, the k-1 summation circuit AC1
Series connection, the summation circuit AC1 of kth -1 connect with a summation circuit AC2, and second described one-dimensional to kth -1
The output end of multiplier is connect with a summation circuit AC1 respectively, second to -1 one-dimensional multiplier of kth defeated
Enter the ends B, the ends X are separately connected a CS module, the input terminal of the first one-dimensional multiplier directly inputs, operational formula
For:
The CS modules are used to carry out ring shift right kd to the numerical value of input.
5. it is according to claim 4 two dimension multiplier, which is characterized in that the summation circuit AC1 include shift unit and
Addition unit, the shift unit output end connect the addition unit input terminal, and the summation circuit AC1 moves input
Position simultaneously exports, shift unit ring shift right kd with the one-dimensional multiplier output results added that is connected.
6. two dimension multiplier according to claim 5, which is characterized in that the summation circuit AC2 includes shift unit, adds
Method unit and temporary storage location, the shift unit output end connect the addition unit input terminal, the addition unit output end
The temporary storage location is connected, the temporary storage location output end connects the input terminal of the addition unit, and the shift unit is to defeated
Enter numerical value ring shift right k2D;The summation circuit AC1 includes shift unit and addition unit, the shift unit output end
Connect the addition unit input terminal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410414896.8A CN104252332B (en) | 2014-08-20 | 2014-08-20 | A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410414896.8A CN104252332B (en) | 2014-08-20 | 2014-08-20 | A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104252332A CN104252332A (en) | 2014-12-31 |
CN104252332B true CN104252332B (en) | 2018-09-18 |
Family
ID=52187288
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410414896.8A Expired - Fee Related CN104252332B (en) | 2014-08-20 | 2014-08-20 | A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104252332B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101968732A (en) * | 2010-10-09 | 2011-02-09 | 中国人民解放军信息工程大学 | Bit parallel systolic array shifted polynomial basis multiplier with function of error detection |
CN102929574A (en) * | 2012-10-18 | 2013-02-13 | 复旦大学 | Pulse multiplying unit design method on GF (Generator Field) (2163) domain |
CN103186360A (en) * | 2013-04-03 | 2013-07-03 | 哈尔滨工业大学深圳研究生院 | Fast arithmetic multi-bit serial pulse dual-base binary finite field multiplier |
TW201404108A (en) * | 2012-07-09 | 2014-01-16 | Univ Ching Yun | Semi-systolic Gaussian normal basis multiplier |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7519644B2 (en) * | 2004-05-27 | 2009-04-14 | King Fahd University Of Petroleum And Minerals | Finite field serial-serial multiplication/reduction structure and method |
-
2014
- 2014-08-20 CN CN201410414896.8A patent/CN104252332B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101968732A (en) * | 2010-10-09 | 2011-02-09 | 中国人民解放军信息工程大学 | Bit parallel systolic array shifted polynomial basis multiplier with function of error detection |
TW201404108A (en) * | 2012-07-09 | 2014-01-16 | Univ Ching Yun | Semi-systolic Gaussian normal basis multiplier |
CN102929574A (en) * | 2012-10-18 | 2013-02-13 | 复旦大学 | Pulse multiplying unit design method on GF (Generator Field) (2163) domain |
CN103186360A (en) * | 2013-04-03 | 2013-07-03 | 哈尔滨工业大学深圳研究生院 | Fast arithmetic multi-bit serial pulse dual-base binary finite field multiplier |
Non-Patent Citations (4)
Title |
---|
Low-Latency Digit-Serial Systolic Double Basis Multiplier over GF(2(m)) Using Subquadratic Toeplitz Matrix-Vector Product Approach;Jeng-Shyang Pan等;《IEEE Transactions on Computers》;20140531;第63卷(第5期);第1169-1181页 * |
基于分治算法的ECC乘法器结构及实现;罗鹏 等;《计算机工程》;20090731;第35卷(第13期);第153-155页 * |
基于阵列结构的ECC算法核心运算模块设计;杨玲 等;《微电子学》;20100630;第40卷(第3期);第387-391页 * |
改进的素数域椭圆曲线密码处理器;陈传鹏 等;《武汉大学学报(工学版)》;20110228;第44卷(第1期);第124-127,132页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104252332A (en) | 2014-12-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hossain et al. | High‐performance elliptic curve cryptography processor over NIST prime fields | |
CN103793199B (en) | A kind of fast rsa password coprocessor supporting dual domain | |
Fan et al. | Efficient hardware implementation of Fp-arithmetic for pairing-friendly curves | |
Lee et al. | Efficient design of low-complexity bit-parallel systolic Hankel multipliers to implement multiplication in normal and dual bases of GF (2 m) | |
Jafri et al. | Towards an optimized architecture for unified binary huff curves | |
Liu et al. | High performance modular multiplication for SIDH | |
Rashidi et al. | Efficient and low‐complexity hardware architecture of Gaussian normal basis multiplication over GF (2m) for elliptic curve cryptosystems | |
Niasar et al. | Optimized architectures for elliptic curve cryptography over Curve448 | |
Hu et al. | The analysis and investigation of multiplicative inverse searching methods in the ring of integers modulo m | |
Damrudi et al. | Parallel RSA encryption based on tree architecture | |
Mishra | Pipelined computation of scalar multiplication in elliptic curve cryptosystems (extended version) | |
McIvor et al. | High-radix systolic modular multiplication on reconfigurable hardware | |
Li et al. | Scalable and parallel optimization of the number theoretic transform based on FPGA | |
CN113467754A (en) | Lattice encryption modular multiplication operation method and framework based on decomposition reduction | |
CN205721742U (en) | It is applicable to mould and removes new architecture and the non-interwoven one-dimensional pulsation framework of algorithm | |
CN104252332B (en) | A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device | |
Timarchi et al. | A novel high-speed low-power binary signed-digit adder | |
Liu et al. | A high speed VLSI implementation of 256-bit scalar point multiplier for ECC over GF (p) | |
Feng et al. | A high-speed and spa-resistant implementation of ecc point multiplication over gf (p) | |
Rashidi et al. | Full‐custom hardware implementation of point multiplication on binary edwards curves for application‐specific integrated circuit elliptic curve cryptosystem applications | |
Kadu et al. | Hardware implementation of efficient elliptic curve scalar multiplication using vedic multiplier | |
Lee et al. | Low complexity digit-serial multiplier over GF (2^ m) using Karatsuba technology | |
Masoumi et al. | Efficient Hardware Implementation of an Elliptic Curve Cryptographic Processor over GF (2 163) | |
Jeon et al. | Low-power exponent architecture in finite fields | |
CN104267926A (en) | Method and device for acquiring elliptic curve cryptography data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180918 Termination date: 20190820 |