CN103186360A - Fast arithmetic multi-bit serial pulse dual-base binary finite field multiplier - Google Patents

Fast arithmetic multi-bit serial pulse dual-base binary finite field multiplier Download PDF

Info

Publication number
CN103186360A
CN103186360A CN2013101154017A CN201310115401A CN103186360A CN 103186360 A CN103186360 A CN 103186360A CN 2013101154017 A CN2013101154017 A CN 2013101154017A CN 201310115401 A CN201310115401 A CN 201310115401A CN 103186360 A CN103186360 A CN 103186360A
Authority
CN
China
Prior art keywords
module
input
result
frrp
xor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013101154017A
Other languages
Chinese (zh)
Other versions
CN103186360B (en
Inventor
潘正祥
杨春生
白忠海
李秋莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN201310115401.7A priority Critical patent/CN103186360B/en
Publication of CN103186360A publication Critical patent/CN103186360A/en
Application granted granted Critical
Publication of CN103186360B publication Critical patent/CN103186360B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention relates to a fast arithmetic multi-bit serial pulse dual-base binary finite field multiplier, comprising an input end B, k PE modules, an FRRP module and an R3 module. The k PE modules are connected in series, the k PE modules pass through k cycles, in the first cycle, the input of A is that B is directly input, and the calculation result is restored and input into a temporary register C through the FRRP module; in the second cycle, the input of A is that B is input through the R3 module, the calculation result is also restored through the FRRP module, and is added to the calculation result of the first cycle and stored in the temporary register C; so, in the k cycle, the input of A is that B is input after passing through the R3 module for (k-1) times, the calculation result is restored through the FRRP module, added to the accumulation result of the previous (k-1) times and stored in the temporary register C, and the temporary register C outputs the result.

Description

Fast operation multi-bit serial pulse double-base binary finite field multiplier
Technical Field
The present invention relates to a binary finite field multiplier, and more particularly, to a fast operation multi-bit serial pulse binary finite field multiplier with dual bases.
Background
In recent years, ellipsesCircular curve cryptography (ECC) [1 ]],[2]Has been linked to the study of cryptography. With the advent of elliptic curve cryptography in public key cryptosystems, some hardware implementation issues have been raised on the application of ECC. NIST recommends 5 binary fields, GF (2)163),GF(2233),GF(2283),GF(2409),and GF(2571). In the cryptographic protocol based on the ECC base, there is a field multiplication which is an indispensable element of the computation cost. The availability of cryptographic system hardware generally affects area, power consumption, and performance.
For the implementation of very-large-scale integration (VLSI), systolic array architecture is the better choice. In the extended two-bit field, a variety of efficient systolic array multipliers have been designed and can be categorized as bit parallel and as series mechanisms. Efficient bit-parallel systolic multipliers typically employ either the LSB first or MSB first algorithm. The main advantage of the bit-parallel systolic multiplier is the connectivity throughout the computation. However, these structures require O (m) for the bibit field based polynomial2)XOR,O(m2)AND,O(m2) One bit latch and o (m) delay complexity. To reduce temporal and spatial complexity, LEE [8],[9],[13]The algorithm shows that field multiplication can be used for establishing a full-parallel systolic multiplier by Toeplitz matrix-vector multiplication (TMVP) for special polynomials such as all-one polynomial, five-term polynomial and three-term polynomial. Bit-cascaded systolic array multipliers require the spatial complexity of o (m), but they result in longer computational delays.
For a trade-off between temporal and spatial complexity, digital serial systolic multipliers have been disclosed between being parallel and being series multipliers. A digital serial conversion polynomial based multiplier based on an internal digital and external parallel architecture is described in [20 ]]Is set forth in (1). In such a multiplier, m bits of the element field length can be subdivided
Figure GDA00003008658000011
A sub-segment of length d. In each clock cycle, a string of d bits is computed and a multiplication of m bits is computed. An expandable and systolic multiplier has been developed [15 ] using an inherent parallel hank vector matrix of d x d bits],[16]The delay of which is
Figure GDA00003008658000021
One clock cycle. The use of different structures inside and outside the multi-bit serial ripple multiplier is presented in the literature. The delays of these multipliers areA clock cycle. As mentioned earlier, the design of low complexity systolic finite field multipliers depends on the choice of irreducible polynomials and the choice of representation bases, which require high delays to implement the multiplication calculations.
Disclosure of Invention
The technical problem solved by the invention is as follows: a fast operation multi-bit serial pulse double-base binary finite field multiplier is constructed, and the technical problem that the conventional multiplier needs high delay to realize multiplication calculation is solved.
The technical scheme of the invention is as follows: a fast operation multi-bit serial pulse double-base binary finite field multiplier is constructed, and comprises B, k PE modules, an FRRP module and an R3 module at input ends, wherein the k PE modules are connected in series, the k PE modules are connected in series through k periods, and the input of the 1 st period A is A0、A1、...、Ak-1B, directly inputting, and restoring and inputting a calculation result into a temporary storage C through the FRRP module; input A of 2 nd cycle Ak、Ak+1、…、A2k-1B is input through the R3 module, the calculation result is also restored through the FRRP module, and is added with the calculation result of the 1 st period and is stored in the temporary storage C; thus, for the k-th cycle, the input to A isB is input after passing through the R3 module for (k-1) times, a calculation result is restored through the FRRP module, is added with the accumulated result for (k-1) times, is stored in a temporary storage C, and then the result is output by the temporary storage C, the R3 module realizes Bxkdmod F (x), the PE modules include R1 module, CMP module, CVP module, PWM module,
Figure GDA00003008658000023
An exclusive OR gate, andthe R3 module is output to the R1 module and then undergoes coefficient conversion by the CMP module, the A segment is input to the CVP module to undergo coefficient conversion of the A segment, and the calculation results of the CMP module and the CVP module are input to the PWM module to realize BinAnd A segment product calculation by
Figure GDA00003008658000025
Accumulated by an exclusive-or gate, the result being stored inIn a latch, is composed of
Figure GDA00003008658000027
Latch output result
Figure GDA00003008658000028
Wherein a is represented by a trinomial polynomial f (x) =1+ xn+xmIs expressed as A = a0+a1x+...+am-1xm-1There are m coefficients, i.e. (a)0,a1,...,am-1). Cutting m-bit A into pieces by using a segmentation cutting method
Figure GDA000030086580000211
Each segment has d bits and total k2Is segmented, thus having
Figure GDA00003008658000029
B may be represented by double base as B = B0β0+b1β1+...+bm-1βm-1As the other input of the multiplier; and C is an output result.
The further technical scheme of the invention is as follows: the FRRP module comprises an FR module, an R2 module, the R2 module realizes Cmod (x)m+1), the input of the FR module is the calculation result of k series PE modules, and the result is restored and output to the R2 module.
The further technical scheme of the invention is as follows: the CMP module comprises exclusive-OR gates XOR _1 and XOR _2, and the exclusive-OR gates XOR _1 and XOR _2 are connected in parallel.
The further technical scheme of the invention is as follows: the CVP module is an exclusive OR gate XOR _ 3.
The further technical scheme of the invention is as follows: the PWM module comprises three AND gates AND _1, AND _2 AND AND _3 which are connected in parallel. And point-to-point multiplying the results output by the CMP module and the CVP module.
The further technical scheme of the invention is as follows: the FR-module comprises two parallel exclusive or gates XOR _4 and XOR _ 5.
The invention has the technical effects that: constructing a fast operation multi-bit serial pulse double-base binary finite field multiplier, which comprises B, k PE modules at input ends, an FRRP module and an R3 module, wherein the k PE modules are connected in series, the k PE modules pass through k periods, and the input of the 1 st period A is (A)0,A1,…Ak-1) B, directly inputting, and restoring and inputting a calculation result into a temporary storage C through the FRRP module; input of 2 nd cycle A (A)k,Ak+1,…,A2k-1) B is input through the R3 module, the calculation result is also restored through the FRRP module, and is added with the calculation result of the 1 st period and is stored in the temporary storage C; thus, for the k-th cycle, the input to A is
Figure GDA00003008658000032
B is input after passing through the R3 module for (k-1) times, a calculation result is restored through the FRRP module, added with the accumulated result for (k-1) times, stored in the temporary storage C, and then the result is output by the temporary storage C. Some live multiplications can be obtained in a bit-parallel structure through the sub-subspace TMVP. In a binary field GF (2)m) The irresolvable three-term polynomial and the five-term polynomial are widely used in the field of cryptography, and the bit length is usually large in such fields. The proposed architecture can go to very low levels once a d x d Toeplitz multiplication is selected by using the quadratic TMVP equation through a new digital concatenation new station shrink double-base multiplier
Figure GDA00003008658000031
A clock cycle.
Drawings
FIG. 1 is a schematic structural diagram of the present invention.
Fig. 2 is a diagram of a multi-bit serial ripple multiplier according to the present invention.
Fig. 3 is a block diagram of the processing unit PE according to the present invention.
FIG. 4 is a specific circuit diagram of the PE module according to the present invention.
Detailed Description
The technical solution of the present invention is further illustrated below with reference to specific examples.
As shown in fig. 2, the embodiment of the present invention is: a fast operation multi-bit serial pulse double-base binary finite field multiplier is constructed, and comprises B, k PE modules, an FRRP module and an R3 module at input ends, wherein the k PE modules are connected in series, and the k PE modules are connected in seriesOver k cycles, the input for the 1 st cycle A is A0、A1、…、Ak-1B, directly inputting, and restoring and inputting a calculation result into a temporary storage C through the FRRP module; input A of 2 nd cycle Ak、Ak+1、…、A2k-1B is input through the R3 module, the calculation result is also restored through the FRRP module, and is added with the calculation result of the 1 st period and is stored in the temporary storage C; thus, for the k-th cycle, the input to A is
Figure GDA00003008658000048
B is input after passing through the R3 module for (k-1) times, a calculation result is restored through the FRRP module, is added with the accumulated result for (k-1) times, is stored in a temporary storage C, and then the result is output by the temporary storage C, the R3 module realizes Bxkdmod F (x), the PE modules include R1 module, CMP module, CVP module, PWM module,
Figure GDA00003008658000041
An exclusive OR gate, andthe R3 module is output to the R1 module and then undergoes coefficient conversion by the CMP module, the A segment is input to the CVP module to undergo coefficient conversion of the A segment, and the calculation results of the CMP module and the CVP module are input to the PWM module to realize BinAnd A segment product calculation by
Figure GDA00003008658000043
Accumulated by an exclusive-or gate, the result being stored in
Figure GDA00003008658000044
In a latch, is composed of
Figure GDA00003008658000045
Latch output result
Figure GDA00003008658000046
Wherein A is a trinomial polynomialF(x)=1+xn+xmIs expressed as A = a0+a1x+...+am-1xm-1There are m coefficients, i.e. (a)0,a1,...,am-1). Cutting m-bit A into pieces by using a segmentation cutting method
Figure GDA00003008658000049
Each segment has d bits and total k2Is segmented, thus having
Figure GDA00003008658000047
B may be represented by double base as B = B0β0+b1β1+...+bm-1βm-1As the other input of the multiplier; and C is an output result.
The preferred embodiments of the present invention are: the FRRP module comprises an FR module, an R2 module, the R2 module realizes Cmod (x)m+1), the input of the FR module is the calculation result of k series PE modules, and the result is restored and output to the R2 module.
The inputs of the CMP module and the CVP module are BinAnd
Figure GDA00003008658000051
the output result is used as the input of the PWM module, and the output of the PWM module passes throughAn exclusive OR gate, and
Figure GDA00003008658000053
a latch for outputting the result
Figure GDA00003008658000054
The input to the R1 module is BinThe output passes through m latches and outputs a result Bout. The input to the CMP module is Bxdk(i+1)+jdThe output is [ B ](p+q),B(p+q+1),...,B(p+q+d-1)]The input of the CVP module is Aik+jOutput is [ aq,aq+1,...,aq+d-1]TWherein
Figure GDA00003008658000055
To represent
Figure GDA000030086580000510
The number of rows and columns arranged in a matrix, i, j =0, 1.. the k-1, i denotes the ith row of the matrix, j denotes the jth column of the matrix, p denotes dk (i +1) + jd, q denotes (ik + j) d, and T denotes [ a ], [q,aq+1,...,aq+d-1]Transposing of the matrix. The output result is accumulated with the result of the previous FRRP module and output to the next FRRP module.
The whole structure of the double-base multiplication is shown in the structure of the double-base multiplier of the systolic array of FIG. 1, A, B and C are three in GF (2)m) By an undecomposed trinomial polynomial F (x) =1+ xn+xmWherein n is less than or equal to m/2. Element a is represented by a polynomial base representation, B and C are represented by a dual base representation, and the entire multiplier implements the C = abmodf (x) function, with A, B as input and C as output result. A by a trinomial polynomial f (x) =1+ xn+xmIs expressed as A = a0+a1x+...+am-1xm-1There are m coefficients, i.e. (a)0,a1,...,am-1). Cutting m-bit A into pieces by using a segmentation cutting methodEach segment has d bits and total k2Is segmented, thus havingEach segment Ai may be denoted Ai=aid+aid+1x+…+aid+d-1xd-1All segments
Figure GDA00003008658000058
Instead of a as input to the entire multiplier. B may be represented by double base as B = B0β0+b1β1+...+bm-1βm-1As the other input to the multiplier. C is the output result, calculated by C = abmodf (x), i.e. the function implemented by the whole multiplier.
Since A is divided intoSo A can be represented as A = A 0 + A 1 x d + . . . + A k 2 - 1 x ( k 2 - 1 ) d . Thus unfolding a in C = abmodf (x) can yield:
wherein C = AB mod F ( x ) = B ( A 0 + A 1 x d + · · · + A k 2 - 1 x ( k 2 - 1 ) d ) mod F ( x ) = ( B ( A 0 + A 1 x d + · · · + A k - 1 x ( k - 1 ) d ) + Bx dk ( A k + A k + 1 x d + · · · + A 2 k - 1 x ( k - 1 ) d ) + · · · + Bx dk ( k - 1 ) ( A k ( k - 1 ) + A k ( k - 1 ) + 1 x d + · · · + A k 2 - 1 x ( k - 1 ) d ) ) mod F ( x ) = ( C 0 + C 1 + · · · + C k - 1 ) mod F ( x ) C 0 = B ( A 0 + A 1 x d + · · · + A k - 1 x ( k - 1 ) d ) C 1 = Bx dk ( A k + A k + 1 x d + · · · + A 2 k - 1 x ( k - 1 ) d ) · · · C k - 1 = Bx dk ( k - 1 ) ( A k ( k - 1 ) + A k ( k - 1 + 1 ) x d + · · · + A k 2 - 1 x ( k - 1 ) d )
In the overall multiplier structure of FIG. 1, C is calculated in line 10=B(A0+A1xd+…+Ak-1x(k-1)d) Its 1 st processing element PE0,0Calculating BA0Product result2 nd processing element PE0,1Calculating BA1xdThe result of the multiplication, and so on, the k-th processing element PE0,k-1Calculating BAk-1x(k-1)dAnd (4) multiplying the result. The calculation results of the whole k processing units are accumulated again to finally obtain C0And input to the 1 st FRRP (FinalReconstruction-Reduction-Polymer) module. Likewise, line 2 of the overall multiplier structure calculates C1=Bxdk(Ak+Ak+1x d+…+A2k-1x(k-1)d) Incremental R3 modular calculation Bxdkmod F (x), whose input is B. Its 1 st processing element PE1,0Calculating BxdxA0The result of the multiplication, subsequently similar to line 1, is calculated as result C1Input to the 2 nd FRRP module, and then add up to the 1 st FRRP module to obtain (C)0+C1) mod F (x). Each line of the whole multiplier carries out similar calculation until the k line, and the output result of the R3 module is Bxdk(k-1)mod F (x), the kth FRRP module input is Ck-1The output is (C)0+C1+…+Ck-1) mod F (x), which is the whole multiplier operation result C = (C)0+C1+…+Ck-1)modF(x)。
The detailed circuit of each processing unit PEi, j, as shown in fig. 2, is used to calculate Bxdk(i+1)+jdAik+jAnd (4) multiplying the result. A. thein、BinAnd
Figure GDA00003008658000062
as input, BoutAndas an output. For the 1 st processing element PE of each rowi,0Of which A isinInput is Aik,BinIs output from the (i +1) th R3 module, namely Bxdk(i+1)mod F (x), and
Figure GDA00003008658000064
the initialization is 0. B isoutThe output of R1 is also 2 ndA processing unit PEi,1The result of the output is Bxdk(i+1)+dmodF(x)。
Figure GDA00003008658000071
Output is
Figure GDA00003008658000072
As a result of (1), i.e. calculating Bxdk(i+1)AikAnd (4) multiplying the result. The 2 nd processing element PE per rowi,1Of which A isinInput is Aik+1,BinInput is Bxdk(i+1)+dmodF(x),
Figure GDA00003008658000073
Input is the 1 st processing element PEi,0The result of the calculation is Bxdk(i+1)AikAs the 3 rd processing element PEi,1Is inputted
Figure GDA00003008658000074
BoutOutput is Bxdk(i+1)+2dmod F (x) the result of the calculation as the 3 rd processing element PEi,1Input B ofin
Figure GDA00003008658000075
Output is Bxdk(i+1)+dAik+1And (4) multiplying the result. By analogy, the j +1 processing element PE in each rowi,jCalculated is Bxdk(i+1)+jdAik+jProduct result, A thereofinInput is Aik+j,BinInput is Bxdk(i+1)+jdmodF(x),
Figure GDA00003008658000076
Input to the jth module
Figure GDA00003008658000077
Output the result as Bxdk(i+1)+(j-1)dAik+(j-1),BoutOutput is Bxdk(i+1)+(j+1)dmod f (x) the result of the calculation,
Figure GDA00003008658000078
output is Bxdk(i+1)+jdAik+jAnd (4) multiplying the result.
Mix Bxdk(i+1)+jdAnd Aik+jAre separately unfolded, i.e. Bxdk(i+1)+jd=(b0β0+b1β1+…+bm-1βm-1)xdk(i+1)+jd,Aik+j=a(ik+j)d+a(ik+j)d+1x+…+a(ik+j)d+d-1xd-1 ,According to the double-base multiplication rule, the following results are obtained:
Bxdk(i+1)+jdAik+j
=(b0β0+b1β1+…+bm-1βm-1)xdk(i+1)+jdAik+j
=(b0 (p)β0+b1 (p)β1+…bm-1 (p)βm-1)Aik+j
=(a(ik+j)d+a(ik+j)d+1x+…+a(ik+j)d+d-1xd-1)B(p)
=aqB(p)+aq+1xB(p)+…+aq+d-1xd-1B(p)
=aqB(p+q)+aq+1B(p+q+1)+…+aq+d-1B(p+q+d-1)
=[B(p+q),B(p+q+1),...,B(p+q+d-1)][aq,aq+1,...,aq+d-1]T
p=dk(i+1)+jd
wherein q = (ik + j) d
B(p)=b0 (p)β0+b1 (p)β1+…+bm-1 (p)βm-1
FIG. 3 processing element PEi,jIn the detailed circuit of (1), the input of the CMP module is Bxdk(i+1)+jdThe output is [ B ](p+q),B(p+q+1),...,B(p+q+d-1)]The input of the CVP module is Aik+jOutput is [ aq,aq+1,...,aq+d-1]TThe PWM module is used for calculating [ B(p+q),B(p+q+1),...,B(p+q+d-1)][aq,aq+1,...,aq+d-1]TThe result of the multiplication is then summed with
Figure GDA00003008658000079
Adding the result to a register L and outputting the result from the register LThe input to the R1 module is BinRealization of xdBinmod F (x) operation, the result is stored in register L, which is used as BoutAnd (6) outputting.
In calculating [ B(p+q),B(p+q+1),...,B(p+q+d-1)][aq,aq+1,...,aq+d-1]TDue to Toeplitz matrix-vector product, split into t 1 t 2 t 0 t 1 v 0 v 1 , ( t 1 t 2 t 0 t 1 Represents the Toeplitz matrix [ B(p+q),B(p+q+1),...,B(p+q+d-1)]Divided into four blocks, two of which are identical and are t1The other two blocks are t0And t2 v 0 v 1 Will vector [ aq,aq+1,...,aq+d-1]TDivided into two segments, T representing a matrix transpose, where one can obtain
= [ B ( p + q ) , B ( p + q + 1 ) , . . . , B ( p + q + d - 1 ) ] [ a q , a q + 1 , . . . , a q + d - 1 ] T
= t 1 t 2 t 0 t 1 v 0 v 1 = t 1 ( v 0 + v 1 ) + v 1 ( t 2 + t 1 ) t 1 ( v 0 + v 1 ) + v 0 ( t 0 + t 1 )
= c 0 c 1
Fig. 4 shows the CMP, CVP and PWM specific circuits of the processing unit PE. The input to the CMP module is (t)0,t1,t2) Input (t) via exclusive or gates XOR _1 and XOR _20+t1,t1,t1+t2) (ii) a The CVP module inputs (v)0,v1) Is input (v) via an exclusive or gate XOR _30,v0+v1,v1) (ii) a The PWM module multiplies the results output by the CMP module AND the CVP module point to point, passes through 3 AND gates AND _1, AND _2 AND AND _3 AND outputs (v)0(t0+t1),t1(v0+v1),v1(t2+t1) ); the FR reduction module calculates c by using 2 exclusive OR gates XOR _4 and XOR _50=t1(v0+v1)+v1(t2+t1) And c1=t1(v0+v1)+v0(t0+t1) Output (c)0,c1)。
Fig. 2 shows a multi-bit serial ripple multiplier architecture proposed by the present invention, which is obtained by folding the structure shown in fig. 1. In FIG. 1 use k2An arithmetic unit PE ofThe structure and function of the rows of k arithmetic elements PE are the same, so the remaining k arithmetic elements PE can be replaced with the k arithmetic elements PE of the 1 st row, which requires k cycles. The input for the 1 st cycle A is (A)0,A1,…,Ak-1) B, directly inputting, and inputting a calculation result into a temporary storage C through an FRRP reduction module; input of 2 nd cycle A (A)k,Ak+1,…,A2k-1) B is input through an R3 module, and the calculation result is added with the calculation result of the 1 st period through an FRRP reduction module and is stored in a temporary storage C; thus, knowing the kth cycle, the input to A is
Figure GDA00003008658000088
B passes through the (k-1) time R3 module and then is input, the calculation result passes through the FRRP restoration module, is added with the previous (k-1) time accumulation result and is stored in the register C, and the register C outputs the result, wherein C = ABmodF (x).
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (6)

1. A fast operation multi-bit serial pulse binary finite field multiplier is characterized in that the multiplier comprises an input endBkA PE module, an FRRP module, an R3 modulekA plurality of PE modules are connected in series, thekA PE module warpkOne period, 1 st periodAIs inputted by
Figure 893260DEST_PATH_IMAGE001
And B direct transfusionAnd in, the calculation result is restored and input to the temporary storage through the FRRP moduleCPerforming the following steps; 2 nd cycleAIs inputtedBThe calculation result is also restored by the FRRP module after being input by the R3 module, added with the calculation result of the 1 st period and stored in a temporary storageCPerforming the following steps; thus, firstkIn the course of one period of time,Ais inputted by
Figure 199531DEST_PATH_IMAGE003
BThrough (a) tok-1) inputting after the R3 module, the calculation result is restored by the FRRP module, andk-1) adding the results of the sub-accumulations and saving them in a registerCIn the intermediate registerCOutput results, the R3 module implements
Figure 275107DEST_PATH_IMAGE004
The PE module includes an R1 module, a CMP module, a CVP module, a PWM module, a,An exclusive OR gate, andthe R3 module is output to the R1 module and then undergoes coefficient conversion by the CMP module, the A segment is input to the CVP module to undergo coefficient conversion of the A segment, and the calculation results of the CMP module and the CVP module are input to the PWM module to realize
Figure 870277DEST_PATH_IMAGE007
AndAfractional product calculation through
Figure 730917DEST_PATH_IMAGE005
Accumulated by an exclusive-or gate, the result being stored in
Figure 476894DEST_PATH_IMAGE005
In a latch, is composed of
Figure 422984DEST_PATH_IMAGE005
Latch output result
Figure 414074DEST_PATH_IMAGE008
Wherein,Aby a trinomial polynomial
Figure 690072DEST_PATH_IMAGE009
Is shown as
Figure 108415DEST_PATH_IMAGE010
All of (1) tomA coefficient of
Figure 338539DEST_PATH_IMAGE011
Using a segmentation cutting method, willmOf bitsAIs cut into
Figure 133320DEST_PATH_IMAGE012
Each segment ofdBits, in total, havek 2 Is segmented, thus having
Figure 201508DEST_PATH_IMAGE013
BBy a double substrate can be represented as
Figure 728435DEST_PATH_IMAGE014
As the other input of the multiplier;Cis the output result.
2. The fast-acting multi-bit serial systolic dual-basis binary finite field multiplier of claim 1, in which said FRRP block includes an FR block, an R2 block, said R2 block implementing
Figure 773752DEST_PATH_IMAGE015
The input of the FR module is the calculation result of k series PE modules, and the result is restored and output to the R2 module.
3. The fast acting multi-bit serial systolic dual-basis binary finite field multiplier of claim 1, characterized in that said CMP module comprises XOR-gates XOR _1 and XOR _2, said XOR-gates XOR-1 and XOR _2 being connected in parallel.
4. The fast acting multi-bit serial systolic dual-basis binary finite field multiplier of claim 1, in which said CVP block is an exclusive or gate XOR _ 3.
5. The fast acting multi-bit serial systolic dual-basis binary finite field multiplier of claim 1, characterized in that said PWM block comprises three AND gates AND _1, AND _2 AND _3 connected in parallel, point-to-point multiplying the results output by said CMP block AND said CVP block.
6. The fast acting multi-bit serial systolic dual-basis binary finite field multiplier of claim 1, in which said FR block comprises two parallel exclusive or gates XOR _4 and XOR _ 5.
CN201310115401.7A 2013-04-03 2013-04-03 Binary system Galois field multiplier at the bottom of rapid computations many bits series connection pulsation double-basis Expired - Fee Related CN103186360B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310115401.7A CN103186360B (en) 2013-04-03 2013-04-03 Binary system Galois field multiplier at the bottom of rapid computations many bits series connection pulsation double-basis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310115401.7A CN103186360B (en) 2013-04-03 2013-04-03 Binary system Galois field multiplier at the bottom of rapid computations many bits series connection pulsation double-basis

Publications (2)

Publication Number Publication Date
CN103186360A true CN103186360A (en) 2013-07-03
CN103186360B CN103186360B (en) 2016-08-03

Family

ID=48677539

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310115401.7A Expired - Fee Related CN103186360B (en) 2013-04-03 2013-04-03 Binary system Galois field multiplier at the bottom of rapid computations many bits series connection pulsation double-basis

Country Status (1)

Country Link
CN (1) CN103186360B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104252332A (en) * 2014-08-20 2014-12-31 哈尔滨工业大学深圳研究生院 Multiplier and multiplier processing element for ellipse cipher apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW527561B (en) * 2001-11-02 2003-04-11 Chiou-Ying Lee Low-complexity bit-parallel systolic multiplier over GF (2m)
TW200710716A (en) * 2006-11-24 2007-03-16 Univ Lunghwa Sci & Technology Low-complexity finite field GF(2m) bit-parallel systolic array dual-basis multiplier
CN102073477A (en) * 2010-11-29 2011-05-25 北京航空航天大学 Implementation method of finite field multiplying unit with functions of detecting, correcting and locating error
CN102929574A (en) * 2012-10-18 2013-02-13 复旦大学 Pulse multiplying unit design method on GF (Generator Field) (2163) domain

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW527561B (en) * 2001-11-02 2003-04-11 Chiou-Ying Lee Low-complexity bit-parallel systolic multiplier over GF (2m)
TW200710716A (en) * 2006-11-24 2007-03-16 Univ Lunghwa Sci & Technology Low-complexity finite field GF(2m) bit-parallel systolic array dual-basis multiplier
CN102073477A (en) * 2010-11-29 2011-05-25 北京航空航天大学 Implementation method of finite field multiplying unit with functions of detecting, correcting and locating error
CN102929574A (en) * 2012-10-18 2013-02-13 复旦大学 Pulse multiplying unit design method on GF (Generator Field) (2163) domain

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHIOU-YNG LEE: "《Low-Complexity Bit-Parallel Sysolic Montgomery Multipliers for Special Classes of GF(2/sup m)》", 《IEEE TRANSACTION ON COMPUTERS》, vol. 54, no. 9, 25 July 2005 (2005-07-25), pages 1061 - 1070 *
HAINING FAN ET AL.: "Subquadratic Computational Complexity Schemes for Extended Binary Field Multiplication Using Optimal Normal Bases", 《IEEE TRANSACTION ON COMPUTERS》, vol. 56, no. 10, 25 October 2007 (2007-10-25), pages 1435 - 1437, XP011191962, DOI: doi:10.1109/TC.2007.1076 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104252332A (en) * 2014-08-20 2014-12-31 哈尔滨工业大学深圳研究生院 Multiplier and multiplier processing element for ellipse cipher apparatus
CN104252332B (en) * 2014-08-20 2018-09-18 哈尔滨工业大学深圳研究生院 A kind of multiplier processing unit and multiplier for elliptic curves cryptosystem device

Also Published As

Publication number Publication date
CN103186360B (en) 2016-08-03

Similar Documents

Publication Publication Date Title
Kim et al. A digit-serial multiplier for finite field GF (2/sup m/)
Lee Low complexity bit-parallel systolic multiplier over GF (2m) using irreducible trinomials
Namin et al. A word-level finite field multiplier using normal basis
Lee et al. Efficient design of low-complexity bit-parallel systolic Hankel multipliers to implement multiplication in normal and dual bases of GF (2 m)
Azarderakhsh et al. Systolic Gaussian normal basis multiplier architectures suitable for high-performance applications
Reyhani-Masoleh A new bit-serial architecture for field multiplication using polynomial bases
El-Razouk et al. New Bit-Level Serial GF (2^ m) Multiplication Using Polynomial Basis
CN103186360B (en) Binary system Galois field multiplier at the bottom of rapid computations many bits series connection pulsation double-basis
Hariri et al. Digit-level semi-systolic and systolic structures for the shifted polynomial basis multiplication over binary extension fields
Lee Low complexity systolic montgomery multiplication over finite fields GF (2 m)
Rashidi et al. High-speed hardware implementations of point multiplication for binary Edwards and generalized Hessian curves
Rashmi et al. Optimized reversible montgomery multiplier
CN103942027A (en) Reconfigurable rapid parallel multiplier
Lee Super Digit-Serial Systolic Multiplier over GF (2^ m)
Meher Systolic formulation for low-complexity serial-parallel implementation of unified finite field multiplication over GF (2 m)
Mozhi et al. Efficient bit-parallel systolic multiplier over GF (2 m)
Meher High-throughput hardware-efficient digit-serial architecture for field multiplication over GF (2 m)
Choi et al. Reduced complexity polynomial multiplier architecture for finite fields GF (2m)
Okamoto et al. A graph-based approach to designing parallel multipliers over Galois fields based on normal basis representations
Tujillo-Olaya et al. Hardware architectures for elliptic curve cryptoprocessors using polynomial and Gaussian normal basis over GF (2 233)
Madhuri et al. Analysis of reconfigurable multipliers for integer and Galois field multiplication based on high speed adders
KR20010068349A (en) Standard basis gf multiplier with the generalized basis cell and the fixed basic cell and squarer architecture
TW201616340A (en) Finite field multiplication device with reconfigurable architecture
Bhoite et al. A systolic architecture based GF (2m) multiplier using modified LSD first multiplication algorithm
Öztürk et al. A versatile Montgomery multiplier architecture with characteristic three support

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160803

Termination date: 20180403