CN101349967B - CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof - Google Patents

CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof Download PDF

Info

Publication number
CN101349967B
CN101349967B CN 200810046004 CN200810046004A CN101349967B CN 101349967 B CN101349967 B CN 101349967B CN 200810046004 CN200810046004 CN 200810046004 CN 200810046004 A CN200810046004 A CN 200810046004A CN 101349967 B CN101349967 B CN 101349967B
Authority
CN
China
Prior art keywords
logical
logical block
output
cbsa
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200810046004
Other languages
Chinese (zh)
Other versions
CN101349967A (en
Inventor
王金波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Westone Information Industry Inc
Original Assignee
Chengdu Westone Information Industry Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Westone Information Industry Inc filed Critical Chengdu Westone Information Industry Inc
Priority to CN 200810046004 priority Critical patent/CN101349967B/en
Publication of CN101349967A publication Critical patent/CN101349967A/en
Application granted granted Critical
Publication of CN101349967B publication Critical patent/CN101349967B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses a plug-minus indifferent parallel operation adder and a design method thereof. The adder is composed of a single-bit logic parallel operation unit adder modules, wherein each unit adder module comprises xi, yi, zi, xi, yi, zi registers; (xi^yi)v(xi^zi)v(yi^zi) logic operation units, a xiyiz logic operation unit, a xiyizi logic operation unit, a (xi^yi)v(xi^zi)v(yi^zi) logicoperation unit, and a (-(s1i^(-s0i))) logic operation unit, four logic OR gates connected with logic units for attaining data ci, si, s, c, and output bit registers connected with each logic AND gate. The plug-minus indifferent parallel operation adder has the advantages of improved calculation efficiency and physical attach resistance.

Description

The CBSA hardware adder and the method for designing of the parallel computation of plus-minus method indifference
Technical field
The present invention relates in the digital processing system, realize having the addition of overlength position and the parallel design method of subtraction simultaneously, particularly relate to hardware adder and method for designing thereof that addition and subtraction are carried out the indifference parallel computation.
Background technology
In many digital processing systems, need have the addition and the subtraction (position and bit have identical meanings) of overlength position herein.For example, the public-key encryptosystem in the information safety system, as RSA and ECC algorithm, its realization relates to the hundreds of position even arrives several kilobits above addition and subtraction.And the basic processing unit of CPU has only tens (as 8,16,32 etc.) in the common computer, and the addition or the subtraction that utilize them to handle so big number will be very slow, obviously can not satisfy the quick response requirement in the application.Therefore,, need design to have the addition of overlength position and the hardware adder of subtraction, utilize it to assist to finish the realization of High Speed of public-key encryptosystem in order to improve system handles speed.Improve system handles speed, utilize hardware component to realize the calculating of complicated algorithm usually.In the reality, the algorithm computing finally all is converted into addition and fundamental operation such as subtraction repeatedly.Totalizer is one of core component of computing machine, and the speed of totalizer processing plus-minus method is determining the operational performance of computing machine.And the calculated performance of any totalizer all depends on its employed computing method.
The additional calculation of utilizing hardware adder to come realization of High Speed overlength position is often used the bit parallel computing technique.Realize the totalizer that parallel addition calculates at present, refer in particular to carry save adder (Carry Save Adders), below brief note is for CSA, and basic thought is by simple logic Parallel Implementation repeatedly add operations of signless integer arbitrarily such as bit distance, " or " and " with ".Two data are all exported in each CSA computing, a carry information C who contains everybody, and another contains everybody XOR information S.Because CSA has realized exempting to link carry addition, be particularly suitable for the hardware adder design of ultra-long data.If normally used processor unit is the m bit, carry out the CSA computing of L bit length, then its speed is to use more than L/m times of common addition process device.
If use mark '
Figure G2008100460048D0001094836QIETU
Operation that ' expression step-by-step ' XOR ' operation, ' ∧ ' represent step-by-step ' with ', ' ∨ ' represent step-by-step ' or ' operate.To the nonnegative integer X of input, Y, Z, carry out CSA calculate CSA (X, Y, Z)=(C S), is output as C and S, satisfies 2C+S=X+Y+Z, and then the computing formula of CSA is:
C=(X∧Y)∨(X∧Z)∨(Y∧Z), S = X ⊕ Y ⊕ Z .
As seen, each arithmetic operation in more than calculating all can step-by-step (promptly by bit) parallel mode carry out from formula, and the CSA addition of three nonnegative integers of bit length can be finished calculating in one claps arbitrarily.For carrying out the operation of addition many times repeatedly, can finish efficiently by CSA; Its shortcoming is the parallel addition that can only be used for signless integer, can't carry out bit parallel to signed number and calculate, and promptly can't do subtraction.
In addition, be to guarantee the safety of public-key encryptosystem, should manage to reduce or avoid its implementation procedure information leakage (such as, utilize time, the energy information revealed in the calculating process, can analyze key).And operations such as the comparison in the computation process, carry, borrow, usually victim is used to the time of carrying out, the energy information analysis is used.Public-key encryptosystem is realized, relates to addition and subtraction, can not be finished by CSA merely.Therefore, in the existing accelerator hardware of public-key encryptosystem realizes,, inevitably in computation process, introduced operations such as comparison, carry, borrow, brought the adverse effect of secure context for being suitable for subtraction.For this reason, seek, design, realize highly-parallel, unified addition and subtraction, overlength position totalizer hardware, can effectively avoid the information leakage of computation process, become the target that people yearn for.
The unified universal method that realizes addition and subtraction, the common addition that is to use the complementary operation rule to carry out, but it is not a parallel method.The better method that another attempts parallel processing addition and subtraction is Radix-2Signed Digit method, below abbreviates it as SD2, and it is that to utilize radix be 2 signed number word code technology, realizes the limited parallel work-flow of addition and subtraction.SD2 uses digital collection
Figure G2008100460048D00022
Wherein Expression-1.A SD2 integer A=[a N-1... a 1a 0] ( a i ∈ { 1 , ~ 0, 1 } ) Value be ∑ I=0 ..., n-1a i2 iIf B=[b N-1... b 1b 0], S=[s N-1... s 1s 0], calculate S=A+B.When any two SD2 count A and B and carry out addition, each the digital s among the output result iAll need search a rule list obtains.Rely on this rule list, each digital s iCan be according to being operated preceding two digital a in the number I-2And a I-1And b I-2And b I-1Calculate.Because this forward direction dependence, the SD2 method is not the parallel method of step-by-step or bit computing.
In addition, can unify to realize that the method for addition and subtraction also has the RSD method, but it not a parallel method.Its thought is: an integer X is expressed as two positive integer x +And x -, and X=x is arranged +-x -If X=is (x +, x -), Y=(y +, y -), X-Y=(x then +, x -)-(y +, y -)=(x +, x -)+(y -, y +), X+Y=(x +, x -)+(y +, y -).The RSD method can be according to x +And x -Most significant digit directly carry out the comparison of two integers, saved subtrahend and mend to have handled problems, and do not worried carry and borrow problem in the operating process.Its shortcoming is, must be to two positive integer x of division +And x -Carry out positive negative flag, when carrying out addition and subtraction process repeatedly, data need constantly compare and transform between positive negative flag.If utilize the CSA concurrent technique to realize RSD, then each positive negative flag needs two migrations of independently carrying out big flow between the CSA processing module, and its hardware implementation efficiency is very low.
Summary is got up, and the SD2 method has the limited dependence of forward direction bit (need introduce table look-up etc.), has reduced hardware parallel processing efficient, has increased hardware and has realized difficulty; And complementary operation method and RSD method do not have concurrency.The analysis showed that by above, be used for unifying to realize the whole bag of tricks and the hardware adder thereof of addition and subtraction at present, is not truly addition and design of subtraction indifference and the strict parallel computation of being undertaken by bit.How realizing that the without differences of addition and subtraction handle, and strictly carry out parallel computation by bit, can satisfy the randomness of computation process again, promptly is the technical barrier that the present invention will solve.
Summary of the invention
The objective of the invention is to: totalizer and method for designing thereof for the user provides the parallel computation of a kind of plus-minus method indifference, realize addition and subtraction indifference are handled and parallel computation.This totalizer strictness realizes the addition and the subtraction of arbitrary integer by the bit parallel mode, and the randomness of computation process can be provided.Be applicable in the information safety system public-key encryptosystem, relate to the plus-minus method supercomputing of overlength position, reach the ability that improves system's operation efficiency and security simultaneously.
The objective of the invention is to realize by the enforcement following technical proposals:
The CBSA hardware adder of plus-minus method indifference parallel computation is characterized in that: be made up of the unit adder Module of the single-bit logical calculated of 64 bit parallels at least; Wherein every bit location adder Module includes following logical organization:
Input bit is respectively
Figure G2008100460048D00031
3 unsigned number registers,
Input bit is respectively
Figure G2008100460048D00032
3 redundant digit registers,
Respectively with 3 The unsigned number register connects, carries out
Figure G2008100460048D00034
Logical operation, export this carry information Logical block-1,
Respectively with 3
Figure G2008100460048D00036
The unsigned number register connects, carries out
Figure G2008100460048D00037
Logical operation, export this XOR information
Figure G2008100460048D00038
Logical block-2,
Respectively with 3 The redundant digit register connects, carries out
Figure G2008100460048D000310
Logical operation, export this XOR information s 1 i = x ~ i ⊕ y ~ i ⊕ z ~ i Logical block-3,
Respectively with 3
Figure DEST_PATH_GA20177880200810046004801D00011
The redundant digit register connects, carries out
Figure DEST_PATH_GA20177880200810046004801D00012
Logical operation, export this carry information Logical block-4,
Be connected with logical block-3 with logical block-2 respectively, with the input s0 iWith s1 iCarry out (~(s1 i∧ (~s0 i))) logical operation, output result be t i=(~(s1 i∧ (~s0 i))) logical block-5,
Be connected with logical block-5 with logical block-1 respectively, with the input c0 iWith t iCarry out the logical computing, obtain
Figure DEST_PATH_GA20177880200810046004801D00014
Logical AND gate-1,
Be connected with logical block-5 with logical block-2 respectively, with the input s0 iWith t iCarry out the logical computing, obtain
Figure DEST_PATH_GA20177880200810046004801D00015
Logical AND gate-2,
Be connected with logical block-5 with logical block-3 respectively, with the input s1 iWith t iCarry out the logical computing, obtain
Figure DEST_PATH_GA20177880200810046004801D00016
Logical AND gate-3,
Be connected with logical block-5 with logical block-4 respectively, with the input c1 iWith t iCarry out the logical computing, obtain
Figure DEST_PATH_GA20177880200810046004801D00017
Logical AND gate-4,
The output bit that is connected with logical AND gate-1 is Register,
The output bit that is connected with logical AND gate-2 is
Figure DEST_PATH_GA20177880200810046004801D00019
Register,
The output bit that is connected with logical AND gate-3 is
Figure DEST_PATH_GA20177880200810046004801D000110
Register,
And the output bit that is connected with logical AND gate-4 is Register;
Described Be any bigit X=(± x N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0) unsigned number X ` = ( x ` n - 1 . . . x ` 1 x ` 0 ) , Y ` = ( y ` n - 1 . . . y ` 1 y ` 0 ) , Z ` = ( z ` n - 1 . . . z ` 1 z ` 0 ) The i bit, wherein x ` i ∈ { 0,1 } , y ` i ∈ { 0,1 } , z ` i ∈ { 0,1 } ; I=0,1 ..., n-1, n are any positive integer greater than 64; Described unsigned number
Figure DEST_PATH_GA20177880200810046004801D000116
Be the numerical table formula after all signs of removing each digital front in corresponding X, Y, the Z binary number;
Described
Figure DEST_PATH_GA20177880200810046004801D000117
Be any bigit X=(± x N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0) redundant digit X ~ = ( x ~ n - 1 . . . x ~ 1 x ~ 0 ) , Y ~ = ( y ~ n - 1 . . . y ~ 1 y ~ 0 ) , Z ~ = ( z ~ n - 1 . . . z ~ 1 z ~ 0 ) The i bit, wherein x ~ i ∈ { 0,1 } , y ~ i ∈ { 0,1 } , z ~ i ∈ { 0,1 } ; I=0,1 ..., n-1, n are any positive integer greater than 64; Described redundant digit
Figure DEST_PATH_GA20177880200810046004801D000121
For being 1 for negative bit labeling, otherwise be labeled as 0 to each the digital previous symbol in corresponding X, Y, the Z binary number, and the numerical table formula;
Operational symbol wherein ' ∧ ' expression step-by-step logic ' with ' computing, ' ∨ ' expression step-by-step logic ' or ' computing,
Figure DEST_PATH_GA20177880200810046004801D000122
Expression step-by-step logic ' XOR ' computing, "~" expression step-by-step logic ' negate ' computing (being 1=~0,0=~1).
Other totalizer of utilizing described CBSA hardware adder to constitute:
1. the realization modular multiplication computing unit (T+a that utilizes described CBSA hardware adder to constitute iB+q iFour advancing two and go out totalizer N)=(T1+T2+X1+X2) includes:
4 difference stored data T1, T2, X1, the output register of X2,
Respectively with output register T1, T2, the first order CBSA hardware adder that X1 connects,
The second level CBSA hardware adder that is connected with first order CBSA hardware adder two output terminals and X2 output register respectively,
Stored data T1 that is connected with second level CBSA hardware adder two output terminals and the output register unit of stored data T2 (T1, T2),
And control output register unit (T1, clk clock signal of system T2) and rst totalizer reset signal.
2. the realization modular multiplication computing unit (T1+T2+a that utilizes described CBSA hardware adder to constitute iB1+a iB2+q iFive advancing two and go out totalizer N) includes:
5 difference stored data T1, T2, a iB1, a iB2, q iThe output register of N,
Respectively with output register T1, T2, a iThe first order CBSA hardware adder that B1 connects,
Respectively with first order CBSA hardware adder two output terminals and a iThe second level CBSA hardware adder that the B2 output register connects,
Respectively with second level CBSA hardware adder two output terminals and q iThe third level CBSA hardware adder that the N output register connects,
Stored data T1 that is connected with third level CBSA hardware adder two output terminals and the output register unit of stored data T2 (T1, T2),
And control output register unit (T1, clk clock signal of system T2) and rst totalizer reset signal.
Realize the method for designing of the CBSA hardware adder of plus-minus method indifference of the present invention parallel computation, following steps arranged:
The first step is determined the design object of CBSA totalizer, be realize calculating CBSA (X, Y, Z)=(C S), and satisfies C+S=X+Y+Z, for this reason:
1. arbitrary integer X, Y, Z is shown as X=(± x with 2 system numerical tables N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0), x wherein i∈ 0,1}, y i∈ 0,1}, z i∈ 0,1}, and the X=∑ is arranged I=0 ..., n-1(± x i2 i), the Y=∑ I=..., n-1(± y i2 i), the Z=∑ I=0 ..., n-1(± z i2 i);
2. count X=(± x for three 2 systems of any input N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0) and Z=(± z N-1... ± z 1± z 0), (C is that 2 systems are counted C=(± c S) to its output result equally after CBSA calculates N-1... ± c 1± c 0) and S=(± s N-1... ± s 1± s 0);
In second step, binary number unsigned number tabular form and redundant digit tabular form are set
1) binary number unsigned number tabular form is set
To any bigit X=(± x N-1... ± x 1± x 0), remove all signs of each digital front, obtain the unsigned number tabular form of X, it is designated as , wherein
Figure G2008100460048D00062
2) the redundant digit tabular form of binary number is set to any bigit X=(± x N-1... ± x 1± x 0), x iEach digital previous symbol be 1 for negative bit labeling, otherwise be labeled as 0, then obtain the redundant digit tabular form of X, and be designated as X ~ = ( x ~ n - 1 . . . x ~ 1 x ~ 0 ) , Wherein x ~ i ∈ { 0,1 } ;
In the 3rd step, to importing any long integer X in position, Y and Z carry out CBSA and calculate
With any long bigit X in position of input, Y, the unsigned number tabular form of Z and redundant digit tabular form are designated as respectively
Figure G2008100460048D00065
CBSA (X, Y, Z)=(C, calculating process S) is as follows:
(1) at first parallel computation
Figure G2008100460048D00066
With
Figure G2008100460048D00067
CSA is the computing of the carry save adder known, obtains:
C0=(c0 n-1...c0 1c0 0),S0=(s0 n-1...s0 1s0 0),c0 i,s0 i∈{0,1};
C1=(c1 n-1...c1 1c1 0),S1=(s1 n-1...s1 1s1 0),c1 i,s1 i∈{0,1};
(2) provide CBSA (X, Y, Z)=(C, S) result's unsigned number tabular form and redundant digit tabular form are designated as respectively
Figure G2008100460048D00068
With
Figure G2008100460048D00069
And
Figure G2008100460048D000610
With
Figure G2008100460048D000611
Wherein:
Figure G2008100460048D000612
Figure G2008100460048D000613
Figure G2008100460048D000614
C ~ = ( c ~ n - 1 . . . c ~ 1 c ~ 0 ) , S ~ = ( s ~ n - 1 . . . s ~ 1 s ~ 0 ) , c ~ i , s ~ i ∈ { 0,1 }
(3) C0 as a result that utilizes above-mentioned calculating process (1) to obtain, S0, C1, each digital bit of S1 is by the following output that calculates CBSA
Figure G2008100460048D000618
Each digital bit:
Figure G2008100460048D000619
Figure DEST_PATH_GA20177880200810046004801D00031
i=0,...,n-1.
Wherein operational symbol '~' is represented the step-by-step logic ' negate ' (being 1=~0,0=~1), ' ∧ ' expression step-by-step logic ' with, operation; Consider in the information safety system public-key encryptosystem that relate to the plus-minus method supercomputing needs of overlength position, positive integer n gets 64 at least; As seen, in CBSA calculated, the computing in (3) step was undertaken by the bit method is parallel fully, a CBSA computing, and each single-bit logical block can be finished in a beat simultaneously.
In the 4th step, the recovery of CBSA output data is handled
In fact, system utilizes its last result of calculation after carrying out limited number of time CBSA calculating
Figure DEST_PATH_GA20177880200810046004801D00033
Recover C=(± c as follows N-1... ± c ± c 0) and S=(± s N-1... ± s 1± s 0):
(1) c i = c ` i , s i = s ` i , i=0,...,n-1.
(2) judge symbol: if c ~ i = 0 , C then iThe symbol of front is '+', otherwise c iThe symbol of front is '-'; If s ~ i = 0 , S then iThe symbol of front is '+', otherwise s iThe symbol of front is '-'.
By the method for the 4th step (1), promptly obtain CBSA (X, Y, Z) output C and S with (2);
(3) utilize usual method to obtain net result W=C+S again.
Can obviously find out from above respectively going on foot: realization CBSA (X, Y Z) calculate, and key is to carry out following simple logic computing by the method in the 3rd step:
Figure DEST_PATH_GA20177880200810046004801D00037
s 0 i = x ` i ⊕ y ` i ⊕ z ` i
Figure DEST_PATH_GA20177880200810046004801D00039
s 1 i = x ~ i ⊕ y ~ i ⊕ z ~ i
t i=(~(s1 i∧(~s0 i)))
Figure DEST_PATH_GA20177880200810046004801D000311
i=0,...,n-1.
Figure DEST_PATH_GA20177880200810046004801D000312
i=0,...,n-1.
Figure DEST_PATH_GA20177880200810046004801D000313
i=0,...,n-1.
Figure DEST_PATH_GA20177880200810046004801D000314
i=0,...,n-1.
Wherein preceding 5 computings are according to the 3rd step
Figure DEST_PATH_GA20177880200810046004801D000315
Each input bit
Figure DEST_PATH_GA20177880200810046004801D000316
Figure DEST_PATH_GA20177880200810046004801D000317
With the 3rd step (1) respectively export bit c0 i, s0 i, c1 i, s1 iGiven simple logic computing, can carry out the design of following CBSA hardware adder according to these 9 logical operations:
1. according to the input data of calculating The design relevant register;
The difference stored data
Figure G2008100460048D00081
Three registers be called the unsigned number register, stored data respectively
Figure G2008100460048D00082
Three registers be called the redundant digit register;
2. according to the input data
Figure G2008100460048D00083
, carry out
Figure G2008100460048D00084
Logical operation, output data c0 i, design is called the simple logic circuit structure of logical block-1, and sets up
Figure G2008100460048D00085
Register is connected with logical block-1, obtain logical block-1 respectively with
Figure G2008100460048D00086
Three circuit structures that register connects;
3. according to the input data
Figure G2008100460048D00087
, carry out
Figure G2008100460048D00088
Logical operation, output data s0 i, design is called the simple logic circuit structure of logical block-2, and sets up
Figure G2008100460048D00089
Register is connected with logical block-2, obtain logical block-2 respectively with
Figure G2008100460048D000810
Three circuit structures that register connects;
4. according to the input data
Figure G2008100460048D000811
Carry out
Figure G2008100460048D000812
Logical operation, output data s1 i, design is called the simple logic circuit structure of logical block-3, and sets up
Figure G2008100460048D000813
Register is connected with logical block-3, obtain logical block-3 respectively with
Figure G2008100460048D0008101550QIETU
Three circuit structures that register connects;
5. according to the input data
Figure G2008100460048D000815
Carry out
Figure G2008100460048D000816
Logical operation, output data c1 i, design is called the simple logic circuit structure of logical block-4, and sets up
Figure G2008100460048D000817
Register is connected with logical block-4, obtain logical block-4 respectively with
Figure G2008100460048D000818
Three circuit structures that register connects;
6. according to input data s0 iAnd s1 i, carry out (~(s1 i∧ (~s0 i))) logical operation, output data t i, design is called the simple logic circuit structure of logical block-5, and set up logical block-5 respectively with being connected of logical block-2 and logical block-3, obtain the circuit structure that logical block-5 is connected with logical block-3 with logical block-2 respectively;
7. according to input data c0 iAnd t i, carry out c0 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D000819
, design is called the simple logic circuit structure of logical AND gate-1, and set up logical AND gate-1 respectively with being connected of logical block-1 and logical block-5, obtain the circuit structure that logical AND gate-1 is connected with logical block-5 with logical block-1 respectively;
8. according to input data s0 iAnd t i, carry out s0 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D000820
, design is called the simple logic circuit structure of logical AND gate-2, and set up logical AND gate-2 respectively with being connected of logical block-2 and logical block-5, obtain the circuit structure that logical AND gate-2 is connected with logical block-5 with logical block-2 respectively;
9. according to input data s1 iAnd t i, carry out s1 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D00091
, design is called the simple logic circuit structure of logical AND gate-3, and set up logical AND gate-3 respectively with being connected of logical block-3 and logical block-5, obtain the circuit structure that logical AND gate-3 is connected with logical block-5 with logical block-3 respectively;
10. according to input data c1 iAnd t i, carry out c1 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D00092
, design is called the simple logic circuit structure of logical AND gate-4, and set up logical AND gate-4 respectively with being connected of logical block-4 and logical block-5, obtain the circuit structure that logical AND gate-4 is connected with logical block-5 with logical block-4 respectively;
According to logical AND gate-1 output data
Figure G2008100460048D00093
, logical AND gate-2 output data , logical AND gate-3 output data
Figure G2008100460048D00095
, logical AND gate-4 output data , the output bit register of storing these data respectively is set;
Finish these steps, just obtained realizing the CBSA hardware adder of plus-minus method indifference parallel computation.
Find out by this adder structure: to importing any long integer X in position, Y and Z, carrying out plus-minus method indifference parallel C BSA calculates, be exactly in fact the logical operation that utilizes the step-by-step of scale-of-two input data to carry out, the unit adder Module of the n that obtains walking abreast a single-bit logical calculated, the hardware circuit composition of each unit adder Module is provided by Fig. 1, and Fig. 2 has provided the adder Module structural drawing that the n bit parallel calculates.
Outstanding advantage of the present invention is:
The indifference processing and the parallel computation of addition and subtraction have been realized.Its characteristics can be summarized as: strictness realizes the addition and the subtraction of arbitrary integer by the bit parallel mode, and the randomness of computation process can be provided.Be specially adapted to carry out the digital processing system that overlength position plus-minus method calculates, as utilize totalizer of the present invention to assist to finish the safety high speed realization of public-key encryptosystem, can improve efficient and security that public key cryptography is realized greatly.
Particularly, hardware adder and method for designing thereof that addition and subtraction are carried out the indifference parallel computation that the present invention provides, its major advantage has:
(1) but the addition and the subtraction of overlength integer are carried out in the indifference strange land, adapt to any integer as input, the high-speed parallel that carries out addition and subtraction with simple logic calculates.Carry out in a large number in the arithmetic processing system of addition and subtraction at needs,, then need the subtraction that runs into is carried out individual processing, will lower parallel efficiency calculation significantly if utilize CSA to carry out parallel computation.CBSA of the present invention carries out parallel computation with the plus-minus method unification, has avoided the translation process between the plus-minus method, its operation efficiency can be improved significantly.
(2), unsigned number tabular form and the redundant digit tabular form that provide among the present invention about data, can select different unsigned numbers and redundant digit at random.Select for different unsigned numbers and redundant digit, will obtain the various combination of two output valves of CBSA, this makes the assailant accurately to survey or the relevant information of acquisition algorithm in calculating process.Therefore, the CBSA totalizer that provides among the present invention can provide the randomness of computation process.
(3), utilize CBSA totalizer of the present invention, carry out addition and the computing of subtraction indifference and random paralleling computing power, exempt operations such as comparison between common additive operation and subtraction, carry, borrow, condition control, can significantly improve the anti-physical attacks ability of hardware adder.
Description of drawings
Fig. 1 is the unit totalizer meter building-block of logic of single-bit logical calculated of the present invention.
Fig. 2 is the totalizer composition diagram that n bit parallel of the present invention calculates.
Fig. 3 carries out calling computing unit (T+a repeatedly in the overlength bit Montgomery Algorithm iB+q iN)=(T1+T2+a iB1+a iB2+q i5 advancing 2 and go out the adder designs block diagram N).
Fig. 4 is used for calculating (T+a iB+q i4 advancing 2 and go out the adder designs block diagram N)=(T1+T2+X1+X2).
Embodiment
The CBSA hardware adder of a kind of plus-minus method indifference parallel computation is made up of the unit adder Module of the single-bit logical calculated of 64 bit parallels at least; Wherein every bit location adder Module has following logical organization:
Input bit is respectively
Figure G2008100460048D00101
3 unsigned number registers,
Input bit is respectively
Figure G2008100460048D00102
3 redundant digit registers,
Respectively with 3
Figure G2008100460048D00103
The unsigned number register connects, carries out
Figure G2008100460048D00104
Logical operation, export this carry information
Figure G2008100460048D00105
Logical block-1,
Respectively with 3
Figure G2008100460048D0010102122QIETU
The unsigned number register connects, carries out
Figure G2008100460048D00107
Logical operation, export this XOR information Logical block-2,
Respectively with 3
Figure G2008100460048D00109
The redundant digit register connects, carries out Logical operation, export this XOR information s 1 i = x ~ i ⊕ y ~ i ⊕ z ~ i Logical block-3,
Respectively with 3
Figure G2008100460048D001012
The redundant digit register connects, carries out
Figure G2008100460048D001013
Logical operation, export this carry information
Figure G2008100460048D001014
Logical block-4,
Be connected with logical block-3 with logical block-2 respectively, with the input data s0 iWith s1 iCarry out (~(s1 i∧ (~s0 i))) logical operation, output result data be t i=(~(s1 i∧ (~s0 i))) logical block-5,
Be connected with logical block-5 with logical block-1 respectively, with the input data c0 iWith t iCarry out the logical computing, obtain result data Logical AND gate-1,
Be connected with logical block-5 with logical block-2 respectively, with the input data s0 iWith t iCarry out the logical computing, obtain result data Logical AND gate-2,
Be connected with logical block-5 with logical block-3 respectively, with the input data s1 iWith t iCarry out the logical computing, obtain result data
Figure G2008100460048D00113
Logical AND gate-3,
Be connected with logical block-5 with logical block-4 respectively, with the input data c1 iWith t iCarry out the logical computing, obtain result data
Figure G2008100460048D00114
Logical AND gate-4,
The output bit that is connected with logical AND gate-1
Figure G2008100460048D00115
Register,
The output bit that is connected with logical AND gate-2
Figure G2008100460048D00116
Register,
The output bit that is connected with logical AND gate-3 Register,
And the output bit that is connected with logical AND gate-4
Figure G2008100460048D00118
Register;
Described
Figure G2008100460048D00119
Be any bigit X=(± x N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0) unsigned number
Figure G2008100460048D001111
Figure G2008100460048D001112
Figure G2008100460048D001113
The i bit, wherein
Figure G2008100460048D001114
Figure G2008100460048D001115
Figure G2008100460048D001116
I=0,1 ..., n-1, n are any positive integer greater than 64; Described unsigned number
Figure G2008100460048D001117
Be the numerical table formula after all signs of removing each digital front in corresponding X, Y, the Z binary number;
Described Be any bigit X=(± x N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0) redundant digit X ~ = ( x ~ n - 1 . . . x ~ 1 x ~ 0 ) , Y ~ = ( y ~ n - 1 . . . y ~ 1 y ~ 0 ) , Z ~ = ( z ~ n - 1 . . . z ~ 1 z ~ 0 ) The i bit, wherein x ~ i ∈ { 0,1 } , y ~ i ∈ { 0,1 } , z ~ i ∈ { 0,1 } ; I=0,1 ..., n-1, n are any positive integer greater than 64; Described redundant digit
Figure G2008100460048D001125
For being 1 for negative bit labeling, otherwise be labeled as 0 to each the digital previous symbol in corresponding X, Y, the Z binary number, and the numerical table formula;
Operational symbol wherein ' ∧ ' expression step-by-step logic ' with ' computing, ' ∨ ' expression step-by-step logic ' or ' computing, '
Figure G2008100460048D001126
' expression step-by-step logic ' XOR ' computing, "~" expression step-by-step logic ' negate ' computing (being 1=~0,0=~1).
The method for designing of plus-minus method indifference parallel computation totalizer of the present invention has following steps:
The first step is determined the design object of CBSA totalizer, be realize calculating CBSA (X, Y, Z)=(C S), satisfies C+S=X+Y+Z, for this reason:
1. arbitrary integer X, Y, Z becomes X=(± x with 2 system numerical table formulas N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0), x wherein i∈ 0,1}, y i∈ 0,1}, z i∈ 0,1}, and the X=∑ is arranged I=0 ..., n-1(± x i2 i), the Y=∑ I=0 ..., n-1(± y i2 i), the Z=∑ I=0 ..., n-1(± z i2 i);
2. count X=(± x for three 2 systems of any input N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0) and Z=(± z N-1... ± z 1± z 0), its output result is that 2 systems are counted C=(± c equally after CBSA calculates N-1... ± c 1± c 0) and S=(± s N-1... ± s 1± s 0);
In second step, binary number unsigned number tabular form and redundant digit tabular form are set
1) binary number unsigned number tabular form is set
To any bigit X=(± x N-1... ± x 1± x 0), remove all signs of each digital front, obtain the unsigned number tabular form of X, it is designated as
Figure G2008100460048D00121
Wherein
2) the redundant digit tabular form of binary number is set
To any bigit X=(± x N-1... ± x 1± x 0), x iEach digital previous symbol be 1 for negative bit labeling, otherwise be labeled as 0, then obtain the redundant digit tabular form of X, and be designated as X ~ = ( x ~ n - 1 . . . x ~ 1 x ~ 0 ) , Wherein x ~ i ∈ { 0,1 } ;
In the 3rd step, to importing any long integer X in position, Y and z carry out CBSA and calculate
With any long integer X in position of input, Y, the unsigned number tabular form of Z and redundant digit tabular form are designated as respectively
Figure G2008100460048D00126
CBSA (X, Y, Z)=(C, calculating process S) is as follows:
(1) at first parallel computation
Figure G2008100460048D00127
With
Figure G2008100460048D00128
CSA is the computing of the carry save adder known, obtains:
C0=(c0 n-1...c0 1c0 0),S0=(s0 n-1...s0 1s0 0),c0 i,s0 i∈{0,1};
C1=(c1 n-1...c1 1c1 0),S1=(s1 n-1...s1 1s1 0),c1 i,s1 i∈{0,1};
(2) provide CBSA (X, Y, Z)=(C, S) result's unsigned number tabular form and redundant digit tabular form are designated as respectively
Figure G2008100460048D00129
With
Figure G2008100460048D001210
And
Figure G2008100460048D001211
With
Figure G2008100460048D001212
Wherein:
Figure G2008100460048D001214
C ~ = ( c ~ n - 1 . . . c ~ 1 c ~ 0 ) , S ~ = ( s ~ n - 1 . . . s ~ 1 s ~ 0 ) , c ~ i , s ~ i ∈ { 0,1 }
(3) C0 as a result that utilizes above-mentioned calculating process (1) to obtain, S0, C1, each digital bit of S1 is by the following output that calculates CBSA
Figure G2008100460048D001219
Each digital bit:
Figure G2008100460048D00131
Figure G2008100460048D00132
Figure G2008100460048D00134
Wherein operational symbol '~' expression step-by-step ' negate ' (being 1=~0,0=~1), ' ∧ ' represent step-by-step ' with ' operate; Consider in the information safety system public-key encryptosystem that relate to the plus-minus method supercomputing needs of overlength position, n gets 64 at least; As seen, in CBSA calculated, the computing of step 3 was undertaken by the bit method is parallel fully, and a CBSA computing can be finished in a beat simultaneously.
In the 4th step, the recovery of CBSA output data is handled
In fact, system utilizes its last result of calculation after carrying out limited number of time CBSA calculating Recover C=(± c as follows N-1... ± c ± c 0) and S=(± s N-1... ± s 1± s 0):
Figure G2008100460048D00136
(2) judge symbol: if c ~ i = 0 , C then iThe symbol of front is '+', otherwise c iThe symbol of front is '-'; If s ~ i = 0 , S then iThe symbol of front is '+', otherwise s iThe symbol of front is '-'.
Obtain CBSA (X, Y, Z) output C and S above;
(3) utilize usual method to obtain net result W=C+S again.
Can obviously find out from above respectively going on foot: realization CBSA (X, Y Z) calculate, and key is to carry out following simple logic computing by the method in the 3rd step:
Figure G2008100460048D00139
Figure G2008100460048D001310
Figure G2008100460048D001311
s 1 i = x ~ i ⊕ y ~ i ⊕ z ~ i
t i=(~(s1 i∧(~s0 i)))
Figure G2008100460048D001313
Figure G2008100460048D001314
Figure G2008100460048D001315
According to these logical operations, can carry out the design of following CBSA hardware adder:
1. according to the input data of calculating
Figure G2008100460048D00141
The design relevant register; The difference stored data
Figure G2008100460048D00142
Three registers be called the unsigned number register, stored data respectively
Figure G2008100460048D00143
Three registers be called the redundant digit register;
2. according to the input data
Figure G2008100460048D00144
Carry out
Figure G2008100460048D00145
Logical operation, output data c0 i, design is called the simple logic circuit structure of logical block-1, and sets up
Figure G2008100460048D00146
The annexation of register and logical block-1, obtain logical block-1 respectively with
Figure G2008100460048D00147
Three circuit structures that register connects;
3. according to the input data
Figure G2008100460048D00148
Carry out
Figure G2008100460048D00149
Logical operation, output data s0 i, design is called the simple logic circuit structure of logical block-2, and sets up Register is connected with logical block-2, obtain logical block-2 respectively with
Figure G2008100460048D001411
Three circuit structures that register connects;
4. according to the input data Carry out
Figure G2008100460048D001413
Logical operation, output data s1 i, design is called the simple logic circuit structure of logical block-3, and sets up
Figure G2008100460048D001414
Register is connected with logical block-3, obtain logical block-3 respectively with
Figure G2008100460048D001415
Three circuit structures that register connects;
5. according to the input data
Figure G2008100460048D001416
Carry out
Figure G2008100460048D001417
Logical operation, output data c1 i, design is called the simple logic circuit structure of logical block-4, and sets up Register is connected with logical block-4, obtain logical block-4 respectively with Three circuit structures that register connects;
6. according to input data s0 iAnd s1 i, carry out (~(s1 i∧ (~s0 i))) logical operation, output data t i, design is called the simple logic circuit structure of logical block-5, and set up logical block-5 respectively with being connected of logical block-2 and logical block-3, obtain the circuit structure that logical block-5 is connected with logical block-3 with logical block-2 respectively;
7. according to input data c0 iAnd t i, carry out c0 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D001420
, design is called the simple logic circuit structure of logical AND gate-1, and set up logical AND gate-1 respectively with being connected of logical block-1 and logical block-5, obtain the circuit structure that logical AND gate-1 is connected with logical block-5 with logical block-1 respectively;
8. according to input data s0 iAnd t i, carry out s0 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D001421
Design is called the simple logic circuit structure of logical AND gate-2, and set up logical AND gate-2 respectively with being connected of logical block-2 and logical block-5, obtain the circuit structure that logical AND gate-2 is connected with logical block-5 with logical block-2 respectively;
9. according to input data s1 iAnd t i, carry out s1 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D00151
Design is called the simple logic circuit structure of logical AND gate-3, and set up logical AND gate-3 respectively with being connected of logical block-3 and logical block-5, obtain the circuit structure that logical AND gate-3 is connected with logical block-5 with logical block-3 respectively;
10. according to input data c1 iAnd t i, carry out c1 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure G2008100460048D00152
Design is called the simple logic circuit structure of logical AND gate-4, and set up logical AND gate-4 respectively with being connected of logical block-4 and logical block-5, obtain the circuit structure that logical AND gate-4 is connected with logical block-5 with logical block-4 respectively;
According to logical AND gate-1 output data
Figure G2008100460048D00153
Logical AND gate-2 output data
Figure G2008100460048D00154
Logical AND gate-3 output data , logical AND gate-4 output data
Figure G2008100460048D00156
The output bit register of storing these data respectively is set;
Finish these steps, just obtained realizing the CBSA hardware adder of plus-minus method indifference parallel computation.
Provide CBSA hardware adder specification among the present invention below in conjunction with accompanying drawing 1 and accompanying drawing 2.
The unit totalizer of single-bit logical calculated of the present invention shown in Figure 1, mark 100~105 are 6 input bits of the single-bit computational logic of this totalizer
Figure G2008100460048D00157
Register, wherein Register is the unsigned number register,
Figure G2008100460048D00159
Register is the redundant digit register; 106 is logical block 1, and this unit will
Figure G2008100460048D001510
The unsigned number of register output
Figure G2008100460048D001511
Carry out
Figure G2008100460048D001512
Logical operation, output data c0 i107 is logical block 2, and this unit will
Figure G2008100460048D001513
The unsigned number of register output
Figure G2008100460048D001514
Carry out
Figure G2008100460048D001515
Logical operation, output data s0 i108 is logical block 3, and this unit will
Figure G2008100460048D001516
The redundant digit of redundant digit register output
Figure G2008100460048D001517
Carry out
Figure G2008100460048D001518
Logical operation, output data s1 i109 is logical block 4, and this unit will The redundant digit of redundant digit register output Carry out
Figure G2008100460048D001521
Logical operation, output data c1 i111 is logical block 5, and this unit is with the s0 of logical block-2 and logical block-3 output iWith s1 iCarry out (~(s1 i∧ (~s0 i))) logical operation, output data t i=(~(s1 i∧ (~s0 i))); 112~115 is logic ' with ' door, wherein 112 is logic ' with ' door 1, it is with the c0 of logical block-1 and logical block-5 output iWith t iCarry out the logical computing, obtain data
Figure G2008100460048D001522
113 is logical AND gate-2, and it is with the s0 of logical block-2 and logical block-5 output iWith t iCarry out the logical computing, obtain data
Figure G2008100460048D001523
114 is logical AND gate-3, and it is with the s1 of logical block-3 and logical block-5 output iWith t iCarry out the logical computing, obtain data
Figure G2008100460048D001525
115 is logical AND gate-4, and it is with the c1 of logical block-4 and logical block-5 output iWith t iCarry out the logical computing, obtain data
Figure G2008100460048D00161
116~119 is 4 output bits of the single-bit computational logic of this totalizer
Figure G2008100460048D00162
Register; Wherein, operational symbol ' ∧ ' presentation logic ' with ' computing, operational symbol ' ∨ ' presentation logic ' or ' computing, operational symbol '
Figure G2008100460048D00163
' presentation logic ' XOR ' computing, operational symbol "~" presentation logic ' negate ' computing.
Fig. 2 is the totalizer composition diagram that n bit parallel of the present invention calculates, and is particularly suitable for the plus-minus method indifference parallel computation of overlength position, mark among the figure: 200~203 is three inputs of CBSA hardware adder data X, Y, n arranged side by side the input bit unit of Z; 204~207 is the bit unsigned number and the redundant digit register of n arranged side by side input bit unit; 208~211 is n arranged side by side CBSA single-bit computational logic, and its input is provided by 204~207 each register cell, and output is provided by 212~215 each register cell; 212~215 is the output of n arranged side by side CBSA single-bit computational logic; 216~219 is two output data C of CBSA hardware adder, n arranged side by side the output bit cell of S.
As shown in Figure 2, the CBSA hardware adder of plus-minus method indifference of the present invention parallel computation, unit adder Module by n single-bit logical calculated as shown in Figure 1 arranged side by side is formed, 200,204,208,212,216 have constituted the wherein unit adder Module of first single-bit logical calculated, 201,205,209,213,217 have constituted the wherein unit adder Module of second single-bit logical calculated, the rest may be inferred, and 203,207,211,215,219 have constituted the wherein unit adder Module of n piece single-bit logical calculated; Because each module arithmetic is independent fully, thereby the strictness of CBSA hardware adder realizes the addition and the subtraction of arbitrary integer by the bit parallel mode.Because the input data of 208~211 n CBSA single-bit computational logic are unsigned number and redundant digit, but arbitrary combination, so the CBSA hardware adder can provide the randomness of computation process.
Below, we further provide the present invention about CBSA hardware adder composite design and application note.
Its main operational unit of public-key encryptosystem is the big digital-to-analogue multiplication module of carrying out the overlength Bit data, and operand length is at least more than the hundreds of bit.For example, more than 200 bits, the data operation length in the RSA Algorithm is at least more than 1024 bits at least for the data operation length of ECC algorithm.
The Montgomery algorithm that utilization is known carries out modular multiplication.If fixedly modulus is N, the input data of establishing modular multiplication are A and B, and output data is T, wherein A=(a N-1... a 1a 0).Then depend on each bit value a iAnd parameter q i, the modular multiplication process need be called computing unit (T+a repeatedly n time iB+q iN)/2, a wherein i, q i∈ 1,0,1}, q iBe the value of T lowest order, T is an output unit, initial value T=0.With original input data B and T all random splitting become two part: B=B1+B2, T=T1+T2 then can design and advance 2 by 5 of three CBSA totalizers combinations and go out totalizer (as shown in Figure 3), is used for calculating (T+a iB+q iN)=(T1+T2+a iB1+a iB2+q iN), wherein T1 and T2 enter the 2 output register unit that go out totalizer as 5.
Mark among Fig. 3: 300~304 is 5 to advance 2 and go out 5 of totalizer input data cell T1, T2, a iB1, a iB2, q iN; 305~307 is three identical CBSA adder units; 308 be 5 advance the 2 output register unit that go out totalizer (T1, T2); Clk is a clock signal of system, and rst 5 advances 2 and goes out the totalizer reset signal.Utilize 5 to advance 2 and go out adder designs, can in the single clock period, realize once (T+a iB+q iN)/2 computing.System repeatedly calls 5 and advances 2 when going out totalizer, and then (T1, data T1 T2) and T2 will feed back to 5 and enter 2 importations that go out totalizer register cell; Get T1=0, T2=0 when initial.
Carry out (T+a iB+q iN) calculate, if precompute (X1, X2)=CBSA (a iB1, a iB2, q iN), then can design and advance 2 by 4 of two CBSA totalizers combination and go out totalizer (as shown in Figure 4), be used for calculating (T+a iB+q iN)=(T1+T2+X1+X2), wherein T1 and T2 enter the 2 output register unit that go out totalizer as 4.System repeatedly calls 4 and advances 2 when going out totalizer, and then (T1, data T1 T2) and T2 will feed back to 4 and enter 2 importations that go out totalizer register cell; Get T1=0, T2=0 when initial.
Mark among Fig. 4: 400~403 is 4 to advance 2 and go out 4 of totalizer input data cell T1, T2, X1, X2; 404~405 is two identical CBSA adder units; 406 be 4 advance the 2 output register unit that go out totalizer (T1, T2); Clk is a clock signal of system, and rst 4 advances 2 and goes out the totalizer reset signal.Utilize 4 to advance 2 and go out adder designs, and precomputation CBSA (a iB1, a iB2, q iN) value X1 and X2 can realize once (T+a in the single clock period iB+q iN)/2 computing.
With 5 advance 2 and go out totalizer relatively, 4 advance 2 goes out totalizer and uses 1 CBSA adder logic arithmetic unit less, thus operation once the clock period of cost will lack.
In the past about (T+a iB+q iN)/2 calculate, all parameters wherein can only be got nonnegative integer.That uses that the present invention provides 5 advances 2 and goes out totalizer or 4 and advance 2 and go out totalizer, and all parameters wherein can be any integers, have strengthened the adaptability and the security of computing.Utilize 4 to advance 2 and go out totalizer or 5 and advance 2 and go out totalizer and can in a timeticks, finish two to three times CBSA additive operation, save the time more than at least 1 times than two to three CBSA totalizer computings of simple recursive call.When if hardware resource enriches relatively, can consider to use 4 to advance 2 and go out totalizer or 5 and advance 2 and go out totalizer.

Claims (4)

1. the CBSA hardware adder of plus-minus method indifference parallel computation is characterized in that: be made up of the unit adder Module of the single-bit logical calculated of 64 bit parallels at least; Wherein every bit location adder Module includes following circuit structure:
Input bit is respectively
Figure FA20177880200810046004801C00011
3 unsigned number registers,
Input bit is respectively
Figure FA20177880200810046004801C00012
3 redundant digit registers,
Respectively with 3
Figure FA20177880200810046004801C00013
The unsigned number register connects, carries out Logical operation, output information are
Figure FA20177880200810046004801C00015
Logical block-1,
Respectively with 3 The unsigned number register connects, carries out
Figure FA20177880200810046004801C00017
Logical operation, output information are s 0 i = x ` i ⊕ y ` i ⊕ z ` i Logical block-2,
Respectively with 3
Figure FA20177880200810046004801C00019
The redundant digit register connects, carries out Logical operation, output information are s 1 i = x ~ i ⊕ y ~ i ⊕ z ~ i Logical block-3,
Respectively with 3
Figure FA20177880200810046004801C000112
The redundant digit register connects, carries out
Figure FA20177880200810046004801C000113
Logical operation, output information are
Figure FA20177880200810046004801C000114
Logical block-4,
Be connected with logical block-3 with logical block-2 respectively, with the input s0 iWith s1 iCarry out (~(s1 i∧ (~s0 i))) logical operation, output information be t i=(~(s1 i∧ (~s0 i))) logical block-5,
Be connected with logical block-5 with logical block-1 respectively, with the input c0 iWith t iCarry out the logical computing, obtain information
Figure FA20177880200810046004801C000115
Logical AND gate-1,
Be connected with logical block-5 with logical block-2 respectively, with the input s0 iWith t iCarry out the logical computing, obtain information
Figure FA20177880200810046004801C000116
Logical AND gate-2,
Be connected with logical block-5 with logical block-3 respectively, with the input s1 iWith t iCarry out the logical computing, obtain information
Figure FA20177880200810046004801C000117
Logical AND gate-3,
Be connected with logical block-5 with logical block-4 respectively, with the input c1 iWith t iCarry out the logical computing, obtain information
Figure FA20177880200810046004801C000118
Logical AND gate-4,
The output bit that is connected with logical AND gate-1 is
Figure FA20177880200810046004801C000119
Register,
The output bit that is connected with logical AND gate-2 is
Figure FA20177880200810046004801C000120
Register,
The output bit that is connected with logical AND gate-3 is
Figure FA20177880200810046004801C000121
Register,
The output bit that is connected with logical AND gate-4 is
Figure FA20177880200810046004801C00021
Register;
Described
Figure FA20177880200810046004801C00022
Be any bigit X=(± x N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0) unsigned number X ` = ( x ` n - 1 . . . x ` 1 x ` 0 ) , Y ` = ( y ` n - 1 . . . y ` 1 y ` 0 ) , Z ` = ( z ` n - 1 . . . z ` 1 z ` 0 ) The i item, wherein x ` i ∈ { 0,1 } , y ` i ∈ { 0,1 } , z ` i ∈ { 0,1 } ;
Described Be any bigit X=(± x N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0) redundant digit X ~ = ( x ~ n - 1 . . . x ~ 1 x ~ 0 ) , Y ~ = ( y ~ n - 1 . . . y ~ 1 y ~ 0 ) , Z ~ = ( z ~ n - 1 . . . z ~ 1 z ~ 0 ) The i item, wherein x ~ i ∈ { 0,1 } , y ~ i ∈ { 0,1 } , z ~ i ∈ { 0,1 } ;
N is any positive integer greater than 64;
Described operator ' ∧ ' expression step-by-step logic ' with ' computing, operator ' ∨ ' expression step-by-step logic ' or ' computing, operator
Figure FA20177880200810046004801C000218
Expression step-by-step logic ' XOR ' computing, operator "~" expression step-by-step logic ' negate ' computing.
2. the realization modular multiplication computing unit (T+a that uses the described CBSA hardware adder of claim 1 to constitute iB+q iFour advancing two and go out totalizer N)=(T1+T2+X1+X2) includes:
4 difference stored data T1, T2, X1, the output register of X2,
Respectively with output register T1, T2, the first order CBSA hardware adder that X1 connects,
The second level CBSA hardware adder that is connected with first order CBSA hardware adder two output terminals and X2 output register respectively,
Stored data T1 that is connected with second level CBSA hardware adder two output terminals and the output register unit of stored data T2 (T1, T2),
And control output register unit (T1, clk clock signal of system T2) and rst totalizer reset signal,
Here: (T+a iB+q iN) computing unit that need call repeatedly for the modular multiplication process, wherein T is the output data of modular multiplication, and B is the input data of modular multiplication, and N is the fixedly modulus of modular multiplication, a iBe the bit value of modular multiplication input data A, q iSignificant bits value for modular multiplication output data T.
3. the realization modular multiplication computing unit (T1+T2+a that uses the described CBSA hardware adder of claim 1 to constitute iB1+a iB2+q iFive advancing two and go out totalizer N) includes:
5 difference stored data T1, T2, a iB1, a iB2, q iThe output register of N,
Respectively with output register T1, T2, a iThe first order CBSA hardware adder that B1 connects,
Respectively with first order CBSA hardware adder two output terminals and a iThe second level CBSA hardware adder that the B2 output register connects,
Respectively with second level CBSA hardware adder two output terminals and q iThe third level CBSA hardware adder that the N output register connects,
Stored data T1 that is connected with third level CBSA hardware adder two output terminals and the output register unit of stored data T2 (T1, T2),
And control output register unit (T1, clk clock signal of system T2) and rst totalizer reset signal.
4. method for designing that realizes the described CBSA hardware adder of claim 1 has following steps:
The first step is determined the design object of CBSA totalizer, be realize calculating CBSA (X, Y, Z)=(C S), and satisfies C+S=X+Y+Z, for this reason:
1. arbitrary integer X, Y, Z is shown as X=(± x with 2 system numerical tables N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0), Z=(± z N-1... ± z 1± z 0), x wherein i∈ 0,1}, y i∈ 0,1}, z i∈ 0,1}, and the X=∑ is arranged I=0 ..., n-1(± x i2 i), the Y=∑ I=0 ..., n-1(± y i2 i), the Z=∑ I=0 .., n-1(± z i2 i);
2. count X=(± x for three 2 systems of any input N-1... ± x 1± x 0), Y=(± y N-1... ± y 1± y 0) and Z=(± z N-1... ± z 1± z 0), (C is that 2 systems are counted C=(± c S) to its output result equally after CBSA calculates N-1... ± c 1± c 0) and S=(± s N-1... ± s 1± s 0);
In second step, binary number unsigned number tabular form and redundant digit tabular form are set
1) binary number unsigned number tabular form is set
To any bigit X=(± x N-1... ± x 1± x 0), remove all signs of each digital front, obtain the unsigned number tabular form of X, it is designated as X ` = ( x ` n - 1 . . . x ` 1 x ` 0 ) , Wherein x ` i ∈ { 0,1 } ;
2) the redundant digit tabular form of binary number is set
To any bigit X=(± x N-1... ± x 1± x 0), x iEach digital previous symbol be 1 for negative bit labeling, otherwise be labeled as 0, then obtain the redundant digit tabular form of X, and be designated as X ~ = ( x ~ n - 1 . . . x ~ 1 x ~ 0 ) Wherein x ~ i ∈ { 0,1 } ;
In the 3rd step, to importing any long integer X in position, Y and Z carry out CBSA and calculate
With any long bigit X in position of input, Y, the unsigned number tabular form of Z and redundant digit tabular form are designated as respectively
Figure FA20177880200810046004801C00036
CBSA (X, Y, Z)=(C, calculating process S) is as follows:
(1) at first parallel computation ( C 0 , S 0 ) = CSA ( X ` , Y ` , Z ` ) With ( C 1 , S 1 ) = CSA ( X ~ , Y ~ , Z ~ ) , CSA is the computing of the carry save adder known, obtains:
C0=(c0 n-1...c0 1c0 0),S0=(s0 n-1...s0 1s0 0),c0 i,s0 i∈{0,1};
C1=(c1 n-1...c1 1c1 0),S1=(s1 n-1...s1 1s1 0),c1 i,s1 i∈{0,1};
(2) provide CBSA (X, Y, Z)=(C, S) result's unsigned number tabular form and redundant digit tabular form are designated as respectively
Figure FA20177880200810046004801C00041
With
Figure FA20177880200810046004801C00042
And
Figure FA20177880200810046004801C00043
With
Figure FA20177880200810046004801C00044
Wherein:
C ` = ( c ` n - 1 . . . c ` 1 c ` 0 ) , S ` = ( s ` n - 1 . . . s ` 1 s ` 0 ) , c ` i , s ` i ∈ { 0,1 }
C ~ = ( c ~ n - 1 . . . c ~ 1 c ~ 0 ) , S ~ = ( s ~ n - 1 . . . s ~ 1 s ~ 0 ) , c ~ i , s ~ i ∈ { 0,1 }
(3) C0 as a result that utilizes above-mentioned calculating process (1) to obtain, S0, C1, each digital bit of S1 is by the following output that calculates CBSA
Figure FA20177880200810046004801C000411
Each digital bit:
Figure FA20177880200810046004801C000412
Figure FA20177880200810046004801C000413
Figure FA20177880200810046004801C000414
Operational symbol '~' expression step-by-step logic wherein ' negate ', i.e. 1=~0,0=~1; ' ∧ ' expression step-by-step logic ' with ' operation; N is any positive integer greater than 64;
The 4th step is according to the 3rd step
Figure FA20177880200810046004801C000416
Each input bit
Figure FA20177880200810046004801C000417
With the 3rd step (1) respectively export bit c0 i, s0 i, c1 i, s1 i, provide following simple logic computing:
Figure FA20177880200810046004801C000418
s 0 i = x ` i ⊕ y ` i ⊕ z ` i
Figure FA20177880200810046004801C000420
s 1 i = x ~ i ⊕ y ~ i ⊕ z ~ i
t i=(~(s1 i∧(~s0 i)))
Here ' ∨ ' expression step-by-step logic ' or ' operation, Expression step-by-step logic ' XOR ' operation, go on foot 4 logical operations that (3) provide in conjunction with the 3rd:
Figure FA20177880200810046004801C000423
Figure FA20177880200810046004801C000425
Figure FA20177880200810046004801C000426
Carry out the design of following CBSA hardware adder:
1. according to the input data of calculating
Figure FA20177880200810046004801C00051
The design relevant register;
2. according to the input data Carry out
Figure FA20177880200810046004801C00053
Logical operation, output data c0 i, design is called the simple logic circuit structure of logical block-1, and sets up
Figure FA20177880200810046004801C00054
The annexation of register and logical block-1;
3. according to the input data
Figure FA20177880200810046004801C00055
Carry out
Figure FA20177880200810046004801C00056
Logical operation, output data s0 i, design is called the simple logic circuit structure of logical block-2, and sets up
Figure FA20177880200810046004801C00057
The annexation of register and logical block-2;
4. according to the input data
Figure FA20177880200810046004801C00058
Carry out Logical operation, output data s1 i, design is called the simple logic circuit structure of logical block-3, and sets up
Figure FA20177880200810046004801C000510
The annexation of register and logical block-3;
5. according to the input data
Figure FA20177880200810046004801C000511
Carry out
Figure FA20177880200810046004801C000512
Logical operation, output data c1 i, design is called the simple logic circuit structure of logical block-4, and sets up
Figure FA20177880200810046004801C000513
The annexation of register and logical block-4;
6. according to input data s0 iAnd s1 i, carry out (~(s1 i∧ (~s0 i))) logical operation, output data t i, design is called the simple logic circuit structure of logical block-5, and set up logical block-5 respectively with the annexation of logical block-2 and logical block-3;
7. according to input data c0 iAnd t i, carry out c0 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure FA20177880200810046004801C000514
Design is called the simple logic circuit structure of logical AND gate-1, and set up logical AND gate-1 respectively with the annexation of logical block-1 and logical block-5;
8. according to input data s0 iAnd t i, carry out s0 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure FA20177880200810046004801C000515
Design is called the simple logic circuit structure of logical AND gate-2, and set up logical AND gate-2 respectively with the annexation of logical block-2 and logical block-5;
9. according to input data s1 iAnd t i, carry out s1 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure FA20177880200810046004801C000516
Design is called the simple logic circuit structure of logical AND gate-3, and set up logical AND gate-3 respectively with the annexation of logical block-3 and logical block-5;
10. according to input data c1 iAnd t i, carry out c1 i∧ (~(s1 i∧ (~s0 i))) logical operation, output data
Figure FA20177880200810046004801C000517
Design is called the simple logic circuit structure of logical AND gate-4, and set up logical AND gate-4 respectively with the annexation of logical block-4 and logical block-5;
According to logical AND gate-1 output data
Figure FA20177880200810046004801C00061
Logical AND gate-2 output data
Figure FA20177880200810046004801C00062
Logical AND gate-3 output data
Figure FA20177880200810046004801C00063
Logical AND gate-4 output data
Figure FA20177880200810046004801C00064
The output bit register of storing these data respectively is set; Finish these steps, just obtained realizing the CBSA hardware adder of plus-minus method indifference parallel computation.
CN 200810046004 2008-09-08 2008-09-08 CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof Expired - Fee Related CN101349967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810046004 CN101349967B (en) 2008-09-08 2008-09-08 CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810046004 CN101349967B (en) 2008-09-08 2008-09-08 CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof

Publications (2)

Publication Number Publication Date
CN101349967A CN101349967A (en) 2009-01-21
CN101349967B true CN101349967B (en) 2010-06-02

Family

ID=40268775

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810046004 Expired - Fee Related CN101349967B (en) 2008-09-08 2008-09-08 CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof

Country Status (1)

Country Link
CN (1) CN101349967B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866875B (en) * 2012-10-05 2016-03-02 刘杰 Multioperand adder
CN113010144B (en) * 2021-03-05 2022-02-11 唐山恒鼎科技有限公司 1bit plus-minus device
CN113010145B (en) * 2021-03-22 2024-02-20 香港中文大学(深圳) Digital operation component, digital calculator and electronic equipment

Also Published As

Publication number Publication date
CN101349967A (en) 2009-01-21

Similar Documents

Publication Publication Date Title
Morain et al. Speeding up the computations on an elliptic curve using addition-subtraction chains
Chow et al. A Karatsuba-based Montgomery multiplier
Shieh et al. Word-based Montgomery modular multiplication algorithm for low-latency scalable architectures
CN106951211B (en) A kind of restructural fixed and floating general purpose multipliers
CN103761068B (en) Optimized Montgomery modular multiplication hardware
CN105335127A (en) Scalar operation unit structure supporting floating-point division method in GPDSP
Thapliyal et al. Design and analysis of a novel parallel square and cube architecture based on ancient Indian Vedic mathematics
Zhang et al. High-radix design of a scalable montgomery modular multiplier with low latency
CN101349967B (en) CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof
US7958180B2 (en) Multiplier engine
Basha et al. Design and Implementation of Radix-4 Based High Speed Multiplier for ALU's Using Minimal Partial Products
Li et al. Research in fast modular exponentiation algorithm based on FPGA
Rashidi et al. High-speed hardware implementations of point multiplication for binary Edwards and generalized Hessian curves
Rashidi et al. Full‐custom hardware implementation of point multiplication on binary edwards curves for application‐specific integrated circuit elliptic curve cryptosystem applications
Mohan et al. Evaluation of Mixed-Radix Digit Computation Techniques for the Three Moduli RNS {2 n− 1, 2 n, 2 n+ 1− 1}
Elango et al. High-Performance Multi-RNS-Assisted Concurrent RSA Cryptosystem Architectures
Kadu et al. Hardware implementation of efficient elliptic curve scalar multiplication using vedic multiplier
Putra et al. Optimized hardware algorithm for integer cube root calculation and its efficient architecture
Patil et al. Performance analysis of multiplication operation based on vedic mathematics
Singh et al. Energy Efficient Vedic Multiplier
Harika et al. Analysis of different multiplication algorithms & FPGA implementation
Harikrishna et al. A LOW POWER BINARY SQUARE ROOTER USING REVERSIBLE LOGIC
Arunachalamani et al. High Radix Design for Montgomery Multiplier in FPGA platform
Bedoui et al. An improvement of both security and reliability for elliptic curve scalar multiplication Montgomery algorithm
Sakthivel et al. Performance Comparison of Radix-2 and Radix-4 by Booth Multiplier

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100602

Termination date: 20170908