Embodiment
For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
Parallel in addition or subtraction operation at dual path: complement code conversion and shifting function mutual exclusion; The situation of the shift unit of a complete bit wide is only needed among the displacement of mantissa alignment and normalized displacement, line of reasoning footpath and at a distance line of reasoning footpath are nearby set, the subtraction process that index difference is less than or equal to 1 is carried out in line of reasoning footpath nearby, and addition and the poor subtraction process being greater than 1 of index are carried out in line of reasoning footpath at a distance.
Leading 1 judges the number for calculating in input leading 0, and the figure place of the number of leading 0 normalization shift just, comprise precoding, leading 1 judges, error correction three part, precoding obtains one 0 according to the magnitude portion coding of two operations will carrying out calculating, 1 string, the position identical (1 error may be had) of its position of leading 1 and mantissa result of calculation leading 1, leading 1 judges it is obtain precoding 0, 1 coded strings leading 1 position carry out binary coding, to obtain the figure place of normalization shift, error correction corrects may exist in precoding process 1 error.
Operation of rounding off carries out small change to result of calculation, use compound adder to calculate all possible outcomes simultaneously, carrying out result in the step that rounds off selects operation just can significantly improve arithmetic speed, parallel rounding off will advance to carry out parallel with mantissa plus-minus method computation process calculating the operation of rounding off finally carried out exactly, improve concurrency, reduce the length in plus-minus method path.
Pipelining is widely used, and has been proved to be features such as having high speed and high-throughput.In addition, the realization of the operation of being correlated with of floating add the subtraction such as conversion of floating fixed point and the operation of floating-point comparing class etc., to a great extent can multiplexing floating add subtraction data path, to realize the simple and high reusability of integrated circuit structure, reduction area consumption.So, carry out the data path OVERALL OPTIMIZA-TION DESIGN FOR of floating add subtraction and associative operation thereof by adopting multiple technologies and multiplexing configuration is research emphasis of the present invention.
Given this, the present invention proposes a kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement, the concurrent operation of multilevel flow water-bound is utilized to perform the addition of two floating-point operation numbers, subtraction and associative operation thereof, floating add subtraction associative operation comprises floating-point and turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root inverse and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) operation, what flowing water multi-stage pipeline arrangement adopts realize, can need to determine according to application.
Fig. 1 is the structural representation of the high-speed floating point arithmetical unit 100 adopting multistage (N level) pipeline organization according to the embodiment of the present invention, this high-speed floating point arithmetical unit comprises input D type flip-flop (D typeflip-flop, DFF) 101, operand information is extracted and zone bit judging unit 115, N level floating-point operation structure, N level DFF and data selection unit 116, N be greater than 1 natural number, wherein, this input DFF 101 is extracted by this operand information and zone bit judging unit 115 is connected to first order floating-point operation structure, this first order floating-point operation anatomical connectivity is in first order DFF, this first order DFF is connected to second level floating-point operation structure, this second level floating-point operation anatomical connectivity is in second level DFF, this second level DFF is connected to third level floating-point operation structure, the like, N-1 level floating-point operation anatomical connectivity is in N-1 level DFF, this N-1 level DFF is connected to N level floating-point operation structure, finally, this N level floating-point operation structure is connected to N level DFF by this data selection unit 116.
This N level floating-point operation structure includes N level index processing unit, N level Close path mantissa's processing unit and N level Far path mantissa processing unit, and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are connected to N level index processing unit.
This input DFF101 receives and deposits the action type that outside provides, round off type, first operand and second operand, wherein, the type that action type indicates operation is floating add, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root inverse and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) in which kind of operation, the type that rounds off indicates to block and rounds off or round off nearby, and action type is exported to first order index processing unit 106 and first order DFF 102, the type that will round off exports to first order Close path mantissa processing unit 109 and first order DFF 102, first operand and second operand are passed to operand information to extract and zone bit judging unit 115.
Operand information extraction and zone bit judging unit 115 extract the sign bit of first operand and second operand, exponent bits and mantissa's bit data, and calculate the zero flag of two operands, the zone bit data such as infinite mark and non-number mark, the exponent bits data of obtain two operands are exported to first order index processing unit 106, mantissa's bit data of two operands is exported to first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112, sign bit data and zone bit data are exported to first order DFF 102 deposit, so that use when next stage flowing water carries out computing.
The index that this first order index processing unit 106 carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112 provide calculates, and this index result of calculation is deposited in first order DFF 102.
The mantissa that first order Close path mantissa processing unit 109 carries out Close path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit 106 provide calculates, and this mantissa's result of calculation is deposited in first order DFF 102.
The mantissa that first order Far path mantissa processing unit 112 carries out Far path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit 106 provide calculates, and this mantissa's result of calculation is deposited in first order DFF 102.
This first order DFF 102 is for depositing the results of intermediate calculations of this first order index processing unit 106, this first order Close path mantissa processing unit 109 and this first order Far path mantissa processing unit 112, and the action type that operand information is extracted and the sign bit data that provides of zone bit judging unit 115 and zone bit data and input DFF 101 deposit and the categorical data that rounds off, use for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit.
Similar with this first order index processing unit 106, the index that second level index processing unit 107 carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and Close path, second level mantissa processing unit 110 and Far path, second level mantissa processing unit 113 provide calculates, and this index result of calculation is deposited in the DFF103 of the second level.The index that N level index processing unit 108 carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and N level Close path mantissa's processing unit and N level Far path mantissa processing unit provide calculates, and this index result of calculation is exported to data selection unit 116.Wherein the index of floating add subtraction and associative operation thereof calculates and specifically comprises: calculate two operand indexes poor, on index, underflow judges, Close path, Far path index generate, and provide desired data, as index difference data for Close path mantissa at different levels process and Far path mantissa process.
Similar with this first order Close path mantissa processing unit 109, the mantissa that Close path, second level mantissa processing unit 110 carries out Close path according to the data that mantissa's bit data of first operand and second operand and second level index processing unit 107 provide calculates, and mantissa's result of calculation in this Close path is deposited in second level DFF 103.The mantissa that N level Close path mantissa processing unit 111 carries out Close path according to the data that mantissa's bit data of first operand and second operand and N level index processing unit 108 provide calculates, and mantissa's result of calculation in this Close path is exported to data selection unit 116.Wherein the mantissa in Close path calculates and specifically comprises: index difference is less than or equal to the subtraction mantissa process of 1, leading 1 of result of calculation judges and the parallel calculating of rounding off of Close route result, and provide desired data for index process at different levels, as leading 1 judges that the result displacement number determined is to adjust the index results in Close path.
Similar with this first order Far path mantissa processing unit 112, the mantissa that Far path, second level mantissa processing unit 113 carries out Far path according to the data that mantissa's bit data of first operand and second operand and second level index processing unit 107 provide calculates, and mantissa's result of calculation in this Far path is deposited in second level DFF 103.The mantissa that N level Far path mantissa processing unit 114 carries out Far path according to the data that mantissa's bit data of first operand and second operand and N level index processing unit 108 provide calculates, and mantissa's result of calculation in this Far path is exported to data selection unit 116.Wherein the mantissa in Far path calculates and specifically comprises: the subtraction mantissa process that index difference is greater than 1, mantissa's process of addition, the parallel calculating of rounding off of Far route result, and plus-minus method associative operation mantissa calculates the process structure needing additionally to add, and provide desired data for index process at different levels, after rounding off as parallel, the displacement number of result of calculation is to adjust the index results in Far path.
Mantissa's result of calculation in the Far path that mantissa's result of calculation in the Close path that data selection unit 116 provides according to being extracted by operand information of passing over and various zone bit data that zone bit judging unit 115 provides, N level Close path mantissa processing unit 111 and N level Far path mantissa processing unit 114 provide obtains final result of calculation, passes to N level DFF and deposits.
First order DFF 102, second level DFF 103, N-1 level DFF 104 deposit index process at different levels, Close path mantissa processes, the results of intermediate calculations of Far path mantissa process, for the process of next stage index, the process of Close path mantissa and the process of Far path mantissa calculate and use.
In addition, the suspension points between second level DFF 103 and N-1 level DFF 104 represents division and the design that can carry out multi-stage pipeline as required.
Fig. 2 is the structural representation adopting the high-speed floating point arithmetical unit 200 of two stage pipeline structure according to the embodiment of the present invention, the present embodiment adopts the high-speed floating point arithmetical unit 200 of two stage pipeline structure, utilizes that the concurrent operation of two-stage flowing structure performs the addition of two floating-point operation numbers, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point get inverse, floating-point square root is reciprocal and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) operation.As shown in Figure 2, this high-speed floating point arithmetical unit 200 comprises input DFF 201, operand information extraction and zone bit judging unit 202, first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205, first order DFF206, second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238, data selection 234 and second level DFF235.
Wherein, input DFF 201 receives the action type that outside provides, round off type, first operand and second operand are deposited, wherein, the type that action type indicates operation is floating add, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root inverse and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) in which kind of operation, the type that rounds off indicates to block and rounds off or round off nearby, first operand and second operand are respectively the data of 32, and action type is exported to first order index processing unit 203 and first order DFF206, the type that will round off exports to first order Close path mantissa processing unit 204 and first order DFF206, first operand and second operand are passed to operand information to extract and zone bit judging unit 202.
Operand information extraction and zone bit judging unit 202 extract the sign bit of first operand and second operand, exponential sum mantissa data, and calculate the zero flag of two operands, infinite mark and non-several flag information, the exponent data of obtain two operands is exported to first order index processing unit 203, the mantissa data of two operands is exported to first order Close path mantissa processing unit 204 and first order Far path mantissa processing unit 205, sign bit information and zone bit data are exported to first order DFF206 deposit, so that use when next stage flowing water carries out computing.
First order DFF206 deposits the results of intermediate calculations of first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205, and the action type that operand information is extracted and the sign bit that provides of zone bit judging unit 202 and zone bit data and input DFF201 deposit and the categorical data that rounds off, export to second level flowing water and use.
The action type that data selection unit 234 provides according to first order DFF, sign bit, zone bit data are to from first order index processing unit 203, first order Close path mantissa processing unit 204, the result of first order Far path mantissa processing unit 205 is analyzed, obtain floating add, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root is reciprocal, comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to compare operation and export to second level DFF235 with the result under abnormal conditions under normal circumstances.The correct result of calculation that data selection unit 234 provides by second level DFF235 is deposited and exports.
Illustrate first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205 below, the Inner Constitution of second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238 and data interaction relation.
In first order index processing unit 203, 8 complex indexes totalizers 207 receive the exponent data of two operands that operand extracts and zone bit judging unit 202 provides, suppose it is A and B respectively, calculate the carry signal of A+B and A+B+1 and point other most significant digit simultaneously, for the associative operation of floating add subtraction, carry out fixed point turn floating-point operation and floating-point turn fixed-point arithmetic time, input be respectively 158 and first operand negate after value, and when carrying out floating-point and getting derivative action, input be respectively 254 and first operand negate after value, when carrying out floating-point square root derivative action, input be respectively 190 and first operand negate after value, to realize the multiplexing of arithmetic unit, result of calculation is exported to index generation/floating turning and is determined the judgement 208 of index spilling.Index generates/and floating turning determine index and overflow action type information that judgement 208 provides according to input DFF and 8 complex indexes totalizers 207 provide the result of calculation of complex adder generation index poor, larger index, floating-point turns fixed point overflow and underflow signal, and floating-point gets operation reciprocal and the floating-point square root index operated reciprocal, export to respectively CpFp/ addition subtraction judge 209 and first order DFF206 deposit.CpFp/ addition subtraction judge 209 according to the input action type that provides of DFF201 and index generate/floating turning determine index difference that index overflows judgement 208 and provide and produce and manage mantissa path and the selection signal that operates of the selection signal of reason mantissa route result and addition and subtraction at a distance nearby.
In first order Close path mantissa processing unit 204, Close path mantissa select 210 according to index generate/floating turning determine the index comparative result that index overflows judgement 208 and provide and judge that the 202 Liang Ge mantissa provided are selected to operand information extraction and zone bit, the mantissa of operand larger for index is exported to 24 compound mantissa adder devices 213 as summand, the mantissa of the operand that index is less exports to and moves to right 0 or 1 211, it generates/floating turn of index difference the information of determining the judgement 208 of index spilling and providing according to index, determine whether to move to right one, if index difference is 1, move to right displacement, otherwise do not move to right, move to right and 0 or 1 211 the data after process are passed to 24 compound adders 213 as addend, Leading-one predicator 212 and Close parallel path round off 214.24 compound mantissa adder devices 213 receive selects 210 summands provided from Close path mantissa and 0 or 1 211 addend provided complex adder operation is carried out in addend negate of moving to right, obtain A+B, A+B+1, and point other most significant digit carry flag and minus flag signal, and pass to first order DFF206 and deposit, if result of calculation is less than 0, then minus flag signal is effective, for single precision floating datum computing, because magnitude portion is 24 significance bits, so compound adder adopts 24 bit wides.Close parallel path rounds off the 214 highest significant position carries provided according to 24 compound mantissa adder devices 213, highest significant position, least significant bit (LSB) information, the rounding procedure information that the warning position that theres is provided for 0 or 1 211 and input DFF201 provide of moving to right carries out the parallel rounding treatment of Close path mantissa, obtains the value that moves in the selection signal of net result after rounding off and postnormalization of rounding off displacement and pass to first order DFF206 depositing.
Nearby under rounding mode Close parallel path round off 212 meet formula as follows:
Sel=sp1Cout &(~g|(MSBSp0 & g & LSBSp0));
ShiftIn=sp1Cout &~MSBSp0 & g;
Wherein, Sel is the selection signal that rounds off produced, Sel is 1 selection A+B+1, Sel is 0 selection A+B, the data value moved into when ShiftIn is normalization, sp1Cout is the most significant digit carry of A+B+1, and g is the warning position produced after 0 or 1 negate that moves to right, MSBSp0 is the highest significant position of A+B, and LSBSp0 is the least significant bit (LSB) of A+B.
Block 212 formula met that round off of Close parallel path under rounding mode as follows:
Sel=sp1Cout&~g;
ShiftIn=sp1Cout & g &~MSBSp0;
Wherein, the parameter declaration in formula is identical with the parameter declaration under rounding mode nearby.
The mantissa of the floating number that Leading-one predicator 212 selects the mantissa of the larger floating number of 210 indexes that provide according to Close path mantissa and index after the displacement alignment that provides for 0 or 1 211 of moving to right is less produces precoding and passes to first order DFF206 deposits.The formula that precoding meets is as follows:
f
i=e
i-1&((g
i&~s
i+1)|(s
i&~g
i+1))|~e
i-1&((s
i&~s
i+1)|(g
i&~g
i+1));
Wherein, define Liang Ge mantissa and be respectively A=a
0a
1... a
m-2a
m-1and B=b
0b
1... b
m-2b
m-1, make W=A-B (not being with carry transmission), wherein w
i=a
i-b
i, w
i{-1,0,1}, definition F is 0,1 coded strings produced after precoding to ∈, the position identical (may have 1 error) of its position of leading 1 and mantissa result of calculation leading 1.F=f
0f
1... f
m-2f
m-1, f
i{ value of 0,1}, F string is that the value of being gone here and there by W under different situations combines generation to ∈.For w
i∈ { three kinds of possibilities of-1,0,1}, definition e
i, g
i, s
i, wherein, w
iwhen=0, e
i=1; w
iwhen=1, g
i=1; w
iwhen=-1, s
i=1.
In first order Far path mantissa processing unit 205, Far path mantissa select 215 according to index generate/floating turning determine the index comparative result that index overflows judgement 208 and provide and judge that the 202 Liang Ge mantissa provided are selected to operand information extraction and zone bit, the mantissa of operand larger for index is exported to first order DFF206 as summand deposit, the mantissa of the operand that index is less exports to 32 gt shift units 216, its according to index generate/floating turning determine index difference information that index overflows judgement 208 and provide and carry out moving to right alignment the data after processing are exported to first order DFF206 deposit, the shift unit that herein moves to right is 32 is to carry out the shifting function that floating-point turns fixed-point operation, realize multiplexing.32 fixed-point numbers move to left 217 carrying out using when fixed point turns floating-point operation, judge that the information of 202 first operand provided is shifted according to operand information extraction and zone bit, obtain the flag information whether result after being shifted and the figure place of displacement and operand be zero, and pass to first order DFF206 and deposit.RECIP/RSQRT mantissa process 218 use when carrying out getting operation reciprocal or inverse square root operation, judge that the mantissa information of 202 first operand provided carries out mantissa's process according to operand information extraction and zone bit, carry out two search operations respectively, using mantissa data as so, obtain the mantissa's operation result under different operating, and pass to first order DFF206 and deposit.
In Close path, second level mantissa processing unit 237, multiplexer 219 is selected one to pass to 24 lt shift units 221 to use according to the Close parallel path that first order DFF206 the deposits 214 Sel signals provided that round off from two result of calculations that 24 compound mantissa adder devices 213 export, leading 1 judges that binary processing is carried out in the precoding that 220 Leading-one predicator 212 deposited according to first order DFF206 provide, obtain the position of in F string leading 1, binary chop algorithm is adopted to search the position of leading 1 when carrying out binary coding, result after coding is exported to the Close path index compound adder 230 in the streamline of the second level, index path 203, 24 lt shift units 221 and error compensation 222 use.24 lt shift units 221 according to leading 1 judge 220 provide leading 1 positional information the result of calculation that multiplexer 219 provides is moved to left, carry out first time move to left time the data that move into round off the 214 ShiftIn data provided for Close parallel path, other time to move into data be 0.The error of 1 may be there is when carrying out Leading-one predicator, so error compensation 222 according to the result and leading 1 after 24 lt shift units 221 are shifted judge 220 output and the Close parallel path 214 ShiftIn data provided that round off carry out error compensation, if carry out the displacement of less, then carry out single place shift, otherwise be no longer shifted, thus the mantissa obtaining nearly mantissa process path calculates net result and whether there occurs the mark of error, exports to data selection 234 and uses.
In Far path, second level mantissa processing unit 238, OP2 form select 223 according to CpFp/ addition subtraction judge 209 provide manage mantissa paths nearby and reason mantissa route result selection signal and addition, subtraction operation selection signal are selected data and are expanded from the data after 32 gt shift units 216 and 32 fixed-point numbers move to left 217 displacements provided at a distance, obtain the data of 32 bit wides, export to 32 compound mantissa adder devices 224 as addend to use, calculate warning position simultaneously, the value of rounding bit and sticky position, to round off 225 uses for Far parallel path.32 compound mantissa adder devices, 224 pairs of Far path mantissa select 215 summands provided to expand to the data of 32, the addend provided is selected to carry out complex adder operation with OP2 form, obtain A+B, A+B+1, and point other most significant digit carry flag, carry out fixed point turn floating-point or floating-point turn fixed-point operation time, summand is 0.Far parallel path rounds off the 225 highest significant position carries provided according to 32 compound mantissa adder devices 224, highest significant position, least significant bit (LSB), secondary low order information, OP2 form selects the 223 warning positions provided, the type that rounds off that rounding bit and sticky position information and input DFF201 provide carries out the parallel rounding treatment of line of reasoning footpath mantissa at a distance, obtain the selection signal of net result after rounding off, the need of the mark carrying out moving to right after rounding off, the information such as the value moved into when moving to left, to carry out when floating-point turns fixed point parallel round off operation time, selection signal is provided, and these signals are exported to the multiplexer 233 carrying out Far path index computing gating in multiplexer 226 and index process 203 second level streamline.Multiplexer 226 according to Far parallel path round off 225 provide round off select signal Sel and SelFloat2Fix to 32 compound mantissa adder devices 224 provide result of calculation select, and export to and move to left or move to right 1 or do not move 227.Move to left or move to right 1 or do not move 227 and judge 209 additions provided, subtraction operation selection signal according to CpFp/ addition subtraction, mantissa's final calculation result that addition when floating-point turns fixed point, subtraction operation selection signal calculate mantissa far away process path to the data that multiplexer 226 provides and the mark whether moved to left, and pass to the multiplexer 233 carrying out Far path index computing gating in data selection 234 and index process 203 second level streamline respectively.
Nearby under rounding mode Far parallel path round off 225 meet formula as follows:
Sel=(OpType==`ADDER)?((~sp0Cout & g &(LSBSp0|r|s))|(sp0Cout & LSBSp0 &(LSB1Sp0|g|r|s))):(sp1Cout &((~g &~r &~s)|(g& r)|(MSBSp0 & g &(LSBSp0|s))));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout &~MSBSp0 &MSB1Sp0 &((~g & r & s)|(~r & g)));
RightMove=(OpType==`ADDER)?(sp0Cout|(~sp0Cout & sp1Cout& g &(r|s|LSBSp0))):1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?(gFloat2Fix &(LSBSp0|rFloat2Fix|sFloat2Fix)):((~gFloat2Fix &~rFloat2Fix &~sFloat2Fix)|(gFloat2Fix &(LSBSp0|sFloat2Fix|rFloat2Fix)));
Wherein, Sel is the selection signal that rounds off produced, Sel is 1 selection A+B+1, Sel data value that to be 0 selection A+B, ShiftIn be moves into when moving to left, RightMove is whether result of calculation moves to right mark, be 1 to move to right, otherwise do not move to right, SelFloat2Fix is the round off selection signal of floating-point when turning fixed point, Sel is 1 selection A+B+1, Sel is 0 selection A+B.OpType judges 209 additions provided, subtraction operation selection signal for CpFp/ addition subtraction, Float2FixOpType is floating-point addition, subtraction operation selection signal when turning fixed point, sp1Cout is the most significant digit carry of A+B+1, sp0Cout is the most significant digit carry of A+B, MSBSp0 is the highest significant position of A+B, MSB1Sp0 is the secondary high significance bit of A+B, and LSBSp0 is the least significant bit (LSB) of A+B, and LSB1Sp0 is the secondary low order of A+B.G, r, s, gFloat2Fix, rFloat2Fix, sFloat2Fix select the 223 warning positions provided, rounding bit and sticky position to calculate by OpType, Float2FixOpType and OP2 form, and it meets following formula.
g=(OpType==`ADDER)?gb:(gb^(rb|sb));
r=(OpType==`ADDER)?rb:(rb^sb);
s=(OpType==`ADDER)?sb:(sb);
gFloat2Fix=(Float2FixOpType==`ADDER)?gb:(gb^(rb |sb));
rFloat2Fix=(Float2FixOpType==`ADDER)?rb:(rb^sb);
sFloat2Fix=(Float2FixOpType==`ADDER)?sb:(sb);
Wherein, gb, rb, sb are " selection of OP2 form " 128 warning positions provided, rounding bit and sticky position.
Block 225 formula met that round off of Far parallel path under rounding mode as follows:
Sel=(OpType==`ADDER)?1′b0:(sp1Cout &(~g &~r &~s));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout &~MSBSp0 &MSB1Sp0 & g);
RightMove=(OpType==`ADDER)?sp0Cout:1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?1′b0:1′b1;
Wherein, the parameter declaration in formula is identical with the parameter declaration under rounding mode nearby.
In second level index processing unit 236,8 complex indexes totalizers 230 receive leading 1 and judge the number of 220 displacements provided and the larger exponential quantity that negate and first order DFF206 deposit carries out complex adder operation, obtain A+B, the result of A+B+1, and export to multiplexer 232, multiplexer 232 provides the result of calculation of no generation error flag to complex adder according to error compensation 222 and selects, obtain the index result of calculation that near-path is final, and export to data selection 234.The action type information setting that OP1 228 deposits according to first order DFF206 is carried out the value of the first operand of complex adder and passes to 8 complex indexes totalizers 231, if for fixed point turns floating-point operation, then it is 158, otherwise be the larger exponential quantity provided that first order DFF206 deposits, the action type information that OP2 229 deposits according to first order DFF206 and CpFp/ addition subtraction judge 209 additions provided, subtraction operation selection signal arranges the value of the second operand of complex adder and passes to 8 complex indexes totalizers 231, if for fixed point turns floating-point operation, 32 fixed-point numbers move to left 217 results be not 0 time, it is that 32 fixed-point numbers move to left the numerical value of the result negate after 217 displacements, 32 fixed-point numbers, 217 results that move to left are 0 when equaling 0, for the operation that other the present invention support, if CpFp/ addition subtraction judges 209 additions provided, subtraction operation selection signal is for selecting subtraction, export as-2, otherwise be 0.8 complex indexes totalizers 231 receive the data that OP1 228 and OP2 229 provides, and carry out complex adder operation, obtain A+B, and the result of A+B+1 also passes to multiplexer 233.Multiplexer 233, according to moving to left or move to right 1 or do not move 227 and provide the no mark that moves to left and Far parallel path 225 marks that move to right provided that round off and select the addition results that 8 complex indexes totalizers 231 provide, obtains the index result of calculation that long way footpath is final.
Above content is the explanation two-stage flowing water shown in Fig. 2 being realized to single-precision floating point plus-minus method and associative operation structure thereof.
Fig. 3 is the method flow diagram realizing the high-speed floating point arithmetical unit adopting two stage pipeline structure shown in Fig. 2, adopts two-stage flowing structure to realize.Step 301 all operations carries out at first order flowing water, and step 302 all operations carries out at second level streamline.In step 303, method 300 receives action type, the type that rounds off and first operand and second operand, and extracts corresponding sign bit, exponential sum mantissa data.In step 304, carry out index difference and calculate and compare.In step 305, according to index difference information, near-path utilizes compound adder to carry out mantissa's subtraction.In step 306, near-path is parallel carries out leading 1 judgement operation, and its operation result need complete before step 310.In step 307, displacement alignment operation is carried out according to index difference in long way footpath.In step 308, near-path carries out complement code conversion and walks abreast rounding off according to the type that rounds off that step 303 provides to operation result respectively.In step 309, long way footpath utilizes compound adder to carry out mantissa's plus-minus.In step 310, near-path carries out normalization shift according to the result that leading 1 judges to the output of step 208, obtains the result of calculation that near-path is final.In step 311, the type that rounds off that long way footpath provides according to step 303 carries out parallel rounding off to result of calculation, and obtains the final result of calculation in long way footpath.In step 312, the action type that the index difference information provided according to step 304 and step 303 provide exports from near-path, long way footpath or abnormality processing result.
Can be found out by the embodiment provided above, high-speed floating point arithmetical unit provided by the invention, adopt dual path, leading 1 judges, parallel rounding off supports floating add subtraction and associative operation thereof, introduce 32 fixed-point numbers move to left except fixed point turns floating-point operation, get operation reciprocal or the process of inverse square root operation immigration mantissa, floating-point turns fixed point and introduces index spilling judgement process, and owing to increasing the logic that action type carries out when carrying out data selection and exporting judging, floating add subtraction associative operation is all the data path of multiplexing floating add subtraction, data concurrent multiplexing is high, area is little.Adopt pipeline organization to carry out design arithmetic speed high, handling capacity is large.
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.