Embodiment
For making the object of the invention, technical scheme and advantage clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, to further explain of the present invention.
In dual path walks abreast to addition or subtraction: complement code conversion and shifting function mutual exclusion; The situation that only needs the shift unit of a complete bit wide among the displacement of mantissa's alignment and the normalized displacement; Footpath of line of reasoning nearby and the footpath of line of reasoning at a distance are set; Line of reasoning is nearby directly carried out the index difference smaller or equal to 1 subtraction process, and line of reasoning is at a distance directly carried out addition and index difference greater than 1 subtraction process.
Leading 1 judgement is used for calculating the number of input leading 0; And leading 0 the number figure place of normalization shift just; Comprise that precoding, leading 1 is judged, error correction three parts; Precoding obtains one 0,1 string according to the magnitude portion coding of two operations will calculating, the position identical (having 1 bit error) of its position of leading 1 and mantissa result of calculation leading 1, and leading 1 judgement is that binary coding is carried out in leading 1 position of 0,1 coded strings that precoding is obtained; To obtain the figure place of normalization shift, error correction is proofreaied and correct 1 bit error that possibly exist in the precoding process.
The operation of rounding off is that result of calculation is carried out small change; Use compound totalizer to calculate all possible outcomes simultaneously; Carrying out as a result in the step that rounds off, selection operation just can obviously improve arithmetic speed; Parallel round off exactly will the operation of carrying out at last of rounding off advance to and mantissa's plus-minus method computation process is parallel carries out calculating, and improves concurrency, reduces the length in plus-minus method path.
Pipelining is widely used, and has been proved to be characteristics such as having high speed and high-throughput.In addition, the relevant operation of floating add subtraction is such as the relatively realization of generic operation etc. of floating fixed point conversion and floating-point, to a great extent can multiplexing floating add subtraction data path, with the simple in structure and high reusability of realization integrated circuit, reduce area consumption.So, be research emphasis of the present invention through data path OVERALL OPTIMIZA-TION DESIGN FOR and the multiplexing configuration of adopting multiple technologies to carry out floating add subtraction and associative operation thereof.
Given this; The present invention proposes a kind of high-speed floating point arithmetical unit that adopts multi-stage pipeline arrangement; Utilize multistage flowing structure concurrent operation to carry out addition, subtraction and the associative operation thereof of two floating-point operation numbers; Floating add subtraction associative operation comprise floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) operation; What flowing water multi-stage pipeline arrangement adopts realize, can confirm according to application need.
Fig. 1 is the structural representation that adopts the high-speed floating point arithmetical unit 100 of multistage (N level) pipeline organization according to the embodiment of the invention; This high-speed floating point arithmetical unit comprise input D type flip-flop (D type flip-flop, DFF) 101, operand information extracts and zone bit judging unit 115, N level floating-point operation structure, N level DFF and data selection unit 116, N is the natural number greater than 1; Wherein, This input DFF 101 extracts through this operand information and zone bit judging unit 115 is connected in first order floating-point operation structure, and this first order floating-point operation structure is connected in first order DFF, and this first order DFF is connected in second level floating-point operation structure; This second level floating-point operation structure is connected in second level DFF; This second level DFF is connected in third level floating-point operation structure ..., and the like; N-1 level floating-point operation structure is connected in N-1 level DFF; This N-1 level DFF is connected in N level floating-point operation structure, and last, this N level floating-point operation structure is connected in N level DFF through this data selection unit 116.
This N level floating-point operation structure includes N level index processing unit, N level Close path mantissa's processing unit and N level Far path mantissa processing unit, and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are connected to N level index processing unit.
This input DFF101 receives and deposits action type, the type that rounds off, first operand and the second operand that the outside provides; Wherein, Action type indicate operation types is floating add, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) in which kind of operation; The type that rounds off indicates to block and rounds off or round off nearby; And action type exported to first order index processing unit 106 and first order DFF 102; The type that will round off is exported to first order Close path mantissa processing unit 109 and first order DFF 102, and first operand and second operand are passed to operand information extraction and zone bit judging unit 115.
Operand information extraction and zone bit judging unit 115 extract sign bit, exponent bits and mantissa's bit data of first operands and second operand; And calculate zone bit data such as the zero flag of two operands, infinite sign and non-number signs; The exponent bits data of two operands that obtain are exported to first order index processing unit 106; Mantissa's bit data of two operands is exported to first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112; Sign bit data and zone bit data are exported to first order DFF 102 deposit, so that next stage flowing water uses when carrying out computing.
The data that this first order index processing unit 106 provides according to the exponent bits data of first operand and second operand and action type and first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112 are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is deposited among the first order DFF 102.
The mantissa that the data that first order Close path mantissa processing unit 109 provides according to the mantissa's bit data and the first order index processing unit 106 of first operand and second operand are carried out the Close path is calculated, and this mantissa's result of calculation is deposited among the first order DFF 102.
The mantissa that the data that first order Far path mantissa processing unit 112 provides according to the mantissa's bit data and the first order index processing unit 106 of first operand and second operand are carried out the Far path is calculated, and this mantissa's result of calculation is deposited among the first order DFF 102.
This first order DFF 102 is used to deposit the results of intermediate calculations of this first order index processing unit 106, this first order Close path mantissa processing unit 109 and this first order Far path mantissa processing unit 112; And the sign bit data that operand information is extracted and zone bit judging unit 115 provides and zone bit data and the action type and the categorical data that rounds off that input DFF 101 deposits, use for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit.
Similar with this first order index processing unit 106; The data that second level index processing unit 107 provides according to the exponent bits data of first operand and second operand and action type and Close path, second level mantissa processing unit 110 and Far path, second level mantissa processing unit 113 are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is deposited among the DFF103 of the second level.The data that N level index processing unit 108 provides according to the exponent bits data of first operand and second operand and action type and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is exported to data selection unit 116.Wherein the Index for Calculation of floating add subtraction and associative operation thereof specifically comprises: it is poor to calculate two operand indexes; Underflow is judged on the index; Close path, Far path index generate, and are that Close at different levels path mantissa handles and the processing of Far path mantissa provides desired data, like the index difference data.
Similar with this first order Close path mantissa processing unit 109; The mantissa that the data that Close path, second level mantissa processing unit 110 provides according to the mantissa's bit data and the second level index processing unit 107 of first operand and second operand are carried out the Close path is calculated, and mantissa's result of calculation in this Close path is deposited among the second level DFF 103.The mantissa that the data that N level Close path mantissa processing unit 111 provides according to the mantissa's bit data and the N level index processing unit 108 of first operand and second operand are carried out the Close path is calculated, and mantissa's result of calculation in this Close path is exported to data selection unit 116.Wherein the mantissa in Close path calculates and specifically comprises: the index difference is handled smaller or equal to 1 subtraction mantissa; Leading 1 of result of calculation is judged and the parallel calculating of rounding off of Close route result; And, judge that as leading 1 the result who confirms is shifted number to adjust the index result in Close path for indexes processing at different levels provide desired data.
Similar with this first order Far path mantissa processing unit 112; The mantissa that the data that Far path, second level mantissa processing unit 113 provides according to the mantissa's bit data and the second level index processing unit 107 of first operand and second operand are carried out the Far path is calculated, and mantissa's result of calculation in this Far path is deposited among the second level DFF 103.The mantissa that the data that N level Far path mantissa processing unit 114 provides according to the mantissa's bit data and the N level index processing unit 108 of first operand and second operand are carried out the Far path is calculated, and mantissa's result of calculation in this Far path is exported to data selection unit 116.Wherein the mantissa in Far path calculates and specifically comprises: the index difference is handled greater than 1 subtraction mantissa; The mantissa of addition handles; The parallel calculating of rounding off of Far route result; And plus-minus method associative operation mantissa calculating needs extra additional Processing Structure, and handles for indexes at different levels desired data is provided, like the displacement number of the parallel back result of calculation that the rounds off index result with adjustment Far path.
Data selection unit 116 obtains final result of calculation according to what pass over by mantissa's result of calculation in operand information is extracted and zone bit judging unit 115 provides various zone bit data, Close path that N level Close path mantissa processing unit 111 provides and mantissa's result of calculation in the Far path that N level Far path mantissa processing unit 114 provides, and passes to N level DFF and deposits.
First order DFF 102, second level DFF 103, N-1 level DFF 104 deposit indexes at different levels and handle; Close path mantissa handles; The results of intermediate calculations that Far path mantissa handles is handled for the next stage index, and Close path mantissa handles and the processing calculating of Far path mantissa is used.
In addition, the suspension points between second level DFF 103 and the N-1 level DFF 104 representes can carry out as required the division and the design of multi-stage pipeline.
Fig. 2 is the structural representation that adopts the high-speed floating point arithmetical unit 200 of two stage pipeline structure according to the embodiment of the invention; Present embodiment adopts the high-speed floating point arithmetical unit 200 of two stage pipeline structure, utilize the concurrent operation of two-stage flowing structure carry out the addition of two floating-point operation numbers, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) operation.As shown in Figure 2, this high-speed floating point arithmetical unit 200 comprises input DFF 201, operand information extracts and zone bit judging unit 202, first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205, first order DFF206, second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238, data selection 234 and second level DFF235.
Wherein, Input DFF 201 receives outside action type, the type that rounds off, first operand and the second operand that provides and deposits; Wherein, Action type indicate operation types is floating add, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) in which kind of operation; The type that rounds off indicates to block and rounds off or round off nearby; First operand and second operand are respectively 32 data; And action type exported to first order index processing unit 203 and first order DFF206, the type that will round off is exported to first order Close path mantissa processing unit 204 and first order DFF206, and first operand and second operand are passed to operand information extraction and zone bit judging unit 202.
Operand information extraction and zone bit judging unit 202 extract sign bit, the exponential sum mantissa data of first operands and second operand; And calculate the zero flag of two operands, infinite sign and non-several flag information; The exponent data of two operands that obtain is exported to first order index processing unit 203; The mantissa data of two operands is exported to first order Close path mantissa processing unit 204 and first order Far path mantissa processing unit 205; Sign bit information and zone bit data are exported to first order DFF206 deposit, so that next stage flowing water uses when carrying out computing.
First order DFF206 deposits the results of intermediate calculations of first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205; And the sign bit that operand information is extracted and zone bit judging unit 202 provides and zone bit data and the action type and the categorical data that rounds off that input DFF201 deposits, export to second level flowing water and use.
The action type that data selection unit 234 provides according to first order DFF, sign bit, zone bit data are to from first order index processing unit 203; First order Close path mantissa processing unit 204; The result of first order Far path mantissa processing unit 205 analyzes, obtain floating add, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root, equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to compare operation under normal circumstances with abnormal conditions under result and export to second level DFF235.The correct calculation result that second level DFF235 provides data selection unit 234 deposits and exports.
Specify first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205 below, the inside of second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238 constitutes and the data interaction relation.
In the first order index processing unit 203,8 complex indexes totalizers 207 receive that operands extract and the exponent data of two operands that zone bit judging unit 202 provides, and suppose it is respectively A and B; Calculate A+B and A+B+1 simultaneously and divide the carry signal of other most significant digit; For the associative operation of floating add subtraction, fixing a point to change floating-point operation and floating-point when changeing fixed-point arithmetic, input be respectively 158 with the first operand negate after value; And carrying out floating-point when getting computing reciprocal; The input be respectively 254 with the first operand negate after value, when carrying out the computing of floating-point inverse square root, the input be respectively 190 with the first operand negate after value; To realize the multiplexing of arithmetic unit, result of calculation is exported to index generation/floating definiteness number that changes and is overflowed judgement 208.Index generation/floating commentaries on classics definiteness number overflows judgement 208 provides the result of calculation generation index of compound addition poor according to action type information and 8 complex indexes totalizers 207 that input DFF provides; Big index; Floating-point changes fixed point overflow and underflow signal; And floating-point gets the index of operation reciprocal and the operation of floating-point inverse square root, export to respectively CpFp/ addition subtraction judge 209 and first order DFF206 deposit.CpFp/ addition subtraction judge 209 action types that provide according to input DFF201 generate with index/the floating definiteness number that changes overflows and judges that the 208 index differences that provide produce and manage the mantissa path nearby and manage the selection signal of mantissa's route result and the selection signal of addition and subtraction with the distant place.
In the first order Close path mantissa processing unit 204; Close path mantissa select 210 according to index generate/the floating definiteness number that changes overflows and judges that the 208 index comparison results that provide extract operand information and zone bit judges that 202 two mantissa providing select; The mantissa of the operand that index is bigger exports to 24 compound mantissa adder devices 213 as summand; The mantissa of the operand that index is less exports to and moves to right 0 or 1 211; Its according to index generate/the floating definiteness number that changes overflows and judges the 208 index difference information that provide, determines whether to move to right one, if the index difference is 1 then the displacement that moves to right; Otherwise do not move to right, the data after will handling for 0 or 1 211 of moving to right pass to 24 compound totalizers 213, leading 1 prediction 212 as addend and the Close path is parallel rounds off 214.24 compound mantissa adder devices 213 receive from Close path mantissa and select 210 summands that provide and move to right 0 or 1 211 addend that provides and the addend negate carried out compound add operation, obtain A+B, A+B+1; And divide other most significant digit carry flag and minus flag signal, and pass to first order DFF206 and deposit, if result of calculation is less than 0; Then the minus flag signal is effective; For the single precision floating datum computing, because magnitude portion is 24 significance bits, so compound totalizer adopts 24 bit wides.The 214 highest significant position carries that provide according to 24 compound mantissa adder devices 213 that round off that the Close path is parallel; Highest significant position; Least significant bit (LSB) information; The rounding procedure information that the warning position that provides for 0 or 1 211 and input DFF201 provide that moves to right is carried out parallel processings of rounding off of Close path mantissa, the value of immigration and pass to first order DFF206 and deposit in obtain rounding off the selection signal of back net result and the postnormalization of the rounding off displacement.
Parallel 212 formula that satisfy that round off in Close path are following under the rounding mode nearby:
Sel=sp1Cout?&(~g|(MSBSp0?&?g?&?LSBSp0));
ShiftIn=sp1Cout?&~MSBSp0?&?g;
Wherein, the round off selection signal of Sel for producing, Sel are 1 selection A+B+1; Sel is 0 selection A+B; The data value that ShiftIn moved into for when normalization, sp1Cout is the most significant digit carry of A+B+1, g is the warning position that moves to right and produce after 0 or 1 negate; MSBSp0 is the highest significant position of A+B, and LSBSp0 is the least significant bit (LSB) of A+B.
It is following to block parallel 212 formula that satisfy that round off in Close path under the rounding mode:
Sel=sp1Cout&~g;
ShiftIn=sp1Cout?&?g?&~MSBSp0;
Wherein, the parameter declaration in the formula is identical with parameter declaration under the rounding mode nearby.
The mantissa of the floating number that the index after the mantissa of the floating number that the index that leading 1 prediction 212 provides according to Close path mantissa selection 210 is bigger and the displacement alignment that provides for 0 or 1 211 that moves to right is less produces precoding and passes to first order DFF206 deposits.The formula that precoding is satisfied is following:
f
i=e
i-1?&((g
i?&~s
i+1)|(s
i?&~g
i+1))|~e
i-1?&((s
i?&~s
i+1)|(g
i?&~g
i+1));
Wherein, two mantissa of definition are respectively A=a
0a
1... a
M-2a
M-1And B=b
0b
1... b
M-2b
M-1, make W=A-B (not bringing the position into transmits), wherein w
i=a
i-b
i, w
i∈ 1,0, and 1}, definition F is 0,1 coded strings that produces after the precoding, the position identical (having 1 bit error) of its position of leading 1 and mantissa result of calculation leading 1.F=f
0f
1... f
M-2f
M-1, f
i∈ 0, and 1}, the value of F string is by the value combination results of W string under the different situations.For w
i∈ 1,0, three kinds of possibilities of 1}, definition e
i, g
i, s
i, wherein, w
i=0 o'clock, e
i=1; w
i=1 o'clock, g
i=1; w
i=-1 o'clock, s
i=1.
In the first order Far path mantissa processing unit 205; Far path mantissa select 215 according to index generate/the floating definiteness number that changes overflows and judges that the 208 index comparison results that provide extract operand information and zone bit judges that 202 two mantissa providing select; The mantissa of the operand that index is bigger exports to first order DFF206 as summand and deposits; The mantissa of the operand that index is less exports to 32 gt shift units 216; Its according to index generate/floating change the definiteness number overflow judge the 208 index difference information that provide move to right alignment and will handle after data export to first order DFF206 and deposit; It is in order to carry out the shifting function that floating-point changes fixed-point operation, to realize multiplexing that the shift unit that here moves to right is 32.32 fixed-point numbers move to left and 217 when fixing a point to change floating-point operation, use; Extract and zone bit judges that the information of 202 first operands of providing is shifted according to operand information; Whether the result after obtaining being shifted and the figure place of displacement and operand are zero flag information, and pass to first order DFF206 and deposit.RECIP/RSQRT mantissa handles 218 and when getting operation reciprocal or inverse square root operation, uses; Extract and zone bit judges that mantissa's information of 202 first operands of providing carries out mantissa and handle according to operand information; Carry out two search operations respectively; So with mantissa data as, obtain the mantissa's operation result under the different operating, and pass to first order DFF206 and deposit.
In Close path, the second level mantissa processing unit 237; The Close path that multiplexer 219 is deposited according to first order DFF206 is parallel, and the 214 Sel signals that provide that round off select one to pass to 24 lt shift units 221 and use from two result of calculations of 24 compound mantissa adder devices 213 outputs; Binary processing is carried out in the precoding that leading 1 prediction 212 that leading 1 judgement 220 is deposited according to first order DFF206 provides; Obtain in the F string leading 1 position; Employing binary chop algorithm is searched leading 1 position when carrying out binary coding, and Close path index complex totalizer 230,24 lt shift units 221 and error compensation 222 that the result behind the coding exports in 203 second level streamlines of index path are used.24 lt shift units 221 judge that according to leading 1 220 leading 1 the positional informations that provide move to left the result of calculation that multiplexer 219 provides; The data that when carrying out moving to left for the first time, the move into 214 ShiftIn data that provide that round off for the Close path is parallel, other the time to move into data be 0.Carry out leading 1 prediction and 1 error the time may occur; So the result and leading 1 of error compensation 222 after according to the displacement of 24 lt shift units 221 judges the walk abreast 214 ShiftIn data that provide that round off of 220 output and Close path and carries out error compensation; If carry out one displacement less then carry out single place shift again; Otherwise no longer be shifted; Thereby obtain the sign that nearly mantissa handles mantissa's calculating net result in path and whether error has taken place, export to data and select 234 to use.
In Far path, the second level mantissa processing unit 238; The OP2 form select 223 according to CpFp/ addition subtraction judge 209 provide manage the mantissa path nearby and manage mantissa's route result at a distance and select signal and addition, subtraction to select signal from move to left the data after 217 displacements that provide selection data and expanding of 32 gt shift units 216 and 32 fixed-point numbers; Obtain the data of 32 bit wides; Exporting to 32 compound mantissa adder devices 224 as addend uses; Calculate the warning position simultaneously, rounding bit and the value of pasting the position 225 use for parallel the rounding off in Far path.The summand that 224 pairs of Far paths of 32 compound mantissa adder devices mantissa selection 215 provides expands to 32 data; The addend of selecting to provide with the OP2 form carries out compound add operation; Obtain A+B, A+B+1, and divide other most significant digit carry flag; When fixing a point to change floating-point or floating-point commentaries on classics fixed-point operation, summand is 0.The 225 highest significant position carries that provide according to 32 compound mantissa adder devices 224 that round off that the Far path is parallel; Highest significant position, least significant bit (LSB), inferior low order information; The warning position that OP2 form selection 223 provides; The type that rounds off that rounding bit and stickup position information and input DFF201 provide is carried out the directly parallel processing of rounding off of mantissa of line of reasoning at a distance, the selection signal of net result after obtaining rounding off, the sign that whether need move to right after rounding off; The information such as value that move into when moving to left; Carrying out floating-point parallel when changeing fixed point when rounding off operation, the selection signal is provided, and these signals are exported to multiplexer 226 and index is handled the multiplexer 233 that carries out Far path exponent arithmetic gating in 203 second level streamlines.Multiplexer 226 according to the Far path parallel round off 225 provide round off select signal Sel and SelFloat2Fix to 32 compound mantissa adder devices 224 provide result of calculation select, and export to and move to left or move to right 1 or do not move 227.Move to left or move to right 1 or do not move 227 and judge that according to CpFp/ addition subtraction 209 addition, the subtraction that provide select signal; The data that addition, subtraction selected signal that multiplexer 226 is provided when floating-point changeed fixed point calculate the sign that mantissa far away handles mantissa's final calculation result in path and whether carried out moving to left, and pass to data respectively and select 234 to handle the multiplexer 233 that carries out Far path exponent arithmetic gating in 203 second level streamlines with index.
Parallel 225 formula that satisfy that round off in Far path are following under the rounding mode nearby:
Sel=(OpType==`ADDER)?((~sp0Cout?&?g?&(LSBSp0|r|s))|(sp0Cout?&?LSBSp0?&(LSB1Sp0|g|r|s))):(sp1Cout?&((~g?&~r?&~s)|(g&?r)|(MSBSp0?&?g?&(LSBSp0|s))));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout?&~MSBSp0?&MSB1Sp0?&((~g?&?r?&?s)|(~r?&?g)));
RightMove=(OpType==`ADDER)?(sp0Cout|(~sp0Cout?&?sp1Cout&?g?&(r|s|LSBSp0))):1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?(gFloat2Fix?&(LSBSp0|rFloat2Fix|sFloat2Fix)):((~gFloat2Fix?&~rFloat2Fix?&~sFloat2Fix)|(gFloat2Fix?&(LSBSp0|sFloat2Fix|rFloat2Fix)));
Wherein, the round off selection signal of Sel for producing, Sel are 1 selection A+B+1; Sel is that 0 to select A+B, ShiftIn be the data value that moves into when moving to left, and RightMove is the result of calculation sign that whether moves to right; Be 1 to move to right, otherwise do not move to right that SelFloat2Fix is the round off selection signal of floating-point when changeing fixed point; Sel is 1 selection A+B+1, and Sel is 0 selection A+B.OpType is that addition, the subtraction that CpFp/ addition subtraction judgement 209 provides selected signal; Float2FixOpType is that floating-point changes addition when fixing a point, subtraction is selected signal, and sp1Cout is the most significant digit carry of A+B+1, and sp0Cout is the most significant digit carry of A+B; MSBSp0 is the highest significant position of A+B; MSB1Sp0 is the inferior high significance bit of A+B, and LSBSp0 is the least significant bit (LSB) of A+B, and LSB1Sp0 is the inferior low order of A+B.G, r, s, gFloat2Fix, rFloat2Fix, sFloat2Fix are calculated by warning position, rounding bit and the stickup position that OpType, Float2FixOpType and OP2 form selection 223 provide, and it satisfies following formula.
g=(OpType==`ADDER)?gb:(gb^(rb|sb));
r=(OpType==`ADDER)?rb:(rb^sb);
s=(OpType==`ADDER)?sb:(sb);
gFloat2Fix=(Float2FixOpType==`ADDER)?gb:(gb^(rb?|sb));
rFloat2Fix=(Float2FixOpType==`ADDER)?rb:(rb^sb);
sFloat2Fix=(Float2FixOpType==`ADDER)?sb:(sb);
Wherein, gb, rb, warning position, rounding bit and stickup position that sb provides for " selection of OP2 form " 128.
It is following to block parallel 225 formula that satisfy that round off in Far path under the rounding mode:
Sel=(OpType==`ADDER)?1′b0:(sp1Cout?&(~g?&~r?&~s));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout?&~MSBSp0?&MSB1Sp0?&?g);
RightMove=(OpType==`ADDER)?sp0Cout:1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?1′b0:1′b1;
Wherein, the parameter declaration in the formula is identical with parameter declaration under the rounding mode nearby.
In the second level index processing unit 236; 8 complex indexes totalizers 230 receive leading 1 and judge that number and the negate of 220 displacements that provide and the bigger exponential quantity that first order DFF206 deposits carry out compound add operation; Obtain A+B; The result of A+B+1, and export to multiplexer 232, it denys that the occurrence of errors sign is selected the result of calculation of compound addition that multiplexer 232 provides according to error compensation 222; Obtain the final Index for Calculation result in shortcut footpath, and export to data and select 234.The action type information setting that OP1 228 deposits according to first order DFF206 carry out compound addition first operand value and pass to 8 complex indexes totalizers 231; If for fixed point is changeed floating-point operation; Then it is 158, otherwise the bigger exponential quantity that provides of depositing for first order DFF206, action type information that OP2 229 deposits according to first order DFF206 and CpFp/ addition subtraction judge 209 addition, the subtraction that provide select signal be provided with compound addition second operand value and pass to 8 complex indexes totalizers 231; If for fixed point is changeed floating-point operation; In 32 fixed-point numbers, 217 results that move to left is not 0 o'clock, and it is the move to left numerical value of the negate as a result after 217 displacements of 32 fixed-point numbers, and 32 fixed-point numbers move to left, and to equal at 0 o'clock be 0 to 217 results; Operation for other the present invention's supports; If the addition that CpFp/ addition subtraction judgement 209 provides, subtraction select signal for selecting subtraction, be output as-2, otherwise be 0.8 complex indexes totalizers 231 receive OP1 228 and the data that OP2 229 provides, and carry out compound add operation, obtain A+B, and the result of A+B+1 also passes to multiplexer 233.Multiplexer 233 obtains the directly final Index for Calculation result of a long way according to moving to left or move to right 1 or do not move 227 and provide parallel 225 signs that move to right that provide that round off in the sign that do not carried out moving to left and Far path the addition results that 8 complex indexes totalizers 231 provide is selected.
Above content is two-stage flowing water shown in Figure 2 to be realized the explanation of single-precision floating point plus-minus method and associative operation structure thereof.
Fig. 3 is a method flow diagram of realizing the high-speed floating point arithmetical unit of employing two stage pipeline structure shown in Figure 2, adopts the two-stage flowing structure to realize.Step 301 all operations carries out at first order flowing water, and streamline carries out step 302 all operations in the second level.In step 303, method 300 receives action type, the type that rounds off and first operand and second operand, and extracts corresponding sign bit, exponential sum mantissa data.In step 304, carry out calculating of index difference and comparison.In step 305, according to index difference information, shortcut directly utilizes compound totalizer to carry out mantissa's subtraction.In step 306, shortcut directly walks abreast and carries out leading 1 decision operation, and its operation result needs before step 310, to accomplish.In step 307, long way footpath is according to the index difference alignment operation that is shifted.In step 308, the type that rounds off that the shortcut footpath provides according to step 303 operation result is carried out complement code conversion and parallel rounding off respectively.In step 309, a long way directly utilizes compound totalizer to carry out mantissa's plus-minus.In step 310, the shortcut footpath is carried out normalization shift according to the result of leading 1 judgement to the output of step 208, obtains the final result of calculation in shortcut footpath.In step 311, the type that rounds off that long way footpath provides according to step 303 walks abreast to result of calculation and rounds off, and obtains the final result of calculation in long way footpath.In step 312, the action type that index difference information that provides according to step 304 and step 303 provide from the shortcut footpath, long way footpath or abnormality processing result export.
Can find out that through the top embodiment that provides high-speed floating point arithmetical unit provided by the invention adopts dual path; Leading 1 judges; Parallel support floating add subtraction and the associative operation thereof of rounding off changes 32 fixed-point numbers of floating-point operation introducing and moves to left except fixing a point, get the perhaps inverse square root operation immigration mantissa of operating reciprocal and handle; Floating-point changes fixed point introducing index and overflows judgment processing; And owing to increase action type in the logic of carrying out that data are selected and judging during output, floating add subtraction associative operation all is the data path of multiplexing floating add subtraction, and reusability is high simultaneously, area is little for data.Adopt pipeline organization to design the arithmetic speed height, handling capacity is big.
Above-described specific embodiment; The object of the invention, technical scheme and beneficial effect have been carried out further explain, and institute it should be understood that the above is merely specific embodiment of the present invention; Be not limited to the present invention; All within spirit of the present invention and principle, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.