CN102566967B - A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement - Google Patents

A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement Download PDF

Info

Publication number
CN102566967B
CN102566967B CN201110418897.6A CN201110418897A CN102566967B CN 102566967 B CN102566967 B CN 102566967B CN 201110418897 A CN201110418897 A CN 201110418897A CN 102566967 B CN102566967 B CN 102566967B
Authority
CN
China
Prior art keywords
mantissa
level
floating
processing unit
operand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110418897.6A
Other languages
Chinese (zh)
Other versions
CN102566967A (en
Inventor
王东琳
张志伟
王惠娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Silang Technology Co ltd
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201110418897.6A priority Critical patent/CN102566967B/en
Publication of CN102566967A publication Critical patent/CN102566967A/en
Application granted granted Critical
Publication of CN102566967B publication Critical patent/CN102566967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of high-speed floating point arithmetical unit, adopt N stage pipeline structure, N be greater than 1 natural number, comprising: input DFF, operand information are extracted and zone bit judging unit, N level floating-point operation structure, N level DFF and data selection unit.Input DFF is extracted by operand information and zone bit judging unit is connected to first order floating-point operation structure in N level floating-point operation structure, the first order DFF of this first order floating-point operation anatomical connectivity in N level DFF, this first order DFF is connected to the second level floating-point operation structure in N level floating-point operation structure, the second level DFF of this second level floating-point operation anatomical connectivity in N level DFF, this second level DFF is connected to the third level floating-point operation structure in N level floating-point operation structure, the like, this N-1 level DFF is connected to the N level floating-point operation structure in N level floating-point operation structure, finally, this N level floating-point operation structure is connected to the N level DFF in N level DFF by data selection unit.

Description

A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement
Technical field
The present invention relates to the floating-point operation technical field in microprocessor, exactly, relate to a kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement.
Background technology
Arithmetic unit occupies critical positions in a microprocessor architecture, it is the key element determining processor processing speed, plus-minus method and associative operation thereof proportion in computing is very high, adopts effective ways the arithmetic unit for plus-minus method and associative operation thereof to be improved to the primary study direction becoming current operation parts research aspect.
Usually, floating number represents all follows IEEE 754 standard, the single precision floating datum of this canonical representation is 32, comprise 1 bit sign position (s), 8 exponent bits (e) and 23 fractional bits (f), exponential part is a signed number, the mode of offset code is adopted to represent (side-play amount is 127), 1 of fractional part and implicit position forms the mantissa of reality jointly, therefore, the numerical value represented with s-e-f namely: (-1) s× 1.f × 2 e-127.
And basic floating add subtraction needs seven steps to complete: step 1: index compares: subtracted each other by the index of two operands, difference absolute value is d; Step 2: displacement alignment: to move to right d position compared with the mantissa of small data; Step 3: mantissa adds and subtracts: carry out adding reducing according to operational code and operand; Step 4: complement code standard is changed: mantissa result is for carrying out supplement time negative; Step 5: leading 1 judges: if 3 as subtraction, needs determination result to move to left figure place, as determined whether to move to right 1 for addition; Step 6: normalization shift: mantissa's plus-minus result is shifted, makes its highest significant position be 1; Step 7: to round off and Output rusults: require to round off operation to result according to rounding procedure.
Basic floating add subtraction step is more, and postpone comparatively large, in order to improve the arithmetic speed of floating add subtraction, the improvement to basic signed magnitude arithmetic(al) is all devoted in a lot of research, creates different effect of optimizations.
Application number disclosed in 17 days February in 2010 is the Chinese patent " the floating add device based on complement code rounds off " of 200910152505.9, inventor: tight dawn is unrestrained, disclose a kind of floating add device rounded off based on complement code, support floating add operation and floating-point subtraction operation, this floating add device comprises: index totalizer, mantissa's shift unit and mantissa's operand ready logic unit, for processing mantissa's operand according to the sign bit of the first floating-point operation number and the second floating-point operation number and index difference, mantissa adder device, round off decision logic unit, judgement of rounding off to mantissa adder result is unified: according to mantissa adder device export most significant digit judge mantissa and positive and negative, the set position judged of rounding off is determined according to the Gao Siwei that mantissa adder device exports, unified true form rounds off and adds 1 decision logic and complement code rounds off and 0 decision logic, and the totalizer that rounds off, for rounding off to the mantissa adder result of floating-point, and complete to mantissa and get complementary operation.
Application number disclosed in 28 days April in 2010 is the Chinese patent " for floating-point adder from leading 0/1 predicting unit of error correction " of 200910218505.4, inventor: Shao Zhibiao etc., disclose a kind of for floating-point adder from leading 0/1 Forecasting Methodology of error correction, it is final correct result that the method adopting multi input logic gate and parallel computation to combine achieves Output rusults, totalizer result need not be relied on revise, adopt parallel computation, as operand bit wide increases, critical path depth can not be affected.Can while calculating floating add, synchronous prediction is made to result of calculation carry out the standardizing shift count that carries out and index replacement information required for process, and predict the outcome do not rely on totalizer export and only produced by predicting unit, predict the outcome as the right value without the need to revising further, the critical path of predicting unit can not be elongated because the bit wide of operand lengthens.
Application number disclosed in 19 days September in 2007 is the Chinese patent " method for high-speed floating point arithmetical unit (ALU) " of 200580034798.0, gas Tahoua in inventor: S Si, disclose a kind of technology for performing the improvement managing path index difference nearby in the arithmetical unit of microprocessor, in one embodiment, the device had for line of reasoning footpath subtraction nearby and the separate logic of line of reasoning footpath subtraction at a distance produces exponent difference signal by only utilizing two of the index of two floating-point operation numbers least significant bit (LSB)s to perform index differences.
Above-mentioned three patents, be optimized for the rounding treatment in basic floating add subtraction seven step, leading 1 judgement and index rating unit and improve respectively, to improve handling property, but the improvement do not provided for floating add subtraction overall data path and design proposal, reference and directiveness are lacked to the global design of floating add subtraction, invention herein with the overall data path of high-speed floating point plus-minus method and associative operation thereof for the starting point, can provide design proposal.
Application number disclosed in 16 days November in 2011 is the Chinese patent " a kind of circuit realizing floating add fast " of 102243577A, inventor: Wang Yongliu, disclose a kind of circuit realizing floating add fast, set up an additive operation unit, two parts are divided into operate the computing of whole floating add, this patent is processed the equal and poor anisochrouous situation of index of index difference is parallel by the logic of adding executed in parallel, we can draw by carefully analyzing: index difference be 1 contrary sign floating add and the index difference floating add that is greater than 1 all not etc. to do not carried out in path by index, the index path mantissa alignment displacement that should comprise in basic adding step such as not also will comprise the normalization shift of result of calculation, do not improve computing velocity in fact.
Application number disclosed in 17 days Dec in 2010 is the Chinese patent " the low delay floating accumulator of the high speed based on FPGA and its implementation " of 201010594926.X, inventor: Chen Yaowu etc., discloses the low delay totalizer of a kind of high speed based on FPGA and its implementation.The floating accumulator of this invention comprises a floating-point adder unit, the design of whole floating accumulator realizes adopting pipeline organization, but detailed design plan is not provided for the specific design of floating-point adder, and whole design is based on FPGA structure, and the basic cell structure adopted in FPGA and chip design is completely different, to use for reference property little for carrying out the non-processing element design with versatility based on FPGA for this kind of structure.
Known by upper surface analysis, optimal design at present for floating add subtraction has proposed a lot, but a lot of research all concentrates on how by being optimized the realization improving local logic, does not provide the embodiment that whole floating add subtraction data via design aspect is more effective and feasible.In addition, some is researched and proposed and adopts pipelining to carry out the design of floating add subtraction, but also not providing detailed design and prioritization scheme and application has limitation.
Summary of the invention
(1) technical matters that will solve
In view of this, fundamental purpose of the present invention is to provide a kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement, to realize floating add, subtraction and associative operation thereof.
(2) technical scheme
For achieving the above object, the invention provides a kind of high-speed floating point arithmetical unit, this high-speed floating point arithmetical unit adopts N stage pipeline structure, N be greater than 1 natural number, comprise: input DFF 101, operand information is extracted and zone bit judging unit 115, N level floating-point operation structure, N level DFF and data selection unit 116, wherein: this input DFF 101 is extracted by this operand information and zone bit judging unit 115 is connected to first order floating-point operation structure in this N level floating-point operation structure, the first order DFF 102 of this first order floating-point operation anatomical connectivity in this N level DFF, this first order DFF 102 is connected to the second level floating-point operation structure in this N level floating-point operation structure, the second level DFF 103 of this second level floating-point operation anatomical connectivity in this N level DFF, this second level DFF 103 is connected to the third level floating-point operation structure in this N level floating-point operation structure, the like, the N-1 level DFF 104 of N-1 level floating-point operation anatomical connectivity in this N level DFF in this N level floating-point operation structure, this N-1 level DFF 104 is connected to the N level floating-point operation structure in this N level floating-point operation structure, finally, this N level floating-point operation structure is connected to the N level DFF 105 in this N level DFF by this data selection unit 116.
In such scheme, this N level floating-point operation structure includes N level index processing unit, N level Close path mantissa's processing unit and N level Far path mantissa processing unit, and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are connected to N level index processing unit.
In such scheme, this input DFF 101 receives and deposits action type, the type that rounds off, first operand and the second operand that outside provides, and action type is exported to first order index processing unit 106 and first order DFF 102, the type that will round off exports to first order Close path mantissa processing unit 109 and first order DFF 102, first operand and second operand is passed to this operand information and extracts and zone bit judging unit 115.
In such scheme, the extraction of this operand information and zone bit judging unit 115 extract the sign bit of first operand and second operand, exponent bits and mantissa's bit data, and calculate the zone bit data of first operand and second operand, the exponent bits data of the first operand obtained and second operand are exported to first order index processing unit 106, mantissa's bit data of first operand and second operand is exported to first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112, sign bit data and zone bit data are exported to first order DFF 102 deposit.
In such scheme, the index that this first order index processing unit 106 carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112 provide calculates, and this index result of calculation is deposited in first order DFF 102.
In such scheme, the mantissa that this first order Close path mantissa processing unit 109 carries out Close path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit 106 provide calculates, and this mantissa's result of calculation is deposited in first order DFF 102.
In such scheme, the mantissa that this first order Far path mantissa processing unit 112 carries out Far path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit 106 provide calculates, and this mantissa's result of calculation is deposited in first order DFF 102.
In such scheme, mantissa's result of calculation in the Far path that mantissa's result of calculation in the Close path that this data selection unit 116 provides according to being extracted by operand information of passing over and various zone bit data that zone bit judging unit 115 provides, N level Close path mantissa processing unit 111 and N level Far path mantissa processing unit 114 provide obtains final result of calculation, passes to N level DFF and deposits.
In such scheme, this first order DFF 102 is for depositing the results of intermediate calculations of this first order index processing unit 106, this first order Close path mantissa processing unit 109 and this first order Far path mantissa processing unit 112, and the action type that operand information is extracted and the sign bit data that provides of zone bit judging unit 115 and zone bit data and input DFF 101 deposit and the categorical data that rounds off, use for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit.
(3) beneficial effect
As can be seen from technique scheme, the present invention has following beneficial effect:
1, high-speed floating point arithmetical unit provided by the invention, adopt dual path, leading 1 judges, parallel rounding off supports floating add subtraction and associative operation thereof, introduce 32 fixed-point numbers move to left except fixed point turns floating-point operation, get operation reciprocal or the process of inverse square root operation immigration mantissa, floating-point turns fixed point and introduces index spilling judgement process, and owing to increasing the logic that action type carries out when carrying out data selection and exporting judging, floating add subtraction associative operation is all the data path of multiplexing floating add subtraction, and data concurrent multiplexing is high, area is little.Adopt pipeline organization to carry out design arithmetic speed high, handling capacity is large.
2, high-speed floating point arithmetical unit provided by the invention, support floating add subtraction and relevant multiple operation thereof, data path efficient multiplexing, multistage flowing water effectively shortens data path path, increases data throughout, meets high-speed demand.In a specific embodiment, the structure of carrying out floating add subtraction and associative operation thereof providing the multilevel flow water-bound that can carry out floating add subtraction and associative operation thereof and adopt two-stage flowing structure to realize.
Accompanying drawing explanation
Fig. 1 is the structural representation of the high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement according to the embodiment of the present invention.
Fig. 2 is the structural representation adopting the high-speed floating point arithmetical unit of two stage pipeline structure according to the embodiment of the present invention.
Fig. 3 is the method flow diagram realizing the high-speed floating point arithmetical unit adopting two stage pipeline structure shown in Fig. 2.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
Parallel in addition or subtraction operation at dual path: complement code conversion and shifting function mutual exclusion; The situation of the shift unit of a complete bit wide is only needed among the displacement of mantissa alignment and normalized displacement, line of reasoning footpath and at a distance line of reasoning footpath are nearby set, the subtraction process that index difference is less than or equal to 1 is carried out in line of reasoning footpath nearby, and addition and the poor subtraction process being greater than 1 of index are carried out in line of reasoning footpath at a distance.
Leading 1 judges the number for calculating in input leading 0, and the figure place of the number of leading 0 normalization shift just, comprise precoding, leading 1 judges, error correction three part, precoding obtains one 0 according to the magnitude portion coding of two operations will carrying out calculating, 1 string, the position identical (1 error may be had) of its position of leading 1 and mantissa result of calculation leading 1, leading 1 judges it is obtain precoding 0, 1 coded strings leading 1 position carry out binary coding, to obtain the figure place of normalization shift, error correction corrects may exist in precoding process 1 error.
Operation of rounding off carries out small change to result of calculation, use compound adder to calculate all possible outcomes simultaneously, carrying out result in the step that rounds off selects operation just can significantly improve arithmetic speed, parallel rounding off will advance to carry out parallel with mantissa plus-minus method computation process calculating the operation of rounding off finally carried out exactly, improve concurrency, reduce the length in plus-minus method path.
Pipelining is widely used, and has been proved to be features such as having high speed and high-throughput.In addition, the realization of the operation of being correlated with of floating add the subtraction such as conversion of floating fixed point and the operation of floating-point comparing class etc., to a great extent can multiplexing floating add subtraction data path, to realize the simple and high reusability of integrated circuit structure, reduction area consumption.So, carry out the data path OVERALL OPTIMIZA-TION DESIGN FOR of floating add subtraction and associative operation thereof by adopting multiple technologies and multiplexing configuration is research emphasis of the present invention.
Given this, the present invention proposes a kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement, the concurrent operation of multilevel flow water-bound is utilized to perform the addition of two floating-point operation numbers, subtraction and associative operation thereof, floating add subtraction associative operation comprises floating-point and turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root inverse and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) operation, what flowing water multi-stage pipeline arrangement adopts realize, can need to determine according to application.
Fig. 1 is the structural representation of the high-speed floating point arithmetical unit 100 adopting multistage (N level) pipeline organization according to the embodiment of the present invention, this high-speed floating point arithmetical unit comprises input D type flip-flop (D typeflip-flop, DFF) 101, operand information is extracted and zone bit judging unit 115, N level floating-point operation structure, N level DFF and data selection unit 116, N be greater than 1 natural number, wherein, this input DFF 101 is extracted by this operand information and zone bit judging unit 115 is connected to first order floating-point operation structure, this first order floating-point operation anatomical connectivity is in first order DFF, this first order DFF is connected to second level floating-point operation structure, this second level floating-point operation anatomical connectivity is in second level DFF, this second level DFF is connected to third level floating-point operation structure, the like, N-1 level floating-point operation anatomical connectivity is in N-1 level DFF, this N-1 level DFF is connected to N level floating-point operation structure, finally, this N level floating-point operation structure is connected to N level DFF by this data selection unit 116.
This N level floating-point operation structure includes N level index processing unit, N level Close path mantissa's processing unit and N level Far path mantissa processing unit, and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are connected to N level index processing unit.
This input DFF101 receives and deposits the action type that outside provides, round off type, first operand and second operand, wherein, the type that action type indicates operation is floating add, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root inverse and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) in which kind of operation, the type that rounds off indicates to block and rounds off or round off nearby, and action type is exported to first order index processing unit 106 and first order DFF 102, the type that will round off exports to first order Close path mantissa processing unit 109 and first order DFF 102, first operand and second operand are passed to operand information to extract and zone bit judging unit 115.
Operand information extraction and zone bit judging unit 115 extract the sign bit of first operand and second operand, exponent bits and mantissa's bit data, and calculate the zero flag of two operands, the zone bit data such as infinite mark and non-number mark, the exponent bits data of obtain two operands are exported to first order index processing unit 106, mantissa's bit data of two operands is exported to first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112, sign bit data and zone bit data are exported to first order DFF 102 deposit, so that use when next stage flowing water carries out computing.
The index that this first order index processing unit 106 carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112 provide calculates, and this index result of calculation is deposited in first order DFF 102.
The mantissa that first order Close path mantissa processing unit 109 carries out Close path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit 106 provide calculates, and this mantissa's result of calculation is deposited in first order DFF 102.
The mantissa that first order Far path mantissa processing unit 112 carries out Far path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit 106 provide calculates, and this mantissa's result of calculation is deposited in first order DFF 102.
This first order DFF 102 is for depositing the results of intermediate calculations of this first order index processing unit 106, this first order Close path mantissa processing unit 109 and this first order Far path mantissa processing unit 112, and the action type that operand information is extracted and the sign bit data that provides of zone bit judging unit 115 and zone bit data and input DFF 101 deposit and the categorical data that rounds off, use for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit.
Similar with this first order index processing unit 106, the index that second level index processing unit 107 carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and Close path, second level mantissa processing unit 110 and Far path, second level mantissa processing unit 113 provide calculates, and this index result of calculation is deposited in the DFF103 of the second level.The index that N level index processing unit 108 carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and N level Close path mantissa's processing unit and N level Far path mantissa processing unit provide calculates, and this index result of calculation is exported to data selection unit 116.Wherein the index of floating add subtraction and associative operation thereof calculates and specifically comprises: calculate two operand indexes poor, on index, underflow judges, Close path, Far path index generate, and provide desired data, as index difference data for Close path mantissa at different levels process and Far path mantissa process.
Similar with this first order Close path mantissa processing unit 109, the mantissa that Close path, second level mantissa processing unit 110 carries out Close path according to the data that mantissa's bit data of first operand and second operand and second level index processing unit 107 provide calculates, and mantissa's result of calculation in this Close path is deposited in second level DFF 103.The mantissa that N level Close path mantissa processing unit 111 carries out Close path according to the data that mantissa's bit data of first operand and second operand and N level index processing unit 108 provide calculates, and mantissa's result of calculation in this Close path is exported to data selection unit 116.Wherein the mantissa in Close path calculates and specifically comprises: index difference is less than or equal to the subtraction mantissa process of 1, leading 1 of result of calculation judges and the parallel calculating of rounding off of Close route result, and provide desired data for index process at different levels, as leading 1 judges that the result displacement number determined is to adjust the index results in Close path.
Similar with this first order Far path mantissa processing unit 112, the mantissa that Far path, second level mantissa processing unit 113 carries out Far path according to the data that mantissa's bit data of first operand and second operand and second level index processing unit 107 provide calculates, and mantissa's result of calculation in this Far path is deposited in second level DFF 103.The mantissa that N level Far path mantissa processing unit 114 carries out Far path according to the data that mantissa's bit data of first operand and second operand and N level index processing unit 108 provide calculates, and mantissa's result of calculation in this Far path is exported to data selection unit 116.Wherein the mantissa in Far path calculates and specifically comprises: the subtraction mantissa process that index difference is greater than 1, mantissa's process of addition, the parallel calculating of rounding off of Far route result, and plus-minus method associative operation mantissa calculates the process structure needing additionally to add, and provide desired data for index process at different levels, after rounding off as parallel, the displacement number of result of calculation is to adjust the index results in Far path.
Mantissa's result of calculation in the Far path that mantissa's result of calculation in the Close path that data selection unit 116 provides according to being extracted by operand information of passing over and various zone bit data that zone bit judging unit 115 provides, N level Close path mantissa processing unit 111 and N level Far path mantissa processing unit 114 provide obtains final result of calculation, passes to N level DFF and deposits.
First order DFF 102, second level DFF 103, N-1 level DFF 104 deposit index process at different levels, Close path mantissa processes, the results of intermediate calculations of Far path mantissa process, for the process of next stage index, the process of Close path mantissa and the process of Far path mantissa calculate and use.
In addition, the suspension points between second level DFF 103 and N-1 level DFF 104 represents division and the design that can carry out multi-stage pipeline as required.
Fig. 2 is the structural representation adopting the high-speed floating point arithmetical unit 200 of two stage pipeline structure according to the embodiment of the present invention, the present embodiment adopts the high-speed floating point arithmetical unit 200 of two stage pipeline structure, utilizes that the concurrent operation of two-stage flowing structure performs the addition of two floating-point operation numbers, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point get inverse, floating-point square root is reciprocal and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) operation.As shown in Figure 2, this high-speed floating point arithmetical unit 200 comprises input DFF 201, operand information extraction and zone bit judging unit 202, first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205, first order DFF206, second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238, data selection 234 and second level DFF235.
Wherein, input DFF 201 receives the action type that outside provides, round off type, first operand and second operand are deposited, wherein, the type that action type indicates operation is floating add, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root inverse and comparing class (comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to and compare) in which kind of operation, the type that rounds off indicates to block and rounds off or round off nearby, first operand and second operand are respectively the data of 32, and action type is exported to first order index processing unit 203 and first order DFF206, the type that will round off exports to first order Close path mantissa processing unit 204 and first order DFF206, first operand and second operand are passed to operand information to extract and zone bit judging unit 202.
Operand information extraction and zone bit judging unit 202 extract the sign bit of first operand and second operand, exponential sum mantissa data, and calculate the zero flag of two operands, infinite mark and non-several flag information, the exponent data of obtain two operands is exported to first order index processing unit 203, the mantissa data of two operands is exported to first order Close path mantissa processing unit 204 and first order Far path mantissa processing unit 205, sign bit information and zone bit data are exported to first order DFF206 deposit, so that use when next stage flowing water carries out computing.
First order DFF206 deposits the results of intermediate calculations of first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205, and the action type that operand information is extracted and the sign bit that provides of zone bit judging unit 202 and zone bit data and input DFF201 deposit and the categorical data that rounds off, export to second level flowing water and use.
The action type that data selection unit 234 provides according to first order DFF, sign bit, zone bit data are to from first order index processing unit 203, first order Close path mantissa processing unit 204, the result of first order Far path mantissa processing unit 205 is analyzed, obtain floating add, subtraction, floating-point turns fixed point, fixed point turns floating-point, floating-point absolute value, floating-point gets inverse, floating-point square root is reciprocal, comparison of equalization, not etc. do not compare, be greater than and compare, be more than or equal to and compare, be less than and compare, be less than or equal to compare operation and export to second level DFF235 with the result under abnormal conditions under normal circumstances.The correct result of calculation that data selection unit 234 provides by second level DFF235 is deposited and exports.
Illustrate first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205 below, the Inner Constitution of second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238 and data interaction relation.
In first order index processing unit 203, 8 complex indexes totalizers 207 receive the exponent data of two operands that operand extracts and zone bit judging unit 202 provides, suppose it is A and B respectively, calculate the carry signal of A+B and A+B+1 and point other most significant digit simultaneously, for the associative operation of floating add subtraction, carry out fixed point turn floating-point operation and floating-point turn fixed-point arithmetic time, input be respectively 158 and first operand negate after value, and when carrying out floating-point and getting derivative action, input be respectively 254 and first operand negate after value, when carrying out floating-point square root derivative action, input be respectively 190 and first operand negate after value, to realize the multiplexing of arithmetic unit, result of calculation is exported to index generation/floating turning and is determined the judgement 208 of index spilling.Index generates/and floating turning determine index and overflow action type information that judgement 208 provides according to input DFF and 8 complex indexes totalizers 207 provide the result of calculation of complex adder generation index poor, larger index, floating-point turns fixed point overflow and underflow signal, and floating-point gets operation reciprocal and the floating-point square root index operated reciprocal, export to respectively CpFp/ addition subtraction judge 209 and first order DFF206 deposit.CpFp/ addition subtraction judge 209 according to the input action type that provides of DFF201 and index generate/floating turning determine index difference that index overflows judgement 208 and provide and produce and manage mantissa path and the selection signal that operates of the selection signal of reason mantissa route result and addition and subtraction at a distance nearby.
In first order Close path mantissa processing unit 204, Close path mantissa select 210 according to index generate/floating turning determine the index comparative result that index overflows judgement 208 and provide and judge that the 202 Liang Ge mantissa provided are selected to operand information extraction and zone bit, the mantissa of operand larger for index is exported to 24 compound mantissa adder devices 213 as summand, the mantissa of the operand that index is less exports to and moves to right 0 or 1 211, it generates/floating turn of index difference the information of determining the judgement 208 of index spilling and providing according to index, determine whether to move to right one, if index difference is 1, move to right displacement, otherwise do not move to right, move to right and 0 or 1 211 the data after process are passed to 24 compound adders 213 as addend, Leading-one predicator 212 and Close parallel path round off 214.24 compound mantissa adder devices 213 receive selects 210 summands provided from Close path mantissa and 0 or 1 211 addend provided complex adder operation is carried out in addend negate of moving to right, obtain A+B, A+B+1, and point other most significant digit carry flag and minus flag signal, and pass to first order DFF206 and deposit, if result of calculation is less than 0, then minus flag signal is effective, for single precision floating datum computing, because magnitude portion is 24 significance bits, so compound adder adopts 24 bit wides.Close parallel path rounds off the 214 highest significant position carries provided according to 24 compound mantissa adder devices 213, highest significant position, least significant bit (LSB) information, the rounding procedure information that the warning position that theres is provided for 0 or 1 211 and input DFF201 provide of moving to right carries out the parallel rounding treatment of Close path mantissa, obtains the value that moves in the selection signal of net result after rounding off and postnormalization of rounding off displacement and pass to first order DFF206 depositing.
Nearby under rounding mode Close parallel path round off 212 meet formula as follows:
Sel=sp1Cout &(~g|(MSBSp0 & g & LSBSp0));
ShiftIn=sp1Cout &~MSBSp0 & g;
Wherein, Sel is the selection signal that rounds off produced, Sel is 1 selection A+B+1, Sel is 0 selection A+B, the data value moved into when ShiftIn is normalization, sp1Cout is the most significant digit carry of A+B+1, and g is the warning position produced after 0 or 1 negate that moves to right, MSBSp0 is the highest significant position of A+B, and LSBSp0 is the least significant bit (LSB) of A+B.
Block 212 formula met that round off of Close parallel path under rounding mode as follows:
Sel=sp1Cout&~g;
ShiftIn=sp1Cout & g &~MSBSp0;
Wherein, the parameter declaration in formula is identical with the parameter declaration under rounding mode nearby.
The mantissa of the floating number that Leading-one predicator 212 selects the mantissa of the larger floating number of 210 indexes that provide according to Close path mantissa and index after the displacement alignment that provides for 0 or 1 211 of moving to right is less produces precoding and passes to first order DFF206 deposits.The formula that precoding meets is as follows:
f i=e i-1&((g i&~s i+1)|(s i&~g i+1))|~e i-1&((s i&~s i+1)|(g i&~g i+1));
Wherein, define Liang Ge mantissa and be respectively A=a 0a 1... a m-2a m-1and B=b 0b 1... b m-2b m-1, make W=A-B (not being with carry transmission), wherein w i=a i-b i, w i{-1,0,1}, definition F is 0,1 coded strings produced after precoding to ∈, the position identical (may have 1 error) of its position of leading 1 and mantissa result of calculation leading 1.F=f 0f 1... f m-2f m-1, f i{ value of 0,1}, F string is that the value of being gone here and there by W under different situations combines generation to ∈.For w i∈ { three kinds of possibilities of-1,0,1}, definition e i, g i, s i, wherein, w iwhen=0, e i=1; w iwhen=1, g i=1; w iwhen=-1, s i=1.
In first order Far path mantissa processing unit 205, Far path mantissa select 215 according to index generate/floating turning determine the index comparative result that index overflows judgement 208 and provide and judge that the 202 Liang Ge mantissa provided are selected to operand information extraction and zone bit, the mantissa of operand larger for index is exported to first order DFF206 as summand deposit, the mantissa of the operand that index is less exports to 32 gt shift units 216, its according to index generate/floating turning determine index difference information that index overflows judgement 208 and provide and carry out moving to right alignment the data after processing are exported to first order DFF206 deposit, the shift unit that herein moves to right is 32 is to carry out the shifting function that floating-point turns fixed-point operation, realize multiplexing.32 fixed-point numbers move to left 217 carrying out using when fixed point turns floating-point operation, judge that the information of 202 first operand provided is shifted according to operand information extraction and zone bit, obtain the flag information whether result after being shifted and the figure place of displacement and operand be zero, and pass to first order DFF206 and deposit.RECIP/RSQRT mantissa process 218 use when carrying out getting operation reciprocal or inverse square root operation, judge that the mantissa information of 202 first operand provided carries out mantissa's process according to operand information extraction and zone bit, carry out two search operations respectively, using mantissa data as so, obtain the mantissa's operation result under different operating, and pass to first order DFF206 and deposit.
In Close path, second level mantissa processing unit 237, multiplexer 219 is selected one to pass to 24 lt shift units 221 to use according to the Close parallel path that first order DFF206 the deposits 214 Sel signals provided that round off from two result of calculations that 24 compound mantissa adder devices 213 export, leading 1 judges that binary processing is carried out in the precoding that 220 Leading-one predicator 212 deposited according to first order DFF206 provide, obtain the position of in F string leading 1, binary chop algorithm is adopted to search the position of leading 1 when carrying out binary coding, result after coding is exported to the Close path index compound adder 230 in the streamline of the second level, index path 203, 24 lt shift units 221 and error compensation 222 use.24 lt shift units 221 according to leading 1 judge 220 provide leading 1 positional information the result of calculation that multiplexer 219 provides is moved to left, carry out first time move to left time the data that move into round off the 214 ShiftIn data provided for Close parallel path, other time to move into data be 0.The error of 1 may be there is when carrying out Leading-one predicator, so error compensation 222 according to the result and leading 1 after 24 lt shift units 221 are shifted judge 220 output and the Close parallel path 214 ShiftIn data provided that round off carry out error compensation, if carry out the displacement of less, then carry out single place shift, otherwise be no longer shifted, thus the mantissa obtaining nearly mantissa process path calculates net result and whether there occurs the mark of error, exports to data selection 234 and uses.
In Far path, second level mantissa processing unit 238, OP2 form select 223 according to CpFp/ addition subtraction judge 209 provide manage mantissa paths nearby and reason mantissa route result selection signal and addition, subtraction operation selection signal are selected data and are expanded from the data after 32 gt shift units 216 and 32 fixed-point numbers move to left 217 displacements provided at a distance, obtain the data of 32 bit wides, export to 32 compound mantissa adder devices 224 as addend to use, calculate warning position simultaneously, the value of rounding bit and sticky position, to round off 225 uses for Far parallel path.32 compound mantissa adder devices, 224 pairs of Far path mantissa select 215 summands provided to expand to the data of 32, the addend provided is selected to carry out complex adder operation with OP2 form, obtain A+B, A+B+1, and point other most significant digit carry flag, carry out fixed point turn floating-point or floating-point turn fixed-point operation time, summand is 0.Far parallel path rounds off the 225 highest significant position carries provided according to 32 compound mantissa adder devices 224, highest significant position, least significant bit (LSB), secondary low order information, OP2 form selects the 223 warning positions provided, the type that rounds off that rounding bit and sticky position information and input DFF201 provide carries out the parallel rounding treatment of line of reasoning footpath mantissa at a distance, obtain the selection signal of net result after rounding off, the need of the mark carrying out moving to right after rounding off, the information such as the value moved into when moving to left, to carry out when floating-point turns fixed point parallel round off operation time, selection signal is provided, and these signals are exported to the multiplexer 233 carrying out Far path index computing gating in multiplexer 226 and index process 203 second level streamline.Multiplexer 226 according to Far parallel path round off 225 provide round off select signal Sel and SelFloat2Fix to 32 compound mantissa adder devices 224 provide result of calculation select, and export to and move to left or move to right 1 or do not move 227.Move to left or move to right 1 or do not move 227 and judge 209 additions provided, subtraction operation selection signal according to CpFp/ addition subtraction, mantissa's final calculation result that addition when floating-point turns fixed point, subtraction operation selection signal calculate mantissa far away process path to the data that multiplexer 226 provides and the mark whether moved to left, and pass to the multiplexer 233 carrying out Far path index computing gating in data selection 234 and index process 203 second level streamline respectively.
Nearby under rounding mode Far parallel path round off 225 meet formula as follows:
Sel=(OpType==`ADDER)?((~sp0Cout & g &(LSBSp0|r|s))|(sp0Cout & LSBSp0 &(LSB1Sp0|g|r|s))):(sp1Cout &((~g &~r &~s)|(g& r)|(MSBSp0 & g &(LSBSp0|s))));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout &~MSBSp0 &MSB1Sp0 &((~g & r & s)|(~r & g)));
RightMove=(OpType==`ADDER)?(sp0Cout|(~sp0Cout & sp1Cout& g &(r|s|LSBSp0))):1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?(gFloat2Fix &(LSBSp0|rFloat2Fix|sFloat2Fix)):((~gFloat2Fix &~rFloat2Fix &~sFloat2Fix)|(gFloat2Fix &(LSBSp0|sFloat2Fix|rFloat2Fix)));
Wherein, Sel is the selection signal that rounds off produced, Sel is 1 selection A+B+1, Sel data value that to be 0 selection A+B, ShiftIn be moves into when moving to left, RightMove is whether result of calculation moves to right mark, be 1 to move to right, otherwise do not move to right, SelFloat2Fix is the round off selection signal of floating-point when turning fixed point, Sel is 1 selection A+B+1, Sel is 0 selection A+B.OpType judges 209 additions provided, subtraction operation selection signal for CpFp/ addition subtraction, Float2FixOpType is floating-point addition, subtraction operation selection signal when turning fixed point, sp1Cout is the most significant digit carry of A+B+1, sp0Cout is the most significant digit carry of A+B, MSBSp0 is the highest significant position of A+B, MSB1Sp0 is the secondary high significance bit of A+B, and LSBSp0 is the least significant bit (LSB) of A+B, and LSB1Sp0 is the secondary low order of A+B.G, r, s, gFloat2Fix, rFloat2Fix, sFloat2Fix select the 223 warning positions provided, rounding bit and sticky position to calculate by OpType, Float2FixOpType and OP2 form, and it meets following formula.
g=(OpType==`ADDER)?gb:(gb^(rb|sb));
r=(OpType==`ADDER)?rb:(rb^sb);
s=(OpType==`ADDER)?sb:(sb);
gFloat2Fix=(Float2FixOpType==`ADDER)?gb:(gb^(rb |sb));
rFloat2Fix=(Float2FixOpType==`ADDER)?rb:(rb^sb);
sFloat2Fix=(Float2FixOpType==`ADDER)?sb:(sb);
Wherein, gb, rb, sb are " selection of OP2 form " 128 warning positions provided, rounding bit and sticky position.
Block 225 formula met that round off of Far parallel path under rounding mode as follows:
Sel=(OpType==`ADDER)?1′b0:(sp1Cout &(~g &~r &~s));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout &~MSBSp0 &MSB1Sp0 & g);
RightMove=(OpType==`ADDER)?sp0Cout:1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?1′b0:1′b1;
Wherein, the parameter declaration in formula is identical with the parameter declaration under rounding mode nearby.
In second level index processing unit 236,8 complex indexes totalizers 230 receive leading 1 and judge the number of 220 displacements provided and the larger exponential quantity that negate and first order DFF206 deposit carries out complex adder operation, obtain A+B, the result of A+B+1, and export to multiplexer 232, multiplexer 232 provides the result of calculation of no generation error flag to complex adder according to error compensation 222 and selects, obtain the index result of calculation that near-path is final, and export to data selection 234.The action type information setting that OP1 228 deposits according to first order DFF206 is carried out the value of the first operand of complex adder and passes to 8 complex indexes totalizers 231, if for fixed point turns floating-point operation, then it is 158, otherwise be the larger exponential quantity provided that first order DFF206 deposits, the action type information that OP2 229 deposits according to first order DFF206 and CpFp/ addition subtraction judge 209 additions provided, subtraction operation selection signal arranges the value of the second operand of complex adder and passes to 8 complex indexes totalizers 231, if for fixed point turns floating-point operation, 32 fixed-point numbers move to left 217 results be not 0 time, it is that 32 fixed-point numbers move to left the numerical value of the result negate after 217 displacements, 32 fixed-point numbers, 217 results that move to left are 0 when equaling 0, for the operation that other the present invention support, if CpFp/ addition subtraction judges 209 additions provided, subtraction operation selection signal is for selecting subtraction, export as-2, otherwise be 0.8 complex indexes totalizers 231 receive the data that OP1 228 and OP2 229 provides, and carry out complex adder operation, obtain A+B, and the result of A+B+1 also passes to multiplexer 233.Multiplexer 233, according to moving to left or move to right 1 or do not move 227 and provide the no mark that moves to left and Far parallel path 225 marks that move to right provided that round off and select the addition results that 8 complex indexes totalizers 231 provide, obtains the index result of calculation that long way footpath is final.
Above content is the explanation two-stage flowing water shown in Fig. 2 being realized to single-precision floating point plus-minus method and associative operation structure thereof.
Fig. 3 is the method flow diagram realizing the high-speed floating point arithmetical unit adopting two stage pipeline structure shown in Fig. 2, adopts two-stage flowing structure to realize.Step 301 all operations carries out at first order flowing water, and step 302 all operations carries out at second level streamline.In step 303, method 300 receives action type, the type that rounds off and first operand and second operand, and extracts corresponding sign bit, exponential sum mantissa data.In step 304, carry out index difference and calculate and compare.In step 305, according to index difference information, near-path utilizes compound adder to carry out mantissa's subtraction.In step 306, near-path is parallel carries out leading 1 judgement operation, and its operation result need complete before step 310.In step 307, displacement alignment operation is carried out according to index difference in long way footpath.In step 308, near-path carries out complement code conversion and walks abreast rounding off according to the type that rounds off that step 303 provides to operation result respectively.In step 309, long way footpath utilizes compound adder to carry out mantissa's plus-minus.In step 310, near-path carries out normalization shift according to the result that leading 1 judges to the output of step 208, obtains the result of calculation that near-path is final.In step 311, the type that rounds off that long way footpath provides according to step 303 carries out parallel rounding off to result of calculation, and obtains the final result of calculation in long way footpath.In step 312, the action type that the index difference information provided according to step 304 and step 303 provide exports from near-path, long way footpath or abnormality processing result.
Can be found out by the embodiment provided above, high-speed floating point arithmetical unit provided by the invention, adopt dual path, leading 1 judges, parallel rounding off supports floating add subtraction and associative operation thereof, introduce 32 fixed-point numbers move to left except fixed point turns floating-point operation, get operation reciprocal or the process of inverse square root operation immigration mantissa, floating-point turns fixed point and introduces index spilling judgement process, and owing to increasing the logic that action type carries out when carrying out data selection and exporting judging, floating add subtraction associative operation is all the data path of multiplexing floating add subtraction, data concurrent multiplexing is high, area is little.Adopt pipeline organization to carry out design arithmetic speed high, handling capacity is large.
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (16)

1. a high-speed floating point arithmetical unit, it is characterized in that, this high-speed floating point arithmetical unit adopts N stage pipeline structure, N be greater than 1 natural number, comprise: input DFF (101), operand information are extracted and zone bit judging unit (115), N level floating-point operation structure, N level DFF and data selection unit (116), wherein:
This input DFF (101) is extracted by this operand information and zone bit judging unit (115) is connected to first order floating-point operation structure in this N level floating-point operation structure, the first order DFF (102) of this first order floating-point operation anatomical connectivity in this N level DFF, this first order DFF (102) is connected to the second level floating-point operation structure in this N level floating-point operation structure, the second level DFF (103) of this second level floating-point operation anatomical connectivity in this N level DFF, this second level DFF (103) is connected to the third level floating-point operation structure in this N level floating-point operation structure, the like, the N-1 level DFF (104) of N-1 level floating-point operation anatomical connectivity in this N level DFF in this N level floating-point operation structure, this N-1 level DFF (104) is connected to the N level floating-point operation structure in this N level floating-point operation structure, finally, this N level floating-point operation structure is connected to the N level DFF (105) in this N level DFF by this data selection unit (116),
The floating-point operation structures at different levels of this N level floating-point operation structure include the index processing unit of corresponding stage, Close path mantissa's processing unit and Far path mantissa processing unit, and Close path mantissa's processing unit of corresponding stage and Far path mantissa processing unit are connected to the index processing unit of same one-level.
2. high-speed floating point arithmetical unit according to claim 1, it is characterized in that, this input DFF (101) receives and deposits action type, the type that rounds off, first operand and the second operand that outside provides, and action type is exported to first order index processing unit (106) and first order DFF (102), the type that will round off exports to first order Close path mantissa processing unit (109) and first order DFF (102), first operand and second operand is passed to this operand information and extracts and zone bit judging unit (115).
3. high-speed floating point arithmetical unit according to claim 2, it is characterized in that, the type that this action type indicates operation is floating add, floating-point subtraction, floating-point turn fixed point, fixed point turns floating-point, floating-point absolute value, floating-point get inverse, floating-point square root is reciprocal and which kind of operation in comparing class; This type that rounds off indicates to block and rounds off or round off nearby.
4. high-speed floating point arithmetical unit according to claim 1, it is characterized in that, the extraction of this operand information and zone bit judging unit (115) extract the sign bit of first operand and second operand, exponent bits and mantissa's bit data, and calculate the zone bit data of first operand and second operand, the exponent bits data of the first operand obtained and second operand are exported to first order index processing unit (106), mantissa's bit data of first operand and second operand is exported to first order Close path mantissa processing unit (109) and first order Far path mantissa processing unit (112), sign bit data and zone bit data are exported to first order DFF (102) deposit.
5. high-speed floating point arithmetical unit according to claim 4, is characterized in that, these zone bit data at least comprise zero flag, infinite mark and non-number mark.
6. high-speed floating point arithmetical unit according to claim 1, it is characterized in that, the index that first order index processing unit (106) carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and first order Close path mantissa processing unit (109) and first order Far path mantissa processing unit (112) provide calculates, and this index result of calculation is deposited in first order DFF (102).
7. high-speed floating point arithmetical unit according to claim 6, is characterized in that,
The index that this second level index processing unit (107) carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and Close path, second level mantissa's processing unit (110) and Far path, second level mantissa's processing unit (113) provide calculates, and is deposited in second level DFF (103) by this index result of calculation;
The index that this N level index processing unit (108) carries out floating add subtraction and associative operation thereof according to the data that the exponent bits data of first operand and second operand and action type and N level Close path mantissa's processing unit and N level Far path mantissa processing unit provide calculates, and this index result of calculation is exported to data selection unit (116).
8. the high-speed floating point arithmetical unit according to claim 6 or 7, it is characterized in that, wherein the index of floating add subtraction and associative operation thereof calculates and specifically comprises: calculate two operand indexes poor, on index, underflow judges, Close path, Far path index generate, and provide desired data for Close path mantissa at different levels process and Far path mantissa process.
9. high-speed floating point arithmetical unit according to claim 1, it is characterized in that, the mantissa that first order Close path mantissa processing unit (109) carries out Close path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit (106) provide calculates, and this mantissa's result of calculation is deposited in first order DFF (102).
10. high-speed floating point arithmetical unit according to claim 9, is characterized in that,
The mantissa that Close path is carried out according to the data that mantissa's bit data of first operand and second operand and second level index processing unit (107) provide in Close path, this second level mantissa's processing unit (110) calculates, and is deposited in second level DFF (103) by mantissa's result of calculation in this Close path;
The mantissa that Close path is carried out according to the data that mantissa's bit data of first operand and second operand and N level index processing unit (108) provide in this N level Close path mantissa's processing unit (111) calculates, and mantissa's result of calculation in this Close path is exported to data selection unit (116).
11. high-speed floating point arithmetical unit according to claim 9 or 10, it is characterized in that, wherein the mantissa in Close path calculates and specifically comprises: index difference is less than or equal to the subtraction mantissa process of 1, leading 1 of result of calculation judges and the parallel calculating of rounding off of Close route result, and provides desired data for index process at different levels.
12. high-speed floating point arithmetical unit according to claim 1, it is characterized in that, the mantissa that first order Far path mantissa processing unit (112) carries out Far path according to the data that mantissa's bit data of first operand and second operand and first order index processing unit (106) provide calculates, and this mantissa's result of calculation is deposited in first order DFF (102).
13. high-speed floating point arithmetical unit according to claim 12, is characterized in that,
The mantissa that Far path is carried out according to the data that mantissa's bit data of first operand and second operand and second level index processing unit (107) provide in Far path, this second level mantissa's processing unit (113) calculates, and is deposited in second level DFF (103) by mantissa's result of calculation in this Far path;
The mantissa that Far path is carried out according to the data that mantissa's bit data of first operand and second operand and N level index processing unit (108) provide in this N level Far path mantissa's processing unit (114) calculates, and mantissa's result of calculation in this Far path is exported to data selection unit (116).
14. high-speed floating point arithmetical unit according to claim 12 or 13, it is characterized in that, wherein the mantissa in Far path calculates and specifically comprises: the subtraction mantissa process that index difference is greater than 1, mantissa's process of addition, the parallel calculating of rounding off of Far route result, and the process of plus-minus method associative operation mantissa needs extra additional calculating, and provide desired data for index process at different levels.
15. high-speed floating point arithmetical unit according to claim 1, it is characterized in that, mantissa's result of calculation in the Far path that mantissa's result of calculation in the Close path that this data selection unit (116) provides according to being extracted by operand information of passing over and various zone bit data that zone bit judging unit (115) provides, N level Close path mantissa's processing unit (111) and N level Far path mantissa's processing unit (114) provide obtains final result of calculation, passes to N level DFF and deposits.
16. high-speed floating point arithmetical unit according to claim 1, it is characterized in that, this first order DFF (102) is for depositing first order index processing unit (106), the results of intermediate calculations of first order Close path mantissa processing unit (109) and first order Far path mantissa processing unit (112), and the action type that operand information is extracted and the sign bit data that provides of zone bit judging unit (115) and zone bit data and input DFF (101) are deposited and the categorical data that rounds off, for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit use.
CN201110418897.6A 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement Active CN102566967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110418897.6A CN102566967B (en) 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110418897.6A CN102566967B (en) 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement

Publications (2)

Publication Number Publication Date
CN102566967A CN102566967A (en) 2012-07-11
CN102566967B true CN102566967B (en) 2015-08-19

Family

ID=46412487

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110418897.6A Active CN102566967B (en) 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement

Country Status (1)

Country Link
CN (1) CN102566967B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2528497B (en) * 2014-07-24 2021-06-16 Advanced Risc Mach Ltd Apparatus And Method For Performing Floating-Point Square Root Operation
US9696992B2 (en) * 2014-12-23 2017-07-04 Intel Corporation Apparatus and method for performing a check to optimize instruction flow
CN104636114B (en) * 2015-02-12 2018-05-15 北京思朗科技有限责任公司 A kind of rounding method and device of floating number multiplication
US10216479B2 (en) * 2016-12-06 2019-02-26 Arm Limited Apparatus and method for performing arithmetic operations to accumulate floating-point numbers
CN107957976B (en) * 2017-12-15 2020-12-18 安徽寒武纪信息科技有限公司 Calculation method and related product
CN111382390B (en) * 2018-12-28 2022-08-12 上海寒武纪信息科技有限公司 Operation method, device and related product
CN111260044B (en) * 2018-11-30 2023-06-20 上海寒武纪信息科技有限公司 Data comparator, data processing method, chip and electronic equipment
CN109960533B (en) * 2019-03-29 2023-04-25 合芯科技(苏州)有限公司 Floating point operation method, device, equipment and storage medium
CN111596887B (en) * 2020-05-22 2023-07-21 威高国科质谱医疗科技(天津)有限公司 Inner product calculation method based on reconfigurable calculation structure
CN116643718B (en) * 2023-06-16 2024-02-23 合芯科技有限公司 Floating point fusion multiply-add device and method of pipeline structure and processor
CN117251132B (en) * 2023-09-19 2024-05-14 上海合芯数字科技有限公司 Fixed-floating point SIMD multiply-add instruction fusion processing device and method and processor

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1410885A (en) * 2001-09-27 2003-04-16 中国科学院计算技术研究所 Command pipeline system based on operation queue duplicating use and method thereof
CN101174200A (en) * 2007-05-18 2008-05-07 清华大学 5-grade stream line structure of floating point multiplier adder integrated unit
CN101692202A (en) * 2009-09-27 2010-04-07 北京龙芯中科技术服务中心有限公司 64-bit floating-point multiply accumulator and method for processing flowing meter of floating-point operation thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070074008A1 (en) * 2005-09-28 2007-03-29 Donofrio David D Mixed mode floating-point pipeline with extended functions

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1410885A (en) * 2001-09-27 2003-04-16 中国科学院计算技术研究所 Command pipeline system based on operation queue duplicating use and method thereof
CN101174200A (en) * 2007-05-18 2008-05-07 清华大学 5-grade stream line structure of floating point multiplier adder integrated unit
CN101692202A (en) * 2009-09-27 2010-04-07 北京龙芯中科技术服务中心有限公司 64-bit floating-point multiply accumulator and method for processing flowing meter of floating-point operation thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
全流水线结构双精度浮点乘加单元的设计;蔡敏等;《微电子学与计算机》;20100131;第27卷(第1期);第53-56,60页 *
基于流水线结构的浮点加法器IP 核设计;夏杰等;《微计算机信息》;20081231;第24卷(第9-3期);第192-193页 *

Also Published As

Publication number Publication date
CN102566967A (en) 2012-07-11

Similar Documents

Publication Publication Date Title
CN102566967B (en) A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement
JP6207574B2 (en) Calculation control indicator cache
CN101174200B (en) 5-grade stream line structure of floating point multiplier adder integrated unit
CN101847087B (en) Reconfigurable transverse summing network structure for supporting fixed and floating points
CN106970775A (en) A kind of general adder of restructural fixed and floating
CN101692202B (en) 64-bit floating-point multiply accumulator and method for processing flowing meter of floating-point operation thereof
CN107305485B (en) Device and method for performing addition of multiple floating point numbers
CN104991757A (en) Floating point processing method and floating point processor
CN104778026A (en) High-speed data format conversion part with SIMD and conversion method
CN101650643B (en) Rounding method for indivisible floating point division radication
Daoud et al. A survey on design and implementation of floating point adder in FPGA
JP5966768B2 (en) Arithmetic circuit, arithmetic processing device, and control method of arithmetic processing device
CN101699390B (en) Self-correction precursor 0/1 predicting unit for floating-point adder
Giri et al. Pipelined floating-point arithmetic unit (fpu) for advanced computing systems using fpga
KR101753162B1 (en) Method for calculating of leading zero, apparatus thereof
Reddy et al. A novel low power error detection logic for inexact leading zero anticipator in floating point units
CN106802783B (en) A kind of addition of decimal result rounding method and apparatus
Deshmukh A novel FPGA based leading one anticipation algorithm for floating point arithmetic units
NAGENDRA et al. Design and Implementation of Fused Floating Point Three-Term Adder
CN103455305A (en) Rounding prediction method for floating point adder

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20171129

Address after: 102412 Beijing City, Fangshan District Yan Village Yan Fu Road No. 1 No. 11 building 4 layer 402

Patentee after: Beijing Si Lang science and Technology Co.,Ltd.

Address before: 100190 Zhongguancun East Road, Beijing, No. 95, No.

Patentee before: Institute of Automation, Chinese Academy of Sciences

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 201306 building C, No. 888, Huanhu West 2nd Road, Lingang New District, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai

Patentee after: Shanghai Silang Technology Co.,Ltd.

Address before: 102412 room 402, 4th floor, building 11, No. 1, Yanfu Road, Yancun Town, Fangshan District, Beijing

Patentee before: Beijing Si Lang science and Technology Co.,Ltd.