CN102566967A - High-speed floating point unit in multilevel pipeline organization - Google Patents

High-speed floating point unit in multilevel pipeline organization Download PDF

Info

Publication number
CN102566967A
CN102566967A CN2011104188976A CN201110418897A CN102566967A CN 102566967 A CN102566967 A CN 102566967A CN 2011104188976 A CN2011104188976 A CN 2011104188976A CN 201110418897 A CN201110418897 A CN 201110418897A CN 102566967 A CN102566967 A CN 102566967A
Authority
CN
China
Prior art keywords
mantissa
level
processing unit
floating
operand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011104188976A
Other languages
Chinese (zh)
Other versions
CN102566967B (en
Inventor
王东琳
张志伟
王惠娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Silang Technology Co ltd
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201110418897.6A priority Critical patent/CN102566967B/en
Publication of CN102566967A publication Critical patent/CN102566967A/en
Application granted granted Critical
Publication of CN102566967B publication Critical patent/CN102566967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses a high-speed floating point unit, which is in an N-level pipeline organization, wherein N is a natural number larger than 1. The high-speed floating point unit comprises an input DFF (D type flip-flop), an operand information extraction and zone bit judgment unit, N-level floating point arithmetic structures and an N-level DFF and data selection unit. The input DFF is connected to a first-level floating point arithmetic structure in the N-level floating point arithmetic structures through the operand information extraction and zone bit judgment unit. The first-level floating point arithmetic structure is connected to a first-level DFF in N-level DFFs, the first-level DFF is connected with a second-level floating point arithmetic structure in the N-level floating point arithmetic structures, the second-level floating point arithmetic structure is connected with a second-level DFF in the N-level DFFs, and the second-level DFF is connected with a third-level floating point arithmetic structure, and so on. An N-1-level DFF is connected with an N-level floating point arithmetic structure in the N-level floating point arithmetic structures, and the N-level floating point arithmetic structure is connected with the N-level DFF in the N-level DFFs through the data selection unit.

Description

A kind of high-speed floating point arithmetical unit that adopts multi-stage pipeline arrangement
Technical field
The present invention relates to the floating-point operation technical field in the microprocessor, exactly, relate to a kind of high-speed floating point arithmetical unit that adopts multi-stage pipeline arrangement.
Background technology
Arithmetic unit occupies critical positions in micro-processor architecture; It is the key element of decision processor processes speed; Plus-minus method and associative operation thereof proportion in computing is very high, adopts effective ways the arithmetic unit that is used for plus-minus method and associative operation thereof to be improved a primary study direction that becomes current arithmetic unit research aspect.
Usually, floating number representes all to follow IEEE 754 standards, and the single precision floating datum of this canonical representation is 32; Comprise 1 bit sign position (s), 8 exponent bits (e) and 23 fractional bits (f), exponential part is a signed number; Adopt the mode of offset code to represent (side-play amount is 127); 1 of fractional part and implicit position constitutes actual mantissa jointly, therefore, the numerical value of representing with s-e-f promptly: (1) s* 1.f * 2 E-127
And basic floating add subtraction needed for seven steps accomplished: step 1: index comparison: the index of two operands is subtracted each other, and the difference absolute value is d; Step 2: displacement alignment: than the mantissa of the small data d position that moves to right; Step 3: mantissa's plus-minus: add reducing according to operational code and operand; Step 4: complement code is accurate changes: the result of mantissa carries out supplement when being negative; Step 5: leading 1 judges: if 3 be subtraction, whether the need determination result figure place that moves to left is as determining to move to right 1 for addition; Step 6: normalization shift: the plus-minus result of mantissa is shifted, and making its highest significant position is 1; Step 7: round off and export the result: require the operation of rounding off to the result according to rounding procedure.
Basic floating add subtraction step is more, postpones greatlyyer, and in order to improve the arithmetic speed of floating add subtraction, much the improvement to basic signed magnitude arithmetic(al) all is devoted in researchs, has produced different optimization effects.
On February 17th, 2010, disclosed application number was 200910152505.9 Chinese patent " the floating add device that rounds off based on complement code "; Inventor: sternly know wave etc.; A kind of floating add device that rounds off based on complement code is disclosed; Support floating add operation and floating-point subtraction; This floating add device comprises: index totalizer, mantissa's shift unit and mantissa operand ready logic unit are used for mantissa's operand being handled the mantissa adder device according to the sign bit and the index difference of the first floating-point operation number and the second floating-point operation number; The decision logic unit that rounds off, the judgement of rounding off that the mantissa adder result is unified: according to the most significant digit of mantissa adder device output judge mantissa and positive and negative; Confirm to be used to the set position of rounding off and judging according to the Gao Siwei of mantissa adder device output; Unified true form rounds off and adds 1 decision logic and round off and 0 decision logic with complement code; And the totalizer that rounds off, be used for the mantissa adder result of floating-point is rounded off, and accomplish to mantissa and get complementary operation.
On April 28th, 2010, disclosed application number was 200910218505.4 Chinese patent " is used for floating-point adder from leading 0/1 predicting unit of error correction "; Inventor: Shao Zhibiao etc.; Disclose a kind of be used for floating-point adder from leading 0/1 Forecasting Methodology of error correction; The method that adopts many input logic gates and parallel computation to combine has realized that the output result is final correct result, needn't rely on the totalizer result and revise, and adopts parallel computation; Increase like the operand bit wide, can not influence critical path depth.Can be when calculating floating add; Result of calculation standardized handle the required shift count that carries out and the index adjustment information is made synchronous prediction; And predict the outcome and do not rely on totalizer output and only produce by predicting unit; The right value that predicts the outcome and further revise for need not, the critical path of predicting unit can be not elongated because of the bit wide lengthening of operand.
On September 19th, 2007, disclosed application number was 200580034798.0 Chinese patent " method that is used for high-speed floating point arithmetical unit (ALU) "; Gas Tahoua in inventor: S Si; Disclose a kind of arithmetical unit that is used for microprocessor and carried out the improved technology of the footpath of line of reasoning nearby index difference; In one embodiment, two least significant bit (LSB)s that have the device that is used for the line of reasoning nearby footpath subtraction and the separate logic of the footpath of a line of reasoning at a distance subtraction index through only utilizing two floating-point operation numbers are carried out the index differences and are produced the index difference signal.
Above-mentioned three patents; Partly be optimized and improve to the processing of rounding off, leading 1 judgement and the index comparison of basic floating add subtraction in seven steps respectively; To improve handling property, still do not provide improvement and design proposal to floating add subtraction overall data path, the global design of floating add subtraction is lacked reference and directiveness; The invention of this paper can be the starting point with the overall data path of high-speed floating point plus-minus method and associative operation thereof, provides design proposal.
The Chinese patent " a kind of circuit of Rapid Realization floating add " that on November 16th, 2011, disclosed application number was 102243577A; Inventor: Wang Yongliu; A kind of circuit of Rapid Realization floating add is disclosed; Set up an additive operation unit; The computing of whole floating add is divided into two parts operates, this patent equates to handle with index difference anisochrouous situation is parallel to the index difference through the logic of adding executed in parallel, and we can draw through anatomizing: to be 1 contrary sign floating add and index difference all will not wait the path to carry out through index greater than 1 floating add to the index difference; Index does not wait the path should comprise that mantissa's alignment shift in the basic addition step also will comprise the normalization shift of result of calculation, does not improve computing velocity in fact.
The Chinese patent that on Dec 17th, 2010, disclosed application number was 201010594926.X " based on low floating accumulator and its implementation of postponing of high speed of FPGA "; Inventor: Chen Yaowu etc. disclose low totalizer and its implementation of postponing of a kind of high speed based on FPGA.The floating accumulator of this invention comprises a floating-point adder unit; The design of whole floating accumulator realizes adopting pipeline organization; But the concrete design for floating-point adder does not provide the detailed design scheme; And whole design is based on the FPGA structure, and the basic cell structure that is adopted in FPGA and the chip design is different fully, and to use for reference property little for carrying out the non-processing element design with versatility based on FPGA for this kind structure.
Can know through last surface analysis; Optimal design to the floating add subtraction has proposed much at present; But how a lot of researchs all concentrate on through being optimized the realization that improves local logic, do not provide the more effective and feasible embodiment in whole floating add subtraction data via design aspect.In addition, some is researched and proposed and adopts pipelining to carry out the design of floating add subtraction, but also not providing detailed design and prioritization scheme and application has limitation.
Summary of the invention
The technical matters that (one) will solve
In view of this, fundamental purpose of the present invention is to provide a kind of high-speed floating point arithmetical unit that adopts multi-stage pipeline arrangement, to realize floating add, subtraction and associative operation thereof.
(2) technical scheme
For achieving the above object; The invention provides a kind of high-speed floating point arithmetical unit; This high-speed floating point arithmetical unit adopts the N stage pipeline structure; N is the natural number greater than 1, comprising: input DFF 101, operand information are extracted and zone bit judging unit 115, N level floating-point operation structure, N level DFF and data selection unit 116, and wherein: this input DFF 101 extracts through this operand information and zone bit judging unit 115 is connected in the first order floating-point operation structure in this N level floating-point operation structure; This first order floating-point operation structure is connected in the first order DFF 102 among this N level DFF; This first order DFF 102 is connected in the second level floating-point operation structure in this N level floating-point operation structure, and this second level floating-point operation structure is connected in the second level DFF 103 among this N level DFF, and this second level DFF 103 is connected in the third level floating-point operation structure in this N level floating-point operation structure; And the like, the N-1 level floating-point operation structure in this N level floating-point operation structure is connected in the N-1 level DFF 104 among this N level DFF, and this N-1 level DFF 104 is connected in the N level floating-point operation structure in this N level floating-point operation structure; At last, this N level floating-point operation structure is connected in the N level DFF 105 among this N level DFF through this data selection unit 116.
In the such scheme; This N level floating-point operation structure includes N level index processing unit, N level Close path mantissa's processing unit and N level Far path mantissa processing unit, and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are connected to N level index processing unit.
In the such scheme; This input DFF 101 receives and deposits action type, the type that rounds off, first operand and the second operand that the outside provides; And action type exported to first order index processing unit 106 and first order DFF 102; The type that will round off is exported to first order Close path mantissa processing unit 109 and first order DFF 102, and first operand and second operand are passed to this operand information extraction and zone bit judging unit 115.
In the such scheme; This operand information extraction and zone bit judging unit 115 extract sign bit, exponent bits and mantissa's bit data of first operands and second operand; And the zone bit data of calculating first operand and second operand; The exponent bits data of first operand that obtains and second operand are exported to first order index processing unit 106; Mantissa's bit data of first operand and second operand is exported to first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112, sign bit data and zone bit data are exported to first order DFF 102 deposit.
In the such scheme; The data that this first order index processing unit 106 provides according to the exponent bits data of first operand and second operand and action type and first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112 are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is deposited among the first order DFF 102.
In the such scheme; The mantissa that the data that this first order Close path mantissa processing unit 109 provides according to the mantissa's bit data and the first order index processing unit 106 of first operand and second operand are carried out the Close path is calculated, and this mantissa's result of calculation is deposited among the first order DFF 102.
In the such scheme; The mantissa that the data that this first order Far path mantissa processing unit 112 provides according to the mantissa's bit data and the first order index processing unit 106 of first operand and second operand are carried out the Far path is calculated, and this mantissa's result of calculation is deposited among the first order DFF 102.
In the such scheme; This data selection unit 116 obtains final result of calculation according to what pass over by mantissa's result of calculation in operand information is extracted and zone bit judging unit 115 provides various zone bit data, Close path that N level Close path mantissa processing unit 111 provides and mantissa's result of calculation in the Far path that N level Far path mantissa processing unit 114 provides, and passes to N level DFF and deposits.
In the such scheme; This first order DFF 102 is used to deposit the results of intermediate calculations of this first order index processing unit 106, this first order Close path mantissa processing unit 109 and this first order Far path mantissa processing unit 112; And the sign bit data that operand information is extracted and zone bit judging unit 115 provides and zone bit data and the action type and the categorical data that rounds off that input DFF 101 deposits, use for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit.
(3) beneficial effect
Can find out that from technique scheme the present invention has following beneficial effect:
1, high-speed floating point arithmetical unit provided by the invention; Adopt dual path, leading 1 judges, parallel rounding off supported floating add subtraction and associative operation thereof; Changeing 32 fixed-point numbers of floating-point operation introducing except fixed point moves to left; Get the perhaps inverse square root operation immigration mantissa of operating reciprocal and handle, the floating-point commentaries on classics is fixed a point to introduce index and is overflowed judgment processing, and owing to increase the logic that action type is judged when carrying out the data selection and exporting; Floating add subtraction associative operation all is the data path of multiplexing floating add subtraction, and reusability is high simultaneously, area is little for data.Adopt pipeline organization to design the arithmetic speed height, handling capacity is big.
2, high-speed floating point arithmetical unit provided by the invention is supported floating add subtraction and relevant multiple operation thereof, and data path is efficiently multiplexing, and multistage flowing water effectively shortens the data path path, increases data throughout, satisfies the high speed requirement.In embodiment, provide the structure of carrying out floating add subtraction and associative operation thereof that the multistage flowing structure that can carry out floating add subtraction and associative operation thereof and employing two-stage flowing structure are realized.
Description of drawings
Fig. 1 is the structural representation that adopts the high-speed floating point arithmetical unit of multi-stage pipeline arrangement according to the embodiment of the invention.
Fig. 2 is the structural representation that adopts the high-speed floating point arithmetical unit of two stage pipeline structure according to the embodiment of the invention.
Fig. 3 is a method flow diagram of realizing the high-speed floating point arithmetical unit of employing two stage pipeline structure shown in Figure 2.
Embodiment
For making the object of the invention, technical scheme and advantage clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, to further explain of the present invention.
In dual path walks abreast to addition or subtraction: complement code conversion and shifting function mutual exclusion; The situation that only needs the shift unit of a complete bit wide among the displacement of mantissa's alignment and the normalized displacement; Footpath of line of reasoning nearby and the footpath of line of reasoning at a distance are set; Line of reasoning is nearby directly carried out the index difference smaller or equal to 1 subtraction process, and line of reasoning is at a distance directly carried out addition and index difference greater than 1 subtraction process.
Leading 1 judgement is used for calculating the number of input leading 0; And leading 0 the number figure place of normalization shift just; Comprise that precoding, leading 1 is judged, error correction three parts; Precoding obtains one 0,1 string according to the magnitude portion coding of two operations will calculating, the position identical (having 1 bit error) of its position of leading 1 and mantissa result of calculation leading 1, and leading 1 judgement is that binary coding is carried out in leading 1 position of 0,1 coded strings that precoding is obtained; To obtain the figure place of normalization shift, error correction is proofreaied and correct 1 bit error that possibly exist in the precoding process.
The operation of rounding off is that result of calculation is carried out small change; Use compound totalizer to calculate all possible outcomes simultaneously; Carrying out as a result in the step that rounds off, selection operation just can obviously improve arithmetic speed; Parallel round off exactly will the operation of carrying out at last of rounding off advance to and mantissa's plus-minus method computation process is parallel carries out calculating, and improves concurrency, reduces the length in plus-minus method path.
Pipelining is widely used, and has been proved to be characteristics such as having high speed and high-throughput.In addition, the relevant operation of floating add subtraction is such as the relatively realization of generic operation etc. of floating fixed point conversion and floating-point, to a great extent can multiplexing floating add subtraction data path, with the simple in structure and high reusability of realization integrated circuit, reduce area consumption.So, be research emphasis of the present invention through data path OVERALL OPTIMIZA-TION DESIGN FOR and the multiplexing configuration of adopting multiple technologies to carry out floating add subtraction and associative operation thereof.
Given this; The present invention proposes a kind of high-speed floating point arithmetical unit that adopts multi-stage pipeline arrangement; Utilize multistage flowing structure concurrent operation to carry out addition, subtraction and the associative operation thereof of two floating-point operation numbers; Floating add subtraction associative operation comprise floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) operation; What flowing water multi-stage pipeline arrangement adopts realize, can confirm according to application need.
Fig. 1 is the structural representation that adopts the high-speed floating point arithmetical unit 100 of multistage (N level) pipeline organization according to the embodiment of the invention; This high-speed floating point arithmetical unit comprise input D type flip-flop (D type flip-flop, DFF) 101, operand information extracts and zone bit judging unit 115, N level floating-point operation structure, N level DFF and data selection unit 116, N is the natural number greater than 1; Wherein, This input DFF 101 extracts through this operand information and zone bit judging unit 115 is connected in first order floating-point operation structure, and this first order floating-point operation structure is connected in first order DFF, and this first order DFF is connected in second level floating-point operation structure; This second level floating-point operation structure is connected in second level DFF; This second level DFF is connected in third level floating-point operation structure ..., and the like; N-1 level floating-point operation structure is connected in N-1 level DFF; This N-1 level DFF is connected in N level floating-point operation structure, and last, this N level floating-point operation structure is connected in N level DFF through this data selection unit 116.
This N level floating-point operation structure includes N level index processing unit, N level Close path mantissa's processing unit and N level Far path mantissa processing unit, and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are connected to N level index processing unit.
This input DFF101 receives and deposits action type, the type that rounds off, first operand and the second operand that the outside provides; Wherein, Action type indicate operation types is floating add, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) in which kind of operation; The type that rounds off indicates to block and rounds off or round off nearby; And action type exported to first order index processing unit 106 and first order DFF 102; The type that will round off is exported to first order Close path mantissa processing unit 109 and first order DFF 102, and first operand and second operand are passed to operand information extraction and zone bit judging unit 115.
Operand information extraction and zone bit judging unit 115 extract sign bit, exponent bits and mantissa's bit data of first operands and second operand; And calculate zone bit data such as the zero flag of two operands, infinite sign and non-number signs; The exponent bits data of two operands that obtain are exported to first order index processing unit 106; Mantissa's bit data of two operands is exported to first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112; Sign bit data and zone bit data are exported to first order DFF 102 deposit, so that next stage flowing water uses when carrying out computing.
The data that this first order index processing unit 106 provides according to the exponent bits data of first operand and second operand and action type and first order Close path mantissa processing unit 109 and first order Far path mantissa processing unit 112 are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is deposited among the first order DFF 102.
The mantissa that the data that first order Close path mantissa processing unit 109 provides according to the mantissa's bit data and the first order index processing unit 106 of first operand and second operand are carried out the Close path is calculated, and this mantissa's result of calculation is deposited among the first order DFF 102.
The mantissa that the data that first order Far path mantissa processing unit 112 provides according to the mantissa's bit data and the first order index processing unit 106 of first operand and second operand are carried out the Far path is calculated, and this mantissa's result of calculation is deposited among the first order DFF 102.
This first order DFF 102 is used to deposit the results of intermediate calculations of this first order index processing unit 106, this first order Close path mantissa processing unit 109 and this first order Far path mantissa processing unit 112; And the sign bit data that operand information is extracted and zone bit judging unit 115 provides and zone bit data and the action type and the categorical data that rounds off that input DFF 101 deposits, use for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit.
Similar with this first order index processing unit 106; The data that second level index processing unit 107 provides according to the exponent bits data of first operand and second operand and action type and Close path, second level mantissa processing unit 110 and Far path, second level mantissa processing unit 113 are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is deposited among the DFF103 of the second level.The data that N level index processing unit 108 provides according to the exponent bits data of first operand and second operand and action type and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is exported to data selection unit 116.Wherein the Index for Calculation of floating add subtraction and associative operation thereof specifically comprises: it is poor to calculate two operand indexes; Underflow is judged on the index; Close path, Far path index generate, and are that Close at different levels path mantissa handles and the processing of Far path mantissa provides desired data, like the index difference data.
Similar with this first order Close path mantissa processing unit 109; The mantissa that the data that Close path, second level mantissa processing unit 110 provides according to the mantissa's bit data and the second level index processing unit 107 of first operand and second operand are carried out the Close path is calculated, and mantissa's result of calculation in this Close path is deposited among the second level DFF 103.The mantissa that the data that N level Close path mantissa processing unit 111 provides according to the mantissa's bit data and the N level index processing unit 108 of first operand and second operand are carried out the Close path is calculated, and mantissa's result of calculation in this Close path is exported to data selection unit 116.Wherein the mantissa in Close path calculates and specifically comprises: the index difference is handled smaller or equal to 1 subtraction mantissa; Leading 1 of result of calculation is judged and the parallel calculating of rounding off of Close route result; And, judge that as leading 1 the result who confirms is shifted number to adjust the index result in Close path for indexes processing at different levels provide desired data.
Similar with this first order Far path mantissa processing unit 112; The mantissa that the data that Far path, second level mantissa processing unit 113 provides according to the mantissa's bit data and the second level index processing unit 107 of first operand and second operand are carried out the Far path is calculated, and mantissa's result of calculation in this Far path is deposited among the second level DFF 103.The mantissa that the data that N level Far path mantissa processing unit 114 provides according to the mantissa's bit data and the N level index processing unit 108 of first operand and second operand are carried out the Far path is calculated, and mantissa's result of calculation in this Far path is exported to data selection unit 116.Wherein the mantissa in Far path calculates and specifically comprises: the index difference is handled greater than 1 subtraction mantissa; The mantissa of addition handles; The parallel calculating of rounding off of Far route result; And plus-minus method associative operation mantissa calculating needs extra additional Processing Structure, and handles for indexes at different levels desired data is provided, like the displacement number of the parallel back result of calculation that the rounds off index result with adjustment Far path.
Data selection unit 116 obtains final result of calculation according to what pass over by mantissa's result of calculation in operand information is extracted and zone bit judging unit 115 provides various zone bit data, Close path that N level Close path mantissa processing unit 111 provides and mantissa's result of calculation in the Far path that N level Far path mantissa processing unit 114 provides, and passes to N level DFF and deposits.
First order DFF 102, second level DFF 103, N-1 level DFF 104 deposit indexes at different levels and handle; Close path mantissa handles; The results of intermediate calculations that Far path mantissa handles is handled for the next stage index, and Close path mantissa handles and the processing calculating of Far path mantissa is used.
In addition, the suspension points between second level DFF 103 and the N-1 level DFF 104 representes can carry out as required the division and the design of multi-stage pipeline.
Fig. 2 is the structural representation that adopts the high-speed floating point arithmetical unit 200 of two stage pipeline structure according to the embodiment of the invention; Present embodiment adopts the high-speed floating point arithmetical unit 200 of two stage pipeline structure, utilize the concurrent operation of two-stage flowing structure carry out the addition of two floating-point operation numbers, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) operation.As shown in Figure 2, this high-speed floating point arithmetical unit 200 comprises input DFF 201, operand information extracts and zone bit judging unit 202, first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205, first order DFF206, second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238, data selection 234 and second level DFF235.
Wherein, Input DFF 201 receives outside action type, the type that rounds off, first operand and the second operand that provides and deposits; Wherein, Action type indicate operation types is floating add, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root and comparison class (equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to relatively) in which kind of operation; The type that rounds off indicates to block and rounds off or round off nearby; First operand and second operand are respectively 32 data; And action type exported to first order index processing unit 203 and first order DFF206, the type that will round off is exported to first order Close path mantissa processing unit 204 and first order DFF206, and first operand and second operand are passed to operand information extraction and zone bit judging unit 202.
Operand information extraction and zone bit judging unit 202 extract sign bit, the exponential sum mantissa data of first operands and second operand; And calculate the zero flag of two operands, infinite sign and non-several flag information; The exponent data of two operands that obtain is exported to first order index processing unit 203; The mantissa data of two operands is exported to first order Close path mantissa processing unit 204 and first order Far path mantissa processing unit 205; Sign bit information and zone bit data are exported to first order DFF206 deposit, so that next stage flowing water uses when carrying out computing.
First order DFF206 deposits the results of intermediate calculations of first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205; And the sign bit that operand information is extracted and zone bit judging unit 202 provides and zone bit data and the action type and the categorical data that rounds off that input DFF201 deposits, export to second level flowing water and use.
The action type that data selection unit 234 provides according to first order DFF, sign bit, zone bit data are to from first order index processing unit 203; First order Close path mantissa processing unit 204; The result of first order Far path mantissa processing unit 205 analyzes, obtain floating add, subtraction, floating-point change fixed point, fixed point change floating-point, floating-point absolute value, floating-point get inverse, floating-point inverse square root, equate relatively, do not wait relatively, greater than relatively, more than or equal to relatively, less than relatively, smaller or equal to compare operation under normal circumstances with abnormal conditions under result and export to second level DFF235.The correct calculation result that second level DFF235 provides data selection unit 234 deposits and exports.
Specify first order index processing unit 203, first order Close path mantissa processing unit 204, first order Far path mantissa processing unit 205 below, the inside of second level index processing unit 236, Close path, second level mantissa processing unit 237, Far path, second level mantissa processing unit 238 constitutes and the data interaction relation.
In the first order index processing unit 203,8 complex indexes totalizers 207 receive that operands extract and the exponent data of two operands that zone bit judging unit 202 provides, and suppose it is respectively A and B; Calculate A+B and A+B+1 simultaneously and divide the carry signal of other most significant digit; For the associative operation of floating add subtraction, fixing a point to change floating-point operation and floating-point when changeing fixed-point arithmetic, input be respectively 158 with the first operand negate after value; And carrying out floating-point when getting computing reciprocal; The input be respectively 254 with the first operand negate after value, when carrying out the computing of floating-point inverse square root, the input be respectively 190 with the first operand negate after value; To realize the multiplexing of arithmetic unit, result of calculation is exported to index generation/floating definiteness number that changes and is overflowed judgement 208.Index generation/floating commentaries on classics definiteness number overflows judgement 208 provides the result of calculation generation index of compound addition poor according to action type information and 8 complex indexes totalizers 207 that input DFF provides; Big index; Floating-point changes fixed point overflow and underflow signal; And floating-point gets the index of operation reciprocal and the operation of floating-point inverse square root, export to respectively CpFp/ addition subtraction judge 209 and first order DFF206 deposit.CpFp/ addition subtraction judge 209 action types that provide according to input DFF201 generate with index/the floating definiteness number that changes overflows and judges that the 208 index differences that provide produce and manage the mantissa path nearby and manage the selection signal of mantissa's route result and the selection signal of addition and subtraction with the distant place.
In the first order Close path mantissa processing unit 204; Close path mantissa select 210 according to index generate/the floating definiteness number that changes overflows and judges that the 208 index comparison results that provide extract operand information and zone bit judges that 202 two mantissa providing select; The mantissa of the operand that index is bigger exports to 24 compound mantissa adder devices 213 as summand; The mantissa of the operand that index is less exports to and moves to right 0 or 1 211; Its according to index generate/the floating definiteness number that changes overflows and judges the 208 index difference information that provide, determines whether to move to right one, if the index difference is 1 then the displacement that moves to right; Otherwise do not move to right, the data after will handling for 0 or 1 211 of moving to right pass to 24 compound totalizers 213, leading 1 prediction 212 as addend and the Close path is parallel rounds off 214.24 compound mantissa adder devices 213 receive from Close path mantissa and select 210 summands that provide and move to right 0 or 1 211 addend that provides and the addend negate carried out compound add operation, obtain A+B, A+B+1; And divide other most significant digit carry flag and minus flag signal, and pass to first order DFF206 and deposit, if result of calculation is less than 0; Then the minus flag signal is effective; For the single precision floating datum computing, because magnitude portion is 24 significance bits, so compound totalizer adopts 24 bit wides.The 214 highest significant position carries that provide according to 24 compound mantissa adder devices 213 that round off that the Close path is parallel; Highest significant position; Least significant bit (LSB) information; The rounding procedure information that the warning position that provides for 0 or 1 211 and input DFF201 provide that moves to right is carried out parallel processings of rounding off of Close path mantissa, the value of immigration and pass to first order DFF206 and deposit in obtain rounding off the selection signal of back net result and the postnormalization of the rounding off displacement.
Parallel 212 formula that satisfy that round off in Close path are following under the rounding mode nearby:
Sel=sp1Cout?&(~g|(MSBSp0?&?g?&?LSBSp0));
ShiftIn=sp1Cout?&~MSBSp0?&?g;
Wherein, the round off selection signal of Sel for producing, Sel are 1 selection A+B+1; Sel is 0 selection A+B; The data value that ShiftIn moved into for when normalization, sp1Cout is the most significant digit carry of A+B+1, g is the warning position that moves to right and produce after 0 or 1 negate; MSBSp0 is the highest significant position of A+B, and LSBSp0 is the least significant bit (LSB) of A+B.
It is following to block parallel 212 formula that satisfy that round off in Close path under the rounding mode:
Sel=sp1Cout&~g;
ShiftIn=sp1Cout?&?g?&~MSBSp0;
Wherein, the parameter declaration in the formula is identical with parameter declaration under the rounding mode nearby.
The mantissa of the floating number that the index after the mantissa of the floating number that the index that leading 1 prediction 212 provides according to Close path mantissa selection 210 is bigger and the displacement alignment that provides for 0 or 1 211 that moves to right is less produces precoding and passes to first order DFF206 deposits.The formula that precoding is satisfied is following:
f i=e i-1?&((g i?&~s i+1)|(s i?&~g i+1))|~e i-1?&((s i?&~s i+1)|(g i?&~g i+1));
Wherein, two mantissa of definition are respectively A=a 0a 1... a M-2a M-1And B=b 0b 1... b M-2b M-1, make W=A-B (not bringing the position into transmits), wherein w i=a i-b i, w i∈ 1,0, and 1}, definition F is 0,1 coded strings that produces after the precoding, the position identical (having 1 bit error) of its position of leading 1 and mantissa result of calculation leading 1.F=f 0f 1... f M-2f M-1, f i∈ 0, and 1}, the value of F string is by the value combination results of W string under the different situations.For w i∈ 1,0, three kinds of possibilities of 1}, definition e i, g i, s i, wherein, w i=0 o'clock, e i=1; w i=1 o'clock, g i=1; w i=-1 o'clock, s i=1.
In the first order Far path mantissa processing unit 205; Far path mantissa select 215 according to index generate/the floating definiteness number that changes overflows and judges that the 208 index comparison results that provide extract operand information and zone bit judges that 202 two mantissa providing select; The mantissa of the operand that index is bigger exports to first order DFF206 as summand and deposits; The mantissa of the operand that index is less exports to 32 gt shift units 216; Its according to index generate/floating change the definiteness number overflow judge the 208 index difference information that provide move to right alignment and will handle after data export to first order DFF206 and deposit; It is in order to carry out the shifting function that floating-point changes fixed-point operation, to realize multiplexing that the shift unit that here moves to right is 32.32 fixed-point numbers move to left and 217 when fixing a point to change floating-point operation, use; Extract and zone bit judges that the information of 202 first operands of providing is shifted according to operand information; Whether the result after obtaining being shifted and the figure place of displacement and operand are zero flag information, and pass to first order DFF206 and deposit.RECIP/RSQRT mantissa handles 218 and when getting operation reciprocal or inverse square root operation, uses; Extract and zone bit judges that mantissa's information of 202 first operands of providing carries out mantissa and handle according to operand information; Carry out two search operations respectively; So with mantissa data as, obtain the mantissa's operation result under the different operating, and pass to first order DFF206 and deposit.
In Close path, the second level mantissa processing unit 237; The Close path that multiplexer 219 is deposited according to first order DFF206 is parallel, and the 214 Sel signals that provide that round off select one to pass to 24 lt shift units 221 and use from two result of calculations of 24 compound mantissa adder devices 213 outputs; Binary processing is carried out in the precoding that leading 1 prediction 212 that leading 1 judgement 220 is deposited according to first order DFF206 provides; Obtain in the F string leading 1 position; Employing binary chop algorithm is searched leading 1 position when carrying out binary coding, and Close path index complex totalizer 230,24 lt shift units 221 and error compensation 222 that the result behind the coding exports in 203 second level streamlines of index path are used.24 lt shift units 221 judge that according to leading 1 220 leading 1 the positional informations that provide move to left the result of calculation that multiplexer 219 provides; The data that when carrying out moving to left for the first time, the move into 214 ShiftIn data that provide that round off for the Close path is parallel, other the time to move into data be 0.Carry out leading 1 prediction and 1 error the time may occur; So the result and leading 1 of error compensation 222 after according to the displacement of 24 lt shift units 221 judges the walk abreast 214 ShiftIn data that provide that round off of 220 output and Close path and carries out error compensation; If carry out one displacement less then carry out single place shift again; Otherwise no longer be shifted; Thereby obtain the sign that nearly mantissa handles mantissa's calculating net result in path and whether error has taken place, export to data and select 234 to use.
In Far path, the second level mantissa processing unit 238; The OP2 form select 223 according to CpFp/ addition subtraction judge 209 provide manage the mantissa path nearby and manage mantissa's route result at a distance and select signal and addition, subtraction to select signal from move to left the data after 217 displacements that provide selection data and expanding of 32 gt shift units 216 and 32 fixed-point numbers; Obtain the data of 32 bit wides; Exporting to 32 compound mantissa adder devices 224 as addend uses; Calculate the warning position simultaneously, rounding bit and the value of pasting the position 225 use for parallel the rounding off in Far path.The summand that 224 pairs of Far paths of 32 compound mantissa adder devices mantissa selection 215 provides expands to 32 data; The addend of selecting to provide with the OP2 form carries out compound add operation; Obtain A+B, A+B+1, and divide other most significant digit carry flag; When fixing a point to change floating-point or floating-point commentaries on classics fixed-point operation, summand is 0.The 225 highest significant position carries that provide according to 32 compound mantissa adder devices 224 that round off that the Far path is parallel; Highest significant position, least significant bit (LSB), inferior low order information; The warning position that OP2 form selection 223 provides; The type that rounds off that rounding bit and stickup position information and input DFF201 provide is carried out the directly parallel processing of rounding off of mantissa of line of reasoning at a distance, the selection signal of net result after obtaining rounding off, the sign that whether need move to right after rounding off; The information such as value that move into when moving to left; Carrying out floating-point parallel when changeing fixed point when rounding off operation, the selection signal is provided, and these signals are exported to multiplexer 226 and index is handled the multiplexer 233 that carries out Far path exponent arithmetic gating in 203 second level streamlines.Multiplexer 226 according to the Far path parallel round off 225 provide round off select signal Sel and SelFloat2Fix to 32 compound mantissa adder devices 224 provide result of calculation select, and export to and move to left or move to right 1 or do not move 227.Move to left or move to right 1 or do not move 227 and judge that according to CpFp/ addition subtraction 209 addition, the subtraction that provide select signal; The data that addition, subtraction selected signal that multiplexer 226 is provided when floating-point changeed fixed point calculate the sign that mantissa far away handles mantissa's final calculation result in path and whether carried out moving to left, and pass to data respectively and select 234 to handle the multiplexer 233 that carries out Far path exponent arithmetic gating in 203 second level streamlines with index.
Parallel 225 formula that satisfy that round off in Far path are following under the rounding mode nearby:
Sel=(OpType==`ADDER)?((~sp0Cout?&?g?&(LSBSp0|r|s))|(sp0Cout?&?LSBSp0?&(LSB1Sp0|g|r|s))):(sp1Cout?&((~g?&~r?&~s)|(g&?r)|(MSBSp0?&?g?&(LSBSp0|s))));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout?&~MSBSp0?&MSB1Sp0?&((~g?&?r?&?s)|(~r?&?g)));
RightMove=(OpType==`ADDER)?(sp0Cout|(~sp0Cout?&?sp1Cout&?g?&(r|s|LSBSp0))):1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?(gFloat2Fix?&(LSBSp0|rFloat2Fix|sFloat2Fix)):((~gFloat2Fix?&~rFloat2Fix?&~sFloat2Fix)|(gFloat2Fix?&(LSBSp0|sFloat2Fix|rFloat2Fix)));
Wherein, the round off selection signal of Sel for producing, Sel are 1 selection A+B+1; Sel is that 0 to select A+B, ShiftIn be the data value that moves into when moving to left, and RightMove is the result of calculation sign that whether moves to right; Be 1 to move to right, otherwise do not move to right that SelFloat2Fix is the round off selection signal of floating-point when changeing fixed point; Sel is 1 selection A+B+1, and Sel is 0 selection A+B.OpType is that addition, the subtraction that CpFp/ addition subtraction judgement 209 provides selected signal; Float2FixOpType is that floating-point changes addition when fixing a point, subtraction is selected signal, and sp1Cout is the most significant digit carry of A+B+1, and sp0Cout is the most significant digit carry of A+B; MSBSp0 is the highest significant position of A+B; MSB1Sp0 is the inferior high significance bit of A+B, and LSBSp0 is the least significant bit (LSB) of A+B, and LSB1Sp0 is the inferior low order of A+B.G, r, s, gFloat2Fix, rFloat2Fix, sFloat2Fix are calculated by warning position, rounding bit and the stickup position that OpType, Float2FixOpType and OP2 form selection 223 provide, and it satisfies following formula.
g=(OpType==`ADDER)?gb:(gb^(rb|sb));
r=(OpType==`ADDER)?rb:(rb^sb);
s=(OpType==`ADDER)?sb:(sb);
gFloat2Fix=(Float2FixOpType==`ADDER)?gb:(gb^(rb?|sb));
rFloat2Fix=(Float2FixOpType==`ADDER)?rb:(rb^sb);
sFloat2Fix=(Float2FixOpType==`ADDER)?sb:(sb);
Wherein, gb, rb, warning position, rounding bit and stickup position that sb provides for " selection of OP2 form " 128.
It is following to block parallel 225 formula that satisfy that round off in Far path under the rounding mode:
Sel=(OpType==`ADDER)?1′b0:(sp1Cout?&(~g?&~r?&~s));
ShiftIn=(OpType==`ADDER)?1′b0:(sp1Cout?&~MSBSp0?&MSB1Sp0?&?g);
RightMove=(OpType==`ADDER)?sp0Cout:1′b0;
SelFloat2Fix=(Float2FixOpType==`ADDER)?1′b0:1′b1;
Wherein, the parameter declaration in the formula is identical with parameter declaration under the rounding mode nearby.
In the second level index processing unit 236; 8 complex indexes totalizers 230 receive leading 1 and judge that number and the negate of 220 displacements that provide and the bigger exponential quantity that first order DFF206 deposits carry out compound add operation; Obtain A+B; The result of A+B+1, and export to multiplexer 232, it denys that the occurrence of errors sign is selected the result of calculation of compound addition that multiplexer 232 provides according to error compensation 222; Obtain the final Index for Calculation result in shortcut footpath, and export to data and select 234.The action type information setting that OP1 228 deposits according to first order DFF206 carry out compound addition first operand value and pass to 8 complex indexes totalizers 231; If for fixed point is changeed floating-point operation; Then it is 158, otherwise the bigger exponential quantity that provides of depositing for first order DFF206, action type information that OP2 229 deposits according to first order DFF206 and CpFp/ addition subtraction judge 209 addition, the subtraction that provide select signal be provided with compound addition second operand value and pass to 8 complex indexes totalizers 231; If for fixed point is changeed floating-point operation; In 32 fixed-point numbers, 217 results that move to left is not 0 o'clock, and it is the move to left numerical value of the negate as a result after 217 displacements of 32 fixed-point numbers, and 32 fixed-point numbers move to left, and to equal at 0 o'clock be 0 to 217 results; Operation for other the present invention's supports; If the addition that CpFp/ addition subtraction judgement 209 provides, subtraction select signal for selecting subtraction, be output as-2, otherwise be 0.8 complex indexes totalizers 231 receive OP1 228 and the data that OP2 229 provides, and carry out compound add operation, obtain A+B, and the result of A+B+1 also passes to multiplexer 233.Multiplexer 233 obtains the directly final Index for Calculation result of a long way according to moving to left or move to right 1 or do not move 227 and provide parallel 225 signs that move to right that provide that round off in the sign that do not carried out moving to left and Far path the addition results that 8 complex indexes totalizers 231 provide is selected.
Above content is two-stage flowing water shown in Figure 2 to be realized the explanation of single-precision floating point plus-minus method and associative operation structure thereof.
Fig. 3 is a method flow diagram of realizing the high-speed floating point arithmetical unit of employing two stage pipeline structure shown in Figure 2, adopts the two-stage flowing structure to realize.Step 301 all operations carries out at first order flowing water, and streamline carries out step 302 all operations in the second level.In step 303, method 300 receives action type, the type that rounds off and first operand and second operand, and extracts corresponding sign bit, exponential sum mantissa data.In step 304, carry out calculating of index difference and comparison.In step 305, according to index difference information, shortcut directly utilizes compound totalizer to carry out mantissa's subtraction.In step 306, shortcut directly walks abreast and carries out leading 1 decision operation, and its operation result needs before step 310, to accomplish.In step 307, long way footpath is according to the index difference alignment operation that is shifted.In step 308, the type that rounds off that the shortcut footpath provides according to step 303 operation result is carried out complement code conversion and parallel rounding off respectively.In step 309, a long way directly utilizes compound totalizer to carry out mantissa's plus-minus.In step 310, the shortcut footpath is carried out normalization shift according to the result of leading 1 judgement to the output of step 208, obtains the final result of calculation in shortcut footpath.In step 311, the type that rounds off that long way footpath provides according to step 303 walks abreast to result of calculation and rounds off, and obtains the final result of calculation in long way footpath.In step 312, the action type that index difference information that provides according to step 304 and step 303 provide from the shortcut footpath, long way footpath or abnormality processing result export.
Can find out that through the top embodiment that provides high-speed floating point arithmetical unit provided by the invention adopts dual path; Leading 1 judges; Parallel support floating add subtraction and the associative operation thereof of rounding off changes 32 fixed-point numbers of floating-point operation introducing and moves to left except fixing a point, get the perhaps inverse square root operation immigration mantissa of operating reciprocal and handle; Floating-point changes fixed point introducing index and overflows judgment processing; And owing to increase action type in the logic of carrying out that data are selected and judging during output, floating add subtraction associative operation all is the data path of multiplexing floating add subtraction, and reusability is high simultaneously, area is little for data.Adopt pipeline organization to design the arithmetic speed height, handling capacity is big.
Above-described specific embodiment; The object of the invention, technical scheme and beneficial effect have been carried out further explain, and institute it should be understood that the above is merely specific embodiment of the present invention; Be not limited to the present invention; All within spirit of the present invention and principle, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (17)

1. high-speed floating point arithmetical unit; It is characterized in that; This high-speed floating point arithmetical unit adopts the N stage pipeline structure; N is the natural number greater than 1, comprising: input DFF (101), operand information are extracted and zone bit judging unit (115), N level floating-point operation structure, N level DFF and data selection unit (116), wherein:
This input DFF (101) extracts through this operand information and zone bit judging unit (115) is connected in the first order floating-point operation structure in this N level floating-point operation structure; This first order floating-point operation structure is connected in the first order DFF (102) among this N level DFF; This first order DFF (102) is connected in the second level floating-point operation structure in this N level floating-point operation structure; This second level floating-point operation structure is connected in the second level DFF (103) among this N level DFF; This second level DFF (103) is connected in the third level floating-point operation structure in this N level floating-point operation structure ..., and the like; N-1 level floating-point operation structure in this N level floating-point operation structure is connected in the N-1 level DFF (104) among this N level DFF; This N-1 level DFF (104) is connected in the N level floating-point operation structure in this N level floating-point operation structure, and last, this N level floating-point operation structure is connected in the N level DFF (105) among this N level DFF through this data selection unit (116).
2. high-speed floating point arithmetical unit according to claim 1; It is characterized in that; This N level floating-point operation structure includes N level index processing unit, N level Close path mantissa's processing unit and N level Far path mantissa processing unit, and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are connected to N level index processing unit.
3. high-speed floating point arithmetical unit according to claim 2; It is characterized in that; This input DFF (101) receives and deposits action type, the type that rounds off, first operand and the second operand that the outside provides; And action type exported to first order index processing unit (106) and first order DFF (102); The type that will round off is exported to first order Close path mantissa's processing unit (109) and first order DFF (102), and first operand and second operand are passed to this operand information extraction and zone bit judging unit (115).
4. high-speed floating point arithmetical unit according to claim 3; It is characterized in that it is that floating add, subtraction, floating-point change fixed point, fixed point and change floating-point, floating-point absolute value, floating-point and get which kind of operation in inverse, floating-point inverse square root and the comparison class that this action type indicates operation types; This type that rounds off indicates to block and rounds off or round off nearby.
5. high-speed floating point arithmetical unit according to claim 2; It is characterized in that; This operand information extraction and zone bit judging unit (115) extract sign bit, exponent bits and mantissa's bit data of first operand and second operand; And the zone bit data of calculating first operand and second operand; The exponent bits data of first operand that obtains and second operand are exported to first order index processing unit (106); Mantissa's bit data of first operand and second operand is exported to first order Close path mantissa's processing unit (109) and first order Far path mantissa's processing unit (112), sign bit data and zone bit data are exported to first order DFF (102) deposit.
6. high-speed floating point arithmetical unit according to claim 5 is characterized in that, these zone bit data comprise zero flag, infinite sign and non-number sign at least.
7. high-speed floating point arithmetical unit according to claim 2; It is characterized in that; The data that this first order index processing unit (106) provides according to the exponent bits data of first operand and second operand and action type and first order Close path mantissa's processing unit (109) and first order Far path mantissa's processing unit (112) are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is deposited among the first order DFF (102).
8. high-speed floating point arithmetical unit according to claim 7 is characterized in that,
The data that this second level index processing unit (107) provides according to the exponent bits data of first operand and second operand and action type and Close path, second level mantissa's processing unit (110) and Far path, second level mantissa's processing unit (113) are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is deposited among the second level DFF (103);
The data that this N level index processing unit (108) provides according to the exponent bits data of first operand and second operand and action type and N level Close path mantissa's processing unit and N level Far path mantissa processing unit are carried out the Index for Calculation of floating add subtraction and associative operation thereof, and this Index for Calculation result is exported to data selection unit (116).
9. according to claim 7 or 8 described high-speed floating point arithmetical unit; It is characterized in that; Wherein the Index for Calculation of floating add subtraction and associative operation thereof specifically comprises: it is poor to calculate two operand indexes; Underflow is judged on the index, and Close path, Far path index generate, and is that Close at different levels path mantissa handles and the processing of Far path mantissa provides desired data.
10. high-speed floating point arithmetical unit according to claim 2; It is characterized in that; The mantissa that the data that this first order Close path mantissa's processing unit (109) provides according to the mantissa's bit data and the first order index processing unit (106) of first operand and second operand are carried out the Close path is calculated, and this mantissa's result of calculation is deposited among the first order DFF (102).
11. high-speed floating point arithmetical unit according to claim 10 is characterized in that,
The mantissa that the data that Close path, this second level mantissa's processing unit (110) provides according to the mantissa's bit data and the second level index processing unit (107) of first operand and second operand are carried out the Close path is calculated, and mantissa's result of calculation in this Close path is deposited among the second level DFF (103);
The mantissa that the data that this N level Close path mantissa's processing unit (111) provides according to the mantissa's bit data and the N level index processing unit (108) of first operand and second operand are carried out the Close path is calculated, and mantissa's result of calculation in this Close path is exported to data selection unit (116).
12. according to claim 10 or 11 described high-speed floating point arithmetical unit; It is characterized in that; Wherein the mantissa in Close path calculates and specifically comprises: the index difference is handled smaller or equal to 1 subtraction mantissa; Leading 1 of result of calculation is judged and the parallel calculating of rounding off of Close route result, and is that indexes processing at different levels provide desired data.
13. high-speed floating point arithmetical unit according to claim 2; It is characterized in that; The mantissa that the data that this first order Far path mantissa's processing unit (112) provides according to the mantissa's bit data and the first order index processing unit (106) of first operand and second operand are carried out the Far path is calculated, and this mantissa's result of calculation is deposited among the first order DFF (102).
14. high-speed floating point arithmetical unit according to claim 3 is characterized in that,
The mantissa that the data that Far path, this second level mantissa's processing unit (113) provides according to the mantissa's bit data and the second level index processing unit (107) of first operand and second operand are carried out the Far path is calculated, and mantissa's result of calculation in this Far path is deposited among the second level DFF (103);
The mantissa that the data that this N level Far path mantissa's processing unit (114) provides according to the mantissa's bit data and the N level index processing unit (108) of first operand and second operand are carried out the Far path is calculated, and mantissa's result of calculation in this Far path is exported to data selection unit (116).
15. according to claim 13 or 14 described high-speed floating point arithmetical unit; It is characterized in that; Wherein the mantissa in Far path calculates and specifically comprises: the index difference is handled greater than 1 subtraction mantissa, and the mantissa of addition handles, the parallel calculating of rounding off of Far route result; And the extra additional Processing Structure of plus-minus method associative operation mantissa calculating needs, and be that indexes processing at different levels provide desired data.
16. high-speed floating point arithmetical unit according to claim 2; It is characterized in that; This data selection unit (116) obtains final result of calculation according to what pass over by mantissa's result of calculation in operand information is extracted and zone bit judging unit (115) provides various zone bit data, Close path that N level Close path mantissa's processing unit (111) provides and mantissa's result of calculation in the Far path that N level Far path mantissa's processing unit (114) provides, and passes to N level DFF and deposits.
17. high-speed floating point arithmetical unit according to claim 2; It is characterized in that; This first order DFF (102) is used to deposit the results of intermediate calculations of this first order index processing unit (106), this first order Close path mantissa's processing unit (109) and this first order Far path mantissa's processing unit (112); And the sign bit data that operand information is extracted and zone bit judging unit (115) provides and zone bit data and the action type and the categorical data that rounds off that input DFF (101) deposits, use for next stage index processing unit, next stage Close path mantissa's processing unit and next stage Far path mantissa processing unit.
CN201110418897.6A 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement Active CN102566967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110418897.6A CN102566967B (en) 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110418897.6A CN102566967B (en) 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement

Publications (2)

Publication Number Publication Date
CN102566967A true CN102566967A (en) 2012-07-11
CN102566967B CN102566967B (en) 2015-08-19

Family

ID=46412487

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110418897.6A Active CN102566967B (en) 2011-12-15 2011-12-15 A kind of high-speed floating point arithmetical unit adopting multi-stage pipeline arrangement

Country Status (1)

Country Link
CN (1) CN102566967B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636114A (en) * 2015-02-12 2015-05-20 中国科学院自动化研究所 Floating point number multiplication rounding method and device
CN105302519A (en) * 2014-07-24 2016-02-03 Arm有限公司 Apparatus and method for performing floating-point square root operation
CN107003840A (en) * 2014-12-23 2017-08-01 英特尔公司 Checked for performing to optimize the apparatus and method of instruction stream
CN109960533A (en) * 2019-03-29 2019-07-02 苏州中晟宏芯信息科技有限公司 Floating-point operation method, apparatus, equipment and storage medium
CN110036368A (en) * 2016-12-06 2019-07-19 Arm有限公司 For executing arithmetical operation with the device and method for the floating number that adds up
CN111382390A (en) * 2018-12-28 2020-07-07 上海寒武纪信息科技有限公司 Operation method, device and related product
CN111596887A (en) * 2020-05-22 2020-08-28 天津国科医工科技发展有限公司 Inner product calculation method based on reconfigurable calculation structure
CN112214196A (en) * 2020-10-19 2021-01-12 上海兆芯集成电路有限公司 Floating point exception handling method and device
CN112463115A (en) * 2017-12-15 2021-03-09 安徽寒武纪信息科技有限公司 Calculation method and related product
CN111260044B (en) * 2018-11-30 2023-06-20 上海寒武纪信息科技有限公司 Data comparator, data processing method, chip and electronic equipment
CN116643718A (en) * 2023-06-16 2023-08-25 合芯科技有限公司 Floating point fusion multiply-add device and method of pipeline structure and processor
CN117251132A (en) * 2023-09-19 2023-12-19 上海合芯数字科技有限公司 Fixed-floating point SIMD multiply-add instruction fusion processing device and method and processor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1410885A (en) * 2001-09-27 2003-04-16 中国科学院计算技术研究所 Command pipeline system based on operation queue duplicating use and method thereof
US20070074008A1 (en) * 2005-09-28 2007-03-29 Donofrio David D Mixed mode floating-point pipeline with extended functions
CN101174200A (en) * 2007-05-18 2008-05-07 清华大学 5-grade stream line structure of floating point multiplier adder integrated unit
CN101692202A (en) * 2009-09-27 2010-04-07 北京龙芯中科技术服务中心有限公司 64-bit floating-point multiply accumulator and method for processing flowing meter of floating-point operation thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1410885A (en) * 2001-09-27 2003-04-16 中国科学院计算技术研究所 Command pipeline system based on operation queue duplicating use and method thereof
US20070074008A1 (en) * 2005-09-28 2007-03-29 Donofrio David D Mixed mode floating-point pipeline with extended functions
CN101174200A (en) * 2007-05-18 2008-05-07 清华大学 5-grade stream line structure of floating point multiplier adder integrated unit
CN101692202A (en) * 2009-09-27 2010-04-07 北京龙芯中科技术服务中心有限公司 64-bit floating-point multiply accumulator and method for processing flowing meter of floating-point operation thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
夏杰等: "基于流水线结构的浮点加法器IP 核设计", 《微计算机信息》 *
蔡敏等: "全流水线结构双精度浮点乘加单元的设计", 《微电子学与计算机》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105302519A (en) * 2014-07-24 2016-02-03 Arm有限公司 Apparatus and method for performing floating-point square root operation
CN105302519B (en) * 2014-07-24 2019-07-12 Arm 有限公司 Device and method for executing floating-point square root operation
CN107003840A (en) * 2014-12-23 2017-08-01 英特尔公司 Checked for performing to optimize the apparatus and method of instruction stream
CN107003840B (en) * 2014-12-23 2021-05-25 英特尔公司 Apparatus and method for performing checks to optimize instruction flow
CN104636114A (en) * 2015-02-12 2015-05-20 中国科学院自动化研究所 Floating point number multiplication rounding method and device
CN104636114B (en) * 2015-02-12 2018-05-15 北京思朗科技有限责任公司 A kind of rounding method and device of floating number multiplication
CN110036368B (en) * 2016-12-06 2023-02-28 Arm有限公司 Apparatus and method for performing arithmetic operations to accumulate floating point numbers
CN110036368A (en) * 2016-12-06 2019-07-19 Arm有限公司 For executing arithmetical operation with the device and method for the floating number that adds up
CN112463115A (en) * 2017-12-15 2021-03-09 安徽寒武纪信息科技有限公司 Calculation method and related product
CN111260044B (en) * 2018-11-30 2023-06-20 上海寒武纪信息科技有限公司 Data comparator, data processing method, chip and electronic equipment
CN111382390A (en) * 2018-12-28 2020-07-07 上海寒武纪信息科技有限公司 Operation method, device and related product
CN109960533A (en) * 2019-03-29 2019-07-02 苏州中晟宏芯信息科技有限公司 Floating-point operation method, apparatus, equipment and storage medium
CN109960533B (en) * 2019-03-29 2023-04-25 合芯科技(苏州)有限公司 Floating point operation method, device, equipment and storage medium
CN111596887A (en) * 2020-05-22 2020-08-28 天津国科医工科技发展有限公司 Inner product calculation method based on reconfigurable calculation structure
CN111596887B (en) * 2020-05-22 2023-07-21 威高国科质谱医疗科技(天津)有限公司 Inner product calculation method based on reconfigurable calculation structure
CN112214196A (en) * 2020-10-19 2021-01-12 上海兆芯集成电路有限公司 Floating point exception handling method and device
CN116643718A (en) * 2023-06-16 2023-08-25 合芯科技有限公司 Floating point fusion multiply-add device and method of pipeline structure and processor
CN116643718B (en) * 2023-06-16 2024-02-23 合芯科技有限公司 Floating point fusion multiply-add device and method of pipeline structure and processor
CN117251132A (en) * 2023-09-19 2023-12-19 上海合芯数字科技有限公司 Fixed-floating point SIMD multiply-add instruction fusion processing device and method and processor
CN117251132B (en) * 2023-09-19 2024-05-14 上海合芯数字科技有限公司 Fixed-floating point SIMD multiply-add instruction fusion processing device and method and processor

Also Published As

Publication number Publication date
CN102566967B (en) 2015-08-19

Similar Documents

Publication Publication Date Title
CN102566967A (en) High-speed floating point unit in multilevel pipeline organization
TWI625671B (en) Microprocessor and method performed in microprocessor
CN101692202B (en) 64-bit floating-point multiply accumulator and method for processing flowing meter of floating-point operation thereof
CN101847087B (en) Reconfigurable transverse summing network structure for supporting fixed and floating points
CN102722352B (en) Booth multiplier
CN102629189B (en) Water floating point multiply-accumulate method based on FPGA
CN106970775A (en) A kind of general adder of restructural fixed and floating
CN106155627B (en) Low overhead iteration trigonometric device based on T_CORDIC algorithm
US8788561B2 (en) Arithmetic circuit, arithmetic processing apparatus and method of controlling arithmetic circuit
KR100948559B1 (en) Arithmetic unit performing division or square root operation of floating point number and operating method
CN100454237C (en) Processor having efficient function estimate instructions
CN104778026A (en) High-speed data format conversion part with SIMD and conversion method
CN101937379A (en) Arithmetical circuit, arithmetic processing equipment and arithmetic processing method
CN101650643B (en) Rounding method for indivisible floating point division radication
CN110187866A (en) A kind of logarithmic multiplication computing system and method based on hyperbolic CORDIC
Daoud et al. A survey on design and implementation of floating point adder in FPGA
JP5966768B2 (en) Arithmetic circuit, arithmetic processing device, and control method of arithmetic processing device
CN103944576B (en) A kind of Sigma Delta manipulator and a kind of operation method for Sigma Delta manipulator
CN101699390B (en) Self-correction precursor 0/1 predicting unit for floating-point adder
CN117971162B (en) Data processing method and related equipment applied to magnetic bearing control prediction model
CN103455305B (en) The Forecasting Methodology that rounds off for floating-point adder
CN103455305A (en) Rounding prediction method for floating point adder
CN102360277A (en) Hardware division unit of microprocessor computational system based on resource reuse

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20171129

Address after: 102412 Beijing City, Fangshan District Yan Village Yan Fu Road No. 1 No. 11 building 4 layer 402

Patentee after: Beijing Si Lang science and Technology Co.,Ltd.

Address before: 100190 Zhongguancun East Road, Beijing, No. 95, No.

Patentee before: Institute of Automation, Chinese Academy of Sciences

TR01 Transfer of patent right
CP03 Change of name, title or address

Address after: 201306 building C, No. 888, Huanhu West 2nd Road, Lingang New District, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai

Patentee after: Shanghai Silang Technology Co.,Ltd.

Address before: 102412 room 402, 4th floor, building 11, No. 1, Yanfu Road, Yancun Town, Fangshan District, Beijing

Patentee before: Beijing Si Lang science and Technology Co.,Ltd.

CP03 Change of name, title or address