CN1740962A - Fast pipeline type divider - Google Patents

Fast pipeline type divider Download PDF

Info

Publication number
CN1740962A
CN1740962A CN 200510029858 CN200510029858A CN1740962A CN 1740962 A CN1740962 A CN 1740962A CN 200510029858 CN200510029858 CN 200510029858 CN 200510029858 A CN200510029858 A CN 200510029858A CN 1740962 A CN1740962 A CN 1740962A
Authority
CN
China
Prior art keywords
unit
output
input
multiplier
merchant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200510029858
Other languages
Chinese (zh)
Other versions
CN100367191C (en
Inventor
陈柏钦
侯钢
王国中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INESA Electron Co., Ltd.
Original Assignee
Central Academy of SVA Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central Academy of SVA Group Co Ltd filed Critical Central Academy of SVA Group Co Ltd
Priority to CNB2005100298581A priority Critical patent/CN100367191C/en
Publication of CN1740962A publication Critical patent/CN1740962A/en
Application granted granted Critical
Publication of CN100367191C publication Critical patent/CN100367191C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The present invention relates to a divider. It includes successively circuit-connected normalization unit, first-level multiplier unit, second-level multiplier unit and quotient truncation unit, and between the normalization unit and second-level multiplier unit a list-searching unit also is connected. Said invention also provider their connection mode and the working principle of said divider.

Description

A kind of fast pipeline type divider
Technical field
The present invention relates to a kind of fast pipeline type divider, specifically, relate to the fast pipeline type divider in a kind of integrated circuit algorithm routine that can be applicable to fields such as Digital Image Processing, digital signal processing, communication.
Background technology
In digital signal processing, communication and Flame Image Process, often can use division arithmetic.In the most of programs that realize division arithmetic, because corresponding processor does not have the division arithmetic unit, the algorithms library that makes compiler to call to design in advance could realize that its shortcoming is that the execution time is long, should not effectively move under specific occasions such as real-time system.In the algorithm design of integrated circuit and programmable logic device (PLD), do not realize the chip of division function in the existing chip, in hardware description language, do not have yet can be comprehensive divide statement, companies such as Synopsys of EDA provider and Cadence do not provide corresponding design library.Programmable logic device (PLD) such as Xinlix designs manufacturer provides divider IP, but is unsuitable for transplanting, and resource consumption is excessive, operation time oversize shortcoming.
Chinese patent, application number is 95107302 to disclose a kind of two bit division arithmetic method and apparatus that need not hardware, this method and apparatus adopts the skill of binary displacement and addition to simplify the calculating process of division, but it is to be calculate platform with the microprocessor, by the division arithmetic of instruction realization; And the redundanat code that Chinese patent application number has adopted the secondary computing to constitute for 89106625.X is mixed the redundanat code adder unit composition division array of use with provincialism, and adopting the selector switch of alternative to constitute the array change-over circuit that the merchant of redundanat code is directly changed into binary mode, its circuit structure is complicated; United States Patent (USP) 5485414 includes the multiplying unit in its circuit, circuit structure is also complicated.The common drawback of these patents is complex structures, use components and parts more, and arithmetic speed is slow, can't satisfy some application-specific in digital circuit.
Summary of the invention
The object of the present invention is to provide a kind of fast pipeline type divider, it can be used in the algorithm routine of integrated circuit (and associated programmable logic device (PLD)), to handle the divide operations in the mathematical operation.It is simple in structure, and the components and parts of use are few, fast operation.
For achieving the above object, divider provided by the invention is formed and is comprised:
The standardized unit that connects of circuit successively, first order multiplier unit, second level multiplier unit, merchant's cut position unit; And also connect a lookup unit between standardized unit and the second level multiplier unit; Standardized unit also has an output to connect the input of merchant's cut position unit;
Dividend (X) and divisor (Y) that standardized unit is indicated according to the useful signal of input, the multiplicand of output enable signal and this multiplier and multiplier are to first order multiplier unit; Output displacement reference signal is to discussing the cut position unit; And output address to the lookup unit address of tabling look-up accordingly of tabling look-up;
The specific design of standardized unit need be carried out respectively dividend and divisor, method is: when the input efficient in operation the time, dividend and divisor will be 1 to shifting left until most significant digit respectively, and write down final result (x, y) and the figure place that is shifted (Ex, Ey).To carry out next step computing: x as the multiplicand output that is input to first order multiplier unit to x, y, Ex, Ey then; Get the high h position of y, deduct remaining low data and generate new data, as the multiplier output that is input to first order multiplier unit; Get the high h position of y and seek table address accordingly as being input to lookup unit; The displacement reference signal that is input to the cut position unit then is Ex and (Ey-2) poor;
Aforesaid operations was finished in a clock period, therefore was input to the enable signal of first order multiplier unit, and next clock output after the input useful signal is effective is to enable the computing of first order multiplier unit.Under the situation of carrying out pipelining, next clock after the input useful signal is effective both can continue to import data and carry out division arithmetic.
First order multiplier unit is according to the enable signal of the output of standardized unit, and multiplicand and multiplier that standardized unit is exported carry out multiplying, again the result are input to second level multiplier unit; In addition, also send enable signal to second level multiplier unit;
Aforesaid operations was finished in a clock period, therefore was input to the enable signal of second level multiplier unit, and next clock output after the input useful signal is effective is to enable second level multiplier unit.Under the situation of carrying out pipelining, next clock after the input useful signal is effective both can be proceeded next data operation.
Because all integrated multiplication unit so use multiplying to need not to call built-in function in the algorithm routine, can directly carry out computing in most processors; In integrated circuit (and associated programmable logic device (PLD)) design, EDA provider also all provides and can use by convenient and practical, transplantable IP.So the design of multiplier unit does not belong to the design's row.
Lookup unit is taken out the data that leave in advance wherein according to the standardized unit output lookup unit address of tabling look-up accordingly, outputs to second level multiplier unit, has finished third level computing.
What deposit in the lookup unit is a table that needs in the calculating process, and its data based computing obtains in advance.Because the output accuracy of table is bigger to merchant's influence, need be according to merchant's the accuracy requirement bit wide of the output data in the reckoner in advance, thereby have influence on the needed hardware resource of table, it can be made ROM according to actual needs or realize by pure combinational logic circuit.
Data in the table are according to the Taylor series expansion after the inverse utilization of divisor is optimized, and get an approximate value then, and the result of table output calculates as a pure decimal.
Above-mentioned table lookup operation can be finished in a clock period.
Second level multiplier unit sends enable signal according to first order multiplier unit, will be by the data of first order multiplier unit output as multiplicand, and the data that draw of tabling look-up are carried out computing and are drawn final output as multiplier.
Aforesaid operations was finished in a clock period, therefore was input to the enable signal of merchant's cut position unit, and next clock output after the input useful signal is effective is to enable the cut position unit.Under the situation of carrying out pipelining, next clock after the input useful signal is effective can be proceeded next data operation.
The design basically identical of the design of second level multiplier unit and first order multiplier unit.
Merchant's cut position unit formats the output of the second level multiplier unit integral part that obtains discussing and fraction part according to the displacement reference signal of standardized unit output.
The design of merchant's cut position unit is to judge according to the displacement reference signal of input to draw fully.Because the complexity of merchant's cut position unit is lower, so in actual applications, can mixes with other logical circuits and use, thereby make this divider in 3 clock period, finish computing.
Aforesaid operations was finished in a clock period, so merchant's useful signal of output, next clock output after the input useful signal is effective.
Be input to output from raw data and need 4 clock period at least.
The invention is characterized in if when carrying out division arithmetic one by one, can in 4 clock period, finish the soonest, what is more important should design be supported stream line operation, when dividend and divisor along with clock is imported continuously, then can form the streamline executive mode by internal mechanism, the data operation time average is finished in (N+3)/N clock.
The present invention can judge that when divisor is 0 the merchant of output is feasible maximal value, because this divider is fully controlled, therefore, concrete output result can also make corresponding adjustment according to actual needs.
Description of drawings
Fig. 1 is implementing procedure figure of the present invention;
Fig. 2 is an enforcement structural drawing of the present invention;
Embodiment
Below, (X) is 20 bits to the maximum with dividend, and divisor (Y) is the embodiment of 10 bits to the maximum, according to process flow diagram shown in Figure 1 and structural drawing shown in Figure 2, further specifies the present invention.X represents that the dividend imported, Y represent the divisor of importing among the figure, and Q represents the merchant that exports.Wherein Fig. 2 is to be reference with the clock substantially, and a line display needs a clock, with delegation 2 unit is arranged, and in first row, individual standardized unit is arranged, second capable in, first order multiplier and lookup unit are arranged, representing that these two unit can walk abreast carries out.
According to the present invention, on FPGA, realize one 20 dividers that remove 10, and carry out emulation.Get the operand that the X number is 20 bits, Y is the operand of 10 bits, and the merchant is 10 bits, is [0,256.75] between Shang codomain absolute field promptly, and error is last position, unit.When the merchant greater than 1 the time, if integer-bit is got 8 bits, then error is 1/4; After merchant's decimal place less than 1 time can be got radix point 10, error is 1/1024, by that analogy.
X gets 9.EF45 (sexadecimal) in the present embodiment, i.e. 1001.1110 1,111 0,100 0101 (scale-of-two), and Y gets 1F.38 (sexadecimal), i.e. 1 1111.0011 100 (scale-of-two).(as not being specifically noted, the data in the following bracket all are sexadecimal.)
Clock 1: two standardized units as shown in Figure 2, according to the useful signal X and the Y that indicate of input, value x (0.9EF45) and the y (0.F9C) of X and Y normalization expression pure decimal, the moving of its correspondence, a reference signal are respectively Ex (4) and Ey (5).The visible Fig. 1 of formula wherein considers several positive negativity, do not carry out in the present invention for the computing of sign, thereby the form with absolute value is represented in formula.
Whether divider detects Y earlier when receiving data be zero, if, will not do anything in 3 clocks below, treat the 4th clock, export FFF, and warning is drawn high 1 clock period.As shown in Figure 1, when Y=0, Q=MAX, output.This process can be finished when Y is standardized, thereby does not represent in Fig. 2, has been included in the standardized unit to Y.
When divider detects earlier Y when receiving data is not for zero the time, and x is as the multiplicand output that is input to first order multiplier unit; Get y high 6 (yh=0.3E), deduct remaining low 4 (yl=0.01C) data generate new data (y '=0.F64), as the multiplier output that is input to first order multiplier unit; High 6 conducts of getting y are input to lookup unit and seek table address accordingly; The displacement reference signal that is input to the cut position unit then is Ex and (Ey-2) poor (E=Ex-Ey+2=1).
Be input to the enable signal of first order multiplier unit, next clock output after the input useful signal is effective is to enable the computing of first order multiplier unit.Under the situation of carrying out pipelining, next clock after the input useful signal is effective both can continue to import data and carry out division arithmetic.
Fig. 1 and Fig. 2 are in order to explain algorithmic procedure of the present invention, not represented in the drawings for circuit signal information such as enable signal and useful signals, and be together following.
Clock 2: two processes of Parallel Implementation, first order multiplication and tabling look-up, shown in the structure of Fig. 2:
First order multiplier unit is done multiplying according to the enable signal of the output of standardized unit to x with (yh-yl), and high 20 (0.98E68) of intercepting product are input to second level multiplier unit; In addition, also send enable signal to second level multiplier unit.
Lookup unit is carried out addressing according to yh, and the data that leave in are in advance wherein taken out (0.443), outputs to second level multiplier unit.Because the most significant digit of yh must be 1, therefore only need carry out addressing according to low 5 of yh, have 32 unit in the table, each unit 12 bit amounts to 384 bits.
Clock 3: as multiplicand, 12 bit data that draw of tabling look-up output to merchant's cut position unit to high 18 (0.28B9E) of product as multiplier to second level multiplier unit according to 20 bit data of first order multiplier unit output.
Clock 4: merchant cut position unit is according to the displacement reference signal (E=1) of standardized unit output, with the output of second level multiplier unit as a result integral body to moving to left 1, if E is negative value then moves right.This moment, result's most-significant byte was an integer-bit, and low 10 is decimal place, promptly obtains integral part (0) and the fraction part (0.514) discussed in the present embodiment.Be gained merchant Q (0.514), converting scale-of-two to is 0.0101 0,001 01.
In Fig. 1, second level multiplication and the computing of merchant's cut position have been represented in a formula, i.e. M 2Calculating.
And in computing machine, directly calculate X/Y, and the merchant is got 12, then the merchant is (0.517), promptly scale-of-two is 0.0101 0,001 0111.The result of gained of the present invention comes to the same thing with the computing machine gained at preceding 10 in the present embodiment.
Getting X in another embodiment is E916, and Y gets EA, and then the merchant by divider gained of the present invention is FE.C, and is FF.0 by the result that computing machine directly calculates gained, and error is 1/4, promptly last 1 of the merchant.
Application of the present invention can be integrated circuit (and associated programmable logic device (PLD)), and in the corresponding algorithm routine.

Claims (9)

1. divider, it comprises:
The standardized unit that connects of circuit successively, first order multiplier unit, second level multiplier unit, merchant's cut position unit; And also connect a lookup unit between standardized unit and the second level multiplier unit; Standardized unit also has an output to connect the input of merchant's cut position unit;
Dividend (X) and divisor (Y) that standardized unit is indicated according to the useful signal of input, the multiplicand of output enable signal and this multiplier and multiplier are to first order multiplier unit; Output displacement reference signal is to discussing the cut position unit; And output address to the lookup unit address of tabling look-up accordingly of tabling look-up;
First order multiplier unit is according to the enable signal of the output of standardized unit, and multiplicand and multiplier that standardized unit is exported carry out multiplying, then the result are input to second level multiplier unit; In addition, also send enable signal to second level multiplier unit;
Lookup unit is taken out the data that leave in advance wherein according to the standardized unit output lookup unit address of tabling look-up accordingly, outputs to second level multiplier unit, has finished third level computing;
Second level multiplier unit sends enable signal according to first order multiplier unit, will be by the data of first order multiplier unit output as multiplicand, and the data that draw of tabling look-up are carried out computing and are drawn final output as multiplier;
Merchant's cut position unit formats the output of the second level multiplier unit integral part that obtains discussing and fraction part according to the displacement reference signal of standardized unit output.
2. divider as claimed in claim 1 is characterized in that, the specific design of described standardized unit need be carried out respectively dividend and divisor:
When the efficient in operation of input, dividend and divisor will be 1 to shifting left until most significant digit respectively, and write down final result (x, y) and the figure place that is shifted (Ex, Ey);
Then to x, y, Ex and Ey being carried out next step computing: x as the multiplicand output that is input to first order multiplier unit; Get the high 6 of y, deduct remaining low data, as the multiplier output that is input to first order multiplier unit; High 6 conducts of getting y are input to the lookup unit address of tabling look-up accordingly; The displacement reference signal that is input to merchant's cut position unit then is the poor of Ex and Ey.
3. divider as claimed in claim 2, it is characterized in that, operating in the clock period of described standardized unit finished, promptly according to the useful signal of current input, the enable signal of this unit output is corresponding to postpone a clock period, obtains being input to the enable signal of first order multiplier unit.
4. divider as claimed in claim 1 or 2, it is characterized in that operating in the clock period of described first multiplier finished, therefore be input to the enable signal of second level multiplier unit, next clock output after the input useful signal is effective is to enable second level multiplier unit.
5. divider as claimed in claim 4, it is characterized in that, what deposit in the described lookup unit is a table that needs in the calculating process, its form is that the data of 32 12 bits are formed, its data operation draws, this table is totally 384 bit locations, it can be made ROM according to actual needs or have pure combinational logic circuit to realize.
6. divider as claimed in claim 5 is characterized in that, the data of showing in the described lookup unit are according to the Taylor series expansion after optimizing the inverse of divisor to be got an approximate value, and the result of table output is the pure decimal calculating as one 12 bit; And this operates in the clock period and finishes, and lookup unit adopts register mode output.
7. divider as claimed in claim 5, it is characterized in that operating in the clock period of described second level multiplier unit finished, therefore be input to the enable signal of merchant's cut position unit, next clock output after the input useful signal is effective is to enable the cut position unit.
8. divider as claimed in claim 7, it is characterized in that, described merchant's cut position unit draws according to the displacement reference signal judgement of input, the complexity of merchant's cut position unit is lower, in actual applications, can mix with other logical circuits and use, make this divider in 3 clock period, finish computing.
9. divider as claimed in claim 8 is characterized in that, operating in the clock period of described merchant's cut position unit finished, merchant's useful signal of output, next the clock output behind the input useful signal; In the pipelining process, merchant's useful signal of exporting according to this unit of useful signal of current input postpones a clock period accordingly.
CNB2005100298581A 2005-09-22 2005-09-22 Fast pipeline type divider Active CN100367191C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100298581A CN100367191C (en) 2005-09-22 2005-09-22 Fast pipeline type divider

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100298581A CN100367191C (en) 2005-09-22 2005-09-22 Fast pipeline type divider

Publications (2)

Publication Number Publication Date
CN1740962A true CN1740962A (en) 2006-03-01
CN100367191C CN100367191C (en) 2008-02-06

Family

ID=36093367

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100298581A Active CN100367191C (en) 2005-09-22 2005-09-22 Fast pipeline type divider

Country Status (1)

Country Link
CN (1) CN100367191C (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685384B (en) * 2008-09-23 2011-10-26 华晶科技股份有限公司 Integer division operational circuit tolerating errors
CN102231101A (en) * 2011-07-29 2011-11-02 电子科技大学 Divider and division processing method
CN101295237B (en) * 2007-04-25 2012-03-21 四川虹微技术有限公司 High-speed divider for quotient and balance
CN102460424A (en) * 2009-06-10 2012-05-16 新思科技有限公司 Multiplicative division circuit with reduced area
CN108595147A (en) * 2018-01-02 2018-09-28 上海兆芯集成电路有限公司 Microprocessor with series operation execution circuit
CN115407965A (en) * 2022-11-01 2022-11-29 南京航空航天大学 High-performance approximate divider based on Taylor expansion and error compensation method

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615404A (en) * 2015-02-15 2015-05-13 浪潮电子信息产业股份有限公司 High-speed floating-point division unit based on table look-up

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5563818A (en) * 1994-12-12 1996-10-08 International Business Machines Corporation Method and system for performing floating-point division using selected approximation values
US6782405B1 (en) * 2001-06-07 2004-08-24 Southern Methodist University Method and apparatus for performing division and square root functions using a multiplier and a multipartite table
FI20011610A0 (en) * 2001-08-07 2001-08-07 Nokia Corp Method and apparatus for performing a division calculation
JP2003303096A (en) * 2002-04-11 2003-10-24 Olympus Optical Co Ltd Division circuit

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101295237B (en) * 2007-04-25 2012-03-21 四川虹微技术有限公司 High-speed divider for quotient and balance
CN101685384B (en) * 2008-09-23 2011-10-26 华晶科技股份有限公司 Integer division operational circuit tolerating errors
CN102460424A (en) * 2009-06-10 2012-05-16 新思科技有限公司 Multiplicative division circuit with reduced area
CN102460424B (en) * 2009-06-10 2016-03-23 新思科技有限公司 There is the multiplicative division circuit reducing area
CN102231101A (en) * 2011-07-29 2011-11-02 电子科技大学 Divider and division processing method
CN108595147A (en) * 2018-01-02 2018-09-28 上海兆芯集成电路有限公司 Microprocessor with series operation execution circuit
CN108595147B (en) * 2018-01-02 2021-03-23 上海兆芯集成电路有限公司 Microprocessor with series operation execution circuit
CN115407965A (en) * 2022-11-01 2022-11-29 南京航空航天大学 High-performance approximate divider based on Taylor expansion and error compensation method

Also Published As

Publication number Publication date
CN100367191C (en) 2008-02-06

Similar Documents

Publication Publication Date Title
CN109828744B (en) Configurable floating point vector multiplication IP core based on FPGA
CN102629189B (en) Water floating point multiply-accumulate method based on FPGA
US8429217B2 (en) Executing fixed point divide operations using a floating point multiply-add pipeline
JP2662196B2 (en) Calculation result normalization method and apparatus
CN1740962A (en) Fast pipeline type divider
CN101650642B (en) Floating point addition device based on complement rounding
CN106325811A (en) Method in microprocessor
CN101630243B (en) Transcendental function device and method for realizing transcendental function utilizing same
CN103984522B (en) Fixed point and the implementation method of floating-point mixing division in GPDSP
KR101085810B1 (en) Multi-stage floating-point accumulator
CN105335127A (en) Scalar operation unit structure supporting floating-point division method in GPDSP
CN101692202A (en) 64-bit floating-point multiply accumulator and method for processing flowing meter of floating-point operation thereof
US6351760B1 (en) Division unit in a processor using a piece-wise quadratic approximation technique
US8019805B1 (en) Apparatus and method for multiple pass extended precision floating point multiplication
CN102253822B (en) Modular (2<n>-3) multiplier
CN101650643A (en) Rounding method for indivisible floating point division radication
Ide et al. A 320 MFLOPS CMOS floating-point processing unit for superscalar processors
CN2847379Y (en) Quick divider
CN109298848A (en) The subduplicate circuit of double mode floating-point division
CN1508672A (en) Micro controller IP nucleus
Pazhani et al. High-Speed and Area-Efficient Modified Binary Divider
Bruintjes Design of a fused multiply-add floating-point and integer datapath
Dieter et al. Low-cost microarchitectural support for improved floating-point accuracy
Yang et al. A high performance and full utilization hardware implementation of floating point arithmetic units
Hsiao et al. Design of a low-cost floating-point programmable vertex processor for mobile graphics applications based on hybrid number system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: GUANGDIAN ELECTRONIC CO., LTD., SHANGHAI

Free format text: FORMER OWNER: CENTRAL RESEARCH ACADEMY OF SVA GROUP

Effective date: 20120615

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20120615

Address after: 200233 No. 168, Shanghai, Tianlin Road

Patentee after: Guangdian Electronic Co., Ltd., Shanghai

Address before: 200233, No. 2, building 757, Yishan Road, Shanghai

Patentee before: Central Institute of Shanghai Video and Audio (Group) Co., Ltd.

C56 Change in the name or address of the patentee

Owner name: INESA ELECTRON CO., LTD.

Free format text: FORMER NAME: SVA ELECTRON CO., LTD.

CP03 Change of name, title or address

Address after: 200233 Building 1, building 200, Zhang Heng Road, Zhangjiang hi tech park, Shanghai, Pudong New Area, 2

Patentee after: INESA Electron Co., Ltd.

Address before: 200233 No. 168, Shanghai, Tianlin Road

Patentee before: Guangdian Electronic Co., Ltd., Shanghai