CN104615404A - High-speed floating-point division unit based on table look-up - Google Patents
High-speed floating-point division unit based on table look-up Download PDFInfo
- Publication number
- CN104615404A CN104615404A CN201510081089.3A CN201510081089A CN104615404A CN 104615404 A CN104615404 A CN 104615404A CN 201510081089 A CN201510081089 A CN 201510081089A CN 104615404 A CN104615404 A CN 104615404A
- Authority
- CN
- China
- Prior art keywords
- look
- floating
- inverse
- point
- square root
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention provides a high-speed floating-point division unit based on table look-up and relates to the technical field of computing. The high-speed floating-point division unit comprises a 12-bit multiplexing fixed-point adder (1), a 12-bit fixed-point adder (2), a reciprocal look-up table (3), a reciprocal square root look-up table (4), an index generating logic (5) and a mantissa generating logic (6). Based on the independent logic design, SIMD (single instruction multiple data) operations are added, and the computing speed of fixed-point reciprocal solving (division) and reciprocal square root solving in signal processing is greatly increased.
Description
Technical field
The present invention relates to computing technique field, particularly relate to a kind of high-speed floating point division calculation part device based on table lookup operation.
Background technology
Along with more application algorithms newly such as high-speed communication, multi-media processing occur, floating-point division has become a kind of basic floating-point operation, but is also the most complicated in floating-point arithmetic.Floating-point division has three kinds of implementation methods: based on loop up table, function iteration method and numerical iteration method.Although modern most of general purpose microprocessor achieves floating-point division, division remains the performance bottleneck that these processors realize.
Summary of the invention
Consider that the algorithm of floating-point division own has higher complicacy, realize separately divide instruction module hardware resource overhead larger, for improving the calculated performance of floating-point division in application, reduce hardware resource cost and power consumption, floating-point asks the employing of Reciprocals sums extraction of square root derivative action based on the realization of look-up table by the present invention, and adds SIMD operation.This device can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.
The present invention includes: (1) 12 bit multiplex fixed point totalizer; (2) 12 fixed point totalizers; (3) inverse look-up table; (4) inverse square root look-up table; (5) index formation logic; (6) mantissa's formation logic.Wherein:
(1), 12 bit multiplex fixed point totalizers: the index calculating the result floating number of SIMD high position data;
(2), 12 fixed point totalizers: the index calculating the result floating number of SIMD low data;
(3), inverse look-up table: search and generate the magnitude portion of floating data inverse;
(4), inverse square root look-up table: search and generate the magnitude portion of floating data inverse square root;
(5), index formation logic: generate the exponential part of result floating number according to result of calculation correction in early stage;
(6), mantissa's formation logic: the mantissa generating result floating number.
The present invention can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.
Table look-up ask Reciprocals sums inverse square root roughly computation process mainly comprise two steps, as follows:
The first step: the index of result of calculation.For asking reciprocal, result exponent is the opposite number of source operand real index, and for inverse square root, result exponent is source operand real index
doubly;
Second step: according to mantissa's most-significant byte of source operand, table look-up and obtain the most-significant byte of resultant mantissa.
In actual applications, when computational accuracy is less demanding, the result that can directly obtain tabling look-up as result of calculation, and for the higher application algorithm of accuracy requirement, then needs to carry out the precision that function iteration improves result.Way is for multiplication iteration provides an approximate divisor reciprocal or inverse square root value by above-mentioned look-up table.
In the present invention, the present invention is according to the most-significant byte of floating number magnitude portion, and can obtain degree of accuracy by tabling look-up is
the result of precision, then utilizes newton-Newton Raphson method (Newton-Raphson) by this result, namely utilizes Taylor series first few items to find a function the method for root.Ask floating-point inverse to increase the iterative formula of precision as formula 1, inverse square root as shown in Equation 2.
formula 1
formula 2
Wherein, the inverse of V or inverse square root have been tabled look-up and have been obtained.By above-mentioned iterative formula, every iteration once result precision doubles.Can realize arbitrarily based on above-mentioned algorithm
with
the floating-point operation of form.
Look-up table reciprocal decides the performance of floating-point inverse, hardware resource cost and computational accuracy, and its capacity is along with the width of mantissa and precision exponentially increase.Therefore, for the look-up tables'implementation of floating-point inverse, key is the structure of look-up table, needs to obtain balance between precision and hardware resource cost.
At present, the building method of Floating-point Reciprocal Look-up Tables mainly comprises three kinds: directly method of approximation, linear approximation method and PPA partial product array method.Look-up table in the present invention uses direct method of approximation.This method is also based on searching method the most frequently used in initial reciprocal value division.The construction process of look-up table namely produces the reciprocal value sequence with certain precision.
For the given single precision floating datum X meeting IEEE-754 standard, for asking Reciprocals sums inverse square root, the index of result can be obtained by source operand index very soon, and mantissa position then will be obtained by look-up table.Such as, ask invert instruction for floating-point, the present invention adopts the most-significant byte of mantissa to carry out index inverse look-up table, the reciprocal approximation obtained of tabling look-up also gets 8, and therefore, inverse look-up table capacity of the present invention is 2048, i.e. 2048bits, inquiry entrance i can obtain approximate value reciprocal by formula 3:
formula 3
Wherein, m is inquiry entrance figure place, and n is for exporting approximate value figure place.
The look-up table entry value of the inverse under the most-significant byte mantissa whole circumstances can be obtained by above-mentioned computing method.Also can be obtained the look-up table entry value of floating-point square root inverse by this method, reciprocal for floating-point square root, high 9 of what the present invention chose is mantissa are carried out index, the approximate value obtained of tabling look-up also gets 8, therefore, the look-up table capacity of floating-point square root inverse is 4096, i.e. 4096bits.By the inverse look-up table that said method constructs, the result that degree of accuracy is precision can be obtained.The look-up table of double precision Reciprocals sums inverse square root can be constructed equally by similar approach.
Accompanying drawing explanation
Fig. 1 is composition structural representation of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing the present invention being done and describing in detail further.
Based on above-mentioned research, the structure of the high-speed floating point division calculation parts based on table lookup operation of realization as shown in Figure 1.The advantage of this method for designing is that computing velocity is fast, and structure is simple, does not need complicated computation process, thus greatly reduces design difficulty.
Floating-point asks the employing of Reciprocals sums extraction of square root derivative action based on the realization of look-up table by the present invention, and adds SIMD operation.This method for designing can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.
The present invention includes: (1) 12 bit multiplex fixed point totalizer; (2) 12 fixed point totalizers; (3) inverse look-up table; (4) inverse square root look-up table; (5) index formation logic; (6) mantissa's formation logic.Wherein:
(1), 12 bit multiplex fixed point totalizers: the index calculating the result floating number of SIMD high position data;
(2), 12 fixed point totalizers: the index calculating the result floating number of SIMD low data;
(3), inverse look-up table: search and generate the magnitude portion of floating data inverse;
(4), inverse square root look-up table: search and generate the magnitude portion of floating data inverse square root;
(5), index formation logic: generate the exponential part of result floating number according to result of calculation correction in early stage;
(6), mantissa's formation logic: the mantissa generating result floating number;
(5) index formation logic and (6) mantissa formation logic are connected result of calculation and select and output interface.
But look-up method is only applicable to situation when requiring lower to application arithmetic accuracy.If the precision of guarantee, reduce iterations, then need the precision improving initial approximation, the capacity exponentially of tabling look-up can be made like this to increase, greatly increase area overhead.In the present invention, the resource multiplex by the look-up tables'implementation used by single precision and double-precision operation, greatly reduces area overhead.For precision problem, then can pass through Newton-Raphson algorithm, utilize the multiplication of MAC parts to carry out successive ignition, to improve precision.
Claims (3)
1., based on a high-speed floating point divide block device for table lookup operation, it is characterized in that,
Comprise: (1) 12 bit multiplex fixed point totalizer, (2) 12 fixed point totalizers, (3) inverse look-up table, (4) inverse square root look-up table, (5) index formation logic, (6) mantissa formation logic; Wherein:
(1), 12 bit multiplex fixed point totalizers: the index calculating the result floating number of SIMD high position data;
(2), 12 fixed point totalizers: the index calculating the result floating number of SIMD low data;
(3), inverse look-up table: search and generate the magnitude portion of floating data inverse;
(4), inverse square root look-up table: search and generate the magnitude portion of floating data inverse square root;
(5), index formation logic: generate the exponential part of result floating number according to result of calculation correction in early stage;
(6), mantissa's formation logic: the mantissa generating result floating number.
2. device according to claim 1, is characterized in that, adopt the most-significant byte of mantissa to carry out index inverse look-up table, the reciprocal approximation obtained of tabling look-up also gets 8, and inverse look-up table capacity is 2048, i.e. 2048bits.
3. device according to claim 1, is characterized in that, reciprocal for floating-point square root, and what choose is that high 9 of mantissa carry out index, and the approximate value obtained of tabling look-up also gets 8, and therefore, the look-up table capacity of floating-point square root inverse is 4096, i.e. 4096bits.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510081089.3A CN104615404A (en) | 2015-02-15 | 2015-02-15 | High-speed floating-point division unit based on table look-up |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510081089.3A CN104615404A (en) | 2015-02-15 | 2015-02-15 | High-speed floating-point division unit based on table look-up |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104615404A true CN104615404A (en) | 2015-05-13 |
Family
ID=53149870
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510081089.3A Pending CN104615404A (en) | 2015-02-15 | 2015-02-15 | High-speed floating-point division unit based on table look-up |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104615404A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108123907A (en) * | 2017-11-24 | 2018-06-05 | 浙江天则通信技术有限公司 | A kind of low complex degree equalization method for single carrier frequency domain equalization channel |
CN111814107A (en) * | 2020-07-10 | 2020-10-23 | 上海擎昆信息科技有限公司 | Computing system and computing method for realizing reciprocal of square root with high precision |
CN113296732A (en) * | 2020-06-16 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method and device, processor and data searching method and device |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5563818A (en) * | 1994-12-12 | 1996-10-08 | International Business Machines Corporation | Method and system for performing floating-point division using selected approximation values |
US20020143840A1 (en) * | 2000-12-20 | 2002-10-03 | Alexei Krouglov | Method and apparatus for calculating a reciprocal |
CN100367191C (en) * | 2005-09-22 | 2008-02-06 | 上海广电(集团)有限公司中央研究院 | Fast pipeline type divider |
CN101216753A (en) * | 2008-01-04 | 2008-07-09 | 清华大学 | Preliminary treatment circuit structure for floating point division and quadratic root algorithm |
CN101493760A (en) * | 2008-12-24 | 2009-07-29 | 京信通信系统(中国)有限公司 | High speed divider and method thereof for implementing high speed division arithmetic |
CN201359721Y (en) * | 2008-12-24 | 2009-12-09 | 京信通信系统(中国)有限公司 | High-speed divider |
CN103180820A (en) * | 2010-09-03 | 2013-06-26 | 超威半导体公司 | Method and apparatus for performing floating-point division |
CN101986264B (en) * | 2010-11-25 | 2013-07-31 | 中国人民解放军国防科学技术大学 | Multifunctional floating-point multiply and add calculation device for single instruction multiple data (SIMD) vector microprocessor |
CN103809930A (en) * | 2014-01-24 | 2014-05-21 | 天津大学 | Design method of double-precision floating-point divider and divider |
-
2015
- 2015-02-15 CN CN201510081089.3A patent/CN104615404A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5563818A (en) * | 1994-12-12 | 1996-10-08 | International Business Machines Corporation | Method and system for performing floating-point division using selected approximation values |
US20020143840A1 (en) * | 2000-12-20 | 2002-10-03 | Alexei Krouglov | Method and apparatus for calculating a reciprocal |
CN100367191C (en) * | 2005-09-22 | 2008-02-06 | 上海广电(集团)有限公司中央研究院 | Fast pipeline type divider |
CN101216753A (en) * | 2008-01-04 | 2008-07-09 | 清华大学 | Preliminary treatment circuit structure for floating point division and quadratic root algorithm |
CN100583024C (en) * | 2008-01-04 | 2010-01-20 | 清华大学 | Preliminary treatment circuit structure for floating point division and quadratic root algorithm |
CN101493760A (en) * | 2008-12-24 | 2009-07-29 | 京信通信系统(中国)有限公司 | High speed divider and method thereof for implementing high speed division arithmetic |
CN201359721Y (en) * | 2008-12-24 | 2009-12-09 | 京信通信系统(中国)有限公司 | High-speed divider |
CN103180820A (en) * | 2010-09-03 | 2013-06-26 | 超威半导体公司 | Method and apparatus for performing floating-point division |
CN101986264B (en) * | 2010-11-25 | 2013-07-31 | 中国人民解放军国防科学技术大学 | Multifunctional floating-point multiply and add calculation device for single instruction multiple data (SIMD) vector microprocessor |
CN103809930A (en) * | 2014-01-24 | 2014-05-21 | 天津大学 | Design method of double-precision floating-point divider and divider |
Non-Patent Citations (3)
Title |
---|
王一帮,等: "一种快速除法算法的FDGA实现", 《舰船防化》 * |
邓子椰,邓: "一种基于SRT-8算法的SIMD浮点除法器的设计与实现", 《计算机工程与科学》 * |
邹晓峰,等: "高性能浮点与定点转换部件的设计与实现", 《第十七届计算机工程与工艺年会暨第三届微处理器技术论坛论文集(下册)》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108123907A (en) * | 2017-11-24 | 2018-06-05 | 浙江天则通信技术有限公司 | A kind of low complex degree equalization method for single carrier frequency domain equalization channel |
CN108123907B (en) * | 2017-11-24 | 2020-08-25 | 浙江天则通信技术有限公司 | Low-complexity equalization method for single carrier frequency domain equalization channel |
CN113296732A (en) * | 2020-06-16 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method and device, processor and data searching method and device |
CN113296732B (en) * | 2020-06-16 | 2024-03-01 | 阿里巴巴集团控股有限公司 | Data processing method and device, processor and data searching method and device |
CN111814107A (en) * | 2020-07-10 | 2020-10-23 | 上海擎昆信息科技有限公司 | Computing system and computing method for realizing reciprocal of square root with high precision |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102722352B (en) | Booth multiplier | |
CN103809930B (en) | Design method of double-precision floating-point divider and divider | |
US10303438B2 (en) | Fused-multiply-add floating-point operations on 128 bit wide operands | |
EP3447634B1 (en) | Non-linear function computing device and method | |
CN106155627B (en) | Low overhead iteration trigonometric device based on T_CORDIC algorithm | |
JPH02196328A (en) | Floating point computing apparatus | |
KR102581403B1 (en) | Shared hardware logic unit and method for reducing die area | |
CN104133656A (en) | Floating point number divider adopting shift and subtraction operation by tail codes and floating point number division operation method adopting shift and subtraction operation by tail codes | |
CN112051980A (en) | Non-linear activation function computing device based on Newton iteration method | |
CN103135960A (en) | Design method of integrated floating point unit based on FPGA (field programmable gate array) | |
CN104615404A (en) | High-speed floating-point division unit based on table look-up | |
Kobel et al. | Fast approximate polynomial multipoint evaluation and applications | |
CN111984226B (en) | Cube root solving device and solving method based on hyperbolic CORDIC | |
Singh et al. | Design and synthesis of goldschmidt algorithm based floating point divider on FPGA | |
CN103176948A (en) | Single precision elementary function operation accelerator low in cost | |
Bruguera et al. | Design of a pipelined radix 4 CORDIC processor | |
Wang et al. | $(M, p, k) $-Friendly Points: A Table-Based Method to Evaluate Trigonometric Function | |
US9720648B2 (en) | Optimized structure for hexadecimal and binary multiplier array | |
Shuang-yan et al. | Design and implementation of a 64/32-bit floating-point division, reciprocal, square root, and inverse square root unit | |
Bokade et al. | CLA based 32-bit signed pipelined multiplier | |
Ercegovac et al. | Design of a complex divider | |
Xia et al. | Research and optimization on methods for reciprocal approximation | |
Ravi et al. | Analysis and study of different multipliers to design floating point MAC units for digital signal processing applications | |
RU2449354C1 (en) | Vector normalising apparatus | |
Iyer et al. | Generalised Algorithm for Multiplying Binary Numbers Via Vedic Mathematics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150513 |