CN104615404A - High-speed floating-point division unit based on table look-up - Google Patents

High-speed floating-point division unit based on table look-up Download PDF

Info

Publication number
CN104615404A
CN104615404A CN201510081089.3A CN201510081089A CN104615404A CN 104615404 A CN104615404 A CN 104615404A CN 201510081089 A CN201510081089 A CN 201510081089A CN 104615404 A CN104615404 A CN 104615404A
Authority
CN
China
Prior art keywords
look
floating
inverse
point
square root
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510081089.3A
Other languages
Chinese (zh)
Inventor
邹晓峰
童元满
李仁刚
李拓
刘金广
李国川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201510081089.3A priority Critical patent/CN104615404A/en
Publication of CN104615404A publication Critical patent/CN104615404A/en
Pending legal-status Critical Current

Links

Abstract

The invention provides a high-speed floating-point division unit based on table look-up and relates to the technical field of computing. The high-speed floating-point division unit comprises a 12-bit multiplexing fixed-point adder (1), a 12-bit fixed-point adder (2), a reciprocal look-up table (3), a reciprocal square root look-up table (4), an index generating logic (5) and a mantissa generating logic (6). Based on the independent logic design, SIMD (single instruction multiple data) operations are added, and the computing speed of fixed-point reciprocal solving (division) and reciprocal square root solving in signal processing is greatly increased.

Description

A kind of high-speed floating point divide block device based on table lookup operation
Technical field
The present invention relates to computing technique field, particularly relate to a kind of high-speed floating point division calculation part device based on table lookup operation.
Background technology
Along with more application algorithms newly such as high-speed communication, multi-media processing occur, floating-point division has become a kind of basic floating-point operation, but is also the most complicated in floating-point arithmetic.Floating-point division has three kinds of implementation methods: based on loop up table, function iteration method and numerical iteration method.Although modern most of general purpose microprocessor achieves floating-point division, division remains the performance bottleneck that these processors realize.
Summary of the invention
Consider that the algorithm of floating-point division own has higher complicacy, realize separately divide instruction module hardware resource overhead larger, for improving the calculated performance of floating-point division in application, reduce hardware resource cost and power consumption, floating-point asks the employing of Reciprocals sums extraction of square root derivative action based on the realization of look-up table by the present invention, and adds SIMD operation.This device can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.
The present invention includes: (1) 12 bit multiplex fixed point totalizer; (2) 12 fixed point totalizers; (3) inverse look-up table; (4) inverse square root look-up table; (5) index formation logic; (6) mantissa's formation logic.Wherein:
(1), 12 bit multiplex fixed point totalizers: the index calculating the result floating number of SIMD high position data;
(2), 12 fixed point totalizers: the index calculating the result floating number of SIMD low data;
(3), inverse look-up table: search and generate the magnitude portion of floating data inverse;
(4), inverse square root look-up table: search and generate the magnitude portion of floating data inverse square root;
(5), index formation logic: generate the exponential part of result floating number according to result of calculation correction in early stage;
(6), mantissa's formation logic: the mantissa generating result floating number.
The present invention can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.
Table look-up ask Reciprocals sums inverse square root roughly computation process mainly comprise two steps, as follows:
The first step: the index of result of calculation.For asking reciprocal, result exponent is the opposite number of source operand real index, and for inverse square root, result exponent is source operand real index doubly;
Second step: according to mantissa's most-significant byte of source operand, table look-up and obtain the most-significant byte of resultant mantissa.
In actual applications, when computational accuracy is less demanding, the result that can directly obtain tabling look-up as result of calculation, and for the higher application algorithm of accuracy requirement, then needs to carry out the precision that function iteration improves result.Way is for multiplication iteration provides an approximate divisor reciprocal or inverse square root value by above-mentioned look-up table.
In the present invention, the present invention is according to the most-significant byte of floating number magnitude portion, and can obtain degree of accuracy by tabling look-up is the result of precision, then utilizes newton-Newton Raphson method (Newton-Raphson) by this result, namely utilizes Taylor series first few items to find a function the method for root.Ask floating-point inverse to increase the iterative formula of precision as formula 1, inverse square root as shown in Equation 2.
formula 1
formula 2
Wherein, the inverse of V or inverse square root have been tabled look-up and have been obtained.By above-mentioned iterative formula, every iteration once result precision doubles.Can realize arbitrarily based on above-mentioned algorithm with the floating-point operation of form.
Look-up table reciprocal decides the performance of floating-point inverse, hardware resource cost and computational accuracy, and its capacity is along with the width of mantissa and precision exponentially increase.Therefore, for the look-up tables'implementation of floating-point inverse, key is the structure of look-up table, needs to obtain balance between precision and hardware resource cost.
At present, the building method of Floating-point Reciprocal Look-up Tables mainly comprises three kinds: directly method of approximation, linear approximation method and PPA partial product array method.Look-up table in the present invention uses direct method of approximation.This method is also based on searching method the most frequently used in initial reciprocal value division.The construction process of look-up table namely produces the reciprocal value sequence with certain precision.
For the given single precision floating datum X meeting IEEE-754 standard, for asking Reciprocals sums inverse square root, the index of result can be obtained by source operand index very soon, and mantissa position then will be obtained by look-up table.Such as, ask invert instruction for floating-point, the present invention adopts the most-significant byte of mantissa to carry out index inverse look-up table, the reciprocal approximation obtained of tabling look-up also gets 8, and therefore, inverse look-up table capacity of the present invention is 2048, i.e. 2048bits, inquiry entrance i can obtain approximate value reciprocal by formula 3:
formula 3
Wherein, m is inquiry entrance figure place, and n is for exporting approximate value figure place.
The look-up table entry value of the inverse under the most-significant byte mantissa whole circumstances can be obtained by above-mentioned computing method.Also can be obtained the look-up table entry value of floating-point square root inverse by this method, reciprocal for floating-point square root, high 9 of what the present invention chose is mantissa are carried out index, the approximate value obtained of tabling look-up also gets 8, therefore, the look-up table capacity of floating-point square root inverse is 4096, i.e. 4096bits.By the inverse look-up table that said method constructs, the result that degree of accuracy is precision can be obtained.The look-up table of double precision Reciprocals sums inverse square root can be constructed equally by similar approach.
Accompanying drawing explanation
Fig. 1 is composition structural representation of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing the present invention being done and describing in detail further.
Based on above-mentioned research, the structure of the high-speed floating point division calculation parts based on table lookup operation of realization as shown in Figure 1.The advantage of this method for designing is that computing velocity is fast, and structure is simple, does not need complicated computation process, thus greatly reduces design difficulty.
Floating-point asks the employing of Reciprocals sums extraction of square root derivative action based on the realization of look-up table by the present invention, and adds SIMD operation.This method for designing can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.
The present invention includes: (1) 12 bit multiplex fixed point totalizer; (2) 12 fixed point totalizers; (3) inverse look-up table; (4) inverse square root look-up table; (5) index formation logic; (6) mantissa's formation logic.Wherein:
(1), 12 bit multiplex fixed point totalizers: the index calculating the result floating number of SIMD high position data;
(2), 12 fixed point totalizers: the index calculating the result floating number of SIMD low data;
(3), inverse look-up table: search and generate the magnitude portion of floating data inverse;
(4), inverse square root look-up table: search and generate the magnitude portion of floating data inverse square root;
(5), index formation logic: generate the exponential part of result floating number according to result of calculation correction in early stage;
(6), mantissa's formation logic: the mantissa generating result floating number;
(5) index formation logic and (6) mantissa formation logic are connected result of calculation and select and output interface.
But look-up method is only applicable to situation when requiring lower to application arithmetic accuracy.If the precision of guarantee, reduce iterations, then need the precision improving initial approximation, the capacity exponentially of tabling look-up can be made like this to increase, greatly increase area overhead.In the present invention, the resource multiplex by the look-up tables'implementation used by single precision and double-precision operation, greatly reduces area overhead.For precision problem, then can pass through Newton-Raphson algorithm, utilize the multiplication of MAC parts to carry out successive ignition, to improve precision.

Claims (3)

1., based on a high-speed floating point divide block device for table lookup operation, it is characterized in that,
Comprise: (1) 12 bit multiplex fixed point totalizer, (2) 12 fixed point totalizers, (3) inverse look-up table, (4) inverse square root look-up table, (5) index formation logic, (6) mantissa formation logic; Wherein:
(1), 12 bit multiplex fixed point totalizers: the index calculating the result floating number of SIMD high position data;
(2), 12 fixed point totalizers: the index calculating the result floating number of SIMD low data;
(3), inverse look-up table: search and generate the magnitude portion of floating data inverse;
(4), inverse square root look-up table: search and generate the magnitude portion of floating data inverse square root;
(5), index formation logic: generate the exponential part of result floating number according to result of calculation correction in early stage;
(6), mantissa's formation logic: the mantissa generating result floating number.
2. device according to claim 1, is characterized in that, adopt the most-significant byte of mantissa to carry out index inverse look-up table, the reciprocal approximation obtained of tabling look-up also gets 8, and inverse look-up table capacity is 2048, i.e. 2048bits.
3. device according to claim 1, is characterized in that, reciprocal for floating-point square root, and what choose is that high 9 of mantissa carry out index, and the approximate value obtained of tabling look-up also gets 8, and therefore, the look-up table capacity of floating-point square root inverse is 4096, i.e. 4096bits.
CN201510081089.3A 2015-02-15 2015-02-15 High-speed floating-point division unit based on table look-up Pending CN104615404A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510081089.3A CN104615404A (en) 2015-02-15 2015-02-15 High-speed floating-point division unit based on table look-up

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510081089.3A CN104615404A (en) 2015-02-15 2015-02-15 High-speed floating-point division unit based on table look-up

Publications (1)

Publication Number Publication Date
CN104615404A true CN104615404A (en) 2015-05-13

Family

ID=53149870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510081089.3A Pending CN104615404A (en) 2015-02-15 2015-02-15 High-speed floating-point division unit based on table look-up

Country Status (1)

Country Link
CN (1) CN104615404A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108123907A (en) * 2017-11-24 2018-06-05 浙江天则通信技术有限公司 A kind of low complex degree equalization method for single carrier frequency domain equalization channel
CN111814107A (en) * 2020-07-10 2020-10-23 上海擎昆信息科技有限公司 Computing system and computing method for realizing reciprocal of square root with high precision
CN113296732A (en) * 2020-06-16 2021-08-24 阿里巴巴集团控股有限公司 Data processing method and device, processor and data searching method and device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5563818A (en) * 1994-12-12 1996-10-08 International Business Machines Corporation Method and system for performing floating-point division using selected approximation values
US20020143840A1 (en) * 2000-12-20 2002-10-03 Alexei Krouglov Method and apparatus for calculating a reciprocal
CN100367191C (en) * 2005-09-22 2008-02-06 上海广电(集团)有限公司中央研究院 Fast pipeline type divider
CN101216753A (en) * 2008-01-04 2008-07-09 清华大学 Preliminary treatment circuit structure for floating point division and quadratic root algorithm
CN101493760A (en) * 2008-12-24 2009-07-29 京信通信系统(中国)有限公司 High speed divider and method thereof for implementing high speed division arithmetic
CN201359721Y (en) * 2008-12-24 2009-12-09 京信通信系统(中国)有限公司 High-speed divider
CN103180820A (en) * 2010-09-03 2013-06-26 超威半导体公司 Method and apparatus for performing floating-point division
CN101986264B (en) * 2010-11-25 2013-07-31 中国人民解放军国防科学技术大学 Multifunctional floating-point multiply and add calculation device for single instruction multiple data (SIMD) vector microprocessor
CN103809930A (en) * 2014-01-24 2014-05-21 天津大学 Design method of double-precision floating-point divider and divider

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5563818A (en) * 1994-12-12 1996-10-08 International Business Machines Corporation Method and system for performing floating-point division using selected approximation values
US20020143840A1 (en) * 2000-12-20 2002-10-03 Alexei Krouglov Method and apparatus for calculating a reciprocal
CN100367191C (en) * 2005-09-22 2008-02-06 上海广电(集团)有限公司中央研究院 Fast pipeline type divider
CN101216753A (en) * 2008-01-04 2008-07-09 清华大学 Preliminary treatment circuit structure for floating point division and quadratic root algorithm
CN100583024C (en) * 2008-01-04 2010-01-20 清华大学 Preliminary treatment circuit structure for floating point division and quadratic root algorithm
CN101493760A (en) * 2008-12-24 2009-07-29 京信通信系统(中国)有限公司 High speed divider and method thereof for implementing high speed division arithmetic
CN201359721Y (en) * 2008-12-24 2009-12-09 京信通信系统(中国)有限公司 High-speed divider
CN103180820A (en) * 2010-09-03 2013-06-26 超威半导体公司 Method and apparatus for performing floating-point division
CN101986264B (en) * 2010-11-25 2013-07-31 中国人民解放军国防科学技术大学 Multifunctional floating-point multiply and add calculation device for single instruction multiple data (SIMD) vector microprocessor
CN103809930A (en) * 2014-01-24 2014-05-21 天津大学 Design method of double-precision floating-point divider and divider

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王一帮,等: "一种快速除法算法的FDGA实现", 《舰船防化》 *
邓子椰,邓: "一种基于SRT-8算法的SIMD浮点除法器的设计与实现", 《计算机工程与科学》 *
邹晓峰,等: "高性能浮点与定点转换部件的设计与实现", 《第十七届计算机工程与工艺年会暨第三届微处理器技术论坛论文集(下册)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108123907A (en) * 2017-11-24 2018-06-05 浙江天则通信技术有限公司 A kind of low complex degree equalization method for single carrier frequency domain equalization channel
CN108123907B (en) * 2017-11-24 2020-08-25 浙江天则通信技术有限公司 Low-complexity equalization method for single carrier frequency domain equalization channel
CN113296732A (en) * 2020-06-16 2021-08-24 阿里巴巴集团控股有限公司 Data processing method and device, processor and data searching method and device
CN113296732B (en) * 2020-06-16 2024-03-01 阿里巴巴集团控股有限公司 Data processing method and device, processor and data searching method and device
CN111814107A (en) * 2020-07-10 2020-10-23 上海擎昆信息科技有限公司 Computing system and computing method for realizing reciprocal of square root with high precision

Similar Documents

Publication Publication Date Title
CN102722352B (en) Booth multiplier
CN103809930B (en) Design method of double-precision floating-point divider and divider
US10303438B2 (en) Fused-multiply-add floating-point operations on 128 bit wide operands
EP3447634B1 (en) Non-linear function computing device and method
CN106155627B (en) Low overhead iteration trigonometric device based on T_CORDIC algorithm
JPH02196328A (en) Floating point computing apparatus
KR102581403B1 (en) Shared hardware logic unit and method for reducing die area
CN104133656A (en) Floating point number divider adopting shift and subtraction operation by tail codes and floating point number division operation method adopting shift and subtraction operation by tail codes
CN112051980A (en) Non-linear activation function computing device based on Newton iteration method
CN103135960A (en) Design method of integrated floating point unit based on FPGA (field programmable gate array)
CN104615404A (en) High-speed floating-point division unit based on table look-up
Kobel et al. Fast approximate polynomial multipoint evaluation and applications
CN111984226B (en) Cube root solving device and solving method based on hyperbolic CORDIC
Singh et al. Design and synthesis of goldschmidt algorithm based floating point divider on FPGA
CN103176948A (en) Single precision elementary function operation accelerator low in cost
Bruguera et al. Design of a pipelined radix 4 CORDIC processor
Wang et al. $(M, p, k) $-Friendly Points: A Table-Based Method to Evaluate Trigonometric Function
US9720648B2 (en) Optimized structure for hexadecimal and binary multiplier array
Shuang-yan et al. Design and implementation of a 64/32-bit floating-point division, reciprocal, square root, and inverse square root unit
Bokade et al. CLA based 32-bit signed pipelined multiplier
Ercegovac et al. Design of a complex divider
Xia et al. Research and optimization on methods for reciprocal approximation
Ravi et al. Analysis and study of different multipliers to design floating point MAC units for digital signal processing applications
RU2449354C1 (en) Vector normalising apparatus
Iyer et al. Generalised Algorithm for Multiplying Binary Numbers Via Vedic Mathematics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150513