CN104615404A

CN104615404A - High-speed floating-point division unit based on table look-up

Info

Publication number: CN104615404A
Application number: CN201510081089.3A
Authority: CN
Inventors: 邹晓峰; 童元满; 李仁刚; 李拓; 刘金广; 李国川
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2015-02-15
Filing date: 2015-02-15
Publication date: 2015-05-13

Abstract

The invention provides a high-speed floating-point division unit based on table look-up and relates to the technical field of computing. The high-speed floating-point division unit comprises a 12-bit multiplexing fixed-point adder (1), a 12-bit fixed-point adder (2), a reciprocal look-up table (3), a reciprocal square root look-up table (4), an index generating logic (5) and a mantissa generating logic (6). Based on the independent logic design, SIMD (single instruction multiple data) operations are added, and the computing speed of fixed-point reciprocal solving (division) and reciprocal square root solving in signal processing is greatly increased.

Description

A kind of high-speed floating point divide block device based on table lookup operation

Technical field

The present invention relates to computing technique field, particularly relate to a kind of high-speed floating point division calculation part device based on table lookup operation.

Background technology

Along with more application algorithms newly such as high-speed communication, multi-media processing occur, floating-point division has become a kind of basic floating-point operation, but is also the most complicated in floating-point arithmetic.Floating-point division has three kinds of implementation methods: based on loop up table, function iteration method and numerical iteration method.Although modern most of general purpose microprocessor achieves floating-point division, division remains the performance bottleneck that these processors realize.

Summary of the invention

Consider that the algorithm of floating-point division own has higher complicacy, realize separately divide instruction module hardware resource overhead larger, for improving the calculated performance of floating-point division in application, reduce hardware resource cost and power consumption, floating-point asks the employing of Reciprocals sums extraction of square root derivative action based on the realization of look-up table by the present invention, and adds SIMD operation.This device can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.

The present invention includes: (1) 12 bit multiplex fixed point totalizer; (2) 12 fixed point totalizers; (3) inverse look-up table; (4) inverse square root look-up table; (5) index formation logic; (6) mantissa's formation logic.Wherein:

(1), 12 bit multiplex fixed point totalizers: the index calculating the result floating number of SIMD high position data;

(2), 12 fixed point totalizers: the index calculating the result floating number of SIMD low data;

(3), inverse look-up table: search and generate the magnitude portion of floating data inverse;

(4), inverse square root look-up table: search and generate the magnitude portion of floating data inverse square root;

(5), index formation logic: generate the exponential part of result floating number according to result of calculation correction in early stage;

(6), mantissa's formation logic: the mantissa generating result floating number.

The present invention can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.

Table look-up ask Reciprocals sums inverse square root roughly computation process mainly comprise two steps, as follows:

The first step: the index of result of calculation.For asking reciprocal, result exponent is the opposite number of source operand real index, and for inverse square root, result exponent is source operand real index doubly;

Second step: according to mantissa's most-significant byte of source operand, table look-up and obtain the most-significant byte of resultant mantissa.

In actual applications, when computational accuracy is less demanding, the result that can directly obtain tabling look-up as result of calculation, and for the higher application algorithm of accuracy requirement, then needs to carry out the precision that function iteration improves result.Way is for multiplication iteration provides an approximate divisor reciprocal or inverse square root value by above-mentioned look-up table.

In the present invention, the present invention is according to the most-significant byte of floating number magnitude portion, and can obtain degree of accuracy by tabling look-up is the result of precision, then utilizes newton-Newton Raphson method (Newton-Raphson) by this result, namely utilizes Taylor series first few items to find a function the method for root.Ask floating-point inverse to increase the iterative formula of precision as formula 1, inverse square root as shown in Equation 2.

formula 1

formula 2

Wherein, the inverse of V or inverse square root have been tabled look-up and have been obtained.By above-mentioned iterative formula, every iteration once result precision doubles.Can realize arbitrarily based on above-mentioned algorithm with the floating-point operation of form.

Look-up table reciprocal decides the performance of floating-point inverse, hardware resource cost and computational accuracy, and its capacity is along with the width of mantissa and precision exponentially increase.Therefore, for the look-up tables'implementation of floating-point inverse, key is the structure of look-up table, needs to obtain balance between precision and hardware resource cost.

At present, the building method of Floating-point Reciprocal Look-up Tables mainly comprises three kinds: directly method of approximation, linear approximation method and PPA partial product array method.Look-up table in the present invention uses direct method of approximation.This method is also based on searching method the most frequently used in initial reciprocal value division.The construction process of look-up table namely produces the reciprocal value sequence with certain precision.

For the given single precision floating datum X meeting IEEE-754 standard, for asking Reciprocals sums inverse square root, the index of result can be obtained by source operand index very soon, and mantissa position then will be obtained by look-up table.Such as, ask invert instruction for floating-point, the present invention adopts the most-significant byte of mantissa to carry out index inverse look-up table, the reciprocal approximation obtained of tabling look-up also gets 8, and therefore, inverse look-up table capacity of the present invention is 2048, i.e. 2048bits, inquiry entrance i can obtain approximate value reciprocal by formula 3:

formula 3

Wherein, m is inquiry entrance figure place, and n is for exporting approximate value figure place.

The look-up table entry value of the inverse under the most-significant byte mantissa whole circumstances can be obtained by above-mentioned computing method.Also can be obtained the look-up table entry value of floating-point square root inverse by this method, reciprocal for floating-point square root, high 9 of what the present invention chose is mantissa are carried out index, the approximate value obtained of tabling look-up also gets 8, therefore, the look-up table capacity of floating-point square root inverse is 4096, i.e. 4096bits.By the inverse look-up table that said method constructs, the result that degree of accuracy is precision can be obtained.The look-up table of double precision Reciprocals sums inverse square root can be constructed equally by similar approach.

Accompanying drawing explanation

Fig. 1 is composition structural representation of the present invention.

Embodiment

For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing the present invention being done and describing in detail further.

Based on above-mentioned research, the structure of the high-speed floating point division calculation parts based on table lookup operation of realization as shown in Figure 1.The advantage of this method for designing is that computing velocity is fast, and structure is simple, does not need complicated computation process, thus greatly reduces design difficulty.

Floating-point asks the employing of Reciprocals sums extraction of square root derivative action based on the realization of look-up table by the present invention, and adds SIMD operation.This method for designing can realize two/single-precision floating-point data ask Reciprocals sums two/operation of single-precision floating-point data extraction of square root inverse.

(6), mantissa's formation logic: the mantissa generating result floating number;

(5) index formation logic and (6) mantissa formation logic are connected result of calculation and select and output interface.

But look-up method is only applicable to situation when requiring lower to application arithmetic accuracy.If the precision of guarantee, reduce iterations, then need the precision improving initial approximation, the capacity exponentially of tabling look-up can be made like this to increase, greatly increase area overhead.In the present invention, the resource multiplex by the look-up tables'implementation used by single precision and double-precision operation, greatly reduces area overhead.For precision problem, then can pass through Newton-Raphson algorithm, utilize the multiplication of MAC parts to carry out successive ignition, to improve precision.

Claims

1., based on a high-speed floating point divide block device for table lookup operation, it is characterized in that,

Comprise: (1) 12 bit multiplex fixed point totalizer, (2) 12 fixed point totalizers, (3) inverse look-up table, (4) inverse square root look-up table, (5) index formation logic, (6) mantissa formation logic; Wherein:

2. device according to claim 1, is characterized in that, adopt the most-significant byte of mantissa to carry out index inverse look-up table, the reciprocal approximation obtained of tabling look-up also gets 8, and inverse look-up table capacity is 2048, i.e. 2048bits.

3. device according to claim 1, is characterized in that, reciprocal for floating-point square root, and what choose is that high 9 of mantissa carry out index, and the approximate value obtained of tabling look-up also gets 8, and therefore, the look-up table capacity of floating-point square root inverse is 4096, i.e. 4096bits.