CN111061992A - Function fitting method and device based on parabola - Google Patents

Function fitting method and device based on parabola Download PDF

Info

Publication number
CN111061992A
CN111061992A CN201911194243.2A CN201911194243A CN111061992A CN 111061992 A CN111061992 A CN 111061992A CN 201911194243 A CN201911194243 A CN 201911194243A CN 111061992 A CN111061992 A CN 111061992A
Authority
CN
China
Prior art keywords
interval
error
function
coefficient
fitting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911194243.2A
Other languages
Chinese (zh)
Inventor
潘红兵
吕航
安梦瑜
罗元勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201911194243.2A priority Critical patent/CN111061992A/en
Publication of CN111061992A publication Critical patent/CN111061992A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Operations Research (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a function fitting method and a device based on a parabola. The method comprises the following specific steps: and continuously iterating in a designated interval by using a dichotomy, solving corresponding coefficients by using three-point coordinates, calculating errors, and finally performing piecewise fitting on various curve functions within a given error range to obtain the number of segments and simultaneously give the parabolic coefficients of the segments. The device comprises a data input module, a comparison module, a coefficient selection module, a calculation unit and a data output module. The method can obtain the least number of segments in the current function approximate fitting method, and can ensure that the error of each segment can reach the minimum, namely, the purposes of high precision and low complexity are achieved.

Description

Function fitting method and device based on parabola
Technical Field
The invention relates to the field of integrated circuit algorithm and hardware implementation, in particular to a method based on a parabolic fitting function and an implementation device thereof.
Background
The approximation calculation is a trade-off between computational quality and consumed performance resources. With ever increasing performance demands and increasing resource budgets, approximate computing approaches become attractive and increasingly imperative.
Newton's iteration method is a method of solving equations approximately in real and complex domains, and is commonly used to implement reciprocal, division, reciprocal square root and square root calculations in the design of complex computational units in VLSI. The Newton iteration method, as a traditional VLSI design method of division and square root calculation, has a faster square root convergence characteristic, and otherwise, the advantage is not obvious due to the defects brought by initially guessed solutions. Meanwhile, the hardware overhead of the newton iteration method is too large. Taking the square root implementation as an example, for a full expansion implementation, 17 clock cycles and 13 multipliers are required for 4 iterations, and both delay and area overhead are large. In theory, newton's iteration methods can also be used to implement cubic roots or even high-order roots, but the hardware cost and latency are prohibitive.
A COordinate Rotation DIgital Computer (CORDIC) is an approximation method for computing trigonometric functions and multiplying and dividing. The CORDIC comprises 3 tracks of circumference, linearity and hyperbola, and each track is divided into two convergence modes of rotation and vector 2. The greatest advantage of CORDIC is its simple hardware implementation, including both folding (time division multiplexing) and full-unfolding implementations. The folding mode reduces the hardware cost by replacing the sacrificial sampling rate, while the full-unfolding pipeline mode can realize the input of one data in one clock period, and the full-unfolding mode can realize extremely high frequency because the key path is only the shift addition, thereby realizing extremely high sampling rate. The circular and hyperbolic CORDIC consumes 6 adders per iteration, while the linear CORDIC consumes 4 adders per iteration. However, the CORDIC has a limited approximate precision, and the time delay and hardware resource overhead caused by improving the precision are large.
Disclosure of Invention
Aiming at the technical defects in the existing method, the invention provides a method based on a parabola fitting function and a hardware device for realizing the method, in order to more accurately and completely fit all the unary functions.
The technical scheme adopted by the method is as follows:
a function fitting method based on parabola specifically comprises the following steps:
(1) adopting a binary iteration method to segment the whole interval of the function into a plurality of element intervals, wherein the interval length of the element interval is 2-odOd is the decimal significant digit; calculating the coefficient of the parabolic function by using the real coordinates of the three points, namely the two divided end points and the middle point;
(2) and (2) back-substituting the coefficient obtained in the step (1) into a parabolic function, and calculating an error at an end point of each element interval: substituting the endpoint of each element interval into a secondary expression, calculating a secondary function value of a corresponding point, and subtracting a fitting value of a secondary function from the secondary function value to obtain an error; comparing the calculated error with the set error;
(3) dividing the current element interval into two parts, and if the calculation error is greater than the set error, repeating the steps (1) to (2) on the first half section after dividing into two parts; if the calculation error is less than or equal to the set error, performing halving on the second half section after halving, adding the first half section after halving to the first half section, and repeating the steps (1) to (2) until the calculation error is less than or equal to the set error, wherein the length of each obtained section interval is longest, namely the number of integral sections is least;
(4) and (4) performing quadratic fitting on the segmented interval by using the coefficient obtained in the step (1) to complete the fitting of the whole interval of the function.
The invention relates to a function fitting device based on a parabola, which comprises a data input module, a comparison module, a coefficient selection module, a calculation unit and a data output module; the comparison module is used for comparing the input data with the segmented interval to determine the interval of the data, so as to determine the coefficient of the quadratic function; the coefficient selection module is used for selecting the parabolic coefficient of the interval according to the comparison result of the comparison module; and the calculating unit is used for calculating a quadratic function value according to the input data and the parabolic coefficient.
The invention does not relate to any specific expression of functions in the fitting process, the curve to be fitted is segmented, a parabolic coefficient is determined by three-point coordinates in each interval, the step length of variable value change is set by a given error, and the maximum interval smaller than the set error is found by continuously adopting dichotomy iteration until the whole target interval is covered. The interval obtained in this way is long enough, the number of the interval, namely the number of the segments, reaches the minimum, and the storage and the time delay during hardware calculation are effectively reduced. In addition, two different hardware implementations are provided according to the calculation and quantification results, the direct expansion method can achieve the purposes of low area and low resource occupation, and the CSA and parallel processing method can achieve the result of low time delay. The fitting method and the fitting device can be applied to various occasions requiring function approximate calculation, such as deep learning, big data calculation and the like, and have the advantages of high calculation accuracy, simplicity in realization, low hardware cost, low delay and the like.
Drawings
FIG. 1 is an overall flow diagram of the process of the present invention;
FIG. 2 is a flow chart of an implementation of the method of the present invention;
FIG. 3 is a matlab quadratic piecewise approximation image of several common functions in an embodiment of the present invention, (a) the function f ═ ex(b) a function f is sin (x), and (c) a function f is 1/(1+ e)-x) (d) function f ═ tanh (x);
FIG. 4 is a logic diagram of a hardware compute unit;
FIG. 5 is a schematic diagram of a quadratic trinomial computing unit structure using CSA;
FIG. 6 is a schematic diagram of a simplified quadratic trinomial computing unit.
Detailed Description
Assume the presence of a point (x) on a generic function0,f(x0) When x approaches the base point x)0When, a function at x may be used0The tangent of the point is taken as the approximation of the function. Function(s)
f(x)≈f(x0)+f'(x0)(x-x0)
Called function f at x0A linear approximation of the points.
When x ≈ x0When the temperature of the water is higher than the set temperature,
Figure BDA0002294310500000031
the geometric meaning of the second order approximation is the parabola closest to the original function, which is more accurate than the linear approximation.
It is known that
y=ax2+bx+c(a≠0),
And knowing that the parabola crosses three points (x)1,y1),(x2,y2),(x3,y3). Then
Figure BDA0002294310500000032
Figure BDA0002294310500000033
Figure BDA0002294310500000034
I.e. an analytic expression of the parabola can be obtained by the known coordinates of the three points. The present embodiment employs the end point (x) of the intervals,F(xs)),(xe,f(xe) To the midpoint
Figure BDA0002294310500000035
And substituting for solving.
As shown in fig. 1, the fitting method of this embodiment includes the following specific steps:
(1) setting precision od, dividing target interval into several lengths of 2-odThe meta interval of (2).
(2) And calculating function values of the end points and the middle points of the target interval by using the function expression to be fitted.
(3) The coefficients a, b, c are calculated using the aforementioned formula and known three-point coordinates. Substituting the end point of each element interval into a secondary expression, calculating a secondary function value of the corresponding point, and subtracting the fitting value of the secondary function from the calculated function value to obtain an error.
(4) And (4) if the error is larger than the set error, dividing the current interval into two parts, and repeating the steps (1) to (3) for the first half part after the division.
(5) Otherwise, carrying out halving again in the second half section after halving in the previous step, and adding the first half section to the first half section (namely 3/4 length in the previous step section) in the step (4) to repeat the operations in the steps (1) to (3) until the error does not exceed the set error. At this time, the current interval is subjected to quadratic fitting by using the currently calculated a, b and c coefficients, and the length of the interval obtained at this time is the maximum interval which can be fitted by the current coefficient.
The device of the embodiment comprises: the data input module receives input data; the data comparison module is used for comparing the input data with the segmented interval to determine a specific interval; the coefficient selection module is used for selecting the parabolic coefficient of the corresponding interval according to the comparison result; a calculation unit for calculating a quadratic function value for the input data and the coefficient; and the data output module outputs a calculation result.
The method comprises the steps of inputting bit width, a curve function expression to be fitted and a fitting interval in advance on a server, setting maximum error, segmenting the interval according to the bit width by a data segmentation module, dividing the interval into a plurality of element intervals, wherein the interval length is 2-od. And calculating coefficients by using the operation method with the input interval as an initial interval, calculating errors of each point by back substitution, comparing the calculated errors with the set errors, trying to increase the interval length if the conditions are met until the interval length is maximum, jumping out of a cycle at the moment, and fitting the next interval. If not, taking the middle point as the right end point to repeat the steps. The implementation flow is as shown in fig. 2, and the number of segments and the maximum error are finally calculated. And the coefficients of each section are stored in the variables for extraction, so that the time is short and the precision is high. The matlab calculation results and the elapsed time for the partial function in Ryzen 52600X CPU +16G memory are shown in table 1.
TABLE 1 results of calculation
Function(s) Interval(s) Setting bit width Number of segments Total time(s) Time of use(s) Maximum error
f=1/(1+e-x) [-π,π] 14 14 0.079 0.069 3.051605e-05
f=sin(x) [0,π] 14 17 0.073 0.054 3.051750e-05
f=ex [0,π] 14 36 0.068 0.059 3.051651e-05
f=tanh(x) [0,π] 14 15 0.047 0.034 3.051645e-05
The corresponding fitted image is shown in fig. 3.
A flow chart for calculating the segmentation and parameters is shown in fig. 2.
And performing hardware quantization on the result, and setting variable value quantization digit according to the bit width required by the hardware. And a direct truncation mode is adopted for the end point of the element interval to adapt to the requirement of bit width. For the coefficient, a method of setting a protection bit is adopted, a calculation error is continuously input from the specified bit width, and if the error is smaller than a set hardware error 2-odThen, the current bit width is the quantization bit number of the parabolic coefficient. If not, increasing the number of bits until the output and input accuracies are consistent. And then, storing the quantized interval endpoint data into a hardware data selection module, and storing the quantized parabolic coefficients into a ROM in the hardware coefficient selection module so as to perform hardware calculation.
The hardware implementation block diagram of the computing unit is shown in fig. 4, input data x is input into the selection module through the data input module, where x may be a feature map of a neural network, may be an input angle of FFT or other calculations requiring trigonometric functions, and may be a physical equation satisfying an e-exponential law, such as a time signal of zero input response or zero state response in a circuit. Then comparing with the quantized interval end point to determine the interval where the variable is located, reading the quantized coefficients a, b and c according to the determined interval, and sending the coefficients to a computing unit for:
y=ax2+bx+c
and (4) calculating. Wherein x2The calculation is carried out synchronously with the index quadratic term coefficients a, b and c, and ax can be obtained through multiplication2Bx. With CSA structure, ax can be achieved2And bx and c are added and calculated. CSA (carry save adder) is a digital adder used in computer microarchitecture to compute the sum of three or more n-bit numbers in binary form. It differs from other digital adders in thatTwo numbers of the same dimension as the input are output, one is a partial sum bit sequence and the other is a carry sequence. The carry memory cell consists of n full adders, each of which calculates a sum and a carry based on only corresponding bits of three input numbers. Given three n-bit numbers a, b and c, it produces a partial sum ps and a shift carry sc:
Figure BDA0002294310500000051
sci=(ai∧bi)v(ai∧ci)∨(bi∧ci)
then, the entire sum is calculated by:
the carry sequence sc is shifted one position to the left, 0 is appended to the front (most significant bit) of the partial sum sequence ps, these two are added using one ripple carry adder and the resulting (n +1) bit value is produced.
The key path of the whole process is about 2M +1A u.t., M and A are respectively the time delay of multiplication and addition, and u.t. is a clock unit. Is a hardware platform clock signal. This approach has less delay. As shown in fig. 5.
Due to the fact that
y=ax2+bx+c=(ax+b)*x+c
Therefore, two multiplication and addition units can be cascaded for calculation, the first multiplication and addition unit completes the calculation of ax + b, and the result is input into the second multiplication and addition unit of the cascade. Compared with the method adopting the CSA structure, the method has smaller area and larger delay, and the critical path is 2(M + A) u.t. The hardware implementation of which is shown in fig. 6.
The method can be applied to a large number of scenes because all the first-order curve functions can be fitted. For example, an activation function sigmoid in a neural network, with the expression f ═ 1/(1+ e)-x) And tanh function, can be calculated using this approximation. And e is exMay be used to calculate FFT, gaussian distributions, etc. Trigonometric functions may be used to calculate periodic signals, periodic motion, etc. It can be said that in almost any project where the function of a curve needs to be calculatedThe method can be used for realizing approximate calculation with high precision and low time delay.

Claims (5)

1. A function fitting method based on a parabola is characterized by comprising the following specific steps:
(1) dividing the whole interval of the function into a plurality of element intervals by adopting a binary iteration method, and calculating the coefficient of the parabolic function by using the real coordinates of three points, namely two divided end points and a middle point;
(2) substituting the coefficient obtained in the step (1) back into a parabolic function, calculating an error at an end point of each element interval, and comparing the calculated error with a set error;
(3) dividing the current element interval into two parts, and if the calculation error is greater than the set error, repeating the steps (1) to (2) on the first half section after dividing into two parts; if the calculation error is less than or equal to the set error, performing halving on the second half section after halving, adding the first half section after halving to the first half section, and repeating the steps (1) to (2) until the calculation error is less than or equal to the set error, wherein the length of each obtained section interval is longest, namely the number of integral sections is least;
(4) and (4) performing quadratic fitting on the segmented interval by using the coefficient obtained in the step (1) to complete the fitting of the whole interval of the function.
2. The method according to claim 1, wherein in step (1), the interval length of the element interval is 2-odWherein od is a fractional significant digit.
3. The method of claim 1, wherein in step (2), the error is calculated by: substituting the endpoint of each element interval into a quadratic expression, calculating a quadratic function value of the corresponding point, and subtracting a quadratic function fitting value from the quadratic function value to obtain an error.
4. A function fitting device based on parabola is characterized by comprising a data input module, a comparison module, a coefficient selection module, a calculation unit and a data output module; the comparison module is used for comparing the input data with the segmented interval to determine the interval of the data, so as to determine the coefficient of the quadratic function; the coefficient selection module is used for selecting the parabolic coefficient of the interval according to the comparison result of the comparison module; and the calculating unit is used for calculating a quadratic function value according to the input data and the parabolic coefficient.
5. The apparatus of claim 4, wherein the computing unit employs two cascaded multiply-add units or a carry-save adder.
CN201911194243.2A 2019-11-28 2019-11-28 Function fitting method and device based on parabola Pending CN111061992A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911194243.2A CN111061992A (en) 2019-11-28 2019-11-28 Function fitting method and device based on parabola

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911194243.2A CN111061992A (en) 2019-11-28 2019-11-28 Function fitting method and device based on parabola

Publications (1)

Publication Number Publication Date
CN111061992A true CN111061992A (en) 2020-04-24

Family

ID=70299079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911194243.2A Pending CN111061992A (en) 2019-11-28 2019-11-28 Function fitting method and device based on parabola

Country Status (1)

Country Link
CN (1) CN111061992A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112051980A (en) * 2020-10-13 2020-12-08 浙江大学 Non-linear activation function computing device based on Newton iteration method
CN112257361A (en) * 2020-10-22 2021-01-22 东南大学 Standard unit library construction method based on quadratic fitting model
CN116720554A (en) * 2023-08-11 2023-09-08 南京师范大学 Method for realizing multi-section linear fitting neuron circuit based on FPGA technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
牛涛: "初等函数运算器的设计研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112051980A (en) * 2020-10-13 2020-12-08 浙江大学 Non-linear activation function computing device based on Newton iteration method
CN112051980B (en) * 2020-10-13 2022-06-21 浙江大学 Non-linear activation function computing device based on Newton iteration method
CN112257361A (en) * 2020-10-22 2021-01-22 东南大学 Standard unit library construction method based on quadratic fitting model
CN112257361B (en) * 2020-10-22 2024-02-20 东南大学 Standard cell library construction method based on quadratic fit model
CN116720554A (en) * 2023-08-11 2023-09-08 南京师范大学 Method for realizing multi-section linear fitting neuron circuit based on FPGA technology
CN116720554B (en) * 2023-08-11 2023-11-14 南京师范大学 Method for realizing multi-section linear fitting neuron circuit based on FPGA technology

Similar Documents

Publication Publication Date Title
CN111061992A (en) Function fitting method and device based on parabola
Coleman et al. Arithmetic on the European logarithmic microprocessor
Obermann et al. Division algorithms and implementations
US4949296A (en) Method and apparatus for computing square roots of binary numbers
Barrois et al. The hidden cost of functional approximation against careful data sizing—A case study
CN111488133B (en) High-radix approximate Booth coding method and mixed-radix Booth coding approximate multiplier
CN111984227B (en) Approximation calculation device and method for complex square root
CN108196822A (en) A kind of method and system of double-precision floating point extracting operation
Jun et al. Modified non-restoring division algorithm with improved delay profile and error correction
CN103677737A (en) Method and device for achieving low delay CORDIC trigonometric function based on carry-save summator
Adams et al. Approximate restoring dividers using inexact cells and estimation from partial remainders
CN111443893A (en) N-time root calculation device and method based on CORDIC algorithm
Esposito et al. Approximate adder with output correction for error tolerant applications and Gaussian distributed inputs
CN110187866B (en) Hyperbolic CORDIC-based logarithmic multiplication computing system and method
Kanani et al. ACA-CSU: A carry selection based accuracy configurable approximate adder design
KR20170138143A (en) Method and apparatus for fused multiply-add
Lakshmi et al. VLSI architecture for low latency radix-4 CORDIC
CN107423026B (en) Method and device for realizing sine and cosine function calculation
Rudagi et al. Comparative analysis of radix-2, radix-4, radix-8 CORDIC processors
CN107657078B (en) Ultrasonic phased array floating point focusing transmission implementation method based on FPGA
Sadeghian et al. Optimized low-power elementary function approximation for Chebyshev series approximations
Bajger et al. Low-error, high-speed approximation of the sigmoid function for large FPGA implementations
Hsiao et al. Redundant constant-factor implementation of multi-dimensional CORDIC and its application to complex SVD
CN113919264A (en) Complex quadratic root calculation circuit design method based on general linear approximation algorithm
Zhou et al. Approximate comparator: Design and analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200424