CN107247992B - A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm - Google Patents

A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm Download PDF

Info

Publication number
CN107247992B
CN107247992B CN201710416069.6A CN201710416069A CN107247992B CN 107247992 B CN107247992 B CN 107247992B CN 201710416069 A CN201710416069 A CN 201710416069A CN 107247992 B CN107247992 B CN 107247992B
Authority
CN
China
Prior art keywords
fitting
point
polynomial
coefficient
section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710416069.6A
Other languages
Chinese (zh)
Other versions
CN107247992A (en
Inventor
宋宇鲲
王浩
张多利
杜高明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huangshan Development Investment Group Co.,Ltd.
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201710416069.6A priority Critical patent/CN107247992B/en
Publication of CN107247992A publication Critical patent/CN107247992A/en
Application granted granted Critical
Publication of CN107247992B publication Critical patent/CN107247992B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of sigmoid Function Fitting hardware circuits based on column maze approximate algorithm, it is characterized in that following steps carry out: 1 determines the order of polynomial fitting;2 obtain the fit interval of sigmoid function;3 obtain piecewise interval;4 obtain polynomial fitting;5 design ratio memory modules;6 design multinomial operation modules;7 design judgment modules;8 obtain fitting hardware circuit;9 judge the fitting execution section where operand;10 read fitted polynomial coefficients;11 are fitted calculating in multinomial operation module.The present invention can improve operational precision, accelerate arithmetic speed and promote the flexibility of operating structure on the basis of reducing hardware resource consumption.

Description

A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm
The application is the applying date are as follows: on December 30th, 2014, application No. is 2014108504707, titles are as follows: Yi Zhongji In the divisional application of the sigmoid Function Fitting hardware circuit of column maze approximate algorithm.
Technical field
The present invention relates to artificial neural network field, specifically a kind of sigmoid based on column maze approximate algorithm Function Fitting hardware circuit.
Background technique
Neural network is the abbreviation of artificial neural network, research and the application of neural network be also current research hotspot it One.Two aspects of advantage major embodiment of neural network, one is concurrency, another is exactly powerful nonlinear information processing With learning ability.Currently, having there is the theoretical basis of multiple neural network models, working principle to reach mature level, make The application further studied them in multiple related fieldss such as signal processing, control system, speech recognition is obtained as hot spot.With it is soft Part simulation is compared, and has that processing speed is fast, high concurrency based on hard-wired neural network, it is easier to reach neural network reality When operation requirement.
When realizing neural network with FPGA, there are two difficult point: one be data representation, the other is nerve net The approach method of network activation primitive, this two o'clock determine the height of hardware resource utilization efficiency and the precision approached.Neural network Activation primitive there are many form, Sigmoid function is most popular a kind of excitation function in neural network, realizes difficulty Also maximum, it is the important link that neural network FPGA is realized.
Currently, the FPGA implementation method of sigmoid function has: direct look-up table, piecewise linear approximation, approximation by polynomi-als, Cordic algorithm, genetic algorithm etc..Wherein direct loop up table (Zhiliang Nie, 2012;Alexander Gomperts, 2010) it is to store sigmoid operation result in a storage module, according to input operand, directly searches and read as a result, this Method needs to expend a large amount of storage resource, and hardware realization precision is not high;Piecewise linear approximation method (Manish Panicker, 2012) in (- 5,5) range, using 3 segmentation piecewise linear approximations, number format, operation and storage resource are pinpointed using 32bit It is less, but operational precision is lower, and maximum mean square deviation is 0.00187;Cordic algorithm (Xi Chen, 2006) is calculated using CORDIC Method and lookup table algorithm combine, and data format is using the input of customized 16bit floating-point format and customized 32bit floating-point format Output, calculation resources are big, and operational precision is very low.Genetic algorithm (Bharat Kishore Bharkhada, 2004) is in [0,8] model In enclosing, integral coefficient segmental cubic polynomials are fitted using Gene hepatitis B vaccine, using 16 fixed point number formats, calculation resources are not Height, storage resource is lower, and operational precision is not high, absolute error 2.4376 × 10-3;Polynomial approach algorithm is most commonly seen, tradition Taylor series expansion method, need to consume a large amount of calculation resources of consumption, and operational precision is very low.More classical piecewise parabolic Formula approximate algorithm (Joao O.P.Pinto, 2006) is using 5 rank multinomials of segmentation, and storage resource is low, and calculation resources are not high, operation Precision is higher, and worst error is 8 × 10-5, this fitting precision is current all optimal precision that can reach in the prior art, But it is not still able to satisfy high-accuracy arithmetic requirement.
And in terms of the selection of data format, above all of method is that raising operational precision is mostly customized floating-point lattice Formula, and in real time high-speed process field, data format is often the 32bit single-precision floating point format of IEEE754 standard, it is this from Data format is defined when communicating with other processing modules, it is also necessary to consider the conversion of data format, communication cost is larger.It is dropping In terms of low consumption of resources, to reduce calculation resources consumption, loop up table is used, though operation result can be obtained, and is greatly reduced Calculation resources consumption, but significantly increases storage resource.It is comprehensive the problem of due to algorithm used itself in terms of operational precision The considerations of in terms of joint source, in current state of the art, hard-wired precision is not generally high, is far from satisfying high-precision in real time Spend the requirement of processing;These are all the bottleneck problems of urgent need to resolve.
Summary of the invention
The present invention be to avoid above-mentioned the deficiencies in the prior art in place of, propose a kind of based on column maze approximate algorithm Sigmoid Function Fitting hardware circuit, to improve operational precision on the basis of reducing hardware resource consumption, accelerate fortune It calculates speed and promotes the flexibility of operating structure.
The present invention adopts the following technical scheme that in order to solve the technical problem
A kind of the characteristics of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm of the invention is by following step It is rapid to carry out:
Step 1, basis given fitting precision u, calculation resources and storage resource, determine the order n of polynomial fitting;
Step 2, according to the fitting precision u, utilize formula (1) to obtain the fit interval [a, b] of sigmoid function f (x);
The fit interval [a, b] is divided into using symmetry shown in formula (2) with origin 0 for symmetrical centre by step 3 2m+2 minizone [a, q1],(q1,q2],…,(qm,0],(0,qm+1],…,(q2m,b];a,q1,q2,…,qm,0,qm+1,… q2m, b respectively indicates the endpoint value of the 2m+2 minizone;q1,q2,…,qm,qm+1,…q2mRespectively indicate the 2m cell Between scaling endpoint value;Extreme points set Q={ Q is successively constituted by the scaling endpoint value of the 2m minizone0,Q1,…,Qt,… Q2m-1};QtIndicate the endpoint value of t-th of minizone in the scaling endpoint value of the 2m minizone;To obtain piecewise interval [Q0,Q1],[Q1,Q2],…,[Qt,Qt+1],…,[Q2m-1,Q2m];T=0,1 ..., 2m-1;
F (- x)=1-f (x) (2)
Step 4, by the order n respectively with the section (0, b] on m sectored cells between m Vector Groups of composition [n, Qm,Qm+1],[n,Qm+1,Qm+2],…,[n,Qε,Qε+1],…,[n,Q2m-1,Q2m];ε=m, m+1 ..., 2m-1, [n, Qε,Qε+1] table Show the ε Vector Groups;The m Vector Groups are successively substituted into Remes algorithm, to successively obtain the piecewise interval respectively Corresponding approximation accuracy um”,um+1”,…,ut”,…u2m-1";
Step 4.1 obtains the ε Vector Groups [n, the Q using formula (5)ε,Qε+1] corresponding to n+2 cut and to compare Xue Fuduo The intercrossing point group of item formulaWith the ε intercrossing point groupAs ε initial point sets To obtain m Vector Groups respectively corresponding to initial point set;
In formula (3), λ=0,1 ..., n+1;
Step 4.2 utilizes the ε initial point setsLinear side shown in solution formula (6) The solution of journey groupTo according to the solutionObtain that ε is initial to be forced Nearly multinomial
Step 4.3, in the ε piecewise interval [Qε,Qε+1] in obtain | f (x)-pε' (x) | when being maximum value it is corresponding from VariableBy the independent variableWithTo characterize;
IfAndThen useInstead of
IfAndThen useInstead of
IfAndThen useInstead of β=1,2 ..., n;To obtain the ε initial point setsUpdate point set;
Step 4.4 utilizes the ε initial point setsUpdate point set solve formula (6) institute The more new explanation of the system of linear equations shownTo according to the more new explanation Obtain the approximating polynomial of the ε update
Judgement | uε”-uε' | whether≤eps is true, if so, then with uε" it is used as the ε piecewise interval [Qε,Qε+1] Corresponding approximation accuracy;Otherwise, step 4.3- step 4.4 is repeated;Until | uε”-uε' | until≤eps is set up;Eps expression is forced Nearly error convergence controls precision;
Step 5 successively judges the approximation accuracy um”,um+1”,…,ut”,…u2m-1" whether meet the fitting precision U meets corresponding to approximation accuracy if satisfied, then meeting piecewise interval corresponding to approximation accuracy is to be fitted to execute section The coefficient of approximating polynomial is the fitted polynomial coefficients that the fitting executes section;If not satisfied, then scaling described discontented Scaling endpoint value in piecewise interval corresponding to sufficient approximation accuracy, and return step 4 executes, and meets the fitting until obtaining The m fitting of precision u executes section and m group fitted polynomial coefficients;
If the independent variable x of step 6, the sigmoid function f (x) is interior at section (b ,+∞), then section (b ,+∞) conduct Fitting executes section;And the constant term coefficient of polynomial fitting corresponding to section (b ,+∞) is that 1, remaining each term coefficient is 0;To obtain m+1 n order polynomial fitting, the fitting of sigmoid function is completed;
The coefficient of the m+1 n order polynomial fitting is solidificated in ROM, the efficiency of formation memory module by step 7;
Step 8, according to the n order polynomial fitting, utilize n floating-point adder, 2n-1 floating-point multiplier and (n- 2) × k deposit unit designs multinomial operation module;And a floating-point is designed in the output end of the multinomial operation module Subtracter;K is the flowing water series of the floating-point adder, the floating-point multiplier and floating-point subtracter;
Step 9 executes block design judgment module according to the 2m+2 fitting;By the multinomial operation module, it is Number memory module, floating-point subtracter and judgment module constitute fitting hardware circuit;
The input value of one step 10, input operand ω as the fitting hardware circuit;And utilize the judgement mould Fitting where block judges the operand ω executes section;
If ω ∈ (0 ,+∞), then the fitting where reading the operand ω in the coefficient memory module executes area Between corresponding polynomial fitting coefficient;
If ω ∈ (- ∞, 0], then the fitting where reading the operand ω in the coefficient memory module executes area Between symmetric interval corresponding to polynomial fitting coefficient;
Step 12, the coefficient of polynomial fitting corresponding to the operand ω and the operand ω is read in it is described more It is fitted calculating in item formula computing module, if ω ∈ (0 ,+∞), then the fitting result obtained is the fitting hardware circuit Output valve;If ω ∈ (- ∞, 0], then the fitting result of acquisition and 1 are read in the floating-point subtracter, the calculating knot of acquisition Fruit is the output valve of the fitting hardware circuit.
Compared with currently existing technology, the invention has the advantages that:
1, the column maze approximate algorithm that the present invention uses can satisfy different design objective requirements, if design objective requires Very low calculation resources consumption and higher operational precision, can be appropriate to increase m's in the case where not changing fitting precision u Value increases the number of minizone, reduces the order n of polynomial fitting, and design is made to meet design objective requirement;If design objective It is required that lower storage resource consumption and higher operational precision, it can be appropriate to reduce in the case where not changing fitting precision u The value of m is to reduce the number of minizone, to reduce coefficient storage resource consumption, design is made to meet design objective requirement;Thus It is low to overcome fitting precision in currently existing technology, the big problem of resource consumption, so that polynomial fitting hardware circuit is being realized There is stronger flexibility during fitting of a polynomial.
2, present invention employs multinomial coefficient memory modules, and hardware circuit design is made to have stronger scalability, for Different fitting schemes need to only solidify the coefficient stored in memory module again.
3, present invention employs n floating-point adders, 2n-1 floating-point multiplier, and (n-2) × l deposit unit is utilized to post The intermediate result of operand and corresponding stage is deposited, so that this circuit is able to carry out the pipeline computing of single precision floating datum, is improved Arithmetic speed, so that design can satisfy the requirement of high speed real-time operation.
4, present invention employs judgment modules, and then loop up table and piecewise nonlinear approximatioss are combined, extension The execution section of fitting function, within the scope of entire real number any operand value can obtain corresponding operation result.
5, the present invention is according to the symmetry of sigmoid function, scheme two only need to it is described (0, b] section is using the calculation of column maze Method fitting, so as to which on the basis of not influencing operational precision, the resource consumption of coefficient memory module is reduced to original one Half, the number for solving the coefficient of polynomial fitting is reduced to original half.
6, the present invention increases by one in multinomial operation module-external and subtracts according to the symmetry of sigmoid function, scheme two Musical instruments used in a Buddhist or Taoist mass, to it is described (- ∞, a] section operand fitting result execute subtraction, can be in the base for not influencing operational precision On plinth, final result is fast and accurately obtained.
7, different data formats can be used in the present invention, can be real for the single-precision floating point formatted data of IEEE754 format Existing fitting precision is not less than 10-6.For other customized floating-point format data, in the case of identical resource consumption, using this hair Bright circuit ratio can obtain higher fitting precision using other circuits.
Detailed description of the invention
Fig. 1 is the hardware circuit schematic diagram of the present invention program one;
Fig. 2 is the operation flow diagram of the present invention program one;
Fig. 3 is the multinomial operation circuit structure example implementation diagram of the present invention program one;
Fig. 4 is the hardware circuit schematic diagram of the present invention program two;
Fig. 5 is the operation flow diagram of the present invention program two;
Fig. 6 is the multinomial operation circuit structure example implementation diagram of the present invention program two.
Specific embodiment
In the present embodiment, a kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm is by following step It is rapid to carry out:
Step 1, basis given fitting precision u, calculation resources and storage resource, determine the order n of polynomial fitting;
Step 2, according to fitting precision u, utilize formula (1) to obtain the fit interval [a, b] of sigmoid function f (x);For example, In specific implementation, fitting precision u=10 is given-6, the order n=5 of polynomial fitting;To the fit interval [a, b] obtained =[- 13.816,13.816];
Fit interval [a, b] is divided into 2m+2 with origin 0 using symmetry shown in formula (2) for symmetrical centre by step 3 A minizone [a, q1],(q1,q2],…,(qm,0],(0,qm+1],…,(q2m,b];a,q1,q2,…,qm,0,qm+1,…q2m, b points Not Biao Shi 2m+2 minizone endpoint value;q1,q2,…,qm,qm+1,…q2mRespectively indicate the scaling endpoint value of 2m minizone; Extreme points set Q={ Q is successively constituted by the scaling endpoint value of 2m minizone0,Q1,…,Qt,…Q2m-1};QtIndicate 2m cell Between scaling endpoint value in t-th of minizone endpoint value;To obtain piecewise interval [Q0,Q1],[Q1,Q2],…,[Qt, Qt+1],…,[Q2m-1,Q2m];T=0,1 ..., 2m-1;
In the present embodiment, take m=7, by fit interval [- 13.816,13.816] be divided into 14 minizones [- 13.816,-10],(-10,-8],(-8,-6],(-6,-4],(-4,-2],(-2,-1],(-1,0],(0,1],(1,2],(2, 4], (4,6], (6,8], (8,10], (10,13.816], to obtain 14 piecewise intervals successively are as follows: [- 13.816, -10], (- 10,-8],(-8,-6],(-6,-4],(-4,-2],(-2,-1],(-1,0],(0,1],(1,2],(2,4],(4,6],(6,8], (8,10],(10,13.816];
F (- x)=1-f (x) (2)
The symmetry as shown in formula (2) is it is found that the fitting of sigmoid function f (x) can execute in entire fit interval Fitting, obtains fitting result, can also only do the fitting in the section x ∈ (0 ,+∞), and x ∈ (- ∞, 0] fitting result in section can be with Using the fitting result of formula (2) and its symmetric interval obtain, therefore can there are two types of scheme realize sigmoid function fitting, Wherein scheme one are as follows:
Order n is formed 2m Vector Groups [n, Q with 2m piecewise interval respectively by step 40,Q1],[n,Q1,Q2],…, [n,Qt,Qt+1],…,[n,Q2m,Q2m+1];[n,Qt,Qt+1] indicate t-th of Vector Groups;In the present embodiment, 14 Vector Groups are successively It is [5, -13.816, -10], [5, -10, -8], [5, -8, -6], [5, -6, -4], [5, -4, -2], [5, -2, -1], [5, -1, 0], [5,0,1], [5,1,2], [5,2,4], [5,4,6], [5,6,8], [5,8,10], [5,10,13.816], by 14 vectors Group successively substitute into Remes algorithm, thus successively obtain piecewise interval respectively corresponding to approximation accuracy u0”,u1”,…,ut”,… u2m+1";
Step 4.1 obtains t-th Vector Groups [n, Q using formula (3)t,Qt+1] corresponding to n+2 cut than Xue's husband's multinomial Intercrossing point groupWith t-th of intercrossing point groupAs t-th of initial point setTo obtain The respective corresponding initial point set of 2m Vector Groups;
In formula (3), k=0,1 ..., n+1;
Step 4.2 utilizes t-th of initial point setSystem of linear equations shown in solution formula (4) SolutionTo according to solutionObtain t-th of initial approximating polynomial
Step 4.3, in t-th of piecewise interval [Qt,Qt+1] in obtain | f (x)-pt' (x) | when being maximum value it is corresponding from VariableBy independent variableWithTo characterize;
IfAndThen useInstead of
IfAndThen useInstead of
IfAndThen useInstead ofTo obtain t-th of initial point setUpdate point set;
Step 4.4 utilizes t-th of initial point setUpdate point set solve formula (4) shown in line The more new explanation of property equation groupTo according to more new explanationIt obtains t-th The approximating polynomial of update
Judgement | ut”-ut' | whether≤eps is true, if so, then with ut" it is used as t-th of piecewise interval [Qt,Qt+1] institute it is right The approximation accuracy answered;Otherwise, step 4.3- step 4.4 is repeated;Until | ut”-ut' | until≤eps is set up;Eps is approximate error Convergence control precision;
Step 5 successively judges approximation accuracy u0”,u1”,…,ut”,…u2m-1" whether meet fitting precision u, if satisfied, Then meeting piecewise interval corresponding to approximation accuracy is to be fitted to execute section, meets approximating polynomial corresponding to approximation accuracy Coefficient be fitted execute section fitted polynomial coefficients;If not satisfied, then scaling is unsatisfactory for corresponding to approximation accuracy Scaling endpoint value in piecewise interval, and return step 4 executes, and the 2m+1 fitting execution of fitting precision u is met until obtaining Section and 2m+1 group fitted polynomial coefficients;
If the independent variable x of step 6, sigmoid function f (x) in section (b ,+∞), then section (b ,+∞) is as fitting Execute section;And it is 0 that the constant term coefficient of polynomial fitting corresponding to section (b ,+∞), which is 1, remaining each term coefficient,;If In section, (- ∞, a) interior, then (- ∞ a) executes section, and section as fitting to the independent variable x of sigmoid function f (x) in section (- ∞, a) corresponding to each term coefficient of polynomial fitting be 0;To obtain 2m+2 n order polynomial fitting, complete The fitting of sigmoid function;
In the present embodiment, the constant term coefficient of 5 rank polynomial fittings corresponding to section (13.816 ,+∞) is 1, section Remaining each term coefficient of 5 rank polynomial fittings corresponding to (13.816 ,+∞) is 0;Section (- ∞, -13.816) is corresponding Each term coefficients of 5 rank polynomial fittings be 0;
16 fittings that the present embodiment is obtained after step 5 and step 6 execute section are as follows: and (- ∞, -13.816), [- 13.816,-11],(-11,-7],(-7,5],(-5,-3],(-3,-2],(-2,-1],(-1,0],(0,1],(1,2],(2,3], (3,5], (5,7], (7,11], (11,13.816], (13.816 ,+∞) completes the fitting of sigmoid function.
The coefficient of 2m+2 n order polynomial fitting is solidificated in ROM, the efficiency of formation memory module by step 7;This reality It applies in example, 16 fittings is executed into the corresponding polynomial coefficient in section and are solidificated in ROM, and address is write according to storage rule Rule is read, Coefficient Look-up Table is constituted.
Step 8, according to n order polynomial fitting, using n floating-point adder, 2n-1 floating-point multiplier and (n-2) × K deposit unit designs multinomial operation module;K is the flowing water series of floating-point adder or floating-point multiplier;In the present embodiment, Multinomial operation module is designed using 5 floating-point adders, 9 floating-point multipliers and 6 reg deposit units, wherein floating-point is transported The flowing water series for calculating device is 2 grades.
Step 9 executes block design judgment module according to 2m+2 fitting;Mould is stored by multinomial operation module, coefficient Block and judgment module constitute fitting hardware circuit as shown in Figure 1;In Fig. 1, data_i is the source operand of input, and data_o is The operation result of output.
Shown in step 10, Fig. 2, input value of the operand ω as fitting hardware circuit is inputted;And it utilizes and judges mould Fitting where block judges operand ω executes section;
Step 11, from where read operands ω in coefficient memory module fitting execute section corresponding to fitting it is multinomial The coefficient of formula;
The coefficient of polynomial fitting corresponding to operand ω and operand ω is read in multinomial operation module by step 12 In be fitted calculating, thus obtain fitting result as fitting hardware circuit output valve.
Designed multinomial operation module out is as shown in figure 3, the IEEE754 standard list used in this embodiment scheme one Precision floating point data format, operational precision are not less than 10-65 polynomial fitting hardware circuit implementation structure charts, including 9 multiply Musical instruments used in a Buddhist or Taoist mass and 5 adders and 6 reg deposit units;The multinomial realized is p (x)=Ax5+Bx4+Cx3+Dx2+ Ex+F, Result is the final output of operation as a result, concrete operation process is as follows:
Step a: source operand x enters multinomial operation module, reads coefficient E, and x enters multiplier Multi_1 and completes E*x Operation is simultaneously exported to next stage, and x enters multiplier Multi_2 and completes x2Operation is simultaneously exported to next stage, and it is temporary that x enters reg_1 Two-stage waits and participates in next stage operation, and 2 multipliers of the first order complete operation parallel, and it is 2 grades that multiplier flowing water series, which is all provided with,;
Step b: coefficient F and E*x are read and enters adder Add_1 completion x5Result is simultaneously output to next stage by operation, is read Take coefficient D and x2D*x is completed into multiplier Multi_32Operation is simultaneously exported to next stage, x2Enter multiplier Multi_4 with x Complete x3Operation is simultaneously exported to next stage, and the x of upper level deposit enters reg_2 and continues temporary two-stage, waits and participates in next stage fortune It calculates, the floating point calculator of the second level 3 completes operation parallel, and flowing water series is disposed as 2 grades;
Step c: (E*x+F) and D*x are read2(Dx is completed into adder Add_22+E*x2+ F) operation and export to next Grade reads coefficient C and x3C*x is completed into multiplier Multi_53Operation is simultaneously exported to next stage, reads x3It is deposited with upper level X enter Multi_6 complete x4Operation is simultaneously exported to next stage, and the x of upper level deposit enters reg_3 and continues temporary two-stage, etc. Next stage operation to be participated in, 3 floating point calculators of the third level complete operation parallel, and flowing water series is disposed as 2 grades;
Step d: (Dx is read2+E*x2+ F) and C*x3(C*x is completed into adder Add_33+Dx2+ Ex+F) operation and defeated Out to next stage, coefficient B and x are read4B*x is completed into multiplier Multi_74Operation is simultaneously exported to next stage, reads x4With it is upper The x of level-one deposit enters multiplier Multi_8 and completes x5Operation is simultaneously exported to next stage, and 3 floating point calculators of the fourth stage are parallel Operation is completed, flowing water series is disposed as 2 grades;
Step e: (C*x is read3+Dx2+ Ex+F) and B*x4(B*x is completed into adder Add_44+C*x3+D*x2+E*x+ F it) operation and exports to next stage, reads coefficient A and x5A*x is completed into multiplier Multi_95Operation is simultaneously exported to next stage, 2 floating point calculators of level V complete operation parallel, and flowing water series is disposed as 2 grades;
Step f: adder Add_5 completes (A*x5+B*x4+C*x3+D*x2+ E*x+F) it operation and exports, the stream of adder Water series is set as 2 grades;Operation result is final result, is directly exported;
More than completion after each step, the processing of the sigmoid Function Fitting in the present invention is just completed.It is each to count this example The clock periodicity of a step, every grade of operation flowing water series are 2, and totally 6 grades, the fitting operation for completing single source operand needs 13 A clock cycle, fitting precision are not less than 10-6, maximum mean square deviation is no more than 8.74 × 10-14.The fitting precision is much higher than current Optimal fitting precision in the prior art, resource consumption is lower, and data format is IEEE754 single-precision floating point format, Neng Gougeng Good is applied in high-precision high-speed real-time operation.
Scheme one uses less floating-point operation resource and less floating-point operation series, thus arithmetic speed is faster, but Coefficient memory module will store more fitted polynomial coefficients, increase storage resource.In addition, though entire sigmoid letter Several fitting precisions is all very high, but due to using different polynomial fittings at left and right sides of origin, about origin symmetry The corresponding fitting precision of two fit intervals will be different.
Scheme two: step 4- step 12 can also carry out as follows:
Step 4, by order n respectively with section (0, b] on m sectored cells between form m Vector Groups [n, Qm,Qm+1], [n,Qm+1,Qm+2],…,[n,Qε,Qε+1],…,[n,Q2m-1,Q2m];ε=m, m+1 ..., 2m-1, [n, Qε,Qε+1] indicate ε Vector Groups;M Vector Groups are successively substituted into Remes algorithm, thus successively obtain piecewise interval respectively corresponding to approximation accuracy um”,um+1”,…,ut”,…u2m-1";
Step 4.1 obtains the ε Vector Groups [n, Q using formula (5)ε,Qε+1] corresponding to n+2 cut than Xue's husband's multinomial Intercrossing point groupWith the ε intercrossing point groupAs ε initial point setsTo obtain m The respective corresponding initial point set of a Vector Groups;
In formula (3), λ=0,1 ..., n+1;
Step 4.2 utilizes ε initial point setsSystem of linear equations shown in solution formula (6) SolutionTo according to solutionObtain ε initial approximating polynomials
Step 4.3, in the ε piecewise interval [Qε,Qε+1] in obtain | f (x)-pε' (x) | when being maximum value it is corresponding from VariableBy independent variableWithTo characterize;
IfAndThen useInstead of
IfAndThen useInstead of
IfAndThen useInstead of β=1,2 ..., n;To obtain ε initial point setsUpdate point set;
Step 4.4 utilizes ε initial point setsUpdate point set solve formula (6) shown in The more new explanation of system of linear equationsTo according to more new explanationIt obtains The approximating polynomial of the ε update
Judgement | uε”-uε' | whether≤eps is true, if so, then with uε" it is used as the ε piecewise interval [Qε,Qε+1] institute it is right The approximation accuracy answered;Otherwise, step 4.3- step 4.4 is repeated;Until | uε”-uε' | until≤eps is set up;Eps expression approaches mistake Difference convergence control precision.
Step 5 successively judges approximation accuracy um”,um+1”,…,ut”,…u2m-1" whether meet fitting precision u, if satisfied, Then meeting piecewise interval corresponding to approximation accuracy is to be fitted to execute section, meets approximating polynomial corresponding to approximation accuracy Coefficient be fitted execute section fitted polynomial coefficients;If not satisfied, then scaling is unsatisfactory for corresponding to approximation accuracy Scaling endpoint value in piecewise interval, and return step 4 executes, and the m fitting execution section of fitting precision u is met until obtaining With m group fitted polynomial coefficients;
If the independent variable x of step 6, sigmoid function f (x) in section (b ,+∞), then section (b ,+∞) is as fitting Execute section;And it is 0 that the constant term coefficient of polynomial fitting corresponding to section (b ,+∞), which is 1, remaining each term coefficient,;From And m+1 n order polynomial fitting is obtained, complete the fitting of sigmoid function;
In the present embodiment, the constant term coefficient of 5 rank polynomial fittings corresponding to section (13.816 ,+∞) is 1, section Remaining each term coefficient of 5 rank polynomial fittings corresponding to (13.816 ,+∞) is 0;
By step 5 and step 6, obtain this example implement 8 fittings execution sections (0,1], (1,2], (2,3], (3, 5],(5,7],(7,11],(11,13.816],(13.816,+∞);To complete the fitting of sigmoid function.
The coefficient of m+1 n order polynomial fitting is solidificated in ROM, the efficiency of formation memory module by step 7;This implementation In example, 8 fittings are executed into the corresponding polynomial coefficient in section and are solidificated in ROM, and address is write according to storage rule and is read Rule is taken, Coefficient Look-up Table is constituted.
Step 8, according to n order polynomial fitting, using n floating-point adder, 2n-1 floating-point multiplier and (n-2) × K deposit unit designs multinomial operation module;And a floating-point subtracter is designed in the output end of multinomial operation module;k For the flowing water series of floating-point adder, floating-point multiplier and floating-point subtracter;In the present embodiment, using 5 floating-point adders, 9 A floating-point multiplier and 6 reg deposit units design multinomial operation module, and wherein the flowing water series of floating point calculator is 2 Grade.
Step 9 executes block design judgment module according to 2m+2 fitting;Mould is stored by multinomial operation module, coefficient Block, floating-point subtracter and judgment module constitute fitting hardware circuit as shown in Figure 4;In Fig. 4, data_i is the source operation of input Number, data_o are the operation results of output.
Shown in step 10, Fig. 5, input value of the operand ω as fitting hardware circuit is inputted;And it utilizes and judges mould Fitting where block judges operand ω executes section;
If ω ∈ (0 ,+∞), then executed corresponding to section from the fitting where read operands ω in coefficient memory module Polynomial fitting coefficient;If ω ∈ (- ∞, 0], then it is executed from the fitting where read operands ω in coefficient memory module The coefficient of polynomial fitting corresponding to the symmetric interval in section;
The coefficient of polynomial fitting corresponding to operand ω and operand ω is read in multinomial operation module by step 12 In be fitted calculating, if ω ∈ (0 ,+∞), then the fitting result obtained be fitted hardware circuit output valve;If ω ∈ (- ∞, 0], then the fitting result of acquisition and 1 are read in floating-point subtracter, the calculated result of acquisition is to be fitted hardware circuit Output valve.
Designed multinomial operation module out is as shown in fig. 6, the IEEE754 standard list used in this embodiment scheme two Precision floating point data format, operational precision are not less than 10-65 polynomial fitting hardware circuit implementation structure charts, including 9 multiply Musical instruments used in a Buddhist or Taoist mass and 5 adders and 6 reg deposit units.The multinomial realized is p (x)=Ax5+Bx4+Cx3+Dx2+ Ex+F, Result is the final output of operation as a result, concrete operation process is as follows:
Step a: source operand x enters multinomial operation module, reads coefficient E, and x enters multiplier Multi_1 and completes E*x Operation is simultaneously exported to next stage, and x enters multiplier Multi_2 and completes x2Operation is simultaneously exported to next stage, and it is temporary that x enters reg_1 Two-stage waits and participates in next stage operation, and 2 multipliers of the first order complete operation parallel, and it is 2 grades that multiplier flowing water series, which is all provided with,;
Step b: coefficient F and E*x are read and enters adder Add_1 completion x5Result is simultaneously output to next stage by operation, is read Take coefficient D and x2D*x is completed into multiplier Multi_32Operation is simultaneously exported to next stage, x2Enter multiplier Multi_4 with x Complete x3Operation is simultaneously exported to next stage, and the x of upper level deposit enters reg_2 and continues temporary two-stage, waits and participates in next stage fortune It calculates, the floating point calculator of the second level 3 completes operation parallel, and flowing water series is disposed as 2 grades;
Step c: (E*x+F) and D*x are read2(Dx is completed into adder Add_22+E*x2+ F) operation and export to next Grade reads coefficient C and x3C*x is completed into multiplier Multi_53Operation is simultaneously exported to next stage, reads x3It is deposited with upper level X enter Multi_6 complete x4Operation is simultaneously exported to next stage, and the x of upper level deposit enters reg_3 and continues temporary two-stage, etc. Next stage operation to be participated in, 3 floating point calculators of the third level complete operation parallel, and flowing water series is disposed as 2 grades;
Step d: (Dx is read2+E*x2+ F) and C*x3(C*x is completed into adder Add_33+Dx2+ Ex+F) operation and defeated Out to next stage, coefficient B and x are read4B*x is completed into multiplier Multi_74Operation is simultaneously exported to next stage, reads x4With it is upper The x of level-one deposit enters multiplier Multi_8 and completes x5Operation is simultaneously exported to next stage, and 3 floating point calculators of the fourth stage are parallel Operation is completed, flowing water series is disposed as 2 grades;
Step e: (C*x is read3+Dx2+ Ex+F) and B*x4(B*x is completed into adder Add_44+C*x3+D*x2+E*x+ F it) operation and exports to next stage, reads coefficient A and x5A*x is completed into multiplier Multi_95Operation is simultaneously exported to next stage, 2 floating point calculators of level V complete operation parallel, and flowing water series is disposed as 2 grades;
Step f: adder Add_5 completes (A*x5+B*x4+C*x3+D*x2+ E*x+F) it operation and exports, the stream of adder Water series is set as 2 grades;
Step g: if source operand is on section (0 ,+∞), then upper level operation result is final result, directly defeated Out;If source operand on section (- ∞, 0), then does subtraction operation with upper level operation result for 1 using subtracter Add_6, Operation result is final result, is directly exported, and the flowing water series of subtracter is set as 2 grades.
More than completion after each step, the processing of the sigmoid Function Fitting in the present invention is just completed.It is each to count this example The clock periodicity of a step, every grade of operation flowing water series are 2, and totally 7 grades, the fitting operation for completing single source operand needs 15 A clock cycle, fitting precision are not less than 10-6, maximum mean square deviation is no more than 8.74 × 10-14, maximum mean square deviation is no more than 8.74 ×10-14.The fitting precision is much higher than optimal fitting precision in currently existing technology, and resource consumption is lower, and data format is IEEE754 single-precision floating point format can preferably be applied in high-precision high-speed real-time operation.
Two coefficient memory module of scheme stores less fitted polynomial coefficients, reduces storage resource consumption, and reduce Digital simulation polynomial workload.Due to using identical polynomial fitting at left and right sides of origin, about origin The corresponding fitting precision of symmetrical two fit intervals is identical, is more convenient for doing error analysis.Although entire sigmoid function is quasi- The requirement that arithmetic speed meets real time high-speed operation is closed, but due to increasing a subtracter and operation series, thus increase Calculation resources consumption, reduce arithmetic speed.
To sum up, the present invention utilizes column maze approximate algorithm, can quickly and effectively complete sigmoid functional operation, realizes The fitting operation of degree of precision, so that single-precision floating point operation for IEEE754 standard, in the requirement of high-precision hardware realization Lower worst error is no more than 10-6, and for non-IEEE754 standard data, equivalent technology can also be obtained using this structure The more currently existing better fitting precision of technology under index request.This method circuit structure is simple, and scale is limited, and use is fewer Operation can be completed in the adder and multiplier of amount, greatly reduces calculation resources consumption, and flexibility is higher, guarantee operation high speed and While concurrency requires, the precision and performance of sigmoid Function Fitting operation are effectively improved, solves currently existing skill The bottleneck problem that art faces.

Claims (1)

1. a kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm, it is characterized in that carrying out as follows:
Step 1, basis given fitting precision u, calculation resources and storage resource, determine the order n of polynomial fitting;
Step 2, according to the fitting precision u, utilize formula (1) to obtain the fit interval [a, b] of sigmoid function f (x);
The fit interval [a, b] is divided into 2m+2 with origin 0 using symmetry shown in formula (2) for symmetrical centre by step 3 A minizone [a, q1],(q1,q2],…,(qm,0],(0,qm+1],…,(q2m,b];a,q1,q2,…,qm,0,qm+1,…q2m, b points The endpoint value of the 2m+2 minizone is not indicated;q1,q2,…,qm,qm+1,…q2mRespectively indicate the scaling end of 2m minizone Point value;Extreme points set Q={ Q is successively constituted by the scaling endpoint value of the 2m minizone0,Q1,…,Qt,…Q2m-1};QtIt indicates The endpoint value of t-th of minizone in the scaling endpoint value of the 2m minizone;To obtain piecewise interval [Q0,Q1],[Q1, Q2],…,[Qt,Qt+1],…,[Q2m-1,Q2m];T=0,1 ..., 2m-1;
F (- x)=1-f (x) (2)
Step 4, by the order n respectively with the section (0, b] on m sectored cells between form m Vector Groups [n, Qm, Qm+1],[n,Qm+1,Qm+2],…,[n,Qε,Qε+1],…,[n,Q2m-1,Q2m];ε=m, m+1 ..., 2m-1, [n, Qε,Qε+1] indicate The ε Vector Groups;The m Vector Groups are successively substituted into Remes algorithm, to successively obtain the piecewise interval respectively institute Corresponding approximation accuracy um”,um+1”,…,ut”,…u2m-1";
Step 4.1 obtains the ε Vector Groups [n, the Q using formula (5)ε,Qε+1] corresponding to n+2 cut than Xue's husband's multinomial Intercrossing point groupWith the ε intercrossing point groupAs ε initial point setsTo Obtain m Vector Groups respectively corresponding to initial point set;
In formula (3), λ=0,1 ..., n+1;
Step 4.2 utilizes the ε initial point setsSystem of linear equations shown in solution formula (6) SolutionTo according to the solutionObtain ε initial approximating polynomials
Step 4.3, in the ε piecewise interval [Qε,Qε+1] in obtain | f (x)-pε' (x) | corresponding independent variable when being maximum valueBy the independent variableWithTo characterize;
IfAndThen useInstead of
IfAndThen useInstead of
IfAndThen useInstead ofβ= 1,2,…,n;To obtain the ε initial point setsUpdate point set;
Step 4.4 utilizes the ε initial point setsUpdate point set solve formula (6) shown in The more new explanation of system of linear equationsTo according to the more new explanation Obtain the approximating polynomial of e-th of update
Judgement | uε”-uε' | whether≤eps is true, if so, then with uε" it is used as e-th of piecewise interval [Qε,Qε+1] institute it is right The approximation accuracy answered;Otherwise, step 4.3- step 4.4 is repeated;Until | uε”-uε' | until≤eps is set up;Eps expression approaches mistake Difference convergence control precision;
Step 5 successively judges the approximation accuracy um”,um+1”,…,ut”,…u2m-1" whether meet the fitting precision u, if Meet, then meeting piecewise interval corresponding to approximation accuracy is to be fitted to execute section, meets and approaches corresponding to approximation accuracy Polynomial coefficient is the fitted polynomial coefficients that the fitting executes section;If not satisfied, then being unsatisfactory for forcing described in scaling Scaling endpoint value in piecewise interval corresponding to nearly precision, and return step 4 executes, and meets the fitting precision until obtaining The m fitting of u executes section and m group fitted polynomial coefficients;
If the independent variable x of step 6, the sigmoid function f (x) is interior at section (b ,+∞), then section (b ,+∞) is as fitting Execute section;And it is 0 that the constant term coefficient of polynomial fitting corresponding to section (b ,+∞), which is 1, remaining each term coefficient,;From And m+1 n order polynomial fitting is obtained, complete the fitting of sigmoid function;
The coefficient of the m+1 n order polynomial fitting is solidificated in ROM, the efficiency of formation memory module by step 7;
Step 8, according to the n order polynomial fitting, using n floating-point adder, 2n-1 floating-point multiplier and (n-2) × K deposit unit designs multinomial operation module;And a floating-point subtraction is designed in the output end of the multinomial operation module Device;K is the flowing water series of the floating-point adder, the floating-point multiplier and floating-point subtracter;
Step 9 executes block design judgment module according to the 2m+2 fitting;It is deposited by the multinomial operation module, coefficient It stores up module, floating-point subtracter and judgment module and constitutes fitting hardware circuit;
The input value of one step 10, input operand ω as the fitting hardware circuit;And sentenced using the judgment module The fitting broken where the operand ω executes section;
If ω ∈ (0 ,+∞), then the fitting where reading the operand ω in the coefficient memory module executes section institute The coefficient of corresponding polynomial fitting;
If ω ∈ (- ∞, 0], then the fitting where reading the operand ω in the coefficient memory module executes section The coefficient of polynomial fitting corresponding to symmetric interval;
The coefficient of polynomial fitting corresponding to the operand ω and the operand ω is read in the multinomial by step 12 Calculating is fitted in computing module, if ω ∈ (0 ,+∞), then the fitting result obtained is the defeated of the fitting hardware circuit It is worth out;If ω ∈ (- ∞, 0], then the fitting result of acquisition and 1 are read in the floating-point subtracter, the calculated result of acquisition is For the output valve of the fitting hardware circuit.
CN201710416069.6A 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm Active CN107247992B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710416069.6A CN107247992B (en) 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410850470.7A CN104484703B (en) 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuits based on row maze approximate algorithm
CN201710416069.6A CN107247992B (en) 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201410850470.7A Division CN104484703B (en) 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuits based on row maze approximate algorithm

Publications (2)

Publication Number Publication Date
CN107247992A CN107247992A (en) 2017-10-13
CN107247992B true CN107247992B (en) 2019-08-30

Family

ID=52759244

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710416069.6A Active CN107247992B (en) 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm
CN201410850470.7A Active CN104484703B (en) 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuits based on row maze approximate algorithm

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201410850470.7A Active CN104484703B (en) 2014-12-30 2014-12-30 A kind of sigmoid Function Fitting hardware circuits based on row maze approximate algorithm

Country Status (1)

Country Link
CN (2) CN107247992B (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102359265B1 (en) * 2015-09-18 2022-02-07 삼성전자주식회사 Processing apparatus and method for performing operation thereof
CN105893159B (en) * 2016-06-21 2018-06-19 北京百度网讯科技有限公司 Data processing method and device
US10552732B2 (en) * 2016-08-22 2020-02-04 Kneron Inc. Multi-layer neural network
CN106682732B (en) * 2016-12-14 2019-03-29 浙江大学 A kind of Gauss error function circuit applied to neural network
CN108205518A (en) * 2016-12-19 2018-06-26 上海寒武纪信息科技有限公司 Obtain device, method and the neural network device of functional value
US10997492B2 (en) * 2017-01-20 2021-05-04 Nvidia Corporation Automated methods for conversions to a lower precision data format
CN107480771B (en) * 2017-08-07 2020-06-02 北京中星微人工智能芯片技术有限公司 Deep learning-based activation function realization method and device
CN107704422A (en) * 2017-10-13 2018-02-16 武汉精测电子集团股份有限公司 A kind of parallel calculating method and device based on PLD
CN108154224A (en) * 2018-01-17 2018-06-12 北京中星微电子有限公司 For the method, apparatus and non-transitory computer-readable medium of data processing
US10977854B2 (en) 2018-02-27 2021-04-13 Stmicroelectronics International N.V. Data volume sculptor for deep learning acceleration
US11687762B2 (en) * 2018-02-27 2023-06-27 Stmicroelectronics S.R.L. Acceleration unit for a deep learning engine
US11586907B2 (en) 2018-02-27 2023-02-21 Stmicroelectronics S.R.L. Arithmetic unit for deep learning acceleration
CN108537332A (en) * 2018-04-12 2018-09-14 合肥工业大学 A kind of Sigmoid function hardware-efficient rate implementation methods based on Remez algorithms
CN109934336B (en) * 2019-03-08 2023-05-16 江南大学 Neural network dynamic acceleration platform design method based on optimal structure search and neural network dynamic acceleration platform
CN110070170A (en) * 2019-05-23 2019-07-30 福州大学 PSO-BP neural network sensor calibrating system and method based on MCU
CN110647718B (en) * 2019-09-26 2023-07-25 中昊芯英(杭州)科技有限公司 Data processing method, device, equipment and computer readable storage medium
CN110837885B (en) * 2019-10-11 2021-03-02 西安电子科技大学 Sigmoid function fitting method based on probability distribution
CN110796247B (en) * 2020-01-02 2020-05-19 深圳芯英科技有限公司 Data processing method, device, processor and computer readable storage medium
CN111191766B (en) * 2020-01-02 2023-05-16 中昊芯英(杭州)科技有限公司 Data processing method, device, processor and computer readable storage medium
CN111191779B (en) * 2020-01-02 2023-05-30 中昊芯英(杭州)科技有限公司 Data processing method, device, processor and computer readable storage medium
US11507831B2 (en) 2020-02-24 2022-11-22 Stmicroelectronics International N.V. Pooling unit for deep learning acceleration
CN111680782B (en) * 2020-05-20 2022-09-13 河海大学常州校区 FPGA-based RBF neural network activation function implementation method
CN112528211B (en) * 2020-12-17 2022-12-20 中电科思仪科技(安徽)有限公司 Method for fitting solar cell IV curve
CN112859086B (en) * 2021-01-25 2024-02-27 聚融医疗科技(杭州)有限公司 Self-adaptive rapid arctangent system, method and ultrasonic imaging device
CN114567396A (en) * 2022-02-28 2022-05-31 哲库科技(北京)有限公司 Wireless communication method, fitting method of nonlinear function, terminal and equipment
CN114900257B (en) * 2022-05-26 2024-05-14 Oppo广东移动通信有限公司 Baseband chip, channel estimation method, data processing method and equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1741394A (en) * 2005-09-16 2006-03-01 北京中星微电子有限公司 Method for computing nonlinear function in inverse quantization formula
CN102708381A (en) * 2012-05-09 2012-10-03 江南大学 Improved extreme learning machine combining learning thought of least square vector machine
CN103729688A (en) * 2013-12-18 2014-04-16 北京交通大学 Section traffic neural network prediction method based on EMD
CN103809930A (en) * 2014-01-24 2014-05-21 天津大学 Design method of double-precision floating-point divider and divider

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101527010B (en) * 2008-03-06 2011-12-07 上海理工大学 Hardware realization method and system for artificial neural network algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1741394A (en) * 2005-09-16 2006-03-01 北京中星微电子有限公司 Method for computing nonlinear function in inverse quantization formula
CN102708381A (en) * 2012-05-09 2012-10-03 江南大学 Improved extreme learning machine combining learning thought of least square vector machine
CN103729688A (en) * 2013-12-18 2014-04-16 北京交通大学 Section traffic neural network prediction method based on EMD
CN103809930A (en) * 2014-01-24 2014-05-21 天津大学 Design method of double-precision floating-point divider and divider

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于FPGA的神经网络硬件实现的研究与设计";刘培龙;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130715(第7期);I135-626页

Also Published As

Publication number Publication date
CN104484703B (en) 2017-06-30
CN104484703A (en) 2015-04-01
CN107247992A (en) 2017-10-13

Similar Documents

Publication Publication Date Title
CN107247992B (en) A kind of sigmoid Function Fitting hardware circuit based on column maze approximate algorithm
Gokhale et al. Snowflake: An efficient hardware accelerator for convolutional neural networks
CN116894145A (en) Block floating point for neural network implementation
CN110276450A (en) Deep neural network structural sparse system and method based on more granularities
CN107609641A (en) Sparse neural network framework and its implementation
CN109146067B (en) Policy convolution neural network accelerator based on FPGA
CN106951211B (en) A kind of restructural fixed and floating general purpose multipliers
CN108537332A (en) A kind of Sigmoid function hardware-efficient rate implementation methods based on Remez algorithms
CN106155627B (en) Low overhead iteration trigonometric device based on T_CORDIC algorithm
CN107633298A (en) A kind of hardware structure of the recurrent neural network accelerator based on model compression
CN102103479A (en) Floating point calculator and processing method for floating point calculation
CN102184161B (en) Matrix inversion device and method based on residue number system
CN109325590B (en) Device for realizing neural network processor with variable calculation precision
CN103176948A (en) Single precision elementary function operation accelerator low in cost
CN112540946A (en) Reconfigurable processor and method for calculating activation functions of various neural networks on reconfigurable processor
CN103902762A (en) Circuit structure for conducting least square equation solving according to positive definite symmetric matrices
CN212569855U (en) Hardware implementation device for activating function
CN109298848A (en) The subduplicate circuit of double mode floating-point division
Kang et al. Design of convolution operation accelerator based on FPGA
CN113191494A (en) Efficient LSTM accelerator based on FPGA
CN111860792A (en) Hardware implementation device and method for activating function
Yang et al. A Parallel Processing CNN Accelerator on Embedded Devices Based on Optimized MobileNet
Karthickkeyan et al. Booth Multiplier-Based Robust Model of FIR Filters for VLSI Applications
CN103699729A (en) Modulus multiplier
Yunfu et al. Design and implementation of R4-MSD square root algorithm in ternary optical computer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201231

Address after: 245000 No. 50, Meilin Avenue, Huangshan Economic Development Zone, Anhui Province

Patentee after: Huangshan Development Investment Group Co.,Ltd.

Address before: Tunxi road in Baohe District of Hefei city of Anhui Province, No. 193 230009

Patentee before: Hefei University of Technology