WO2017185334A1 - 一种用于执行多种超越函数运算的装置和方法 - Google Patents

一种用于执行多种超越函数运算的装置和方法 Download PDF

Info

Publication number
WO2017185334A1
WO2017185334A1 PCT/CN2016/080690 CN2016080690W WO2017185334A1 WO 2017185334 A1 WO2017185334 A1 WO 2017185334A1 CN 2016080690 W CN2016080690 W CN 2016080690W WO 2017185334 A1 WO2017185334 A1 WO 2017185334A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing unit
post
mode
function
core unit
Prior art date
Application number
PCT/CN2016/080690
Other languages
English (en)
French (fr)
Inventor
张士锦
李尚应
陈天石
陈云霁
Original Assignee
北京中科寒武纪科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京中科寒武纪科技有限公司 filed Critical 北京中科寒武纪科技有限公司
Priority to US16/097,603 priority Critical patent/US20190138570A1/en
Priority to PCT/CN2016/080690 priority patent/WO2017185334A1/zh
Priority to EP16899846.6A priority patent/EP3451152B1/en
Publication of WO2017185334A1 publication Critical patent/WO2017185334A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/548Trigonometric functions; Co-ordinate transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5446Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation using crossaddition algorithms, e.g. CORDIC
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/17Function evaluation by approximation methods, e.g. inter- or extrapolation, smoothing, least mean square method
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/556Logarithmic or exponential functions

Definitions

  • the present invention relates to the field of transcendental function computing, and in particular to an apparatus and method for performing a plurality of transcendental function operations, and more particularly to an apparatus and method for performing trigonometric, hyperbolic, exponential or logarithmic function operations.
  • Transcendental functions such as triangles, hyperbolics, exponents, and logarithms are not only commonly used in various scientific calculations, but are also often used as activation functions in multi-layer artificial neural networks.
  • Multi-layer artificial neural networks are widely used in the fields of pattern recognition, image processing, function approximation and optimization calculation. Multi-layer artificial neural networks have been accepted by academic circles in recent years due to their high recognition accuracy and good parallelism. And the industry is getting more and more attention.
  • One known method of supporting the above various transcendental function calculations is to use a general purpose processor.
  • the method approximates various transcendental functions by executing general-purpose instructions using general-purpose register files and generic features.
  • One of the disadvantages of this method is that it cannot be integrated with a dedicated device of a multi-layer artificial neural network, resulting in other steps failing to enjoy the performance improvement of such devices.
  • the general-purpose processor needs to decode the transcendental function into a long-column operation and a sequence of fetch instructions, and the processor front-end decoding brings a large power consumption overhead.
  • Another method of calculating the transcendental function in a multi-layer artificial neural network is a linear approximation.
  • the method approximates the activation function (some of which are transcendental functions) by segmenting the domain and storing the coefficients of the linear approximations of the segments.
  • the disadvantage of this method is that the segmental linear approximation can divide the number of segments, the precision can not meet the needs of the development of artificial neural networks, and can not be used for scientific computing, image processing and other applications with higher precision requirements.
  • the main object of the present invention is to provide a method for performing a plurality of transcendental functions.
  • the apparatus and method for computing solve the problem of insufficient overhead of the general processor mode and insufficient precision of the pure linear approximation method, and improve support for various transcendental function operations.
  • the present invention provides an apparatus for performing a plurality of transcendental function operations, the apparatus comprising a pre-processing unit group, a core unit, and a post-processing unit group, wherein:
  • a pre-processing unit group for transforming the externally input argument a into a coordinate x, y, an angle z, the remaining information k, and determining an operation mode mode adopted by the core unit;
  • a core unit for performing a triangular or hyperbolic transformation on the coordinates x, y, and angle z to obtain transformed coordinates x', y', angle z', and outputting to the post-processing unit group;
  • the post-processing unit group is configured to transform the coordinates x′, y′ and the angle z′ input by the core unit according to the remaining information k and the function f input by the pre-processing unit group to obtain an output result c.
  • the pre-processing unit group includes a selector 1 and a processor 2
  • the post-processing unit group includes a first post-processing unit 4, a second post-processing unit 5, and a third post-processing unit 6, wherein the selector 1 receives
  • the externally input argument a and function f determine the four different operations that should be taken, as follows:
  • the selector 1 obtains the coordinates x, y, the angle z of the argument a and the mode mode adopted by the core unit according to the function f, and outputs x, y, z, mode to the core unit 3, and the core unit 3 is based on the mode mode pair x.
  • y, z performs triangular or hyperbolic transformation to obtain transformed coordinates x', y', angle z' and outputs to the second post-processing unit 5 in the post-processing unit group, and the second post-processing unit 5 according to the core unit
  • the output coordinates x', y', angle z' and function f get the output result c;
  • the selector 1 hands the argument a and the function f to the processor 2 for pre-processing, and the processor 2 according to the function f decomposes the argument a to obtain the coordinates x, y, the angle z, the mode mode adopted by the core unit 3, and the remaining information k, wherein the coordinates x, y, the angle z, and the mode adopted by the core unit 3 are the same as the mode II.
  • the coordinates x, y, the angle z, the mode mode adopted by the core unit 3 is output to the core unit, and the remaining information k and the function f are directly output to the third post-processing unit 6 in the post-processing unit group;
  • the core unit 3 is based on The mode mode performs a triangular or hyperbolic transformation on x, y, z, and obtains x', y', z' output to the third post-processing unit 6 in the post-processing unit group;
  • the third post-processing unit 6 outputs according to the core unit 3.
  • x', y', z' and the k and function f given by the processor 2 get the output result c;
  • the present invention also provides a method for performing a plurality of transcendental function operations, the method comprising:
  • Step 1 The selector receives the input argument a and the function f, and determines four different operations of the first type, the second type, the third type or the fourth type to be taken;
  • Step 2 When the processor adopts the third operation, the input argument a and the function f are multiplied or shifted to be accepted by the core unit, and the conversion information k and sign are recorded for use by the third post-processing unit. , where sign is only valid under partial functions;
  • Step 3 When the processor adopts the operation of the second or the third type, the core unit realizes the following four kinds of triangles or hyperbolics by adding, subtracting and shifting operations on the three numbers of the abscissa x, the ordinate y and the angle z.
  • Transform
  • a and B are constants related to the number of iterations taken, and the shift operation is multiplied by 2 Power
  • Step 4a When the processor adopts the first operation, the first post-processing unit calculates a linear or quadratic approximation according to the input function f and outputs the same;
  • Step 4b When the processor adopts the operation of the second type or the third type, the second post-processing unit adds or subtracts the output of the core unit according to the input function f and the information provided by the pre-processing unit group processor, multiplies the constant, divides, The shift operation results in an output c, wherein the information provided by the pre-processing unit bank processor is only valid for the third operation.
  • the apparatus and method for performing a plurality of transcendental function operations provided by the present invention adopt an iterative manner by converting a value of the transcendental function into a result of triangulation or hyperbolic rotation transformation. It is guaranteed that the absolute value of the rotation of each step is fixed, and the reverse rotation is performed when the steering is excessive.
  • FIG. 1 shows a block diagram of an apparatus for performing a plurality of transcendental function operations in accordance with an embodiment of the present invention.
  • FIG. 2 is a schematic diagram showing the trigonometric function relationship of the iterative approximation of the core unit of FIG. 1 in the triangular mode.
  • FIG. 3 is a schematic diagram showing the relationship of the hyperbolic function obtained by the iterative approximation of the core unit of FIG. 1 in the hyperbolic mode.
  • FIG. 4 illustrates a method for performing multiple transcendental function operations in accordance with an embodiment of the present invention. flow chart.
  • Table 1 shows the specific operation of each unit under each input function f and argument a in accordance with an embodiment of the invention (under 16-bit floating point numbers). These operations can be implemented mainly by addition, subtraction, multiplication of constants and shifts (multiplication by power of 2), with only a small number of divisions and (multiple approximations) multiplications required for low precision. If the accuracy of the input and output requirements is different, the partial range in Table 1 should be adjusted accordingly.
  • FIG. 1 shows a block diagram of an apparatus for performing a plurality of transcendental function operations in accordance with an embodiment of the present invention.
  • the apparatus includes a pre-processing unit group (1, 2), a core unit 3, and a post-processing unit group (4, 5, 6), wherein:
  • a pre-processing unit group for transforming the externally input argument a into a coordinate x, y, an angle z, the remaining information k, and determining an operation mode mode adopted by the core unit;
  • the core unit 3 is configured to perform a triangular or hyperbolic transformation on the coordinates x, y and the angle z to obtain the transformed coordinates x', y' and the angle z' and output to the post-processing unit group;
  • the post-processing unit group is configured to transform the coordinates x′, y′ and the angle z′ input by the core unit according to the remaining information k and the function f input by the pre-processing unit group to obtain an output result c.
  • the pre-processing unit group includes a selector 1 and a processor 2
  • the post-processing unit group includes a first post-processing unit 4, a second post-processing unit 5, and a third post-processing unit 6, all of which can be integrated by a hardware integrated circuit (for example, dedicated integration) Circuit ASIC) implementation.
  • the selector 1 receives the argument a and the function f of the external input, and determines four different operations that should be taken, as follows:
  • the selector 1 derives the coordinates x, y, the angle z of the argument a and the mode mode adopted by the core unit 3 according to the function f, and outputs x, y, z, mode to the core unit 3, and the core unit 3 is based on the mode mode pair x.
  • y, z performs triangular or hyperbolic transformation to obtain transformed coordinates x', y', angle z' and outputs to the second post-processing unit 5 in the post-processing unit group, and the second post-processing unit 5 according to the core unit
  • the output coordinates x', y', angle z' and function f get the output result c;
  • the selector 1 hands the argument a and the function f to the processor 2 for pre-processing, and the processor 2 performs decomposition information processing on the argument a according to the function f.
  • the third post-processing unit 6 outputs x', y', z according to the core unit 3 'and the k and function f given by the processor 2 get the output result c;
  • FIG. 4 illustrates a flow chart of a method for performing a plurality of transcendental function operations according to an embodiment of the present invention, including The following steps:
  • Step 1 The selector receives the input argument a and the function f, and determines four different operations of the first type, the second type, the third type or the fourth type to be taken;
  • variable a and the function f are directly output to the first post-processing unit in the post-processing unit group, and the first post-processing unit obtains a linear approximation of the argument a according to the function f, and adds and multiplies the argument a to obtain an output result c. See Table 1 for details;
  • the selector obtains the coordinates x, y, the angle z of the argument a and the mode mode adopted by the core unit according to the function f, and outputs x, y, z, mode to the core unit, and the core unit is based on the mode mode to x, y, z.
  • the selector hands the argument a and the function f to the processor for pre-processing, and the processor performs decomposition information processing on the argument a according to the function f to obtain coordinates.
  • the core unit performs triangular or hyperbolic transformation on x, y, z based on the mode mode to obtain x.
  • the third post-processing unit is obtained according to the x', y', z' output by the core unit and the k and function f given by the processor Output result c;
  • Step 2 When the processor adopts the third operation, the input argument a and the function f are multiplied or shifted to be accepted by the core unit, and the conversion information k and sign are recorded for use by the third post-processing unit. , where sign is only valid under partial functions; see Table 1 for specific examples of the operation of the processor under each input function.
  • Step 3 When the processor adopts the operation of the second or the third type, the core unit realizes the following four kinds of triangles or hyperbolics by adding, subtracting and shifting operations on the three numbers of the abscissa x, the ordinate y and the angle z.
  • Transform
  • a and B are constants related to the number of iterations taken, and the shift operation is a power of 2; the transformation is done by iterative approximation of the angle that should be rotated:
  • Figure 2 visually shows the principle of approximating the triangulation with a series of fixed angle trigonometric transformations (for convenience, the magnification of the horizontal and vertical coordinates 1/cos z i is not shown).
  • Figure 3 visually shows the principle of the hyperbolic transformation obtained by a series of fixed angle hyperbolic transformations (for convenience, the magnification of the horizontal and vertical coordinates 1/cosh z i times is not shown).
  • Each iteration is equivalent to rotating z i in the forward or reverse direction and amplifying the horizontal and vertical coordinates by 1/cos z i times, where in the hyperbolic mode it is 1/cosh z i times:
  • the specific number of iterations that is, the maximum value of i, is flexibly selected according to the precision of the floating point number to be processed. After the maximum number of iterations is selected, the aforementioned constant can be calculated.
  • Table 1 The selection of the above four modes under each input function is shown in Table 1.
  • Step 4a When the processor adopts the first operation, the first post-processing unit calculates a linear or quadratic approximation according to the input function f and outputs; the specific operation of the first post-processing unit under each input function is shown in Table 1.
  • Step 4b When the processor adopts the operation of the second type or the third type, the second post-processing unit adds or subtracts the output of the core unit according to the input function f and the information provided by the pre-processing unit group processor, multiplies the constant, divides, The shift operation results in an output c, wherein the information provided by the pre-processing unit bank processor is only valid for the third operation.
  • Table 1 The specific operation of the second post-processing unit under each input function is shown in Table 1.
  • the apparatus and method for performing a plurality of transcendental function operations ensure each step in an iterative manner by converting the value of the transcendental function into a result of triangulation or hyperbolic rotation transformation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

一种用于执行多种超越函数运算的装置和方法,该装置包括前处理单元组、核心单元和后处理单元组,前处理单元组,用于将外部输入的自变量a变换为坐标x,y、角度z、其余信息k,以及决定核心单元采取的操作模式mode;核心单元,用于对坐标x,y、角度z进行三角或双曲变换,得到变换后的坐标x',y'、角度z'并输出给后处理单元组;后处理单元组,用于根据前处理单元组输入的其余信息k和函数f,对核心单元输入的坐标x',y'、角度z'进行变换得到输出结果c。通过该装置和方法解决了通用处理器方式开销过大和纯线性近似方式精度不足的问题,有效提高了对各种超越函数运算的支持。

Description

一种用于执行多种超越函数运算的装置和方法 技术领域
本发明涉及超越函数运算技术领域,具体地涉及一种用于执行多种超越函数运算的装置和方法,尤其是用于执行三角、双曲、指数或对数函数运算的装置和方法。
背景技术
三角、双曲、指数、对数等超越函数不仅常用于各类科学计算,也常用作多层人工神经网络中的激活函数。多层人工神经网络被广泛应用于模式识别、图像处理、函数逼近和优化计算等领域,多层人工神经网络在近年来由于其较高的识别准确度和较好的可并行性,受到学术界和工业界越来越广泛的关注。
一种支持上述各种超越函数计算的已知方法是使用通用处理器。该方法通过使用通用寄存器堆和通用功能部件执行通用指令来近似计算各种超越函数。该方法的缺点之一是无法与多层人工神经网络的专用装置整合,导致其他步骤无法享受此类装置的性能提升。另外,通用处理器需要把超越函数计算译码成一长列运算及访存指令序列,处理器前端译码带来了较大的功耗开销。
另一种在多层人工神经网络中计算超越函数的方法是线性近似。该方法通过将定义域分段并存储各段线性近似的系数来近似计算激活函数(其中不少是超越函数)。该方法的缺点是分段线性近似所能分割的段数有限,精度不能满足人工神经网络的发展的需要,更无法用于科学计算,图像处理等精度要求更高的用途。
发明内容
有鉴于此,本发明的主要目的在于提供一种用于执行多种超越函数 运算的装置和方法,以解决通用处理器方式开销过大和纯线性近似方式精度不足的问题,提高对各种超越函数运算的支持。
为达到上述目的,本发明提供了一种用于执行多种超越函数运算的装置,该装置包括前处理单元组、核心单元和后处理单元组,其中:
前处理单元组,用于将外部输入的自变量a变换为坐标x,y、角度z、其余信息k,以及决定核心单元采取的操作模式mode;
核心单元,用于对坐标x,y、角度z进行三角或双曲变换,得到变换后的坐标x′,y′、角度z′并输出给后处理单元组;
后处理单元组,用于根据前处理单元组输入的其余信息k和函数f,对核心单元输入的坐标x′,y′、角度z′进行变换得到输出结果c。
上述方案中,所述前处理单元组包括选择器1和处理器2,后处理单元组包括第一后处理单元4、第二后处理单元5和第三后处理单元6,其中选择器1接收外部输入的自变量a和函数f,判断应采取的四种不同操作,具体如下:
I、如果在输入或输出采用的规范下自变量a线性或二次近似的结果与真值各自用浮点数表示时误差不超过尾数最后一位,导致自变量a过小,则选择器1将自变量a和函数f直接输出给后处理单元组中的第一后处理单元4,第一后处理单元4根据函数f得出自变量a的线性近似式,并对自变量a进行加法及乘法得到输出结果c;
II、如果自变量a不超出核心单元收敛域,能在有限步数内达到默认模式下角度z=0或向量模式下纵坐标y=0,自变量a能直接被核心单元3的相应模式接受,则选择器1根据函数f得出自变量a的坐标x,y、角度z和核心单元采取的模式mode,将x,y,z,mode输出给核心单元3,核心单元3基于模式mode对x,y,z进行三角或双曲变换,得到变换后的坐标x′,y′,角度z′并输出给后处理单元组中的第二后处理单元5,第二后处理单元5根据核心单元输出的坐标x′,y′、角度z′和函数f得到输出结果c;
III、如果自变量a不能直接被核心单元3的相应模式接受,则选择器1将自变量a和函数f交给处理器2进行前处理,处理器2根据函数 f对自变量a进行分解信息处理,得到坐标x,y、角度z、核心单元3采取的模式mode以及其余信息k,其中坐标x,y、角度z、核心单元3采取的模式mode与II相同,坐标x,y、角度z、核心单元3采取的模式mode被输出给核心单元,并将其余信息k和函数f直接输出给后处理单元组中的第三后处理单元6;核心单元3基于模式mode对x,y,z进行三角或双曲变换,得到x′,y′,z′输出给后处理单元组中的第三后处理单元6;第三后处理单元6根据核心单元3输出的x′,y′,z′以及处理器2给出的k和函数f得到输出结果c;
IV、如果在输入或输出采用的规范下自变量a的真值超出浮点数表示的最大范围,则选择器1直接将自变量a和函数f输出。
上述方案中,所述IV中在输入或输出采用的规范下自变量a的真值超出浮点数表示的最大范围,是IEEE754半精度浮点数下绝对值最大为(1024+1023)或1024×230-15=65504。
为达到上述目的,本发明还提供了一种用于执行多种超越函数运算的方法,该方法包括:
步骤1:选择器接收输入的自变量a和函数f,判断应采取的第I种、第II种、第III种或第IV种四种不同操作;
步骤2:处理器在采取第III种操作时,对输入的自变量a和函数f进行乘法或移位变换使其能被核心单元接受,并记录变换信息k、sign供第三后处理单元所用,其中sign只在部分函数下有效;
步骤3:处理器在采取第II种或第III种操作时,核心单元在横坐标x、纵坐标y和角度z这3个数上通过加减和移位操作实现以下四种三角或双曲变换:
三角默认:(x,y,z)→(A(xcosz-ysinz),A(ycosz+xsinz),0)
三角向量:
Figure PCTCN2016080690-appb-000001
双曲默认:(x,y,z)→(B(xcoshz+ysinhz),B(ycoshz+xsinhz),0)
双曲向量:
Figure PCTCN2016080690-appb-000002
上述公式中A和B是与所取迭代次数有关的常量,移位操作是乘2 的幂;
步骤4a:处理器在采取第I种操作时,第一后处理单元根据输入函数f计算线性或二次近似并输出;
步骤4b:处理器在采取第II种或第III种操作时,第二后处理单元根据输入函数f和前处理单元组处理器提供的信息对核心单元的输出进行加减,乘常量,除法,移位操作,得到输出结果c,其中前处理单元组处理器提供的信息仅在第III种操作有效。
从上述技术方案可以看出,本发明提供的这种用于执行多种超越函数运算的装置和方法,通过将求超越函数的值转换为求三角或双曲旋转变换的结果,采用迭代的方式保证每一步旋转的角度绝对值固定,转向过度时取反向旋转,不足时取正向旋转使得只需要存储一系列固定的系数,规定角度序列zi使得tanzi(双曲情况下为tanhzi)为2的幂,使它们与横纵坐标x,y的乘法可以通过更简单的移位实现,进而减少了变量之间相互乘法带来的时间或功耗浪费,又保证了达到要求的精度,可以实现各类三角函数、双曲函数、指数函数和对数函数的计算,解决了通用处理器方式开销过大和纯线性近似方式精度不足的问题,有效提高了对各种超越函数运算的支持。
附图说明
为了更完整地理解本发明及其优势,现在将参考结合附图的以下描述,其中:
图1示出了根据本发明实施例的用于执行多种超越函数运算的装置的结构示意图。
图2示出了图1中核心单元在三角模式下迭代逼近所求三角函数关系的原理图。
图3示出了图1中核心单元在双曲模式下迭代逼近所求双曲函数关系的原理图。
图4示出了根据本发明实施例的用于执行多种超越函数运算的方法 流程图。
表1示出了根据本发明实施例(16位浮点数下)各单元在各输入函数f和自变量a下的具体操作。这些运算能够主要以加减、乘常量和移位(乘除2的幂)实现,仅有少量除法和(二次近似时)低精度要求的乘法。若输入输出要求的精度不同,表1中的部分范围应做相应调整。
具体实施方式
根据结合附图对本发明示例性实施例的以下详细描述,本发明的其它方面、优势和突出特征对于本领域技术人员将变得显而易见。
在本发明中,术语“包括”和“含有”及其派生词意为包括而非限制;术语“或”是包含性的,意为和/或。
在本说明书中,下述用于描述本发明原理的各种实施例只是说明,不应该以任何方式解释为限制发明的范围。参照附图的下述描述用于帮助全面理解由权利要求及其等同物限定的本发明的示例性实施例。下述描述包括多种具体细节来帮助理解,但这些细节应认为仅仅是示例性的。因此,本领域普通技术人员应认识到,在不背离本发明的范围和精神的情况下,可以对本文中描述的实施例进行多种改变和修改。此外,为了清楚和简洁起见,省略了公知功能和结构的描述。此外,贯穿附图,相同参考数字用于相似功能和操作。
图1示出了根据本发明实施例的用于执行多种超越函数运算的装置的结构示意图。如图1所示,该装置包括前处理单元组(1、2)、核心单元3和后处理单元组(4、5、6),其中:
前处理单元组,用于将外部输入的自变量a变换为坐标x,y、角度z、其余信息k,以及决定核心单元采取的操作模式mode;
核心单元3,用于对坐标x,y、角度z进行三角或双曲变换,得到变换后的坐标x′,y′、角度z′并输出给后处理单元组;
后处理单元组,用于根据前处理单元组输入的其余信息k和函数f,对核心单元输入的坐标x′,y′、角度z′进行变换得到输出结果c。
其中前处理单元组包括选择器1和处理器2,后处理单元组包括第一后处理单元4、第二后处理单元5和第三后处理单元6,均可以通过硬件集成电路(例如专用集成电路ASIC)实现。选择器1接收外部输入的自变量a和函数f,判断应采取的四种不同操作,具体如下:
I、如果在输入或输出采用的规范下自变量a线性或二次近似的结果与真值各自用浮点数表示时误差不超过尾数最后一位,导致自变量a过小,则选择器1将自变量a和函数f直接输出给后处理单元组中的第一后处理单元4,第一后处理单元4根据函数f得出自变量a的线性近似式,并对自变量a进行加法及乘法得到输出结果c;详见附表1;
II、如果自变量a不超出核心单元收敛域,能在有限步数内达到默认模式下角度z=0或向量模式下纵坐标y=0,自变量a能直接被核心单元的相应模式接受,则选择器1根据函数f得出自变量a的坐标x,y、角度z和核心单元3采取的模式mode,将x,y,z,mode输出给核心单元3,核心单元3基于模式mode对x,y,z进行三角或双曲变换,得到变换后的坐标x′,y′,角度z′并输出给后处理单元组中的第二后处理单元5,第二后处理单元5根据核心单元输出的坐标x′,y′、角度z′和函数f得到输出结果c;
III、如果自变量a不能直接被核心单元的相应模式接受,则选择器1将自变量a和函数f交给处理器2进行前处理,处理器2根据函数f对自变量a进行分解信息处理,得到坐标x,y、角度z、核心单元3采取的模式mode以及其余信息k,其中坐标x,y、角度z、核心单元3采取的模式mode与II相同,坐标x,y、角度z、核心单元3采取的模式mode被输出给核心单元3,并将其余信息k和函数f直接输出给后处理单元组中的第三后处理单元6;核心单元3基于模式mode对x,y,z进行三角或双曲变换,得到x′,y′,z′输出给后处理单元组中的第三后处理单元6;第三后处理单元6根据核心单元3输出的x′,y′,z′以及处理器2给出的k和函数f得到输出结果c;
IV、如果在输入或输出采用的规范下自变量a的真值超出浮点数表示的最大范围,例如IEEE754半精度浮点数下绝对值最大为(1024+1023) 或1024×230-15=65504,则选择器1直接将自变量a和函数f输出(NaN)。IEEE754半精度(binary16)浮点数下各输入函数在情况I、II、III和IV的具体判断范围和四种情况下的操作见附表1。
本发明实施例还提供了用于执行多种超越函数运算的方法,具体如图4所示,图4示出了根据本发明实施例的用于执行多种超越函数运算的方法流程图,包括以下步骤:
步骤1:选择器接收输入的自变量a和函数f,判断应采取的第I种、第II种、第III种或第IV种四种不同操作;
I、如果在输入或输出采用的规范下自变量a线性或二次近似的结果与真值各自用浮点数表示时误差不超过尾数最后一位,导致自变量a过小,则选择器将自变量a和函数f直接输出给后处理单元组中的第一后处理单元,第一后处理单元根据函数f得出自变量a的线性近似式,并对自变量a进行加法及乘法得到输出结果c;详见附表1;
II、如果自变量a不超出核心单元收敛域,能在有限步数内达到默认模式下角度z=0或向量模式下纵坐标y=0,自变量a能直接被核心单元的相应模式接受,则选择器根据函数f得出自变量a的坐标x,y、角度z和核心单元采取的模式mode,将x,y,z,mode输出给核心单元,核心单元基于模式mode对x,y,z进行三角或双曲变换,得到变换后的坐标x′,y′,角度z′并输出给后处理单元组中的第二后处理单元,第二后处理单元根据核心单元输出的坐标x′,y′、角度z′和函数f得到输出结果c;
III、如果自变量a不能直接被核心单元的相应模式接受,则选择器将自变量a和函数f交给处理器进行前处理,处理器根据函数f对自变量a进行分解信息处理,得到坐标x,y、角度z、核心单元采取的模式mode以及其余信息k,其中坐标x,y、角度z、核心单元采取的模式mode与II相同,坐标x,y、角度z、核心单元采取的模式mode被输出给核心单元,并将其余信息k和函数f直接输出给后处理单元组中的第三后处理单元;核心单元基于模式mode对x,y,z进行三角或双曲变换,得到x′,y′,z′输出给后处理单元组中的第三后处理单元;第三后处理单元根据核心单元输出的x′,y′,z′以及处理器给出的k和函数f得到输出结果c;
IV、如果在输入或输出采用的规范下自变量a的真值超出浮点数表示的最大范围,例如IEEE754半精度浮点数下绝对值最大为(1024+1023)或1024×230-15=65504,则选择器直接将自变量a和函数f输出(NaN)。IEEE754半精度(binary16)浮点数下各输入函数在情况I、II、III和IV的具体判断范围和四种情况下的操作见附表1。
步骤2:处理器在采取第III种操作时,对输入的自变量a和函数f进行乘法或移位变换使其能被核心单元接受,并记录变换信息k、sign供第三后处理单元所用,其中sign只在部分函数下有效;各输入函数下处理器的具体操作示例见附表1。
步骤3:处理器在采取第II种或第III种操作时,核心单元在横坐标x、纵坐标y和角度z这3个数上通过加减和移位操作实现以下四种三角或双曲变换:
三角默认:(x,y,z)→(A(xcosz-ysinz),A(ycosz+xsinz),0)
三角向量:
Figure PCTCN2016080690-appb-000003
双曲默认:(x,y,z)→(B(xcoshz+ysinhz),B(ycoshz+xsinhz),0)
双曲向量:
Figure PCTCN2016080690-appb-000004
上述公式中A和B是与所取迭代次数有关的常量,移位操作是乘2的幂;该变换通过迭代逼近应当旋转的角度完成:
第i步旋转角度zi,正向或反向根据如下判断:默认模式下目标z=0,z>0时做正向旋转,z<0时做反向旋转;向量模式下目标y=0,y>0时做反向旋转,y<0时做正向旋转;
图2直观地显示了用一系列固定角度的三角变换逼近所求三角变换的原理(方便起见,每步放大横纵坐标1/cos zi倍未表示)。图3直观地显示了用一系列固定角度的双曲变换逼近所求双曲变换的原理(方便起见,每步放大横纵坐标1/cosh zi倍未表示)。
每一步迭代相当于正向或反向旋转zi并将横纵坐标放大1/cos zi倍, 其中在双曲模式下是放大1/cosh zi倍:
三角正向:(x,y,z)→((x-ytanzi),(y+xtanzi),z-zi)
三角反向:(x,y,z)→((x+ytanzi),(y-xtanzi),z+zi)
双曲正向:(x,y,z)→((x+ytanhzi),(y+xtanhzi),z-zi)
双曲反向:(x,y,z)→((x-ytanhzi),(y-xtanhzi),z+zi)
为了能够实现仅用加减和移位实现每一步迭代并收敛,zi应当取如下序列:
三角:zi=arctan2-i,i=0,1,2,...
双曲:zi=arctanh2-j,j=i-k,当(3k+1-1)/2+k≤i≤(3k+2-1)/2+k+1,i=1,2,3,...
具体迭代次数,即i最大值,依所处理的浮点数精度灵活选定,选定最大迭代次数后即可计算前述常量
Figure PCTCN2016080690-appb-000005
各输入函数下上述四种模式的选择见附表1。
步骤4a:处理器在采取第I种操作时,第一后处理单元根据输入函数f计算线性或二次近似并输出;各输入函数下第一后处理单元的具体操作见附表1。
步骤4b:处理器在采取第II种或第III种操作时,第二后处理单元根据输入函数f和前处理单元组处理器提供的信息对核心单元的输出进行加减,乘常量,除法,移位操作,得到输出结果c,其中前处理单元组处理器提供的信息仅在第III种操作有效。各输入函数下第二后处理单元的具体操作见附表1。
通过上述描述可知,本发明提供的这种用于执行多种超越函数运算的装置和方法,通过将求超越函数的值转换为求三角或双曲旋转变换的结果,采用迭代的方式保证每一步旋转的角度绝对值固定,转向过度时取反向旋转,不足时取正向旋转使得只需要存储一系列固定的系数,规定角度序列zi使得tanzi(双曲情况下为tanhzi)为2的幂,使它们与横 纵坐标x,y的乘法可以通过更简单的移位实现,进而减少了变量之间相互乘法带来的时间或功耗浪费,又保证了达到要求的精度,可以实现各类三角函数、双曲函数、指数函数和对数函数的计算,解决了通用处理器方式开销过大和纯线性近似方式精度不足的问题,有效提高了对各种超越函数运算的支持。
前面的附图中所描绘的进程或方法可通过包括硬件(例如,电路、专用逻辑等)、固件、软件(例如,被具体化在非瞬态计算机可读介质上的软件),或两者的组合的处理逻辑来执行。虽然上文按照某些顺序操作描述了进程或方法,但是,应该理解,所描述的某些操作能以不同顺序来执行。此外,可并行地而非顺序地执行某些操作。
在前述的说明书中,参考其特定示例性实施例描述了本发明的各实施例。显然,可对各实施例做出各种修改,而不背离所附权利要求所述的本发明的更广泛的精神和范围。相应地,说明书和附图应当被认为是说明性的,而不是限制性的。

Claims (9)

  1. 一种用于执行多种超越函数运算的装置,其特征在于,该装置包括前处理单元组、核心单元和后处理单元组,其中:
    前处理单元组,用于将外部输入的自变量a变换为坐标x,y、角度z、其余信息k,以及决定核心单元采取的操作模式mode;
    核心单元,用于对坐标x,y、角度z进行三角或双曲变换,得到变换后的坐标x′,y′、角度z′并输出给后处理单元组;
    后处理单元组,用于根据前处理单元组输入的其余信息k和函数f,对核心单元输入的坐标x′,y′、角度z′进行变换得到输出结果c。
  2. 根据权利要求1所述的用于执行多种超越函数运算的装置,其特征在于,所述前处理单元组包括选择器(1)和处理器(2),后处理单元组包括第一后处理单元(4)、第二后处理单元(5)和第三后处理单元(6),其中选择器(1)接收外部输入的自变量a和函数f,判断应采取的四种不同操作,具体如下:
    I、如果在输入或输出采用的规范下自变量a线性或二次近似的结果与真值各自用浮点数表示时误差不超过尾数最后一位,导致自变量a过小,则选择器(1)将自变量a和函数f直接输出给后处理单元组中的第一后处理单元(4),第一后处理单元(4)根据函数f得出自变量a的线性近似式,并对自变量a进行加法及乘法得到输出结果c;
    II、如果自变量a不超出核心单元收敛域,能在有限步数内达到默认模式下角度z=0或向量模式下纵坐标y=0,自变量a能直接被核心单元的相应模式接受,则选择器(1)根据函数f得出自变量a的坐标x,y、角度z和核心单元采取的模式mode,将x,y,z,mode输出给核心单元,核心单元(3)基于模式mode对x,y,z进行三角或双曲变换,得到变换后的坐标x′,y′,角度z′并输出给后处理单元组中的第二后处理单元(5),第二后处理单元(5)根据核心单元输出的坐标x′,y′、角度z′和函数f得到输出结果c;
    III、如果自变量a不能直接被核心单元(3)的相应模式接受,则选 择器(1)将自变量a和函数f交给处理器(2)进行前处理,处理器(2)根据函数f对自变量a进行分解信息处理,得到坐标x,y、角度z、核心单元(3)采取的模式mode以及其余信息k,其中坐标x,y、角度z、核心单元(3)采取的模式mode与II相同,坐标x,y、角度z、核心单元(3)采取的模式mode被输出给核心单元(3),并将其余信息k和函数f直接输出给后处理单元组中的第三后处理单元(6);核心单元(3)基于模式mode对x,y,z进行三角或双曲变换,得到x′,y′,z′输出给后处理单元组中的第三后处理单元(6);第三后处理单元(6)根据核心单元输出的x′,y′,z′以及处理器(2)给出的k和函数f得到输出结果c;
    IV、如果在输入或输出采用的规范下自变量a的真值超出浮点数表示的最大范围,则选择器(1)直接将自变量a和函数f输出。
  3. 根据权利要求2所述的用于执行多种超越函数运算的装置,其特征在于,所述IV中在输入或输出采用的规范下自变量a的真值超出浮点数表示的最大范围,是IEEE754半精度浮点数下绝对值最大为(1024+1023)或1024×230-15=65504。
  4. 一种用于执行多种超越函数运算的方法,应用于权利要求1至3中任一项所述的装置,其特征在于,该方法包括:
    步骤1:选择器接收输入的自变量a和函数f,判断应采取的第I种、第II种、第III种或第IV种四种不同操作;
    步骤2:处理器在采取第III种操作时,对输入的自变量a和函数f进行乘法或移位变换使其能被核心单元接受,并记录变换信息k、sign供第三后处理单元所用,其中sign只在部分函数下有效;
    步骤3:处理器在采取第II种或第III种操作时,核心单元在横坐标x、纵坐标y和角度z这3个数上通过加减和移位操作实现以下四种三角或双曲变换:
    三角默认:(x,y,z)→(A(xcosz-ysinz),A(ycosz+xsinz),0)
    三角向量:
    Figure PCTCN2016080690-appb-100001
    双曲默认:(x,y,z)→(B(xcoshz+ysinhz),B(ycoshz+xsinhz),0)
    双曲向量:
    Figure PCTCN2016080690-appb-100002
    上述公式中A和B是与所取迭代次数有关的常量,移位操作是乘2的幂;
    步骤4a:处理器在采取第I种操作时,第一后处理单元根据输入函数f计算线性或二次近似并输出;
    步骤4b:处理器在采取第II种或第III种操作时,第二后处理单元根据输入函数f和前处理单元组处理器提供的信息对核心单元的输出进行加减,乘常量,除法,移位操作,得到输出结果c,其中前处理单元组处理器提供的信息仅在第III种操作有效。
  5. 根据权利要求4所述的用于执行多种超越函数运算的方法,其特征在于,步骤1中所述第I种操作为:
    I、如果在输入或输出采用的规范下自变量a线性或二次近似的结果与真值各自用浮点数表示时误差不超过尾数最后一位,导致自变量a过小,则选择器将自变量a和函数f直接输出给后处理单元组中的第一后处理单元,第一后处理单元根据函数f得出自变量a的线性近似式,并对自变量a进行加法及乘法得到输出结果c。
  6. 根据权利要求4所述的用于执行多种超越函数运算的方法,其特征在于,步骤1中所述第II种操作为:
    II、如果自变量a不超出核心单元收敛域,能在有限步数内达到默认模式下角度z=0或向量模式下纵坐标y=0,自变量a能直接被核心单元的相应模式接受,则选择器根据函数f得出自变量a的坐标x,y、角度z和核心单元采取的模式mode,将x,y,z,mode输出给核心单元,核心单元基于模式mode对x,y,z进行三角或双曲变换,得到变换后的坐标x′,y′,角度z′并输出给后处理单元组中的第二后处理单元,第二后处理单元根据核心单元输出的坐标x′,y′、角度z′和函数f得到输出结果c。
  7. 根据权利要求4所述的用于执行多种超越函数运算的方法,其特征在于,步骤1中所述第III种操作为:
    III、如果自变量a不能直接被核心单元的相应模式接受,则选择器将自变量a和函数f交给处理器进行前处理,处理器根据函数f对自变 量a进行分解信息处理,得到坐标x,y、角度z、核心单元采取的模式mode以及其余信息k,其中坐标x,y、角度z、核心单元采取的模式mode与II相同,坐标x,y、角度z、核心单元采取的模式mode被输出给核心单元,并将其余信息k和函数f直接输出给后处理单元组中的第三后处理单元;核心单元基于模式mode对x,y,z进行三角或双曲变换,得到x′,y′,z′输出给后处理单元组中的第三后处理单元;第三后处理单元根据核心单元输出的x′,y′,z′以及处理器给出的k和函数f得到输出结果c。
  8. 根据权利要求4所述的用于执行多种超越函数运算的方法,其特征在于,步骤1中所述第IV种操作为:
    IV、如果在输入或输出采用的规范下自变量a的真值超出浮点数表示的最大范围,则选择器直接将自变量a和函数f输出。
  9. 根据权利要求4所述的用于执行多种超越函数运算的方法,其特征在于,步骤3中所述核心单元在横坐标x、纵坐标y和角度z这3个数上通过加减和移位操作实现以下四种三角或双曲变换,该变换通过迭代逼近应当旋转的角度完成:
    第i步旋转角度zi,正向或反向根据如下判断:默认模式下目标z=0,z>0时做正向旋转,z<0时做反向旋转;向量模式下目标y=0,y>0时做反向旋转,y<0时做正向旋转;
    每一步迭代相当于正向或反向旋转zi并将横纵坐标放大1/cos zi,其中在双曲模式下是1/cosh zi倍:
    三角正向:(x,y,z)→((x-ytanzi),(y+xtanzi),z-zi)
    三角反向:(x,y,z)→((x+ytanzi),(y-xtanzi),z+zi)
    双曲正向:(x,y,z)→((x+ytanhzi),(y+xtanhzi),z-zi)
    双曲反向:(x,y,z)→((x-ytanhzi),(y-xtanhzi),z+zi)
    为了能够实现仅用加减和移位实现每一步迭代并收敛,zi应当取如下序列:
    三角:zi=arctan2-i,i=0,1,2,...
    双曲:zi=arctanh2-j,j=i-k,当(3k+1-1)/2+k≤i≤(3k+2-1)/2+k+1,i=1,2,3,...
    具体迭代次数,即i最大值,依所处理的浮点数精度灵活选定,选定最大迭代次数后即可计算前述常量
    Figure PCTCN2016080690-appb-100003
PCT/CN2016/080690 2016-04-29 2016-04-29 一种用于执行多种超越函数运算的装置和方法 WO2017185334A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US16/097,603 US20190138570A1 (en) 2016-04-29 2016-04-29 Apparatus and Methods for Performing Multiple Transcendental Function Operations
PCT/CN2016/080690 WO2017185334A1 (zh) 2016-04-29 2016-04-29 一种用于执行多种超越函数运算的装置和方法
EP16899846.6A EP3451152B1 (en) 2016-04-29 2016-04-29 Device and method for performing multiple transcendental function operations

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/080690 WO2017185334A1 (zh) 2016-04-29 2016-04-29 一种用于执行多种超越函数运算的装置和方法

Publications (1)

Publication Number Publication Date
WO2017185334A1 true WO2017185334A1 (zh) 2017-11-02

Family

ID=60161749

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/080690 WO2017185334A1 (zh) 2016-04-29 2016-04-29 一种用于执行多种超越函数运算的装置和方法

Country Status (3)

Country Link
US (1) US20190138570A1 (zh)
EP (1) EP3451152B1 (zh)
WO (1) WO2017185334A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10725742B2 (en) 2018-06-05 2020-07-28 Texas Instruments Incorporated Transcendental function evaluation
CN118092854A (zh) * 2024-04-26 2024-05-28 中科亿海微电子科技(苏州)有限公司 一种面积优化的串行浮点超越函数计算装置及处理器

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040015882A1 (en) * 2001-06-05 2004-01-22 Ping Tak Peter Tang Branch-free software methodology for transcendental functions
US20040215676A1 (en) * 2003-04-28 2004-10-28 Tang Ping T. Methods and apparatus for compiling a transcendental floating-point operation
CN101630243A (zh) * 2009-08-14 2010-01-20 西北工业大学 超越函数装置以及用该装置实现超越函数的方法
CN102722469A (zh) * 2012-05-28 2012-10-10 西安交通大学 基于浮点运算单元的基本超越函数运算方法及其协处理器

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040015882A1 (en) * 2001-06-05 2004-01-22 Ping Tak Peter Tang Branch-free software methodology for transcendental functions
US20040215676A1 (en) * 2003-04-28 2004-10-28 Tang Ping T. Methods and apparatus for compiling a transcendental floating-point operation
CN101630243A (zh) * 2009-08-14 2010-01-20 西北工业大学 超越函数装置以及用该装置实现超越函数的方法
CN102722469A (zh) * 2012-05-28 2012-10-10 西安交通大学 基于浮点运算单元的基本超越函数运算方法及其协处理器

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3451152A4 *

Also Published As

Publication number Publication date
EP3451152A4 (en) 2019-12-04
US20190138570A1 (en) 2019-05-09
EP3451152A1 (en) 2019-03-06
EP3451152B1 (en) 2020-09-30

Similar Documents

Publication Publication Date Title
EP3144805B1 (en) Method and processing apparatus for performing arithmetic operation
CN107305484B (zh) 一种非线性函数运算装置及方法
JP5731937B2 (ja) ベクトル浮動小数点引数削減
WO2017185414A1 (zh) 一种支持较少位数浮点数的神经网络运算的装置和方法
CN107305485B (zh) 一种用于执行多个浮点数相加的装置及方法
CN107329732B (zh) 一种用于执行多种超越函数运算的装置和方法
US8346831B1 (en) Systems and methods for computing mathematical functions
KR100465371B1 (ko) 덧셈 및 반올림 연산을 동시에 수행하는 부동 소수점alu 연산 장치
US9151842B2 (en) Method and apparatus for time of flight sensor 2-dimensional and 3-dimensional map generation
WO2017185334A1 (zh) 一种用于执行多种超越函数运算的装置和方法
CN107423026B (zh) 一种正余弦函数计算的实现方法及装置
WO2022001722A1 (zh) 一种用于计算正弦或余弦函数的实现方法及装置
JP5733379B2 (ja) プロセッサおよび演算方法
CN108228135B (zh) 一种运算多种超越函数的装置
CN111984226A (zh) 一种基于双曲cordic的立方根求解装置及求解方法
KR102559930B1 (ko) 수학적 함수들을 연산하기 위한 시스템 및 방법들
US20170308357A1 (en) Logarithm and power (exponentiation) computations using modern computer architectures
KR20160120249A (ko) 수학적 함수를 연산하는 시스템 및 방법
CN115237372A (zh) 一种乘法电路、机器学习运算电路、芯片及数据处理方法
CN113419779B (zh) 可扩展多精度数据流水线系统和方法
Ismail et al. Hybrid logarithmic number system arithmetic unit: A review
Zyuzina et al. Monotone approximation of a scalar conservation law based on the CABARET scheme in the case of a sign-changing characteristic field
US9875084B2 (en) Calculating trigonometric functions using a four input dot product circuit
Nguyen et al. A parallel pipeline CORDIC based on adaptive angle selection
Hsiao et al. Design of a low-cost floating-point programmable vertex processor for mobile graphics applications based on hybrid number system

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16899846

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2016899846

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016899846

Country of ref document: EP

Effective date: 20181129