WO2022001722A1 - 一种用于计算正弦或余弦函数的实现方法及装置 - Google Patents

一种用于计算正弦或余弦函数的实现方法及装置 Download PDF

Info

Publication number
WO2022001722A1
WO2022001722A1 PCT/CN2021/101216 CN2021101216W WO2022001722A1 WO 2022001722 A1 WO2022001722 A1 WO 2022001722A1 CN 2021101216 W CN2021101216 W CN 2021101216W WO 2022001722 A1 WO2022001722 A1 WO 2022001722A1
Authority
WO
WIPO (PCT)
Prior art keywords
result
sine
constant
mapped
range
Prior art date
Application number
PCT/CN2021/101216
Other languages
English (en)
French (fr)
Inventor
万江华
龙科莅
陈虎
Original Assignee
湖南毂梁微电子有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 湖南毂梁微电子有限公司 filed Critical 湖南毂梁微电子有限公司
Publication of WO2022001722A1 publication Critical patent/WO2022001722A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/548Trigonometric functions; Co-ordinate transformations

Definitions

  • the present invention mainly relates to the technical field, and particularly relates to an implementation method and apparatus for calculating a sine or cosine function.
  • Sine function and cosine function are important components in scientific technology and engineering applications. Compared with basic elementary functions, it has the characteristics of relatively complex implementation, large calculation delay, and many implementation methods. However, to obtain high-precision single-precision floating-point output results that meet the IEEE-754 standard, the cost is relatively higher.
  • the main implementation methods include: coordinate rotation method (CORDIC: Cordinate Rotation Digital Computer), look-up table method and polynomial approximation method.
  • the estimation point is generally set in advance to predict and estimate the result. The closer the distance between the input value and the estimated point is, the closer the estimated result of the estimated point is to the ideal accurate result, and the more accurate the result can be obtained with fewer iterations.
  • the resulting estimates are typically stored in the corresponding circuit or device in the form of a coefficient table.
  • the present invention provides a simple principle, high precision, little hardware resource consumption, low calculation delay, and relatively low storage space occupied by coefficients. Smaller implementation method and apparatus for calculating sine or cosine functions.
  • an implementation method for calculating a sine or cosine function the steps of which include: Step S1: map the input number to the range of [0, TT/4], and Obtain the function type of the internal operation and the sign of the result; Step S2: Obtain the constant result and the nearest estimated point according to the number mapped to the range of [0, TT/4] in Step S1; Step S3: According to the function of the internal operation Type and the nearest estimated point to obtain the estimated value of the corresponding sine or cosine function, that is, the coefficients required for the polynomial calculation; Step S4: Obtain the number mapped to the range of [0, TT/4] in step S1 to the nearest estimated point Step S5: Use the estimated value and the distance from the number mapped to the range of [0,TT/4] to the nearest estimated point to complete the polynomial operation; Step S6: Perform the calculation on the number mapped to the range of [0,TT/4] And the
  • Step S201 According to the partial digits of the number that have been mapped to [0, TT/4], obtain the nearest estimated point
  • Step S202 According to the mapped To the number of [0,TT/4], obtain the constant result: If the number after mapping is equal to several specific constants, the number position used to identify the valid constant result is set to 1, indicating that there is a valid constant result output; valid constant result
  • the existence of is mainly used to eliminate the sudden change of coefficient value in the very narrow range of number domain in the middle of two adjacent estimated points.
  • step S3 the coefficient search is completed using the partial digits of the nearest estimated point and the function type identifier of the internal operation obtained in the step S1 as an index.
  • step S4 add the number mapped to the range of [0,7r/4] and the nearest estimated point to obtain the result; map the number mapped to [0,TT/4] in step S1 4] and the nearest estimated point are added to obtain the distance between the two; the two data order codes in step S4 are consistent, and there is no need to perform order matching, directly add the mantissa, and then according to the leading 0 of the mantissa The number of adjustment order code.
  • step S5 a polynomial operation is completed according to the input variables and the correlation coefficient, wherein the polynomial form is as follows:
  • Y C 0 + (Ci + C 2 x D) x D (2) where 0 ) , 6, and C 2 represent coefficients, and D is the number obtained in step S4 and mapped to the range of [0,?r/4] The distance value to the nearest estimated point; where D can be positive or negative, determined by the relative position of the two numbers on the number axis for calculating the value of D; Coefficients C Q , C 1; C 2 can be the same or only have a sign bit Different or differ only by a constant multiple, which is calculated by shifting.
  • the steps of step S6 include: Step S601: if the input of the constant result is detected, select the constant result as the selected number; that is: if the valid identifier of the constant result is 1, then the constant result is the selected number; Step S602: Otherwise, when the number mapped to the range of [0,TT/4] in step S1 is less than or equal to the set threshold and the function type of the internal operation is a sine function, select the input mapping [0,TT] The number within the range of /4] is the selected number; Step S603: Otherwise, the result of the polynomial calculation is selected as the selected number; Step S604: Use the sign of the result obtained in Step S1 to give the correct sign bit to the selected number; Step S605: Rounds and normalizes the selected number to obtain a single-precision floating-point output that conforms to the IEEE-754 standard.
  • the present invention further provides an apparatus for calculating a sine or cosine function, which includes: a preprocessing module, including a compression mapping circuit unit for mapping input numbers to [0, ?r/4], an addition unit, a translation unit a code unit, a constant selection circuit; a coefficient look-up table, using a non-volatile storage device, for storing the estimated value corresponding to the estimated point, that is, the coefficient value used for the polynomial calculation; an operation module, including two multiply-add operation units and a related left shift unit, used to complete polynomial operations; a result selection module, including an addition unit, a constant result valid flag detection unit, and a basic selection circuit; when the constant result is valid flag When the bit detection unit detects that the valid flag bit is 1, it
  • the preprocessing module in the preprocessing module, two constant multiplication units are used to assist in completing the compression mapping of the input number to [0, TT/4]; the addition unit is used to complete the output value after compression mapping The calculation of the distance to the nearest estimated point; the decoding unit is a two-level decoder, which is used to complete the decoding of the order code of the output value of the compression mapping and the partial digits of the mantissa to the coefficient index value, and obtain the nearest estimate. Point; the constant selection circuit includes a multiplexer for selecting the constant result for output according to the value after the compression mapping is completed.
  • the implementation method and device for calculating a sine or cosine function of the present invention map data to [0, TT/4] for operation, and map the data in the range of [0, TT/4] close to 0
  • the method in which the number of sine function is directly output as the result without calculation during the internal sine function operation, and the multiplexing of the coefficients during the polynomial operation of the internal sine function and cosine function at the same estimation point, so that the result can be guaranteed. Under the premise of accuracy, the size of the coefficient table space is effectively reduced.
  • the implementation method and device for calculating the sine or cosine function of the present invention through flexible setting of the distance between adjacent estimated points, and mapping to the number close to 0 in the range of [0, TT/4]
  • the precision of the output result can reach the ideal precision of the single-precision floating-point number of the IEEE-754 standard (that is, the maximum The error is less than or equal to the size of 1 unit represented by the last digit of the single-precision floating-point mantissa).
  • the implementation method and device for calculating a sine or cosine function of the present invention fine-tune the bit width of the input and output data of the multiplication and addition unit in the operation module, fine-tune the bit width of some coefficients, and fine-tune the bit width of the input and output data.
  • the number equal to some constants is directly assigned to the constant result, so as to ensure that the monotonicity of the output result is consistent with the original function without affecting the accuracy of the output result.
  • the implementation method and device for calculating a sine or cosine function of the present invention on the premise of ensuring the accuracy of the result, includes 1 compression mapping operation, 1 secondary decoding operation, and 1 coefficient search operation on the critical path , 2 multiplication operations operation, 2 addition operations, 1 result selection operation, and 1 normalization processing operation, which meet the application requirements of low latency.
  • FIG. 1 is a schematic flow chart of the method of the present invention.
  • FIG. 2 is a schematic diagram of the structure principle of the device of the present invention in a specific application example.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be emphasized that the calculation of the sine or cosine function discussed in the present invention is based on the content of the computer or the processing end of the chip, that is, how to realize the calculation of high-precision sine or cosine function by means of hardware technology, and can also ensure the performance of the hardware. The optimization, the reduction of the hardware overhead, the improvement of the accuracy under the condition of the hardware platform.
  • Step S1 Map the input number to the range of [0, TT/4], and obtain the function of the internal operation Type and sign of the result
  • Step S2 Obtain the constant result and the nearest estimated point according to the number mapped to [0, TT/4] in Step S1
  • Step S3 According to the function type of the internal operation and the nearest estimation Obtain the estimated value of the corresponding sine or cosine function, that is, the coefficients required for the polynomial calculation
  • Step S4 Obtain the distance from the number mapped to the range of [0, TT/4] in step S1 to the nearest estimated point
  • Step S5 Use the estimated value and the distance from the number mapped to the range [0,TT/4] to the nearest estimated point to complete the polynomial operation
  • Step S6 For the number mapped to the range [0,TT/4] and the constant result and polynomial The result of the operation is selected, the selected data is normalized and
  • step S2 The number is further mapped into the range of [0,7r/4], and the function type identifier of the internal operation is obtained, with 0 for sine operation and 1 for cosine operation.
  • step S2 the specific implementation steps of step S2 are: Step S201: Acquire the nearest estimated point according to the partial digits of the number that have been mapped to [0, TT/4]; Step S202: According to the number that has been mapped to [0, TT/4] 0,TT/4], and obtain the constant result: If the mapped number is equal to several specific constants, the number position used to identify the valid constant result is set to 1, indicating that there is a valid constant result output; the existence of valid constant result , which is mainly used to eliminate the fluctuation of the result value due to the sudden change of the coefficient value in the extremely narrow number domain range in the middle of the two adjacent estimated points; for the result mapped to the range of [0, TT/4] in step S201 , if it is exactly equal to TT/4, the constant result is given
  • the distance between two adjacent estimated points has a minimum value, that is, the number of estimated values within a limited range is limited, that is, the size of the corresponding coefficient table is limited;
  • the step S3 includes: using the partial digits of the nearest estimated point and the function type identifier of the internal operation obtained in the step S1 as an index to complete the search of the coefficient. Among them, this embodiment further defines the coefficient:
  • the coefficient stored in the coefficient table is the function value of the sine or cosine function corresponding to the estimated point or a constant multiple of the corresponding function value;
  • the coefficients in the coefficient table can be reused as needed to further reduce Coefficient table; that is, some coefficients can be used for the operation of the internal sine function and the operation of the internal cosine function.
  • the step S4 is specifically as follows: performing an addition operation on the number mapped to the range of [0, TT/4] and the nearest estimated point to obtain the result. Add the number mapped to [0, TT/4] in step S1 and the nearest estimated point to obtain the distance between the two.
  • step S4 The two data order codes in step S4 are consistent, and no order matching is required, the mantissa is directly added, and then the order code can be adjusted according to the number of leading 0s of the mantissa.
  • step S5 is specifically: completing the polynomial operation according to the input variable and the correlation coefficient. Among them, the polynomial form is as follows:
  • D is the mapping obtained in step S4 to the range of [0,?r/4] Count the distance value to the nearest estimated point.
  • D can be a positive number or a negative number, which is determined by the relative positions of the two numbers on the number axis for calculating the D value.
  • the series can be the same or only have the difference of the sign bit or only differ by a constant multiple, and the constant multiple can be calculated by shifting.
  • the coefficient The same function operation can be reused.
  • the coefficient used for the sine operation can be used as a coefficient to participate in the operation when the cosine operation is performed, and the difference between the two may only be one sign bit or a constant multiple, and the constant multiple can be completed by shifting.
  • the difference between the coefficients 0 ) and ⁇ : 2 is only a constant multiple, and the operation of this constant multiple can be completed by the operation of shift and addition.
  • the polynomial operation shown in the above formula (1) is further explained as follows: Since the positive and negative signs of the coefficients in the polynomial operation may appear alternately, resulting in the instability of the result value, it is necessary to achieve the required accuracy on the premise of partial multiplication and addition. The result accuracy is adjusted.
  • one input bit width of multiplying unit #2 is adjusted to be the same as one input bit width of multiplying unit #1, and another input bit width of multiplying unit #2 is adjusted to be higher than that of multiplying unit #1.
  • One input bit width is increased by 12 bits.
  • the corresponding input and output signal bit width of the adder and the corresponding output signal bit width of the multiplier #1 are adjusted.
  • the specific steps of the step S6 include: Step S601: if the input of the constant result is detected, select the constant result as the selected number; that is: if the constant result If the valid identifier is 1, the constant result is the selected number; Step S602: Otherwise, when the number mapped to the range of [0, TT/4] in step S1 is less than or equal to the set threshold and the function type of the internal operation is sine When using the function, select the number within the range of the input mapping [0, TT/4] as the selected number; Step S603: Otherwise, select the result of the polynomial calculation as the selected number; Step S604: Use the sign of the result obtained in Step S1 to select the number The number is assigned the correct sign bit; Step S605: Rounding and normalizing the selected number to obtain a single-precision floating-point number output conforming to the IEEE-754 standard.
  • this step uses the method of rounding to an even number to perform a rounding operation on the mantissa of the result; when there is a carry of the highest digit, add 1 to the exponent.
  • some of the steps in the method of the present invention can be executed in parallel or in an exchange order as required, for example, step (4) can be executed in parallel with steps (2) (3) or be executed before step (2). . As shown in FIG.
  • the present invention further provides an apparatus for calculating a sine or cosine function, which includes: a preprocessing module, including completing the mapping of input numbers to [0, ?r/ 4] compression mapping circuit unit, an addition unit, a decoding unit, a constant selection circuit.
  • the coefficient look-up table adopts a non-volatile storage device and is used to store the estimated value corresponding to the estimated point, that is, the coefficient value used for the polynomial calculation.
  • the operation module includes two multiply-add operation units and a related left shift unit for completing polynomial operations. That is, it includes multiplying unit #1, multiplying unit #2, and adding unit #1 and adding unit #2.
  • the result selection module includes an addition unit, a constant result valid flag detection unit, and a basic selection circuit.
  • the normalization processing module includes a decoding unit and a left shift unit. Rounding processing module, including an adder, three comparison units.
  • the output of the preprocessing module in the device of the present invention includes: the coefficient index value, the distance from the output value after the compression mapping to the nearest estimated point, the output value after the compression mapping is completed, the positive value of the result The negative sign, the corresponding constant value obtained from the output value after the compression mapping is completed, and the function type identifier of the internal operation.
  • the input of the preprocessing module includes: the function type identifier of the sine or cosine function to be calculated and the single-precision floating point number conforming to the IEEE-754 standard.
  • two constant multiplication units are used to assist in completing the compression mapping of the input number to [0, TT/4].
  • the addition unit is used to complete the calculation of the distance from the compressed mapped output value to the nearest estimated point.
  • the decoding unit is a two-level decoder, which is used for decoding the exponent of the output value of the compression mapping and the partial digits of the mantissa to the coefficient index value, and obtains the nearest estimated point.
  • the constant selection circuit includes a multiplexer for selecting a constant according to the value after the compression mapping is completed output the result.
  • the coefficient look-up table is a non-volatile storage device for storing estimated values corresponding to the estimated points, that is, coefficient values used for polynomial calculation.
  • the size of the table space required for the coefficient lookup table is not greater than 256 X 64 bits.
  • the operation module includes two multiply-add operation units for completing polynomial operations.
  • the polynomial form is as follows:
  • the operation module on the premise of ensuring the accuracy of the result, needs to adjust the bit width of the output signal of the multiplier #2, for example, to make the bit width of one of the input signals of the multiplication unit #2 and the input bit of the multiplication unit #1
  • the width of the other input signal of the multiplier #2 is 12 bits wider than that of the other input signal of the multiplication unit #1.
  • the bit width of the input and output signals of the adder #1 should be adjusted to ensure the accuracy of the result.
  • the fluctuation of the result value caused by the alternating signs of the coefficients during the polynomial operation is eliminated.
  • operations within the parentheses are performed using multiplier #1 and adder #1
  • operations outside the parentheses are performed using multiplier #2 and adder #2.
  • one input bit width of multiplication unit #2 can be adjusted to be the same as one input bit width of multiplication unit #1, and another input bit width of multiplication unit #2 can be adjusted without affecting the accuracy of the result.
  • the result selection module when the constant result valid flag bit detection unit detects that the valid flag bit is 1, it directly selects and outputs the constant result.
  • the addition unit is used to compare the input compression mapping to [0, TT/4] number with the set threshold, when the input compression mapping to [0, TT/4] number is less than the set threshold and When the function type of the internal operation is identified as a sine function, the compression output in the output preprocessing module is mapped to the [0,TT/4] number.
  • the device of the present invention includes 1 compression mapping operation, 1 secondary decoding operation, 1 coefficient search operation, 2 multiplication operations, 2 addition operations, 1 result selection operation, 1 result selection operation on the critical path.
  • a normalized processing operation which meets the application requirements of low latency.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

一种用于计算正弦或余弦函数的实现方法及装置,该方法包括:步骤S1:将输入数映射到[0,π/4]的范围内,并获取内部运算的函数类型和结果的正负符号;步骤S2:根据步骤S1映射到[0,π/4]的范围内的数获取常数结果和最近的估计点;步骤S3:获取对应的正弦或余弦函数的估计值,即多项式计算所需的系数;步骤S4:获取步骤S1中映射到[0,π/4]范围内的数到最近估计点的距离;步骤S5:利用估计值和映射到[0,π/4]范围内的数到最近估计点的距离完成多项式运算;步骤S6:对映射到[0,π/4]范围内的数以及常数结果和多项式运算的结果进行选择,进行规整化及舍入处理后输出。该装置用来实现上述方法。该方法具有原理简单、精度高、耗费硬件资源不大、低计算延时、系数占用存储空间相对较小等优点。

Description

一种用于计算正弦或余弦函数的实现方法及装置 相关 申请的交叉引用 本 申请以申请日为 “2020-6-29”、 申请号为 “202010607527.6”、 发明创造名称为“一 种用于计算正弦或余弦函数的实现方法及装置”的中国专利申请为基础,并主张其优先权, 该中国专利申请的全文在此引用至本申请中, 以作为本申请的一部分。
【技术领域】 本发 明主要涉及到技术领域, 特指一种用于计算正弦或余弦函数的实现方法及装置。
【背景技术】 正弦 函数和余弦函数是科学技术和工程应用中的重要组成部分。 相较于基本初等函数 而言, 其具有实现相对复杂、 计算延时大、 实现方法繁多等特点。 而要获得高精度的满足 IEEE-754标准的单精度浮点数输出结果, 其代价相对更高。 传统技术中, 主要实现方法包 括:坐标旋转法 (CORDIC:Cordinate Rotation Digital Computer),查表法以及多项式逼近法。 一般而言, 由于坐标旋转法收敛较慢, 要达到较高精度, 需要很多次迭代, 速度慢, 计算 延时较高; 查表法虽然快速, 但随着对结果精度要求的提高, 其所需表空间呈几何倍数增 长; 而多项式逼近法, 收敛较快, 能在低计算延时条件下输出高精度结果。 对于多项式逼近法,为了减少迭代次数的同时增加结果精度,一般会提前设定估计点, 对结果进行预测与估计。 输入值与估计点之间的距离越近, 则估计点的估计结果与理想的 精确结果越接近, 则越能在更少的迭代次数下获得更高精度的结果。 结果的估计值一般以 系数表的形式存储在相应的电路或器件中。 这类方法存在的主要 问题有:
( 1 )在可忽略常数 7T的误差对结果的影响的输入范围内, 使用一般的多项式逼近的方 法,进行正弦或余弦函数的运算过程中,对于 IEEE-754标准的单精度浮点数表示的输出, 难以达到理想的精度要求 (即最大误差小于或等于单精度浮点数尾数的最后一位所表示的 1个单位大小)。
(2) 在可忽略常数 7T的误差对结果的影响的输入范围内, 要保证输出结果达到理想精 度, 使用一般的多项式逼近的方法进行正弦或余弦函数的运算, 所需的系数表空间极大。
(3 ) 在进行正弦或余弦函数的运算过程中, 由于所计算的多项式中项式的正负符号 交替出现导致的输出数值的波动, 以及处于相邻两个估计点中间的相邻两个输入数使用不 同的估计结果进行多项式运算而导致的输出数值的波动, 导致难以保证其输出结果的单调 性与原函数一致。
【发明内容】 本发 明要解决的技术问题就在于: 针对现有技术存在的技术问题, 本发明提供一种原 理简单、 精度高、 耗费硬件资源不大、 低计算延时、 系数占用存储空间相对较小的用于计 算正弦或余弦函数的实现方法及装置。 为解决上述技术问题, 本发明采用以下技术方案: 一种用于计算正弦或余弦函数的实现方法, 其步骤包括: 步骤 S1: 将输入数映射到[0,TT/4]的范围内, 并获取内部运算的函数类型和结果的正 负符号; 步骤 S2: 根据步骤 S1映射到[0,TT/4]的范围内的数获取常数结果和最近的估计点; 步骤 S3:根据内部运算的函数类型和最近的估计点获取对应的正弦或余弦函数的估计 值, 即多项式计算所需的系数; 步骤 S4 : 获取步骤 S 1中映射到[0, TT/4]范围内的数到最近估计点的距离; 步骤 S5: 利用估计值和映射到[0,TT/4]范围内的数到最近估计点的距离完成多项式运 算; 步骤 S6: 对映射到[0,TT/4]范围内的数以及常数结果和多项式运算的结果进行选择, 并对选中数据进行规整化及舍入处理, 然后输出最终结果。 作为本发明方法的进一步改进: 所述步骤 S1的具体实施步骤为: 步骤 S101: 若输入数的绝对值处于[0,TT/4]内, 则取输入的函数类型标识为内部运算 的函数类型, 并根据输入数的符号与函数类型标识获取最终结果的符号; 步骤 S102:否则,将输入数据根据正弦或余弦函数的周期性和对称性,映射到[0, TT/2]内, 并获取结果正负符号; 步骤 S103: 根据三角正弦或余弦函数的变换原理, 根据所求正弦或余弦函数的类型, 将数据 进一步映射 到[0,TT/4], 并获取内部运算的函数类型标识, 其映射方程式为 sinx = cos(7r/2 — x)或者 cosx = sin(7r/2 — x) ; 将数进一步映射到[0,7r/4]的范围内, 并获 得内部运算的函数类型标识, 以 0标识正弦运算, 1标识余弦运算。 作为本发明方法的进一步改进: 所述步骤 S2的具体实施步骤为: 步骤 S201: 根据已映射到[0,TT/4]的数的部分数位, 获取最近的估计点; 步骤 S202: 根据已映射到[0,TT/4]的数, 获取常数结果: 若映射后的数等于特定的几 个常数, 则将用于标识常数结果有效的数位置 1, 表示存在有效常数结果输出; 有效常数 结果的存在, 主要用于消除在相邻两个估计点正中间的极窄数域范围内, 因系数值突变带 来的结果数值的波动; 对于步骤 S201 中映射到[0,TT/4]范围的结果, 若其正好等于 TT/4, 则直接给予常数结果。 作为本发明方法的进一步改进: 所述步骤 S3 中, 以最近的估计点的部分数位和步骤 S1中获得的内部运算的函数类型标识作为索引, 完成系数的查找。 作为本发明方法的进一步改进: 所述步骤 S4中, 对映射到[0,7r/4]范围内的数和最近 估计点做加法运算, 获取结果; 对步骤 S1中映射到[0,TT/4]内的数和最近估计点进行加法 运算, 获得两者之间的距离; 步骤 S4 中的两个数据阶码具备一致性, 无需进行对阶, 直 接尾数相加, 然后根据尾数的前导 0的个数调整阶码。 作为本发明方法的进一步改进: 所述步骤 S5 中, 根据输入变量以及相关系数, 完成 多项式运算, 其中多项式形式如下:
Y = C0 + (Ci + C2 x D) x D (2) 其 中 0), 6, C2表示系数, D为步骤 S4中所得的映射到[0,?r/4]范围内的数到最近估计点 的距离值; 其中 D可以为正数或者负数, 由计算 D值的两个数在数轴上的相对位置决定; 系数 CQ, C1; C2可以相同或只存在符号位的不同或只相差一个常数倍, 这个常数倍通过移位 来完成计算。 作为本发明方法的进一步改进: 所述步骤 S6的步骤包括: 步骤 S601: 若检测到常数结果的输入, 则选择常数结果作为选中数; 即: 若常数结果 的有效标识符为 1, 则常数结果为选中数; 步骤 S602: 否则, 当步骤 S1中映射到[0,TT/4]范围内的数小于或等于设定阈值且内部 运算的函数类型为正弦函数时, 选取输入映射[0,TT/4]范围内的数为选中数; 步骤 S603: 否则选取多项式计算的结果为选中数; 步骤 S604: 利用步骤 S1中获得的结果的正负号给选中数赋予正确的符号位; 步骤 S605: 对选中数进行舍入和规格化处理, 获得符合 IEEE-754标准的单精度浮点 数输出。 作为本发明方法的进一步改进: 在进行舍入处理时, 使用向偶数舍入的方法对结果的 尾数进行舍入操作; 存在最高位进位时, 对阶码进行加 1操作。 本发 明进一步提供一种用于计算正弦或余弦函数的装置, 其包括: 预处理模块, 包括完成将输入数映射到[0,?r/4]的压缩映射电路单元, 一个加法单元, 一个译码单元, 一个常数选择电路; 系数查找表, 采用非挥发性存储装置, 用于存储估计点对应的估计值, 即多项式计算 所用的系数值; 运算模块, 包括两个乘加运算单元以及相关的左移位单元, 用于完成多项式运算; 结果选择模块, 包括一个加法单元、 常数结果有效标志位检测单元、 基本选择电路; 当常数结果有效标志位检测单元检测到有效标志位为 1时, 直接选择输出常数结果; 所述 加法单元 对输入的压缩映射到 [0,TT/4]数与设定阈值进行比较, 当输入的压缩映射到 [0,TT/4]数小于设定阈值且内部运算的函数类型标识为正弦函数时,输出预处理模块中输出 的压缩映射到[0,TT/4]数;在结果选择模块中,当不选择输出常数结果和压缩映射到[0,TT/4] 内的数时, 则选择多项式计算所得的结果进行输出; 规格化处理模块, 包括一个译码单元和一个左移位单元; 舍入处理模块, 包括一个加法器, 三个比较单元。 作为本发明装置的进一步改进: 在所述预处理模块中, 两个常数乘法单元用以辅助完 成输入数到[0,TT/4]的压缩映射;加法单元用以完成压缩映射后的输出值到最近的估计点的 距离的计算; 译码单元是一个二级译码器, 用以完成压缩映射的输出值的阶码和尾数的部 分数位到系数索引值的译码, 并获取最近的估计点; 常数选择电路包括多路选择器, 用于 根据完成压缩映射后的值选择常数结果进行输出。 与现有技术相比, 本发明的优点在于:
1、 本发明的用于计算正弦或余弦函数的实现方法及装置, 将数据映射到[0,TT/4]内进 行操作,并对映射到[0,TT/4]范围内的接近于 0的数在内部进行正弦函数运算过程中不经计 算而直接作为结果输出的方式, 以及在同一估计点进行内部正弦函数和余弦函数的多项式 运算的过程中对系数的复用, 以使在保证结果精度的前提下, 有效减少了系数表空间的大 小。
2、 本发明的用于计算正弦或余弦函数的实现方法及装置, 通过对相邻估计点间距离 的灵活设置,并对映射到[0,TT/4]范围内的接近于 0的数在内部进行正弦函数运算过程中不 经计算而直接作为结果输出, 以保证在可忽略常数 7T的误差的输入范围内, 输出结果精度 可达到 IEEE-754标准的单精度浮点数的理想精度 (即最大误差小于或等于单精度浮点数 尾数的最后一位所表示的 1个单位大小)。
3、 本发明的用于计算正弦或余弦函数的实现方法及装置, 通过对运算模块中的乘法 和加法单元的输入输出数据的位宽进行微调, 对部分系数位宽进行微调, 以及对在输入映 射到[0,TT/4]范围内之后等于某几个常数的数直接赋予常数结果, 以保证在不影响输出结果 精度的前提下, 输出结果的单调性与原函数保持一致。
4、本发明的用于计算正弦或余弦函数的实现方法及装置,在保证结果精度的前提下, 在关键路径上包含 1个压缩映射操作、 1个二级译码操作、 1个系数查找操作、 2个乘法操 作、 2个加法操作、 1个结果选择操作、 1个规格化处理操作, 符合低延时的应用要求。
【附图说明】 图 1是本发明方法的流程示意图。 图 2是本发明装置在具体应用实例中的结构原理示意图。 【具体实施方式】 以下将结合说明书附图和具体实施例对本发明做进一步详细说明。 需要强调是, 本发明所探讨的正弦或余弦函数的计算, 是基于计算机或芯片处理端的 内容, 即如何通过硬件技术的手段来实现高精度正弦或余弦函数的计算, 同时还能够保证 在硬件性能上的优化, 在硬件开销上的减小, 在硬件平台条件下精度的提高。 换言之, 本 发明内容中所探讨的正弦或余弦函数的计算并不是单纯的计算, 而是基于硬件平台的计算。 这从本发明要解决技术问题就可以一目了然, 单纯从计算而言, 并不会产生如上所说的各 种技术问题, 这些技术问题的出现是基于在一个硬件平台下来进行正弦或余弦函数的计算 才会产生的技术问题。 如 图 1所示, 本发明的一种用于计算正弦或余弦函数的实现方法, 步骤为: 步骤 S1: 将输入数映射到[0,TT/4]的范围内, 并获取内部运算的函数类型和结果的正 负符号; 步骤 S2: 根据步骤 S1映射到[0,TT/4]的范围内的数获取常数结果和最近的估计点; 步骤 S3:根据内部运算的函数类型和最近的估计点获取对应的正弦或余弦函数的估计 值, 即多项式计算所需的系数; 步骤 S4 : 获取步骤 S 1中映射到[0, TT/4]范围内的数到最近估计点的距离; 步骤 S5: 利用估计值和映射到[0,TT/4]范围内的数到最近估计点的距离完成多项式运 算; 步骤 S6: 对映射到[0,TT/4]范围内的数以及常数结果和多项式运算的结果进行选择, 并对选中数据进行规整化及舍入处理, 然后输出最终结果。 在具体应用实例中, 所述步骤 S1的具体实施步骤为: 步骤 S101: 若输入数的绝对值处于[0,7T/4]内, 则取输入的函数类型标识为内部运算 的函数类型, 并根据输入数的符号与函数类型标识获取最终结果的符号; 步骤 S102:否则,将输入数据根据正弦或余弦函数的周期性和对称性,映射到[0,TT/2]内, 并获取结果正负符号; 步骤 S103: 根据三角正弦或余弦函数的变换原理, 根据所求正弦或余弦函数的类型, 将数据 进一步映射 到[0,TT/4], 并获取内部运算的函数类型标识, 其映射方程式为 sin x = COS(7T/2 — x)或者 cosx = sin(7r/2 — x)。 将数进一步映射到[0,7r/4]的范围内, 并获 得内部运算的函数类型标识, 以 0标识正弦运算, 1标识余弦运算。 在具体应用实例中, 所述步骤 S2的具体实施步骤为: 步骤 S201: 根据已映射到[0,TT/4]的数的部分数位, 获取最近的估计点; 步骤 S202: 根据已映射到[0,TT/4]的数, 获取常数结果: 若映射后的数等于特定的几 个常数, 则将用于标识常数结果有效的数位置 1, 表示存在有效常数结果输出; 有效常数 结果的存在, 主要用于消除在相邻两个估计点正中间的极窄数域范围内, 因系数值突变带 来的结果数值的波动; 对于步骤 S201 中映射到[0,TT/4]范围的结果, 若其正好等于 TT/4, 则直接给予常数结果。 其 中, 本实施例进一步对估计点的进行定义:
(a) 估计点都处于[0,7r/4]范围内;
(b) 估计点并不是均匀分布的;
(c) 当估计点越接近于 0, 根据对应正弦函数的单精度浮点数的输出数的阶码, 对相 邻两个估计点之间的距离进行缩减, 以使结果满足精度要求; (d) 余弦函数运算使用与正弦函数运算相同的估计点;
(e)相邻两个估计点之间的距离, 存在一个最小值, 即在有限范围内的估计值的数量 有限, 也即对应系数表的大小有限;
(f) 相邻估计点之间距离的最小值可以根据实际设计进行一定的上下浮动;
(g) 当映射到[0,TT/4]的数到相邻两个估计点的距离相同时, 取数轴右侧的估计点为 最近的估计点。 在具体应用实例中, 本实施例所述常数结果的获取包括:
(a)位于相邻两个估计点正中间一段极窄区域内的数, 因为最近的估计点的选择不同 而引起结果数值的波动;
(b) 对这样引起结果数值波动的在输入映射到[0,TT/4]内之后等于某几个常数的数, 直接赋予常数结果输出。 在具体应用实例中, 所述步骤 S3中包括: 以最近的估计点的部分数位和步骤 S1中获 得的内部运算的函数类型标识作为索引, 完成系数的查找。 其 中, 本实施例进一步对所述系数定义:
( a )系数表中存储的系数为估计点对应的正弦或余弦函数的函数值或对应函数值的常 数倍;
(b) 对于正 /余弦函数的计算, 系数表中的系数根据需要, 可以复用, 以进一步减少 系数表; 即某些系数既可以用于内部的正弦函数的运算, 也可以用于内部的余弦函数的运算。 在具体应用实例中, 所述步骤 S4的具体为: 对映射到[0,TT/4]范围内的数和最近估计 点做加法运算,获取结果。对步骤 S1中映射到[0,TT/4]内的数和最近估计点进行加法运算, 获得两者之间的距离。 步骤 S4 中的两个数据阶码具备一致性, 无需进行对阶, 直接尾数 相加, 然后根据尾数的前导 0的个数调整阶码即可。 在具体应用实例中, 所述步骤 S5 的具体为: 根据输入变量以及相关系数, 完成多项 式运算。 其中, 多项式形式如下:
Y = C0 + (Cx + C2 x D) x D (3) 其 中 0), 6, C2表示系数, D为步骤 S4中所得的映射到[0,?r/4]范围内的数到最近估计点 的距离值。 其 中 D可以为正数或者负数, 由计算 D值的两个数在数轴上的相对位置决定。 其 中系 可 以相同或只存在符号位的不同或只相差一个常数倍, 这个常数倍 可以通过移位来完成计算。 其 中系数
Figure imgf000009_0001
同的函数运算,可以复用。比如其中用于正弦运算的系数 在 进行余弦运算时,可以作为系数 参与运算,二者之间可能只相差一个符号位或者常数倍, 这个常数倍可以通过移位完成运算。 又 比如, 其中系数0)和{:2之间只相差一个常数倍, 可以通过移位加的操作来完成这个 常数倍的运算。 进一步上述 公式 ( 1 ) 所示多项式运算进一步说明如下: 由于多项式运算中系数的正 负号可能交替出现, 导致结果数值的不稳定, 需要在达到所要求精度的前提下, 对部分乘 法和加法的结果精度进行调整。 在不影响精度 的前提下, 由于多项式系数的正负符号交替出现带来的结果数值的波动, 需要对部分乘法单元及加法单元的输入输出位宽进行微调, 以消除这种数值的波动, 保证 结果的单调性。 比如在不影响结果精度的基础上, 调节乘法单元 #2 的一个输入位宽与乘法单元 #1 的 一个输入位宽相同, 调节乘法单元 #2 的另一个输入位宽比乘法单元 #1 的另一个输入位宽 多 12位, 与此同时, 调节加法器的相应输入输出信号位宽和乘法器 #1的相应输出信号位 宽。 在具体应用实例中, 所述步骤 S6的具体步骤包括: 步骤 S601: 若检测到常数结果的输入, 则选择常数结果作为选中数; 即: 若常数结果 的有效标识符为 1, 则常数结果为选中数; 步骤 S602: 否则, 当步骤 S1中映射到[0,TT/4]范围内的数小于或等于设定阈值且内部 运算的函数类型为正弦函数时, 选取输入映射[0,TT/4]范围内的数为选中数; 步骤 S603: 否则选取多项式计算的结果为选中数; 步骤 S604: 利用步骤 S1中获得的结果的正负号给选中数赋予正确的符号位; 步骤 S605: 对选中数进行舍入和规格化处理, 获得符合 IEEE-754标准的单精度浮点 数输出。 在进行舍入处理时, 此步骤使用向偶数舍入的方法对结果的尾数进行舍入操作; 存在 最高位进位时, 对阶码进行加 1操作。 在具体应用实例中, 本发明方法中的所述的部分步骤可以根据需要并行或者交换顺序 执行, 比如步骤 (4) 可以和步骤 (2) (3) 并行执行或者放到步骤 (2) 之前执行。 如 图 2所示, 为了完成本发明的上述方法, 本发明进一步提供一种用于计算正弦或余 弦函数的装置, 其包括: 预处理模块, 包括完成将输入数映射到[0,?r/4]的压缩映射电路单元, 一个加法单元, — 个译码单元, 一个常数选择电路。 系数查找表 , 采用非挥发性存储装置, 用于存储估计点对应的估计值, 即多项式计算 所用的系数值。 运算模块 , 包括两个乘加运算单元以及相关的左移位单元, 用于完成多项式运算。 即 包括乘法单元 #1、 乘法单元 #2、 和加法单元 #1、 加法单元 #2。 结果选择模块, 包括一个加法单元、 常数结果有效标志位检测单元、 基本选择电路。 规格化处理模块, 包括一个译码单元和一个左移位单元。 舍入处理模块, 包括一个加法器, 三个比较单元。 在具体应用实例中, 本发明的装置中所述预处理模块的输出包括: 系数索引值, 完成 压缩映射后的输出值到最近的估计点的距离, 完成压缩映射后的输出值, 结果的正负号, 根据完成压缩映射后的输出值获得的对应常数值, 内部运算的函数类型标识。 所述预 处理模块的输入包括: 所需计算的正弦或余弦函数的函数类型标识以及符合 IEEE-754标准的单精度浮点数。 在所述预处理模块中, 两个常数乘法单元用以辅助完成输入数到[0,TT/4]的压缩映射。 加法单元用以完成压缩映射后的输出值到最近的估计点的距离的计算。 译码单元是一个二 级译码器, 用以完成压缩映射的输出值的阶码和尾数的部分数位到系数索引值的译码, 并 获取最近的估计点。 常数选择电路包括多路选择器, 用于根据完成压缩映射后的值选择常 数结果进行输出。 在具体应用实例中, 所述系数查找表为非挥发性存储装置, 用于存储估计点对应的估 计值, 即多项式计算所用的系数值。 所述系数查找表所需表空间大小不大于 256 X 64bit。 在具体应用实例中, 所述运算模块包括两个乘加运算单元, 用于完成多项式运算。 所述多项式形式如下:
Y = C0 + {C1 + C2 x D) x D 其 中 Cm Cp Q表示系数, Z)为步骤(4) 中所得的映射到[0,7T/4]范围内的数到最近估计 点的距离值。 所述运算模块, 在保证结果精确度的前提下, 需要乘法器 #2的输出输出信号的位宽进 行调整, 比如使乘法单元 #2 的其中一个输入信号位宽与乘法单元 #1 的输入位宽相同, 乘 法器 #2的另一个输入信号位宽比乘法单元 #1的另一个输入信号位宽大 12位, 同时相对应 对加法器 #1的输入输出信号位宽进行调整, 以在保证结果精度的前提下, 消除因在多项式 运算过程中系数正负号交替出现而引起的结果数值的波动。 比如在所示多项式形式的运算中, 括号内的运算使用乘法器 #1 加法器 #1 完成, 括号 外的运算使用乘法器 #2加法器 #2完成。 为了消除结果数值的波动, 可在不影响结果精度 的基础上, 调节乘法单元 #2 的一个输入位宽与乘法单元 #1 的一个输入位宽相同, 调节乘 法单元 #2的另一个输入位宽比乘法单元 #1的另一个输入位宽多 12位, 与此同时, 调节加 法器的相应输入输出信号位宽和乘法器 # 1的相应输出信号位宽。 在具体应用实例中, 所述结果选择模块中, 当常数结果有效标志位检测单元检测到有 效标志位为 1时, 直接选择输出常数结果。 所述结果选择模块中, 使用加法单元对输入的 压缩映射到[0,TT/4]数与设定阈值进行比较, 当输入的压缩映射到[0,TT/4]数小于设定阈值 且内部运算的函数类型标识为正弦函数时,输出预处理模块中输出的压缩映射到[0,TT/4]数。 在结果选择模块中, 当不选择输出常数结果和压缩映射到[0,TT/4]内的数时, 则选择多项式 计算所得的结果进行输出。 由上可知, 本发明的装置在关键路径上包含 1个压缩映射操作、 1个二级译码操作、 1 个系数查找操作、 2个乘法操作、 2个加法操作、 1个结果选择操作、 1个规格化处理操作, 符合低延时的应用要求。 以上仅是本发明的优选实施方式, 本发明的保护范围并不仅局限于上述实施例, 凡属 于本发明思路下的技术方案均属于本发明的保护范围。 应当指出, 对于本技术领域的普通 技术人员来说,在不脱离本发明原理前提下的若干改进和润饰,应视为本发明的保护范围。

Claims

权 利 要 求
1.一种用于计算正弦或余弦函数的实现方法, 其特征在于, 步骤包括: 步骤 S1: 将输入数映射到[0,7T/4]的范围内, 并获取内部运算的函数类型和结果的正 负符号; 步骤 S2: 根据步骤 S1映射到[0, TT/4]的范围内的数获取常数结果和最近的估计点; 步骤 S3:根据内部运算的函数类型和最近的估计点获取对应的正弦或余弦函数的估计 值, 即多项式计算所需的系数; 步骤 S4 : 获取步骤 S 1中映射到[0, TT/4]范围内的数到最近估计点的距离; 步骤 S5: 利用估计值和映射到[0, TT/4]范围内的数到最近估计点的距离完成多项式运 算; 步骤 S6: 对映射到[0, TT/4]范围内的数以及常数结果和多项式运算的结果进行选择, 并对选中数据进行规整化及舍入处理, 然后输出最终结果。
2. 根据权利要求 1所述的用于计算正弦或余弦函数的实现方法, 其特征在于, 所述步 骤 S1的具体实施步骤为: 步骤 S101: 若输入数的绝对值处于[0, TT/4]内, 则取输入的函数类型标识为内部运算 的函数类型, 并根据输入数的符号与函数类型标识获取最终结果的符号; 步骤 S102:否则,将输入数据根据正弦或余弦函数的周期性和对称性,映射到[0, TT/2]内, 并获取结果正负符号; 步骤 S103: 根据三角正弦或余弦函数的变换原理, 根据所求正弦或余弦函数的类型, 将数据 进一步映射 到[0, TT/4], 并获取内部运算的函数类型标识, 其映射方程式为 sinx = COS(7T/2 - x)或者 cosx = sin(7r/2 - x); 将数进一步映射到[0,7r/4]的范围内, 并获 得内部运算的函数类型标识, 以 0标识正弦运算, 1标识余弦运算。
3. 根据权利要求 1所述的用于计算正弦或余弦函数的实现方法, 其特征在于, 所述步 骤 S2的具体实施步骤为: 步骤 S201: 根据已映射到[0, TT/4]的数的部分数位, 获取最近的估计点; 步骤 S202; 根据已映射到[0, TT/4]的数, 获取常数结果: 若映射后的数等于特定的几 个常数, 则将用于标识常数结果有效的数位置 1, 表示存在有效常数结果输出; 有效常数 结果的存在, 主要用于消除在相邻两个估计点正中间的极窄数域范围内, 因系数值突变带 来的结果数值的波动; 对于步骤 S201 中映射到[0, TT/4]范围的结果, 若其正好等于 TT/4, 则直接给予常数结果。
4. 根据权利要求 1所述的用于计算正弦或余弦函数的实现方法, 其特征在于, 所述步 骤 S3中, 以最近的估计点的部分数位和步骤 S1中获得的内部运算的函数类型标识作为索 弓 I, 完成系数的查找。
5. 根据权利要求 1所述的用于计算正弦或余弦函数的实现方法, 其特征在于, 所述步 骤 S4 中, 对映射到[0,TT/4]范围内的数和最近估计点做加法运算, 获取结果; 对步骤 S1 中映射到[0,TT/4]内的数和最近估计点进行加法运算, 获得两者之间的距离; 步骤 S4中的 两个数据阶码具备一致性, 无需进行对阶, 直接尾数相加, 然后根据尾数的前导 0的个数 调整阶码。
6. 根据权利要求 1所述的用于计算正弦或余弦函数的实现方法, 其特征在于, 所述步 骤 S5中, 根据输入变量以及相关系数, 完成多项式运算, 其中多项式形式如下:
Y = C0 + (Cx + C2 x D) x D (1) 其 中 0), 6, C2表示系数, D为步骤 S4中所得的映射到[0,?r/4]范围内的数到最近估计点 的距离值; 其中 D可以为正数或者负数, 由计算 D值的两个数在数轴上的相对位置决定; 系数 CQ, C1; C2可以相同或只存在符号位的不同或只相差一个常数倍, 这个常数倍通过移位 来完成计算。
7. 根据权利要求 1所述的用于计算正弦或余弦函数的实现方法, 其特征在于, 所述步 骤 S6的步骤包括: 步骤 S601: 若检测到常数结果的输入, 则选择常数结果作为选中数; 即: 若常数结果 的有效标识符为 1, 则常数结果为选中数; 步骤 S602: 否则, 当步骤 S1中映射到[0,TT/4]范围内的数小于或等于设定阈值且内部 运算的函数类型为正弦函数时, 选取输入映射[0,TT/4]范围内的数为选中数; 步骤 S603: 否则选取多项式计算的结果为选中数; 步骤 S604: 利用步骤 S1中获得的结果的正负号给选中数赋予正确的符号位; 步骤 S605: 对选中数进行舍入和规格化处理, 获得符合 IEEE-754标准的单精度浮点 数输出。
8. 根据权利要求 7所述的用于计算正弦或余弦函数的实现方法, 其特征在于, 在进行 舍入处理时, 使用向偶数舍入的方法对结果的尾数进行舍入操作; 存在最高位进位时, 对 阶码进行加 1操作。
9. 一种用于计算正弦或余弦函数的装置, 其特征在于, 包括: 预处理模块, 包括完成将输入数映射到[0,?r/4]的压缩映射电路单元, 一个加法单元, 一个译码单元, 一个常数选择电路; 系数查找表, 采用非挥发性存储装置, 用于存储估计点对应的估计值, 即多项式计算 所用的系数值; 运算模块, 包括两个乘加运算单元以及相关的左移位单元, 用于完成多项式运算; 结果选择模块, 包括一个加法单元、 常数结果有效标志位检测单元、 基本选择电路; 当常数结果有效标志位检测单元检测到有效标志位为 1时, 直接选择输出常数结果; 所述 加法单元 对输入的压缩映射到 [0,TT/4]数与设定阈值进行比较, 当输入的压缩映射到 [0,TT/4]数小于设定阈值且内部运算的函数类型标识为正弦函数时,输出预处理模块中输出 的压缩映射到[0,TT/4]数;在结果选择模块中,当不选择输出常数结果和压缩映射到[0,TT/4] 内的数时, 则选择多项式计算所得的结果进行输出; 规格化处理模块, 包括一个译码单元和一个左移位单元; 舍入处理模块, 包括一个加法器, 三个比较单元。
10. 根据权利要求 9所述的用于计算正弦或余弦函数的装置, 其特征在于, 在所述预 处理模块中, 两个常数乘法单元用以辅助完成输入数到[0,TT/4]的压缩映射; 加法单元用以 完成压缩映射后的输出值到最近的估计点的距离的计算; 译码单元是一个二级译码器, 用 以完成压缩映射的输出值的阶码和尾数的部分数位到系数索引值的译码, 并获取最近的估 计点; 常数选择电路包括多路选择器, 用于根据完成压缩映射后的值选择常数结果进行输 出。
PCT/CN2021/101216 2020-06-29 2021-06-21 一种用于计算正弦或余弦函数的实现方法及装置 WO2022001722A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010607527.6A CN111831257A (zh) 2020-06-29 2020-06-29 一种用于计算正弦或余弦函数的实现方法及装置
CN202010607527.6 2020-06-29

Publications (1)

Publication Number Publication Date
WO2022001722A1 true WO2022001722A1 (zh) 2022-01-06

Family

ID=72899627

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/101216 WO2022001722A1 (zh) 2020-06-29 2021-06-21 一种用于计算正弦或余弦函数的实现方法及装置

Country Status (2)

Country Link
CN (1) CN111831257A (zh)
WO (1) WO2022001722A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116720554A (zh) * 2023-08-11 2023-09-08 南京师范大学 一种基于fpga技术的多段线性拟合的神经元电路实现方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831257A (zh) * 2020-06-29 2020-10-27 湖南毂梁微电子有限公司 一种用于计算正弦或余弦函数的实现方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001209525A (ja) * 2000-01-28 2001-08-03 Matsushita Electric Ind Co Ltd 三角関数生成装置
CN102741706A (zh) * 2009-12-16 2012-10-17 泰勒斯公司 地理参照图像区域的方法
CN104536720A (zh) * 2014-12-22 2015-04-22 浙江中控研究院有限公司 基于fpga的待测角三角函数值的测算方法及系统
US20180217814A1 (en) * 2017-02-02 2018-08-02 Vivante Corporation Systems And Methods For Computing Mathematical Functions
CN111831257A (zh) * 2020-06-29 2020-10-27 湖南毂梁微电子有限公司 一种用于计算正弦或余弦函数的实现方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001209525A (ja) * 2000-01-28 2001-08-03 Matsushita Electric Ind Co Ltd 三角関数生成装置
CN102741706A (zh) * 2009-12-16 2012-10-17 泰勒斯公司 地理参照图像区域的方法
CN104536720A (zh) * 2014-12-22 2015-04-22 浙江中控研究院有限公司 基于fpga的待测角三角函数值的测算方法及系统
US20180217814A1 (en) * 2017-02-02 2018-08-02 Vivante Corporation Systems And Methods For Computing Mathematical Functions
CN111831257A (zh) * 2020-06-29 2020-10-27 湖南毂梁微电子有限公司 一种用于计算正弦或余弦函数的实现方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116720554A (zh) * 2023-08-11 2023-09-08 南京师范大学 一种基于fpga技术的多段线性拟合的神经元电路实现方法
CN116720554B (zh) * 2023-08-11 2023-11-14 南京师范大学 一种基于fpga技术的多段线性拟合的神经元电路实现方法

Also Published As

Publication number Publication date
CN111831257A (zh) 2020-10-27

Similar Documents

Publication Publication Date Title
US5404324A (en) Methods and apparatus for performing division and square root computations in a computer
WO2022001722A1 (zh) 一种用于计算正弦或余弦函数的实现方法及装置
CN107305485B (zh) 一种用于执行多个浮点数相加的装置及方法
CN106202890B (zh) 基于CORDIC和Taylor算法相结合的全流水浮点三角函数装置
US20160313976A1 (en) High performance division and root computation unit
WO2022052625A1 (zh) 一种定点与浮点转换器、处理器、方法以及存储介质
US8788561B2 (en) Arithmetic circuit, arithmetic processing apparatus and method of controlling arithmetic circuit
US20120011185A1 (en) Rounding unit for decimal floating-point division
CN108228136B (zh) 基于优化查找表法的对数函数计算的方法及装置
US20090172069A1 (en) Method and apparatus for integer division
US20080281890A1 (en) Fast correctly-rounding floating-point conversion
CN107423026B (zh) 一种正余弦函数计算的实现方法及装置
CN102566965B (zh) 一种误差平坦的浮点数对数运算装置
CN107015783B (zh) 一种浮点角度压缩实现方法及装置
Murillo et al. A suite of division algorithms for posit arithmetic
CN117032625A (zh) 一种低延时的浮点平方根函数硬件实现方法
CN114691082A (zh) 乘法器电路、芯片、电子设备及计算机可读存储介质
KR101922462B1 (ko) 데이터 처리장치 및 이진수에 대해 시프트 기능을 수행하는 방법
CN1936830A (zh) 小数取幂的数字实现
US20040254973A1 (en) Rounding mode insensitive method and apparatus for integer rounding
CN113721885B (zh) 一种基于cordic算法的除法器
JPH086766A (ja) 正弦余弦演算装置
WO2023124235A1 (zh) 多输入浮点数处理方法、装置、处理器及计算机设备
US20070143389A1 (en) Efficient error-check and exact-check for newton-raphson divide and square-root operations
Villalba et al. Double-residue modular range reduction for floating-point hardware implementations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21831551

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21831551

Country of ref document: EP

Kind code of ref document: A1