WO2022047873A1 - 除法运算方法、装置、电子设备和介质 - Google Patents

除法运算方法、装置、电子设备和介质 Download PDF

Info

Publication number
WO2022047873A1
WO2022047873A1 PCT/CN2020/117978 CN2020117978W WO2022047873A1 WO 2022047873 A1 WO2022047873 A1 WO 2022047873A1 CN 2020117978 W CN2020117978 W CN 2020117978W WO 2022047873 A1 WO2022047873 A1 WO 2022047873A1
Authority
WO
WIPO (PCT)
Prior art keywords
error
iteration
division
iteration value
rounding operation
Prior art date
Application number
PCT/CN2020/117978
Other languages
English (en)
French (fr)
Inventor
吴锋
张立勇
钟万勰
Original Assignee
大连理工大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大连理工大学 filed Critical 大连理工大学
Publication of WO2022047873A1 publication Critical patent/WO2022047873A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/487Multiplying; Dividing
    • G06F7/4873Dividing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/499Denomination or exception handling, e.g. rounding or overflow
    • G06F7/49942Significance control
    • G06F7/49947Rounding

Definitions

  • the present application relates to the field of computing technology, and in particular, to a division operation method, apparatus, electronic device and medium.
  • the division operation is one of the basic operations in processors (including DSP and embedded chips) and is a very important functional component.
  • processors including DSP and embedded chips
  • high-precision division operations (beyond the precision provided by computer hardware) are often required, and software simulation of floating-point division is required. operation.
  • the division operation is more complicated and takes longer. Therefore, the design of high-performance division calculation scheme is very important.
  • function iteration algorithm digital loop algorithm
  • maximum cardinality algorithm look-up table method
  • variable calculation cycle method there are five main implementation methods of division: function iteration algorithm, digital loop algorithm, maximum cardinality algorithm, look-up table method and variable calculation cycle method.
  • the numerical loop algorithm and the function iteration method are the two most commonly used algorithms.
  • the convergence rate of the numerical loop algorithm is linear, while the convergence rate of the function iteration algorithm is quadratic.
  • the present application provides a division operation method, apparatus, electronic device and medium.
  • a division operation method including:
  • the second iteration value is used as the first iteration value to repeatedly perform the above steps until the second iteration value meets the accuracy requirement, and the second iteration value is output.
  • the performing error analysis processing according to the first iteration value to determine the rounding operation to be performed on the division parameter in this iteration includes:
  • a rounding operation to be performed on the division parameter in this iteration is determined.
  • the determining, according to the first error and the second error, the rounding operation that needs to be performed on the division parameter in this iteration including:
  • the determining, according to the first error and the second error, the rounding operation that needs to be performed on the division parameter in this iteration including:
  • a rounding operation to be performed on the first iteration value and the divisor is determined.
  • the first error is:
  • x n is the first iteration value
  • a is the divisor
  • N n is a positive integer
  • E n represents a constant
  • the determining of the rounding operation that needs to be performed on the first iteration value and the divisor according to the index of the first error includes:
  • the rounding operation that needs to be performed to determine the x n is to round off the digits after N n +1 digits;
  • the rounding operation required to determine the a is to round off the digits after N n+1 +1 bits.
  • the continuing iterative processing according to the parameters after the rounding operation, and obtaining the second iterative value includes:
  • the first error is:
  • x n is the first iteration value
  • a is the divisor
  • N n is a positive integer
  • E n represents a constant
  • the method also includes:
  • the continuing iterative processing according to the parameters after the rounding operation to obtain a second iterative value includes:
  • a division operation device comprising:
  • an acquisition module used for acquiring the division parameters to be performed the division operation, and acquiring the first iteration value
  • a processing module configured to perform error analysis and processing according to the first iteration value, and determine the rounding operation that needs to be performed for the division parameter in this iteration;
  • the processing module is further configured to continue iterative processing according to the parameters after the rounding operation to obtain a second iterative value
  • the processing module is further configured to, if the second iteration value does not meet the accuracy requirement, repeat the above steps with the second iteration value as the first iteration value;
  • An output module configured to output the second iteration value until the second iteration value meets the precision requirement.
  • an electronic device comprising a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor causes the processor to perform the first aspect and any of the above. Steps for a possible implementation.
  • a computer storage medium stores one or more instructions, the one or more instructions are adapted to be loaded and executed by a processor as described in the first aspect and any one thereof Steps of possible implementations.
  • the present application obtains the first iteration value by acquiring the division parameter to be subjected to the division operation, and performs error analysis and processing according to the above-mentioned first iteration value, so as to determine the rounding operation that needs to be performed for the above-mentioned division parameter in this iteration, and after the above-mentioned rounding operation
  • the parameters continue to be iteratively processed to obtain the second iteration value. If the second iteration value does not meet the accuracy requirement, the above-mentioned second iteration value is used as the first iteration value and the above steps are repeated until the second iteration value meets the above-mentioned accuracy requirement.
  • the above second iteration value is output.
  • FIG. 1 is a schematic flowchart of a division operation method provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of another division operation method provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of numerical values after a rounding operation in a Newton-Raphson algorithm processing process provided by an embodiment of the present application;
  • FIG. 4 is a schematic structural diagram of a division device provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a division operation method provided by an embodiment of the present application. The method may include:
  • the execution body of this embodiment of the present application may be a division operation device, which may be an electronic device.
  • the above-mentioned electronic device may be a terminal, which may also be called a terminal device, including but not limited to devices such as those with touch-sensitive surfaces (for example, , touch screen displays and/or touch pads) other portable devices such as mobile phones, laptops or tablet computers.
  • the above-described device is not a portable communication device, but a desktop computer.
  • the above-mentioned division operation method may be executed by a corresponding hardware module such as a chip or a divider, or optionally, may be executed based on a software program.
  • the above-mentioned division operation is implemented by a function iteration algorithm, and the main theoretical basis is the error analysis theory of the function iteration algorithm.
  • the above-mentioned division parameters to be divided may come from any application scenarios that require division, and may include one or more of dividends, divisors, each iteration value and error, and may be selected according to the The algorithm decides, and there is no restriction here.
  • the division operation method in the embodiment of the present application can be executed when specific division parameters are obtained.
  • the dividend is b and the divisor is a
  • the function iteration method is used to calculate the quotient.
  • the iterative format (1) is:
  • x n represents the approximate quotient after the nth iteration, which is also called the iteration value in each iteration.
  • the above-mentioned first iteration value may be any iteration value in the iteration process, including the initial iteration value x 0 .
  • the error of the iterative approximation can be calculated, and then according to the error analysis theory, under the premise of ensuring the accuracy, it will be invalid beyond the error Numbers are rounded.
  • the specific iterative format has a specific error analysis theory, which is not limited here.
  • step 102 includes:
  • the above rounding operation can be understood as an operation of retaining a few decimal places in any form.
  • each step of the quotient is approximate, and its accuracy can be given according to the error analysis theory.
  • the approximate quotient has 15 digits after the decimal point.
  • the above-mentioned first error is any value that measures the accuracy of the first iteration value, such as the absolute difference between the first iteration value and the real value, relative error, etc., which is not limited here.
  • the above-mentioned second error is any value that measures the accuracy of the second iteration value, such as the absolute difference between the second iteration value and the real value, relative error, etc., which is not limited here.
  • the precision of 0.12 is the same as the precision of 0.123456789012345, it does not affect the precision of the next iteration value x n+1 . But obviously, it's easier to calculate with 0.12 than with 0.123456789012345.
  • the above-mentioned division parameters may include one or more of dividend b, divisor a, each iteration value, and error, which may be determined according to the algorithm used, which is not limited here.
  • the division operation method in this embodiment of the present application can be executed under the condition that specific division parameters are obtained.
  • the following steps may be performed:
  • the above-mentioned rounding operation that needs to be performed for the above-mentioned division parameter in this iteration is determined, including:
  • the first iteration value x n can be set to the same precision as the first error ⁇ n
  • a, b, and other iteration parameters involved in the iteration format can be set to the same precision as the second error ⁇ n+1 , to perform the iterative operation.
  • the iterative algorithm steps in FIG. 2 include: 21. Substitute the initial value into the iterative rule; 22. Iteratively calculate to obtain an approximate value; 23. Calculate the error; end when the judgment error is accepted; execute when the judgment error is not accepted 24. According to the error, round off the invalid numbers in the relevant value; 25. Substitute the rounded value into the iterative rule. Multiple iterations can be performed through the above steps.
  • the division operation method in the embodiment of the present application calculates the error of the iterative approximation value based on the error analysis theory of the iterative format adopted in each iteration, and then, according to the error analysis theory, under the premise of ensuring the accuracy, will exceed the error Invalid numbers are rounded, so that the numerical value with short word length and equal precision is substituted into the iterative rule for calculation, which can simplify the complexity of the calculation and improve the calculation efficiency.
  • the rounding operation that needs to be performed on the first iteration value and the divisor may be determined according to the index of the first error.
  • the above division operation method can be applied to different iterative algorithms. Due to different iterative rules, the calculation of the error may also be different, and the rounding operation of the related data in the calculation may also be different.
  • the above-mentioned first error is:
  • x n is the first iteration value
  • a is the divisor
  • N n is a positive integer
  • E n represents a constant
  • the above N n+1 is the index corresponding to the error at the n+1th iteration, and the above-mentioned rounding operation to be performed on the first iteration value and the divisor is determined according to the index of the above-mentioned first error, including:
  • the rounding operation that needs to be performed to determine the above x n is to round off the digits after N n +1 digits;
  • the Newton-Rapshon algorithm is a function iterative algorithm. The initial value x 0 needs to be provided during the calculation, and then the calculation is performed in the following iterative format:
  • x n+1 x n ⁇ (2-a ⁇ x n );
  • x n+1 x n ⁇ (1+ ⁇ n );
  • the error ⁇ n+1 of x n+ 1 can be expressed as:
  • the Newton-Rapshon algorithm converges quadratically. If the initial iteration value x 0 is selected well, then the initial error ⁇ 0 is small, and the number of iterations is also small. In the current practice of improving the Newton-Rapshon algorithm, most of them are improved for the selection of the initial iteration value x 0 .
  • the present application uses its error analysis results to improve the Newton-Rapshon algorithm, which can reduce the amount of calculation and improve the calculation efficiency of the division.
  • N 0 is the exponent of ⁇ 0 and N 0 is a positive integer.
  • Roundoff() represents the rounding operation. After rounding The error is still is of the order of magnitude, so it does not affect the accuracy of the next iteration, but The word length is much smaller than that of x n , which simplifies data processing.
  • the rounding method provided in the IEEE745 standard may be selected as the rounding method for the above-mentioned rounding operation, or a "quasi-rounding to 1" method, a “constant 1-setting” method, etc., which are not limited in this embodiment of the present application.
  • the division operation method in the embodiment of the present application is suitable for the Newton-Raphson algorithm, and the rounding operation combined with the characteristics of its own algorithm can reduce the amount of data calculation and improve the processing efficiency without affecting the accuracy.
  • the Markstein algorithm can also be used as an example to further illustrate this application.
  • the above-mentioned first error is:
  • the above method also includes:
  • the second iteration value x n+1 is calculated according to the first error after the above-mentioned rounding operation and the first iteration value after the above-mentioned rounding operation.
  • N 0 is the exponent of ⁇ 0 and N 0 is a positive integer. According to N 0 , the numbers after N 0 +1 in x 0 are rounded off, and this step is expressed as:
  • the average error of the Newton-Raphson algorithm of the embodiment of the present application is 0.52 ⁇ 10 -16 , and the standard deviation is 0.72 ⁇ 10 -16 ; the average error of the Markstein algorithm of the embodiment of the present application is 0.37 ⁇ 10 -16 , and the standard deviation is 0.56 ⁇ 10 ⁇ 16 , there is no difference in the magnitude of the error, the difference in the specific error is very small, and even the error of the Markstein algorithm in the embodiment of the present application is smaller in the same number of iterations.
  • the word length of the numerical value used in each step of iteration is different.
  • the mantissa of the initial value (x 0 ) is only 2 bits
  • the second step (x 1 ) is 4 bits, increasing in turn, until the last step, there are 32 bits.
  • all iteration steps are 32-bit calculations. Therefore, compared with the current function iteration calculation method, the division calculation method in the embodiment of the present application avoids unnecessary calculations and greatly improves the calculation efficiency. .
  • the embodiments of the present application are based on the theory of error analysis.
  • the invalid numbers beyond the error can be rounded, so that the numerical value with short word length and equal precision can be substituted into the iteration.
  • the calculation is performed in the format, which simplifies the complexity of the calculation, especially the calculation complexity of the multiplication in the iterative process, and improves the calculation efficiency. It can be applied to any function iterative method that can provide error analysis.
  • the iterative method in the foregoing two specific embodiments is only a partial example, and does not limit the scope of the embodiments of the present application.
  • the division operation method in the embodiment of the present application can be used to round the relevant values in the iterative process according to the error and precision requirements, so that the short word length, etc.
  • the numerical value of the precision is substituted into the iterative format for calculation, so as to greatly simplify the calculation complexity and improve the calculation efficiency, which is not repeated in this embodiment of the present application.
  • the methods provided by the embodiments of the present application can not only be used for the division calculation component in the microprocessor, but also can be used in the algorithm design that requires high precision and adopts the software simulation division operation, and can be applied to various application scenarios including iterative algorithms For example, it is applied to the division operation of floating-point numbers, which greatly simplifies the computational complexity and waste of computational resources, and improves computational efficiency.
  • the embodiments of the present application do not limit the application scenarios of the method.
  • the division operation device 400 includes:
  • an obtaining module 410 configured to obtain a division parameter to be subjected to a division operation, and obtain a first iteration value
  • the processing module 420 is configured to perform error analysis and processing according to the above-mentioned first iteration value, and determine the rounding operation that needs to be performed for the above-mentioned division parameter in this iteration;
  • the above-mentioned processing module 420 is further configured to continue iterative processing according to the parameters after the rounding operation to obtain a second iterative value;
  • the above-mentioned processing module 420 is further configured to, if the above-mentioned second iteration value does not meet the accuracy requirement, repeat the above steps with the above-mentioned second iteration value as the first iteration value;
  • the output module 430 is configured to output the second iteration value until the second iteration value meets the precision requirement.
  • each step involved in the method of the embodiment shown in FIG. 1 may be performed by each module in the division operation apparatus 400 shown in FIG. 4 , and details are not repeated here.
  • the division operation device 400 in the embodiment of the present application may acquire the division parameters to be performed the division operation, acquire the first iteration value, perform error analysis processing according to the above-mentioned first iteration value, and determine the rounding that needs to be performed for the above-mentioned division parameter in this iteration.
  • the rounding operation is performed, and the iterative processing is continued according to the parameters after the rounding operation to obtain the second iteration value. If the second iteration value does not meet the accuracy requirement, the second iteration value is used as the first iteration value and the above steps are repeated until the above-mentioned second iteration value is used.
  • the second iteration value satisfies the above-mentioned precision requirement, and the above-mentioned second iteration value is output.
  • the error analysis theory under the premise of ensuring the accuracy, the invalid numbers beyond the error are rounded, so that the numerical value with short word length and equal precision is substituted into the iterative rule for calculation, which can greatly reduce the waste of computing resources and improve the Operational efficiency of function iteration algorithm.
  • the embodiments of the present application further provide an electronic device.
  • the electronic device 500 at least includes a processor 501 , an input device 502 , an output device 503 and a computer storage medium 504 .
  • the processor 501, the input device 502, the output device 503 and the computer storage medium 504 in the terminal may be connected through a bus or other means.
  • the computer storage medium 504 may be stored in the memory of the terminal.
  • the computer storage medium 504 is used for storing computer programs, and the computer program includes program instructions.
  • the processor 501 is used for executing the program instructions stored in the computer storage medium 504 .
  • the processor 501 (or called CPU (Central Processing Unit, central processing unit)) is the computing core and the control core of the terminal, which is suitable for implementing one or more instructions, and is specifically suitable for loading and executing one or more instructions to achieve the corresponding Method flow or corresponding function; in one embodiment, the processor 501 in the above-mentioned embodiment of the present application may be used to perform a series of processing, including the method in the embodiment shown in FIG. 1 and so on.
  • Embodiments of the present application further provide a computer storage medium (Memory), where the computer storage medium is a memory device in a terminal, used to store programs and data.
  • the computer storage medium here may include both a built-in storage medium in the terminal, and certainly also an extended storage medium supported by the terminal.
  • the computer storage medium provides storage space, and the storage space stores the operating system of the terminal.
  • one or more instructions suitable for being loaded and executed by the processor 501 are also stored in the storage space, and these instructions may be one or more computer programs (including program codes).
  • the computer storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one memory located far away from the aforementioned processor. computer storage media.
  • one or more instructions stored in the computer storage medium can be loaded and executed by the processor 501 to implement the corresponding steps in the foregoing embodiment; in specific implementation, one or more instructions in the computer storage medium can be Any steps of the method in FIG. 1 are loaded and executed by the processor 501 , which will not be repeated here.
  • the embodiments of the present application can simplify the complexity of the function iterative algorithm for processing the division problem, especially the computational complexity of the multiplication in the iterative process, thereby improving the computational efficiency.
  • the present application adds an adaptive rounding operation step, which is easy to implement.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the division of the module is only for one logical function division.
  • multiple modules or components may be combined or integrated into another system, or some features may be ignored or not implement.
  • the shown or discussed mutual coupling, or direct coupling, or communication connection may be through some interfaces, indirect coupling or communication connection of devices or modules, and may be in electrical, mechanical or other forms.
  • Modules described as separate components may or may not be physically separated, and components shown as modules may or may not be physical modules, that is, they may be located in one place, or may be distributed to multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • the above-mentioned embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, the procedures or functions according to the embodiments of the present application are generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted over a computer-readable storage medium.
  • the computer instructions can be sent from one website site, computer, server, or data center to another by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.)
  • wire e.g. coaxial cable, fiber optic, digital subscriber line (DSL)
  • wireless e.g., infrared, wireless, microwave, etc.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more available media integrated.
  • the available media may be read-only memory (ROM), or random access memory (RAM), or magnetic media, such as floppy disks, hard disks, magnetic tapes, magnetic disks, or optical media, such as, A digital versatile disc (DVD), or a semiconductor medium, for example, a solid state disk (SSD) and the like.
  • ROM read-only memory
  • RAM random access memory
  • magnetic media such as floppy disks, hard disks, magnetic tapes, magnetic disks, or optical media, such as, A digital versatile disc (DVD), or a semiconductor medium, for example, a solid state disk (SSD) and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Nonlinear Science (AREA)
  • Complex Calculations (AREA)

Abstract

一种除法运算方法、装置(400)、电子设备(500)和介质。所述除法运算方法包括:获取待进行除法运算的除法参数,获取第一迭代值(101);根据上述第一迭代值进行误差分析处理,确定本次迭代中上述除法参数需要进行的舍入操作(102);根据上述舍入操作后的除法参数继续迭代处理,获得第二迭代值(103);若上述第二迭代值不满足精度要求,将上述第二迭代值作为第一迭代值重复执行以上步骤,直到上述第二迭代值满足上述精度要求,输出上述第二迭代值(104)。上述方法可以提高函数迭代算法效率。

Description

除法运算方法、装置、电子设备和介质 技术领域
本申请涉及计算技术领域,尤其是涉及一种除法运算方法、装置、电子设备和介质。
背景技术
除法运算是处理器(包括DSP和嵌入式芯片)中的基本操作之一,是一种非常重要功能部件。在计算机图形学、计算机视觉、可视化、地理信息系统、网格生成技术和科学计算中,往往需要很高精度的除法运算(超出计算机硬件所提供的精度),此时需要采用软件模拟浮点除法运算。相比浮点加、减、乘法运算,除法运算更为复杂,所耗时间更长,因此高性能除法的计算方案设计十分重要。目前除法有五种主要实现方法:函数迭代算法、数字循环算法、极大基数算法、查表方法以及可变计算周期方法。在所有这些算法中,数字循环算法和函数迭代法是最常用的两种算法。数字循环算法的收敛速度是线性的,而函数迭代算法的收敛速度是二次的。
目前函数迭代算法已经广泛使用在当代通用处理器的除法设计中。获取待进行除法运算的除法参数,往往需要提供一个商的初始近似值,然后通过某种迭代格式,逐步逼近精确的商。一个成熟的函数迭代算法一般需要给定一个初始近似值和迭代格式,进行迭代运算,直到迭代结果满足精度要求。然而当涉及较大数据量的处理时,现有迭代算法存在严重的计算资源浪费问题,其运算效率亟待提高。
发明内容
本申请提供了一种除法运算方法、装置、电子设备和介质。
第一方面,提供了一种除法运算方法,包括:
获取待进行除法运算的除法参数,获取第一迭代值;
根据所述第一迭代值进行误差分析处理,确定本次迭代中所述除法参数需要进行的舍入操作;
根据所述舍入操作后的除法参数继续迭代处理,获得第二迭代值;
若所述第二迭代值不满足精度要求,将所述第二迭代值作为第一迭代值重复执行以上步骤,直到所述第二迭代值满足所述精度要求,输出所述第二迭代值。
在一种可选的实施方式中,所述根据所述第一迭代值进行误差分析处理,确定本次迭代中所述除法参数需要进行的舍入操作,包括:
获取第一迭代值的第一误差和第二迭代值的第二误差;
根据所述第一误差和所述第二误差,确定本次迭代中所述除法参数需要进行的舍入操作。
在一种可选的实施方式中,所述根据所述第一误差和所述第二误差,确定本次迭代中所述除法参数需要进行的舍入操作,包括:
根据所述第一误差的精度,确定所述第一迭代值的舍入操作的精度;以及根据所述第二误差的精度,确定所述除法运算的除数和/或被除数的舍入操作的精度。
在一种可选的实施方式中,所述根据所述第一误差和所述第二误差,确定本次迭代中所述除法参数需要进行的舍入操作,包括:
根据所述第一误差的指数,确定所述第一迭代值和除数需要进行的舍入操作。
在一种可选的实施方式中,所述第一误差为:
Figure PCTCN2020117978-appb-000001
其中x n为第一迭代值,a为除数,N n是正整数,E n表示常数;
所述根据所述第一误差的指数,确定所述第一迭代值和除数需要进行的舍入操作,包括:
确定所述x n需要进行的舍入操作是将N n+1位之后的数字舍去;
确定所述a需要进行的舍入操作是将N n+1+1位之后的数字舍去。
在一种可选的实施方式中,所述方法还包括:
获取中间数据
Figure PCTCN2020117978-appb-000002
所述
Figure PCTCN2020117978-appb-000003
为所述x n进行所述舍入操作后获得,所述
Figure PCTCN2020117978-appb-000004
为所述a进行所述舍入操作后获得;
将e n中N n+1+1位之后的数字舍去,获得所述舍入操作后的中间数据
Figure PCTCN2020117978-appb-000005
所述根据所述舍入操作后的参数继续迭代处理,获得第二迭代值包括:
根据所述舍入操作后的中间数据
Figure PCTCN2020117978-appb-000006
和所述
Figure PCTCN2020117978-appb-000007
计算第二迭代值x n+1
在一种可选的实施方式中,所述第一误差为:
Figure PCTCN2020117978-appb-000008
其中x n为第一迭代值,a为除数,N n是正整数,E n表示常数;
所述方法还包括:
根据所述第一误差的指数确定所述第一误差需要进行的舍入操作;
所述根据所述舍入操作后的参数继续迭代处理,获得第二迭代值,包括:
根据所述舍入操作后的第一误差、所述舍入操作后的第一迭代值、所述舍入操作后的除数,及其它舍入操作后的中间参数计算第二迭代值x n+1
第二方面,提供了一种除法运算装置,包括:
获取模块,用于获取待进行除法运算的除法参数,获取第一迭代值;
处理模块,用于根据所述第一迭代值进行误差分析处理,确定本次迭代中所述除法参数需要进行的舍入操作;
所述处理模块还用于,根据所述舍入操作后的参数继续迭代处理,获得第二迭代值;
所述处理模块还用于,若所述第二迭代值不满足精度要求,将所述第二迭代值作为第一迭代值重复执行以上步骤;
输出模块,用于直到所述第二迭代值满足所述精度要求,输出所述第二迭代值。
第三方面,提供了一种电子设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如第一方面及其任一种可能的实现方式的步骤。
第四方面,提供了一种计算机存储介质,所述计算机存储介质存储有一条或多条指令,所述一条或多条指令适于由处理器加载并执行如上述第一方面及其任一种可能的实现方式的步骤。
本申请通过获取待进行除法运算的除法参数,获取第一迭代值,根据上述第一迭代值进行误差分析处理,确定本次迭代中上述除法参数需要进行的舍入操作,根据上述舍入操作后的参数继续迭代处理,获得第二迭代值,若上述第二迭代值不满足精度要求,将上述第二迭代值作为第一迭代值重复执行以上步骤,直到上述第二迭代值满足上述精度要求,输出上述第二迭代值。基于误 差分析理论,在保证精度的前提下,将超出误差之外无效数字进行舍入操作,从而将短字长、等精度的数值代入迭代规则中进行计算,可以极大地减少计算资源浪费,提高函数迭代算法的运算效率。
附图说明
为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。
图1为本申请实施例提供的一种除法运算方法的流程示意图;
图2为本申请实施例提供的另一种除法运算方法的流程示意图;
图3为本申请实施例提供的一种Newton-Raphson算法处理过程中舍入操作后的数值示意图;
图4为本申请实施例提供的一种除法运算装置的结构示意图;
图5为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其他步骤或单元。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施 例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
下面结合本申请实施例中的附图对本申请实施例进行描述。
请参阅图1,图1是本申请实施例提供的一种除法运算方法的流程示意图。该方法可包括:
101、获取待进行除法运算的除法参数,获取第一迭代值。
本申请实施例的执行主体可以为一种除法运算装置,可以为电子设备,具体实现中,上述电子设备为一种终端,也可称为终端设备,包括但不限于诸如具有触摸敏感表面(例如,触摸屏显示器和/或触摸板)的移动电话、膝上型计算机或平板计算机之类的其它便携式设备。还应当理解的是,在某些实施例中,上述设备并非便携式通信设备,而是台式计算机。在一种可选的实施方式中,上述除法运算方法可以通过相应的硬件模块如芯片或者除法器运行,可选的,也可以基于软件程序运行。
本申请实施例中,上述除法运算使用函数迭代类算法实现,其主要理论依据是函数迭代算法的误差分析理论。在本申请实施例中,上述待进行除法运算的除法参数,可以来自任意需要进行除法运算的应用场景,可以包括被除数、除数、各个迭代值和误差中的一种或几种,可以根据使用的算法决定,此处不做限制。在获得具体除法参数的情况下可以执行本申请实施例中的除法运算方法。本申请实施例中假设被除数为b,除数为a,采用函数迭代法计算其商,迭代格式(1)为:
x n+1=f(a,b,x n);
其中x n表示第n次迭代后的近似商,在每一次的迭代中也被称为迭代值。
上述第一迭代值可以是迭代过程中的任意迭代值,包括初始迭代值x 0
102、根据上述第一迭代值进行误差分析处理,确定本次迭代中上述除法参数需要进行的舍入操作。
具体的,可以在每一步的迭代中,基于所采用的迭代格式(1)的误差分析理论,计算迭代近似值的误差,然后根据误差分析理论,在保证精度的前提下,将超出误差之外无效数字进行舍入操作。具体的迭代格式有具体的误差分析理论,此处不做限制。
在一种可选的实施方式中,上述步骤102包括:
获取第一迭代值的第一误差和第二迭代值的第二误差;
根据上述第一误差和上述第二误差,确定本次迭代中上述除法参数需要进行的舍入操作。
上述舍入操作可以理解为通过任意形式保留小数点后几位的操作。函数迭代算法在计算商的过程中,每一步商都是近似的,其精度可以根据误差分析理论给出。以近似商为x n=0.123456789012345为例,该近似商小数点后有15位数字。在函数迭代算法中,可以计算出每次迭代近似商的误差。假设根据误差分析理论,近似商0.123456789012345的误差是ε n=0.01,那么该近似商小数点后第3位数字之后的所有数字实际上都没有用,属于无效数字,如全部参与计算,实属计算资源浪费。将无效数字舍去后,得到近似商
Figure PCTCN2020117978-appb-000009
可将
Figure PCTCN2020117978-appb-000010
作为近似商代入算法的迭代格式(1)中进行计算。
其中,上述第一误差为任何衡量第一迭代值精度的值,例如第一迭代值与真实值的绝对差、相对误差等,此处不做限定。上述第二误差为任何衡量第二迭代值精度的值,例如第二迭代值与真实值的绝对差、相对误差等,此处不做限定。
因为0.12的精度和0.123456789012345的精度是相同的,不影响下一次迭代值x n+1的精度。但是显然,用0.12计算要比用0.123456789012345计算更加简单。同时,根据误差分析理论,可以在x n+1被计算出来前,就可知道其精度(为方便阐述,假定x n+1的误差是ε n+1=0.0001)。那么根据ε n+1=0.0001可知,实际计算时可以对迭代格式中需要的其他数值(如a和b),在不损失精度的前提下,也可进行舍入操作。由于
Figure PCTCN2020117978-appb-000011
Figure PCTCN2020117978-appb-000012
的字长要比a,b和x n的字长短,计算更加简单和快捷。
上述除法参数可以包括被除数b、除数a、各个迭代值和误差中的一种或几种,可以根据使用的算法决定,此处不做限制。在获得具体除法参数的情况 下可以执行本申请实施例中的除法运算方法。
在一种实施方式中,设置初始迭代值x 0,可以执行以下步骤:
1)根据所采用的函数迭代算法的误差分析理论,得到x n的误差ε n和x n+1的误差ε n+1,n=0,1,…;
2)根据误差,在保证精度的前提下,对迭代格式涉及到的迭代参数(如x n、a和b等)进行舍入操作,得到短字长数值(对应的
Figure PCTCN2020117978-appb-000013
Figure PCTCN2020117978-appb-000014
)。
可选的,上述根据上述第一误差和上述第二误差,确定本次迭代中上述除法参数需要进行的舍入操作,包括:
根据上述第一误差的精度,确定上述第一迭代值的舍入操作的精度;以及根据上述第二误差的精度,确定上述除法运算的除数、被除数,及迭代格式涉及到的其它迭代参数的舍入操作的精度。
具体的,可以将第一迭代值x n设置为与第一误差ε n相同的精度,将a、b,及迭代格式涉及到的其它迭代参数设置为与第二误差ε n+1相同的精度,来进行迭代运算。
103、根据上述舍入操作后的参数继续迭代处理,获得第二迭代值。
根据误差分析理论,在保证精度的前提下,将超出误差之外无效数字进行舍入操作,再通过前述获得的舍入操作后的参数代入迭代规则中进行计算,简化计算的复杂度,从而提高计算效率。
104、若上述第二迭代值不满足精度要求,将上述第二迭代值作为第一迭代值重复执行以上步骤,直到上述第二迭代值满足上述精度要求,输出上述第二迭代值。
本申请实施例中,针对可以给出误差分析的函数迭代类算法,根据迭代格式(1),依次循环迭代,每次迭代都可以得到一个更加精确的近似商。可以根据误差分析结果,判定每一步迭代结果的误差和精度,从而判定迭代是否结束。此处对于基本迭代算法不再赘述。
可以参见图2所示的一种除法运算方法的流程示意图。图2中的迭代算法步骤包括:21、将初始值代入迭代规则;22、迭代计算获得近似值;23、计算 误差;在判断误差被接受的情况下结束;若判断误差不被接受的情况下执行24、根据误差,将相关数值中的无效数字舍去;25、将舍入后的数值代入迭代规则。通过上述步骤可执行多次迭代。
本申请实施例中的除法运算方法在每步迭代中,基于所采用的迭代格式的误差分析理论,计算迭代近似值的误差,然后根据误差分析理论,在保证精度的前提下,将超出误差之外无效数字进行舍入操作,从而将短字长、等精度的数值代入迭代规则中进行计算,可以简化计算的复杂度,从而提高计算效率。
为了更进一步地描述本申请实施例中的除法运算方法的应用,以下结合不同的迭代算法(格式)进行具体描述。
在一种实施方式中,可以根据上述第一误差的指数,确定上述第一迭代值和除数需要进行的舍入操作。
上述除法运算方法可以应用于不同的迭代算法中,由于迭代规则不同,其误差的计算也可能不同,进而对计算中的相关数据的舍入操作也可以不同。
在一种可选的实施方式中,上述第一误差为:
Figure PCTCN2020117978-appb-000015
其中x n为第一迭代值,a为除数,N n是正整数,E n表示常数;
上述N n+1是第n+1次迭代时误差对应的指数,上述根据上述第一误差的指数,确定上述第一迭代值和除数需要进行的舍入操作,包括:
确定上述x n需要进行的舍入操作是将N n+1位之后的数字舍去;
确定上述a需要进行的舍入操作是将N n+1+1位之后的数字舍去。
具体的,可以Newton-Rapshon算法为例,对本申请作进一步的阐述。令a为除数,b为被除数,商为Q。则Newton-Rapshon算法用于计算x=1÷a,然后在用一次乘法计算得到商Q=b×x。Newton-Rapshon算法是一种函数迭代算法,计算时需要提供初始值x 0,然后采用如下迭代格式计算:
x n+1=x n×(2-a×x n);
定义x n的误差为ε n=1-a×x n,则上式可以表示为:
x n+1=x n×(1+ε n);
x n+1的误差ε n+1可以表示为:
Figure PCTCN2020117978-appb-000016
由上式可见,Newton-Rapshon算法二次收敛。如果初始迭代值x 0选取得好,那么初始的误差ε 0就小,迭代次数也就小一些。在目前对Newton-Rapshon算法进行改进的做法中,大多是针对初始迭代值x 0的选取进行改进。本申请利用其误差分析结果对Newton-Rapshon算法进行改进,可以减少计算量,提高除法的计算效率。
在采用Newton-Rapshon算法进行除法运算的过程中,由于每次迭代所产生的误差是不同的,随着迭代次数的增加,迭代精度逐步变好。而在初始几步迭代内,x n的精度并不好,因此x n就不需要保存所有的数字,而只需要保存与精度相当的数字即可,这样一来在计算乘法a×x n时,也不需要计算所有的数字,可将其中超出误差之外的无效数字舍去,从而减少计算量。以下进行详细说明。假定求a的倒数x=1÷a,将a用二进制的科学计数法表示为:a=A×2 M,其中1 |A| 2,M为一个整数。如果|a|≤1,则a=A。由于a的倒数实际上等于a=A -1×2 -M,因此不失一般性,可以仅就1 a 2进行讨论。此时可提供一个初始近似值0 x 0 1,具体的计算步骤如下:
1)根据初始迭代值x 0,计算初始误差ε 0,并将其按科学计数法表示为:
Figure PCTCN2020117978-appb-000017
其中-N 0是ε 0的指数,N 0是正整数。
2)按n=0,1,…迭代。x n的误差为ε n,其指数为-N n,因此将x n中N n+1之后的数字舍去,将此步骤表示为:
Figure PCTCN2020117978-appb-000018
其中,Roundoff()表示舍入操作。舍入后
Figure PCTCN2020117978-appb-000019
的误差仍然是
Figure PCTCN2020117978-appb-000020
的量级,因此不影响下一步迭代的精度,但是
Figure PCTCN2020117978-appb-000021
字长要比x n的字长小得多,简化了数据处理量。上述舍入操作的舍入方法可以选用IEEE745标准中提供的舍入方法,也可使用“准舍1入”法、“恒置一”法等,本申请实施例对此不做限制。
进一步可选的,该方法还包括:
获取中间数据
Figure PCTCN2020117978-appb-000022
上述为上述x n进行上述舍入操作后获得,上述
Figure PCTCN2020117978-appb-000023
为上述a进行上述舍入操作后获得;
将e n中N n+1+1位之后的数字舍去,获得上述舍入操作后的中间数据
Figure PCTCN2020117978-appb-000024
上述根据上述舍入操作后的参数继续迭代处理,获得第二迭代值包括:
根据上述舍入操作后的中间数据
Figure PCTCN2020117978-appb-000025
和上述
Figure PCTCN2020117978-appb-000026
计算第二迭代值x n+1
在Newton-Raphson算法中,要获得第二迭代值x n+1可以先计算中间数据,该中间数据由除数和前一迭代值计算得到,包括前述均执行舍入操作后的除数
Figure PCTCN2020117978-appb-000027
和迭代值
Figure PCTCN2020117978-appb-000028
在中间数据的计算中仍然可以使用对应的舍入操作。继续举例来讲,可见以下步骤:
3)根据Newton-Raphson算法的误差分析理论可知,每次迭代的误差按二次方减少,因此x n+1的误差指数必然是上一次误差指数的两倍,因此有N n+1=2N n。但是在计算x n+1时,需要用到
Figure PCTCN2020117978-appb-000029
其中
Figure PCTCN2020117978-appb-000030
的字长很短,而a则保留了所有精确的数字,其字长很长。因为x n+1也仍然是不精确的,因此 也没有必要用精确的a,可以对a进行舍入操作,将a中N n+1+1之后的数字舍去,从而得到
Figure PCTCN2020117978-appb-000031
4)利用
Figure PCTCN2020117978-appb-000032
Figure PCTCN2020117978-appb-000033
计算
Figure PCTCN2020117978-appb-000034
根据
Figure PCTCN2020117978-appb-000035
和e n就可以计算
Figure PCTCN2020117978-appb-000036
但是e n是通过乘法运算得到的。一般在计算过程中,减法运算对结果字长的影响小,而乘法运算往往会导致字长的显著增加。因为x n+1的精度是
Figure PCTCN2020117978-appb-000037
的量级,所以在保证精度的条件下,可以对e n进行舍入操作,于是有:
Figure PCTCN2020117978-appb-000038
5)最后计算x n+1
Figure PCTCN2020117978-appb-000039
如果x n+1还没有达到精度要求,返回2)继续计算,此时n=n+1。
可见本申请实施例中的除法运算方法适用于Newton-Raphson算法,使用结合自身算法特点的舍入操作可以在不影响精确度的情况下减少数据计算量,提高处理效率。
在一种可选的实施方式中,还可以Markstein算法为例,对本申请作进一步的阐述。
可选的,上述第一误差为:
Figure PCTCN2020117978-appb-000040
其中x n为第一迭代值,a为除数,N n是正整数,E n表示常数;
上述方法还包括:
根据上述第一误差的指数确定上述第一误差需要进行的舍入操作;
上述根据上述舍入操作后的参数继续迭代处理,获得第二迭代值,包括:
根据上述舍入操作后的第一误差和上述舍入操作后的第一迭代值计算第二迭代值x n+1
与Newton-Raphson算法的应用类似的,上述步骤可以对应于Markstein算法,该算法的迭代格式如下:
x n+1=x n+x n×ε nn+1=1-a×x n+1
将上述方法用于Markstein算法,其具体的计算步骤如下:
1)根据初始迭代值x 0,计算初始误差ε 0,并将其按科学计数法表示为:
Figure PCTCN2020117978-appb-000041
其中-N 0是ε 0的指数,N 0是正整数。根据N 0,将x 0中N 0+1之后的数字舍去,将此步骤表示为:
Figure PCTCN2020117978-appb-000042
2)按n=0,1,…迭代。首先对ε n进行舍入操作。Markstein算法是二阶收敛的,因此x n+1的误差是
Figure PCTCN2020117978-appb-000043
量级。为保证对ε n进行舍入操作后,不影响计算x n+1的精度,将ε n保留2N n位有效数字,其舍入操作表示为:
Figure PCTCN2020117978-appb-000044
3)计算
Figure PCTCN2020117978-appb-000045
并对e n进行舍入操作。同理,为保证x n+1的误差是
Figure PCTCN2020117978-appb-000046
量级,对e n进行舍入操作是,保留2N n位有效数字,其舍入操作表示为
Figure PCTCN2020117978-appb-000047
4)计算
Figure PCTCN2020117978-appb-000048
并对x n+1进行舍入操作
Figure PCTCN2020117978-appb-000049
5)接下来对a进行舍入操作。对a进行舍入操作的目的是为了计算x n+2,根据算法的二次收敛性质,x n+2的误差是
Figure PCTCN2020117978-appb-000050
量级。因此为保证对a进行舍入操作后,不影响计算x n+2的精度,将a保留4N n位有效数字,其舍入操作表示为
Figure PCTCN2020117978-appb-000051
6)计算
Figure PCTCN2020117978-appb-000052
的误差
Figure PCTCN2020117978-appb-000053
其中-N n+1是ε n+1的指数。判断如果
Figure PCTCN2020117978-appb-000054
还没有达到精度要求,返回2)继续计算,此时n=n+1。
通过实验对以上两种迭代算法的实施方式进行测试。实验在a∈1,2内任意取100000个点,计算其倒数,以0.5为初始值,每个点迭代6次。同时也采用现有Newton-Raphson算法和Markstein进行计算。计算结果表明,现有Newton-Raphson算法计算100000个点,通过6次迭代后的平均误差为0.45×10 -16,标准差为0.66×10 -16。现有Markstein算法计算100000个点,通过6次迭代后的平均误差为0.45×10 -16,标准差为0.65×10 -16。而采用本申请实施例的Newton-Raphson算法的平均误差为0.52×10 -16,标准差为0.72×10 -16;采用本申请实施例的Markstein算法的平均误差为0.37×10 -16,标准差为0.56×10 -16,误差量级无差异,具体误差的差异很小,甚至本申请实施例的Markstein算法在相同迭代次数中的误差更小。
仍以a=1.477134351152927为例,可以参见如图3所示的一种Newton-Raphson算法处理过程中舍入操作后的数值示意图。其中,采用基于误差理论的Newton-Raphson函数迭代算法计算1÷1.477134351152927时,迭代过程中,舍入后的相关数值(是
Figure PCTCN2020117978-appb-000055
Figure PCTCN2020117978-appb-000056
)。
Figure PCTCN2020117978-appb-000057
Figure PCTCN2020117978-appb-000058
的表达式分别参前述实施例中对应的公式描述。所有数值采用32位二进制表示,第一行给出该数值的名称,第二行给出该数值的指数,余下行则给出该数值的尾数。尾数中的无效数字经过舍入操作,用0标记。以第2列为例,第2列给出的是
Figure PCTCN2020117978-appb-000059
数值的32位二进制表示,其中第2行是-1,表示
Figure PCTCN2020117978-appb-000060
的指数,而第3至32行表示
Figure PCTCN2020117978-appb-000061
的尾数,其中只有3、4行有数字,而其余5至30行全是0。
Figure PCTCN2020117978-appb-000062
是对x 0进行舍入操作后得到的数值,其尾数中只保留了两位数值。
由图3可见,采用本申请实施例中的方法计算1÷a时,每一步迭代所用到的数值的字长是不一样的。以舍入后的迭代值
Figure PCTCN2020117978-appb-000063
为例,初始值(x 0)的尾数只有2位,第二步(x 1)则是4位,依次增加,直到最后一步,才有32位。类似的,在运算中的
Figure PCTCN2020117978-appb-000064
Figure PCTCN2020117978-appb-000065
也是如此。然而一般的迭代算法,所有迭代步内,都是32位计算,因此与目前的函数迭代类计算方法相比,本申请实施例中的除法运算方法避免了不必要的计算,大大提高了计算效率。
本申请实施例基于误差分析理论,在迭代算法每步迭代过程中,可以在保持精度的前提下,将超出误差之外无效数字进行舍入操作,从而将短字长、等精度的数值代入迭代格式中进行计算,简化计算的复杂度,尤其是迭代过程中乘法的计算复杂度,提高计算效率。可以适用于具有可给出误差分析的任意函数迭代法,前述两个具体实施方式中的迭代法仅作部分举例,不限制本申请实施例所包含的范围。即对于具有可给出误差分析的任意函数迭代法,可以通过本申请实施例中的除法运算方法,根据误差和精度要求,将迭代过程中相关数值进行舍入操作,从而将短字长、等精度的数值代入迭代格式中进行计算,以极大简化计算的复杂度和提升计算效率,本申请实施例对此不做赘述。
本申请实施例所提供的方法,不仅可以用于微型处理器中的除法计算部 件,也可用于需要高精度而采用软件模拟除法运算的算法设计中,可以应用于各类包含迭代算法的应用场景中,比如应用于浮点数的除法运算中,大大简化计算的复杂度和计算资源浪费,提高计算效率。本申请实施例对该方法的应用场景不做限制。
基于上述除法运算方法实施例的描述,本申请实施例还公开了一种除法运算装置。请参见图4,除法运算装置400包括:
获取模块410,用于获取待进行除法运算的除法参数,获取第一迭代值;
处理模块420,用于根据上述第一迭代值进行误差分析处理,确定本次迭代中上述除法参数需要进行的舍入操作;
上述处理模块420还用于,根据上述舍入操作后的参数继续迭代处理,获得第二迭代值;
上述处理模块420还用于,若上述第二迭代值不满足精度要求,将上述第二迭代值作为第一迭代值重复执行以上步骤;
输出模块430,用于直到上述第二迭代值满足上述精度要求,输出上述第二迭代值。
根据本申请的一个实施例,图1所示实施例的方法所涉及的各个步骤均可以是由图4所示的除法运算装置400中的各个模块执行的,此处不再赘述。
本申请实施例中的除法运算装置400,可以获取待进行除法运算的除法参数,获取第一迭代值,根据上述第一迭代值进行误差分析处理,确定本次迭代中上述除法参数需要进行的舍入操作,根据上述舍入操作后的参数继续迭代处理,获得第二迭代值,若上述第二迭代值不满足精度要求,将上述第二迭代值作为第一迭代值重复执行以上步骤,直到上述第二迭代值满足上述精度要求,输出上述第二迭代值。基于误差分析理论,在保证精度的前提下,将超出误差之外无效数字进行舍入操作,从而将短字长、等精度的数值代入迭代规则中进行计算,可以极大地减少计算资源浪费,提高函数迭代算法的运算效率。
基于上述方法实施例以及装置实施例的描述,本申请实施例还提供一种电子设备。请参见图5,该电子设备500至少包括处理器501、输入设备502、输出设备503以及计算机存储介质504。其中,终端内的处理器501、输入设备502、输出设备503以及计算机存储介质504可通过总线或其他方式连接。
计算机存储介质504可以存储在终端的存储器中,上述计算机存储介质504用于存储计算机程序,上述计算机程序包括程序指令,上述处理器501用于执行上述计算机存储介质504存储的程序指令。处理器501(或称CPU(Central Processing Unit,中央处理器))是终端的计算核心以及控制核心,其适于实现一条或多条指令,具体适于加载并执行一条或多条指令从而实现相应方法流程或相应功能;在一个实施例中,本申请实施例上述的处理器501可以用于进行一系列的处理,包括如图1所示实施例中方法等等。
本申请实施例还提供了一种计算机存储介质(Memory),上述计算机存储介质是终端中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机存储介质既可以包括终端中的内置存储介质,当然也可以包括终端所支持的扩展存储介质。计算机存储介质提供存储空间,该存储空间存储了终端的操作系统。并且,在该存储空间中还存放了适于被处理器501加载并执行的一条或多条的指令,这些指令可以是一个或一个以上的计算机程序(包括程序代码)。需要说明的是,此处的计算机存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的还可以是至少一个位于远离前述处理器的计算机存储介质。
在一个实施例中,可由处理器501加载并执行计算机存储介质中存放的一条或多条指令,以实现上述实施例中的相应步骤;具体实现中,计算机存储介质中的一条或多条指令可以由处理器501加载并执行图1中方法的任意步骤,此处不再赘述。
本申请实施例可以简化函数迭代算法处理除法问题的复杂度,尤其是迭代过程中乘法的计算复杂度,从而提高计算效率。与一般的函数迭代算法相比,本申请增加适应性的一个舍入操作步骤,容易实现。本申请在处理器501中的除法部件,和在计算机图形学、计算机视觉、可视化、地理信息系统、网格生成技术和科学计算等需要很高精度的除法运算(超出计算机硬件所提供的精度)的软件模块中,都可以具有很高的应用价值,此处不再赘述。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置和模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,该模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如,多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。所显示或讨论的相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者通过该计算机可读存储介质进行传输。该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是只读存储器(read-only memory,ROM),或随机存储存储器(random access memory,RAM),或磁性介质,例如,软盘、硬盘、磁带、磁碟、或光介质,例如,数字通用光盘(digital versatile disc,DVD)、或者半导体介质,例如,固态硬盘(solid state disk,SSD)等。

Claims (10)

  1. 一种除法运算方法,其特征在于,包括:
    获取待进行除法运算的除法参数,获取第一迭代值;
    根据所述第一迭代值进行误差分析处理,确定本次迭代中所述除法参数需要进行的舍入操作;
    根据所述舍入操作后的除法参数继续迭代处理,获得第二迭代值;
    若所述第二迭代值不满足精度要求,将所述第二迭代值作为第一迭代值重复执行以上步骤,直到所述第二迭代值满足所述精度要求,输出所述第二迭代值。
  2. 根据权利要求1所述的除法运算方法,其特征在于,所述根据所述第一迭代值进行误差分析处理,确定本次迭代中所述除法参数需要进行的舍入操作,包括:
    获取第一迭代值的第一误差和第二迭代值的第二误差;
    根据所述第一误差和所述第二误差,确定本次迭代中所述除法参数需要进行的舍入操作。
  3. 根据权利要求2所述的除法运算方法,其特征在于,所述根据所述第一误差和所述第二误差,确定本次迭代中所述除法参数需要进行的舍入操作,包括:
    根据所述第一误差的精度,确定所述第一迭代值的舍入操作的精度;以及根据所述第二误差的精度,确定所述除法运算的除数和/或被除数的舍入操作 的精度。
  4. 根据权利要求2所述的除法运算方法,其特征在于,所述根据所述第一误差和所述第二误差,确定本次迭代中所述除法参数需要进行的舍入操作,包括:
    根据所述第一误差的指数,确定所述第一迭代值和除数需要进行的舍入操作。
  5. 根据权利要求4所述的除法运算方法,其特征在于,所述第一误差为:
    Figure PCTCN2020117978-appb-100001
    其中x n为第一迭代值,a为除数,N n是正整数,E n表示常数;
    所述根据所述第一误差的指数,确定所述第一迭代值和除数需要进行的舍入操作,包括:
    确定所述x n需要进行的舍入操作是将N n+1位之后的数字舍去;
    确定所述a需要进行的舍入操作是将N n+1+1位之后的数字舍去。
  6. 根据权利要求5所述的除法运算方法,其特征在于,所述方法还包括:
    获取中间数据
    Figure PCTCN2020117978-appb-100002
    所述
    Figure PCTCN2020117978-appb-100003
    为所述x n进行所述舍入操作后获得,所述
    Figure PCTCN2020117978-appb-100004
    为所述a进行所述舍入操作后获得;
    将e n中N n+1+1位之后的数字舍去,获得所述舍入操作后的中间数据
    Figure PCTCN2020117978-appb-100005
    所述根据所述舍入操作后的参数继续迭代处理,获得第二迭代值包括:
    根据所述舍入操作后的中间数据
    Figure PCTCN2020117978-appb-100006
    和所述
    Figure PCTCN2020117978-appb-100007
    计算第二迭代值x n+1
  7. 根据权利要求4所述的除法运算方法,其特征在于,所述第一误差为:
    Figure PCTCN2020117978-appb-100008
    其中x n为第一迭代值,a为除数,N n是正整数,E n表示常数;
    所述方法还包括:
    根据所述第一误差的指数确定所述第一误差需要进行的舍入操作;
    所述根据所述舍入操作后的参数继续迭代处理,获得第二迭代值,包括:
    根据所述舍入操作后的第一误差、所述舍入操作后的第一迭代值、所述舍入操作后的除数,及其它舍入操作后的中间参数计算第二迭代值x n+1
  8. 一种除法运算装置,其特征在于,包括:
    获取模块,用于获取待进行除法运算的除法参数,获取第一迭代值;
    处理模块,用于根据所述第一迭代值进行误差分析处理,确定本次迭代中所述除法参数需要进行的舍入操作;
    所述处理模块还用于,根据所述舍入操作后的参数继续迭代处理,获得第二迭代值;
    所述处理模块还用于,若所述第二迭代值不满足精度要求,将所述第二迭代值作为第一迭代值重复执行以上步骤;
    输出模块,用于直到所述第二迭代值满足所述精度要求,输出所述第二迭代值。
  9. 一种电子设备,其特征在于,包括存储器和处理器,所述存储器存储 有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如权利要求1至7中任一项所述的除法运算方法的步骤。
  10. 一种计算机可读存储介质,其特征在于,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如权利要求1至7中任一项所述的除法运算方法的步骤。
PCT/CN2020/117978 2020-09-02 2020-09-27 除法运算方法、装置、电子设备和介质 WO2022047873A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010908236.0 2020-09-02
CN202010908236.0A CN112181357A (zh) 2020-09-02 2020-09-02 除法运算方法、装置、电子设备和介质

Publications (1)

Publication Number Publication Date
WO2022047873A1 true WO2022047873A1 (zh) 2022-03-10

Family

ID=73925562

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/117978 WO2022047873A1 (zh) 2020-09-02 2020-09-27 除法运算方法、装置、电子设备和介质

Country Status (2)

Country Link
CN (1) CN112181357A (zh)
WO (1) WO2022047873A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105955706A (zh) * 2016-06-16 2016-09-21 武汉芯泰科技有限公司 一种除法器及除法运算方法
US20170010862A1 (en) * 2015-07-10 2017-01-12 Arm Limited Apparatus and method for performing division
CN109298848A (zh) * 2018-08-29 2019-02-01 中科亿海微电子科技(苏州)有限公司 双模式浮点除法平方根的电路
CN111104092A (zh) * 2019-12-06 2020-05-05 北京多思安全芯片科技有限公司 一种快速除法器和除法运算方法
CN111399803A (zh) * 2019-01-03 2020-07-10 北京小米松果电子有限公司 除法运算方法、装置、存储介质及电子设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8751555B2 (en) * 2010-07-06 2014-06-10 Silminds, Llc, Egypt Rounding unit for decimal floating-point division
GB2539265B (en) * 2015-06-12 2020-07-29 Advanced Risc Mach Ltd Apparatus and method for controlling rounding when performing a floating point operation
CN105389157A (zh) * 2015-10-29 2016-03-09 中国人民解放军国防科学技术大学 基于Goldschmidt算法的浮点除法器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170010862A1 (en) * 2015-07-10 2017-01-12 Arm Limited Apparatus and method for performing division
CN105955706A (zh) * 2016-06-16 2016-09-21 武汉芯泰科技有限公司 一种除法器及除法运算方法
CN109298848A (zh) * 2018-08-29 2019-02-01 中科亿海微电子科技(苏州)有限公司 双模式浮点除法平方根的电路
CN111399803A (zh) * 2019-01-03 2020-07-10 北京小米松果电子有限公司 除法运算方法、装置、存储介质及电子设备
CN111104092A (zh) * 2019-12-06 2020-05-05 北京多思安全芯片科技有限公司 一种快速除法器和除法运算方法

Also Published As

Publication number Publication date
CN112181357A (zh) 2021-01-05

Similar Documents

Publication Publication Date Title
JP5175379B2 (ja) 選択可能な下位精度を有する浮動小数点プロセッサ
CN110457068B (zh) 用于深度学习加速的非常低精度浮点表示
US5671170A (en) Method and apparatus for correctly rounding results of division and square root computations
WO2023206832A1 (zh) 函数实现方法、逼近区间分段方法、芯片、设备及介质
US8914431B2 (en) Range check based lookup tables
WO2020119188A1 (zh) 一种程序检测方法、装置、设备及可读存储介质
CN112835551B (zh) 用于处理单元的数据处理方法、电子设备和计算机可读存储介质
US7406589B2 (en) Processor having efficient function estimate instructions
WO2022052625A1 (zh) 一种定点与浮点转换器、处理器、方法以及存储介质
CN116466910A (zh) 一种基于浮点数的查表方法、装置、电子设备及存储介质
CN114139693A (zh) 神经网络模型的数据处理方法、介质和电子设备
Subhasri et al. Hardware‐efficient approximate logarithmic division with improved accuracy
CN113805974A (zh) 基于应用程序的数据类型选择
WO2022047873A1 (zh) 除法运算方法、装置、电子设备和介质
KR102503498B1 (ko) 수학적 함수를 연산하는 시스템 및 방법
JP2023103419A (ja) 演算方法、装置、チップ、電子機器及び記憶媒体
US9772975B2 (en) Hybrid table-lookup algorithm for functions
KR102559930B1 (ko) 수학적 함수들을 연산하기 위한 시스템 및 방법들
US20150363170A1 (en) Calculation of a number of iterations
TW202333041A (zh) 執行浮點運算的系統及方法
CN109460535A (zh) 一种基于云的有限域矩阵求逆装置及求逆方法
CN115511047B (zh) Softmax模型的量化方法、装置、设备及介质
Hao et al. A VLSI implementation of double precision floating-point logarithmic function
US20090094306A1 (en) Cordic rotation angle calculation
CN115469829B (zh) 运算装置和基于运算电路的指数运算方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20952115

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20952115

Country of ref document: EP

Kind code of ref document: A1