CN116578338A

CN116578338A - Low-delay trigonometric function hardware acceleration algorithm

Info

Publication number: CN116578338A
Application number: CN202211579750.XA
Authority: CN
Inventors: 周柯; 奉斌; 金庆忍; 俞小勇; 王晓明; 卢柏桦
Original assignee: Electric Power Research Institute of Guangxi Power Grid Co Ltd
Current assignee: Electric Power Research Institute of Guangxi Power Grid Co Ltd
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-08-11

Abstract

The invention discloses a low-delay trigonometric function hardware acceleration algorithm, which comprises the following steps: θ using front-end modules ₀ The input angle is converted into a first quadrant; determining the calculated initial value (x ₀ ，y ₀ ，z ₀ ) For sine and cosine calculations, the given initial value is (1/K _n 0, θ); the iteration module is used for obtaining an iteration value (x) by adopting a secondary merging iteration method _N ，y _N ) The method comprises the steps of carrying out a first treatment on the surface of the And obtaining the value of the trigonometric function by using a post-processing module. The invention adopts a method of twice merging iteration during iterative operation, can reduce the hardware consumption of a hardware module, improves the calculation speed, adopts a method of combining table lookup with approximate substitution to calculate the arctangent algorithm during calculating the arctangent angle, improves the calculation speed and reduces the consumption of ROM resources.

Description

Low-delay trigonometric function hardware acceleration algorithm

Technical Field

The invention relates to the technical field of hardware acceleration algorithms, in particular to a low-delay trigonometric function hardware acceleration algorithm.

Background

The novel power distribution network taking new energy as a main body and taking digitization and intellectualization as characteristics needs a large amount of information acquisition and operation analysis so as to ensure safe, stable and reliable power supply of the power grid. The electric energy quality is used as an important standard for measuring the power supply level, and is an important object for collection and analysis in the novel power distribution network. However, the terminal side of the power distribution network has serious defects in power quality acquisition and analysis at present, and huge calculation pressure is brought to the edge side of the power grid. Therefore, it is necessary to provide a power chip with power quality processing capability to relieve edge side calculation pressure and promote grid edge fusion.

The processing of the electric energy quality involves a large amount of trigonometric function operation, the current power chip mostly adopts a table look-up or Taylor expansion mode to calculate the trigonometric function, the table look-up method needs to occupy a large amount of memory and has limited precision, and the Taylor expansion calculation method has large operation amount, low operation speed, needs to occupy a large amount of CPU resources and has large power consumption. Therefore, there is a need to use a method of calculating trigonometric functions in a chip form to improve the calculation capability of the power chip and reduce the power consumption.

Disclosure of Invention

The low-delay trigonometric function hardware acceleration algorithm solves the problems of low operation speed, high occupied memory and high hardware consumption caused by large operation amount in the existing trigonometric function operation method in the background technology.

In order to achieve the above purpose, the present invention provides the following technical solutions: a low-latency trigonometric function hardware acceleration algorithm, comprising the steps of:

s1: θ using front-end modules ₀ The input angle is converted into a first quadrant;

s2: determining the calculated initial value (x ₀ ，y ₀ ，z ₀ ) For sine and cosine calculations, the given initial value is (1/K _n ，0，θ)；

S3: the iteration is obtained by adopting a method of twice merging iteration through an iteration moduleValue (x) _N ，y _N )；

S4: and obtaining the value of the trigonometric function by using a post-processing module.

Preferably, the step S3 includes the steps of:

s301: input iterative calculation initial value (x ₀ ,y ₀ ,z ₀ )；

S302: judging whether the iteration number i reaches the maximum iteration number N, if so, outputting an iteration result, and if not, continuing to execute the steps S303-S307;

s303: calculating tan by combining table look-up and approximate substitution ^-1 2 ^-i ；

S304: according to the following, the iteration directions di, di+1 and the iteration value z of the ith step and the (i+1) th step are calculated _i 、z _i+1 ；

S305: the (i+1) th iteration value x is calculated according to the following formula _i+1 ,y _i+1 ；

S306: updating the iteration times i;

s307: returning to step S302.

Preferably, the pre-module in step S1 performs the following operations: when 0 is less than or equal to theta ₀ Pi/2 or less, and the converted angle theta is theta ₀ The method comprises the steps of carrying out a first treatment on the surface of the When pi/2<θ ₀ Pi is less than or equal to pi, and the converted angle theta is theta ₀ -pi/2; when pi is<θ ₀ Less than or equal to 3 pi/2, and the converted angle theta is theta ₀ -pi; when 3 pi/2<θ ₀ Less than or equal to 2 pi, and the converted angle theta is theta ₀ -3π/2。

Preferably, the operation formula of the iteration module in the step S3 is as follows:

wherein d _i ＝sign(z _i )，d _i Is the iteration direction.

Preferably, the post-processing module in step S4 includes the following algorithm:

preferably, the tan ^-1 2 ^-i The method comprises the following steps:

s01: tan (r) ^-1 2 ^-i Values of (i=1, 2,3 … m) are pre-stored in ROM;

s02: judging the size relation between i and m, executing the step S03 when i is smaller than m, and executing the step S04 when i is larger than or equal to m;

s03: obtaining the value of tan-12-i by looking up a table in ROM;

s04: the tan-12-i has a value of 2-i.

Preferably, the value of m is as follows:

wherein N is the set maximum iteration number.

Preferably, the steps S1 to S4 are implemented by a hardware curing circuit.

Preferably, the maximum iteration number N has a value of 16.

Preferably, the maximum iteration number N has a value of 32.

The beneficial effects of the invention are as follows:

the invention adopts a method of twice merging iteration during iterative operation, can reduce the hardware consumption of a hardware module, improves the calculation speed, adopts a method of combining table lookup with approximate substitution to calculate the arctangent algorithm during calculating the arctangent angle, improves the calculation speed and reduces the consumption of ROM resources.

Drawings

FIG. 1 is a block diagram of a trigonometric function low-delay hardware acceleration algorithm of the present invention;

FIG. 2 is a flow chart of an iterative algorithm module of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides a low-delay trigonometric function hardware acceleration algorithm, which is described by combining with FIG. 1, and comprises the following steps:

S3: the iteration module is used for obtaining an iteration value (x) by adopting a secondary merging iteration method _N ，y _N )；

The process and principle of a low-delay trigonometric function hardware acceleration algorithm of the invention are described below in conjunction with the four steps described above.

Step S1: θ using front-end modules ₀ The input angle is converted into a first quadrant;

wherein, the front module performs the following operation according to the input theta ₀ Different conditions of the converted angle are obtained under different values of the angle, so that the converted angle theta is ensured to be in the first quadrant, and the method is as follows: when 0 is less than or equal to theta ₀ Pi/2 or less, and the converted angle theta is theta ₀ The method comprises the steps of carrying out a first treatment on the surface of the When pi/2<θ ₀ Pi is less than or equal to pi, and the converted angle theta is theta ₀ -pi/2; when pi is<θ ₀ Less than or equal to 3 pi/2, and the converted angle theta is theta ₀ -pi; when 3 pi/2<θ ₀ Less than or equal to 2 pi, and the converted angle theta is theta ₀ -3 pi/2. To sum up, it is possible to realize θ according to the input ₀ Different conditions of the converted angle are obtained under different values of the angle, and the converted angle theta is ensured to be in the first quadrant.

Step S2: determining the calculated initial value (x ₀ ，y ₀ ，z ₀ ) For sine and cosine calculations, the given initial value is (1/K _n ，0，θ)；

Wherein K is _n Is a fixed value, takes a value of 0.607, since after n iterations (n is greater), kn is approximately 0.607, so takes K _n Is a fixed value of 0.607.

Step S3: the iteration module is used for obtaining an iteration value (x) by adopting a secondary merging iteration method _N ，y _N )；

The method of twice merging iteration is adopted because the twice merging merges the two iterations into one time, and the calculation speed of the method of twice iteration is faster compared with that of the general iteration method because the general iteration method carries out one iteration each time. The algorithm of the iteration module can be summarized by the following formula, as follows:

wherein d _i ＝sign(z _i )，d _i Is the iteration direction.

Fig. 2 shows the operation steps of the iteration module, which can be expressed as the following steps:

s301: input iterative calculation initial value (x ₀ ,y ₀ ,z ₀ )；

the maximum iteration number N is 16 or 32, which is the official value in the industry, and the maximum iteration number of the output result can be obtained accurately under the condition of the least iteration number.

S306: updating the iteration times i;

s307: returning to step S302.

Wherein, the arctangent tan is described in step S303 ^-1 2 ^-i The method is obtained by combining storage and approximate substitution, and is characterized in that when calculating the arc tangent angle by a single table look-up method, all the reverse switching values are required to be calculated in advance and stored in the ROM in advance, and a large memory is required to be consumed; the single approximate substitution method replaces the arctangent value at the sampling angle value, but at the initial stage of calculation, but at the early stage of the iterative algorithm, the angle value is larger, and the approximate substitution error is larger; the method combining the table look-up and the approximate substitution adopts the table look-up to calculate the arctangent value in the early stage of iteration, and adopts the approximate substitution method to calculate the arctangent value in the later stage of iteration, so that the larger memory consumption can be reduced, and the calculation accuracy can be increased.

Said arctangent tan ^-1 2 ^-i The steps obtained by the method combining storage and approximate substitution are as follows:

s01: tan (r) ^-1 2 ^-i Values of (i=1, 2,3 … m) are pre-stored in ROM;

s03: obtaining the value of tan-12-i by looking up a table in ROM;

s04: the tan-12-i has a value of 2-i.

Step S4: and obtaining the value of the trigonometric function by using a post-processing module.

The post-processing module in step S4 includes the following algorithm:

in summary, the invention utilizes the hardware circuit to realize the calculation of the trigonometric function, thereby greatly improving the operation speed of the algorithm and reducing the calculation pressure and the power consumption of the CPU. Compared with other calculation methods, the algorithm adopts a secondary merging iteration method during iterative operation, so that the hardware consumption of a hardware module can be reduced, the calculation speed is improved, and meanwhile, when the arc tangent angle is calculated, the arc tangent algorithm is calculated by adopting a method combining table lookup with approximate substitution, the calculation speed is improved, and the consumption of ROM resources is reduced.

The method has the advantages that the calculation speed can be increased by adopting the method of the secondary iteration, the consumption of hardware can be reduced due to the reduction of the iteration times, the secondary iteration is combined into one time by the secondary iteration, and compared with the common iteration method, the calculation speed of the method of the secondary iteration is faster and the effect is better, so that compared with the prior art, the method has the advantages of high calculation speed and less hardware consumption.

Secondly, the invention adopts a method combining table look-up and approximate substitution to calculate the arc tangent algorithm, because when calculating the arc tangent angle, a single table look-up method needs to calculate all the inverse and positive switching values in advance and store the calculated values in ROM in advance, and a large memory is needed to be consumed; the single approximate substitution method replaces the arctangent value at the sampling angle value, but at the initial stage of calculation, but at the early stage of the iterative algorithm, the angle value is larger, and the approximate substitution error is larger; therefore, no matter a single table look-up method or a single approximate substitution method is adopted, the method for combining the table look-up and the approximate substitution is adopted to calculate the arc tangent algorithm, the method for combining the table look-up and the approximate substitution is adopted to calculate the arc tangent value in the early stage of iteration, and the approximate substitution method is adopted to calculate the arc tangent value in the later stage of iteration, so that the method can reduce larger memory consumption and increase the accuracy of calculation.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention, and are intended to be included within the scope of the appended claims and description.

Claims

1. A low-latency trigonometric function hardware acceleration algorithm, comprising the steps of:

2. The low-latency trigonometric function hardware acceleration algorithm according to claim 1, characterized in that said step S3 comprises the steps of:

s301: input iterative calculation initial value (x ₀ ,y ₀ ,z ₀ )；

s303: calculation by adopting a method combining table look-up and approximate substitutionTan of arctangent ^-1 2 ^-i ；

S306: updating the iteration times i;

s307: returning to step S302.

3. The low-latency trigonometric function hardware acceleration algorithm according to claim 1, wherein the pre-module in step S1 performs the following operations: when 0 is less than or equal to theta ₀ Pi/2 or less, and the converted angle theta is theta ₀ The method comprises the steps of carrying out a first treatment on the surface of the When pi/2<θ ₀ Pi is less than or equal to pi, and the converted angle theta is theta ₀ -pi/2; when pi is<θ ₀ Less than or equal to 3 pi/2, and the converted angle theta is theta ₀ -pi; when 3 pi/2<θ ₀ Less than or equal to 2 pi, and the converted angle theta is theta ₀ -3π/2。

4. The low-latency trigonometric function hardware acceleration algorithm according to claim 1, wherein the operation formula of the iterative module in step S3 is as follows:

wherein d _i ＝sign(z _i )，d _i Is the iteration direction.

5. A low-latency trigonometric function hardware acceleration algorithm according to claim 3, wherein the post-processing module of step S4 includes the following algorithm:

6. a low-latency trigonometric function hardware acceleration algorithm according to claim 2 or 4, characterized in that the tan ^-1 2 ^-i The method comprises the following steps:

s01: tan (r) ^-1 2 ^-i Values of (i=1, 2,3 … m) are pre-stored in ROM;

s03: obtaining tan by looking up a table in ROM ^-1 2 ^-i Is a value of (2);

S04：tan ^-1 2 ^-i has a value of 2 ^-i 。

7. The low-latency trigonometric function hardware acceleration algorithm according to claim 6, wherein the value of m is as follows:

wherein N is the set maximum iteration number.

8. The low-latency trigonometric function hardware acceleration algorithm according to claim 1, wherein steps S1-S4 are implemented with hardware cure circuitry.

9. A low-latency trigonometric function hardware acceleration algorithm according to claim 2, characterized in that: the maximum iteration number N takes a value of 16.

10. A low-latency trigonometric function hardware acceleration algorithm according to claim 2, characterized in that: the maximum iteration number N takes a value of 32.