CN116185126A

CN116185126A - Complex function output method and device based on lookup table

Info

Publication number: CN116185126A
Application number: CN202211094356.7A
Authority: CN
Inventors: 曹二帅; 冯若飞; 张莉莉
Original assignee: Chongqing Bitmap Information Technology Co ltd
Current assignee: Chongqing Bitmap Information Technology Co ltd
Priority date: 2022-09-08
Filing date: 2022-09-08
Publication date: 2023-05-30

Abstract

The invention discloses a complex function output method based on a lookup table, which comprises the following steps: establishing a lookup table of the complex function, wherein the lookup table is distributed with more storage space in a gradual change area of a significant change area of the complex function; and receiving input data of the complex function, and obtaining output data of the complex function according to the lookup table and the interpolation algorithm. The invention also discloses a complex function output device based on the lookup table. According to the invention, different step sizes are adopted for the significant change area and the smooth change area to store in the memory, more memory space is allocated to the area with significant change in the function in the limited memory resource, so that the LUT obtains higher precision under the same memory space.

Description

Complex function output method and device based on lookup table

Technical Field

The invention belongs to the technical field of digital circuits, and particularly relates to a complex function output method and device based on a lookup table.

Background

In the fields of digital signal processing, image processing, artificial intelligence, radar, etc., it is often necessary to use relatively complex mathematical function operations such as exponential, logarithmic, trigonometric functions, etc. Common methods for implementing these complex functions are CORDIC algorithm, taylor expansion, and the use of LUTs (Look-Up tables), etc. Since the LUT is essentially a set of RAMs (Random Access Memory ), the complex functions described above can be fitted using the LUT by configuring the RAMs in the LUT in an off-line manner. The LUT is simple in logic in implementation, and because the LUT is an RAM (random access memory) in nature, the fitting of different complex functions can be realized by different configurations on the same RAM in the same set of hardware circuits, so that the fitting of the complex functions by using the LUT becomes a mainstream method. Research on implementation methods of LUTs has also become an important point for research on implementation of complex functions.

Each LUT uses RAM of a fixed depth, which is typically increased to achieve high fitting accuracy. However, due to limited resources, the RAM depth cannot be increased arbitrarily, so that a scheme for obtaining higher-precision function fitting under the limited RAM depth is necessary to be provided.

In the prior art, when a complex function is implemented by using a LUT, an output value is found in a lookup table according to the position of input data in an x coordinate, and is used as the output of the complex function. For the LUT, if the coverage of the LUT in the x-axis direction is MIN to MAX and the RAM depth of the LUT is N, the step in the x-axis direction and the data stored at each address in the RAM of the LUT are as shown in the following formula (1):

in formula (1), LUT (0) is used to represent data stored at address 0 of RAM of LUT, LUT (1) is used to represent data stored at address 1 of RAM of LUT, and so on LUT (N-1) is used to represent data stored at address N-1 of RAM of LUT. In the above formula, f (x) is a complex function to be fitted, and the fitting range is MIN.ltoreq.x.ltoreq.MAX, wherein x is an independent variable of the complex function f (x). The function value of f (0 x step+min) is stored to LUT (0), the function value of f (1 x step+min) is stored to LUT (1), and so on, the function value of f ((N-1) step+min) is stored to LUT (N-1). Step in equation (1) is used to represent the spacing of the actual adjacent arguments corresponding to adjacent LUT RAM addresses.

Specifically, when the input real data x satisfies 0×step+min < x < 1×step+min, selecting the data of LUT (0) as the result of the complex function; when the input real data x meets 1 x step+MIN and is less than or equal to x and less than 2 x step+MIN, selecting the data of the LUT (1) as a result of a complex function; analogize to select the data of LUT (N-1) as the result of complex function when the input real data x satisfies (N-2) x < (N-1) x < (N-2).

When the input real data x is smaller than MIN or x is larger than MAX, the result of the complex function of the independent variable at x is obtained by adopting an epitaxial method. It is realized according to the following formula (2), where k1 is the epi coefficient when x < MIN, and k2 is the epi coefficient when x > MAX.

When the prior art is adopted, the complex function is stored in the same memory, and any part of the complex function is stored in the memory in a mode of equal step length. This results in less than ideal output accuracy for complex functions in regions where the variation is significant and excessive redundancy in output accuracy in regions where the variation is gentle.

Meanwhile, when the prior art is adopted, when the independent variable x meets that M x is less than or equal to x < (M+1) step+MIN, the data of the LUT (M) is selected as a result of a complex function, and M is more than or equal to 0 and less than or equal to N-1; at this time, any argument x belonging to a range is equal to LUT (M), and there is a certain error.

Disclosure of Invention

Aiming at the defects existing in the prior art, the invention provides a lookup table-based complex function output method and a lookup table-based complex function output device which can effectively improve the complex function output precision.

In a first aspect, a method for outputting a complex function based on a lookup table includes the steps of:

establishing a lookup table of the complex function, wherein the lookup table is distributed with more storage space in a gradual change area of a significant change area of the complex function;

and receiving input data of the complex function, and obtaining output data of the complex function according to the lookup table and the interpolation algorithm.

As a preferable scheme, according to the input data of the complex function, the address of a lookup table and operation parameters of the complex function are obtained;

outputting corresponding lookup table data according to the lookup table address; wherein the look-up table allocates more memory in the significantly varying region of the complex function than in the smoothly varying region;

and calculating according to the operation parameters and the lookup table data to obtain output data of the complex function.

Preferably, the obtaining the output data of the complex function is obtained by a linear interpolation method:

f (x) = (N-x)/(N-M) RAM (M) + (x-M)/(N-M) RAM (m+1); wherein M is not less than x is not less than N.

Preferably, the complex function is f (x) =2 ^x ；

The value of N-M is the power of 2 to the power of p.

In a second aspect, a complex function output device based on a lookup table includes an address range selection logic module, a lookup table module, and an operation processing module;

the address range selection logic module is used for obtaining the address of the lookup table and the operation parameters of the complex function according to the input data of the complex function and outputting the address and the operation parameters to the lookup table module and the operation processing module respectively;

the lookup table module is used for storing the lookup table of the complex function and outputting corresponding lookup table data to the operation processing module according to the received lookup table address; wherein the look-up table allocates more memory in the significantly varying region of the complex function than in the smoothly varying region;

the operation processing module is used for receiving the operation parameters and the lookup table data, calculating and outputting the operation result of the complex function.

As a preferable scheme, the operation processing module adopts a linear interpolation method to calculate:

Preferably, the operation processing module includes:

the first inverter is used for receiving the M data of the operation parameters, inverting the M data and outputting the M data;

a first adder for adding and outputting the input data of the complex function and the output data of the first inverter;

a first multiplier for multiplying RAM (m+1) data in the lookup table data with output data of the first adder and outputting the multiplied data;

the second inverter is used for receiving the input data of the complex function, inverting the input data and outputting the processed data;

a second adder for adding and outputting the N data of the operation parameter to the output data of the second inverter;

a second multiplier for multiplying RAM (m) data in the lookup table data with output data of the second adder and outputting the multiplied data;

a third adder for adding and outputting output data of the first multiplier and the second multiplier;

and the divider is used for dividing the output data of the third adder with the N-M data and outputting an operation result.

Preferably, the divider comprises a right shift module;

the right shift module is configured to receive p data of the operation parameter and perform a right shift operation on output data of the third adder according to the p data, where N-m=2 ^p 。

Compared with the prior art, the invention has the following beneficial effects:

1. when some areas of the complex function change more obviously and some areas change more slowly, the invention stores the obvious change areas and the smooth change areas in the memory by adopting different step sizes, and allocates more memory space to the areas with more obvious change in the function in the limited memory resource, so that the LUT obtains higher precision under the same memory space.

2. The invention adopts a linear interpolation method to lead x to obtain a unique function at any position where M is not more than x < (M+1) instead of obtaining the same value, thereby further improving the precision of the LUT.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a complex function output method based on a lookup table according to the present invention;

FIG. 2 is a schematic diagram of a complex function output device based on a lookup table according to the present invention;

FIG. 3 is a schematic diagram of a complex function output device based on a lookup table according to an embodiment of the present invention;

fig. 4 is a schematic flow chart of a complex function output method based on a lookup table according to the present invention.

Wherein, 1 address range selection logic module;

2, a lookup table module;

3, an operation processing module; 31 a first inverter; 32 a first adder; a second inverter 33; 34 a second adder; a first multiplier 35; a second multiplier 36; a third adder 37; 38 shift the module to the right.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In a first aspect, a method for outputting a complex function based on a lookup table, as shown in fig. 1, includes the following steps:

s101: and establishing a lookup table of the complex function, wherein the lookup table is distributed with more storage space in a gradual change area of a significant change area of the complex function.

In this step, the significant change region refers to a region in which the function value of the complex function changes significantly with the change of the argument; the gentle change region is a region in which the function value of the complex function changes less significantly with the change of the independent variable. The regions of significant change and regions of insignificant change are different according to the specific complex function, and can be specifically set and selected according to actual needs by those skilled in the art.

Now as complex function f (x) =2 ^x For illustration, the representation of the independent variables and the function values of each point is shown in Table 1 under the condition that x is more than or equal to-16 and less than or equal to 15.

x	f(x)	x	f(x)	x	f(x)	x	f(x)
								-16	0.00001526	-8	0.00390625	0	1	8	256
-15	0.00003051	-7	0.0078125	1	2	9	512
								-14	0.00006103	-6	0.015625	2	4	10	1024
-13	0.00012207	-5	0.03125	3	8	11	2048
								-12	0.00024414	-4	0.0625	4	16	12	4096
-11	0.00048828	-3	0.125	5	32	13	8192
								-10	0.00097656	-2	0.25	6	64	14	16384
-9	0.00195312	-1	0.5	7	128	15	32768

TABLE 1

In the prior art, considering that the memory resources are limited, assuming that the hardware resources of the LUT only have RAM with depth of 8, for the function value change case of the table 1, the RAM address space with depth of 4 is allocated to all data between-16 x and-1, and the RAM address space with depth of 4 is allocated to all data between 0 x and-15. The data stored at 8 addresses in the RAM of the LUT described in the prior art can be shown in table 2 below.

addr	data	addr	data
				0	0.00001526	4	1
1	0.00024414	5	16
				2	0.00390625	6	256
3	0.0625	7	4096

TABLE 2

In this step, for the case of the change of the function value in table 1, it is possible to select a small memory space to be allocated to the gentle change region and a large memory space to be allocated to the significant change region in the limited address memory space. For example, a RAM address space of depth 2 may be selected to be allocated to all data between-16 x 1 and a RAM address space of depth 6 to be allocated to all data between 0 x 15. The data stored at 8 addresses in the RAM of the LUT described herein may be as shown in table 3 below.

addr	data	addr	data
				0	0.00001526	4	64
1	0.00390625	5	512
				2	1	6	4096
3	8	7	32768

TABLE 3 Table 3

Wherein, for all data when x < -16, an epitaxial method can be adopted to obtain the output result of the complex function according to the RAM (0) in the table 3; for all data between-16. Ltoreq.x. Ltoreq.9, the output result of the complex function can be obtained from RAM (0) and RAM (1) in the table 3; for all data between-8.ltoreq.x.ltoreq.1, the output result of the complex function can be obtained from RAM (1) and RAM (2) in the table 3; for all data between 0.ltoreq.x.ltoreq.2, the output result of the complex function can be obtained from RAM (2) and RAM (3) in the table 3; for all data between 3.ltoreq.x.ltoreq.5, the output result of the complex function can be obtained from RAM (3) and RAM (4) in the table 3; for all data between 6.ltoreq.x.ltoreq.8, the output result of the complex function can be obtained from RAM (4) and RAM (5) in the table 3; for all data between 9.ltoreq.x.ltoreq.11, the output result of the complex function can be obtained from RAM (5) and RAM (6) in the table 3; for all data between 12.ltoreq.x.ltoreq.14, the output result of the complex function can be obtained from RAM (6) and RAM (7) in the table 3; for all data with x being equal to or greater than 15, an epitaxial method can be adopted to obtain the output result of the complex function according to the RAM (7) in the table 3.

S102: and receiving input data of the complex function, and obtaining output data of the complex function according to the lookup table and the interpolation algorithm.

In this step, in order to obtain the function value of f (x), the present invention may be implemented by an interpolation method. Specifically, the function value of f (x) can be obtained by the following linear interpolation formula (3):

f(x)＝(N-x)/(N-M)*RAM(m)+(x-M)/(N-M)*RAM(m+1)③。

in the area M is less than or equal to x is less than or equal to N, the RAM value in the LUT corresponding to M is RAM (M), and the RAM value in the LUT corresponding to N is RAM (m+1). The table 1 can be divided as follows: in the region where-16.ltoreq.x.ltoreq.8, the function value f (-16) corresponding to x= -16 is stored in the RAM (0) in the LUT, the function value f (-8) corresponding to x= -8 is stored in the RAM (1) in the LUT, and at this time, any x point in which M corresponds to-16, N corresponds to-8, M corresponds to 0, N-M=8, and-16.ltoreq.x.ltoreq.8 can be obtained by the formula (3). For ease of calculation, the x-axis may be partitioned in such a way that the value of N-M is equal to the power of 2 to the power of p, at which point the division operation of equation (3) may be equivalently a shift operation.

In the same manner, x can be further divided into a region of-8.ltoreq.x.ltoreq.0, a region of 0.ltoreq.x.ltoreq.4, a region of 4.ltoreq.x.ltoreq.8, a region of 8.ltoreq.x.ltoreq.10, a region of 10.ltoreq.x.ltoreq.12, and a region of 12.ltoreq.x.ltoreq.14 as shown in Table 4 below. When x is within the above-mentioned region, i.e., -16.ltoreq.x.ltoreq.14, each function value is obtained using an interpolation method. When x is not-16.ltoreq.x.ltoreq.14, each function value may be obtained by using the epitaxial method using the formula (2).

addr	data	addr	data
				0	0.00001526	4	256
1	0.00390625	5	1024
				2	1	6	4096
3	16	7	16384

TABLE 4 Table 4

For example, if f (2) is calculated, since 0.ltoreq.2.ltoreq.4, where 0 corresponds to RAM (2) of the LUT and 4 corresponds to RAM (3) of the LUT, then the function value at f (x) is shown as formula (4) below, where M in formula (3) corresponds to 2, N corresponds to 4, and M corresponds to 2, then:

f(2)＝(4-2)/(4-0)*RAM(2)+(2-0)/(4-0)*RAM(3)＝8.5④。

it can be appreciated that a complex function output method based on a lookup table in this embodiment adopts the function f (x) =2 ^x For convenience of description only, the method is not limited to the calculation of an exponential function, but can be applied to the calculation of a logarithm, a trigonometric function or other complex functions (particularly, a function that can clearly distinguish between a significantly changing region and a gently changing region).

Compared with the prior art, the embodiment is as follows: firstly, in the embodiment, when some regions of the complex function change more significantly and some regions change more gently, the LUT total memory space is limited, as shown in table 3, and the technology allocates more memory space for the significantly changed regions, so that the precision of the significantly changed regions is improved; and meanwhile, less storage space is allocated for the area with gentle change, and the accuracy of the area is not affected significantly by using less storage space to represent the data because the area changes more gently. Next, as can be seen from table 2, since the independent variable x with one function value corresponding to one range is adopted in the prior art, but not the interpolation algorithm adopted in the present embodiment, the values f (0), f (1), f (2) and f (3) of the complex function f (x) all correspond to the RAM (4), that is, the values are all 1, and the error is large. Compared with the prior art, the scheme of the embodiment can effectively improve the output precision of the complex function.

According to the embodiment, different step sizes are adopted for the significant change area and the smooth change area to store in the memory, so that more memory space is allocated to the area with significant change in the function in the limited memory resource, and the LUT can obtain higher precision under the same memory space.

In a second aspect, as shown in fig. 2 and 3, a complex function output device based on a lookup table includes an address range selection logic module 1, a lookup table module 2, and an operation processing module 3, where:

the address range selection logic module 1 is configured to obtain a lookup table address and an operation parameter of the complex function according to input data of the complex function, and output the lookup table address and the operation parameter to the lookup table module and the operation processing module respectively;

the lookup table module 2 is configured to store a lookup table of the complex function, and output corresponding lookup table data to the operation processing module according to the received lookup table address; wherein the look-up table allocates more memory in the significantly varying region of the complex function than in the smoothly varying region;

the operation processing module 3 is configured to receive the operation parameters and the lookup table data, perform calculation, and output an operation result of the complex function.

In this embodiment, the complex function f (x) =2 ^x For example, according to the complex function output method based on the first aspect, first, the look-up table data of the complex function is stored in advance in the look-up table moduleIs not shown in the memory. The argument x is then input into the address range selection logic block shown in fig. 3, which generates operational parameters that may include N, M, M in equation (3) and p (when the divider employs a right shift block), where M is the look-up table address, N-m=2 ^p 。

As shown in fig. 3, the operation processing module may include two inverters, three adders, two multipliers and a divider. The structure of the operation processing module is related to a specific calculation formula to be realized, and the formula (3) is selected for explanation.

Wherein, M is input to an a port of the first adder 32 through the first inverter 31, the independent variable x is input to a b port of the first adder 32, and the first adder 32 outputs x-M.

Meanwhile, the argument x is input to the a port of the second adder 34 through the second inverter 33, the argument N is input to the b port of the second adder 34, and the second adder 34 outputs N-x.

The RAM adopts a true dual-port RAM, and can read data of two addresses at the same time. And (3) acquiring the RAM (m+1) and the RAM (m) used in the formula (3) from the RAM according to the m acquired by the address range selection logic module 1.

The obtained RAM (m+1) is input to the a port of the first multiplier 35, the output x-M of the first adder 32 is input to the b port of the first multiplier 35, and the first multiplier 35 outputs (x-M) RAM (m+1).

The obtained RAM (m) is input to the a port of the second multiplier 36, the output N-x of the second adder 34 is input to the b port of the second multiplier 36, and the second multiplier 36 outputs (N-x) RAM (m).

Next, the output (x-M) RAM (m+1) of the first multiplier 35 is input to the input port a of the third adder 37, the output (N-x) RAM (M) of the second multiplier 36 is input to the input port b of the third adder 37, and the third adder 37 outputs (N-x) RAM (M) + (x-M) RAM (m+1).

In a specific hardware implementation, to simplify the hardware design, the value of N-M may be raised to the power of p of 2, i.e. the divider may be implemented by the right shift module 38, where the output (N-x) RAM (M) + (x-M) RAM (m+1) of the third adder 37 is input to the right shift module 38, where the right shift value is p, and the output result of the right shift module 38 is the output result of the final LUT, as shown in the formula (3).

In addition, a complex function output method based on a lookup table corresponding to the present device, as shown in fig. 4, may include the following steps:

s201: obtaining the address of a lookup table and operation parameters of the complex function according to the input data of the complex function;

s202: outputting corresponding lookup table data according to the lookup table address; wherein the look-up table allocates more memory in the significantly varying region of the complex function than in the smoothly varying region;

s203: and calculating according to the operation parameters and the lookup table data, and outputting the operation result of the complex function.

The above examples merely represent a few embodiments of the present invention, which are described in more detail and are not to be construed as limiting the scope of the present invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims

1. The complex function output method based on the lookup table is characterized by comprising the following steps:

2. The complex function output method based on the lookup table is characterized by comprising the following steps:

obtaining the address of a lookup table and operation parameters of the complex function according to the input data of the complex function;

3. A complex function output method based on a lookup table as claimed in any one of claims 1 or 2, comprising the steps of:

the output data of the complex function is obtained through a linear interpolation method:

4. A complex function output method based on a lookup table as claimed in claim 3, wherein:

the complex function is f (x) =2 ^x ；

The value of N-M is the power of 2 to the power of p.

5. The utility model provides a complicated function output device based on lookup table, includes address range selection logic module, lookup table module and operation processing module, its characterized in that:

6. A look-up table based complex function output device as defined in claim 5, wherein:

the operation processing module adopts a linear interpolation method to calculate:

7. A look-up table based complex function output device as defined in claim 5, wherein:

the operation processing module comprises:

8. A look-up table based complex function output device as defined in claim 7, wherein:

the divider includes a right shift module;