CN111984056B - GPU (graphics processing Unit) texture cache and accumulated error compensation based numerically-controlled oscillator and implementation method - Google Patents
GPU (graphics processing Unit) texture cache and accumulated error compensation based numerically-controlled oscillator and implementation method Download PDFInfo
- Publication number
- CN111984056B CN111984056B CN202010662304.XA CN202010662304A CN111984056B CN 111984056 B CN111984056 B CN 111984056B CN 202010662304 A CN202010662304 A CN 202010662304A CN 111984056 B CN111984056 B CN 111984056B
- Authority
- CN
- China
- Prior art keywords
- lookup table
- phase
- nco
- data
- index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/02—Digital function generators
- G06F1/03—Digital function generators working, at least partly, by table look-up
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03D—DEMODULATION OR TRANSFERENCE OF MODULATION FROM ONE CARRIER TO ANOTHER
- H03D7/00—Transference of modulation from one carrier to another, e.g. frequency-changing
- H03D7/16—Multiple-frequency-changing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/28—Indexing scheme for image data processing or generation, in general involving image processing hardware
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Power Engineering (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Stabilization Of Oscillater, Synchronisation, Frequency Synthesizers (AREA)
Abstract
The invention discloses a GPU texture cache and accumulated error compensation-based numerically-controlled oscillator and an implementation method thereof, and belongs to the technical field of communication. The invention can design and realize the high-efficiency and high-precision numerically-controlled oscillator NCO by utilizing the high flexibility and the high-efficiency parallel data processing capability of a Graphic Processing Unit (GPU). The scheme of the invention specifically comprises the following steps: and constructing a sine lookup table LUT, and storing the sine lookup table LUT by using a texture memory of a GPU (graphics processing Unit). The GPU receives input data and performs segmentation processing on the data. Initial phase of NCO corresponding to ith section of input dataPhase of NCO corresponding to iL dataAccumulating to obtain accumulated phaseUsing accumulated phaseAnd constructing a lookup table index value index (iL), and searching a sine lookup table LUT in the texture cache by using the lookup table index value index (iL) to obtain sine and cosine two-way quadrature numerically-controlled oscillator NCO signals.
Description
Technical Field
The invention relates to the technical field of communication, in particular to a numerical control oscillator based on GPU texture caching and accumulated error compensation and an implementation method.
Background
A Digital Down Converter (DDC) system is an important subsystem of a modern aerospace measurement and control system. In a typical DDC system, a digital local oscillator is its most complex core device.
In a conventional DDC system, a digital local Oscillator is mainly implemented by an NCO (Numerical Control Oscillator). A typical NCO consists of an N-bit phase accumulator, a phase register, and an M-bit sine look-up table. Increasing the number of bits of N and M can effectively improve the frequency and phase resolution, but the phase resolution precision of the lookup table is limited by the storage space on the FPGA chip and cannot be effectively improved. In recent years, with the increase of on-chip storage space, the lookup table method is widely applied due to the characteristics of less occupied computing resources and high speed, but the problems are not fundamentally solved. The coordic (coding Rotation Digital computer) algorithm proposed by Jack volertic in 1959 provides an alternative solution to this problem. The algorithm approximately calculates the value of the function to be calculated by shifting, adding, subtracting and iterative operation, replaces the operation of a lookup table, effectively saves very limited on-chip storage resources, but consumes more resources for additional iterative operation. So far, the achievement of high-precision phase resolution still needs to make a compromise between computational resources and memory space.
The GPU provides an effective scheme for solving the problems, and the CUDA can provide a high-efficiency high-precision sine function lookup table for realizing the digital local oscillator by utilizing the high-efficiency floating point arithmetic capability and the multi-level storage system. In 2016, a university of Sichuan team designs a digital local oscillator by using a lookup table method, and the speed improvement of the method is realized by 4 times compared with direct calculation, however, the frequency precision of the method is limited by the number of threads in the same block and is difficult to improve. Scott C.Kim [2] and the like respectively use texture memory nearest neighbor and linear interpolation to realize output of any bandwidth, and the results show that the Mean Square Error (MSE) of texture interpolation and traditional resampling is about 4.11e-4, the MSE of nearest neighbor and linear interpolation is about 1e-5, and the MSE of linear interpolation is slightly superior to that of nearest neighbor interpolation, but the method does not solve the problem of phase accumulation Error and has lower precision.
Although the NCO implementation method based on the GPU has the advantages of flexibility and high efficiency, the floating-point-number-based operation method of the GPU accumulates rounding errors caused by the order matching operation of floating-point numbers in the phase accumulation process. Especially when the number of points of phase accumulation is large, the error accumulation becomes significant and needs to be eliminated by pertinently researching a proper algorithm.
How to design and realize a numerically controlled oscillator NCO with high efficiency and high precision by utilizing the high flexibility and the high-efficiency parallel data processing capability of a Graphic Processing Unit (GPU) is a problem to be solved urgently at present.
Disclosure of Invention
In view of this, the present invention provides a numerically controlled oscillator based on GPU texture caching and accumulated error compensation and an implementation method thereof, which can design and implement a high-efficiency and high-precision numerically controlled oscillator NCO by using the high flexibility and high-efficiency parallel data processing capability of a Graphics Processing Unit (GPU).
In order to achieve the purpose, the technical scheme of the invention is as follows: a numerical control oscillator NCO realization method based on GPU texture caching and accumulated error compensation comprises the following steps:
step one, constructing a sine lookup table LUT, and storing the sine lookup table LUT by using a texture memory of a GPU (graphics processing unit).
And step two, the GPU receives input data and carries out segmented processing on the data.
The total number of points of the input data is nLT; when the processed data is the ith segment of input data, the previous segment is the ith segmentThe i-1 section of input data corresponds to the tail phase of NCOThe initial phase of the input data NCO for the ith segment.
Step three, for the ith segment of data with the length of nL, the corresponding phase of the data with the index of iL is as follows:
wherein f isLOThe local oscillation frequency of the numerically controlled oscillator NCO is; f. ofsIs the sampling rate of the input data; iL is data index with value of 0-nL-1.
Step four, adopting the following mode to carry out initial phase of NCO on the ith section of input dataPhase corresponding to iL dataAccumulating to obtain accumulated phase
To be provided withAs a first input value a toAs a second input value b; a sum deviation ofb is a sum deviation of
step five, utilizing accumulated phaseAnd constructing a lookup table index value index (iL), and searching a sine lookup table LUT in a texture memory by using the lookup table index value index (iL) to obtain sine and cosine two-path orthogonal NCO signals.
Further, in step five, the accumulated phase is utilizedConstructing a lookup table index value (iL), specifically:
further, in the fifth step, a lookup table index (il) is used to find out and obtain sine and cosine orthogonal NCO outputs from a sine lookup table LUT in the texture memory, specifically:
and searching the sine lookup table LUT in the texture memory by using the lookup table index value index (iL) to obtain the sine NCO output.
The index value index (il) of the lookup table is shifted by 1/4 cycles, and the cosine NCO output is obtained by looking up from the sine lookup table LUT in the texture memory.
Further, the sinusoidal NCO output is NCOI(iL)
NCOI(iL)=lookup(LUT,index);
Where lookup is a lookup function of a lookup table.
Cosine NCO output being NCOQ(iL)。
NCOQ(iL)=lookup(LUT,index+nLT/4)。
Further, in the first step, only the data of the previous 1/4 cycles in the sine lookup table is stored in the texture memory of the GPU; in step five, before the lookup table index (il) is used to look up from the sine lookup table LUT in the texture memory, the lookup table index (il) is converted to the angle corresponding to the previous 1/4 cycles.
Another embodiment of the present invention further provides a GPU texture caching and accumulated error compensation based numerically controlled oscillator, as shown in fig. 2, comprising a phase calculation module, a phase accumulation module, and a lookup table module built on a GPU of a graphics processor.
The phase calculation module receives input data, and the total number of points of the input data is nLT; the phase calculation module carries out sectional processing on the data, the current processing data is ith section input data, and the previous section, i.e. the ith-1 section input data, corresponds to the final phase of NCOThe initial phase of the input data NCO for the ith segment; the phase calculation module is used for calculating the corresponding phase of the ith segment of data with the length of nL and the data with the index of iL as follows:wherein f isLOThe local oscillation frequency of the numerically controlled oscillator NCO is; f. ofsIs the sampling rate of the input data; iL is data index with value of 0-nL-1.
The phase accumulation module is used for carrying out initial phase on the ith section of input data NCOPhase corresponding to iL dataAccumulating to obtain accumulated phaseThe accumulated phase is:wherein at leastAs a first input value a toAs a second input value b; a sum deviation ofb is a sum deviation of The corrected value of (a) is a' ═ a + da;the corrected value of (b) is b' ═ b + db.
The lookup table module is used for constructing a sine lookup table LUT and storing the sine lookup table LUT by using a texture memory of a GPU (graphics processing unit); look-up table module utilizing accumulated phaseAnd constructing a lookup table index value index (iL), and searching from a sine lookup table LUT in a texture memory by using the lookup table index value index (iL) to obtain sine and cosine two-path orthogonal NCO output.
Furthermore, the phase accumulation module is realized by adopting seven adders; respectively, a first adder to a seventh adder.
The first adder takes a and da as input and carries out addition operation and outputsThe corrected value a' of (a).
The second adder takes b and db as input to carry out addition operation and outputCorrected value b' of (1).
A fourth adder for adding phaseAnd-a' as inputs for addition to output a first intermediate quantity b 1;
the fifth adder takes b' and-b 1 as input to carry out addition operation and output a second intermediate quantity a 1;
the sixth adder takes a' and-a 1 as input to carry out addition operation to output a summation deviation da;
the seventh adder takes b' and-b 1 as input to perform addition operation to output the summation deviation db of b.
Has the advantages that:
1. the numerical control oscillator based on the GPU texture cache and the accumulated error compensation and the implementation method thereof provided by the embodiment of the invention realize the correct output of the NCO by utilizing a lookup table method based on the GPU. In the process of realizing the NCO by the GPU texture cache lookup table, the high-efficiency and high-precision output of the NCO is realized by fully utilizing the linear interpolation advantage and the cache advantage of the texture cache. And finally, aiming at the common problem of large accumulated errors of floating-point number phase accumulation operation in NCO calculation, an error-free transformation technology is utilized, a floating-point number phase accumulated error comprehensive compensation algorithm based on a GPU is designed, the floating-point number phase accumulated error is controlled to be 1e-5 orders of magnitude by utilizing the algorithm, and the phase calculation precision is effectively improved.
2. The invention provides a lookup table optimization design aiming at a sine lookup table, namely, 1/4 period data is stored in an original data table of the whole period, so that the phase resolution is effectively increased, and the spurious suppression of an output signal is effectively improved.
Drawings
Fig. 1 is a flowchart of a method for implementing a numerically controlled oscillator based on GPU texture caching and accumulated error compensation according to an embodiment of the present invention;
fig. 2 is a block diagram of a numerically controlled oscillator based on GPU texture caching and accumulated error compensation according to an embodiment of the present invention;
FIG. 3 is a block diagram of a phase accumulation module in a numerically controlled oscillator based on GPU texture caching and accumulated error compensation according to an embodiment of the present invention
FIG. 4 is a waveform diagram of a complete cycle phase sine lookup table in an embodiment of the present invention;
FIG. 5 is an exemplary diagram of an output NCO spectrum corresponding to a complete cycle phase sine lookup table in an embodiment of the present invention;
FIG. 6 is a waveform diagram of an optimized lookup table in an embodiment of the present invention;
FIG. 7 is an exemplary diagram of an output NCO spectrum corresponding to an optimized lookup table in an embodiment of the present invention;
FIG. 8 is a schematic diagram illustrating absolute error comparison between data based on the 2Sum algorithm and the float algorithm in the embodiment of the present invention.
Detailed Description
The invention is described in detail below by way of example with reference to the accompanying drawings.
The invention provides a numerical control oscillator NCO realization method based on GPU texture caching and accumulated error compensation, as shown in figure 1, comprising the following steps:
step one, constructing a sine lookup table LUT, and storing the sine lookup table LUT by using a texture memory of a GPU (graphics processing unit).
Wherein the GPU has linear interpolation advantages and cache advantages in the presence of a specific read mode within the texture.
And step two, the GPU receives input data and carries out segmented processing on the data.
Total number of points of input dataIs nLT; when the pre-processing data is the ith section of input data, the previous section, i.e. the i-1 th section of input data corresponds to the tail phase of NCOThe initial phase of the input data NCO for the ith segment;
step three, for the ith segment of data with the length of nL, the corresponding phase of the data with the index of iL is as follows:
wherein f isLOThe local oscillation frequency of the numerically controlled oscillator NCO is; f. ofsIs the sampling rate of the input data; iL is data index with value of 0-nL-1.
Step four, adopting the following mode to carry out initial phase of NCO on the ith section of input dataPhase corresponding to iL dataAccumulating to obtain accumulated phase
In NCO calculation, phase accumulation is an indispensable core link, while a floating-point number accumulation process based on nearest point rounding inevitably leads to accumulation of phase errors, and as the number of phase accumulation points increases, the accumulation of errors will become larger and larger, and finally, it will possibly lead to complete error of output data. Although the double-precision floating-point number method can relieve accumulated errors, the double-precision operation brings adverse effects to the operation efficiency of the system. Therefore, the invention provides a floating-point number phase accumulated error comprehensive compensation algorithm based on an error-free transformation technology, and the method can effectively reduce the phase accumulated error on the premise of ensuring the operation efficiency.
To be provided withAs a first input value a toAs a second input value b; a sum deviation ofb is a sum deviation of
the accumulation mode of the accumulated phase provided by the embodiment of the invention can compensate the accumulated operation deviation of each step of floating point number of two numbers participating in accumulation when the two numbers are accumulated next time. Through the above process, the comprehensive compensation of the accumulated phase is completed.
Step five, utilizing accumulated phaseAnd constructing a lookup table index value index (iL), and searching a sine lookup table LUT in a texture memory by using the lookup table index value index (iL) to obtain sine and cosine two-path orthogonal NCO signals.
The index value index (il) of the lookup table constructed in this step is specifically:
searching a sine lookup table LUT in a texture memory by using a lookup table index value (iL) to obtain sine NCO output;
the index value index (il) of the lookup table is shifted by 1/4 cycles, and the cosine NCO output is obtained by looking up from the sine lookup table LUT in the texture memory.
Wherein the sinusoidal NCO output is NCOI(iL):NCOI(iL)=lookup(LUT,index);
Where lookup is a lookup function of a lookup table.
Cosine NCO output being NCOQ(iL):NCOQ(iL)=lookup(LUT,index+nLT/4)。
And carrying out orthogonal frequency mixing processing on input data according to the sine and cosine orthogonal NCO signals to obtain the output of the required intermediate frequency signal.
In the above embodiment, the GPU texture cache successfully implements NCO signal output based on the sine lookup table, however, in this embodiment, sine wave data of one period is directly stored, the symmetric characteristic of the sine signal is not fully utilized, and in addition, in the phase value calculation process, as the data index increases, the accumulated phase value will continuously increase, and finally data overflow may occur. Based on the above analysis, in an embodiment of the present invention, in the first step, the following optimization is performed for the form of storing the sine lookup table in the texture memory of the GPU: only the data of the first 1/4 cycles in the sine lookup table, that is, the data of 1/4 cycles is stored in the original data table of the whole cycle. In step five, before the lookup table index (il) is used to look up from the sine lookup table LUT in the texture memory, the lookup table index (il) is converted to the angle corresponding to the previous 1/4 cycles.
In the embodiment of the present invention, a scheme of a complete cycle phase lookup table stored in a texture memory is compared with an NCO spectrum obtained by an optimization method of the complete cycle phase lookup table stored in the texture memory and stored in 1/4 cycle phase lookup tables, where the complete cycle phase lookup table stored in the texture memory is shown in fig. 4, and is a single-precision floating point number sine lookup table in a range of 1024 points, a sampling frequency is set to 64MHz, a local oscillation frequency is set to 9MHz, and an NCO output signal spectrum is obtained as shown in fig. 5. The 1/4-cycle phase lookup table stored in the texture memory is shown in fig. 6, and in order to obtain a 1024-point single-precision floating-point number sine lookup table in a pi/2 range according to the optimization algorithm design, the sampling frequency is set to be 64MHz, the local oscillation frequency is set to be 9MHz, and the frequency spectrum of the obtained NCO output signal is shown in fig. 7. It can be seen that the optimized way of storing 1/4 periodic phase lookup tables in texture memory effectively improves spur rejection of the output signal due to the increased phase resolution.
Another embodiment of the present invention provides a GPU texture caching and accumulated error compensation based digitally controlled oscillator, which is composed as shown in fig. 2, and is characterized by including a phase calculation module, a phase accumulation module, and a lookup table module, which are built on a GPU of a graphics processor.
The phase calculation module receives input data, and the total number of points of the input data is nLT; the phase calculation module carries out sectional processing on the data, the current processing data is ith section input data, and the previous section, i.e. the ith-1 section input data, corresponds to the final phase of NCOThe initial phase of the input data NCO for the ith segment; the phase calculation module is used for calculating the corresponding phase of the ith segment of data with the length of nL and the data with the index of iL as follows:wherein f isLOThe local oscillation frequency of the numerically controlled oscillator NCO is; f. ofsIs the sampling rate of the input data; iL is data index with value of 0-nL-1.
The phase accumulation module is used for carrying out initial phase on the ith section of input data NCOPhase corresponding to iL dataAccumulating to obtain accumulated phaseAccumulationThe phase is:wherein at leastAs a first input value a toAs a second input value b; a sum deviation ofb is a sum deviation of The corrected value of (a) is a' ═ a + da;the corrected value of (b) is b' ═ b + db.
The lookup table module is used for constructing a sine lookup table LUT and storing the sine lookup table LUT by using a texture memory of a GPU (graphics processing unit); look-up table module utilizing accumulated phaseAnd constructing a lookup table index value index (iL), and searching from a sine lookup table LUT in the texture memory by using the lookup table index value index (iL) to obtain sine and cosine two-way quadrature numerically controlled oscillator NCO output.
And finally, an output module can be added, and the output module is used for carrying out orthogonal processing on input data according to sine and cosine two-path orthogonal NCO output to obtain the output of the numerically controlled oscillator NCO.
In the embodiment of the invention, the phase accumulation module is realized by adopting seven adders; the specific connection relationship is shown in fig. 3, wherein the seven adders are respectively the first to seventh adders.
First addition methodThe device takes a and da as input to carry out addition operation and outputThe corrected value a' of (a).
The second adder takes b and db as input to carry out addition operation and outputCorrected value b' of (1).
A fourth adder for adding phaseAnd-a' as inputs for addition to output a first intermediate quantity b 1;
the fifth adder takes b' and-b 1 as input to carry out addition operation and output a second intermediate quantity a 1;
the sixth adder takes a' and-a 1 as input and performs addition to output a summation offset da.
The seventh adder takes b' and-b 1 as input to perform addition operation to output the summation deviation db of b.
The NCO scheme provided by the invention is utilized to carry out simulation analysis on the output phase, the simulation parameter is NCO frequency 9MHz, the sampling rate Fs is 64MHz, the analysis data length is 2^15, the analysis data is divided into 32 sections, and each section is 1024 points. And respectively adopting single-precision, compensation single-precision and double-precision NCO operation data to analyze and compare NCO phases. The simulation process adopts an accumulation method to calculate the phase, the phase is circularly reset every 1024 points to eliminate the whole-cycle part, the accumulation calculation adopts a 2Sum algorithm to eliminate the floating-point number calculation deviation, and finally the optimized calculation results of the double-precision calculation, the single-precision calculation and the single-precision 2Sum algorithm and the theoretical value deviation are shown in figure 8. As can be seen from the figure, after the 2Sum algorithm is superposed on the basis of single-precision operation, the operation deviation is greatly compressed without phase drift, the phase deviation is better than 1e-5rad, and the precision is greatly improved.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (7)
1. A numerical control oscillator NCO realization method based on GPU texture caching and accumulated error compensation is characterized by comprising the following steps:
step one, constructing a sine lookup table LUT, and storing the sine lookup table LUT by using a texture memory of a Graphics Processing Unit (GPU);
step two, the GPU receives input data and carries out segmented processing on the data;
the total number of points of the input data is nLT; when the pre-processing data is the ith section of input data, the previous section, i.e. the i-1 th section of input data corresponds to the tail phase of NCOThe initial phase of the input data NCO for the ith segment;
step three, for the ith segment of data with the length of nL, the corresponding phase of the data with the index of iL is as follows:
wherein f isLOThe local oscillation frequency of the numerically controlled oscillator NCO is obtained; f. ofsIs the sampling rate of the input data; iL is a data index and takes the value of 0-nL-1;
step four, adopting the following mode to carry out initial phase of NCO on the ith section of input dataA phase corresponding to the iL dataAccumulating to obtain accumulated phase
To be provided withAs a first input value a toAs a second input value b; the summation deviation of a is da ═ a '- [ b' - (phi-)out(iL)-a')](ii) a The sum of b is different from db ═ b' - (phi)out(iL)-a');
3. the method as claimed in claim 2, wherein in the fifth step, the lookup table index (il) is used to lookup from the sine lookup table LUT in the texture cache to obtain sine and cosine two-way orthogonal NCO outputs, specifically:
searching from a sine lookup table LUT in the texture cache by using the lookup table index value index (iL) to obtain a sine NCO output;
and shifting the index value index (il) of the lookup table by 1/4 cycles, and searching from a sine lookup table LUT in the texture cache to obtain a cosine NCO output.
4. The method of claim 3 wherein the sinusoidal NCO output is NCOI(iL)
NCOI(iL)=lookup(LUT,index);
Wherein the lookup is a lookup function of a lookup table;
cosine NCO output being NCOQ(iL);
NCOQ(iL)=lookup(LUT,index+nLT/4)。
5. The method according to any one of claims 1 to 4, wherein in the first step, only the data of the previous 1/4 cycles in the sine lookup table is stored in the texture memory of the GPU;
in the fifth step, before the lookup table index (il) is used to lookup from the sine lookup table LUT in the texture cache, the lookup table index (il) is converted to the angle corresponding to the previous 1/4 cycles.
6. The numerical control oscillator based on GPU texture memory and accumulated error compensation is characterized by comprising a phase calculation module, a phase accumulation module and a lookup table module, wherein the phase calculation module, the phase accumulation module and the lookup table module are built on a GPU of a graphic processor;
the phase calculation module receives input data, and the total number of points of the input data is nLT; the phase calculation module carries out sectional processing on the data, the current processing data is ith section input data, and the previous section, i.e. the ith-1 section input data, corresponds to the tail phase of the NCOThe initial phase of the input data NCO for the ith segment; the phase calculation module calculates the corresponding phase for the ith segment of data with the length of nL and the data with the index of iL as follows:wherein f isLOThe local oscillation frequency of the numerically controlled oscillator NCO is obtained; f. ofsIs the sampling rate of the input data; iL is a data index and takes the value of 0-nL-1;
the phase accumulation module is used for carrying out initial phase on the ith segment of input data NCOA phase corresponding to the iL dataAccumulating to obtain accumulated phaseThe accumulated phase is:wherein at leastAs a first input value a toAs a second input value b; a sum deviation ofb is a sum deviation of The corrected value of (a) is a' ═ a + da;the corrected value of (a) is b' ═ b + db;
the lookup table module is used for constructing a sine lookup table LUT and storing the sine lookup table LUT by using a texture memory of a GPU (graphics processing unit); the look-up table module utilizes the accumulated phaseAnd constructing a lookup table index value index (iL), and searching from a sine lookup table LUT in the texture memory by using the lookup table index value index (iL) to obtain sine and cosine quadrature numerically-controlled oscillator NCO outputs.
7. The digitally-controlled oscillator of claim 6, wherein the phase accumulation module is implemented with seven adders; a first adder to a seventh adder;
the first adder takes a and da as input and carries out addition operation and outputsThe corrected value a';
second addingThe method takes b and db as input to carry out addition operation and outputThe corrected value b';
A fourth adder for adding phaseAnd-a' as inputs for addition to output a first intermediate quantity b 1; b1 phiout(iL)-a';
The fifth adder takes b' and-b 1 as input to carry out addition operation and output a second intermediate quantity a 1;
a1=b'-(φout(iL)-a')
the sixth adder takes a' and-a 1 as input to carry out addition operation to output a summation deviation da;
the seventh adder takes b' and-b 1 as input to perform addition operation to output the summation deviation db of b.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010662304.XA CN111984056B (en) | 2020-07-10 | 2020-07-10 | GPU (graphics processing Unit) texture cache and accumulated error compensation based numerically-controlled oscillator and implementation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010662304.XA CN111984056B (en) | 2020-07-10 | 2020-07-10 | GPU (graphics processing Unit) texture cache and accumulated error compensation based numerically-controlled oscillator and implementation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111984056A CN111984056A (en) | 2020-11-24 |
CN111984056B true CN111984056B (en) | 2021-04-27 |
Family
ID=73439100
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010662304.XA Active CN111984056B (en) | 2020-07-10 | 2020-07-10 | GPU (graphics processing Unit) texture cache and accumulated error compensation based numerically-controlled oscillator and implementation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111984056B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114510268B (en) * | 2021-12-24 | 2022-09-20 | 中国人民解放军战略支援部队航天工程大学 | GPU-based method for realizing single-precision floating point number accumulated error control in down-conversion |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107565958A (en) * | 2016-07-01 | 2018-01-09 | 英特尔Ip公司 | To the gain calibration of digital controlled oscillator in fast lock phase-locked loop |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1835389B (en) * | 2005-03-14 | 2010-07-14 | 华为技术有限公司 | Method able to eliminate frequency error of digital controlled oscillator and phase accumulator |
CN101345886B (en) * | 2008-09-03 | 2011-11-02 | 华为技术有限公司 | Method and device for phase error correction |
CN101854172B (en) * | 2009-04-01 | 2013-01-09 | 北京理工大学 | Numerical control oscillator parallel design method based on two-dimensional sine table |
JP5662040B2 (en) * | 2010-03-16 | 2015-01-28 | 株式会社メガチップス | Numerically controlled oscillator |
CN106803818B (en) * | 2016-12-08 | 2020-07-28 | 华中科技大学 | Method and device for receiving TD-AltBOC signal |
-
2020
- 2020-07-10 CN CN202010662304.XA patent/CN111984056B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107565958A (en) * | 2016-07-01 | 2018-01-09 | 英特尔Ip公司 | To the gain calibration of digital controlled oscillator in fast lock phase-locked loop |
Non-Patent Citations (1)
Title |
---|
基于CUDA的GPS软件接收机研究;武新波;《中国优秀硕士学位论文全文数据库(电子期刊)基础科学辑》;20161130;A008-26 * |
Also Published As
Publication number | Publication date |
---|---|
CN111984056A (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107305484B (en) | Nonlinear function operation device and method | |
CN111984056B (en) | GPU (graphics processing Unit) texture cache and accumulated error compensation based numerically-controlled oscillator and implementation method | |
CN103488245B (en) | Phase amplitude conversion method in DDS and device | |
US8751555B2 (en) | Rounding unit for decimal floating-point division | |
JPH05241794A (en) | Device for approximating transcendental function and its method | |
CN104133656A (en) | Floating point number divider adopting shift and subtraction operation by tail codes and floating point number division operation method adopting shift and subtraction operation by tail codes | |
CN111984057B (en) | GPU-based digital NCO high-precision parallel implementation method | |
CN111813371A (en) | Floating-point division operation method, system and readable medium for digital signal processing | |
CN102566965B (en) | Floating-point number logarithmic operation device with flat errors | |
CN107102841A (en) | A kind of coordinate transform parallel calculating method and device | |
CN107423026A (en) | The implementation method and device that a kind of sin cos functionses calculate | |
CN113126954B (en) | Method, device and arithmetic logic unit for floating point number multiplication calculation | |
CN107015783B (en) | Floating point angle compression implementation method and device | |
CN111831257A (en) | Implementation method and device for calculating sine or cosine function | |
Chekushkin et al. | Improving polynomial methods of reconstruction of functional dependences in information-measuring systems | |
CN105302520A (en) | Reciprocal operation solving method and system | |
CN114510268B (en) | GPU-based method for realizing single-precision floating point number accumulated error control in down-conversion | |
CN107315447A (en) | A kind of power Direct Digital Frequency Synthesis and circuit of the conversion of high compression ratio phase amplitude | |
CN115001485A (en) | Direct digital frequency synthesizer based on Taylor polynomial approximation | |
KR100403374B1 (en) | Table Lookup Based Phase Calculator with Normalization of Input Operands for High-Speed Communication | |
CN103365826A (en) | Small-area radical-3 FFT (Fast Fourier Transform) butterfly-shaped unit | |
CN109687870B (en) | Charge redistribution type SARADC capacitance mismatch correction method and system | |
Maharatna et al. | A CORDIC like processor for computation of arctangent and absolute magnitude of a vector | |
He et al. | High‐Performance FP Divider with Sharing Multipliers Based on Goldschmidt Algorithm | |
CN113721885B (en) | Divider based on cordic algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |