CN114020240A - Time domain convolution computing device and method for realizing clock domain crossing based on FPGA - Google Patents

Time domain convolution computing device and method for realizing clock domain crossing based on FPGA Download PDF

Info

Publication number
CN114020240A
CN114020240A CN202111304060.9A CN202111304060A CN114020240A CN 114020240 A CN114020240 A CN 114020240A CN 202111304060 A CN202111304060 A CN 202111304060A CN 114020240 A CN114020240 A CN 114020240A
Authority
CN
China
Prior art keywords
signal
fifo
conv
coef
multiplication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111304060.9A
Other languages
Chinese (zh)
Inventor
游斌相
廖育富
刘泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Jiuzhou ATC Technology Co Ltd
Original Assignee
Sichuan Jiuzhou ATC Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Jiuzhou ATC Technology Co Ltd filed Critical Sichuan Jiuzhou ATC Technology Co Ltd
Priority to CN202111304060.9A priority Critical patent/CN114020240A/en
Publication of CN114020240A publication Critical patent/CN114020240A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/491Computations with decimal numbers radix 12 or 20.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a device and a method for realizing cross-clock-domain time domain convolution calculation based on FPGA (field programmable gate array), comprising RAM _ coef, FIFO _ coef and FIFO _ conv which are realized based on FPGA; RAM _ coef and (M-1) FIFO _ coef form a pipeline structure which is transmitted in sequence according to signal time sequence, and RAM _ coef and (M-1) FIFO _ coef are respectively in one-to-one correspondence with the M multipliers and are used for respectively providing corresponding convolution coefficients for the time sequence signals of the M multipliers; the M-path multipliers and corresponding adders work in parallel, and (M-1) intermediate results of multiply-add operation are generated before the next signal arrives, and M FIFO _ conv are adopted to respectively store the intermediate results of the multiply-add operation of the M-path signals. The invention can meet the application requirements of high real-time performance and less output delay and has the characteristic of less resource consumption.

Description

Time domain convolution computing device and method for realizing clock domain crossing based on FPGA
Technical Field
The invention belongs to the technical field of programmable device application, and particularly relates to a device and a method for realizing time domain convolution calculation across clock domains based on an FPGA (field programmable gate array).
Background
Convolution is widely used in engineering and mathematics. Statistically, the weighted moving average is a convolution. In probability theory, the probability density function of the sum of two statistically independent variables X and Y is the convolution of the probability density functions of X and Y. In electronic engineering and signal processing, the output of any linear system can be obtained by convolving the input signal with a system function (the impulse response of the system).
The convolution operation process can be regarded as a multiply-add operation. If the convolution coefficient is long, the FPGA occupies more resources of the multiplier and the adder when performing the multiply-add operation in the time domain, thereby affecting the resource allocation and optimization of the whole system. The time domain convolution calculation performed by the FPGA is likely to encounter a clock domain crossing problem, that is, the clock domain of the input signal is not matched with the clock domain of the signal processing (convolution calculation), and at this time, the two clock domains need to be unified into one clock domain first, and then the convolution calculation is realized.
At present, the mainstream method for realizing convolution calculation on an FPGA is time domain convolution calculation, which specifically includes performing FFT operation on a signal and a convolution coefficient, then performing multiplication operation in a frequency domain, and finally returning to a time domain through ITTF calculation. The advantage of this approach is that it is not affected by the clock domain and saves multiplier resources. However, this method needs to be converted back and forth between the frequency domain and the time domain during the implementation process, which results in that the method needs to consume a lot of time to perform the time-frequency domain conversion, and the final result is that the method faces the problems of high delay and low real-time performance. Particularly, in the application occasions with extremely high requirements on the real-time performance of the operation, the convolution operation realized by adopting the frequency domain calculation method is difficult to meet the actual use requirements.
In addition, the FPGA time domain convolution calculation has the problems of occupying more multiplier and adder resources and the like
Disclosure of Invention
In order to solve the problems of high delay and low real-time performance of the conventional method for realizing convolution calculation on the FPGA, the invention provides a time domain convolution calculation device for realizing clock domain crossing based on the FPGA. The invention can meet the application requirements of high real-time performance and less output delay and has the characteristic of less resource consumption.
The invention is realized by the following technical scheme:
the device for realizing the time domain convolution calculation of the cross-clock domain based on the FPGA comprises a RAM _ coef, a FIFO _ coef and a FIFO _ conv which are realized based on the FPGA;
RAM _ coef is RAM for storing convolution coefficient in advance;
FIFO _ coef is (M-1) cache FIFOs for storing convolution coefficients;
FIFO _ conv is a buffer FIFO for storing the intermediate result of the multiply-add operation, and the number of the FIFO _ conv is M;
RAM _ coef and (M-1) FIFO _ coef form a pipeline structure which is transmitted in sequence according to signal time sequence, and RAM _ coef and (M-1) FIFO _ coef are respectively in one-to-one correspondence with the M multipliers and are used for respectively providing corresponding convolution coefficients for the time sequence signals of the M multipliers;
the M-path adders and the M-path multipliers are in one-to-one correspondence, the M-path multipliers and the corresponding adders work in parallel, and (M-1) intermediate results of multiplication and addition operations are generated before the next signal arrives, and M FIFO _ conv are adopted to respectively store the intermediate results of the multiplication and addition operations of the M-path signals.
Preferably, the apparatus of the present invention further comprises an input signal interface Sig _ din;
the input signal interface Sig _ din is used for receiving the timing signals and distributing the timing signals to the corresponding multipliers.
Preferably, when the Xth signal is input and X is more than or equal to 1 and less than or equal to M, the Xth signal is input into the Xth multiplier for operation;
when the X-th signal is input and M is more than X and less than or equal to 2M, just finishing all multiplication operations of the (X-M) -th signal, and at the moment, just enabling the (X-M) -th multiplier to be in an idle state and being capable of being used for calculating multiplication of the X-th signal;
the above steps are repeated in a circulating way, and the multiplexing of M multipliers is realized.
Preferably, the multiplication result output by the Xth multiplier of the present invention is added to the intermediate result of the multiplication and addition stored in the (X-1) th FIFO _ conv to obtain the intermediate result of the multiplication and addition by the Xth adder, and the first signal of the intermediate result of the multiplication and addition by the Xth adder is output as the convolution calculation result, and the rest of the signals are stored in the Xth FIFO _ conv (X), where X is greater than or equal to 1 and less than or equal to M.
Preferably, when the Xth signal is input and X is more than or equal to 1 and less than or equal to M, the Xth signal is subjected to multiply-add operation by using an Xth multiplier and an Xth adder to obtain a multiply-add operation result of the Xth signal, a first signal of the multiply-add operation result of the Xth signal is output as a convolution calculation result, and the rest signals are stored in an Xth FIFO _ conv (X);
when the Xth signal is input and M is more than X and less than or equal to 2M, just finishing the multiply-add operation of the (X-M) th signal, and reading and using the data in the (X-M) th FIFO _ conv (X) for storing the multiply-add operation result of the Xth signal;
the operation is repeated in a circulating way, and the multiplexing of M adders and FIFO _ conv is realized.
Preferably, the length of the (M-1) FIFO _ coef of the present invention is greater than N;
n is the ratio of the signal processing clock frequency to the input signal clock frequency.
Preferably, the method of the present invention comprises:
when a signal is input, circularly reading convolution coefficients from the RAM _ coef in sequence, multiplying the read convolution coefficients by signals 1, M +1 and 2M +1 … … to obtain an intermediate result mult1, and writing the coefficients into FIFO _ coef (1);
after N clocks, 2 … … signals of No. 2, M +2 and 2M +2 are input, at the moment, the coefficient is read out from the FIFO _ coef (1), the read coefficient is multiplied by the signals of No. 2, M +2 and 2M +2 … … to obtain an intermediate result mult2, and the coefficient is written into the FIFO _ coef (2);
and successively recursion backwards until the coefficient is read out from the last stage FIFO _ coef (M-1), and the read coefficient is multiplied by the M, 2M and 3M … … signals to obtain an intermediate result multM.
Preferably, the method of the present invention further comprises:
adding the multiplication result of the 1 st signal to the output of the M-th path FIFO _ conv (M) to obtain an intermediate result conv _ tmp (1) of the multiplication and addition operation of the 1 st signal, outputting the first signal in the conv _ tmp (1) as the convolution calculation result, and storing the rest signals in the 1 st path FIFO _ conv (1);
adding the multiplication result of the 2 nd signal to the output of the 1 st FIFO _ conv (1) to obtain an intermediate result conv _ tmp (2) of the multiplication and addition operation of the 2 nd signal, outputting the first signal in the conv _ tmp (2) as the convolution calculation result, and storing the rest signals in the 2 nd FIFO _ conv (2);
sequentially recursing backwards, adding the multiplication result of the Mth signal to the output of the (M-1) th FIFO _ conv (1) to obtain an intermediate result conv _ tmp (M) of the multiplication and addition operation of the Mth signal, outputting the first signal in the conv _ tmp (M) as a convolution calculation result, and storing the rest signals in the Mth FIFO _ conv (M);
when the (M +1) th signal is input, M times N clocks pass at this time, the multiplication and addition operation of the 1 st signal is just finished, the data in the FIFO _ conv (1) is read and used, the adder used by the 1 st signal is multiplexed by the addition operation of the signal at this time, and the intermediate result is stored in the 1 st FIFO _ conv (1);
the operation is repeated in a circulating way, and the multiplexing of M adders and M FIFO _ conv is realized.
In a third aspect, the invention provides a real-time data processing system, and the time domain convolution computing device based on the FPGA to realize clock domain crossing carries out convolution computation on a time sequence signal.
In a fourth aspect, the invention provides a radar data processing system, and the time domain convolution calculation device for realizing clock domain crossing based on the FPGA is adopted to carry out convolution calculation on radar signals.
The invention has the following advantages and beneficial effects:
1. compared with the frequency domain convolution calculation, the method can reduce time delay and improve real-time performance.
2. Compared with the time domain convolution calculation, the method can reduce the resource consumption of a multiplier, an adder and the like.
3. The invention can be widely popularized and used in application occasions with high real-time processing requirements and limited resources, and is particularly suitable for scenes such as data processing of a radar system.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
fig. 1 is a schematic block diagram of the apparatus of the present invention.
FIG. 2 is a timing diagram of the relationship between the input signal and the convolution coefficients and the clock according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.
Examples
The embodiment provides a time domain convolution calculating device for realizing clock domain crossing based on an FPGA.
The convolution calculation principle is as follows:
Figure BDA0003339451790000051
in the formula, h [ n ] is convolution coefficient, x [ n ] is signal to be convolved. As can be seen from the formula:
1) the convolution calculation process is mainly multiplication and addition calculation.
2) Each signal to be convolved is multiplied by all coefficients.
3) The current output value is related to the historical input value.
In order to realize the multiplexing of the multiplier and the adder for the most times, the multiplication and the addition of one signal must be completed at the fastest speed, and after the multiplier and the adder complete the calculation of one signal, the multiplier and the adder can be used for completing the calculation of other signals. Meanwhile, in order to reduce the accumulation of intermediate calculation results as much as possible, the present embodiment designs the whole calculation process in a "pipeline" manner, so that the intermediate calculation results of the multiplier and the adder are always kept in an "optimal state" (the "optimal state" means that the current input signal only needs to complete one multiplication calculation and addition calculation with the current latest result to obtain the current convolution result, and the historical input multiplication and addition calculation result is not needed to be calculated again).
In this embodiment, if the clock frequency of the signal processing (convolution calculation) is N (N is a positive integer) times faster than the clock frequency of the input signal, the number of convolution coefficients is M (M is a positive integer) times of N, and if the coefficients are insufficient, zero padding is performed at the tail. Because the signal processing clock is faster than the input signal clock by N times, the input 1 signal can read out N coefficients to be multiplied and added, and the clock difference can be used for multiplexing the multiplier and the adder, thereby saving the resources of the multiplier and the adder by N times.
As shown in fig. 1, the apparatus of this embodiment includes RAM _ coef, FIFO _ coef, and FIFO _ conv implemented based on FPGA;
the RAM _ coef is a RAM storing convolution coefficients.
FIFO _ coef is a buffer FIFO storing convolution coefficients, and the number of FIFO _ coef is (M-1), namely FIFO _ coef (1), FIFO _ coef (2), FIFO _ coef …, and FIFO _ coef (M-1) shown in FIG. 1.
FIFO _ conv is a buffer FIFO storing the result of the multiply-add calculation, and there are M, that is, FIFO _ conv (1), FIFO _ coef (2), …, and FIFO _ coef (M) shown in fig. 1.
In fig. 1, Sig _ din is an input signal interface for receiving timing signals; 1. 2, … … M, M +1 and M +2 … … are input signal serial numbers; mult1 to multM are multiplication results; conv _ tmp is an intermediate multiplication and addition calculation result; FIFO _ conv is a buffer FIFO for storing the result of multiply-add calculation.
RAM _ coef and (M-1) FIFO _ coef form a structure (pipeline structure) which is transmitted in sequence according to signal time sequence, and RAM _ coef and (M-1) FIFO _ coef are respectively in one-to-one correspondence with the M multipliers and are used for respectively providing corresponding convolution coefficients for the time sequence signals of the M multipliers;
the M paths of time sequence signals correspond to the M paths of multipliers one by one, the M paths of multipliers and corresponding adders work in parallel, before the next signal arrives, (M-1) intermediate results are generated, and M FIFO _ conv are adopted to store the intermediate results of the multiplication and addition operation of the M paths of signals respectively.
Convolution coefficients are stored in 1 RAM, and (M-1) FIFOs are used for buffering the coefficients, so that M signals can be simultaneously multiplied. When the Xth signal (M is less than X and less than or equal to 2M) is input, all multiplication operations of the (X-M) th signal are just finished, and at the moment, the (X-M) th multiplier is just in an idle state and can be used for calculating multiplication of the Xth signal. The cyclic reciprocation realizes the continuity of convolution calculation and improves the multiplexing times of the multiplier to the maximum extent.
As shown in fig. 2, 1, 2, … … M, M +1, and M +2 … … are input signal numbers, each signal needs to be multiplied and added M × N times, and as can be seen from fig. 2, when the M +1 th signal is input, the multiplication and addition operation of the 1 st signal is just finished after M × N clocks. At this time, the multiplier and adder used by the 1 st signal are in idle state, and can be just multiplexed by the M +1 st signal. By analogy, multiplexing of M multipliers and adders can be realized, and resources of the N-times multiplier and the adder are saved.
In this embodiment, the convolution calculation is performed based on the device architecture shown in fig. 1, and the specific process includes:
before the signal comes, the convolution coefficients are stored in RAM _ coef, and (M-1) FIFO _ coef with the length larger than N is defined to buffer the convolution coefficients.
When a signal is input, convolution coefficients stored in the RAM _ coef in advance are sequentially and circularly read from the RAM _ coef, the read coefficients are multiplied by signals 1, M +1 and 2M +1 … … to obtain an intermediate result mult1, and the coefficients are written into the FIFO _ coef (1).
After N clocks, 2 … … signals of No. 2, M +2 and 2M +2 are input, at this time, the coefficient is read out from the FIFO _ coef (1), the read coefficient is multiplied by the signals of No. 2, M +2 and 2M +2 … … to obtain an intermediate result mult2, and the coefficient is written into the FIFO _ coef (2).
And successively recursion backwards until the coefficient is read out from the last stage FIFO _ coef (M-1), and the read coefficient is multiplied by the M, 2M and 3M … … signals to obtain an intermediate result multM.
In the convolution calculation process, the output result of the current signal is: the multiplication result of the current signal is added with the multiplication results of the previous N _ coef-1 signals (N _ coef is the number of convolution coefficients). Since the signal processing clock is faster than the signal input clock and the M multipliers are working, M-1 intermediate results are generated before the next signal arrives, and therefore M FIFOs are defined for storing the intermediate results of the multiply-add operation.
Adding the multiplication result multX of the Xth signal to the multiplication and addition result of the (X-1) th signal to obtain an intermediate result conv _ tmp (X), outputting the first signal in the conv _ tmp (X) as a convolution calculation result, and storing the rest signals into the FIFO _ conv (X).
The method specifically comprises the following steps:
when X is equal to 1, that is, the multiplication result mult1 of the 1 st signal is added to the output (all zeros) of the FIFO _ conv (m), so as to obtain an intermediate result conv _ tmp (1), the first signal in conv _ tmp (1) is output as the convolution calculation result, and the rest signals are stored in the FIFO _ conv (1).
When X is 2, the multiplication result mult2 of the 2 nd signal is added with the multiplication and addition result of the 1 st signal (i.e. the output of FIFO _ conv (1)) to obtain an intermediate result conv _ tmp (2), the first signal in conv _ tmp (2) is output as the convolution calculation result, and the rest signals are written into FIFO _ conv (2);
and successively recurrently backwards, when X is equal to M, adding the multiplication result multM of the M-th signal to the multiplication and addition result of the (M-1) -th signal (namely the output of the FIFO _ conv (M-1)) to obtain an intermediate result conv _ tmp (M), outputting the first signal in conv _ tmp (M) as the convolution calculation result, and writing the rest signals into the FIFO _ conv (M).
When X is M +1, M × N clocks have elapsed, and the addition and multiplication of the 1 st signal are completed, the data in FIFO _ conv (1) (the intermediate result of the 1 st signal) is basically read and used, and the addition and calculation of the signal can reuse the adder used by the 1 st signal, and the result can be stored in FIFO _ conv (1). The cyclic reciprocating way not only realizes the multiplexing of the adder, but also realizes the multiplexing of FIFO, and greatly improves the utilization efficiency of the internal resources of the FPGA.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. The device for realizing the time domain convolution calculation of the cross-clock domain based on the FPGA is characterized by comprising a RAM _ coef, a FIFO _ coef and a FIFO _ conv which are realized based on the FPGA;
RAM _ coef is RAM for storing convolution coefficient in advance;
FIFO _ coef is (M-1) cache FIFOs for storing convolution coefficients;
RAM _ coef and (M-1) FIFO _ coef form a pipeline structure which is transmitted in sequence according to signal time sequence, and RAM _ coef and (M-1) FIFO _ coef are respectively in one-to-one correspondence with the M multipliers and are used for respectively providing corresponding convolution coefficients for the time sequence signals of the M multipliers;
the M-path adders and the M-path multipliers are in one-to-one correspondence, the M-path multipliers and the corresponding adders work in parallel, and (M-1) intermediate results of multiplication and addition operations are generated before the next signal arrives, and M FIFO _ conv are adopted to respectively store the intermediate results of the multiplication and addition operations of the M-path signals.
2. The device of claim 1, further comprising an input signal interface Sig _ din;
the input signal interface Sig _ din is used for receiving the timing signals and distributing the timing signals to the corresponding multipliers.
3. The device for realizing time domain convolution calculation across clock domains based on the FPGA according to claim 1, wherein when the Xth signal is input and X is more than or equal to 1 and less than or equal to M, the Xth signal is input into the Xth multiplier for operation;
when the X-th signal is input and M is more than X and less than or equal to 2M, just finishing all multiplication operations of the (X-M) -th signal, and at the moment, just enabling the (X-M) -th multiplier to be in an idle state and being capable of being used for calculating multiplication of the X-th signal;
the above steps are repeated in a circulating way, and the multiplexing of M multipliers is realized.
4. The device according to claim 1, wherein the multiplication result output by the X-th multiplier is added to the intermediate result of the multiplication and addition operation stored in the (X-1) th FIFO _ conv to obtain the intermediate result of the multiplication and addition operation of the X-th adder, the first signal of the intermediate result of the multiplication and addition operation of the X-th adder is output as the convolution calculation result, and the rest of the signals are stored in the X-th FIFO _ conv (X), where 1 ≦ X ≦ M.
5. The device according to claim 1, wherein when the xth signal is input and X is not less than 1 and not more than M, the xth signal is multiplied and added by an xth multiplier and an xth adder to obtain the result of the multiply-add operation of the xth signal, the first signal of the result of the multiply-add operation of the xth signal is output as the result of the convolution operation, and the rest of the signals are stored in an xth FIFO _ conv (X);
when the Xth signal is input and M is more than X and less than or equal to 2M, just finishing the multiply-add operation of the (X-M) th signal, and reading and using the data in the (X-M) th FIFO _ conv (X) for storing the multiply-add operation result of the Xth signal;
the operation is repeated in a circulating way, and the multiplexing of M adders and FIFO _ conv is realized.
6. The apparatus according to claim 1, wherein (M-1) FIFO _ coef lengths are greater than N;
n is the ratio of the signal processing clock frequency to the input signal clock frequency.
7. The method for implementing the device for time-domain convolution calculation across clock domains based on the FPGA according to any one of claims 1 to 6, comprising:
when a signal is input, circularly reading convolution coefficients from the RAM _ coef in sequence, multiplying the read convolution coefficients by signals 1, M +1 and 2M +1 … … to obtain an intermediate result mult1, and writing the coefficients into FIFO _ coef (1);
after N clocks, 2 … … signals of No. 2, M +2 and 2M +2 are input, at the moment, the coefficient is read out from the FIFO _ coef (1), the read coefficient is multiplied by the signals of No. 2, M +2 and 2M +2 … … to obtain an intermediate result mult2, and the coefficient is written into the FIFO _ coef (2);
and successively recursion backwards until the coefficient is read out from the last stage FIFO _ coef (M-1), and the read coefficient is multiplied by the M, 2M and 3M … … signals to obtain an intermediate result multM.
8. The method of claim 7, further comprising:
adding the multiplication result of the 1 st signal to the output of the M-th path FIFO _ conv (M) to obtain an intermediate result conv _ tmp (1) of the multiplication and addition operation of the 1 st signal, outputting the first signal in the conv _ tmp (1) as the convolution calculation result, and storing the rest signals in the 1 st path FIFO _ conv (1);
adding the multiplication result of the 2 nd signal to the output of the 1 st FIFO _ conv (1) to obtain an intermediate result conv _ tmp (2) of the multiplication and addition operation of the 2 nd signal, outputting the first signal in the conv _ tmp (2) as the convolution calculation result, and storing the rest signals in the 2 nd FIFO _ conv (2);
sequentially recursing backwards, adding the multiplication result of the Mth signal to the output of the (M-1) th FIFO _ conv (1) to obtain an intermediate result conv _ tmp (M) of the multiplication and addition operation of the Mth signal, outputting the first signal in the conv _ tmp (M) as a convolution calculation result, and storing the rest signals in the Mth FIFO _ conv (M);
when the (M +1) th signal is input, M times N clocks pass at this time, the multiplication and addition operation of the 1 st signal is just finished, the data in the FIFO _ conv (1) is read and used, the adder used by the 1 st signal is multiplexed by the addition operation of the signal at this time, and the intermediate result is stored in the 1 st FIFO _ conv (1);
the operation is repeated in a circulating way, and the multiplexing of M adders and M FIFO _ conv is realized.
9. A real-time data processing system, characterized in that the time domain convolution calculation device based on the FPGA to realize the cross-clock domain is adopted to carry out convolution calculation on the time sequence signal according to any one of claims 1 to 6.
10. A radar data processing system, characterized in that the convolution calculation is performed on radar signals by using the time domain convolution calculation device based on the FPGA to realize the clock domain crossing according to any one of claims 1 to 6.
CN202111304060.9A 2021-11-05 2021-11-05 Time domain convolution computing device and method for realizing clock domain crossing based on FPGA Pending CN114020240A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111304060.9A CN114020240A (en) 2021-11-05 2021-11-05 Time domain convolution computing device and method for realizing clock domain crossing based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111304060.9A CN114020240A (en) 2021-11-05 2021-11-05 Time domain convolution computing device and method for realizing clock domain crossing based on FPGA

Publications (1)

Publication Number Publication Date
CN114020240A true CN114020240A (en) 2022-02-08

Family

ID=80061104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111304060.9A Pending CN114020240A (en) 2021-11-05 2021-11-05 Time domain convolution computing device and method for realizing clock domain crossing based on FPGA

Country Status (1)

Country Link
CN (1) CN114020240A (en)

Similar Documents

Publication Publication Date Title
Vun et al. A new RNS based DA approach for inner product computation
CN111506294B (en) FPGA (field programmable Gate array) implementation device and method based on FBLMS (fiber bulk mean Square) algorithm of block floating point
CN101741348B (en) Multiphase filter, digital signal processing system and filtering method
CN106803750B (en) Multichannel running water FIR filter
CN110058201B (en) Method for realizing multi-waveform multi-rate time domain pulse pressure based on FPGA resource multiplexing
CN103544111B (en) A kind of hybrid base FFT method based on real-time process
CN101207372B (en) Apparatus and method for implementation of fixed decimal sampling frequency conversion
CN110677138B (en) FIR filter based on error-free probability calculation
CN114020240A (en) Time domain convolution computing device and method for realizing clock domain crossing based on FPGA
CN110620566B (en) FIR filtering system based on combination of random calculation and remainder system
US9268744B2 (en) Parallel bit reversal devices and methods
CN114185014B (en) Parallel convolution method and device applied to radar signal processing
CN115033293A (en) Zero-knowledge proof hardware accelerator, generating method, electronic device and storage medium
Hong et al. Implementation of FIR filter on FPGA using DAOBC algorithm
CN111221496B (en) Method for realizing floating point data accumulation by using FPGA
CN115982527B (en) FPGA-based time-frequency domain transformation algorithm implementation method
CN110808935B (en) Accurate and efficient implementation method and device for autocorrelation operation of linear frequency modulation signal
US20230179315A1 (en) Method for Disseminating Scaling Information and Application Thereof in VLSI Implementation of Fixed-Point FFT
CN112260980B (en) Hardware system for realizing phase noise compensation based on advance prediction and realization method thereof
CN109445748B (en) Method and system for rapidly solving median
CN101355701B (en) Device and method for inverse transformation of integer of DCT
CN115189675A (en) High-order FIR digital filter based on protocol calculation
CN117670644A (en) Two-dimensional FFT hardware accelerator based on SRAM memory calculation
CN1758694A (en) Device for generation confortable noise
Zhou et al. A low power FIR filter structure based on a modified distributed arithmetic algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination