CN108509382B

CN108509382B - Method for realizing quick convolution operation of super-long sequence based on FPGA

Info

Publication number: CN108509382B
Application number: CN201810276062.3A
Authority: CN
Inventors: 孙桂玲; 王鹏霄; 郑祥雨; 陈雨歌; 辛港涛; 陈江韬
Original assignee: Nankai University
Current assignee: Nankai University
Priority date: 2018-03-27
Filing date: 2018-03-27
Publication date: 2022-06-07
Anticipated expiration: 2038-03-27
Also published as: CN108509382A

Abstract

The invention discloses a method for realizing ultra-long sequence fast convolution operation based on an FPGA (field programmable gate array). The method comprises the steps of firstly collecting two paths of signals through an AD sampling module, then respectively storing the collected two paths of data into two random access memories (SRAMs) outside an FPGA in sequence, then taking out the data from the two SRAMs in a burst length in a reverse sequence to carry out fast convolution operation, and simultaneously storing a large amount of data obtained through operation into a DDR2 outside the FPGA, thereby finally realizing the fast convolution operation of the two paths of signals based on the FPGA.

Description

Method for realizing quick convolution operation of super-long sequence based on FPGA

[ technical field ] A method for producing a semiconductor device

The invention belongs to the field of high-speed real-time digital signal processing in the space positioning problem, and is used for carrying out high-speed and large-depth sampling and convolution operation on two paths of signals so as to distinguish the tiny phase difference of the two paths of signals. The AD sampling module adopts a dual-channel 12-bit AD acquisition module AN926, and the FPGA adopts a Cyclone IV series of Altera corporation, wherein the model is Cyclone IV EP4CE115F29C 7N. The two pieces of SRAM for storing data are read in a burst mode, burst lengths are accumulated in sequence, the operation speed is greatly improved, and finally, the fast convolution operation of two paths of signals is achieved in a short time.

[ background of the invention ]

In the spatial localization problem, the phase difference of two signals is an important parameter, and spatial accurate localization can be realized through the phase difference, and convolution operation is an important method for finding the phase difference.

When the phase difference of two paths of signals is extremely small, such as optical signals, the resolution of the phase difference of the two paths of signals can be realized only by performing convolution after high sampling rate and large-depth sampling. At present, convolution operation on two paths of discrete digital signals is mainly realized by software, the speed of the software for executing ultra-long sequence convolution is low, the requirement on the real-time performance of high-speed signal processing cannot be met, and the characteristics of rich hardware resources and parallel operation can be fully utilized by adopting the FPGA, so that the operation speed is greatly improved, and the real-time performance of space positioning is realized.

[ summary of the invention ]

The method utilizes FPGA to realize the fast convolution operation of two paths of signals, and the technical scheme comprises the following aspects:

AD Module data acquisition

The two paths of analog signals are sampled by a dual-channel 12-bit AD acquisition module AN926, the sampling frequency is 50Mhz, the sampling time is 5.243ms, and two groups of 262144-bit data are obtained after the sampling is finished.

2. Data write-in FPGA off-chip SRAM

The external static random access memory adopts two SRAM with model number IS61LV25616AL, and IS connected to the FPGA through GPIO port, and the storage resource of the SRAM IS 256Kx 16bit, so that the storage requirement IS met.

Because the data line of the SRAM is 16bit, and the data collected by the AD which needs to be stored is only 12 bits, D12, D13, D14 and D15 of the SRAM are directly grounded on the external hardware design.

The data writing into the external SRAM is realized by controlling a three-state bus through control logic inside the FPGA. Firstly, when data acquired by AD needs to be written into SRAM, FPGA controls AD module to obtain bus use right, and when writing is finished, FPGA controls AD module to release bus, and bus control right is handed to fast convolution module. 3. Reading data from SRAM

In order to fit the fast convolution method, the method of reading data from the SRAM is a critical part of the method. The external SRAM is read in a burst mode, the burst length is started from 256 bits, and the read length is increased by 256 bits every time the burst is read. Reading the two pieces of SRAM by adopting a completely reverse sequence, namely the reading sequence of the SRAM _ A in the first burst is 0, 1, 2, 3 … 255; the read order of SRAM _ B is 255, 254, 253 … 2, 1, 0. The reading sequence of the SRAM _ A in the second burst is 0, 1, 2, 3 … 511; the read order of SRAM _ B is 511, 510, 509 … 2, 1, 0.

4. Fast convolution operation

The data read from the SRAM is sent to a fast convolution module, when the data is burst for the first time, the burst length is set to be 256, the data are sequentially and parallelly input into 256 multipliers, the multiplication results are latched by latches, each latch is sequentially and independently counted, when all the latches are full of 256 counts, namely the result is output at the same time when the first burst is finished, and the 256 convolution operation results calculated by the first burst can be obtained. And immediately entering a second burst with the burst length of 512, still sending the read data to a fast convolution module, and outputting the result when all the latch counts are full of 512. In this way, a total of 1024 bursts can be passed to obtain the full convolution result.

5. Storing operation result into FPGA external DDR2

The DRR2 SDRAM has the characteristics of large capacity and high read-write speed. The read-write mode of DDR2 is burst mode. After the fast convolution module finishes one burst read operation and outputs the result, the DDR2 enters write burst with the length of 256, and the results are written into the DDR2 in sequence according to the sequence of the latches. The DDR2 will write and release the bus before the next read burst begins.

[ advantages and advantageous effects of the invention ]

The invention utilizes the characteristics of abundant hardware resources and parallel computation of the FPGA, realizes the fast convolution operation of the overlong sequence by using a hardware circuit, and ensures the real-time property of high-speed signal processing on the premise of meeting the measurement precision. If the conventional software convolution method is adopted to read data from the memory bit by bit and multiply and accumulate the data, when the convolution is completed, 1+2+3+4+ … 262143+262144 length of two sequences needs to be consumed, namely, 3.436 × 10¹⁰One clock cycle, when the clock is 100Mhz, the consumed time is about 5min43 s; the method only needs to consume 256+512+768+ … +261888+262144 ═ 1.3435 × 10⁸In each clock period, when the clock is 100Mhz, the consumed time is about 1.34s, the operation speed is improved by 255 times, and the operation speed is greatly improved.

[ description of the drawings ]

FIG. 1 is a schematic block diagram of a system;

FIG. 2 is a functional block diagram of a fast convolution module;

FIG. 3 is a first sinusoidal signal;

FIG. 4 is a second path of sinusoidal signals;

FIG. 5 is a diagram showing a small phase difference between two signals;

FIG. 6 is the result of a fast convolution operation;

fig. 7 is a flow chart of a fast convolution method.

[ detailed description ] embodiments

In order to more clearly illustrate the embodiments of the present invention, the present invention will be further described below with reference to the following drawings.

As shown in fig. 3-5, convolution operation is performed on two sinusoidal signals with a small phase difference, the lengths of the two sinusoidal signals after being collected by the AD module are 262144, and the two sinusoidal signals are sequentially stored in an SRAM outside the FPGA. Next, 1024 times of burst read operations are performed in sequence, the read length is increased by 256 in sequence from 256, and the data obtained by reversely reading the two pieces of SRAM is sent to the fast convolution module, as shown in fig. 2. And when the burst is ended, 256 convolution results are obtained at the same time and are sequentially stored in DDR2 outside the FPGA. After 1024 bursts, the fast convolution operation is finished, and the convolution result is shown in fig. 6, it can be seen that although the phase difference of the two paths of signals is extremely small, the peak value can still be distinguished at the position which is slightly left in the middle.

Claims

1.A method for realizing ultra-long sequence fast convolution operation based on FPGA is characterized in that: storing two paths of data acquired by the AD module in two independent SRAMs outside the FPGA through a three-state bus, then respectively reading the two SRAMs in reverse order within a burst length, sending the read data into a fast convolution module, and simultaneously storing an operation result in a DDR2 outside the FPGA; the method comprises the following steps of respectively reading two pieces of SRAM in a reverse order within a burst length, specifically, reading the two pieces of SRAM in a completely reverse order, wherein the burst length starts from 256 bits, and the read length is increased by 256 bits each time of reading burst; the read data are sent to a fast convolution module, specifically, during the first burst, the burst length is set to be 256, the data are sequentially input into 256 multipliers in parallel, multiplication results are latched by latches, each latch counts independently in sequence, when all the latches are full of 256 counts, namely the first burst is ended, the results are output simultaneously, the 256 convolution operation results calculated by the first burst can be obtained, then the data immediately enter the second burst, the burst length is 512, and in this way, all the convolution results can be obtained after 1024 bursts.