CN108509382B - Method for realizing quick convolution operation of super-long sequence based on FPGA - Google Patents

Method for realizing quick convolution operation of super-long sequence based on FPGA Download PDF

Info

Publication number
CN108509382B
CN108509382B CN201810276062.3A CN201810276062A CN108509382B CN 108509382 B CN108509382 B CN 108509382B CN 201810276062 A CN201810276062 A CN 201810276062A CN 108509382 B CN108509382 B CN 108509382B
Authority
CN
China
Prior art keywords
burst
fpga
data
convolution operation
fast convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810276062.3A
Other languages
Chinese (zh)
Other versions
CN108509382A (en
Inventor
孙桂玲
王鹏霄
郑祥雨
陈雨歌
辛港涛
陈江韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nankai University
Original Assignee
Nankai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nankai University filed Critical Nankai University
Priority to CN201810276062.3A priority Critical patent/CN108509382B/en
Publication of CN108509382A publication Critical patent/CN108509382A/en
Application granted granted Critical
Publication of CN108509382B publication Critical patent/CN108509382B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7803System on board, i.e. computer system on one or more PCB, e.g. motherboards, daughterboards or blades
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method for realizing ultra-long sequence fast convolution operation based on an FPGA (field programmable gate array). The method comprises the steps of firstly collecting two paths of signals through an AD sampling module, then respectively storing the collected two paths of data into two random access memories (SRAMs) outside an FPGA in sequence, then taking out the data from the two SRAMs in a burst length in a reverse sequence to carry out fast convolution operation, and simultaneously storing a large amount of data obtained through operation into a DDR2 outside the FPGA, thereby finally realizing the fast convolution operation of the two paths of signals based on the FPGA.

Description

Method for realizing quick convolution operation of super-long sequence based on FPGA
[ technical field ] A method for producing a semiconductor device
The invention belongs to the field of high-speed real-time digital signal processing in the space positioning problem, and is used for carrying out high-speed and large-depth sampling and convolution operation on two paths of signals so as to distinguish the tiny phase difference of the two paths of signals. The AD sampling module adopts a dual-channel 12-bit AD acquisition module AN926, and the FPGA adopts a Cyclone IV series of Altera corporation, wherein the model is Cyclone IV EP4CE115F29C 7N. The two pieces of SRAM for storing data are read in a burst mode, burst lengths are accumulated in sequence, the operation speed is greatly improved, and finally, the fast convolution operation of two paths of signals is achieved in a short time.
[ background of the invention ]
In the spatial localization problem, the phase difference of two signals is an important parameter, and spatial accurate localization can be realized through the phase difference, and convolution operation is an important method for finding the phase difference.
When the phase difference of two paths of signals is extremely small, such as optical signals, the resolution of the phase difference of the two paths of signals can be realized only by performing convolution after high sampling rate and large-depth sampling. At present, convolution operation on two paths of discrete digital signals is mainly realized by software, the speed of the software for executing ultra-long sequence convolution is low, the requirement on the real-time performance of high-speed signal processing cannot be met, and the characteristics of rich hardware resources and parallel operation can be fully utilized by adopting the FPGA, so that the operation speed is greatly improved, and the real-time performance of space positioning is realized.
[ summary of the invention ]
The method utilizes FPGA to realize the fast convolution operation of two paths of signals, and the technical scheme comprises the following aspects:
AD Module data acquisition
The two paths of analog signals are sampled by a dual-channel 12-bit AD acquisition module AN926, the sampling frequency is 50Mhz, the sampling time is 5.243ms, and two groups of 262144-bit data are obtained after the sampling is finished.
2. Data write-in FPGA off-chip SRAM
The external static random access memory adopts two SRAM with model number IS61LV25616AL, and IS connected to the FPGA through GPIO port, and the storage resource of the SRAM IS 256Kx 16bit, so that the storage requirement IS met.
Because the data line of the SRAM is 16bit, and the data collected by the AD which needs to be stored is only 12 bits, D12, D13, D14 and D15 of the SRAM are directly grounded on the external hardware design.
The data writing into the external SRAM is realized by controlling a three-state bus through control logic inside the FPGA. Firstly, when data acquired by AD needs to be written into SRAM, FPGA controls AD module to obtain bus use right, and when writing is finished, FPGA controls AD module to release bus, and bus control right is handed to fast convolution module. 3. Reading data from SRAM
In order to fit the fast convolution method, the method of reading data from the SRAM is a critical part of the method. The external SRAM is read in a burst mode, the burst length is started from 256 bits, and the read length is increased by 256 bits every time the burst is read. Reading the two pieces of SRAM by adopting a completely reverse sequence, namely the reading sequence of the SRAM _ A in the first burst is 0, 1, 2, 3 … 255; the read order of SRAM _ B is 255, 254, 253 … 2, 1, 0. The reading sequence of the SRAM _ A in the second burst is 0, 1, 2, 3 … 511; the read order of SRAM _ B is 511, 510, 509 … 2, 1, 0.
4. Fast convolution operation
The data read from the SRAM is sent to a fast convolution module, when the data is burst for the first time, the burst length is set to be 256, the data are sequentially and parallelly input into 256 multipliers, the multiplication results are latched by latches, each latch is sequentially and independently counted, when all the latches are full of 256 counts, namely the result is output at the same time when the first burst is finished, and the 256 convolution operation results calculated by the first burst can be obtained. And immediately entering a second burst with the burst length of 512, still sending the read data to a fast convolution module, and outputting the result when all the latch counts are full of 512. In this way, a total of 1024 bursts can be passed to obtain the full convolution result.
5. Storing operation result into FPGA external DDR2
The DRR2 SDRAM has the characteristics of large capacity and high read-write speed. The read-write mode of DDR2 is burst mode. After the fast convolution module finishes one burst read operation and outputs the result, the DDR2 enters write burst with the length of 256, and the results are written into the DDR2 in sequence according to the sequence of the latches. The DDR2 will write and release the bus before the next read burst begins.
[ advantages and advantageous effects of the invention ]
The invention utilizes the characteristics of abundant hardware resources and parallel computation of the FPGA, realizes the fast convolution operation of the overlong sequence by using a hardware circuit, and ensures the real-time property of high-speed signal processing on the premise of meeting the measurement precision. If the conventional software convolution method is adopted to read data from the memory bit by bit and multiply and accumulate the data, when the convolution is completed, 1+2+3+4+ … 262143+262144 length of two sequences needs to be consumed, namely, 3.436 × 1010One clock cycle, when the clock is 100Mhz, the consumed time is about 5min43 s; the method only needs to consume 256+512+768+ … +261888+262144 ═ 1.3435 × 108In each clock period, when the clock is 100Mhz, the consumed time is about 1.34s, the operation speed is improved by 255 times, and the operation speed is greatly improved.
[ description of the drawings ]
FIG. 1 is a schematic block diagram of a system;
FIG. 2 is a functional block diagram of a fast convolution module;
FIG. 3 is a first sinusoidal signal;
FIG. 4 is a second path of sinusoidal signals;
FIG. 5 is a diagram showing a small phase difference between two signals;
FIG. 6 is the result of a fast convolution operation;
fig. 7 is a flow chart of a fast convolution method.
[ detailed description ] embodiments
In order to more clearly illustrate the embodiments of the present invention, the present invention will be further described below with reference to the following drawings.
As shown in fig. 3-5, convolution operation is performed on two sinusoidal signals with a small phase difference, the lengths of the two sinusoidal signals after being collected by the AD module are 262144, and the two sinusoidal signals are sequentially stored in an SRAM outside the FPGA. Next, 1024 times of burst read operations are performed in sequence, the read length is increased by 256 in sequence from 256, and the data obtained by reversely reading the two pieces of SRAM is sent to the fast convolution module, as shown in fig. 2. And when the burst is ended, 256 convolution results are obtained at the same time and are sequentially stored in DDR2 outside the FPGA. After 1024 bursts, the fast convolution operation is finished, and the convolution result is shown in fig. 6, it can be seen that although the phase difference of the two paths of signals is extremely small, the peak value can still be distinguished at the position which is slightly left in the middle.

Claims (1)

1.A method for realizing ultra-long sequence fast convolution operation based on FPGA is characterized in that: storing two paths of data acquired by the AD module in two independent SRAMs outside the FPGA through a three-state bus, then respectively reading the two SRAMs in reverse order within a burst length, sending the read data into a fast convolution module, and simultaneously storing an operation result in a DDR2 outside the FPGA; the method comprises the following steps of respectively reading two pieces of SRAM in a reverse order within a burst length, specifically, reading the two pieces of SRAM in a completely reverse order, wherein the burst length starts from 256 bits, and the read length is increased by 256 bits each time of reading burst; the read data are sent to a fast convolution module, specifically, during the first burst, the burst length is set to be 256, the data are sequentially input into 256 multipliers in parallel, multiplication results are latched by latches, each latch counts independently in sequence, when all the latches are full of 256 counts, namely the first burst is ended, the results are output simultaneously, the 256 convolution operation results calculated by the first burst can be obtained, then the data immediately enter the second burst, the burst length is 512, and in this way, all the convolution results can be obtained after 1024 bursts.
CN201810276062.3A 2018-03-27 2018-03-27 Method for realizing quick convolution operation of super-long sequence based on FPGA Active CN108509382B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810276062.3A CN108509382B (en) 2018-03-27 2018-03-27 Method for realizing quick convolution operation of super-long sequence based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810276062.3A CN108509382B (en) 2018-03-27 2018-03-27 Method for realizing quick convolution operation of super-long sequence based on FPGA

Publications (2)

Publication Number Publication Date
CN108509382A CN108509382A (en) 2018-09-07
CN108509382B true CN108509382B (en) 2022-06-07

Family

ID=63377951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810276062.3A Active CN108509382B (en) 2018-03-27 2018-03-27 Method for realizing quick convolution operation of super-long sequence based on FPGA

Country Status (1)

Country Link
CN (1) CN108509382B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101652757A (en) * 2007-04-13 2010-02-17 Rgb网络有限公司 Sdram convolutional interleaver with two paths
US8799750B1 (en) * 2011-05-09 2014-08-05 Xilinx, Inc. Convolutional interleaver for bursty memory access
CN104461934A (en) * 2014-11-07 2015-03-25 北京海尔集成电路设计有限公司 Time-domain deconvolution interweaving device and method suitable for DDR memorizer
CN105282083A (en) * 2015-11-03 2016-01-27 西安烽火电子科技有限责任公司 Burst-mode broadband data processing device and method based on FPGA chip
CN107403221A (en) * 2016-05-03 2017-11-28 想象技术有限公司 The hardware of convolutional neural networks is realized

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1089440A1 (en) * 1999-09-28 2001-04-04 TELEFONAKTIEBOLAGET L M ERICSSON (publ) Interleaver and method for interleaving a data bit sequence
CN1148881C (en) * 2002-01-30 2004-05-05 信息产业部电信传输研究所 Realising method for parallel cascade convolution code hardware decoder
US10572824B2 (en) * 2003-05-23 2020-02-25 Ip Reservoir, Llc System and method for low latency multi-functional pipeline with correlation logic and selectively activated/deactivated pipelined data processing engines
US20060236045A1 (en) * 2005-04-13 2006-10-19 Analog Devices, Inc. Apparatus for deinterleaving interleaved data using direct memory access
US7743287B2 (en) * 2006-10-18 2010-06-22 Trellisware Technologies, Inc. Using SAM in error correcting code encoder and decoder implementations
CN101237240B (en) * 2008-02-26 2011-07-20 北京海尔集成电路设计有限公司 A method and device for realizing cirrocumulus interweaving/de-interweaving based on external memory
US11074492B2 (en) * 2015-10-07 2021-07-27 Altera Corporation Method and apparatus for performing different types of convolution operations with the same processing elements

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101652757A (en) * 2007-04-13 2010-02-17 Rgb网络有限公司 Sdram convolutional interleaver with two paths
US8799750B1 (en) * 2011-05-09 2014-08-05 Xilinx, Inc. Convolutional interleaver for bursty memory access
CN104461934A (en) * 2014-11-07 2015-03-25 北京海尔集成电路设计有限公司 Time-domain deconvolution interweaving device and method suitable for DDR memorizer
CN105282083A (en) * 2015-11-03 2016-01-27 西安烽火电子科技有限责任公司 Burst-mode broadband data processing device and method based on FPGA chip
CN107403221A (en) * 2016-05-03 2017-11-28 想象技术有限公司 The hardware of convolutional neural networks is realized

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FPGA在实时SAR成像系统中的应用;赵博;《中国优秀硕士学位论文全文数据库信息科技辑》;20101105(第11期);摘要,正文第37-38页4.2节 *

Also Published As

Publication number Publication date
CN108509382A (en) 2018-09-07

Similar Documents

Publication Publication Date Title
CN110364203B (en) Storage system supporting internal calculation of storage and calculation method
CN101231877B (en) N-port memory and method for accessing n-port memory M memory address
KR100945968B1 (en) A semiconductor memory
CN104077492A (en) Sample data interpolation method based on FPGA
CN209842608U (en) DDR3 memory control based on FPGA FIFO module
CN108038068A (en) One kind reads method of data synchronization and system based on DDR
CN111694029A (en) Hardware implementation method for generating B1C signal pseudo-random noise code
CN103678729A (en) High-speed A/D sampling data real-time storage method achieved based on FPGA
CN108509382B (en) Method for realizing quick convolution operation of super-long sequence based on FPGA
CN103592489A (en) Method for designing deep storage of digital oscilloscope
CN103309981A (en) ADC (analog-to-digital converter) data organization system with high storage efficiency and ADC data organization method
CN103794244B (en) A kind of phase transition storage reading circuit based on SPI interface and method
CN106158012A (en) Sequential processing method, on-chip SRAM and the FPGA of FPGA on-chip SRAM
CN103065672B (en) A kind of asynchronous static RAM based on synchronized SRAM IP
CN112711393B (en) Real-time multichannel accumulation method based on FPGA
CN102231140B (en) Method for obtaining data envelopments based on double-port random access memory (DPRAM)
CN108665937B (en) Storage component testing method and device
US7684257B1 (en) Area efficient and fast static random access memory circuit and method
CN105528305B (en) A kind of short cycle storage method based on DDR2 SDRAM
CN114758686A (en) Position determination method, reading method, starting method, device, equipment and medium
JP5499131B2 (en) Dual port memory and method thereof
JP2841456B2 (en) Data transfer method and data buffer device
CN113555051B (en) SAR imaging data transposition processing system based on DDR SDRAM
US4796225A (en) Programmable dynamic shift register with variable shift control
CN202661997U (en) High-speed dual-channel signal acquisition and cache circuit with synchronization performance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant