CN113641625B - Four-way parallel data processing transposition system based on FPGA - Google Patents

Four-way parallel data processing transposition system based on FPGA Download PDF

Info

Publication number
CN113641625B
CN113641625B CN202110952866.2A CN202110952866A CN113641625B CN 113641625 B CN113641625 B CN 113641625B CN 202110952866 A CN202110952866 A CN 202110952866A CN 113641625 B CN113641625 B CN 113641625B
Authority
CN
China
Prior art keywords
data
group
serial
ram
parallel conversion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110952866.2A
Other languages
Chinese (zh)
Other versions
CN113641625A (en
Inventor
李晋
闵锐
黄太
徐浩典
余雷
曹宗杰
崔宗勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Electronic Information Industry Technology Research Institute Co ltd
University of Electronic Science and Technology of China
Original Assignee
Sichuan Electronic Information Industry Technology Research Institute Co ltd
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Electronic Information Industry Technology Research Institute Co ltd, University of Electronic Science and Technology of China filed Critical Sichuan Electronic Information Industry Technology Research Institute Co ltd
Priority to CN202110952866.2A priority Critical patent/CN113641625B/en
Publication of CN113641625A publication Critical patent/CN113641625A/en
Application granted granted Critical
Publication of CN113641625B publication Critical patent/CN113641625B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7839Architectures of general purpose stored program computers comprising a single central processing unit with memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/382Information transfer, e.g. on bus using universal interface adapter
    • G06F13/385Information transfer, e.g. on bus using universal interface adapter for adaptation of a particular data processing system to different peripheral devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention belongs to the field of radar imaging signal processing, and particularly relates to a system for quickly transposing data in radar imaging signal processing. The invention comprises a DDR group 1, a DDR group 1 address control module, an input data asynchronous FIFO, a RAM group input data channel selector, a RAM group 1, a RAM group 2, a RAM output data channel selector, a 4-core FFT group, a serial-parallel conversion input data channel selector, a serial-parallel conversion register group 1, a serial-parallel conversion register group 2, a serial-parallel conversion output data channel selector, an output data asynchronous FIFO, a DDR group 2 and a DDR group 2 address control module. The invention can ensure that the four-core FFT input data is uninterrupted, and simultaneously writes the processed data into the DDR group 2 without loss, thereby realizing the streamlined high-speed operation of the whole data processing flow.

Description

Four-way parallel data processing transposition system based on FPGA
Technical Field
The invention belongs to the field of radar imaging signal processing, and particularly relates to a system for quickly transposing data in radar imaging signal processing.
Background
Synthetic Aperture Radars (SAR) have all-weather working characteristics and high-resolution imaging accuracy all day long, and play a great role in remote sensing mapping in cloudy and foggy areas, military reconnaissance, national economic construction and the like. In recent years, with the rapid development of hardware manufacturing level, the design of the SAR real-time imaging system is receiving more and more research. In addition, the SAR imaging signal processing process involves the transmission and storage of a large data volume, and the access of data in the imaging processing process needs to be switched between a distance dimension and an orientation dimension, so the transposition efficiency is directly related to the SAR imaging signal processing speed.
At present, DDR SDRAM (double rate synchronous dynamic random access memory) is applied more and more in SAR imaging signal processing due to the advantages of large storage capacity, high speed, low power consumption, low cost and the like. In the SAR imaging processing system based on the DDR SDRAM, there is a two-page or three-page transposing method in the previous research, which implements matrix transposing by cyclically accessing two or three SDRAMs. A paper published by Wuqin institute of electronics technology of Nanjing, "a high-efficiency matrix transposition method based on FPGA and DDR" disassembles data from row dimensions, and arranges the original row of data into a new small matrix so as to balance the read-write efficiency. The ' CTM algorithm and implementation based on DDR SDRAM ' published by Liuchen et al of the institute of electrical and electronics Engineers ' propose the fastest column reading matrix transposition algorithm, firstly receive two azimuth data, and write the two data into the DDR SDRAM by splicing at intervals, and the method can realize the sequential arrangement of the column data so as to read out the data sequentially. The transposing methods in the above documents are all designed for one path of data processing system, and data is input to the corresponding sub-modules in rows or columns, however, in some imaging algorithm flows, data transposing is required many times, and in FPGA-based implementation schemes, a multi-path parallel data processing design is mostly adopted, which also results in that data cannot be input to the corresponding sub-modules in rows or columns in the data processing process.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention designs a four-way parallel data processing transpose system based on FPGA, the system caches a plurality of row data by using RAM resources in the FPGA, then takes out the row data according to columns for four-core FFT data processing, and then writes the processed data into DDR SDRAM by using the burst transmission technology of DDR SDRAM.
The technical scheme of the invention is as follows:
a four-way parallel data processing transposition system based on FPGA is divided into a data processing module and a DDR storage module; the data processing module comprises a RAM group input data channel selector, a RAM write data control module, a RAM read data control module, a RAM group 1, a RAM group 2, a RAM group output data channel selector, a 4-core FFT group and a serial-parallel conversion module, wherein the serial-parallel conversion module comprises a serial-parallel conversion input data channel selector, a serial-parallel conversion register group 1, a serial-parallel conversion register group 2 and a serial-parallel conversion output data channel selector; the DDR storage module comprises a DDR group 1, a DDR group 2, a DDR address control module, an input data asynchronous FIFO and an output data asynchronous FIFO which are responsible for data buffering;
the RAM group 1 and the RAM group 2 are respectively provided with four independent RAM blocks, the data input bit widths and the depths of all the RAM blocks are the same, and each RAM group is provided with an independent write counter, a write enable signal input port, a write data input bus, a writable zone bit, a read counter, a read enable signal input port, a read data output bus and a readable zone bit;
the DDR group 1 is connected with an input data asynchronous FIFO data input port; the output port of the input data asynchronous FIFO almost non-empty mark signal is connected with the effective signal input end of the RAM data writing control module data, and the data output port of the input data asynchronous FIFO is connected with the data input end of the RAM group input data channel selector; the output end of the RAM group input data channel selector is connected with the write data input buses of the RAM group 1 and the RAM group 2; the readable mark signal output ends of the RAM group 1 and the RAM group 2 are connected with the readable signal input end of the RAM group of the RAM data reading control module, the writable mark signal output ends of the RAM group 1 and the RAM group 2 are connected with the writable signal input end of the RAM group of the RAM data writing control module, and the read data output buses of the RAM group 1 and the RAM group 2 are connected with the data input end of the RAM group output data channel selector; the write enable signal of the RAM write data control module is connected with the write enable signal input ports of the RAM group 1 and the RAM group 2, the read enable signal of the RAM read data control module is connected with the read enable signal input ports of the RAM group 1 and the RAM group 2, the RAM group channel signal output end of the RAM read data control module is connected with the channel information input end of the RAM group output data channel selector, and the data effective signal output end of the RAM read data control module is connected with the input data effective signal input end of the four-core FFT group; the data output bus of the RAM group output data channel selector is connected with the input data bus of the four-core FFT group; the output data effective signal output end of the four-core FFT group is connected with the input data effective signal input end of the serial-parallel conversion module, and the data output bus of the four-core FFT group is connected with the data input bus of the serial-parallel conversion module;
the RAM group input data channel selector realizes data channel selection between data input to the RAM group; the RAM group output data channel selector realizes the data channel selection from the RAM group to the four-core FFT;
the serial-parallel conversion module consists of a serial-parallel conversion input data channel selector, a serial-parallel conversion register group 1, a serial-parallel conversion register group 2 and a serial-parallel conversion output data channel selector; each serial-parallel conversion register group consists of four serial-parallel conversion register blocks, and each serial-parallel conversion register block is provided with an independent data input port, an independent data input effective signal port, an independent data output port and an independent data output effective signal port; the data output port of the serial-parallel conversion input data channel selector is connected with the data input ports of the serial-parallel conversion register group 1 and the serial-parallel conversion register group 2; the output end of the effective data signal output by the serial-parallel conversion input data channel selector is connected with the input end of the effective data signal input by the serial-parallel conversion register group 1 and the serial-parallel conversion register group 2; the data output port of each serial-parallel conversion register block is connected with the data input port of the serial-parallel conversion output data channel selector; the data output valid signal output port of each serial-parallel conversion register block is connected with the data input valid signal port of the serial-parallel conversion output data channel selector; the data output port of the serial-parallel conversion output data channel selector is connected with the data input port of the output data asynchronous FIFO, and the data output effective signal port of the serial-parallel conversion output data channel selector is connected with the data read enabling signal port of the output data asynchronous FIFO;
the serial-parallel conversion input data channel selector realizes the data channel selection between the FFT group data input to the serial-parallel conversion register group; the serial-parallel conversion output data channel selector realizes the data channel selection from the serial-parallel conversion register group to the output data asynchronous FIFO;
the DDR address control module 1 changes the read-write address of the DDR group 1 according to a specified sequence, and the RAM write data control module sets a write enable signal of a corresponding RAM group to be 1 when the writable flag signal of the RAM group is pulled up, so that 32-bit original data are written into the specified RAM group; the RAM reading data control module sets a reading enabling signal corresponding to the RAM group to be 1 when the readable mark signal of the RAM group is pulled up, so that 32-bit original data are read out according to a preset sequence; when each block of RAM is written with one data, the write counter is added with 1, the write address is added with 1, when the write counter reaches the upper write limit, the next write operation is waited for by clearing, when the write counter reaches the upper write limit, the readable flag signal is set to 1, and the writable flag signal is set to 0. Every time a data reading counter is read, adding 1 to the data reading counter, changing the reading address once according to a specified sequence, resetting to wait for the next reading operation when the data reading counter reaches a reading upper limit, setting a readable mark signal to be 0 when the data reading counter reaches the reading upper limit, and setting a writable mark signal to be 1;
the DDR address control module 2 changes the read-write address of the DDR group 2 according to the designated sequence, when the DDR initialization completion signal is pulled up and the output data asynchronous FIFO empty mark signal is pulled down, the DDR address control module 2 sets the output data asynchronous FIFO write enable signal to be 1, data is written into the DDR designated address, and an address jump control counter of the DDR address control module 2 is added with 1; when the effective signal of DDR read data is pulled high, the DDR address control module 2 sets an output data asynchronous FIFO write enable signal to be 1, reads data in a DDR designated address, and an address jump control counter of the DDR address control module 2 is added with 1. When the address jump counter counts to the upper limit of the DDR address, the address jump counter resets, and the DDR address returns to the initial position 0.
The input data asynchronous FIFO completes data transmission across clock domains, the bit width of a data input end of the input data asynchronous FIFO is equal to 2 times of the sum of the bit widths of data outputs of all RAM blocks in a single RAM group, and the bit width of a data output end of the input data asynchronous FIFO is equal to the sum of the bit widths of the data outputs of all the RAM blocks in the single RAM group.
When the output data of the FFT group is valid, the serial-parallel conversion module sets an input signal valid flag signal of a serial-parallel conversion input data channel selector to be 1, divides the 128-bit wide data input data output in parallel by the FFT into four paths of 32-bit wide data, and writes the four paths of 32-bit wide data into four register blocks of a serial-parallel conversion register group 1 or a serial-parallel conversion register group 2 respectively; when the FFT output data is successfully written into the four register blocks each time, the data writing counter is added with 1, when 16 times of data writing is completed, the data writing counter is reset, the set of serial-parallel conversion register group output data valid flag signal is set to be 1, 16 32-bit original data of the four register blocks in the set of serial-parallel conversion register group are parallelly output to a cache register of a serial-parallel conversion output data channel selector in a 512-bit mode, meanwhile, the set of input data valid flag signal is set to be 0, the other set of input data valid flag signal is set to be 1, the subsequent FFT group output data is written into the other set of serial-parallel conversion register group, and the FFT group output data can be alternately written into the serial-parallel conversion register group 1 and the serial-parallel conversion register group 2 according to the rule. When the flag signal of the full signal of the asynchronous FIFO output data is 0, the valid flag signal of the output data of the channel selector of the serial-parallel conversion output data is set to be 1, and the data in the data buffer register are sequentially output to the asynchronous FIFO output data according to the time sequence.
The bit width of a data input end of the output data asynchronous FIFO is equal to 16 times of the bit width of the original data, namely 512bits, and the bit width of a data output end is also equal to 16 times of the bit width of the original data; the serial-parallel conversion register group 1 and the serial-parallel conversion register group 2 are composed of 4 register blocks, each register block comprises 16 registers, each register block has a bit width of 32bits and can just store one original data, and because the bit width of the DDR is 2 times of the bit width of the original data, one address in the DDR can store two original data. Therefore, the address control module of the DDR group 2 sets the address increment to 8, that is, outputs the data in the data asynchronous FIFO to the DDR, and the DDR address counter is incremented by 8 every time the data is written into the DDR in one register block.
Further, the number of the RAM blocks in the RAM group 1 and the RAM group 2 is 4, the data input bit width of each RAM block is equal to the bit width of 32bits of the original data, and the depth is equal to the length 32768 of four rows of the original data.
Further, the upper limit value of the RAM group writing is equal to the depth of the RAM block, and the upper limit value of the RAM group reading is equal to the depth of the RAM block.
Further, the serial-parallel conversion register group 1 and the serial-parallel conversion register group 2 are each composed of 4 register blocks, each register block includes 16 registers, the data input bit width of each register is equal to the original data bit width of 32bits, and the output data bit width is 16 times of the original data bit width, that is, 512bits.
Furthermore, through the serial-parallel conversion input data channel selector and the serial-parallel conversion output data channel selector, the ping-pong operation of serially writing the output data of the FFT group into the serial-parallel conversion register group and parallelly outputting the data to the asynchronous FIFO of the output data is realized.
Furthermore, the DDR memory module clock is 200MHz, and the data processing module clock is 300MHz
The invention can ensure that the four-core FFT input data is uninterrupted, and simultaneously writes the processed data into the DDR group 2 without loss, thereby realizing the streamlined high-speed operation of the whole data processing flow.
Drawings
FIG. 1 is a block diagram of an implementation of a four-way parallel data processing transpose system based on FPGA;
FIG. 2 shows the matrix data arrangement and data reading sequence to be transposed in the present invention;
FIG. 3 is a diagram showing the data arrangement of the matrix data to be transposed in the RAM according to the present invention;
FIG. 4 is a reading sequence of data in the RAM in the present invention;
FIG. 5 is a block diagram of an implementation of a serial-to-parallel conversion module of the present invention;
FIG. 6 shows the arrangement of data written to DDR after passing through the serial-to-parallel conversion module.
Detailed Description
The technical scheme of the invention is explained in detail in the following with the accompanying drawings.
As shown in fig. 1, the four-way parallel data processing transpose system based on FPGA of the present invention is divided into a data processing module and a DDR storage module; the data processing module comprises a RAM group input data channel selector, a RAM write data control module, a RAM read data control module, a RAM group 1, a RAM group 2, a RAM group output data channel selector, a 4-core FFT group and a serial-parallel conversion module, wherein the serial-parallel conversion module comprises a serial-parallel conversion input data channel selector, a serial-parallel conversion register group 1, a serial-parallel conversion register group 2 and a serial-parallel conversion output data channel selector; the DDR storage module comprises a DDR group 1, a DDR group 2, a DDR address control module, an input data asynchronous FIFO and an output data asynchronous FIFO, wherein the input data asynchronous FIFO and the output data asynchronous FIFO are responsible for data buffering;
except DDR SDRAM, other modules are all realized on an FPGA, and the FPGA selects an xc7vx690tffg1761-3 chip of XILINX company. The model of the DDR SDRAM chip is MT8JTF12864HZ-1G6, the DDR3 SDRAM chip is provided, and the burst transmission length is set to be 8. Vivado 2018.3 with a development environment of XILINX, and a DDR SDRAM read-write interface uses an MIG core provided by Vivado.
In this specific embodiment, the size of the matrix to be transposed is 8192 × 8192, two sets of RAMs are provided, each set has 4 RAM blocks, and the size of the addressing space of each RAM block is 32768.
The function and the working principle of each module of the invention are as follows:
the raw data is fed to the input data bus of all RAM blocks on one of the RAM banks through the RAM bank input data channel selector in row order.
The RAM data writing control module sets a writing enabling signal of the RAM group 1 to be 1 when the writable flag signal of the RAM group 1 is pulled up, so that input data are written into the RAM group 1; when the writable flag of the RAM group 1 is pulled down and the writable flag of the RAM group 2 is pulled up, the writing enable signal of the RAM group 1 is set to 0, the writing enable signal of the RAM group 2 is set to 1, so that data is written into the RAM group 2, the RAM read data control module sets the reading enable signal of the RAM group 1 to 1 when the readable flag signal of the RAM group 1 is pulled up, so that input data is read from the RAM group 1 according to a preset sequence, when the readable flag of the RAM group 1 is pulled down and the readable flag of the RAM group 2 is pulled up, the reading enable signal of the RAM group 1 is set to 0, the reading enable signal of the RAM group 2 is set to 1, and the input data is read from the RAM group 2 according to a preset sequence; the method comprises the steps that 1 is added to a data writing counter and 1 is added to a writing address when each RAM is written into each block of RAM, when the writing counter reaches an upper limit 32768, zero clearing is carried out to wait for the next writing operation, when the writing counter reaches the upper limit, a readable flag signal is set to be 1, a writable flag signal is set to be 0, when a data reading counter is read for each time, 1 is added, reading addresses are converted once according to a specified sequence, when the reading counter reaches 32768, zero clearing is carried out to wait for the next reading operation, when the reading counter reaches the upper limit, the readable flag signal is set to be 0, and the writable flag signal is set to be 1.
Data is written into 4 RAM blocks in the RAM group 1 in sequence, and every time when one data is written into the 4 RAM blocks, the counter is increased by one, and when the write counter is equal to 32768. At this time, the readable flag signal of the RAM bank 1 is set to 1, and the writable flag signal is set to 0.
The write RAM control unit turns off the write function of the RAM group 1, turns on the write function of the RAM group 2, and sequentially writes data into the RAM group 2. Data is written into the RAM group 2 according to the method of writing the RAM group 1 until the RAM group 2 is full, and at the moment, the readable mark of the RAM group 2 is set to be 1.
The read RAM control unit sets the read enable of the RAM group 1 to 1 after the readable flag signal of the RAM group 1 is set to 1, and starts reading data from the RAM group 1.
After the read enable signal of the RAM group 1 is set to 1, the RAM block in the RAM group 1 reads out data to the RAM group output data channel selector according to the specified address sequence.
Firstly, 4 data with the sequence of 1 in 4 RAM blocks in the RAM group 1 are read out and written into corresponding channels of FFT according to the sequence of FIG. 4. Then 4 data with the sequence of 2 in 4 RAM blocks in the RAM group 1 are read out and written into FFT in sequence
Every time data is read out, the reading counter is added with 1, and the reading address signals are converted once in sequence.
After the reading of all the data in the RAM group 1 is completed, the reading counter reaches the set upper reading limit value 32768, the RAM group 1 sets the writable signal mark to be 1, and the readable signal mark is set to be 0.
After the read enable signal of the RAM bank 2 is set to 1, the RAM blocks in the RAM bank 2 read data to the FFT in the order of the designated address.
The method of reading data from the RAM bank 2 and writing data to the FFT is the same as the method of reading data from the RAM bank 1 and writing data to the FFT.
The data writing of each group of the RAMs is completed by one writing cycle, the data reading of each group of the RAMs is completed by one reading cycle, and the reading and writing cycles are circularly performed in sequence.
As shown in fig. 5, the serial-to-parallel conversion module of the present invention includes a serial-to-parallel conversion input data channel selector, a serial-to-parallel conversion register bank 1, a serial-to-parallel conversion register bank 2, and a serial-to-parallel conversion output data channel selector.
In this embodiment, the serial-to-parallel conversion register group 1 and the serial-to-parallel conversion register group 2 are each composed of 4 register blocks, each register block includes 16 registers, the data input bit width of each register is equal to the original data bit width of 32bits, and the output data bit width is 16 times of the original data bit width, that is, 512bits.
The function and the working principle of the serial-parallel conversion module are as follows:
and the FFT output data is sequentially sent into a serial-parallel conversion register group through a serial-parallel conversion input data channel selector according to the time sequence to realize serial-parallel conversion, and then the data in the serial-parallel conversion register group is sequentially output to an output data asynchronous FIFO through the serial-parallel conversion output data channel selector according to the time sequence.
The serial-to-parallel conversion input data channel selector is provided with an input signal valid flag signal, an input data port, an output data valid flag signal, an output data port, and a data write counter.
When the output data of the FFT group is valid, setting an input signal valid flag signal of a serial-parallel conversion input data channel selector to be 1, dividing the 128bits bit wide data input data output in parallel by the FFT into four paths of 32bits bit wide data, and respectively writing the four paths of 32bit wide data into four register blocks of a serial-parallel conversion register group 1 or a serial-parallel conversion register group 2; when the FFT output data is successfully written into the four register blocks each time, adding 1 to a data writing counter, resetting the data writing counter when 16 times of data writing is completed, setting a valid flag signal of the output data of the group of serial-parallel conversion register groups to be 1, and parallelly outputting 16 32-bit original data of the four register blocks in the group of serial-parallel conversion register groups into a cache register of a serial-parallel conversion output data channel selector in a 512-bit mode; and simultaneously, the state controller changes the state, sets the valid flag signal of the input data to be 0, sets the valid flag signal of the other input data to be 1, and controls the input data channel selector to switch the FFT input data to be sent to the other group of idle serial-parallel conversion register group data input ports. According to the above rule, the output data of the FFT group can be alternately written into the serial-parallel conversion register group 1 and the serial-parallel conversion register group 2, so as to realize the ping-pong operation of writing the output data of the FFT group into the serial-parallel conversion register group.
Each serial-parallel conversion register of the serial-parallel conversion register group is provided with an input signal valid flag signal, an input data port, an output data valid flag signal and an output data port.
Taking the serial-parallel conversion register group 1 as an example, the serial-parallel conversion register group 1 is composed of 4 register blocks, each register block comprises 16 registers, and the data input bit width of each register is equal to the original data bit width of 32bits; when the output data of the serial-parallel conversion input data channel selector is valid, the serial-parallel conversion input data channel selector outputs a data valid flag signal of 1, the serial-parallel conversion register input signal valid flag signal is set to 1, the output data valid flag signal is set to 0, and the data with the bit width of 32bits is written in series; when the output data of the channel selector for the serial-parallel conversion input data is invalid, the channel selector for the serial-parallel conversion input data outputs a data valid flag signal of 0, the state controller generates state change, the input signal valid flag signal of the serial-parallel conversion register is set to 0, the output data valid flag signal of the serial-parallel conversion register is set to 1, and the serial-parallel conversion register outputs the written 16 data with 32bits bit width in parallel as data with 512bits bit width.
The serial-to-parallel conversion output data channel selector comprises eight data buffer registers and is provided with an input signal valid flag signal, an input data port, a FIFO full signal flag signal, an output data valid flag signal and an output data port.
When the output data of the serial-parallel conversion register is valid, the valid flag signal of the output data of the serial-parallel conversion register is 1, the valid flag signal of the input signal is set to be 1, and the output data of the serial-parallel conversion register is written into a corresponding data buffer register; meanwhile, when the output data asynchronous FIFO full signal flag signal is 0, setting the output data valid flag signal to be 1, and sequentially outputting the data in the data buffer register to the data output data asynchronous FIFO according to the time sequence; after finishing the data output of the two serial-parallel conversion register groups, one-time ping-pong operation that the serial-parallel conversion input data channel selector writes the FFT output data into the serial-parallel conversion register and the serial-parallel conversion output data channel selector outputs the data of the serial-parallel conversion register to the data output data asynchronous FIFO is successfully realized.
The DDR group 2 address control module is provided with a write enable flag signal output port and an address control counter, when an output data asynchronous FIFO empty flag signal is set to be 1, and DDR initialization is completed, command receiving and data receiving are ready, the write enable flag signal is pulled high, data in the output data asynchronous FIFO is written into a designated address in the DDR, and the address control counter jumps according to a designated rule.
As shown in fig. 4, the FFT block outputs four azimuth data (a) at a time 1,1 …a 8192,1 、a 1,2 …a 8192,2 、a 1,3 …a 8192,3 、a 1,4 …a 8192,4 ) After each azimuth data is subjected to a serial-parallel conversion module, writing the azimuth data into an output data asynchronous FIFO in a 512bits bit width form; when the DDR reads the effective signal to set 1, outputting a data asynchronous FIFO to read the enabling signal to set 1, and reading the data in the data output asynchronous FIFO; as shown in FIG. 6, writing 1 burst data in the first direction into DDR (a in FIG. 6) 1,1 …a 16,1 ) Thereafter, the address control counter is incremented by 128, i.e. jumps to the address of the first burst data in the next direction (a in FIG. 6) 1,2 …a 16,2 ). According to the rule, every time four burst direction data are written, the address control counter is decreased 376, namely, the data jump to the first direction and the second b directionThe address of urst data (a in FIG. 6) 17,1 …a 31,1 ). According to the rule, after sixteen burst direction data are written into each direction, 1048072 is added to the address control counter, namely, the address control counter is added to the address of the sixteenth burst data from the fourth direction (a in fig. 6) 241,4 …a 256,4 ) Jump to the address of the seventeenth burst data in the first direction (a in FIG. 6) 257,1 …a 272,1 ) Corresponding to the first address of the second row in DDR. According to the rule, when the fourth azimuth data to the last burst azimuth data is written (a in FIG. 6) 8177,4 …a 8192,4 ) Then, the address control counter is decreased 32505848, i.e. jump to the address of the first burst data in the fifth direction (a in FIG. 6) 1,5 …a 16,5 ). According to the rule, after 8192 th azimuth last burst data (a 8177,8192 \8230; a8192, 8192) in fig. 6 is written, the address control counter is reset, and the writing of a frame of data with the size of 8192 x 8192 is completed.

Claims (6)

1. A four-way parallel data processing transposition system based on FPGA comprises a data processing module and a DDR storage module; the data processing module comprises a RAM group input data channel selector, a RAM write data control module, a RAM read data control module, a first RAM group, a second RAM group, a RAM group output data channel selector, a 4-core FFT group and a serial-parallel conversion module, wherein the serial-parallel conversion module comprises a serial-parallel conversion input data channel selector, a first serial-parallel conversion register group, a second serial-parallel conversion register group and a serial-parallel conversion output data channel selector; the DDR storage module comprises a first DDR group, a second DDR group, a DDR address control module, an input data asynchronous FIFO and an output data asynchronous FIFO;
the first RAM group and the second RAM group are respectively provided with four independent RAM blocks, the data input bit widths and the depths of all the RAM blocks are the same, and each RAM group is provided with an independent write counter, a write enable signal input port, a write data input bus, a writable zone bit, a read counter, a read enable signal input port, a read data output bus and a readable zone bit;
the first DDR group is connected with an input data asynchronous FIFO data input port; the non-empty mark signal output port of the input data asynchronous FIFO is connected with the effective signal input end of the RAM data writing control module, and the data output port of the input data asynchronous FIFO is connected with the data input end of the RAM group input data channel selector; the output end of the RAM group input data channel selector is connected with the write data input buses of the first RAM group and the second RAM group; the readable mark signal output ends of the first RAM group and the second RAM group are connected with the readable signal input end of the RAM group of the RAM data reading control module, the writable mark signal output ends of the first RAM group and the second RAM group are connected with the writable signal input end of the RAM group of the RAM data writing control module, and the read data output buses of the first RAM group and the second RAM group are connected with the data input end of the RAM group output data channel selector; the write enable signal of the RAM write data control module is connected with the write enable signal input ports of the first RAM group and the second RAM group, the read enable signal of the RAM read data control module is connected with the read enable signal input ports of the first RAM group and the second RAM group, the RAM group channel signal output end of the RAM read data control module is connected with the channel information input end of the RAM group output data channel selector, and the data effective signal output end of the RAM read data control module is connected with the input data effective signal input end of the four-core FFT group; the data output bus of the RAM group output data channel selector is connected with the input data bus of the four-core FFT group; the output data effective signal output end of the four-core FFT group is connected with the input data effective signal input end of the serial-parallel conversion module, and the data output bus of the four-core FFT group is connected with the data input bus of the serial-parallel conversion module;
the RAM group input data channel selector realizes data channel selection between data input to the RAM group; the RAM group output data channel selector realizes the data channel selection from the RAM group to the four-core FFT;
the serial-parallel conversion module consists of a serial-parallel conversion input data channel selector, a first serial-parallel conversion register group, a second serial-parallel conversion register group and a serial-parallel conversion output data channel selector; each serial-parallel conversion register group consists of four serial-parallel conversion register blocks, and each serial-parallel conversion register block is provided with an independent data input port, an independent data input effective signal port, an independent data output port and an independent data output effective signal port; the data output port of the serial-parallel conversion input data channel selector is connected with the first serial-parallel conversion register group and the second serial-parallel conversion register group data input port; the output end of the valid data signal output by the serial-parallel conversion input data channel selector is connected with the input end of the valid data signal input by the first serial-parallel conversion register group and the second serial-parallel conversion register group; the data output port of each serial-parallel conversion register block is connected with the data input port of the serial-parallel conversion output data channel selector; the data output valid signal output port of each serial-parallel conversion register block is connected with the data input valid signal port of the serial-parallel conversion output data channel selector; the data output port of the serial-parallel conversion output data channel selector is connected with the data input port of the output data asynchronous FIFO, and the data output valid signal port of the serial-parallel conversion output data channel selector is connected with the data output enable signal port of the output data asynchronous FIFO;
the serial-parallel conversion input data channel selector realizes the data channel selection between the FFT group data input to the serial-parallel conversion register group; the serial-parallel conversion output data channel selector realizes the data channel selection from the serial-parallel conversion register group to the output data asynchronous FIFO;
the DDR address control module changes the read-write address of the first DDR group according to a set sequence, and the RAM write data control module sets a write enable signal of the corresponding RAM group to be 1 when the writable flag signal of the RAM group is pulled up, so that 32-bit original data are written into the designated RAM group; the RAM reading data control module sets a reading enabling signal corresponding to the RAM group to be 1 when the readable mark signal of the RAM group is pulled up, so that 32-bit original data are read out according to a preset sequence; when each block of RAM is written with one data, the write counter is added with 1, the write address is added with 1, when the write counter reaches the upper write limit, the write counter is cleared to wait for the next write operation, when the write counter reaches the upper write limit, the readable flag signal is set to 1, and the writable flag signal is set to 0; when reading a data, the reading counter is added with 1, the reading address is changed once according to the appointed sequence, when the reading counter reaches the reading upper limit, the reading counter is reset to wait for the next reading operation, when the reading counter reaches the reading upper limit, the readable mark signal is set to be 0, and the writable mark signal is set to be 1;
the DDR address control module 2 changes the read-write address of the second DDR group according to the designated sequence, the DDR address control module 2 sets an output data asynchronous FIFO write enable signal to be 1 when a DDR initialization completion signal is pulled high and an output data asynchronous FIFO empty mark signal is pulled low, data are written into the DDR designated address, and an address jump control counter of the DDR address control module 2 is added with 1; the DDR address control module 2 sets an output data asynchronous FIFO write enable signal to be 1 when a DDR read data effective signal is pulled high, data in a DDR designated address is read, and an address jump control counter of the DDR address control module 2 is added with 1; when the address jump counter counts to the upper limit of the DDR address, the address jump counter resets, and the DDR address returns to the initial position 0;
the input data asynchronous FIFO completes data transmission across clock domains, the bit width of a data input end of the input data asynchronous FIFO is equal to 2 times of the sum of the bit widths of data outputs of all RAM blocks in a single RAM group, and the bit width of a data output end of the input data asynchronous FIFO is equal to the sum of the bit widths of the data outputs of all the RAM blocks in the single RAM group;
when the output data of the FFT group is valid, the serial-parallel conversion module sets an input signal valid flag signal of a serial-parallel conversion input data channel selector to be 1, divides the 128-bit-width data input data output in parallel by the FFT into four paths of 32-bit-width data, and writes the four paths of 32-bit-width data into four register blocks of a first serial-parallel conversion register group or a second serial-parallel conversion register group respectively; when the FFT output data is successfully written into the four register blocks each time, adding 1 to a data writing counter, resetting the data writing counter when finishing writing the data 16 times, parallelly outputting 16-bit 32-bit original data of the four register blocks in the set of serial-parallel conversion register groups to a cache register of a serial-parallel conversion output data channel selector in a 512-bit mode, simultaneously setting 0 to the input data valid flag signal of the set of input data, setting 1 to the other input data valid flag signal, writing the subsequent FFT output data into the other set of serial-parallel conversion register groups, and alternately writing the FFT output data into a first serial-parallel conversion register group and a second serial-parallel conversion register group according to the rule; when the output data asynchronous FIFO full signal flag signal is 0, setting the output data valid flag signal of the serial-parallel conversion output data channel selector to 1, and sequentially outputting the data in the data buffer register to the output data asynchronous FIFO according to the time sequence;
the bit width of a data input end of the output data asynchronous FIFO is equal to 16 times of the bit width of the original data, namely 512bits, and the bit width of a data output end is also equal to 16 times of the bit width of the original data; the first serial-parallel conversion register group and the second serial-parallel conversion register group are composed of 4 register blocks, each register block comprises 16 registers, each register block is 32bits wide and used for storing one piece of original data, and the DDR bit width is 2 times of the original data bit width, so that one address in the DDR can store two pieces of original data; and the address control module of the second DDR group sets the address increment to be 8, namely, the DDR address counter is increased by 8 every time data in the data asynchronous FIFO is output and written into the DDR in one register block.
2. The FPGA-based four-way parallel data processing transpose system of claim 1, wherein in the first RAM group and the second RAM group, the data input bit width of each RAM block is equal to 32bits of the bit width of the original data, and the depth is equal to 32768 of the length of four lines of the original data.
3. The FPGA-based four-way parallel data processing transpose system of claim 1, wherein the upper RAM bank write limit value is equal to the depth of the RAM block, and the upper RAM bank read limit value is equal to the depth of the RAM block.
4. The system according to claim 1, wherein the first set of serial-to-parallel conversion registers and the second set of serial-to-parallel conversion registers are each composed of 4 register blocks, each register block comprises 16 registers, each register has a data input bit width equal to 32bits of the original data bit width, and an output data bit width is 16 times the original data bit width, that is, 512bits.
5. The four-way parallel data processing transpose system based on FPGA of claim 1, wherein ping pong operation of serially writing FFT group output data into serial-to-parallel conversion register group and parallel outputting data to output data asynchronous FIFO is achieved through serial-to-parallel conversion input data channel selector and serial-to-parallel conversion output data channel selector.
6. The FPGA-based four-way parallel data processing transpose system of claim 1, wherein a DDR memory module clock is 200MHz, and a data processing module clock is 300MHz.
CN202110952866.2A 2021-08-19 2021-08-19 Four-way parallel data processing transposition system based on FPGA Active CN113641625B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110952866.2A CN113641625B (en) 2021-08-19 2021-08-19 Four-way parallel data processing transposition system based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110952866.2A CN113641625B (en) 2021-08-19 2021-08-19 Four-way parallel data processing transposition system based on FPGA

Publications (2)

Publication Number Publication Date
CN113641625A CN113641625A (en) 2021-11-12
CN113641625B true CN113641625B (en) 2023-03-14

Family

ID=78422820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110952866.2A Active CN113641625B (en) 2021-08-19 2021-08-19 Four-way parallel data processing transposition system based on FPGA

Country Status (1)

Country Link
CN (1) CN113641625B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114626005B (en) * 2022-03-21 2023-05-26 电子科技大学 FPGA (field programmable Gate array) implementation method of CS (circuit switched) algorithm in video SAR (synthetic aperture radar) real-time imaging

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1877360A (en) * 2005-06-10 2006-12-13 中国科学院电子学研究所 Real-time image processing transpose memory in synthetic aperture radar
EP2120063A1 (en) * 2008-05-15 2009-11-18 The European Community, represented by the European Commission Radar-imaging of a scene in the far-field of a one-or two-dimensional radar array
CN101441271B (en) * 2008-12-05 2011-07-20 航天恒星科技有限公司 SAR real time imaging processing device based on GPU
CN104239232B (en) * 2014-09-10 2017-05-10 北京空间机电研究所 Ping-Pong cache operation structure based on DPRAM (Dual Port Random Access Memory) in FPGA (Field Programmable Gate Array)
CN108614266A (en) * 2018-03-13 2018-10-02 南京航空航天大学 A kind of implementation method of the FPGA of video SAR high-speed processing technologies
CN111257874A (en) * 2020-01-15 2020-06-09 中国电子科技集团公司第十四研究所 PFA FPGA parallel implementation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王一鸣.基于FPGA的SAR转置存储器设计.2013,(第4期),第I136-862页. *
胡晓琛.基于FPGA的微型SAR实时成像技术.2019,(第2期),第I136-1418页. *

Also Published As

Publication number Publication date
CN113641625A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN108053855B (en) Matrix transposition method based on SDRAM chip
US8358557B2 (en) Memory device and method
CN113641625B (en) Four-way parallel data processing transposition system based on FPGA
US6259648B1 (en) Methods and apparatus for implementing pseudo dual port memory
US7995419B2 (en) Semiconductor memory and memory system
CN101825997A (en) Asynchronous first-in first-out storage
WO2013097223A1 (en) Multi-granularity parallel storage system and storage
CN101645305B (en) Static random access memory (SRAM) for automatically tracking data
CN108169716A (en) SAR imaging system matrix transposition devices and pattern deinterleaving method based on SDRAM chips
US8621135B2 (en) Semiconductor memory device and information data processing apparatus including the same
CN102004626B (en) Dual-port memory
WO2013097228A1 (en) Multi-granularity parallel storage system
CN107408076B (en) Data processing apparatus
CN105577985A (en) Digital image processing system
CN100568382C (en) Push-up storage
CN115185859B (en) Radar signal processing system and low-delay matrix transposition processing device and method
CN113555051B (en) SAR imaging data transposition processing system based on DDR SDRAM
CN113740851B (en) SAR imaging data processing system of time-sharing multiplexing single DDR
CN111966628B (en) Multi-core combined type large-capacity data synchronous storage method
CN107293318B (en) Bit width configurable embedded memory
KR100571435B1 (en) Synchronous dynamic random access memory architecture for sequential burst mode
CN114626005B (en) FPGA (field programmable Gate array) implementation method of CS (circuit switched) algorithm in video SAR (synthetic aperture radar) real-time imaging
CN116500573A (en) Four-path parallel SAR imaging data transposition system based on DDR SDRAM
US20220383916A1 (en) Memory device, semiconductor system, and data processing system
JPH09180433A (en) First-in/first-out memory device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant