CN110781447B - DDR-based high-efficiency matrix transposition processing method - Google Patents

DDR-based high-efficiency matrix transposition processing method Download PDF

Info

Publication number
CN110781447B
CN110781447B CN201910996540.2A CN201910996540A CN110781447B CN 110781447 B CN110781447 B CN 110781447B CN 201910996540 A CN201910996540 A CN 201910996540A CN 110781447 B CN110781447 B CN 110781447B
Authority
CN
China
Prior art keywords
data
ddr
address
read
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910996540.2A
Other languages
Chinese (zh)
Other versions
CN110781447A (en
Inventor
张为
李欣桐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910996540.2A priority Critical patent/CN110781447B/en
Publication of CN110781447A publication Critical patent/CN110781447A/en
Application granted granted Critical
Publication of CN110781447B publication Critical patent/CN110781447B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging
    • G01S13/90Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
    • G01S13/9004SAR image acquisition techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging
    • G01S13/90Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Data Mining & Analysis (AREA)
  • Electromagnetism (AREA)
  • Mathematical Optimization (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Error Detection And Correction (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a DDR-based high-efficiency matrix transposition processing method, which is characterized in that a write RAM and a read RAM are matched with an IP core of a DDR3 SDRAM; the matrix to be transposed is a 128-by-128 matrix of single data 64bit, and the data of each row of the matrix to be transposed is a small matrix; 16 data are written in each 1 active signal in the writing operation process; reading 8 × 16 data in every 1 active signal in the reading operation process; in DDR, the data is read out according to the rule of data cycle skip among the same lines so as to reduce the occurrence of a line active signal as much as possible. The invention solves the problem that the whole processing speed of the system is reduced due to the skip access of the DDR SDRAM in the large-order matrix transposition. By the matrix blocking technology, on the basis of reducing the write rate by a small margin, the read rate is greatly improved, the read-write rate of the DDR is balanced when the matrix is transposed, and therefore the average read-write efficiency of the DDR is improved.

Description

DDR-based high-efficiency matrix transposition processing method
Technical Field
The invention belongs to the field of data transposition in radar imaging technology, and relates to a processing method for quickly transposing large data volume in a radar imaging process.
Background
Synthetic Aperture Radar (SAR) is a Radar with high resolution imaging effect, and compared with the traditional Radar, SAR has the advantages of being all-weather and capable of penetrating through an obstruction all day long. The method plays an important role in the fields of environment detection, terrain reconnaissance, target identification and the like. The SAR is mainly carried on three working platforms, namely an airborne working platform, a satellite-borne working platform and a missile-borne working platform. By performing algorithm processing on the echo signal, a focused SAR image can be obtained. The conventional SAR imaging system not only has complex structure and high manufacturing cost, but also has larger weight and volume. With the rapid development of foreign unmanned aerial vehicles and light and small aircrafts, the imaging processing of the miniature SAR is widely concerned. Since the echo signals of the SAR need to process a large amount of data, a higher requirement is put on the fast transmission and processing of the data. In the echo data processing flow of the SAR, fourier transform FFT and transposition are two basic operations which are applied frequently.
The need for micro-SAR imaging processing, i.e., either an FPGA or an ASIC, can be relatively inadequate in this regard, compared to the high speed and efficiency that comes from the specificity of ASICs. However, a professional ASIC has a long development period and can only be replaced as a whole with the update. The advantages of the FPGA are its flexibility in programming development and the timeliness of upgrading algorithms.
Generally, the conventional transposition method is very simple to implement in a software portion, and the optimization thereof is also easy to complete. However, if the transposition is implemented in hardware, the transposition process becomes long and the resource occupancy rate is high if some special methods are not used. This is unacceptable for SAR real-time imaging systems with large data volumes to be processed. So that domestic researchers have also conducted research in this respect and achieved certain results. A large-capacity high-speed transmission of data is realized by using a mounted DDR3SDRAM core on an FPGA platform, and transposition is completed in the transmission process. By means of the DDR series double-rate reading and writing characteristics, the transposition method is researched, and the overall processing efficiency of the SAR is greatly improved. The algorithm for realizing matrix transposition in the SAR real-time imaging system mainly comprises the following steps: the methods of run-out (as shown in FIG. 1), column-out, pipeline balancing, chunking, etc. The conventional run-out or column-out method can greatly reduce the rate during the skip due to the DDR internal row activation time.
In the previous research, in 2013, zhou et al adopt an address mapping algorithm, combine the thought of dividing sub-blocks into large matrices, use the thought of storing data by changing pages with DDR, store the matrices according to two methods, namely, the same distance direction and the same direction, and finish transposition work through expression calculation of offset addresses in the reading process; in 2017, wu et al proposed a basic model of a block transposition method, namely, a 128 × 128 standard square matrix is divided into 16 × 8 small matrices, and a way of grouping write-in and cyclic skip read-out is used, so that under the condition of ensuring write-in efficiency, the read-out efficiency is improved, the read-write efficiency of DDR3 is balanced to a certain extent, and under the thought, the overall transposition efficiency is improved.
Disclosure of Invention
When an SAR real-time imaging system is operated on an FPGA, a conventional transposition method is used for solving the problem of low reading efficiency caused by line activation active signals of a DDR in the process of transposing a large amount of data, and meanwhile, 64-bit check bits are introduced for verifying the correctness of a result after matrix transposition. The invention provides a high-efficiency matrix transposition processing method based on DDR,
in order to solve the above technical problem, the present invention provides a DDR-based high efficiency matrix transpose processing method, which uses an IP core of a DDR3SDRAM, collocates a write RAM and a read RAM for the IP core, and includes:
step one, the matrix to be transposed is a 128-128 matrix of 64 bits of single data, and the matrix to be transposed is divided into 128 8-16 small matrices, namely, each row of data forms a small matrix;
writing 16 data in each 1 active signal in the writing operation process;
reading 8 × 16 data per 1 active signal in the reading operation process; during reading, the address is regulated and controlled by the controller, and the data is read out in the DDR according to the rule that data is read out in a cycle mode between the same lines, so that the occurrence of a row active signal is reduced as much as possible.
Further, in the second step of the matrix transposition processing method of the present invention, data enters the write RAM, after the write RAM is completely written, the data write of the DDR is started, the address bus of the DDR is increased in size according to the burst length, wherein the change of the address bit is the change of the read address bit of the write RAM; in the write RAM, switching to the next row and continuously reading the corresponding 16 data every time the reading of the 16 data is completed; according to the logic, each time 8 rows of data are read, a small cycle is formed; then, returning to the first row in the 8 rows, starting reading of the next group of 16 data, sequentially executing the above logics until the current 8 rows of data are completely read, starting reading of the next group of 8 rows of data, and performing a cycle operation, wherein each time the reading of 8 groups of data is completed is a major cycle; according to the large loop and the small loop, data reading of 16 large groups is finally completed, and each large group comprises 8 small groups.
In the third step of the matrix transposition processing method, 8 data are read in a single jump of each row in the DDR, namely the addresses of the 8 data in the read RAM are sequentially increased, and each row in the DDR is read out for 16 times; the reading mode is as follows: jumping to the corresponding position of the next small block every time one data is read out in one row, namely, crossing 15 data of the middle interval; after the 8 th data of the cycle is read out, the address pointer of the DDR points to the corresponding address of the first data which is not read out in the first small block of the line, the previous single-line cycle logic is repeated, and meanwhile, the address pointer of the RAM is read to point to the corresponding position of the next line; then, repeating the above processes until the 16 th cycle is completed, switching the address bit of the DDR to the first bit of the next row, and pointing the address pointer in the RAM to the corresponding position of the next row; repeating all the previous steps until 8 rows of data are read out from the DDR, and circulating the address pointer in the RAM to the corresponding position of the first row of the matrix; under such logic cycle, the previous cycle process is repeated until the data is read out from the DDR.
After all data are written into the read RAM, the reading state is finished, the data can be verified by using a controller, the difference of the high 64 bits is made between the data and two data to be verified by utilizing the characteristic that the diagonal line does not change in transposition, and if the difference value between the check bit of the diagonal line data and the check bit of one of the data is equal to the check bit of the other data, the two data are in correct positions; outputting the check result to a register, wherein the correctness is 1 and the error is 0; can be easily checked in a later report.
Compared with the prior art, the invention has the beneficial effects that:
the invention solves the problem that when a Programmable Gate Array (FPGA) is used as a main processing chip to realize radar signal processing, the large-order matrix transposition causes the reduction of the overall processing speed of a system due to the line skipping access of a Double-Rate synchronous dynamic random access memory (DDR SDRAM). By the matrix blocking technology, on the basis of reducing the write rate by a small margin, the read rate is greatly improved, the read-write rate of the DDR is balanced when the matrix is transposed, and therefore the average read-write efficiency of the DDR is improved.
Drawings
FIG. 1 is a conventional matrix transposed DDR3 read-write sequence;
FIG. 2 is a flow chart of the block-wise transpose of an embodiment of the present invention;
FIG. 3 is a block-wise DDR3 matrix transpose write sequence of an embodiment of the present invention;
FIG. 4 is a block-wise DDR3 matrix transpose read sequence of the present invention.
Detailed Description
The invention will be further described with reference to the following figures and specific examples, which are not intended to limit the invention in any way.
The invention provides a DDR-based high-efficiency matrix transposition processing method, which uses an IP core of a DDR3SDRAM and collocates a write RAM and a read RAM for the IP core; the matrix to be transposed is a 128-by-128 matrix of 64 bits of single data, and the matrix to be transposed is divided into 128 8-by-16 small matrices, namely, the data of each row forms a small matrix; 16 data are written in each 1 active signal in the process of writing operation; reading 8 × 16 data in every 1 active signal in the reading operation process; during reading, the address is regulated and controlled by the controller, and the data is read out in DDR according to the rule that data is read out in a cycle skip mode between the same lines, so that the occurrence of line active signals is reduced as much as possible.
The working flow of the matrix transpose processing method of the present invention is shown in fig. 2, and is described in detail below with reference to fig. 2.
(1) And splicing the original data in a splicing mode, numbering the data counted by the data counting module, and making into a 64-bit sequence. And splicing the sequence number corresponding to each data with the data to combine the data and the sequence number into 128-bit data to be transmitted. The significance of adding 64-bit sequence number array is that after the matrix transposition is completed, due to the large amount of data, it is inconvenient to verify the correctness of the transposition. Through the logic verification of the check bits of 64 bits, the accuracy of the whole module is quickly and conveniently verified. When the module passes the first verification, the step of verification can be cancelled, so that the whole using process is simpler and more efficient.
(2) After the splicing of the check bits is completed, data can enter the write RAM, and after the write-in of the write RAM is completed completely, the data write-in of the DDR can be started. In the process from reading the write RAM to the DDR, the address bus of the DDR only needs to be continuously increased according to the burst length. The main address bit change is also a change to the read address bit of the write RAM. In the write RAM, each time the reading of 16 data is completed, the next row is switched to, and the corresponding 16 data are continuously read. Every 8 rows are completed as per the above logic, looping back to the first of the 8 rows, beginning the reading of the next set of 16 data. And sequentially executing the logic, and starting reading the next group of 8 rows of data after the data of the current 8 rows are completely read, and continuously performing circulating operation. With the above large and small loops, there are eventually 16 large groups, each containing 8 small groups, which corresponds to the previous blocking strategy of dividing the 128 × 128 matrix into 16 × 8.
(3) After waiting for the data to be completely written, the write state is complete, signaling write completion, and the DDR controller will switch the DDR to the read state. In the read state, on one hand, the invention considers that the occurrence of row active signals is greatly reduced, so that frequent row conversion is avoided as much as possible in the read process of DDR. In DDR, each row reads 8 data in a single skip mode, namely the addresses of the 8 data in the read RAM are sequentially increased, and each row in DDR is read out for 16 times; the reading mode is as follows: jumping to the corresponding position of the next small block every time one data is read out in one row, namely, 15 data crossing the middle interval; after the 8 th data of the current cycle is read, the address pointer of the DDR points to the corresponding address of the first data which is not read in the first small block of the current row, the previous single-row cycle logic is repeated, and meanwhile, the address pointer of the RAM is read to point to the corresponding position of the next row; then, repeating the above processes until the 16 th cycle is completed, switching the address bit of the DDR to the first bit of the next row, and pointing the address pointer in the RAM to the corresponding position of the next row; repeating all the previous steps until 8 rows of data are read out from the DDR, and circulating the address pointer in the RAM to the corresponding position of the first row of the matrix; under such logic cycle, the previous cycle process is repeated until the data is read out from the DDR.
(4) After all data are written into the read RAM, the read state is finished, and the controller performs data verification. By utilizing the characteristic that the diagonal line does not change in transposition, the difference of 64 bits is carried out with two data to be verified. If the difference value between the check bit of the diagonal line data and the check bit of one of the data is equal to the check bit of the other data, the two data are in correct positions. And outputs the check result to the register, where the correct is 1 and the error is 0. This also allows the error bits to be easily checked in a later report.
Study materials:
the Kintex 7-series development board of Xilinx is loaded with the IP core of DDR3SDRAM, and a write RAM and a read RAM are matched with the IP core.
(1) In order to divide the matrix to be transposed into 128 small matrices of 8 × 16, the data of each row constitutes a small matrix; firstly, the original data passes through a splicing module, and a counting prefix of 64 bits is spliced at a high address bit of the original data of 64 bits. A simple counting module is arranged in the splicing device, and 1 is added when the splicing is completed once. The purpose is to encode data, and to correct the correctness of the transposition conveniently after the transposition is completed.
(2) And then, the data enters a write RAM, and a read-out address of the write RAM is controlled by a controller module, wherein the address in the DDR is 8 in burst length as an example. In the step, 8 bits are added to the addresses in the DDR in sequence; in conjunction with fig. 3, the read address of the write RAM changes according to the following rule:
a. for each increment of 16 address bits, i.e. for each output of 16 data, the address bits are first returned to 0, and then 128 x n. This step is scaled into the matrix, which is equivalent to pointing the address pointer to the first bit of the next row, where n denotes the number of big cycles, and the whole writing process has 16 big cycles.
b. After the corresponding 16 numbers in the 8 th row are read out, the address bits are reset to 0 again, and 16 × n is added, and this step is converted into the matrix, which is equivalent to pulling the address pointer from the 8 th row back to the 1 st row, starting from the first number in the second row of the current large group.
And repeating the steps a and b, and the like until after 1 large group is written, n +1, the address bit is reset to 0, 128 × 8 (n-1) =1024 × 1 is added, and the steps a and b are repeated from the first row of the next large group, namely the step c. And then repeating the steps a, b and c until the writing of the whole matrix from the writing RAM to the DDR is completed.
(3) After the data write is completed, the process of reading from the DDR follows. During this read, in conjunction with fig. 4, the address in ddr and the write address in read RAM change as follows:
adding 128 to the address bits of the DDR when reading out 1 number in the DDR, and sequentially adding 1 to the address bits of the RAM;
b. every time step a is executed 7 times, the address bits in the DDR are reset to 0, and 8 x n is added, wherein n represents the number of times step a is completed; meanwhile, the address bit in the RAM is reset to 0, and is added with 128 x n, and the next row is switched to. And finally, n is set to 0, and the steps a and b are repeated.
c. When the step b15 times is executed, the address bit in the DDR is reset to 0, 1024 × m is added, wherein m represents that the DDR finishes the data reading of m rows, namely, m +1 when the step b15 times is finished; in RAM, address bits are set to 0, plus 2048 × m. And then repeating the steps a, b and c.
d. After each step c 7 times, the address bit in the DDR is reset to 0, and 8196 × p is added, where p represents that the DDR completes p large groups of data reading, that is, p +1 after each step c 7 times; in RAM, address bits are set to 0, plus 8 × p. And then repeating the steps a, b and c until all data are read.
(4) After the transpose is completed, the correctness of the data arrangement is checked. The core idea follows: and (3) setting i > j, wherein i represents a row of the matrix to be transposed, j represents a column of the matrix to be transposed, judging whether (i, i) - (j, i) are equal to the upper 64 bits of (i, j), storing a comparison result into a register with 15 bits, wherein the lower 14 bits represent an address, the 15 th bit represents whether the address is equal, if so, the transposition is correct, and setting the address to be 1, otherwise, setting the address to be 0. And after the judgment is finished, removing the spliced high 64-bit check bits. The purpose of this step is mainly to verify whether the transposed function has defects in the design process, so as to ensure the correctness of the logic.
Although the present invention has been described in connection with the accompanying drawings, the present invention is not limited to the above-described embodiments, which are intended to be illustrative rather than restrictive, and many modifications may be made by those skilled in the art without departing from the spirit of the present invention as disclosed in the appended claims.

Claims (6)

1. A high-efficiency matrix transposition processing method based on DDR uses an IP core of DDR3SDRAM, which is characterized in that a write RAM and a read RAM are collocated for the IP core, and the method comprises the following steps:
step one, the matrix to be transposed is a 128-128 matrix of 64 bits of single data, and the matrix to be transposed is divided into 128 8-16 small matrices, namely, each row of data forms a small matrix;
step two, writing 16 data in each 1 active signal in the writing operation process:
the data enters a write RAM, and after the write-in of the write RAM is completed completely, the data write-in of the DDR is started;
in the process of reading from the write RAM to the DDR, the address bus of the DDR increases progressively according to the burst length, wherein the change of the address bit is the change of the read address bit of the write RAM; in the write RAM, switching to the next row and continuously reading the corresponding 16 data every time the reading of the 16 data is completed; according to the logic, each time 8 rows of data are read, a small cycle is formed; then, returning to the first row in the 8 rows, starting reading of the next group of 16 data, sequentially executing the above logics until the current 8 rows of data are completely read, starting reading of the next group of 8 rows of data, and performing a cycle operation, wherein each time the reading of 8 groups of data is completed is a major cycle; according to the large circulation and the small circulation, the data reading of 16 large groups is finally completed, and each large group comprises 8 small groups;
the address in DDR is 8 according to the burst length, and the data writing process of DDR is as follows:
a) Writing the address bits into the RAM, wherein the address bits return to 0 and then 128 x n after 16 address bits are incremented in the RAM, namely 16 data are output; the conversion into the matrix is equivalent to pointing the address pointer to the first bit of the next row, n represents that the next cycle is in, and the whole writing process comprises 16 cycles;
b) After the corresponding 16 numbers in the 8 th row are read out, the address bits are reset to 0 again, 16 × n is added, and the address bits are converted into the matrix, namely the address pointer is pulled back to the 1 st row from the 8 th row, and the first number of the second row of the current large group is started;
c) Repeating the steps A) to B) until 1 major group is written;
d) n +1, the address bit is reset to 0, 128 × 8 (n-1) =1024 × (n-1), and the steps A) to B) are repeated from the first row of the next large group; until the whole matrix is written from the write RAM to the DDR;
reading 8 × 16 data per 1 active signal in the reading operation process; during reading, the address is regulated and controlled by the controller, and the data is read out in DDR according to the rule that data is read out in a cycle skip mode between the same lines, so that the occurrence of line active signals is reduced as much as possible.
2. The method as claimed in claim 1, wherein in step three, 8 data are read in a single skip of each row in the DDR, that is, the addresses of the 8 data in the read RAM are sequentially increased, and each row in the DDR is read out 16 times in total; the reading mode is as follows: jumping to the corresponding position of the next small block every time one data is read out in one row, namely, crossing 15 data of the middle interval; after the 8 th data of the current cycle is read, the address pointer of the DDR points to the corresponding address of the first data which is not read in the first small block of the current row, the previous single-row cycle logic is repeated, and meanwhile, the address pointer of the RAM is read to point to the corresponding position of the next row; then, repeating the above processes until the 16 th cycle is completed, switching the address bit of the DDR to the first bit of the next row, and pointing the address pointer in the RAM to the corresponding position of the next row; repeating all the previous steps until 8 rows of data are read out from the DDR, and circulating the address pointer in the RAM to the corresponding position of the first row of the matrix; under such logic cycle, the previous cycle process is repeated until the data is read out from the DDR.
3. The DDR-based high efficiency matrix transpose processing method of claim 2, wherein the address in the DDR is 8 in burst length, and the detailed procedure for reading out from the DDR is as follows,
a) Adding 128 to the address bits of the DDR and sequentially adding 1 to the address bits of the read RAM when 1 number of the DDR is read;
b) Every 7 times of executing the step A), 0 is added to the address bit in the DDR, and 8 x n is added, wherein n represents the number of times of completing the step A); meanwhile, the address bit in the read RAM is reset to 0, and 128 x n is added to switch to the next row; then, n is returned to 0, and the steps A) to B) are repeated;
c) When the step B15 times is executed, the address bit in the DDR is reset to 0, 1024 x m is added, m indicates that the DDR finishes the data reading of m rows, namely m +1 when the step B15 times is finished;
in RAM, address bit is returned to 0, and 2048 × m is added; thereafter, repeating steps a) through B);
d) When the step C7 times is executed, the address bit in the DDR is reset to 0, and 8196 × p is added, p indicates that the DDR finishes the data reading of p large groups, namely, p +1 when the step C) is finished for 7 times; in the read RAM, the address bit is returned to 0, and 8 × p is added; thereafter, steps a), B) and C) are repeated until the reading of all data is completed.
4. The DDR-based high efficiency matrix transpose processing method of claim 1, wherein after all data is written to the read RAM, the read state is complete and the controller performs data verification.
5. The DDR-based high efficiency matrix transpose processing method of claim 4, wherein the controller performs data validation by:
making a difference with two data to be checked by using the characteristic that the diagonal line does not change in transposition and making a difference with 64 bits, wherein if the difference between the check bit of the diagonal line data and the check bit of one of the data is equal to the check bit of the other data, the two data are in correct positions; and outputs the check result to the register, where the correctness is 1 and the error is 0.
6. The DDR-based high efficiency matrix transpose processing method of claim 5, wherein the addresses in the DDR are 8 in burst length; setting i > j, wherein i represents a row of a matrix to be transposed, j represents a column of the matrix to be transposed, judging whether (i, i) - (j, i) is equal to the upper 64 bits of (i, j), storing a comparison result into a register with 15 bits, wherein the lower 14 bits represent an address, the 15 th bit represents whether the address is equal, if so, indicating that the transposition is correct, and setting the comparison result to be 1, otherwise, setting 0.
CN201910996540.2A 2019-10-19 2019-10-19 DDR-based high-efficiency matrix transposition processing method Active CN110781447B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910996540.2A CN110781447B (en) 2019-10-19 2019-10-19 DDR-based high-efficiency matrix transposition processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910996540.2A CN110781447B (en) 2019-10-19 2019-10-19 DDR-based high-efficiency matrix transposition processing method

Publications (2)

Publication Number Publication Date
CN110781447A CN110781447A (en) 2020-02-11
CN110781447B true CN110781447B (en) 2023-04-07

Family

ID=69386020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910996540.2A Active CN110781447B (en) 2019-10-19 2019-10-19 DDR-based high-efficiency matrix transposition processing method

Country Status (1)

Country Link
CN (1) CN110781447B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111736130A (en) * 2020-07-01 2020-10-02 成都汇蓉国科微系统技术有限公司 Configurable block type matrix transposition system and method based on FPGA
CN111915477B (en) * 2020-08-08 2022-09-06 湖南非雀医疗科技有限公司 Address rotation method for color ultrasonic Doppler transposition storage
CN111984563B (en) * 2020-09-18 2022-08-02 西安电子科技大学 DDR3 read-write controller based on FPGA and matrix transposition implementation method
CN116150055B (en) * 2022-12-09 2023-12-29 中国科学院空天信息创新研究院 Data access method and device based on-chip cache and transposition method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105866774A (en) * 2016-03-23 2016-08-17 南京航空航天大学 FPGA implementation method for polar coordinate format imaging algorithm of chirp signal
CN110109115A (en) * 2019-05-09 2019-08-09 西安电子科技大学 SAR fast imaging device and method based on FPGA and DDR3

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105866774A (en) * 2016-03-23 2016-08-17 南京航空航天大学 FPGA implementation method for polar coordinate format imaging algorithm of chirp signal
CN110109115A (en) * 2019-05-09 2019-08-09 西安电子科技大学 SAR fast imaging device and method based on FPGA and DDR3

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴沁文.基于FPGA和DDR 的高效率矩阵转置方法.《现代雷达》.2017,第第39卷卷(第第39卷期),1-8页. *
胡晓琛 ; 李威 ; 朱岱寅 ; 崔爱欣 ; .基于FPGA的微型SAR成像信号处理技术.雷达科学与技术.2018,(02),全文. *

Also Published As

Publication number Publication date
CN110781447A (en) 2020-02-11

Similar Documents

Publication Publication Date Title
CN110781447B (en) DDR-based high-efficiency matrix transposition processing method
US10141039B2 (en) Burst length defined page size
CN108053855B (en) Matrix transposition method based on SDRAM chip
CN101916227B (en) RLDRAM SIO storage access control method and device
US11392488B2 (en) Optimizing storage of application data in memory
CN110109115B (en) SAR rapid imaging device and method based on FPGA and DDR3
CN110825312A (en) Data processing device, artificial intelligence chip and electronic equipment
CN108169716B (en) SAR imaging system matrix transposition device based on SDRAM chip and pattern interleaving method
US20150380091A1 (en) Methods of programming memories
JP2019508808A (en) Dynamic random access memory (DRAM) and self refresh method
CN109858622B (en) Data handling circuit and method for deep learning neural network
US10209895B2 (en) Memory system
CN113641625B (en) Four-way parallel data processing transposition system based on FPGA
JP4947395B2 (en) Semiconductor test equipment
CN114416612A (en) Memory access method and device, electronic equipment and storage medium
CN109446478A (en) A kind of complex covariance matrix computing system based on iteration and restructural mode
CN116737473A (en) Memory reading and writing method for chip verification and related device thereof
CN115185859B (en) Radar signal processing system and low-delay matrix transposition processing device and method
CN108920097B (en) Three-dimensional data processing method based on interleaving storage
CN105373497A (en) Digital signal processor (DSP) chip based matrix transposition device
CN114218136A (en) Area-friendly storage address mapping method facing systolic array
CN111814675B (en) Convolutional neural network feature map assembly system supporting dynamic resolution based on FPGA
CN111368250B (en) Data processing system, method and equipment based on Fourier transformation/inverse transformation
CN112905954A (en) CNN model convolution operation accelerated calculation method using FPGA BRAM
CN113673691A (en) Storage and computation combination-based multi-channel convolution FPGA (field programmable Gate array) framework and working method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant