CN103760525B - Completion type in-place matrix transposition method - Google Patents
Completion type in-place matrix transposition method Download PDFInfo
- Publication number
- CN103760525B CN103760525B CN201410005244.9A CN201410005244A CN103760525B CN 103760525 B CN103760525 B CN 103760525B CN 201410005244 A CN201410005244 A CN 201410005244A CN 103760525 B CN103760525 B CN 103760525B
- Authority
- CN
- China
- Prior art keywords
- matrix
- transposition
- data
- read
- ram
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/02—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/88—Radar or analogous systems specially adapted for specific applications
- G01S13/89—Radar or analogous systems specially adapted for specific applications for mapping or imaging
- G01S13/90—Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/88—Radar or analogous systems specially adapted for specific applications
- G01S13/89—Radar or analogous systems specially adapted for specific applications for mapping or imaging
- G01S13/90—Radar or analogous systems specially adapted for specific applications for mapping or imaging using synthetic aperture techniques, e.g. synthetic aperture radar [SAR] techniques
- G01S13/9004—SAR image acquisition techniques
Landscapes
- Engineering & Computer Science (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Physics & Mathematics (AREA)
- Electromagnetism (AREA)
- Finish Polishing, Edge Sharpening, And Grinding By Specific Grinding Devices (AREA)
Abstract
The invention discloses a completion type in-place matrix transposition method. A two-dimensional matrix A is divided into multiple square matrixes using the part with smaller line number and smaller column number as side length, the part which is insufficient for being divided into square matrixes is completed to form a square matrix, and the completion part does not contain data. Resegmentation is carried out on the square matrixes by using K*K matrixes, all the K*K matrixes along diagonal lines or off-diagonal lines are read into an RAM module to carry out transposition processing, the two-dimensional matrixes obtained through the transposition processing is read from an SARAM and output in sequence, and a matrix after final transposition is obtained. The completion type in-place matrix transposition method has the advantages that the method is suitable for the transposition of the matrixes with the smaller one of the line number and the column number of the matrixes is positive integral multiples of power of two, the utilizing efficiency of SDRAM type memorizers can be improved effectively, and the processing performance of large matrix type transposition type digital signals can be improved.
Description
Technical field
The invention belongs to digital signal processing technique field, relate to a kind of completion type in-place matrix and turn
Put method.
Background technology
Big matrix transpose computing is the most common, with synthetic aperture radar in data-intensive class is applied
(SAR) as a example by imaging system, in imaging processor, data enter Azimuth Compression be in the past by
Range descriscent tactic, and the orientation of data to compression be with distance to vertical side
Upwards carry out, so having to carry out matrix transpose between two processing procedures.For data volume
For the biggest Digital Signal Processing application, use random access memory ram, static random
Memorizer SRAM has the shortcoming such as finite capacity, high cost as Corner turn memory device, because of
This, frequently with Large Copacity second filial generation double data rate Synchronous Dynamic Random Access Memory
(DDR2, SDRAM) or third generation double data rate Synchronous Dynamic Random Access Memory
(DDR3, SDRAM) is as Corner turn memory device.
The disclosure of the invention of Application No. 201010174342.7 is a kind of " method of matrix transpose ",
Its principle is that matrix is divided into fritter, uses appropriate depositor to carry out transposition.Its weak point
It is, although compare legacy register and improve the execution speed of matrix transpose, however it is necessary that out
Ward off additional storage space, when matrix size is bigger, reduce the utilization ratio of storage resource.
The disclosure of the invention of Application No. 200910236075.9 " matrix transpose automatic control circuit system
System and matrix transpose method ", it is disadvantageous in that, the method needs also exist for opening up extra square
Battle array memory space, it is impossible to realize original place transposition;Additionally, this device configuration information derives from process
Core or DMA, configure by configuring bus, there is more hardware spending.Application number
Be 2012105538360.9 patent of invention disclose one " synthetic aperture radar image-forming system
Matrix transpose method and transposition device ", " SAR Real-time processing machine is adopted with Master's thesis
Collection and the design of transposition module and realization " piecemeal that proposes maps transposition algorithm, its principle be by
The size that DDR2/DDR3 memorizer is divided into multiple memory block and each memory block is identical, each
Individual memory block be used for store a distance to view data, different distance to data deposit respectively
Store up in the address space of different sections.This distribution memory space makes every string Data in Azimuth Direction
Position in each piecemeal is identical.Often read orientation to data only need each piecemeal
The data read-out of same position.Piecemeal maps the line activating operation of Corner turn memory algorithm can
To read multiple data, it is achieved the fast operating that row write row is read.It is disadvantageous in that, due to
There are the data of each clock cycle burst transfer taking-up and only have an effective fraud in algorithm itself
End, the effective percentage therefore causing data is relatively low.The invention of Application No. 201110122834.6
Patent discloses one " SAR imaging signal based on FPGA processes data transposition method ",
Propose a kind of matrix in block form transposition algorithm, matrix data is divided into symmetric pattern matrix, symmetry
Non-diagonal mode matrix, asymmetric non-diagonal mode matrix, carried out respectively by Three models matrix
Matrix in block form.It is disadvantageous in that, although one is can realize square formation data in matrix former
Ground transposition, but matrix ranks are differed greatly can divide the big matrix of multiple square formation not to
Go out practicable technical scheme;Two is to square formation data piecemeal transposition successively in matrix, every time
One piecemeal of transposition, transposition efficiency still has room for promotion.
Summary of the invention
It is an object of the invention to provide a kind of completion type in-place matrix transpose method, solve existing
Have that matrix memory space utilization rate in technology is the highest, transposition efficiency has to be hoisted, especially to square
What battle array ranks differed greatly the most perfect problem of big matrix transpose.
The technical solution adopted in the present invention is to follow the steps below:
Step 1: the data needing transposition are stored in synchronous DRAM continuously
Being mapped to two-dimensional matrix A (L*W) in SDRAM, L represents the line number of matrix, and W represents square
The columns of battle array, the least for 2 the positive integer times of power side;
Step 2: in two-dimensional matrix A line number and the less side of columns as the length of side by two dimension
Matrix A is divided into several square formation, the part polishing being divided into square formation not is become one complete
Whole square formation, polishing part does not contains data;Each square formation is divided into again the matrix of K*K,
K is the approximate number of minimum number in L, W;
The matrix of step 3:K*K is divided into the matrix without polishing part and containing polishing part
Two kinds of matrix:
All K*K matrixes diagonally are read in ram module RAM and carry out by a
Transposition processes, and covers original matrix in the Output matrix after transposition to SDRAM;
Non-all K*K matrixes diagonally are read in ram module RAM by b
Row transposition processes, and covers with diagonal for axis of symmetry in the Output matrix after transposition to SDRAM
Symmetrical matrix;
All square formations are carried out the operation of above step 3;
Step 4: the two-dimensional matrix obtained in step 3 is read from SDRAM output, complete
Become whole transposition process.
The feature of the present invention also resides in the matrix reading random access memory mould in step 3 to K*K
If the process carrying out transposition process in block RAM is that each two K*K matrix is all without polishing portion
Point, the most first two matrixes are read from SDRAM, be stored in RAM, then according to 0,
K, 2K ... K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ...
(K2-1) order, K*K data before the RAM that first skips, according still further to K2, (K2+ K),
(K2+2K)…(K2+ K (K-1)), (K2+ 1), (K2+ (K+1)), (K2+2K+1)…
(K2+K(K-1)+1)…(K2+ K-1), (K2+2K-1)…(K2+K2-1) order is skipped;Step
In 3, the matrix to K*K reads in the mistake carrying out transposition process in ram module RAM
Journey, refers to that in each two K*K matrix, only one contains polishing and non-polishing part, if
Two matrixes are respectively C (M, N) and D (R, S) without polishing part, wherein M, N, R,
S is respectively less than the positive integer equal to K, the most first by C (M, N) and D (R, S) from SDRAM
Read, be stored in RAM, then according to 0, N, 2N ... N (M-1), 1, (N+1), (2N+1) ...
(N (M-1)+1) ... (N-1), (2N-1) ... the order of (N*M-1), M*N data before first skipping,
According still further to M*N, (M*N+S), (M*N+2S) ... (M*N+S (R-1)),
(M*N+1),(M*N+(S+1)),(M*N+(2S+1))…(M*N+(S(R-1)+1))…(M*N+
(S-1)), (M*N+ (2S-1)) ... the order of (M*N+ (S*R-1)), R*S number after the RAM that skips
According to;If any one in two K*K matrixes comprises only polishing part, then should be containing only polishing
The matrix of part is without transposition;By the two-dimensional matrix A after transposition from SDRAM in step 4
If the columns that the process of middle reading is original two dimensional matrix A is less than line number, after reading transposition
Data time first read the first row of first square formation, the rear the first row reading second square formation ... .. is straight
To reading the first row of polishing square formation, only read the position having data, then read first square formation
Second row, rear the second row reading second square formation .... operate successively, until running through whole matrix
Data;If the columns W of original matrix A is more than line number L, read the data after transposition
First the first row of first square formation of reading is as the first row of matrix after transposition, then reads first square formation
The second row as the second row of matrix after transposition, until reading last of first square formation
OK, read the data of each square formation the most in this way, when the data of reading polishing square formation,
Only read the row at valid data place.
It is a further object to provide a kind of for completion type in-place matrix transpose method
In device, including control logic module, control logic module respectively connect reading SDRAM ground
Location generation module, write SDRAM address generating module and transposition RAM module, transposition RAM
Module connects data input fifo module and data output fifo module respectively.Read SDRAM
Address generating module include reading address produce in state machine, block column counter in linage-counter, block,
Block row-coordinate enumerator, block row coordinate counter, square formation enumerator, be stored in data amount check counting
Device, the enumerator across number of addresses, the selector of reading SDRAM line number, reading SDRAM columns
Selector and ranks splice to obtain SDRAM read address, wherein when transposition operation can enter
After row, reading address state machine and start, in block, in linage-counter, block, column counter produces simultaneously
Raw specific piece of interior row and column address;Place is counted according to reading address state machine state in which
The row-coordinate of block and row coordinate, the number of operated square formation, and need to deposit this time to be stored in number
According to number and across number of addresses;According at present residing square formation number, block row-coordinate, block row coordinate,
In block, in row-coordinate, block, row coordinate calculates SDRAM reading address.Write SDRAM address to produce
Module includes that write address produces in state machine, block column counter, block row in linage-counter, block and sits
Mark enumerator, block row coordinate counter, square formation enumerator, write SDRAM line number selector,
Write the selector of SDRAM columns and ranks splice to obtain SDRAM write address, wherein when
After this process starts, write address state machine starts, column count in linage-counter, block in block simultaneously
Device produces specific piece of interior row and column address;Count according to write address state machine state in which
The row-coordinate of place block and row coordinate, the number of operated square formation;According at present residing square formation number,
In block row-coordinate, block row coordinate, block, in row-coordinate, block, row coordinate calculates SDRAM write address.
Transposition RAM module include address produce state machine, put base address every time after skip the meter of number
Number device, the enumerator of base address that every time need to reset, the generation of transposition RAM read/write address,
The data of the most each fritter are stored in transposition RAM the most successively, after storage completes, and root
According to address produce state machine and at that time base address counting, put base address after the number skipped determine
The reading address of RAM.
The invention has the beneficial effects as follows that transposition efficiency is high, design based on SDRAM memory, suitable
In the transposition of Arbitrary Matrix scale, the utilization effect of SDRAM class memorizer can be effectively improved
Rate, promotes the process performance of big matrix transpose class digital signal.
Accompanying drawing explanation
Partitioning of matrix figure before Fig. 1 transposition;
Matrix diagram after Fig. 2 transposition;
The hardware of Fig. 3 deblocking based on SDRAM transposition method realizes figure;
Fig. 4 reads SDRAM address generating module hardware realization figure;
Fig. 5 reads SDRAM address and produces state machine;
Fig. 6 transposition RAM module hardware realizes;
Fig. 7 random memory addresses produces state machine;
Fig. 8 writes SDRAM address generating module hardware realization figure.
Detailed description of the invention
The present invention is described in detail with detailed description of the invention below in conjunction with the accompanying drawings.
As it is shown in figure 1, follow the steps below, step 1: batch to be needed the number of transposition
According to being stored in continuously in synchronous DRAM SDRAM, data are mapped to two-dimensional matrix
A (L*W), L represent the line number of matrix, and W represents matrix column number, the least for 2
The positive integer times of power side;A. line number and the columns of two-dimensional matrix are compared, with line number and columns
Two-dimensional matrix is divided into several square formation as the length of side by a less side, will be divided into square formation not
Part polishing become a square formation;B. with the matrix of K*K, the square formation of previous step is split again,
Being divided into diagonally mode matrix and the matrix of non-diagonally pattern, K is positive integer, K
It must be the approximate number of minimum number in L, W;
Step 2: to each square formation, first by the institute without polishing part of diagonally pattern
There is K*K matrix to carry out transposition process, utilize address saltus step to read every K address and deposit at random
Data in reservoir i.e. RAM, therein every K address reading ram cell data,
Refer to first read the data of zero-address, then read the data of K address, then read the number of 2K address
According to, read K*(K-1 always) behind address, jump to address 1, rear read K+1 address ...,
Circulation is until reading the data of K*K-1 address successively, and the data read are pressed row again successively
It is stored in the matrix of its Data Source, completes the transposition of this K*K units chunk matrix;Secondly by non-
Diagonally pattern carries out transposition process without the matrix of polishing part, is right with diagonal every time
Claim axle to read two symmetrical matrixes, with step 2, each matrix utilize address saltus step every
K address reads the data in RAM, and by row, the data of reading are newly stored into symmetry successively
Matrix in, it is achieved transposition effect;
The matrix of step 3:K*K is divided into the matrix without polishing part and containing polishing part
Two kinds of matrix:
All K*K matrixes diagonally are read in ram module RAM and carry out by a
Transposition processes, and covers original matrix in the Output matrix after transposition to SDRAM;
Non-all K*K matrixes diagonally are read in ram module RAM by b
Row transposition processes, and covers with diagonal for axis of symmetry in the Output matrix after transposition to SDRAM
Symmetrical matrix;
All square formations are carried out the operation of above step 3;
Step 4: the two-dimensional matrix obtained in step 3 is read from SDRAM output, complete
Become whole transposition process.In step 3, the matrix to K*K reads in ram module RAM
If the process of transposition process is carried out for each two K*K matrix all without polishing part, then elder generation in
Two matrixes are read from SDRAM, is stored in RAM, then according to 0, K, 2K ...
K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K2-1)
Order, K*K data before the RAM that first skips, according still further to K2, (K2+ K),
(K2+2K)…(K2+ K (K-1)), (K2+ 1), (K2+ (K+1)), (K2+2K+1)…
(K2+K(K-1)+1)…(K2+ K-1), (K2+2K-1)…(K2+K2-1) order, skip RAM
Rear K*K data;In step 3, the matrix to K*K reads in ram module RAM
Carrying out the process of transposition process, if referring to, in each two K*K matrix, only one contains benefit
Neat and non-polishing part, if two matrixes are respectively C (M, N) and D (R, S) without polishing part,
Wherein M, N, R, S are respectively less than the positive integer equal to K, the most first by C (M, N) and D (R, S)
Read from SDRAM, be stored in RAM, then according to 0, N, 2N ... N (M-1), 1,
(N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... the order of (N*M-1),
M*N data before first skipping, according still further to
M*N,(M*N+S),(M*N+2S)…(M*N+S(R-1)),(M*N+1),(M*N+(S+1)),(M
*N+(2S+1))…(M*N+(S(R-1)+1))…(M*N+(S-1)),(M*N+(2S-1))…(M*
N+ (S*R-1)) order, R*S data after the RAM that skips;Square to K*K in step 3
Battle array reads in the process carrying out transposition process in ram module RAM, if referring to two
Any one in K*K matrix comprises only polishing part, then be somebody's turn to do the matrix containing only polishing part
Without transposition;The mistake in step 4, the two-dimensional matrix A after transposition read from SDRAM
If the columns that journey is original two dimensional matrix A is less than line number, read during the data after transposition first
Read the first row of first square formation, the rear the first row reading second square formation ... .. is until reading polishing
The first row of square formation, only reads the position having data, then reads the second row of first square formation, after
Read the second row of second square formation .... operate successively, until running through the data of whole matrix;If
When the columns W of original matrix A is more than line number L, reads the data after transposition and first read first
The first row of square formation is as the first row of matrix after transposition, then the second row reading first square formation is made
For the second row of matrix after transposition, until reading last column of first square formation, afterwards with this
Mode reads the data of each square formation, when the data of reading polishing square formation, only reads effectively
The row at data place.
The transposition device that the present invention provides includes controlling logic module 1 as shown in Figure 3, controls to patrol
Collect module 1 to connect reading SDRAM address generating module 2 respectively, write the generation of SDRAM address
Module 3 and transposition RAM module 4, transposition RAM module 4 connects data input FIFO respectively
Module 5 and data output fifo module 6.Wherein data path is: SDRAM is external to be deposited
Reservoir output data to data input fifo module 5, is then input to transposition RAM module 4
In, input is sequential write when entering, and skips during output;Transposition RAM module 4 exports number
It is written in SDRAM according in data output fifo module 6;Control logic module 1 to lead to
Road includes reading SDRAM address generating module 2, reading SDRAM port signal, input FIFO
Read-write flag bit, transposition RAM read/write address control, the read-write mark of transposition RAM
Position, the read-write flag bit of output FIFO, write SDRAM address generating module 3, write SDRAM
Port signal;
Read SDRAM address generating module 2 as shown in Figure 4 include read address produce state machine,
Column counter in linage-counter, block in block, block row-coordinate enumerator, block row coordinate counter,
Square formation enumerator, be stored in data amount check enumerator, across number of addresses enumerator, read SDRAM
The selector of line number, the selector reading SDRAM columns and ranks splice to obtain SDRAM
Read address.Wherein reading address and produce the carrying out of the state machine whole transposition flow process of control, block expert counts
The line position in the block reading this block deposited by number device, and in block, the block reading this block deposited by column counter
Interior column position, block row-coordinate counter register reads the line position at the place of this block, and block row are sat
Mark counter register reads the column position at the place of this block, square formation counter register now transposition
Square formation number, be stored in data amount check counter register two diagonally pattern or two non-diagonally
The number of valid data in ray mode matrix;Effective across the counter register block now of number of addresses
The columns of data, is the counting that transposition RAM module 4 reads across address;Read SDRAM
The selector of line number selects the SDRAM row read, when first matrix being in two matrixes
Time calculate, according to row, column coordinate, the row read, when second matrix being in two matrixes,
Row, column number of coordinates is exchanged, calculates the row that second matrix is read;Read SDRAM columns
The selector such as selector principle reading SDRAM line number is identical;Obtain finally according to ranks splicing
SDRAM reads address.Read SDRAM address and produce state machine as it is shown in figure 5, include six
State: the reading in each two matrix transpose operation of original state, diagonally mode matrix
Every the two of first matrix condition (first matrix condition along the line), diagonally mode matrix
Individual matrix transpose operation in second matrix condition of reading (second matrix condition along the line),
First square of reading in each two symmetrical matrix transposition operation of non-diagonally mode matrix
Battle array state (non-first matrix condition along the line), each two pair of non-diagonally mode matrix
Claim matrix transpose operation in reading symmetrical matrix state (non-second matrix condition along the line),
Waiting Matrix block transposition completion status.First resets puts original state, when condition 1 is
Initialization completes and data have been stored in complete, NextState forward to along the line first rectangular
State;As being unsatisfactory for, rest on original state always;When current state is first matrix along the line
State, if when condition 2 i.e. this diagonally mode matrix fritter be the little of place at polishing
Block, last valid data runs through, and NextState forwards Waiting Matrix block transposition completion status to;
When the data of condition 3 that is first fritter run through, NextState forward to along the line second rectangular
State;If condition 2 or 3 is all unsatisfactory for, NextState is still first matrix condition along the line;When
When current state is second matrix condition along the line, if when condition 4 is this diagonally pattern
Matrix fritter is the fritter at place at polishing, and last valid data runs through, or second
The data of fritter run through, and NextState forwards Waiting Matrix block transposition completion status to;If condition
4 are unsatisfactory for, and NextState is still second matrix condition along the line;When current state is for waiting square
Battle array block transposition completion status, when that is all fritter transposition of condition 5 complete, at the beginning of NextState forwards to
Beginning state;When that is upper little block operations of transposition of condition 6 is complete and the most residing side
The diagonally mode matrix transposition of battle array completes, and NextState forwards non-first square along the line to
Battle array state;When that is upper little block operations of transposition of condition 7 is complete and the most residing side
Battle array diagonally mode matrix not yet complete, NextState forward to along the line first rectangular
State;When condition 5,6 or 7 is all unsatisfactory for, NextState is still that Waiting Matrix block transposition is complete
One-tenth state;When current state is non-first matrix condition along the line, if when condition 8 is this
Diagonally mode matrix fritter is the fritter at place at polishing, and last valid data is read
Complete, or at the symmetrical fritter of this fritter data be the matrix i.e. data of polishing be sky, next
State forwards Waiting Matrix block transposition completion status to;When the data of condition 9 that is first fritter are read
Complete, NextState forwards non-second matrix condition along the line to;If condition 8,9 is all unsatisfactory for,
NextState is still non-first matrix condition along the line;When current state is non-second square along the line
During battle array state, if when condition 10 i.e. this diagonally mode matrix fritter be place at polishing
Fritter, last valid data runs through, or the data of second fritter run through, next
State forwards Waiting Matrix block transposition completion status to;If condition 10 is unsatisfactory for, NextState is still
For non-second matrix condition along the line.
As shown in Figure 6, transposition RAM module 4 includes that address produces state machine, puts base every time
The enumerator of number of skipping behind address, the enumerator of base address that every time need to reset, transposition RAM
The generation of read/write address.Address produces state machine and controls whole transposition RAM read-write flow process;Often
Secondary put base address after skip the enumerator of number, deposit the number read of having skipped;Often
The enumerator of the secondary base address that need to reset, deposits the base address needing to reset;Transposition RAM reads
The generation of write address, positions the address of transposition RAM according to the result of above three unit.Turn
Put RAM module 4 and realize being sequentially written in and saltus step read out function of data, including one
Random memory addresses produces state machine, as it is shown in fig. 7, comprises five states: original state,
Be sequentially written in state, address zero setting state, saltus step reads state 1, saltus step reads state 2.
The conversion of each state is as it is shown in fig. 7, resets is set to original state, when condition 1 is
The reading reading FIFO is effective, and NextState forwards to be sequentially written in state;If condition 1 is unsatisfactory for,
NextState is still original state;When current state is for being sequentially written in state, when condition 2 is
When address needs, equal to this fritter, the data being stored in, NextState forwards address zero setting shape to
State;If condition 2 is unsatisfactory for, NextState is still for being sequentially written in state;When current state is ground
During the zero setting state of location, following clock cycle state transition is that saltus step reads state 1;When current shape
When state is saltus step reading state 1, when condition 3 i.e. needs to be stored in the data more than a fritter,
When now address has reached the data of a fritter, NextState forwards saltus step to and reads state 2;
When condition 4 is i.e. stored in the data of a not more than fritter, and now address has been last number
According to, NextState forwards original state to;When condition 3,4 is all unsatisfactory for, NextState is still
State 1 is read for saltus step;When current state be saltus step read state 2 time, when condition 5 i.e. when
Being stored in more than the data of a fritter, NextState forwards original state to;If condition
5 when being unsatisfactory for, and NextState still reads state 2 for saltus step.
As shown in Figure 8, write SDRAM address generating module 3 and produce with reading SDRAM address
Module 2 is corresponding, the function of modules and aforementioned reading SDRAM address generating module 2
Interior little module function is identical.State machine includes seven states: original state, diagonally mould
Formula matrix each two matrix transpose operation in write first matrix condition (first square along the line
Battle array state), diagonally mode matrix each two matrix transpose operation in reading second
Matrix condition (second matrix condition along the line), each two pair of non-diagonally mode matrix
Claim matrix transpose operation in first matrix condition of reading (non-first matrix condition along the line),
Reading symmetrical matrix in each two symmetrical matrix transposition operation of non-diagonally mode matrix
State (non-second matrix condition along the line), wait this stages operating start state 1, etc.
Treat that this stages operating starts state 2.First resets puts original state, when condition 1
This phase start signal i.e. is true, and NextState forwards first matrix condition along the line to;If not
Satisfied then rest on original state always;When current state is first matrix condition along the line, when
If condition 2 i.e. this diagonally mode matrix fritter be the fritter at place at polishing, last
Individual valid data write, and NextState forwards to wait that this stages operating starts state 1;Work as bar
The data of part 3 that is first fritter write, and NextState forwards second matrix condition along the line to;
If condition 2 or 3 is all unsatisfactory for, NextState is still first matrix condition along the line;When at present
When state is second matrix condition along the line, if when condition 4 is this diagonally mode matrix
Fritter is the fritter at place at polishing, and last valid data writes, or second fritter
Data write, NextState forwards to wait that this stages operating starts state 1;If condition
4 are unsatisfactory for, and NextState is still second matrix condition along the line;When current state is for waiting this
When one stages operating starts state 1, the next clock cycle becomes waiting that this stages operating is opened
Beginning state 2;
When current state is for waiting that this stages operating starts state 2, when condition 5 is that is all little
Block transposition completes, and NextState forwards original state to;Start when condition 6 i.e. this stage and
The diagonally mode matrix transposition of the most residing square formation completes, and NextState forwards to non-
First matrix condition along the line;Have begun to and at present residing when condition 7 i.e. this stage
The diagonally mode matrix of square formation not yet completes, and NextState forwards first matrix along the line to
State;When condition 5,6 or 7 is all unsatisfactory for, NextState is still for waiting this stage behaviour
Make beginning state 2;When current state is non-first matrix condition along the line, when condition 8 is
If this diagonally mode matrix fritter be the fritter at place at polishing, last significant figure
According to writing;Or at the fritter that this fritter is symmetrical data be the matrix of polishing, i.e. data be empty,
NextState forwards to wait that this stages operating starts state 1;When condition 9 that is first fritter
Data write, NextState forwards non-second matrix condition along the line to;If condition 8,9 is all
Being unsatisfactory for, NextState is still non-first matrix condition along the line;When current state be non-along the line
During second matrix condition, if when condition 10 i.e. this diagonally mode matrix fritter for mending
The fritter at Qi Chu place, last valid data writes, or the data of second fritter are write
Complete, NextState forwards to wait that this stages operating starts state 1;If condition 10 is unsatisfactory for,
NextState is still non-second matrix condition along the line.
The present invention supports original place transposition and is applicable to the situation that matrix ranks differ greatly.Propose one
Plant completion type in-place transposition method, and realize this transposition method based on hardware so that transposition device
Memorizer service efficiency higher, thus it is the most sharp to meet memory resource in Digital Signal Processing
Requirement.
Beneficial effects of the present invention quantitative analysis is as follows:
Storage efficiency is analyzed:
When matrix size is L*W, it is assumed that less number is W, and the present invention is in matrix storage
The memory space comprising original matrix that needs are opened up is: ([L/W]+1) * W*W stores sky
Between ([] is bracket function).
Making L be respectively 160,544,1312,3232, W is 64;Draw transposition spatial contrast,
Non-original place transposition algorithm as shown in table 1 and the transposition spatial contrast of the algorithm of the present invention.
Table 1
Matrix storage sky as can be seen from the above table, under same matrix size, needed for the present invention
Between less than non-original place transposition algorithm.And along with the increase of matrix size, save space and increasingly connect
Nearly 50%.
The efficiency analysis of transposition algorithm:
Setting during efficiency analysis, matrix size is L*W, and the least number is W, and inter-bank reads
M the clock cycle of ancillary cost, (time that different SDRAM storage class inter-bank activate had
A little difference), piecemeal size is K, then the present invention completes matrix A (L, W) transposition, institute
Spend the clock cycle be:
L*m+(L*W)/4+(W/K)*L*m+(L*W)/4+
(W/K)*([L/W]*W)*m+W*([(L-[L/W]*W)/K]+1)*m+(L*W)/4+
([L/W]+1) * W*m+ (L*W)/4,
If matrix size is 2720*256, the present invention consumes the clock cycle in theory and is:
T1=(2.65625m+170)k+(10.625m+170)k+(10.75m+170)k+(2.75m+170)k
=(26.78125m+680) k,
Existing " SAR Real-time processing machine gathers the design with transposition module and realization " is used to carry
The piecemeal gone out maps storage transposition algorithm total consumption clock cycle for (to assume that block sizes is
16*16):
T2=(42.5m+170)k+(42.5m+680)k
=(85m+850)k;
Both compare: T1-T2=-(58.21875m+170) k, it can be seen that, the present invention is spent
The clock cycle of expense is also far smaller than existing " collection of SAR Real-time processing machine and transposition mould
The design of block and realization " in institute's extracting method, there is preferable transposition efficiency.
Specific embodiment is set forth below illustrate:
Specific embodiment 1:
If original matrix line number L=2720, columns W=256, the submatrix row, column number of division
K=64, transposition method is as follows:
Step 1, will batch need the data of transposition to be stored in continuously in SDRAM, first by data
It is mapped to two-dimensional matrix 2720*256;
Step 2, the line number of comparator matrix and columns, carry out polishing behaviour by the two-dimensional matrix of mapping
Make so that it is several square formation can be divided into, then carry out piecemeal operation.With reference to Fig. 1, this
The specific implementation of step is as follows:
(2a) with 256*256 for square formation scale, matrix can carry out polishing operation, fills up in lower section
The rectangle of one 96*256, makes whole matrix can be divided into 11 square formations;
(2b) for unit matrix, each square formation is split with 64*64, be divided into diagonally mould
Formula matrix and non-diagonally mode matrix;
Step 3, use ram cell realize the effect of transposition, utilize the mode of address saltus step
Realize this effect.This step uses ram cell, and the size of RAM is 2*64*64, i.e.
First can carry out diagonally mode matrix turning with the matrix of two 64*64 of transposition each time
Put, i.e. the oblique line matrix (A00, A11, A22, A33) in figure one is processed: read
Taking two block matrix A00, A11 of diagonally pattern, reading manner is to utilize address saltus step
RAM is jumped and reads by mode, the transposition of A00: from the data of reading zero-address, then
Read the data of 64 addresses, then read the data of 128 addresses, read 4032 addresses, so always
After jump to address 1, rear read 65 addresses .... circulation is until reading the data of 4095 addresses successively,
The data read are stored back to A00 again by row one by one;The transposition of A11: first read
The data of address, address, A11 first place that is 4096, then read the data of 4160 addresses, a direct-reading
Behind 8128 addresses, jumping to 4097 addresses, circulation is until reading 9191 addresses successively
Data, this is last data of A11 matrix, the data read successively by row again
It is stored back to A11 position.Finally, successively each two fritter is carried out the operation that first two steps are rapid, its
Include the saltus step of following address:
The saltus step of line feed address in unit matrix;
From address, a upper block end to the saltus step of next block initial address between two combination block;
Saltus step from next address, block end to previous piece of initial address between two combination block;
The saltus step of address between various combination, i.e. from last address of these two combination block to the next one
The saltus step of the initial address of two combination block.
Step 4, the transposition of non-diagonally mode matrix is processed, i.e. to matrix in figure two-by-two
Combination A01, A10, A02, A20 etc. carry out transposition.Use ram cell, by square formation
In two diagonally symmetrical unit matrix blocks be stored in RAM, utilize address saltus step
Mode reads the data in RAM, is written on symmetrical position, completes two matrix-blocks
Data conversion.The realization of above-mentioned transposition is introduced as a example by non-diagonally mode matrix transposition:
First, symmetrical A01, A10 data are deposited into ram cell, according to step 3 by row
In second step in the mode of address saltus step 64*64 unit fritter that previous transposition is gone out
Data be stored in the position of A10, the 64*64 unit fritter then later transposition gone out
Data are stored in the position of A01, complete the transposition of two non-diagonally mode matrix of symmetry;
Then complete the data conversion of all symmetry blocks with same operation, then can complete all non-edges
The transposition of diagonal model matrix processes.
Step 5, the transposition of the square formation completed in addition to polishing, be stored in after each minor matrix transposition with
On the basis of square formation diagonal, the position symmetrical with original matrix present position.
Step 6, square formation pattern to polishing carry out transposition:
The first step carries out transposition one by one to diagonally mode matrix:
When the matrix data of diagonally pattern is full up, such as B00, B11 in Fig. 1, transposition
Mode is same with step three-phase;When diagonally pattern matrix data less than time, in figure
B22, utilizes the mode of address saltus step the most half-full for RAM to realize transposition, specific implementation
For first these block matrix data being stored in RAM block, first read the data of zero-address, then read
The data of 64 addresses, then read the data of 128 addresses, after reading 1984 addresses, jump always
Change to address 1, rear reading 65 addresses .... circulation is until reading the data of 2047 addresses successively, weight
Newly it is stored in this block matrix position.When the matrix of diagonally pattern is empty, such as the B33 in figure,
Skip transposition operation;Second step carries out transposition to non-diagonally pattern square formation: when a pair non-edge
When the matrix data of diagonal model is full up, such as B01, B10 in figure a pair, transposition
Mode is identical with step 4;Full up one of the matrix data one when a pair non-diagonally pattern
Less than time, such as B02, B20 in figure a pair, utilize the ground that RAM is the most full up but the most half-full
The mode of location saltus step realizes transposition, and specific implementation is: refer to data are stored in RAM block,
First read the data of zero-address, then read the data of 64 addresses, then read the number of 128 addresses
According to, after reading 4032 addresses, jump to address 1, rear reading 65 addresses always .... circulate successively
Until reading the data of 4095 addresses, being stored in the position of B20, at this moment completing full data
The transposition of 64*64 units chunk matrix;Then read the data of 4096 addresses, then read 4160
The data of address, after reading 6080 addresses, jump to address 4097, read 4161 afterwards always
Address .... circulation is until reading the data of 6143 addresses successively, is stored in the position of B02, now
Complete the full up data of the most individual data less than the transposition of non-diagonally mode matrix
Operation.When the matrix data one of a pair non-diagonally pattern is empty less than one, such as figure
In B23, B32 a pair, utilize the mode of address saltus step the most half-full for RAM to realize transposition,
Specific implementation is: data are stored in RAM block, first reads the data of zero-address, then
Read the data of 64 addresses, then read the data of 128 addresses, after reading 1984 addresses always,
Jump to address 1, rear reading 65 addresses ... circulation is until reading the data of 2047 addresses successively,
It is stored in the position of B32, reads RAM and terminate, i.e. complete the most individual data less than a number
According to the transposition operation for empty non-diagonally mode matrix.When a pair non-diagonally pattern
When matrix is sky, skip transposition operation.
Step 7, determine address according to the position after transposition, the number in the most exportable SDRAM
According to.First reading the first row of first square formation, then the first row of polishing square formation (only reads there are data
Position);Then reading the second row of first square formation, then the second row of polishing square formation is (only
Read the position having data);.... until running through all data.
Claims (2)
1. a completion type in-place matrix transpose method, it is characterised in that comprise the steps:
Step 1: the SAR data needing transposition is stored in synchronous DRAM continuously
Being mapped to two-dimensional matrix A (L*W) in SDRAM, L represents the line number of matrix, and W represents square
The columns of battle array, the least for 2 the positive integer times of power side;
Step 2: in two-dimensional matrix A line number and the less side of columns as the length of side by two dimension
Matrix A is divided into several square formation, the part polishing being divided into square formation not is become one complete
Whole square formation, polishing part does not contains data;Each square formation is divided into again the matrix of K*K,
K is the approximate number of minimum number in L, W;
The matrix of step 3:K*K is divided into the matrix without polishing part and containing polishing part
Two kinds of matrix:
All K*K matrixes diagonally are read according to the mode of each two and deposit at random by a
Memory modules RAM carries out transposition process, the Output matrix after transposition to SDRAM covers
Lid original matrix;
Non-all K*K matrixes diagonally according to process two every time with diagonal are by b
The mode of axis of symmetry matrix is read in and is carried out transposition process, transposition in ram module RAM
After Output matrix cover symmetrical matrix with diagonal for axis of symmetry in SDRAM;
All square formations are carried out the operation of above step 3;
Step 4: the two-dimensional matrix obtained in step 3 is read from SDRAM output, complete
Become whole transposition process;
Described step 3 is carried out in the matrix reading ram module RAM of K*K
If transposition process process be each two K*K matrix all without polishing part, the most first by two
Matrix reads from SDRAM, is stored in RAM, then according to 0, K, 2K ... K (K-1),
1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K2-1) order, first
Skip K*K data before RAM, according still further to K2, (K2+ K), (K2+2K)…(K2+ K (K-1)),
(K2+ 1), (K2+ (K+1)), (K2+2K+1)…(K2+K(K-1)+1)…(K2+ K-1),
(K2+2K-1)…(K2+K2-1) order, K*K data after the RAM that skips;
Described step 3 is carried out in the matrix reading ram module RAM of K*K
The process that transposition processes, if refer in each two K*K matrix only one contain polishing and
Non-polishing part, if two matrixes are respectively C (M, N) and D (R, S) without polishing part,
Wherein M, N, R, S are the positive integer less than or equal to K, the most first by C (M, N) and D (R, S)
Read from SDRAM, be stored in RAM, then according to 0, N, 2N ... N (M-1),
1, (N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... the order of (N*M-1),
M*N data before first skipping, according still further to M*N, (M*N+S),
(M*N+2S) ... (M*N+S (R-1)), (M*N+1), (M*N+ (S+1)), (M*N+
(2S+1)) ... (M*N+ (S (R-1)+1)) ... (M*N+ (S-1)), (M*N+ (2S-1)) ... (M*N+
(S*R-1) order), R*S data after the RAM that skips;
Described step 3 is carried out in the matrix reading ram module RAM of K*K
The process that transposition processes, refers to that any one in two K*K matrixes comprises only polishing part,
Then it is somebody's turn to do the matrix containing only polishing part without transposition.
2., according to completion type in-place matrix transpose method a kind of described in claim 1, its feature exists
In: the process in described step 4, the two-dimensional matrix A after transposition read from SDRAM
If during for the columns of original two dimensional matrix A less than line number, first reading when reading the data after transposition
The first row of first square formation, the rear the first row reading second square formation, until reading polishing square formation
The first row, only read to have the position of data, then read the second row of first square formation, read the afterwards
Second row of two square formations, operates successively, until running through the data of whole matrix;If original square
When the columns W of battle array A is more than line number L, reads the data after transposition and first read first square formation
The first row is as the first row of matrix after transposition, then reads the second row of first square formation as transposition
Second row of rear matrix, until reading last column of first square formation, reads the most in this way
Take the data of each square formation, when the data of reading polishing square formation, only read valid data institute
Row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410005244.9A CN103760525B (en) | 2014-01-06 | 2014-01-06 | Completion type in-place matrix transposition method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410005244.9A CN103760525B (en) | 2014-01-06 | 2014-01-06 | Completion type in-place matrix transposition method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103760525A CN103760525A (en) | 2014-04-30 |
CN103760525B true CN103760525B (en) | 2017-01-11 |
Family
ID=50527792
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410005244.9A Active CN103760525B (en) | 2014-01-06 | 2014-01-06 | Completion type in-place matrix transposition method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103760525B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9952831B1 (en) * | 2017-02-16 | 2018-04-24 | Google Llc | Transposing in a matrix-vector processor |
CN108169716B (en) * | 2017-11-29 | 2020-05-19 | 北京时代民芯科技有限公司 | SAR imaging system matrix transposition device based on SDRAM chip and pattern interleaving method |
CN108872990B (en) * | 2018-09-11 | 2021-03-26 | 中国科学院电子学研究所 | Real-time imaging transposition processing method for synthetic aperture radar |
CN111124300A (en) * | 2019-12-17 | 2020-05-08 | 深圳忆联信息系统有限公司 | Method and device for improving access efficiency of SSD DDR4, computer equipment and storage medium |
CN111984563B (en) * | 2020-09-18 | 2022-08-02 | 西安电子科技大学 | DDR3 read-write controller based on FPGA and matrix transposition implementation method |
CN113555051B (en) * | 2021-07-23 | 2023-04-07 | 电子科技大学 | SAR imaging data transposition processing system based on DDR SDRAM |
CN114328315A (en) * | 2021-11-22 | 2022-04-12 | 北京智芯微电子科技有限公司 | DMA-based data preprocessing method, DMA component and chip structure |
CN115248664B (en) * | 2022-09-22 | 2023-01-10 | 北京东远润兴科技有限公司 | Data reading and writing method, device, equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4370732A (en) * | 1980-09-15 | 1983-01-25 | Ibm Corporation | Skewed matrix address generator |
CN101706760A (en) * | 2009-10-20 | 2010-05-12 | 北京龙芯中科技术服务中心有限公司 | Matrix transposition automatic control circuit system and matrix transposition method |
CN102253925A (en) * | 2010-05-18 | 2011-11-23 | 江苏芯动神州科技有限公司 | Matrix transposition method |
CN103048644A (en) * | 2012-12-19 | 2013-04-17 | 电子科技大学 | Matrix transposing method of SAR (synthetic aperture radar) imaging system and transposing device |
-
2014
- 2014-01-06 CN CN201410005244.9A patent/CN103760525B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4370732A (en) * | 1980-09-15 | 1983-01-25 | Ibm Corporation | Skewed matrix address generator |
CN101706760A (en) * | 2009-10-20 | 2010-05-12 | 北京龙芯中科技术服务中心有限公司 | Matrix transposition automatic control circuit system and matrix transposition method |
CN102253925A (en) * | 2010-05-18 | 2011-11-23 | 江苏芯动神州科技有限公司 | Matrix transposition method |
CN103048644A (en) * | 2012-12-19 | 2013-04-17 | 电子科技大学 | Matrix transposing method of SAR (synthetic aperture radar) imaging system and transposing device |
Non-Patent Citations (1)
Title |
---|
基于SDRAM的星载SAR星上实时成像转置存储器;李早社等;《信号处理》;20070630;第23卷(第03期);433-436 * |
Also Published As
Publication number | Publication date |
---|---|
CN103760525A (en) | 2014-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103760525B (en) | Completion type in-place matrix transposition method | |
CN107657581B (en) | Convolutional neural network CNN hardware accelerator and acceleration method | |
EP3757901A1 (en) | Schedule-aware tensor distribution module | |
CN101751344B (en) | A compression status bit cache and backing store | |
US6636223B1 (en) | Graphics processing system with logic enhanced memory and method therefore | |
CN106683158B (en) | Modeling system of GPU texture mapping non-blocking storage Cache | |
CN103136721B (en) | In-line image rotates | |
Rupnow et al. | High level synthesis of stereo matching: Productivity, performance, and software constraints | |
CN208766715U (en) | The accelerating circuit of 3*3 convolution algorithm | |
CN104699631A (en) | Storage device and fetching method for multilayered cooperation and sharing in GPDSP (General-Purpose Digital Signal Processor) | |
US11030095B2 (en) | Virtual space memory bandwidth reduction | |
US20210295607A1 (en) | Data reading/writing method and system in 3d image processing, storage medium and terminal | |
CN103793876A (en) | Distributed tiled caching | |
CN101958112B (en) | Method for realizing rotation of handheld device screen pictures by 90 degrees and 270 degrees simultaneously | |
US10552307B2 (en) | Storing arrays of data in data processing systems | |
CN105892955A (en) | Method and equipment for managing storage system | |
CN105431831A (en) | Data access methods and data access devices utilizing the same | |
CN106846255A (en) | Image rotation implementation method and device | |
CN108139989B (en) | Computer device equipped with processing in memory and narrow access port | |
CN106530209A (en) | FPGA-based image rotation method and apparatus | |
CN104808950B (en) | Modal dependence access to in-line memory element | |
US9196014B2 (en) | Buffer clearing apparatus and method for computer graphics | |
KR20050085056A (en) | Sdram address mapping optimized for two-dimensional access | |
US11194490B1 (en) | Data formatter for convolution | |
US6900812B1 (en) | Logic enhanced memory and method therefore |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20201229 Address after: 245000 No.50, Meilin Avenue, Huangshan Economic Development Zone, Huangshan City, Anhui Province Patentee after: Huangshan Development Investment Group Co.,Ltd. Address before: No. 193, Tunxi Road, Hefei, Anhui Patentee before: Hefei University of Technology |
|
TR01 | Transfer of patent right |