CN103760525A

CN103760525A - Completion type in-place matrix transposition method

Info

Publication number: CN103760525A
Application number: CN201410005244.9A
Authority: CN
Inventors: 杜高明; 张多利; 宋宇鲲; 王莉莉; 尹勇生; 王晓蕾; 贾靖华
Original assignee: Hefei University of Technology
Current assignee: Huangshan Development Investment Group Co.,Ltd.
Priority date: 2014-01-06
Filing date: 2014-01-06
Publication date: 2014-04-30
Anticipated expiration: 2034-01-06
Also published as: CN103760525B

Abstract

The invention discloses a completion type in-place matrix transposition method. A two-dimensional matrix A is divided into multiple square matrixes using the part with smaller line number and smaller column number as side length, the part which is insufficient for being divided into square matrixes is completed to form a square matrix, and the completion part does not contain data. Resegmentation is carried out on the square matrixes by using K*K matrixes, all the K*K matrixes along diagonal lines or off-diagonal lines are read into an RAM module to carry out transposition processing, the two-dimensional matrixes obtained through the transposition processing is read from an SARAM and output in sequence, and a matrix after final transposition is obtained. The completion type in-place matrix transposition method has the advantages that the method is suitable for the transposition of the matrixes with the smaller one of the line number and the column number of the matrixes is positive integral multiples of power of two, the utilizing efficiency of SDRAM type memorizers can be improved effectively, and the processing performance of large matrix type transposition type digital signals can be improved.

Description

A kind of polishing formula original place matrix transpose method

Technical field

The invention belongs to digital signal processing technique field, relate to a kind of polishing formula original place matrix transpose method.

Background technology

The computing of large matrix transposition is very common in data-intensive class application, take synthetic-aperture radar (SAR) imaging system as example, in imaging processor, data enter Azimuth Compression be in the past according to distance to tactic, and the orientation of data is to carry out in vertical direction with distance to compression, so must carry out matrix transpose between two processing procedures.For the very large digital signal processing application of data volume, use random access memory ram, static RAM SRAM as transpose memory part, to have the shortcomings such as finite capacity, high cost, therefore, often adopt large capacity second generation double data rate Synchronous Dynamic Random Access Memory (DDR2, SDRAM) or third generation double data rate Synchronous Dynamic Random Access Memory (DDR3, SDRAM) as transpose memory.

The application number method > > of a kind of < < matrix transpose that has been 201010174342.7 disclosure of the invention, its principle is that matrix is divided into fritter, uses appropriate register to carry out transposition.Its weak point is, although compare legacy register, has improved the execution speed of matrix transpose, need to open up additional storage space, when matrix size is larger, has reduced the utilization ratio of storage resources.Application number be 200910236075.9 disclosure of the invention < < matrix transposition automatic control circuit system and matrix transpose method > >, its weak point is, the method need to be opened up extra matrix stores space equally, cannot realize original place transposition; In addition, this device configuration information derives from processes core or DMA, by configuration bus, is configured, and has more hardware spending.Application number is matrix transpose method and the transposition device > > that 2012105538360.9 patent of invention discloses a kind of < < synthetic aperture radar image-forming system, design and the piecemeal mapping transposition algorithm of realizing > > proposition with Master's thesis < < SAR real time imagery processor collection and transposition module, its principle is that DDR2/DDR3 storer is divided into the size of multiple storage blocks and each storage block is identical, each storage block be used for storing a distance to view data, different distance to data be stored in respectively in the address space of different sections.This memory allocated space makes each row Data in Azimuth Direction position in each piecemeal identical.Often read an orientation to data only need the data reading of the same position of each piecemeal.A line activating operation of piecemeal mapping transposition storage algorithm can be read multiple data, realizes row and writes the fast operating that row is read.Its weak point is, because the data that algorithm itself exists each clock period burst transfer to take out only have an effective drawback, therefore causes the efficient lower of data.Application number is that 201110122834.6 patent of invention discloses the SAR imaging signal deal with data transposition method > > of a kind of < < based on FPGA, a kind of partitioned matrix transposition algorithm is proposed, matrix data is divided into symmetric pattern matrix, symmetrical non-diagonal pattern matrix, asymmetric non-diagonal pattern matrix, three kinds of mode matrix are carried out respectively to partitioned matrix.Its weak point is, the one, although the original place transposition of square formation data in can realization matrix does not provide practicable technical scheme to the matrix ranks large matrix that can divide multiple square formations that differs greatly; The 2nd, to square formation data in matrix successively piecemeal transposition, piecemeal of each transposition, transposition efficiency still has room for promotion.

Summary of the invention

The object of the present invention is to provide a kind of polishing formula original place matrix transpose method, solved that matrix stores space availability ratio in prior art is not high, transposition efficiency have to be hoisted, especially matrix ranks are differed greatly the perfect not problem of large matrix transposition.

The technical solution adopted in the present invention is to carry out according to following steps:

Step 1: the data that need transposition are deposited in continuously in synchronous DRAM SDRAM and are mapped to two-dimensional matrix A (L*W), and L represents the line number of matrix, and W represents matrix column number, the positive integer of wherein less is 2 power side is doubly;

Step 2: as the length of side, two-dimensional matrix A is divided into several square formations using line number in two-dimensional matrix A and the less side of columns, the part polishing that is divided into not square formation is become to a complete square formation, polishing part is containing data; Each square formation is divided into again to the matrix of K*K, K is the approximate number of minimum number in L, W;

The matrix of step 3:K*K is divided into the not matrix and two kinds, the matrix that contains polishing part containing polishing part:

A will read in ram module RAM and carry out transpose process along cornerwise all K*K matrixes, and the Output matrix after transposition covers original matrix in SDRAM;

B reads in and in ram module RAM, carries out transpose process along cornerwise all K*K matrixes non-, the Output matrix after transposition in SDRAM take the diagonal line matrix symmetrical as axis of symmetry covers;

All square formations are carried out to the operation of above step 3;

Step 4: the two-dimensional matrix obtaining in step 3 is read to output from SDRAM, complete whole transposition process.

If feature of the present invention is also that in step 3, the matrix to K*K reads in the process of carrying out transpose process in ram module RAM for every two K*K matrixes are not all containing polishing part, first two matrixes are read from SDRAM, deposit in RAM, then according to 0, K, 2K ... K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K ²-1) order, K*K data before the RAM that first skips, then according to K ², (K ²+ K), (K ²+ 2K) ... (K ²+ K (K-1)), (K ²+ 1), (K ²+ (K+1)), (K ²+ 2K+1) ... (K ²+ K (K-1)+1) ... (K ²+ K-1), (K ²+ 2K-1) ... (K ²+ K ²-1) order is skipped, in step 3, the matrix of K*K is read in to the process of carrying out transpose process in ram module RAM, refer to that in every two K*K matrixes, only one of them contains polishing and non-polishing part, if two matrixes are not respectively C (M containing polishing part, N) and D (R, S), wherein M, N, R, S is all less than or equal to the positive integer of K, first by C (M, N) and D (R, S) from SDRAM, read, deposit in RAM, then according to 0, N, 2N ... N (M-1), 1, (N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... (N*M-1) order, M*N data before first skipping, again according to M*N, (M*N+S), (M*N+2S) ... (M*N+S (R-1)),

(M*N+1), (M*N+ (S+1)), (M*N+ (2S+1)) ... (M*N+ (S (R-1)+1)) ... (M*N+ (S-1)), (M*N+ (2S-1)) ... the order of (M*N+ (S*R-1)), R*S data after the RAM that skips; If any one in two K*K matrixes only contains polishing part, this matrix containing polishing part is without transposition; If when the columns that the process in step 4, the two-dimensional matrix A after transposition being read from SDRAM is original two dimensional matrix A is less than line number, while reading the data after transposition, first read the first row of first square formation, after read the first row of second square formation ... .. until read the first row of polishing square formation, only read the position of data, then read the second row of first square formation, after read the second row of second square formation .... operation successively, until run through the data of whole matrix; If when the columns W of original matrix A is greater than line number L, read the first row that data after transposition first read first square formation as transposition after the first row of matrix, the second row of reading again first square formation as transposition after the second row of matrix, until read last column of first square formation, read in this way afterwards the data of each square formation, when reading the data of polishing square formation, only read the row at valid data place.

Another object of the present invention is to provide a kind of device for polishing formula original place matrix transpose method, comprise steering logic module, steering logic module connects respectively to be read SDRAM address generating module, writes SDRAM address generating module and transposition RAM module, and transposition RAM module is connection data input fifo module and data output fifo module respectively.Reading SDRAM address generating module comprises and reads in address generate state machine, piece column counter, piece row-coordinate counter, piece row coordinate counter, square formation counter in linage-counter, piece, deposits data amount check counter, the counter across number of addresses, the selector switch of reading SDRAM line number, the selector switch of reading SDRAM columns and ranks in and splice to such an extent that SDRAM reads address, wherein after matrix transpose operation can be carried out, read address state machine and start, simultaneously the row and column address of column counter in producing specific in linage-counter, piece in piece; According to reading the residing state of address state machine, count the number of the row-coordinate of place piece and row coordinate, operated square formation, and need to deposit and this time deposit the number of data in and across number of addresses; According to row coordinate in row-coordinate, piece in current square formation number of living in, piece row-coordinate, piece row coordinate, piece, calculate SDRAM and read address.Write SDRAM address generating module and comprise that write address produces linage-counter in state machine, piece, the interior column counter of piece, piece row-coordinate counter, piece row coordinate counter, square formation counter, the selector switch of writing SDRAM line number, the selector switch of writing SDRAM columns and ranks and splices to obtain SDRAM write address, wherein after this process starts, write address state machine starts, simultaneously the row and column address of column counter in producing specific in linage-counter, piece in piece; According to the residing state of write address state machine, count the number of the row-coordinate of place piece and row coordinate, operated square formation; According to row coordinate in row-coordinate, piece in current square formation number of living in, piece row-coordinate, piece row coordinate, piece, calculate SDRAM write address.Transposition RAM module comprises address generate state machine, puts skip behind the base address counter of number, the counter that at every turn needs the base address of resetting, the generation of transposition RAM read/write address at every turn, wherein the data of each fritter are deposited in transposition RAM first successively, after storage completes, according to address generate state machine and the counting, the address of reading of putting the number of skipping behind base address and determine RAM of base address at that time.

The invention has the beneficial effects as follows that transposition efficiency is high, based on SDRAM reservoir designs, be suitable for the transposition of Arbitrary Matrix scale, can effectively improve the utilization ratio of SDRAM class storer, promote the handling property of large matrix transposition class digital signal.

Accompanying drawing explanation

Partitioning of matrix figure before Fig. 1 transposition;

Matrix diagram after Fig. 2 transposition;

The hardware of the deblocking transposition method of Fig. 3 based on SDRAM is realized figure;

Fig. 4 reads SDRAM address generating module hardware realization figure;

Fig. 5 reads SDRAM address generate state machine;

Fig. 6 transposition RAM module hardware is realized;

Fig. 7 random memory addresses produces state machine;

Fig. 8 writes SDRAM address generating module hardware realization figure.

Embodiment

Below in conjunction with the drawings and specific embodiments, the present invention is described in detail.

As shown in Figure 1, according to following steps, carry out, step 1: will need the data of transposition to deposit in continuously in synchronous DRAM SDRAM in batches, data-mapping is become to two-dimensional matrix A (L*W), L represents the line number of matrix, W represents matrix column number, and the positive integer of wherein less is 2 power side doubly; A. the line number and the columns that compare two-dimensional matrix, be divided into several square formations as the length of side by two-dimensional matrix using line number and the less side of columns, and the part polishing that is divided into not square formation is become to a square formation; B. with the matrix of K*K, the square formation of previous step is cut apart again, be divided into along diagonal model matrix and the non-matrix along diagonal model, K is positive integer, and K must be the approximate number of minimum number in L, W;

Step 2: to each square formation, first all K*K matrixes that do not contain polishing part along diagonal model are carried out to transpose process, utilizing address saltus step to read random access memory every K address is the data in RAM, wherein every K address, read ram cell data, refer to the data of first reading zero-address, then read the data of K address, then read the data of 2K address, read K*(K-1 always) behind address, jump to address 1, after read K+1 address circulation is until read the data of K*K-1 address successively, the data of reading are deposited successively again in to the matrix of its Data Source by row, complete this K*K units chunk transpose of a matrix, secondly by non-, along diagonal model, containing the matrix of polishing part, do not carry out transpose process, each two matrixes that read symmetry take diagonal line as axis of symmetry, with step 2, each matrix utilizes address saltus step to read the data in RAM every K address, the data of reading are deposited in symmetrical matrix again by row successively, realize transposition effect,

All square formations are carried out to the operation of above step 3;

Step 4: the two-dimensional matrix obtaining in step 3 is read to output from SDRAM, complete whole transposition process.If the matrix of K*K is read in to the process of carrying out transpose process in ram module RAM for every two K*K matrixes are not all containing polishing part in step 3, first two matrixes are read from SDRAM, deposit in RAM, then according to 0, K, 2K ... K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K ²-1) order, K*K data before the RAM that first skips, then according to K ², (K ²+ K), (K ²+ 2K) ... (K ²+ K (K-1)), (K ²+ 1), (K ²+ (K+1)), (K ²+ 2K+1) ... (K ²+ K (K-1)+1) ... (K ²+ K-1), (K ²+ 2K-1) ... (K ²+ K ²-1) order, K*K data after the RAM that skips, in step 3, the matrix of K*K is read in to the process of carrying out transpose process in ram module RAM, if refer to, in every two K*K matrixes, only one of them contains polishing and non-polishing part, if two matrixes are not respectively C (M containing polishing part, N) and D (R, S), wherein M, N, R, S is all less than or equal to the positive integer of K, first by C (M, N) and D (R, S) from SDRAM, read, deposit in RAM, then according to 0, N, 2N ... N (M-1), 1, (N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... (N*M-1) order, M*N data before first skipping, again according to

M*N, (M*N+S), (M*N+2S) ... (M*N+S (R-1)), (M*N+1), (M*N+ (S+1)), (M*N+ (2S+1)) ... (M*N+ (S (R-1)+1)) ... (M*N+ (S-1)), (M*N+ (2S-1)) ... the order of (M*N+ (S*R-1)), R*S data after the RAM that skips; In step 3, the matrix of K*K is read in to the process of carrying out transpose process in ram module RAM, if refer to, any one in two K*K matrixes only contains polishing part, and this matrix containing polishing part is without transposition; If when the columns that the process in step 4, the two-dimensional matrix A after transposition being read from SDRAM is original two dimensional matrix A is less than line number, while reading the data after transposition, first read the first row of first square formation, after read the first row of second square formation ... .. until read the first row of polishing square formation, only read the position of data, then read the second row of first square formation, after read the second row of second square formation .... operation successively, until run through the data of whole matrix; If when the columns W of original matrix A is greater than line number L, read the first row that data after transposition first read first square formation as transposition after the first row of matrix, the second row of reading again first square formation as transposition after the second row of matrix, until read last column of first square formation, read in this way afterwards the data of each square formation, when reading the data of polishing square formation, only read the row at valid data place.

Transposition device provided by the invention comprises steering logic module 1 as shown in Figure 3, steering logic module 1 connects respectively to be read SDRAM address generating module 2, writes SDRAM address generating module 3 and transposition RAM module 4, and transposition RAM module 4 is connection data input fifo modules 5 and data output fifo module 6 respectively.Wherein data path is: SDRAM external memory output data to data input fifo module 5, be then input in transposition RAM module 4, and input is sequential write while entering, and during output, skips; In transposition RAM module 4, export in data to data output fifo module 6 and be written in SDRAM; Steering logic module 1 path comprises to be read SDRAM address generating module 2, reads SDRAM port signal, the read-write zone bit of the read-write zone bit of the read/write address control of the read-write zone bit of input FIFO, transposition RAM, transposition RAM, output FIFO, write SDRAM address generating module 3, write SDRAM port signal;

Reading SDRAM address generating module 2 comprises as shown in Figure 4 and reads in address generate state machine, piece column counter, piece row-coordinate counter, piece row coordinate counter, square formation counter in linage-counter, piece, deposits data amount check counter, the counter across number of addresses, the selector switch of reading SDRAM line number, the selector switch of reading SDRAM columns and ranks in and splice to such an extent that SDRAM reads address.Wherein read the carrying out of the whole transposition flow process of address generate state machine control, in piece, linage-counter is deposited the line position in the piece that reads this piece, in piece, column counter is deposited the column position in the piece that reads this piece, piece row-coordinate counter is deposited the line position at the place of reading this piece, piece row coordinate counter is deposited the column position at the place of reading this piece, square formation counter is deposited the now square formation number of transposition, deposits data amount check counter in and deposits two along diagonal model or two non-numbers along valid data in diagonal model matrix; Across the counter of number of addresses, deposit the columns of piece valid data now, be the counting that transposition RAM module 4 reads across address; Reading the selector switch of SDRAM line number selects the SDRAM reading capable, when first matrix in two matrixes, according to row, column coordinate, calculate the row of reading, when second matrix in two matrixes, row, column number of coordinates is exchanged, calculate second row that matrix is read; The selector switch of reading SDRAM columns is as identical in read the selector switch principle of SDRAM line number; Finally according to ranks, splicing obtains SDRAM and reads address.Read SDRAM address generate state machine as shown in Figure 5, comprise six states: original state, along first matrix state (first matrix state along the line) of reading in every two matrix transposes operation of diagonal model matrix, along second the matrix state (second matrix state along the line) of reading in every two matrix transposes operation of diagonal model matrix, non-along first matrix state (non-first matrix state along the line) of reading in every two symmetric matrix matrix transpose operation of diagonal model matrix, non-along the symmetric matrix state of reading in every two symmetric matrix matrix transpose operation of diagonal model matrix (non-second matrix state along the line), Waiting Matrix piece transposition completion status.First state machine resets and puts original state, and when condition 1 is that initialization completes and data have deposited in completely, NextState forwards first matrix state along the line to; If do not met and rest on original state always; When current state is first matrix state along the line, if when condition 2 this along diagonal model matrix fritter be polishing place fritter, last valid data runs through, NextState forwards Waiting Matrix piece transposition completion status to; The data that are first fritter when condition 3 run through, and NextState forwards second matrix state along the line to; If condition 2 or 3 does not meet, NextState is still first matrix state along the line; When current state is second matrix state along the line, if when condition 4 this along diagonal model matrix fritter be polishing place fritter, last valid data runs through, or the data of second fritter run through, and NextState forwards Waiting Matrix piece transposition completion status to; If condition 4 does not meet, NextState is still second matrix state along the line; When current state is Waiting Matrix piece transposition completion status, when condition 5 all fritter transposition complete, NextState forwards original state to; When condition 6 go up that a little block operations of transposition has completed and also current residing square formation along diagonal model matrix transposition complete, NextState forwards non-first matrix state along the line to; When condition 7 goes up that a little block operations of transposition has completed and also not yet the completing along diagonal model matrix of current residing square formation, NextState forwards first matrix state along the line to; When condition 5,6 or 7 does not meet, NextState is still Waiting Matrix piece transposition completion status; When current state is non-first matrix state along the line, if when condition 8 this along diagonal model matrix fritter be polishing place fritter, last valid data runs through, or the matrix that the symmetrical fritter place data of this fritter are polishing is that data are empty, and NextState forwards Waiting Matrix piece transposition completion status to; The data that are first fritter when condition 9 run through, and NextState forwards non-second matrix state along the line to; If condition 8,9 does not meet, NextState is still non-first matrix state along the line; When current state is non-second matrix state along the line, if when condition 10 this along diagonal model matrix fritter be polishing place fritter, last valid data runs through, or the data of second fritter run through, and NextState forwards Waiting Matrix piece transposition completion status to; If condition 10 does not meet, NextState is still non-second matrix state along the line.

As shown in Figure 6, transposition RAM module 4 comprises address generate state machine, puts skip behind the base address counter of number, the counter that at every turn needs the base address of resetting, the generation of transposition RAM read/write address at every turn.The whole transposition RAM read-write of address generate state machine control flow process; Put the counter of the number of skipping behind base address at every turn, deposit the number of now having skipped and having read; Need the counter of the base address of resetting at every turn, deposit the base address that needs replacement; The generation of transposition RAM read/write address, according to the address of the result location transposition RAM of above-mentioned three unit.The order that transposition RAM module 4 realizes data writes and saltus step read out function, comprising a random memory addresses, produce state machine, as shown in Figure 7, comprise five states: state 1 is read in original state, order write state, address zero setting state, saltus step, state 2 is read in saltus step.As shown in Figure 7, state machine resets and is set to original state in the conversion of each state, when condition 1, reads reading effectively of FIFO, and NextState forwards order write state to; If condition 1 does not meet, NextState is still original state; When current state is order during write state, when condition 2 is when address equals data that this fritter need to deposit in, NextState forwards address zero setting state to; If condition 2 does not meet, NextState is still order write state; When current state is address zero setting state, next clock period state transition is that state 1 is read in saltus step; When current state is saltus step while reading state 1, when condition 3 needs to deposit in the data that are greater than a fritter, when now address has reached the data of a fritter, NextState forwards saltus step to and reads state 2; When condition 4 deposits the data of a no more than fritter in, now address has been last data, and NextState forwards original state to; When condition 3,4 does not all meet, NextState is still for state 1 is read in saltus step; When current state is saltus step while reading state 2, the data that ought be greater than a fritter when condition 5 have deposited in, and NextState forwards original state to; If when condition 5 does not meet, NextState is still for state 2 is read in saltus step.

As shown in Figure 8, write SDRAM address generating module 3 and to read SDRAM address generating module 2 corresponding, the function of modules is with aforementioned to read little module function in SDRAM address generating module 2 identical.State machine comprises seven states: original state, along first matrix state (first matrix state along the line) of writing in every two matrix transposes operation of diagonal model matrix, along second the matrix state (second matrix state along the line) of reading in every two matrix transposes operation of diagonal model matrix, non-along first matrix state (non-first matrix state along the line) of reading in every two symmetric matrix matrix transpose operation of diagonal model matrix, non-along the symmetric matrix state of reading in every two symmetric matrix matrix transpose operation of diagonal model matrix (non-second matrix state along the line), wait for this stages operating initial state 1, wait for this stages operating initial state 2.First state machine resets and puts original state, and when condition 1 is that this one-phase commencing signal is true, NextState forwards first matrix state along the line to; If do not met and rest on original state always; When current state is first matrix state along the line, if when condition 2 this along diagonal model matrix fritter be polishing place fritter, last valid data writes, NextState forwards to waits for this stages operating initial state 1; The data that are first fritter when condition 3 write, and NextState forwards second matrix state along the line to; If condition 2 or 3 does not meet, NextState is still first matrix state along the line; When current state is second matrix state along the line, if when condition 4 this along diagonal model matrix fritter be polishing place fritter, last valid data writes, or the data of second fritter write, and NextState forwards to waits for this stages operating initial state 1; If condition 4 does not meet, NextState is still second matrix state along the line; When current state is while waiting for this stages operating initial state 1, the next clock period becomes waits for this stages operating initial state 2;

When current state for wait for this stages operating initial state 2, when condition 5 all fritter transposition complete, NextState forwards original state to; When condition 6 be this one-phase start and also current residing square formation along diagonal model matrix transposition complete, NextState forwards non-first matrix state along the line to; When condition 7 is that this one-phase has started and also not yet the completing along diagonal model matrix of current residing square formation, NextState forwards first matrix state along the line to; When condition 5,6 or 7 does not meet, NextState is still for waiting for this stages operating initial state 2; When current state is non-first matrix state along the line, if when condition 8 this along diagonal model matrix fritter be polishing place fritter, last valid data writes; Or the matrix that the symmetrical fritter place data of this fritter are polishing, data are empty, NextState forwards to waits for this stages operating initial state 1; The data that are first fritter when condition 9 write, and NextState forwards non-second matrix state along the line to; If condition 8,9 does not meet, NextState is still non-first matrix state along the line; When current state is non-second matrix state along the line, if when condition 10 this along diagonal model matrix fritter be polishing place fritter, last valid data writes, or the data of second fritter write, and NextState forwards to waits for this stages operating initial state 1; If condition 10 does not meet, NextState is still non-second matrix state along the line.

The present invention supports original place transposition and is applicable to the situation that matrix ranks differ greatly.Propose a kind of polishing formula original place transposition method, and realize this transposition method based on hardware, make the storer service efficiency of transposition device higher, thereby meet the efficient requirement utilizing of memory resource in digital signal processing.

Beneficial effect quantitative analysis of the present invention is as follows:

Storage efficiency is analyzed:

When matrix size is L*W, suppose that less number is W, the present invention needs the original matrix that comprises of opening up at interior storage space to be in matrix store: ([L/W]+1) * W*W storage space ([] is bracket function).

Make L be respectively 160,544,1312,3232, W is 64; Draw transposition spatial contrast, non-original place transposition algorithm as shown in table 1 and the transposition spatial contrast of algorithm of the present invention.

Table 1

As can be seen from the above table, under same matrix size, matrix stores space required for the present invention is less than non-original place transposition algorithm.And along with the increase of matrix size, save space and more and more approach 50%.

The efficiency analysis of transposition algorithm:

During efficiency analysis, set, matrix size is L*W, and wherein less number is W, inter-bank reads an ancillary cost m clock period (there is a little difference the time that different SDRAM storage class inter-bank activate), and a point block size is K, and the present invention completes matrix A (L, W) transposition, the clock period that spends is:

L*m+(L*W)/4+(W/K)*L*m+(L*W)/4+

(W/K)*([L/W]*W)*m+W*([(L-[L/W]*W)/K]+1)*m+(L*W)/4+

([L/W]+1)*W*m+(L*W)/4，

If matrix size is 2720*256, the present invention consumes in theory the clock period and is:

T1=(2.65625m+170)k+(10.625m+170)k+(10.75m+170)k+(2.75m+170)k=(26.78125m+680)k，

The design that adopts existing < < SAR real time imagery processor collection and the transposition module consumption clock period total with realizing piecemeal mapping storage transposition algorithm that > > proposes is (supposing that block sizes is 16*16):

T2=(42.5m+170)k+(42.5m+680)k

=(85m+850)k；

Both compare: T1-T2=-(58.21875m+170) k, as can be seen here, the clock period that the present invention spends is also far smaller than the design of existing < < SAR real time imagery processor collection and transposition module and realizes institute's extracting method in > >, has good transposition efficiency.

Enumerating specific embodiment below describes:

Specific embodiment 1:

If original matrix line number L=2720, columns W=256, the submatrix row, column of division is counted K=64, and transposition method is as follows:

Step 1, by batches needing the data of transposition to deposit in continuously in SDRAM, first data-mapping is become to two-dimensional matrix 2720*256;

The line number of step 2, comparator matrix and columns, carry out polishing operation by the two-dimensional matrix of mapping, makes it can be divided into several square formations, then divides block operations.With reference to Fig. 1, the specific implementation of this step is as follows:

(2a), take 256*256 as square formation scale, matrix can carry out polishing operation, below fill up the rectangle of a 96*256, make whole matrix can be divided into 11 square formations;

(2b) take 64*64 as unit matrix, each square formation is cut apart, be divided into along diagonal model matrix and non-along diagonal model matrix;

Step 3, use ram cell are realized the effect of transposition, utilize the mode of address saltus step to realize this effect.This step is used ram cell, the size of RAM is 2*64*64, matrix that each time can two 64*64 of transposition, first to carrying out transposition along diagonal model matrix, to the oblique line matrix (A00 in figure mono-, A11, A22, A33) process: read the two block matrix A00 along diagonal model, A11, reading manner is to utilize the mode of address saltus step that RAM is being jumped and read, the transposition of A00: from reading the data of zero-address, then read the data of 64 addresses, then read the data of 128 addresses, read 4032 addresses always, then jump to address 1, after read 65 addresses .... circulation is until read the data of 4095 addresses successively, the data of reading are deposited back to A00 again by row one by one, the transposition of A11: first read the i.e. data of 4096 addresses in the first address of A11, then read the data of 4160 addresses, read behind 8128 addresses always, jump to 4097 addresses, circulation is until read the data of 9191 addresses successively, this is last data of A11 matrix, and the data of reading are deposited go back to A11 position again by row successively.Finally, successively every two fritters are carried out to the rapid operation of first two steps, comprising the saltus step of following address:

The saltus step of line feed address in unit matrix;

Saltus step from address, upper piece end to next piece initial address between two combination block;

Saltus step from address, next piece end to previous initial address between two combination block;

The saltus step of address between various combination, the saltus step that from then on the initial address of next two combination block is arrived in last address of two combination block.

Step 4, to non-along diagonal model transpose of a matrix processing,, to matrix combination of two A01, A10 in figure, A02, A20 etc. carries out transposition.Use ram cell, deposit two unit matrix pieces along diagonal symmetry in square formation in RAM, utilize the mode of address saltus step to read the data in RAM, be written on symmetrical position, complete the data-switching of two matrix-blocks.Take the non-realization of introducing above-mentioned transposition along diagonal model matrix transpose as example: first, symmetrical A01, A10 data are deposited into ram cell by row, the data of the 64*64 unit's fritter previous transposition being gone out according to the mode of the address saltus step in the second step in step 3 deposit the position of A10 in, then the position that the data of the rear 64*64 unit that transposition a goes out fritter is deposited in to A01, two of completing symmetry are non-along diagonal model transpose of a matrix; Then with same operation, complete the data-switching of all symmetry blocks, can complete all non-along diagonal model transpose of a matrix processing.

Step 5, complete the transposition of the square formation except polishing, after each minor matrix transposition, deposit take square formation diagonal line as benchmark the position symmetrical with original matrix present position in.

Step 6, the square formation pattern of polishing is carried out to transposition:

The first step is to carrying out one by one transposition along diagonal model matrix:

When the matrix data along diagonal model is while being full up, as the B00 in Fig. 1, B11, transposed way with step three-phase with; When the matrix data along diagonal model less than time, as the B22 in figure, utilize the mode of the not half-full address saltus step of RAM to realize transposition, specific implementation, for first depositing these block matrix data in RAM piece, is first read the data of zero-address, then read the data of 64 addresses, then read the data of 128 addresses, read behind 1984 addresses always, jump to address 1, after read 65 addresses .... circulation until read the data of 2047 addresses, deposits this block matrix position in again successively.When the matrix along diagonal model is sky, as the B33 in figure, skip matrix transpose operation, second step carries out transposition to non-along diagonal model square formation: when a pair of non-matrix data along diagonal model is when full up, as a pair of in the B01 in figure, B10, transposed way is identical with step 4, when one of a pair of non-matrix data along diagonal model full up one less than time, as the B02 in figure, B20 is a pair of, utilize the mode of the not full up but half-full address saltus step of RAM to realize transposition, specific implementation is: refer to deposit data in RAM piece, first read the data of zero-address, then read the data of 64 addresses, then read the data of 128 addresses, read behind 4032 addresses always, jump to address 1, after read 65 addresses .... circulation is until read the data of 4095 addresses successively, deposit the position of B20 in, at this moment complete the 64*64 units chunk transpose of a matrix of full data, then read the data of 4096 addresses, then read the data of 4160 addresses, read behind 6080 addresses always, jump to address 4097, after read 4161 addresses .... circulation is until read the data of 6143 addresses successively, deposit the position of B02 in, now complete one to one a full up data of data less than non-ly along diagonal model transpose of a matrix, operate.When a pair of non-one of the matrix data along diagonal model is sky less than one, as the B23 in figure, B32 is a pair of, utilize the mode of the not half-full address saltus step of RAM to realize transposition, specific implementation is: deposit data in RAM piece, first read the data of zero-address, then read the data of 64 addresses, then read the data of 128 addresses, read behind 1984 addresses always, jump to address 1, after read 65 addresses ... circulation is until read the data of 2047 addresses successively, deposit the position of B32 in, reading RAM finishes, completing a data is one to one that the non-of sky operates along diagonal model transpose of a matrix less than data.When a pair of non-matrix along diagonal model is sky, skip matrix transpose operation.

Step 7, according to the location positioning address after transposition, i.e. data in exportable SDRAM.First read the first row of first square formation, then the first row of polishing square formation (only reading the position of data); Then read the second row of first square formation, then the second row of polishing square formation (only reading the position of data); . until run through all data.

Claims

1. a polishing formula original place matrix transpose method, is characterized in that comprising the steps:

A will read in ram module RAM and carry out transpose process according to the mode of each two along cornerwise all K*K matrixes, and the Output matrix after transposition covers original matrix in SDRAM;

B reads in take diagonal line as the mode of axis of symmetry matrix and in ram module RAM, carries out transpose process according to two of each processing along cornerwise all K*K matrixes non-, the Output matrix after transposition in SDRAM take the diagonal line matrix symmetrical as axis of symmetry covers; All square formations are carried out to the operation of above step 3;

2. according to a kind of polishing formula original place matrix transpose method described in claim 1, it is characterized in that: if the matrix of K*K is read in to the process of carrying out transpose process in ram module RAM for every two K*K matrixes are not all containing polishing part in described step 3, first two matrixes are read from SDRAM, deposit in RAM, then according to 0, K, 2K ... K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K ²-1) order, K*K data before the RAM that first skips, then according to K ², (K ²+ K), (K ²+ 2K) ... (K ²+ K (K-1)), (K ²+ 1), (K ²+ (K+1)), (K ²+ 2K+1) ... (K ²+ K (K-1)+1) ... (K ²+ K-1), (K ²+ 2K-1) ... (K ²+ K ²-1) order, K*K data after the RAM that skips.

3. according to a kind of polishing formula original place matrix transpose method described in claim 1, it is characterized in that: in described step 3, the matrix of K*K is read in to the process of carrying out transpose process in ram module RAM, if refer to, in every two K*K matrixes, only one of them contains polishing and non-polishing part, if two matrixes are not respectively C (M containing polishing part, N) and D (R, S), wherein M, N, R, S is all less than or equal to the positive integer of K, first by C (M, N) and D (R, S) from SDRAM, read, deposit in RAM, then according to 0, N, 2N ... N (M-1), 1, (N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... (N*M-1) order, M*N data before first skipping, again according to M*N, (M*N+S), (M*N+2S) ... (M*N+S (R-1)), (M*N+1), (M*N+ (S+1)), (M*N+ (2S+1)) ... (M*N+ (S (R-1)+1)) ... (M*N+ (S-1)), (M*N+ (2S-1)) ... the order of (M*N+ (S*R-1)), the R*S data after RAM of skipping.

4. according to a kind of polishing formula original place matrix transpose method described in claim 1, it is characterized in that: in described step 3, the matrix of K*K is read in to the process of carrying out transpose process in ram module RAM, refer to that any one in two K*K matrixes only contains polishing part, this matrix containing polishing part is without transposition.

5. according to a kind of polishing formula original place matrix transpose method described in claim 1, it is characterized in that: if when the columns that the process in described step 4, the two-dimensional matrix A after transposition being read from SDRAM is original two dimensional matrix A is less than line number, while reading the data after transposition, first read the first row of first square formation, after read the first row of second square formation, until read the first row of polishing square formation, only read the position of data, then read the second row of first square formation, after read the second row of second square formation, operation successively, until run through the data of whole matrix; If when the columns W of original matrix A is greater than line number L, read the first row that data after transposition first read first square formation as transposition after the first row of matrix, the second row of reading again first square formation as transposition after the second row of matrix, until read last column of first square formation, read in this way afterwards the data of each square formation, when reading the data of polishing square formation, only read the row at valid data place.

6. for the device of a kind of polishing formula original place matrix transpose method described in claim 1, it is characterized in that: comprise steering logic module (1), steering logic module (1) connects respectively to be read SDRAM address generating module (2), writes SDRAM address generating module (3) and transposition RAM module (4), and transposition RAM module (4) respectively connection data input fifo module (5) and data is exported fifo modules (6).

7. according to device claimed in claim 6, it is characterized in that: described in read SDRAM address generating module (2) and comprise and read address generate state machine, linage-counter in piece, column counter in piece, piece row-coordinate counter, piece row coordinate counter, square formation counter, deposit data amount check counter in, across the counter of number of addresses, read the selector switch of SDRAM line number, read the selector switch of SDRAM columns, and ranks splice to such an extent that SDRAM reads address, wherein after matrix transpose operation can be carried out, reading address state machine starts, linage-counter in piece simultaneously, the row and column address of column counter in producing specific in piece, according to reading the residing state of address state machine, count the number of the row-coordinate of place piece and row coordinate, operated square formation, and need to deposit and this time deposit the number of data in and across number of addresses, according to row coordinate in row-coordinate, piece in current square formation number of living in, piece row-coordinate, piece row coordinate, piece, calculate SDRAM and read address.

8. according to device claimed in claim 6, it is characterized in that: described in write SDRAM address generating module (3) and comprise that write address produces in state machine, piece column counter, piece row-coordinate counter, piece row coordinate counter, square formation counter, the selector switch of writing SDRAM line number, the selector switch of writing SDRAM columns and ranks in linage-counter, piece and splices to obtain SDRAM write address, wherein after this process starts, write address state machine starts, simultaneously the row and column address of column counter in producing specific in linage-counter, piece in piece; According to the residing state of write address state machine, count the number of the row-coordinate of place piece and row coordinate, operated square formation; According to row coordinate in row-coordinate, piece in current square formation number of living in, piece row-coordinate, piece row coordinate, piece, calculate SDRAM write address.

9. device according to claim 6, it is characterized in that: described transposition RAM module comprises address generate state machine, puts skip behind the base address counter of number, the counter that at every turn needs the base address of resetting, the generation of transposition RAM read/write address at every turn, wherein the data of each fritter are deposited in transposition RAM first successively, after storage completes, according to address generate state machine and the counting, the address of reading of putting the number of skipping behind base address and determine RAM of base address at that time.