CN103760525B

CN103760525B - Completion type in-place matrix transposition method

Info

Publication number: CN103760525B
Application number: CN201410005244.9A
Authority: CN
Inventors: 杜高明; 张多利; 宋宇鲲; 王莉莉; 尹勇生; 王晓蕾; 贾靖华
Original assignee: Hefei University of Technology
Current assignee: Huangshan Development Investment Group Co.,Ltd.
Priority date: 2014-01-06
Filing date: 2014-01-06
Publication date: 2017-01-11
Anticipated expiration: 2034-01-06
Also published as: CN103760525A

Abstract

The invention discloses a completion type in-place matrix transposition method. A two-dimensional matrix A is divided into multiple square matrixes using the part with smaller line number and smaller column number as side length, the part which is insufficient for being divided into square matrixes is completed to form a square matrix, and the completion part does not contain data. Resegmentation is carried out on the square matrixes by using K*K matrixes, all the K*K matrixes along diagonal lines or off-diagonal lines are read into an RAM module to carry out transposition processing, the two-dimensional matrixes obtained through the transposition processing is read from an SARAM and output in sequence, and a matrix after final transposition is obtained. The completion type in-place matrix transposition method has the advantages that the method is suitable for the transposition of the matrixes with the smaller one of the line number and the column number of the matrixes is positive integral multiples of power of two, the utilizing efficiency of SDRAM type memorizers can be improved effectively, and the processing performance of large matrix type transposition type digital signals can be improved.

Description

A kind of completion type in-place matrix transpose method

Technical field

The invention belongs to digital signal processing technique field, relate to a kind of completion type in-place matrix and turn Put method.

Background technology

Big matrix transpose computing is the most common, with synthetic aperture radar in data-intensive class is applied (SAR) as a example by imaging system, in imaging processor, data enter Azimuth Compression be in the past by Range descriscent tactic, and the orientation of data to compression be with distance to vertical side Upwards carry out, so having to carry out matrix transpose between two processing procedures.For data volume For the biggest Digital Signal Processing application, use random access memory ram, static random Memorizer SRAM has the shortcoming such as finite capacity, high cost as Corner turn memory device, because of This, frequently with Large Copacity second filial generation double data rate Synchronous Dynamic Random Access Memory (DDR2, SDRAM) or third generation double data rate Synchronous Dynamic Random Access Memory (DDR3, SDRAM) is as Corner turn memory device.

The disclosure of the invention of Application No. 201010174342.7 is a kind of " method of matrix transpose ", Its principle is that matrix is divided into fritter, uses appropriate depositor to carry out transposition.Its weak point It is, although compare legacy register and improve the execution speed of matrix transpose, however it is necessary that out Ward off additional storage space, when matrix size is bigger, reduce the utilization ratio of storage resource. The disclosure of the invention of Application No. 200910236075.9 " matrix transpose automatic control circuit system System and matrix transpose method ", it is disadvantageous in that, the method needs also exist for opening up extra square Battle array memory space, it is impossible to realize original place transposition；Additionally, this device configuration information derives from process Core or DMA, configure by configuring bus, there is more hardware spending.Application number Be 2012105538360.9 patent of invention disclose one " synthetic aperture radar image-forming system Matrix transpose method and transposition device ", " SAR Real-time processing machine is adopted with Master's thesis Collection and the design of transposition module and realization " piecemeal that proposes maps transposition algorithm, its principle be by The size that DDR2/DDR3 memorizer is divided into multiple memory block and each memory block is identical, each Individual memory block be used for store a distance to view data, different distance to data deposit respectively Store up in the address space of different sections.This distribution memory space makes every string Data in Azimuth Direction Position in each piecemeal is identical.Often read orientation to data only need each piecemeal The data read-out of same position.Piecemeal maps the line activating operation of Corner turn memory algorithm can To read multiple data, it is achieved the fast operating that row write row is read.It is disadvantageous in that, due to There are the data of each clock cycle burst transfer taking-up and only have an effective fraud in algorithm itself End, the effective percentage therefore causing data is relatively low.The invention of Application No. 201110122834.6 Patent discloses one " SAR imaging signal based on FPGA processes data transposition method ", Propose a kind of matrix in block form transposition algorithm, matrix data is divided into symmetric pattern matrix, symmetry Non-diagonal mode matrix, asymmetric non-diagonal mode matrix, carried out respectively by Three models matrix Matrix in block form.It is disadvantageous in that, although one is can realize square formation data in matrix former Ground transposition, but matrix ranks are differed greatly can divide the big matrix of multiple square formation not to Go out practicable technical scheme；Two is to square formation data piecemeal transposition successively in matrix, every time One piecemeal of transposition, transposition efficiency still has room for promotion.

Summary of the invention

It is an object of the invention to provide a kind of completion type in-place matrix transpose method, solve existing Have that matrix memory space utilization rate in technology is the highest, transposition efficiency has to be hoisted, especially to square What battle array ranks differed greatly the most perfect problem of big matrix transpose.

The technical solution adopted in the present invention is to follow the steps below:

Step 1: the data needing transposition are stored in synchronous DRAM continuously Being mapped to two-dimensional matrix A (L*W) in SDRAM, L represents the line number of matrix, and W represents square The columns of battle array, the least for 2 the positive integer times of power side；

Step 2: in two-dimensional matrix A line number and the less side of columns as the length of side by two dimension Matrix A is divided into several square formation, the part polishing being divided into square formation not is become one complete Whole square formation, polishing part does not contains data；Each square formation is divided into again the matrix of K*K, K is the approximate number of minimum number in L, W；

The matrix of step 3:K*K is divided into the matrix without polishing part and containing polishing part Two kinds of matrix:

All K*K matrixes diagonally are read in ram module RAM and carry out by a Transposition processes, and covers original matrix in the Output matrix after transposition to SDRAM；

Non-all K*K matrixes diagonally are read in ram module RAM by b Row transposition processes, and covers with diagonal for axis of symmetry in the Output matrix after transposition to SDRAM Symmetrical matrix；

All square formations are carried out the operation of above step 3；

Step 4: the two-dimensional matrix obtained in step 3 is read from SDRAM output, complete Become whole transposition process.

The feature of the present invention also resides in the matrix reading random access memory mould in step 3 to K*K If the process carrying out transposition process in block RAM is that each two K*K matrix is all without polishing portion Point, the most first two matrixes are read from SDRAM, be stored in RAM, then according to 0, K, 2K ... K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K²-1) order, K*K data before the RAM that first skips, according still further to K², (K²+ K), (K²+2K)…(K²+ K (K-1)), (K²+ 1), (K²+ (K+1)), (K²+2K+1)… (K²+K(K-1)+1)…(K²+ K-1), (K²+2K-1)…(K²+K²-1) order is skipped；Step In 3, the matrix to K*K reads in the mistake carrying out transposition process in ram module RAM Journey, refers to that in each two K*K matrix, only one contains polishing and non-polishing part, if Two matrixes are respectively C (M, N) and D (R, S) without polishing part, wherein M, N, R, S is respectively less than the positive integer equal to K, the most first by C (M, N) and D (R, S) from SDRAM Read, be stored in RAM, then according to 0, N, 2N ... N (M-1), 1, (N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... the order of (N*M-1), M*N data before first skipping, According still further to M*N, (M*N+S), (M*N+2S) ... (M*N+S (R-1)),

(M*N+1),(M*N+(S+1)),(M*N+(2S+1))…(M*N+(S(R-1)+1))…(M*N+ (S-1)), (M*N+ (2S-1)) ... the order of (M*N+ (S*R-1)), R*S number after the RAM that skips According to；If any one in two K*K matrixes comprises only polishing part, then should be containing only polishing The matrix of part is without transposition；By the two-dimensional matrix A after transposition from SDRAM in step 4 If the columns that the process of middle reading is original two dimensional matrix A is less than line number, after reading transposition Data time first read the first row of first square formation, the rear the first row reading second square formation ... .. is straight To reading the first row of polishing square formation, only read the position having data, then read first square formation Second row, rear the second row reading second square formation .... operate successively, until running through whole matrix Data；If the columns W of original matrix A is more than line number L, read the data after transposition First the first row of first square formation of reading is as the first row of matrix after transposition, then reads first square formation The second row as the second row of matrix after transposition, until reading last of first square formation OK, read the data of each square formation the most in this way, when the data of reading polishing square formation, Only read the row at valid data place.

It is a further object to provide a kind of for completion type in-place matrix transpose method In device, including control logic module, control logic module respectively connect reading SDRAM ground Location generation module, write SDRAM address generating module and transposition RAM module, transposition RAM Module connects data input fifo module and data output fifo module respectively.Read SDRAM Address generating module include reading address produce in state machine, block column counter in linage-counter, block, Block row-coordinate enumerator, block row coordinate counter, square formation enumerator, be stored in data amount check counting Device, the enumerator across number of addresses, the selector of reading SDRAM line number, reading SDRAM columns Selector and ranks splice to obtain SDRAM read address, wherein when transposition operation can enter After row, reading address state machine and start, in block, in linage-counter, block, column counter produces simultaneously Raw specific piece of interior row and column address；Place is counted according to reading address state machine state in which The row-coordinate of block and row coordinate, the number of operated square formation, and need to deposit this time to be stored in number According to number and across number of addresses；According at present residing square formation number, block row-coordinate, block row coordinate, In block, in row-coordinate, block, row coordinate calculates SDRAM reading address.Write SDRAM address to produce Module includes that write address produces in state machine, block column counter, block row in linage-counter, block and sits Mark enumerator, block row coordinate counter, square formation enumerator, write SDRAM line number selector, Write the selector of SDRAM columns and ranks splice to obtain SDRAM write address, wherein when After this process starts, write address state machine starts, column count in linage-counter, block in block simultaneously Device produces specific piece of interior row and column address；Count according to write address state machine state in which The row-coordinate of place block and row coordinate, the number of operated square formation；According at present residing square formation number, In block row-coordinate, block row coordinate, block, in row-coordinate, block, row coordinate calculates SDRAM write address. Transposition RAM module include address produce state machine, put base address every time after skip the meter of number Number device, the enumerator of base address that every time need to reset, the generation of transposition RAM read/write address, The data of the most each fritter are stored in transposition RAM the most successively, after storage completes, and root According to address produce state machine and at that time base address counting, put base address after the number skipped determine The reading address of RAM.

The invention has the beneficial effects as follows that transposition efficiency is high, design based on SDRAM memory, suitable In the transposition of Arbitrary Matrix scale, the utilization effect of SDRAM class memorizer can be effectively improved Rate, promotes the process performance of big matrix transpose class digital signal.

Accompanying drawing explanation

Partitioning of matrix figure before Fig. 1 transposition；

Matrix diagram after Fig. 2 transposition；

The hardware of Fig. 3 deblocking based on SDRAM transposition method realizes figure；

Fig. 4 reads SDRAM address generating module hardware realization figure；

Fig. 5 reads SDRAM address and produces state machine；

Fig. 6 transposition RAM module hardware realizes；

Fig. 7 random memory addresses produces state machine；

Fig. 8 writes SDRAM address generating module hardware realization figure.

Detailed description of the invention

The present invention is described in detail with detailed description of the invention below in conjunction with the accompanying drawings.

As it is shown in figure 1, follow the steps below, step 1: batch to be needed the number of transposition According to being stored in continuously in synchronous DRAM SDRAM, data are mapped to two-dimensional matrix A (L*W), L represent the line number of matrix, and W represents matrix column number, the least for 2 The positive integer times of power side；A. line number and the columns of two-dimensional matrix are compared, with line number and columns Two-dimensional matrix is divided into several square formation as the length of side by a less side, will be divided into square formation not Part polishing become a square formation；B. with the matrix of K*K, the square formation of previous step is split again, Being divided into diagonally mode matrix and the matrix of non-diagonally pattern, K is positive integer, K It must be the approximate number of minimum number in L, W；

Step 2: to each square formation, first by the institute without polishing part of diagonally pattern There is K*K matrix to carry out transposition process, utilize address saltus step to read every K address and deposit at random Data in reservoir i.e. RAM, therein every K address reading ram cell data, Refer to first read the data of zero-address, then read the data of K address, then read the number of 2K address According to, read K*(K-1 always) behind address, jump to address 1, rear read K+1 address ..., Circulation is until reading the data of K*K-1 address successively, and the data read are pressed row again successively It is stored in the matrix of its Data Source, completes the transposition of this K*K units chunk matrix；Secondly by non- Diagonally pattern carries out transposition process without the matrix of polishing part, is right with diagonal every time Claim axle to read two symmetrical matrixes, with step 2, each matrix utilize address saltus step every K address reads the data in RAM, and by row, the data of reading are newly stored into symmetry successively Matrix in, it is achieved transposition effect；

All square formations are carried out the operation of above step 3；

Step 4: the two-dimensional matrix obtained in step 3 is read from SDRAM output, complete Become whole transposition process.In step 3, the matrix to K*K reads in ram module RAM If the process of transposition process is carried out for each two K*K matrix all without polishing part, then elder generation in Two matrixes are read from SDRAM, is stored in RAM, then according to 0, K, 2K ... K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K²-1) Order, K*K data before the RAM that first skips, according still further to K², (K²+ K), (K²+2K)…(K²+ K (K-1)), (K²+ 1), (K²+ (K+1)), (K²+2K+1)… (K²+K(K-1)+1)…(K²+ K-1), (K²+2K-1)…(K²+K²-1) order, skip RAM Rear K*K data；In step 3, the matrix to K*K reads in ram module RAM Carrying out the process of transposition process, if referring to, in each two K*K matrix, only one contains benefit Neat and non-polishing part, if two matrixes are respectively C (M, N) and D (R, S) without polishing part, Wherein M, N, R, S are respectively less than the positive integer equal to K, the most first by C (M, N) and D (R, S) Read from SDRAM, be stored in RAM, then according to 0, N, 2N ... N (M-1), 1, (N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... the order of (N*M-1), M*N data before first skipping, according still further to

M*N,(M*N+S),(M*N+2S)…(M*N+S(R-1)),(M*N+1),(M*N+(S+1)),(M *N+(2S+1))…(M*N+(S(R-1)+1))…(M*N+(S-1)),(M*N+(2S-1))…(M* N+ (S*R-1)) order, R*S data after the RAM that skips；Square to K*K in step 3 Battle array reads in the process carrying out transposition process in ram module RAM, if referring to two Any one in K*K matrix comprises only polishing part, then be somebody's turn to do the matrix containing only polishing part Without transposition；The mistake in step 4, the two-dimensional matrix A after transposition read from SDRAM If the columns that journey is original two dimensional matrix A is less than line number, read during the data after transposition first Read the first row of first square formation, the rear the first row reading second square formation ... .. is until reading polishing The first row of square formation, only reads the position having data, then reads the second row of first square formation, after Read the second row of second square formation .... operate successively, until running through the data of whole matrix；If When the columns W of original matrix A is more than line number L, reads the data after transposition and first read first The first row of square formation is as the first row of matrix after transposition, then the second row reading first square formation is made For the second row of matrix after transposition, until reading last column of first square formation, afterwards with this Mode reads the data of each square formation, when the data of reading polishing square formation, only reads effectively The row at data place.

The transposition device that the present invention provides includes controlling logic module 1 as shown in Figure 3, controls to patrol Collect module 1 to connect reading SDRAM address generating module 2 respectively, write the generation of SDRAM address Module 3 and transposition RAM module 4, transposition RAM module 4 connects data input FIFO respectively Module 5 and data output fifo module 6.Wherein data path is: SDRAM is external to be deposited Reservoir output data to data input fifo module 5, is then input to transposition RAM module 4 In, input is sequential write when entering, and skips during output；Transposition RAM module 4 exports number It is written in SDRAM according in data output fifo module 6；Control logic module 1 to lead to Road includes reading SDRAM address generating module 2, reading SDRAM port signal, input FIFO Read-write flag bit, transposition RAM read/write address control, the read-write mark of transposition RAM Position, the read-write flag bit of output FIFO, write SDRAM address generating module 3, write SDRAM Port signal；

Read SDRAM address generating module 2 as shown in Figure 4 include read address produce state machine, Column counter in linage-counter, block in block, block row-coordinate enumerator, block row coordinate counter, Square formation enumerator, be stored in data amount check enumerator, across number of addresses enumerator, read SDRAM The selector of line number, the selector reading SDRAM columns and ranks splice to obtain SDRAM Read address.Wherein reading address and produce the carrying out of the state machine whole transposition flow process of control, block expert counts The line position in the block reading this block deposited by number device, and in block, the block reading this block deposited by column counter Interior column position, block row-coordinate counter register reads the line position at the place of this block, and block row are sat Mark counter register reads the column position at the place of this block, square formation counter register now transposition Square formation number, be stored in data amount check counter register two diagonally pattern or two non-diagonally The number of valid data in ray mode matrix；Effective across the counter register block now of number of addresses The columns of data, is the counting that transposition RAM module 4 reads across address；Read SDRAM The selector of line number selects the SDRAM row read, when first matrix being in two matrixes Time calculate, according to row, column coordinate, the row read, when second matrix being in two matrixes, Row, column number of coordinates is exchanged, calculates the row that second matrix is read；Read SDRAM columns The selector such as selector principle reading SDRAM line number is identical；Obtain finally according to ranks splicing SDRAM reads address.Read SDRAM address and produce state machine as it is shown in figure 5, include six State: the reading in each two matrix transpose operation of original state, diagonally mode matrix Every the two of first matrix condition (first matrix condition along the line), diagonally mode matrix Individual matrix transpose operation in second matrix condition of reading (second matrix condition along the line), First square of reading in each two symmetrical matrix transposition operation of non-diagonally mode matrix Battle array state (non-first matrix condition along the line), each two pair of non-diagonally mode matrix Claim matrix transpose operation in reading symmetrical matrix state (non-second matrix condition along the line), Waiting Matrix block transposition completion status.First resets puts original state, when condition 1 is Initialization completes and data have been stored in complete, NextState forward to along the line first rectangular State；As being unsatisfactory for, rest on original state always；When current state is first matrix along the line State, if when condition 2 i.e. this diagonally mode matrix fritter be the little of place at polishing Block, last valid data runs through, and NextState forwards Waiting Matrix block transposition completion status to； When the data of condition 3 that is first fritter run through, NextState forward to along the line second rectangular State；If condition 2 or 3 is all unsatisfactory for, NextState is still first matrix condition along the line；When When current state is second matrix condition along the line, if when condition 4 is this diagonally pattern Matrix fritter is the fritter at place at polishing, and last valid data runs through, or second The data of fritter run through, and NextState forwards Waiting Matrix block transposition completion status to；If condition 4 are unsatisfactory for, and NextState is still second matrix condition along the line；When current state is for waiting square Battle array block transposition completion status, when that is all fritter transposition of condition 5 complete, at the beginning of NextState forwards to Beginning state；When that is upper little block operations of transposition of condition 6 is complete and the most residing side The diagonally mode matrix transposition of battle array completes, and NextState forwards non-first square along the line to Battle array state；When that is upper little block operations of transposition of condition 7 is complete and the most residing side Battle array diagonally mode matrix not yet complete, NextState forward to along the line first rectangular State；When condition 5,6 or 7 is all unsatisfactory for, NextState is still that Waiting Matrix block transposition is complete One-tenth state；When current state is non-first matrix condition along the line, if when condition 8 is this Diagonally mode matrix fritter is the fritter at place at polishing, and last valid data is read Complete, or at the symmetrical fritter of this fritter data be the matrix i.e. data of polishing be sky, next State forwards Waiting Matrix block transposition completion status to；When the data of condition 9 that is first fritter are read Complete, NextState forwards non-second matrix condition along the line to；If condition 8,9 is all unsatisfactory for, NextState is still non-first matrix condition along the line；When current state is non-second square along the line During battle array state, if when condition 10 i.e. this diagonally mode matrix fritter be place at polishing Fritter, last valid data runs through, or the data of second fritter run through, next State forwards Waiting Matrix block transposition completion status to；If condition 10 is unsatisfactory for, NextState is still For non-second matrix condition along the line.

As shown in Figure 6, transposition RAM module 4 includes that address produces state machine, puts base every time The enumerator of number of skipping behind address, the enumerator of base address that every time need to reset, transposition RAM The generation of read/write address.Address produces state machine and controls whole transposition RAM read-write flow process；Often Secondary put base address after skip the enumerator of number, deposit the number read of having skipped；Often The enumerator of the secondary base address that need to reset, deposits the base address needing to reset；Transposition RAM reads The generation of write address, positions the address of transposition RAM according to the result of above three unit.Turn Put RAM module 4 and realize being sequentially written in and saltus step read out function of data, including one Random memory addresses produces state machine, as it is shown in fig. 7, comprises five states: original state, Be sequentially written in state, address zero setting state, saltus step reads state 1, saltus step reads state 2. The conversion of each state is as it is shown in fig. 7, resets is set to original state, when condition 1 is The reading reading FIFO is effective, and NextState forwards to be sequentially written in state；If condition 1 is unsatisfactory for, NextState is still original state；When current state is for being sequentially written in state, when condition 2 is When address needs, equal to this fritter, the data being stored in, NextState forwards address zero setting shape to State；If condition 2 is unsatisfactory for, NextState is still for being sequentially written in state；When current state is ground During the zero setting state of location, following clock cycle state transition is that saltus step reads state 1；When current shape When state is saltus step reading state 1, when condition 3 i.e. needs to be stored in the data more than a fritter, When now address has reached the data of a fritter, NextState forwards saltus step to and reads state 2； When condition 4 is i.e. stored in the data of a not more than fritter, and now address has been last number According to, NextState forwards original state to；When condition 3,4 is all unsatisfactory for, NextState is still State 1 is read for saltus step；When current state be saltus step read state 2 time, when condition 5 i.e. when Being stored in more than the data of a fritter, NextState forwards original state to；If condition 5 when being unsatisfactory for, and NextState still reads state 2 for saltus step.

As shown in Figure 8, write SDRAM address generating module 3 and produce with reading SDRAM address Module 2 is corresponding, the function of modules and aforementioned reading SDRAM address generating module 2 Interior little module function is identical.State machine includes seven states: original state, diagonally mould Formula matrix each two matrix transpose operation in write first matrix condition (first square along the line Battle array state), diagonally mode matrix each two matrix transpose operation in reading second Matrix condition (second matrix condition along the line), each two pair of non-diagonally mode matrix Claim matrix transpose operation in first matrix condition of reading (non-first matrix condition along the line), Reading symmetrical matrix in each two symmetrical matrix transposition operation of non-diagonally mode matrix State (non-second matrix condition along the line), wait this stages operating start state 1, etc. Treat that this stages operating starts state 2.First resets puts original state, when condition 1 This phase start signal i.e. is true, and NextState forwards first matrix condition along the line to；If not Satisfied then rest on original state always；When current state is first matrix condition along the line, when If condition 2 i.e. this diagonally mode matrix fritter be the fritter at place at polishing, last Individual valid data write, and NextState forwards to wait that this stages operating starts state 1；Work as bar The data of part 3 that is first fritter write, and NextState forwards second matrix condition along the line to； If condition 2 or 3 is all unsatisfactory for, NextState is still first matrix condition along the line；When at present When state is second matrix condition along the line, if when condition 4 is this diagonally mode matrix Fritter is the fritter at place at polishing, and last valid data writes, or second fritter Data write, NextState forwards to wait that this stages operating starts state 1；If condition 4 are unsatisfactory for, and NextState is still second matrix condition along the line；When current state is for waiting this When one stages operating starts state 1, the next clock cycle becomes waiting that this stages operating is opened Beginning state 2；

When current state is for waiting that this stages operating starts state 2, when condition 5 is that is all little Block transposition completes, and NextState forwards original state to；Start when condition 6 i.e. this stage and The diagonally mode matrix transposition of the most residing square formation completes, and NextState forwards to non- First matrix condition along the line；Have begun to and at present residing when condition 7 i.e. this stage The diagonally mode matrix of square formation not yet completes, and NextState forwards first matrix along the line to State；When condition 5,6 or 7 is all unsatisfactory for, NextState is still for waiting this stage behaviour Make beginning state 2；When current state is non-first matrix condition along the line, when condition 8 is If this diagonally mode matrix fritter be the fritter at place at polishing, last significant figure According to writing；Or at the fritter that this fritter is symmetrical data be the matrix of polishing, i.e. data be empty, NextState forwards to wait that this stages operating starts state 1；When condition 9 that is first fritter Data write, NextState forwards non-second matrix condition along the line to；If condition 8,9 is all Being unsatisfactory for, NextState is still non-first matrix condition along the line；When current state be non-along the line During second matrix condition, if when condition 10 i.e. this diagonally mode matrix fritter for mending The fritter at Qi Chu place, last valid data writes, or the data of second fritter are write Complete, NextState forwards to wait that this stages operating starts state 1；If condition 10 is unsatisfactory for, NextState is still non-second matrix condition along the line.

The present invention supports original place transposition and is applicable to the situation that matrix ranks differ greatly.Propose one Plant completion type in-place transposition method, and realize this transposition method based on hardware so that transposition device Memorizer service efficiency higher, thus it is the most sharp to meet memory resource in Digital Signal Processing Requirement.

Beneficial effects of the present invention quantitative analysis is as follows:

Storage efficiency is analyzed:

When matrix size is L*W, it is assumed that less number is W, and the present invention is in matrix storage The memory space comprising original matrix that needs are opened up is: ([L/W]+1) * W*W stores sky Between ([] is bracket function).

Making L be respectively 160,544,1312,3232, W is 64；Draw transposition spatial contrast, Non-original place transposition algorithm as shown in table 1 and the transposition spatial contrast of the algorithm of the present invention.

Table 1

Matrix storage sky as can be seen from the above table, under same matrix size, needed for the present invention Between less than non-original place transposition algorithm.And along with the increase of matrix size, save space and increasingly connect Nearly 50%.

The efficiency analysis of transposition algorithm:

Setting during efficiency analysis, matrix size is L*W, and the least number is W, and inter-bank reads M the clock cycle of ancillary cost, (time that different SDRAM storage class inter-bank activate had A little difference), piecemeal size is K, then the present invention completes matrix A (L, W) transposition, institute Spend the clock cycle be:

L*m+(L*W)/4+(W/K)*L*m+(L*W)/4+

(W/K)*([L/W]*W)*m+W*([(L-[L/W]*W)/K]+1)*m+(L*W)/4+

([L/W]+1) * W*m+ (L*W)/4,

If matrix size is 2720*256, the present invention consumes the clock cycle in theory and is:

T1=(2.65625m+170)k+(10.625m+170)k+(10.75m+170)k+(2.75m+170)k =(26.78125m+680) k,

Existing " SAR Real-time processing machine gathers the design with transposition module and realization " is used to carry The piecemeal gone out maps storage transposition algorithm total consumption clock cycle for (to assume that block sizes is 16*16):

T2=(42.5m+170)k+(42.5m+680)k

=(85m+850)k；

Both compare: T1-T2=-(58.21875m+170) k, it can be seen that, the present invention is spent The clock cycle of expense is also far smaller than existing " collection of SAR Real-time processing machine and transposition mould The design of block and realization " in institute's extracting method, there is preferable transposition efficiency.

Specific embodiment is set forth below illustrate:

Specific embodiment 1:

If original matrix line number L=2720, columns W=256, the submatrix row, column number of division K=64, transposition method is as follows:

Step 1, will batch need the data of transposition to be stored in continuously in SDRAM, first by data It is mapped to two-dimensional matrix 2720*256；

Step 2, the line number of comparator matrix and columns, carry out polishing behaviour by the two-dimensional matrix of mapping Make so that it is several square formation can be divided into, then carry out piecemeal operation.With reference to Fig. 1, this The specific implementation of step is as follows:

(2a) with 256*256 for square formation scale, matrix can carry out polishing operation, fills up in lower section The rectangle of one 96*256, makes whole matrix can be divided into 11 square formations；

(2b) for unit matrix, each square formation is split with 64*64, be divided into diagonally mould Formula matrix and non-diagonally mode matrix；

Step 3, use ram cell realize the effect of transposition, utilize the mode of address saltus step Realize this effect.This step uses ram cell, and the size of RAM is 2*64*64, i.e. First can carry out diagonally mode matrix turning with the matrix of two 64*64 of transposition each time Put, i.e. the oblique line matrix (A00, A11, A22, A33) in figure one is processed: read Taking two block matrix A00, A11 of diagonally pattern, reading manner is to utilize address saltus step RAM is jumped and reads by mode, the transposition of A00: from the data of reading zero-address, then Read the data of 64 addresses, then read the data of 128 addresses, read 4032 addresses, so always After jump to address 1, rear read 65 addresses .... circulation is until reading the data of 4095 addresses successively, The data read are stored back to A00 again by row one by one；The transposition of A11: first read The data of address, address, A11 first place that is 4096, then read the data of 4160 addresses, a direct-reading Behind 8128 addresses, jumping to 4097 addresses, circulation is until reading 9191 addresses successively Data, this is last data of A11 matrix, the data read successively by row again It is stored back to A11 position.Finally, successively each two fritter is carried out the operation that first two steps are rapid, its Include the saltus step of following address:

The saltus step of line feed address in unit matrix；

From address, a upper block end to the saltus step of next block initial address between two combination block；

Saltus step from next address, block end to previous piece of initial address between two combination block；

The saltus step of address between various combination, i.e. from last address of these two combination block to the next one The saltus step of the initial address of two combination block.

Step 4, the transposition of non-diagonally mode matrix is processed, i.e. to matrix in figure two-by-two Combination A01, A10, A02, A20 etc. carry out transposition.Use ram cell, by square formation In two diagonally symmetrical unit matrix blocks be stored in RAM, utilize address saltus step Mode reads the data in RAM, is written on symmetrical position, completes two matrix-blocks Data conversion.The realization of above-mentioned transposition is introduced as a example by non-diagonally mode matrix transposition: First, symmetrical A01, A10 data are deposited into ram cell, according to step 3 by row In second step in the mode of address saltus step 64*64 unit fritter that previous transposition is gone out Data be stored in the position of A10, the 64*64 unit fritter then later transposition gone out Data are stored in the position of A01, complete the transposition of two non-diagonally mode matrix of symmetry； Then complete the data conversion of all symmetry blocks with same operation, then can complete all non-edges The transposition of diagonal model matrix processes.

Step 5, the transposition of the square formation completed in addition to polishing, be stored in after each minor matrix transposition with On the basis of square formation diagonal, the position symmetrical with original matrix present position.

Step 6, square formation pattern to polishing carry out transposition:

The first step carries out transposition one by one to diagonally mode matrix:

When the matrix data of diagonally pattern is full up, such as B00, B11 in Fig. 1, transposition Mode is same with step three-phase；When diagonally pattern matrix data less than time, in figure B22, utilizes the mode of address saltus step the most half-full for RAM to realize transposition, specific implementation For first these block matrix data being stored in RAM block, first read the data of zero-address, then read The data of 64 addresses, then read the data of 128 addresses, after reading 1984 addresses, jump always Change to address 1, rear reading 65 addresses .... circulation is until reading the data of 2047 addresses successively, weight Newly it is stored in this block matrix position.When the matrix of diagonally pattern is empty, such as the B33 in figure, Skip transposition operation；Second step carries out transposition to non-diagonally pattern square formation: when a pair non-edge When the matrix data of diagonal model is full up, such as B01, B10 in figure a pair, transposition Mode is identical with step 4；Full up one of the matrix data one when a pair non-diagonally pattern Less than time, such as B02, B20 in figure a pair, utilize the ground that RAM is the most full up but the most half-full The mode of location saltus step realizes transposition, and specific implementation is: refer to data are stored in RAM block, First read the data of zero-address, then read the data of 64 addresses, then read the number of 128 addresses According to, after reading 4032 addresses, jump to address 1, rear reading 65 addresses always .... circulate successively Until reading the data of 4095 addresses, being stored in the position of B20, at this moment completing full data The transposition of 64*64 units chunk matrix；Then read the data of 4096 addresses, then read 4160 The data of address, after reading 6080 addresses, jump to address 4097, read 4161 afterwards always Address .... circulation is until reading the data of 6143 addresses successively, is stored in the position of B02, now Complete the full up data of the most individual data less than the transposition of non-diagonally mode matrix Operation.When the matrix data one of a pair non-diagonally pattern is empty less than one, such as figure In B23, B32 a pair, utilize the mode of address saltus step the most half-full for RAM to realize transposition, Specific implementation is: data are stored in RAM block, first reads the data of zero-address, then Read the data of 64 addresses, then read the data of 128 addresses, after reading 1984 addresses always, Jump to address 1, rear reading 65 addresses ... circulation is until reading the data of 2047 addresses successively, It is stored in the position of B32, reads RAM and terminate, i.e. complete the most individual data less than a number According to the transposition operation for empty non-diagonally mode matrix.When a pair non-diagonally pattern When matrix is sky, skip transposition operation.

Step 7, determine address according to the position after transposition, the number in the most exportable SDRAM According to.First reading the first row of first square formation, then the first row of polishing square formation (only reads there are data Position)；Then reading the second row of first square formation, then the second row of polishing square formation is (only Read the position having data)；.... until running through all data.

Claims

1. a completion type in-place matrix transpose method, it is characterised in that comprise the steps:

Step 1: the SAR data needing transposition is stored in synchronous DRAM continuously Being mapped to two-dimensional matrix A (L*W) in SDRAM, L represents the line number of matrix, and W represents square The columns of battle array, the least for 2 the positive integer times of power side；

All K*K matrixes diagonally are read according to the mode of each two and deposit at random by a Memory modules RAM carries out transposition process, the Output matrix after transposition to SDRAM covers Lid original matrix；

Non-all K*K matrixes diagonally according to process two every time with diagonal are by b The mode of axis of symmetry matrix is read in and is carried out transposition process, transposition in ram module RAM After Output matrix cover symmetrical matrix with diagonal for axis of symmetry in SDRAM； All square formations are carried out the operation of above step 3；

Step 4: the two-dimensional matrix obtained in step 3 is read from SDRAM output, complete Become whole transposition process；

Described step 3 is carried out in the matrix reading ram module RAM of K*K If transposition process process be each two K*K matrix all without polishing part, the most first by two Matrix reads from SDRAM, is stored in RAM, then according to 0, K, 2K ... K (K-1), 1, (K+1), (2K+1) ... (K (K-1)+1) ... (K-1), (2K-1) ... (K²-1) order, first Skip K*K data before RAM, according still further to K², (K²+ K), (K²+2K)…(K²+ K (K-1)), (K²+ 1), (K²+ (K+1)), (K²+2K+1)…(K²+K(K-1)+1)…(K²+ K-1), (K²+2K-1)…(K²+K²-1) order, K*K data after the RAM that skips；

Described step 3 is carried out in the matrix reading ram module RAM of K*K The process that transposition processes, if refer in each two K*K matrix only one contain polishing and Non-polishing part, if two matrixes are respectively C (M, N) and D (R, S) without polishing part, Wherein M, N, R, S are the positive integer less than or equal to K, the most first by C (M, N) and D (R, S) Read from SDRAM, be stored in RAM, then according to 0, N, 2N ... N (M-1), 1, (N+1), (2N+1) ... (N (M-1)+1) ... (N-1), (2N-1) ... the order of (N*M-1), M*N data before first skipping, according still further to M*N, (M*N+S), (M*N+2S) ... (M*N+S (R-1)), (M*N+1), (M*N+ (S+1)), (M*N+ (2S+1)) ... (M*N+ (S (R-1)+1)) ... (M*N+ (S-1)), (M*N+ (2S-1)) ... (M*N+ (S*R-1) order), R*S data after the RAM that skips；

Described step 3 is carried out in the matrix reading ram module RAM of K*K The process that transposition processes, refers to that any one in two K*K matrixes comprises only polishing part, Then it is somebody's turn to do the matrix containing only polishing part without transposition.

2., according to completion type in-place matrix transpose method a kind of described in claim 1, its feature exists In: the process in described step 4, the two-dimensional matrix A after transposition read from SDRAM If during for the columns of original two dimensional matrix A less than line number, first reading when reading the data after transposition The first row of first square formation, the rear the first row reading second square formation, until reading polishing square formation The first row, only read to have the position of data, then read the second row of first square formation, read the afterwards Second row of two square formations, operates successively, until running through the data of whole matrix；If original square When the columns W of battle array A is more than line number L, reads the data after transposition and first read first square formation The first row is as the first row of matrix after transposition, then reads the second row of first square formation as transposition Second row of rear matrix, until reading last column of first square formation, reads the most in this way Take the data of each square formation, when the data of reading polishing square formation, only read valid data institute Row.