CN102624653A - Extensible QR decomposition method based on pipeline working mode - Google Patents

Extensible QR decomposition method based on pipeline working mode Download PDF

Info

Publication number
CN102624653A
CN102624653A CN2012100103456A CN201210010345A CN102624653A CN 102624653 A CN102624653 A CN 102624653A CN 2012100103456 A CN2012100103456 A CN 2012100103456A CN 201210010345 A CN201210010345 A CN 201210010345A CN 102624653 A CN102624653 A CN 102624653A
Authority
CN
China
Prior art keywords
layer
matrix
row
module
rotation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100103456A
Other languages
Chinese (zh)
Other versions
CN102624653B (en
Inventor
陈翔
蔡世杰
周春晖
周世东
许希斌
张秀军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201210010345.6A priority Critical patent/CN102624653B/en
Publication of CN102624653A publication Critical patent/CN102624653A/en
Application granted granted Critical
Publication of CN102624653B publication Critical patent/CN102624653B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides an extensible QR decomposition method based on a pipeline working mode. A layered and cascaded structure is adopted, wherein pipeline operation is realized among layers; two lines of matrix data of each layer are subject to Givens rotation; the Givens rotation is conducted by using a coordinate rotation digital computer (CORDIC) algorithm and realized by using operations of addition and displacement; the layer number is determined by the line number or the row number of matrices; a whole QR decomposition module consists of a controller and a data processing module; a line rotation module is arranged on each layer of the data processing module; output ports of upper layers of line rotation modules are connected with input ports of lower layers of line rotation modules; the matrix data to be composed are input from an input port of the first layer of line rotation module, and sequentially flow through all the line rotation modules; and all the line rotation modules work in parallel. The scalable QR decomposition method has the advantages that hardware resources are saved; and as the iterative process of the CORDIC algorithm is split at the inner part of each layer of the data processing module, a pipeline structure is formed, the rotation operation of multiple vector groups is processed in a time sharing manner, the throughput rate is increased, and the extension is flexible.

Description

Expanded QR decomposition method based on pipeline work
Technical field
The invention belongs to the mobile wireless data transmission technique field, relate to a kind of expanded QR decomposition method, structure and application based on pipeline work.
Background technology
It is a kind of basic matrix decomposition form that QR decomposes, and is widely used in various signal processing field.For example in MIMO detects, channel matrix is carried out not only can keeping orthogonality of data after QR decomposes, and can the reduced data testing process.After channel matrix being carried out the QR decomposition, can be converted into an orthogonal matrix Q and a upper triangular matrix R.Like this, receiving data interference each other will be simplified.Because QR decomposes shortcut calculation complexity greatly, the scale of whole algorithm can't become big because of introducing QR to decompose.
At present, the main flow algorithm of QR decomposition has three kinds, Gram-Scmidt orthogonalization method, Givens rotary process and Householder method.Because can realize the Givens rotary process, realize needed multiplication and division quantity thereby reduce hardware greatly, so the main stream approach of academia's research now is the Givens method through cordic algorithm.The QR that institute of the present invention extracting method just is being based on the Givens rotation decomposes.
The Givens rotation can rotate a certain angle the vector on the two dimensional surface and keep vector length constant, and rotary manipulation is equivalent to orthogonal matrix of premultiplication.Can vector rotated on the reference axis and be changed to zero to a certain dimension coordinate of vector through setting the specific anglec of rotation.Givens rotary process QR decomposes and to be actually through treating a series of spin matrix of split-matrix H premultiplication, thereby trigonometric ratio on the matrix H.G just MG M-1G 2G 1H=R, thereby Q=G H 1G H 2G H M-1G H MIn the MIMO non-linear detection, need to calculate the vectorial y premultiplication Q of reception HThe result.Here notice; is just at the QR decomposable process; The rotary manipulation that all carry out matrix H; Carry out same operation to receiving vector
Figure BDA00001308197500012
simultaneously; Like this when QR decompose to finish, obtained
Figure BDA00001308197500021
simultaneously and avoided the calculating of Q and follow-up
Figure BDA00001308197500022
.If the channel matrix of a plurality of detections is identical, then can come the channel matrix right side to a plurality of reception vector orders, constitute the matrix of a columns greater than line number.When carrying out the QR decomposition, only need carry out Givens rotary process QR and decompose for the positive square matrix on the left side, the reception vector on the right side is followed and is done identical rotation, and when QR decomposed end, it was vectorial just to have obtained a plurality of postrotational receptions.The scale of channel matrix is different, and the detection number of same channel matrix is different, carries out the matrix size difference that QR decomposes.
Realize the Givens rotary manipulation with cordic algorithm, can reduce big multiplication of hardware resource occupancy and divide operations greatly, the operation of mainly carrying out is that hardware is realized simply being shifted and adding reducing.Cordic algorithm has two kinds of patterns: Victoring mode and Rotation mode.Victoring mode mainly addresses such a problem, for given doublet (X, Y), if it as a vector, how many angles of this vectorial length and itself and X axle is so.The another kind of Victoring mode is described, and (X, Y), if assign it as a vector, need rotate how many angles so can forward this vector on the X axle to, and how much this vectorial length is for given doublet.Rotation mode mainly addresses such a problem, and (X is Y) with angle θ, if (X, Y) as a vector, how many vectors behind that this vectorial rotational angle θ is for given doublet.No matter be Victoring mode, or Rotation is mode, basic thought all is to be split as α=d to angle [alpha] 1α 1+ d 2α 2+ ... + d nα nForm, successively each dimension is handled then.
In the QR decomposable process, for h I+1, jBe changed to zero, need operating to the H matrix like the lower part:
h i , j h i , j + 1 h i , j + 2 · · · h i , m - 1 h i , m h i + 1 , j h i + 1 , j + 1 h i , j + 2 · · · h i + 1 , m - 1 h i + 1 , m .
The Victoring mode that just earlier first of above-mentioned part matrix is listed as with CORDIC calculates a doublet (h I, j, h I+1, j) rotate to the required angle θ that turns over of X axle I+1, j, use Rotation mode spin matrix G (i+1, j, θ then I+1, j) other row of the above-mentioned part matrix of premultiplication.
Very big calculating waste is arranged so in fact.In Victoring mode module, need calculate d earlier i, i=1,2 ..., n uses d then iCalculate α=d 1α 1+ d 2α 2+ ... + d nα nVictoring mode module is passed to Rotation mode module to the α that calculates, and the α that the utilization of Rotation mode module receives calculates d i, i=1,2 ..., n uses each d then iBe rotated respectively.It is just passable to notice that Victoring mode module and Rotation mode module only need guarantee to rotate same angle, does not need Practical Calculation to come out and how many this angle [alpha] is on earth.Just only need to guarantee the d in Victoring mode module and the Rotation mode module i, i=1,2 ..., n is identical just passable.Victoring mode module only need be the d that calculates like this iPass to Rotation mode module, the d that the utilization of Rotation mode module receives iDirectly be rotated just passable.
Summary of the invention
In order to overcome the deficiency of above-mentioned prior art; The object of the present invention is to provide a kind of expanded QR decomposition method, structure and application based on pipeline work; It is few to make hardware resource take; System's short time delay, throughput is big, can be applicable to various scale matrixs according to flexible the expanding of demand
To achieve these goals, the technical scheme of the present invention's employing is:
A kind of expanded QR decomposition method based on pipeline work, the structure of employing level cascade realizes water operation, for N between each layer ROK, N TThe matrix H to be decomposed of row is worked as N T>=N RThe time, the number of plies of layering is N R-1; Work as N T<N RThe time, the number of plies of layering is N TThe two row matrix data of each layer are carried out the Givens rotation, and the Givens rotation is carried out with cordic algorithm, realize with addition and shifting function.
Work as N T>N RThe time, to the N on the left side R* N RSubmatrix carries out QR and decomposes the N on the right side R* (N T-N R) submatrix follows and do identical rotation, QR decomposes successively and realizes according to the following steps:
Step (1): the N of matrix RLine data is imported the 1st layer;
Step (2): the N of matrix R-1 line data is imported the 1st layer, the 1st layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure BDA00001308197500031
Be changed to zero;
Step (3): the N of matrix R-2 line data are imported the 1st layer, the N of the matrix of the 1st layer of output RLine data is imported the 2nd layer, the 1st layer of N to matrix R-1 row and N R-2 row carry out the Givens rotation, matrix element
Figure BDA00001308197500041
Be changed to zero;
Step (4): the N of matrix R-3 line data are imported the 1st layer, the N of the matrix of the 1st layer of output R-1 line data is imported the 2nd layer, the 1st layer of N to matrix R-2 row and N R-3 row carry out the Givens rotation, matrix element Be changed to zero, the 2nd layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure BDA00001308197500043
Be changed to zero;
……
Step (N R): the 1st line data input ground floor of matrix, i layer
Figure BDA00001308197500044
The 2i+1 line data of the matrix of output is imported the i+1 layer, the j layer
Figure BDA00001308197500045
Capable and 2j-1 is capable to the 2j of matrix carries out the Givens rotation, matrix element h 2j, jBe changed to zero, the 1st row result of calculation of the 1st layer of output R matrix;
Step (N R+ 1): the i layer
Figure BDA00001308197500046
The 2i line data of the matrix of output is imported the i+1 layer, the j layer
Figure BDA00001308197500047
Capable and 2j-2 is capable to the 2j-1 of matrix carries out the Givens rotation, matrix element h 2j-1, jBe changed to zero, the 2nd row result of calculation of the 2nd layer of output R matrix;
……
Step (2N R-2): N RThe N of the matrix of-2 layers of output R-1 line data is imported N R-1 layer, N R-1 layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure BDA00001308197500048
Be changed to zero, N R-1 layer of N that exports the R matrix simultaneously RRow and N R-1 row result of calculation.
Work as N T<N RThe time, QR decomposes successively realization according to the following steps:
Step (1): the N of matrix RLine data is imported the 1st layer;
Step (2): the N of matrix R-1 line data is imported the 1st layer, the 1st layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure BDA00001308197500049
Be changed to zero;
Step (3): the N of matrix R-2 line data are imported the 1st layer, the N of the matrix of the 1st layer of output RLine data is imported the 2nd layer, the 1st layer of N to matrix R-1 row and N R-2 row carry out the Givens rotation, matrix element
Figure BDA00001308197500051
Be changed to zero;
Step (4): the N of matrix R-3 line data are imported the 1st layer, the N of the matrix of the 1st layer of output R-1 line data is imported the 2nd layer, the 1st layer of N to matrix R-2 row and N R-3 row carry out the Givens rotation, matrix element
Figure BDA00001308197500052
Be changed to zero, the 2nd layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure BDA00001308197500053
Be changed to zero;
……
Step (N R): the 1st line data input ground floor of matrix, i layer
Figure BDA00001308197500054
The 2i+1 line data of the matrix of output is imported the i+1 layer, the j layer Capable and 2j-1 is capable to the 2j of matrix carries out the Givens rotation, matrix element h 2j, jBe changed to zero, the 1st row result of calculation of the 1st layer of output R matrix;
Step (N R+ 1): the i layer
Figure BDA00001308197500056
The 2i line data of the matrix of output is imported the i+1 layer, the j layer
Figure BDA00001308197500057
Capable and 2j-2 is capable to the 2j-1 of matrix carries out the Givens rotation, matrix element h 2j-1, jBe changed to zero, the 2nd row result of calculation of the 2nd layer of output R matrix;
……
Step (N R+ N T-2): N TThe N of the matrix of-2 layers of output T-1 line data is imported N T-1 layer, N T-1 layer of N to matrix TRow and N T-1 row carries out the Givens rotation, matrix element
Figure BDA00001308197500058
Be changed to zero, N TThe N of-1 layer of output R matrix T-1 row result of calculation;
Step (N R+ N T-1): N TThe N of the matrix of-1 layer of output TLine data is imported N TLayer, N TLayer is to the N of matrix T+ 1 row and N TRow carries out the Givens rotation, matrix element
Figure BDA00001308197500059
Be changed to zero, N TThe N of layer output R matrix TRow result of calculation.
Described Givens rotary manipulation is equivalent to orthogonal matrix of premultiplication, thus any given angle of the rotation of the vector on the two dimensional surface, and keep vector length constant, can be changed to zero to a certain dimension coordinate of vector through rotating to vector on the reference axis.
Said QR decomposes by controller and data processing module realization; Each layer of data processing module is a capable rotary module; The output port of the capable rotary module of last layer connects down the input port of the capable rotary module of one deck; Matrix data to be decomposed flows through each row rotary module successively from the input port input of the capable rotary module of ground floor, each row rotary module concurrent working; Controller adopts hierarchy equally, is made up of control one and control two two kinds of controller modules altogether; Each layer row rotary module is controlled by a control two, and the control two of each layer cascades up, control one of cascade before the control of ground floor two.
Described capable rotary module splits the cordic algorithm iterative process based on cordic algorithm, constitutes pipeline organization, for the order input module treat the rotating vector group, carry out water operation:
Treat that first the effective vector in the rotating vector group is called a rotating vector; Carry out Victoring mode rotation, thereby rotate on the reference axis, call Y being changed to zero one dimension its certain one dimension set zero; Another dimension is called X; When rotating vector is rotated direction of rotation is recorded in the CORDIC inside modules, follow-up rotating vector carries out Rotation mode rotation successively based on the direction of rotation of record, thereby carries out identical rotary manipulation with a rotating vector;
According to aforesaid way, when the row rotary module carried out the CORDIC iterative operation, the different rotary vector was rotated successively, finished rotation, dateout successively successively;
The row rotary module has three patterns, and in QR decomposable process, the row rotary module starts application model 0 for the first time; Delegation's channel matrix data of input are deposited in the shift register group; Do not carry out the CORDIC iterative operation, the row rotary module starts application model 1 for the second time, and the input data insert the X input port of iteration unit; The data of pattern 0 storage insert the Y input port of iteration unit, carry out the CORDIC iterative operation.Start application model 2 after the row rotary module reaches for the third time, the input data insert the X input port of iteration unit, and the X component of module dateout inserts the Y input port of iteration unit, carries out the CORDIC iterative operation.
The beginning of said controller module control two and end signal are produced by last layer controltwo module and provide, and begin and finish the work of current layer row rotary module; The controller module control one of cascade produces ground floor control two is provided the end signal of module before ground floor control two module, so constitutes level cascade structure controller; Control two module cooperates other each layer row rotary modules to accomplish the QR operation splitting of whole module through beginning and the work of two signal controlling current layer of model selection row rotary module.
The recursion structure of said level cascade controller realizes according to following data manipulation sequential:
Current layer is read in first iteration cycle of the iteration cycle of data as current layer for the first time;
The capable rotary module of ground floor carries out iteration ROW time, and wherein ROW is the line number of matrix, after the ROW iteration cycle finishes, and the first row rotary module end-of-job;
Behind the secondary iteration end cycle of each layer, the control module of current layer starts one deck work down;
After each layer power cut-off, through an iteration cycle, the control module of current layer finishes one deck work down.
Described cordic algorithm is a kind of computational methods of bivector rotation, thereby realizes that with addition and shifting function rotation reduces multiplication and divide operations.
Compare with existing network performance evaluation method, the present invention has made full use of the data dependence when the Givens rotary process realizes the QR decomposition, short time delay.Adopt cordic algorithm to realize the Givens rotation, the multiplier and the divider quantity that need are little.Adopt the level cascade structure, realize flowing water between each layer, not only can save hardware resource, improve throughput, can also flexiblely expand the QR decomposing module that is applicable to various scale matrixs.Apply the present invention in the MIMO nonlinear detector; Can be under the situation of introducing a small amount of extra resource expense and time delay; Simplify different transmit antennas and transmit, thereby the reduced data testing process reduces the complexity of algorithm greatly at the stacked system of reception antenna end.QR of the present invention in addition decomposes flexible extension, has satisfied the expansion demand of the MIMO nonlinear detector of the channel matrix same case that is applied to a plurality of detections.
Description of drawings
Fig. 1 is serial row rotary module pattern 0 data path proposed by the invention.
Fig. 2 is serial row rotary module pattern 1 data path proposed by the invention.
Fig. 3 is serial row rotary module pattern 2 data paths proposed by the invention.
Fig. 4 is the hardware sketch of serial row rotary module proposed by the invention.
Fig. 5 decomposes space-time diagram for the level pipeline organization QR that this name provided.
Fig. 6 is a control one module proposed by the invention.
Fig. 7 is a control two module proposed by the invention.
Fig. 8 is that level flowing structure QR proposed by the invention decomposes hardware realization framework.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is explained further details.
The method of the invention realizes that on FPGA concrete steps are following:
(1) goes function, principle and the realization of rotary module.
The function of row rotary module is in a CORDIC iteration cycle, rotation when realizing Victoring mode module and Rotation mode module, thus in a CORDIC iteration cycle, realize following conversion:
⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ → × × × × × × 0 × × × × ×
Wherein, the element that carries out conversion is rotated in
Figure BDA00001308197500082
expression with Givens.
The row rotary module splits a CORDIC module hardware structure based on cordic algorithm, constitutes pipeline organization.Adjacent two line data of channel matrix, two data of each row constitute one and treat rotating vector, constitute and treat the rotating vector group, and the order input module carries out water operation.Rotating vector carries out Victoring mode rotation, and is recorded in the CORDIC inside modules to direction of rotation, and follow-up rotating vector carries out Rotation mode rotation successively according to the direction of rotation of record, thereby carries out identical rotary manipulation with a rotating vector.The different rotary vector is rotated successively like this, finishes rotation, dateout successively successively.If carry out the parallel-by-bit rotation, in a capable rotary module, a Victoring mode module and a plurality of Rotation mode module are arranged so, a plurality of Rotation mode modules are followed Victoring mode module and are carried out identical rotation simultaneously.All CORDIC modules are imported data simultaneously, parallel identical rotation, calculating end simultaneously, the dateout simultaneously of carrying out.With respect to the parallel-by-bit rotary module, serial row rotary module according to the invention can reduce the hardware resource that consumes greatly.
Serial row rotary module one has three kinds of patterns.The main distinction of three kinds of patterns is new CORDIC iteration cycles when beginning, and the path of input data is different, and the follow-up iteration of whether carrying out.Module is the clock cycle of start signal enabling module first clock as iteration cycle; Import first data simultaneously; Another ready prepd data of first data contract of input are formed first group and are treated that rotating vector gets into the module streamline, and it is last clock in current iteration cycle that module is exported first group of clock of treating rotating vector rotation result.Just the length of iteration cycle equals the pipeline series of row rotary module.After the start signal enabling module, data of each clock input, another ready prepd data of contract are formed one and are treated that rotating vector gets into the module streamline.In whole iteration cycle, a full line matrix data is by from left to right successively in the input module.Special; Need not carry out the pattern of iteration for serial row rotary module; At whole iteration cycle; A full line matrix data of input only is deposited in the shift register successively, does not get into streamline and carries out iteration, but wait for that importing next line matrix data composition treats to get into streamline more successively behind the rotating vector.
The iteration cycle that current layer is read in data does not for the first time carry out iteration, is last column data that begin to read in matrix at iteration cycle.Though this iteration cycle does not carry out iteration work, begin from this iteration cycle, current layer hardware is just in running order, so read in current layer for the first time first iteration cycle of the iteration cycle of data as current layer.
1. pattern 0
First iteration cycle application model 0 of current line rotary module.At first iteration cycle, certain delegation's channel matrix data of input are deposited in the shift register group.Owing to only imported delegation's channel matrix data, needed input next line channel matrix data could begin iteration, so pattern 0 is not carried out the CORDIC iterative operation.Just do not start follow-up CORDIC iteration unit.
In the cycle, invent said serial row rotary module pattern 0 data path shown in accompanying drawing 1.
Can find out that from accompanying drawing 1 data of each clock cycle input of current line channel matrix data are deposited with among the shift register data_in_temp successively, during standby mode 1 beginning, carry out the CORDIC iterative operation.0 cycle of pattern CORDIC iteration unit does not start.
2. pattern 1
Second iteration cycle application model 1 of current line rotary module.Because the delegation's channel matrix data in first iteration cycle input are ready to, so can carry out the CORDIC iterative operation.The input data insert the X input port of iteration unit, and the data of 0 cycle of pattern storage insert the Y input port of iteration unit, carry out iterative operation.
In the cycle, invent said serial row rotary module pattern 1 data path shown in accompanying drawing 2.
Can find out from accompanying drawing 2; Data of each clock cycle input of current line channel matrix data; Insert the X input port of CORDIC iteration unit, the channel matrix data of memory data_in_temp storage insert the Y input port of CORDIC iteration unit with the speed of each clock cycle data.Like this each clock cycle, just there is a pair of rotating vector to get into the CORDIC iteration unit and is rotated operation.Notice that in the pattern 1, shift register data_in_temp still needs each clock displacement once, guarantees that the speed with data of each clock is that the CORDIC iteration unit provides data.
The CORDIC iteration module of back has been divided into many CORDIC iteration unit.The operation of i CORDIC iteration unit CORDIC unit i completion is:
x i + 1 = x i - y i × d i × 2 - ( i - 2 ) y i + 1 = y i + 1 + x i × d i × 2 - ( i - 2 )
Especially, the operation of CORDIC unit 1 completion is:
x 2 = - d 1 y 1 y 2 = d 1 x 1
Wherein, direction of rotation d is a rotating vector or follow-up rotating vector according to current rotating vector, and producing method is different.
Head [i] has write down a whether rotating vector of current rotating vector, and p_or_n [i] has write down the direction of rotation of current C ORDIC rotary unit.The current rotating vector of Head [i]==1 representative is a rotating vector.If y<0, then p_or_n [i] assignment 0, represents current C ORDIC rotary unit to rotate to positive direction, just d iBe 1.
If the head of a last rotary unit is 1; Show that the rotating vector that current rotary unit will receive is a rotating vector; The head of current rotary unit is changed to 1, according to receiving the rotating vector of coming from a last rotary unit, determine current rotary unit the direction of rotation that will carry out; Deposit postrotational result in current rotary unit, and deposit direction of rotation information the p_or_n of current rotary unit in.
If the head of a last rotary unit is 0; Show that the rotating vector that current rotary unit will receive is not a rotating vector; The head of current rotary unit is changed to 0, from the p_or_n of current rotary unit, reads direction of rotation information, the vector that receives from a last rotary unit is rotated; Postrotational information is deposited in the current rotary unit, and the direction of rotation information p_or_n of current layer remains unchanged.
Can know that in follow-up introduction a rotating vector is not the rotating vector of first input, mark is lifted one's head operating among the CORDIC unit 0 of rotating vector and is accomplished.
The mechanism that utilization is above just can realize that a rotating vector carries out Victoring mode operation, and follow-up rotating vector carries out Rotation mode operation, and direction of rotation is identical with a rotating vector.So just realized capable rotary manipulation.In serial row rotary module, each rotating vector is not to accomplish the CORDIC rotation simultaneously, but order is accomplished the rotation iterative operation, postrotational vector result of each clock output of delivery outlet.Serial row rotary module is just imported each clock and is imported one group and treat rotating vector, exports each clock and exports one group of result that rotation is good.
Rotation back vector will carry out multiplication to be operated with amplitude limit, just the right side of accompanying drawing 2 two sub-module.
3. pattern 2
The 3rd and later iteration cycle application model 2 when front module.Pattern 2 is carried out the CORDIC iterative operation.The input data insert the X input port of iteration unit, and the dateout x_out of module inserts the Y input port of iteration unit, carries out iterative operation.
In the cycle, invent said serial row rotary module pattern 2 data paths shown in accompanying drawing 3.
Can find out from accompanying drawing 3; Data of each clock cycle input of current line channel matrix data, the X input port of access CORDIC iteration unit is introduced and can be known by the front; Data of each clock output of module data delivery outlet x_out, the Y input port of access CORDIC iteration unit.Like this each clock cycle, just there is a pair of rotating vector to get into the CORDIC iteration unit and is rotated operation.
CORDIC iteration unit module, the multiplier module of back are identical with pattern 1 with amplitude limit module working method, no longer repeat here.
The clock signal of QR decomposing module and reset signal are directly imported as the clock signal and the reset signal of serial row rotary module.The output useful signal of module is in first group of rotation of output result, and the rotation of head vector just is as a result the time, effectively.After this several clock cycle, each clock is exported one group of rotation result.
The hardware sketch of serial row rotary module according to the invention is shown in accompanying drawing 4.
(2) function of controller, principle and realization.
The QR that the channel matrix of different dimensions need be used different scales decomposes.Be extended to other scales conveniently for QR decomposes hardware module, specially designed the controller of layering.
Described level pipeline organization QR decomposes hardware configuration for invention, can obtain following sequential relationship.
1. the current layer iteration cycle that reads in data does not for the first time carry out iteration, just in last column data that begin to read in matrix of iteration cycle.Though this iteration cycle does not carry out iteration work, begin from this iteration cycle, current layer hardware is just in running order, so read in current layer for the first time first iteration cycle of the iteration cycle of data as current layer.
2. decompose for the QR that accomplishes a matrix; The capable rotary module of ground floor need carry out ROW iteration (comprising first iteration cycle that does not carry out iteration work); Wherein ROW is the line number of matrix, so after the ROW iteration cycle finished, the first row rotary module was with regard to end-of-job.
3. behind the secondary iteration end cycle of each layer, one deck is exported current layer result of calculation downwards, and first iteration cycle of the 3rd iteration cycle of current layer and following one deck begins simultaneously like this.So behind the secondary iteration end cycle of each layer, the control module of current layer starts one deck work down.
4. after the current layer power cut-off; Last one deck is downwards imported data, descends time ratio last layer iteration cycle in evening of one deck power cut-off like this, so after each layer power cut-off; Through an iteration cycle, the control module of current layer finishes one deck work down.
3. 4. embodied the characteristics of a kind of recursion between each layer control signal, this characteristics make controller realize becoming possibility with the level cascade structure.
Matrix with 8 * 4 is an example, when streamline is stablized, invents said level pipeline organization QR and decomposes space-time diagram shown in accompanying drawing 5.On behalf of QR, per two row decompose the one deck in the hardware configuration among the figure, and which row the digitized representation current line is in matrix.Redness represents current layer in running order in current slot.
Multi-layer controller is made up of control one and control two altogether two kinds of controller modules.Each layer row rotary module is controlled by a control two, and the control two of each layer cascades up, control one of cascade before the control of ground floor two.Introduce two kinds of controller modules below in detail.
①control?one
Invent said control one module shown in accompanying drawing 6.
The effect of Control one module is the line number according to matrix to be decomposed, generates the end signal of the capable rotary module of ground floor.The reset signal of QR decomposing module is directly imported as the reset signal of control one module.When new matrix to be decomposed arrived, the initial signal start that the QR decomposing module receives was used for starting control one module.Oe_first_level connects the signal that finishes of the capable rotary module of ground floor; Whenever receive a pulse, explain that the capable rotary module of ground floor does the iteration that is over a time, when the capable rotary module of ground floor has been accomplished the inferior iteration of ROW (ROW is the line number of matrix to be decomposed); Explain that ground floor accomplished the disintegration of current matrix; Can quit work, the finish_next of Control one is effective, finishes the capable rotary module work of ground floor.
②control?two
Invent said control two module shown in accompanying drawing 7.
The clk signal of QR decomposing module and reset signal are directly imported as the clk signal and the reset signal of Control two module.Two input signal start and finish are used for beginning and finish current layer work respectively.Control two is through style and two signal controlling current layer work of start_n.Output signal Ready is a current layer end-of-job signal, also is that the current layer result exports useful signal simultaneously.Current layer output start_next signal and finish_next signal are used for beginning and end one deck work down.The operation principle of following brief account control two module.
The Start signal is used for starting current layer and carries out iteration.After the Start signal was effective, Control two module was carried out the empty iteration first time through style and two capable rotary modules of signal enabling of start_n, and just pattern 0.
The Oe_before signal connects the iteration end signal of one deck, is used for starting the new iteration of current layer.So current layer whenever receives an oe_before signal, carry out once new iteration through style and two signal controlling current layer of start_n row rotary module.Notice that current layer has only the signal enabling through start, when being in starting state, can receiving the oe_before signal and do corresponding operation, otherwise the oe_before signal is to not influence of current layer.
The Oe_current_level signal connects current layer iteration end signal.Control two module is behind the start signal enabling, and when receiving two oe_current_level signals, the start_next signal is changed to effectively, starts one deck work down.
The Oe_next_level signal connects next end signal in generation one on top of another.Control two module receives an oe_next_level signal again after the rotation of finish signal ended current layer row, the finish_next signal is changed to effectively, finishes one deck work down.
(3) adopt the level flowing structure to realize that QR decomposes, each layer is a capable rotary module.
Invent said structure shown in accompanying drawing 8.
Can find out that from accompanying drawing 8 not only capable rotary module inside is pipeline organization, also is pipeline organization between the row rotary module.Left side cascade structure is a controller in the accompanying drawing 8, the right data processing module that capable rotary module cascade forms of serving as reasons.
For each serial row rotary module, when oe is effective, export first group of rotation back vector, each clock cycle is exported one group of rotation back vector in several clock cycle after this.When Start is effective, imports first group and treat rotating vector, in several clock cycle after this, each clock cycle is imported one group and treats rotating vector.This design can make in the rotating vector that last layer exports in proper order the Y component directly order be input to down in one deck input port, and need not carry out buffer memory.The capable serial of i inputs in the capable data of i+1; Preceding i is to approach zero hash; I+1 another ready data of data contract; Can form head vector, so the capable rotating vector of i+1 will be labeled as a rotating vector to i+1 rotating vector of input, import afterwards the rotating vector that remains do identical rotary manipulation with a rotating vector.
That the oe_before of the control two module of ground floor input termination is the oe of ground floor among the figure, and it is continuous that such QR decomposition texture requires the input data.Just each ground floor head vector iteration finishes, and first data of acquiescence next line are ready at input, and the capable rotary module of ground floor takes off the iteration that delegation's matrix data carries out next iteration cycle in the input sequential read automatically.
The channel matrix element of input QR decomposing module is imported one by one; The right side of channel matrix is being put n and is being received vector; Form the matrix (Tx and Rx are respectively number of transmit antennas and reception antenna number) of (2*Rx) x (2*Tx+n) together with channel matrix; First cycle that the start signal begins, first element of input matrix, each cycle after this is an element of input channel matrix successively.The order of input is: since the capable input of 2*Rx, import first row at last.Each row is imported leftmost element at first, and rightmost element is imported in input to the right at last successively.
Each line output of decomposing the back matrix has a useful signal, in the effective cycle of the useful signal of current line, exports first element that current line decomposes the back matrix simultaneously, after this current line element of each cycle output.Output is that the leftmost element of output current line is exported to the right successively earlier, exports the rightmost element of current line at last in proper order.It should be noted that the capable i-1 that exports the at first element of i is invalid, the dateout of back is effective, and the valid data part is to export from left to right equally.
Embodiment: in order to specify function and the performance that the said QR of invention decomposes hardware module, we are with the multiple mimo system of 4*4, and the channel matrix of 6 MIMO detections is all example introduction mutually.
The multiple mimo system of 4*4 converts the real mimo system of 8*8 into, and the channel matrix of 6 MIMO detections is identical.Just each QR decomposition same detection number of being with is 6.
The CORDIC iterations is taken as 10, and multiplier flowing water progression is 3.Like this, vector needs (10+3+1=14) individual clock cycle to group of vector altogether to output rotation back from getting into the CORDIC module, and 1 clock is the output violent change operation.If get the matrix columns is 14, and each row has 14 data.Like this; When the 14th iteration cycle finishes, first group of rotating vector output rotation result of current line, 14 data inputs of current line simultaneously finish; At next clock; First element of the new delegation matrix element of input is formed one group with the vectorial X component in output rotation back and is treated rotating vector, is input in the module iteration of a beginning new round.Like this, this structure can be decomposed the real matrix of 8*14, be exactly above the multiple mimo system of said 4*4,6 identical situation of channel matrix that MIMO detects.
Invent said QR decomposition hardware and realize that inputoutput specification is following.
Input:
Clk: input clock.
Reset: reset signal.
Start: commencing signal.
H_in: first cycle that input channel matrix information, start signal begin, first element of input matrix, each cycle after this is an element of input channel matrix successively.The order of input is: since the capable input of 2*Rx, import first row at last.Each row is imported leftmost element at first, and rightmost element is imported in input to the right at last successively.
Output:
Output port and output content are as shown in table 1;
Table 1:
Figure BDA00001308197500161
X in the table representes invalid data.Each line output of decomposing the back matrix has a useful signal, in the effective cycle of the useful signal of current line, exports first element that current line decomposes the back matrix simultaneously, after this current line element of each cycle output.Output is that the leftmost element of output current line is exported to the right successively earlier, exports the rightmost element of current line at last in proper order.It should be noted that the capable i-1 that exports the at first element of i is invalid, the dateout of back is effective, and the valid data part is to export from left to right equally.
Invent said QR and decompose hardware and be implemented in and set under the parameter, begin from the start signal, import first channel element, first decomposes back channel element to output, wants 1+14*8=113 clock altogether.Wherein 1 clock is the input and latch clock, and the 14*8 of a back clock calculates the clock number that current decomposition needs for each row rotary module.
Invent said QR and decompose hardware and be implemented in and set under the parameter, streamline is full, accomplishes the clock number of QR decomposition needs:
(CORDIC_ITERATION_TIME+mul_pl+1)*Tx*2=(10+3+1)*4*2=112。
Wherein CORDIC_ITERATION_TIME is the CORDIC iterations, and mul_pl is the flowing water progression that multiplier adopts.1 clock is done the output violent change operation.
Setting under the parameter, realizing that on Stratix:EP2S180F102014 the said QR of invention decomposes, the inputoutput data bit wide is 12, has increased bit wide for the intermediate variable that possibly overflow.Invent said QR decomposition hardware and realize that clock rate can reach 224.77MHz.It is as shown in table 2 to take resource situation:
Table 2
Invent said QR and decompose that to take maximum be the DSP resource, be used for the multiplication after the CORDIC rotation.
Expanded QR based on pipeline work of the present invention decomposes implementation method, has made full use of the data dependence when the Givens rotary process realizes the QR decomposition, short time delay.Adopt cordic algorithm to realize the Givens rotation, the multiplier and the divider quantity that need are little.Adopt the level cascade structure, realize flowing water between each layer, not only can save hardware resource, improve throughput, can also flexiblely expand the QR decomposing module that is applicable to various scale matrixs.
The foregoing description just is used to specify the expanded QR based on pipeline work of the present invention and decomposes implementation method; Concrete data wherein just arbitrarily are provided with for explanation; Can not be in order to limit protection scope of the present invention; Promptly as long as implement by the described characteristic of this claim, wherein any variation of data all should belong to protection category of the present invention.

Claims (9)

1. the expanded QR decomposition method based on pipeline work is characterized in that, adopts the structure of level cascade, realizes water operation between each layer, for N ROK, N TThe matrix H to be decomposed of row is worked as N T>=N RThe time, the number of plies of layering is N R-1; Work as N T<N RThe time, the number of plies of layering is N TThe two row matrix data of each layer are carried out the Givens rotation, and the Givens rotation is carried out with cordic algorithm, realize with addition and shifting function.
2. the expanded QR decomposition method based on pipeline work according to claim 1 is characterized in that, works as N T>N RThe time, to the N on the left side R* N RSubmatrix carries out QR and decomposes the N on the right side R* (N T-N R) submatrix follows and do identical rotation, QR decomposes successively and realizes according to the following steps:
Step (1): the N of matrix RLine data is imported the 1st layer;
Step (2): the N of matrix R-1 line data is imported the 1st layer, the 1st layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure FDA00001308197400011
Be changed to zero;
Step (3): the N of matrix R-2 line data are imported the 1st layer, the N of the matrix of the 1st layer of output RLine data is imported the 2nd layer, the 1st layer of N to matrix R-1 row and N R-2 row carry out the Givens rotation, matrix element
Figure FDA00001308197400012
Be changed to zero;
Step (4): the N of matrix R-3 line data are imported the 1st layer, the N of the matrix of the 1st layer of output R-1 line data is imported the 2nd layer, the 1st layer of N to matrix R-2 row and N R-3 row carry out the Givens rotation, matrix element
Figure FDA00001308197400013
Be changed to zero, the 2nd layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure FDA00001308197400014
Be changed to zero;
……
Step (N R): the 1st line data input ground floor of matrix, i layer
Figure FDA00001308197400015
The 2i+1 line data of the matrix of output is imported the i+1 layer, the j layer
Figure FDA00001308197400016
Capable and 2j-1 is capable to the 2j of matrix carries out the Givens rotation, matrix element h 2j, jBe changed to zero, the 1st row result of calculation of the 1st layer of output R matrix;
Step (N R+ 1): the i layer
Figure FDA00001308197400021
The 2i line data of the matrix of output is imported the i+1 layer, the j layer
Figure FDA00001308197400022
Capable and 2j-2 is capable to the 2j-1 of matrix carries out the Givens rotation, matrix element h 2j-1, jBe changed to zero, the 2nd row result of calculation of the 2nd layer of output R matrix;
……
Step (2N R-2): N RThe N of the matrix of-2 layers of output R-1 line data is imported N R-1 layer, N R-1 layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element Be changed to zero, N R-1 layer of N that exports the R matrix simultaneously RRow and N R-1 row result of calculation.
3. the expanded QR decomposition method based on pipeline work according to claim 1 is characterized in that, works as N T<N RThe time, QR decomposes successively realization according to the following steps:
Step (1): the N of matrix RLine data is imported the 1st layer;
Step (2): the N of matrix R-1 line data is imported the 1st layer, the 1st layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure FDA00001308197400024
Be changed to zero;
Step (3): the N of matrix R-2 line data are imported the 1st layer, the N of the matrix of the 1st layer of output RLine data is imported the 2nd layer, the 1st layer of N to matrix R-1 row and N R-2 row carry out the Givens rotation, matrix element
Figure FDA00001308197400025
Be changed to zero;
Step (4): the N of matrix R-3 line data are imported the 1st layer, the N of the matrix of the 1st layer of output R-1 line data is imported the 2nd layer, the 1st layer of N to matrix R-2 row and N R-3 row carry out the Givens rotation, matrix element
Figure FDA00001308197400026
Be changed to zero, the 2nd layer of N to matrix RRow and N R-1 row carries out the Givens rotation, matrix element
Figure FDA00001308197400027
Be changed to zero;
……
Step (N R): the 1st line data input ground floor of matrix, i layer
Figure FDA00001308197400028
The 2i+1 line data of the matrix of output is imported the i+1 layer, the j layer Capable and 2j-1 is capable to the 2j of matrix carries out the Givens rotation, matrix element h 2j, jBe changed to zero, the 1st row result of calculation of the 1st layer of output R matrix;
Step (N R+ 1): the i layer
Figure FDA00001308197400031
The 2i line data of the matrix of output is imported the i+1 layer, the j layer
Figure FDA00001308197400032
Capable and 2j-2 is capable to the 2j-1 of matrix carries out the Givens rotation, matrix element h 2j-1, jBe changed to zero, the 2nd row result of calculation of the 2nd layer of output R matrix;
……
Step (N R+ N T-2): N TThe N of the matrix of-2 layers of output T-1 line data is imported N T-1 layer, N T-1 layer of N to matrix TRow and N T-1 row carries out the Givens rotation, matrix element
Figure FDA00001308197400033
Be changed to zero, N TThe N of-1 layer of output R matrix T-1 row result of calculation;
Step (N R+ N T-1): N TThe N of the matrix of-1 layer of output TLine data is imported N TLayer, N TLayer is to the N of matrix T+ 1 row and N TRow carries out the Givens rotation, matrix element Be changed to zero, N TThe N of layer output R matrix TRow result of calculation.
4. decompose implementation method according to claim 1 or 2 or 3 described expanded QR based on pipeline work; It is characterized in that; Described Givens rotary manipulation is equivalent to orthogonal matrix of premultiplication; Thereby any given angle of the rotation of the vector on the two dimensional surface, and keep vector length constant, can be changed to zero to a certain dimension coordinate of vector through rotating to vector on the reference axis.
5. decompose implementation method according to claim 1 or 2 or 3 described expanded QR based on pipeline work; It is characterized in that; Said QR decomposes by controller and data processing module realization; Each layer of data processing module is a capable rotary module, and the output port of the capable rotary module of last layer connects down the input port of the capable rotary module of one deck, and matrix data to be decomposed is from the input port input of the capable rotary module of ground floor; Flow through each row rotary module successively, each row rotary module concurrent working; Controller adopts hierarchy equally, is made up of control one and control two two kinds of controller modules altogether; Each layer row rotary module is controlled by a control two, and the control two of each layer cascades up, control one of cascade before the control of ground floor two.
6. the expanded QR based on pipeline work according to claim 5 decomposes implementation method; It is characterized in that; Described capable rotary module splits the cordic algorithm iterative process based on cordic algorithm, constitutes pipeline organization; For the order input module treat the rotating vector group, carry out water operation:
Treat that first the effective vector in the rotating vector group is called a rotating vector; Carry out Victoring mode rotation, thereby rotate on the reference axis, call Y being changed to zero one dimension its certain one dimension set zero; Another dimension is called X; When rotating vector is rotated direction of rotation is recorded in the CORDIC inside modules, follow-up rotating vector carries out Rotation mode rotation successively based on the direction of rotation of record, thereby carries out identical rotary manipulation with a rotating vector;
According to aforesaid way, when the row rotary module carried out the CORDIC iterative operation, the different rotary vector was rotated successively, finished rotation, dateout successively successively;
The row rotary module has three patterns, and in QR decomposable process, the row rotary module starts application model 0 for the first time; Delegation's channel matrix data of input are deposited in the shift register group; Do not carry out the CORDIC iterative operation, the row rotary module starts application model 1 for the second time, and the input data insert the X input port of iteration unit; The data of pattern 0 storage insert the Y input port of iteration unit, carry out the CORDIC iterative operation.Start application model 2 after the row rotary module reaches for the third time, the input data insert the X input port of iteration unit, and the X component of module dateout inserts the Y input port of iteration unit, carries out the CORDIC iterative operation.
7. the expanded QR based on pipeline work according to claim 5 decomposes implementation method; It is characterized in that; The beginning of said controller module control two and end signal are produced by last layer controltwo module and provide, and begin and finish the work of current layer row rotary module; The controller module control one of cascade produces ground floor control two is provided the end signal of module before ground floor control two module, so constitutes level cascade structure controller; Control two module cooperates other each layer row rotary modules to accomplish the QR operation splitting of whole module through beginning and the work of two signal controlling current layer of model selection row rotary module.
8. the expanded QR based on pipeline work according to claim 7 decomposes implementation method, it is characterized in that the recursion structure of said level cascade controller realizes according to following data manipulation sequential:
Current layer is read in first iteration cycle of the iteration cycle of data as current layer for the first time;
The capable rotary module of ground floor carries out iteration ROW time, and wherein ROW is the line number of matrix, after the ROW iteration cycle finishes, and the first row rotary module end-of-job;
Behind the secondary iteration end cycle of each layer, the control module of current layer starts one deck work down;
After each layer power cut-off, through an iteration cycle, the control module of current layer finishes one deck work down.
9. decompose implementation method according to claim 1 or 6 described expanded QR based on pipeline work; It is characterized in that; Described cordic algorithm is a kind of computational methods of bivector rotation, thereby realizes that with addition and shifting function rotation reduces multiplication and divide operations.
CN201210010345.6A 2012-01-13 2012-01-13 Extensible QR decomposition method based on pipeline working mode Expired - Fee Related CN102624653B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210010345.6A CN102624653B (en) 2012-01-13 2012-01-13 Extensible QR decomposition method based on pipeline working mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210010345.6A CN102624653B (en) 2012-01-13 2012-01-13 Extensible QR decomposition method based on pipeline working mode

Publications (2)

Publication Number Publication Date
CN102624653A true CN102624653A (en) 2012-08-01
CN102624653B CN102624653B (en) 2014-08-20

Family

ID=46564342

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210010345.6A Expired - Fee Related CN102624653B (en) 2012-01-13 2012-01-13 Extensible QR decomposition method based on pipeline working mode

Country Status (1)

Country Link
CN (1) CN102624653B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981797A (en) * 2012-11-02 2013-03-20 中国航天科技集团公司第九研究院第七七一研究所 Trigonometric function arithmetic device based on combination of feedback of coordinated rotation digital computer (CORDIC) algorithm and pipeline organization
CN103294649A (en) * 2013-05-23 2013-09-11 东南大学 Bilateral CORDIC arithmetic unit, and parallel Jacobian Hermite matrix characteristic decomposition method and implementation circuit based on bilateral CORDIC arithmetic unit.
CN105893334A (en) * 2016-03-28 2016-08-24 广州海格通信集团股份有限公司 Complex signal anti-interference matrix upper triangularization method and signal anti-interference processing device
CN111901071A (en) * 2020-06-24 2020-11-06 上海擎昆信息科技有限公司 Method and device for realizing QR decomposition of matrix with low complexity

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1564125A (en) * 2004-04-09 2005-01-12 哈尔滨工业大学 Array type reconstructural DSP engine chip structure based on CORDIC unit
US20050015418A1 (en) * 2003-06-24 2005-01-20 Chein-I Chang Real-time implementation of field programmable gate arrays (FPGA) design in hyperspectral imaging
US7616695B1 (en) * 2004-06-17 2009-11-10 Marvell International Ltd. MIMO equalizer design: an algorithmic perspective
CN102111350A (en) * 2009-12-25 2011-06-29 中国电子科技集团公司第五十研究所 FPGA device for matrix QR decomposition
US20110264721A1 (en) * 2009-05-22 2011-10-27 Maxlinear, Inc. Signal processing block for a receiver in wireless communication

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050015418A1 (en) * 2003-06-24 2005-01-20 Chein-I Chang Real-time implementation of field programmable gate arrays (FPGA) design in hyperspectral imaging
CN1564125A (en) * 2004-04-09 2005-01-12 哈尔滨工业大学 Array type reconstructural DSP engine chip structure based on CORDIC unit
US7616695B1 (en) * 2004-06-17 2009-11-10 Marvell International Ltd. MIMO equalizer design: an algorithmic perspective
US20110264721A1 (en) * 2009-05-22 2011-10-27 Maxlinear, Inc. Signal processing block for a receiver in wireless communication
CN102111350A (en) * 2009-12-25 2011-06-29 中国电子科技集团公司第五十研究所 FPGA device for matrix QR decomposition

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DIMPESH PATEL等: "A Low-Complexity High-Speed QR Decomposition Implementation for MIMO receivers", 《CIRCUITS AND SYSTEMS,2009,ISCAS 2009,IEEE INTERNATIONAL SYMPOSIUM ON》 *
杨莘元等: "一种自适应阵列信号处理算法的高速实现", 《哈尔滨工程大学学报》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981797A (en) * 2012-11-02 2013-03-20 中国航天科技集团公司第九研究院第七七一研究所 Trigonometric function arithmetic device based on combination of feedback of coordinated rotation digital computer (CORDIC) algorithm and pipeline organization
CN102981797B (en) * 2012-11-02 2015-06-17 中国航天科技集团公司第九研究院第七七一研究所 Trigonometric function arithmetic device based on combination of feedback of coordinated rotation digital computer (CORDIC) algorithm and pipeline organization
CN103294649A (en) * 2013-05-23 2013-09-11 东南大学 Bilateral CORDIC arithmetic unit, and parallel Jacobian Hermite matrix characteristic decomposition method and implementation circuit based on bilateral CORDIC arithmetic unit.
CN103294649B (en) * 2013-05-23 2016-08-10 东南大学 Parallel bilateral CORIDC arithmetic element, the Hermite battle array feature decomposition of parallel Jacobi based on this arithmetic element computing realize circuit and implementation method
CN105893334A (en) * 2016-03-28 2016-08-24 广州海格通信集团股份有限公司 Complex signal anti-interference matrix upper triangularization method and signal anti-interference processing device
CN105893334B (en) * 2016-03-28 2019-01-22 广州海格通信集团股份有限公司 Triangulation Algorithm and signal anti-interference process device on the anti-interference matrix of complex signal
CN111901071A (en) * 2020-06-24 2020-11-06 上海擎昆信息科技有限公司 Method and device for realizing QR decomposition of matrix with low complexity
CN111901071B (en) * 2020-06-24 2021-04-06 上海擎昆信息科技有限公司 Method and device for realizing QR decomposition of matrix with low complexity

Also Published As

Publication number Publication date
CN102624653B (en) 2014-08-20

Similar Documents

Publication Publication Date Title
CN108805266B (en) Reconfigurable CNN high-concurrency convolution accelerator
CN101192833B (en) A device and method for low-density checksum LDPC parallel coding
CN102624653B (en) Extensible QR decomposition method based on pipeline working mode
CN105915241B (en) The method and system of very high speed digital quadrature frequency conversion and filtering extraction is realized in FPGA
CN104317768B (en) Matrix multiplication accelerating method for CPU+DSP (Central Processing Unit + Digital Signal Processor) heterogeneous system
CN103310228B (en) Template matches high-speed parallel implementation method and device based on normalizated correlation coefficient
CN110427171A (en) Expansible fixed-point number matrix multiply-add operation deposits interior calculating structures and methods
CN105426918B (en) Normalize associated picture template matching efficient implementation method
CN102043760B (en) Data processing method and system
CN102111350A (en) FPGA device for matrix QR decomposition
CN101149730B (en) Optimized discrete Fourier transform method and apparatus using prime factor algorithm
CN103516643A (en) MIMO detecting preprocessing device and method
CN107276960A (en) A kind of SCMA optimizes codebook design method
CN100583769C (en) Time point system for ellipse curve password system
CN101221491B (en) Point addition system of elliptic curve cipher system
CN104536913A (en) Big integer operational circuit based on a plurality of RAMs and data transfer method
CN101847137B (en) FFT processor for realizing 2FFT-based calculation
CN104504205B (en) A kind of two-dimentional dividing method of the parallelization of symmetrical FIR algorithm and its hardware configuration
CN101894096A (en) FFT computing circuit structure applied to CMMB and DVB-H/T
CN101630244B (en) System and method of double-scalar multiplication of streamlined elliptic curve
CN105095152A (en) Configurable 128 point fast Fourier transform (FFT) device
Huang et al. A high performance multi-bit-width booth vector systolic accelerator for NAS optimized deep learning neural networks
CN103078729B (en) Based on the double precision chaos signal generator of FPGA
CN103761072A (en) Coarse granularity reconfigurable hierarchical array register file structure
CN104777456B (en) Configurable radar digital signal processing device and its processing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140820

Termination date: 20210113