CN104035915B - Based on inverse many wavefront block ILU preprocess method - Google Patents

Based on inverse many wavefront block ILU preprocess method Download PDF

Info

Publication number
CN104035915B
CN104035915B CN201410245950.0A CN201410245950A CN104035915B CN 104035915 B CN104035915 B CN 104035915B CN 201410245950 A CN201410245950 A CN 201410245950A CN 104035915 B CN104035915 B CN 104035915B
Authority
CN
China
Prior art keywords
matrix
block
supernode
ilu
inverse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410245950.0A
Other languages
Chinese (zh)
Other versions
CN104035915A (en
Inventor
王浩
徐立
李斌
李建清
杨中海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201410245950.0A priority Critical patent/CN104035915B/en
Publication of CN104035915A publication Critical patent/CN104035915A/en
Application granted granted Critical
Publication of CN104035915B publication Critical patent/CN104035915B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

This disclosure of the invention is a kind of based on inverse many wavefront block ILU preprocess method, belongs to numerical solution field in order to the problem of inefficiency when overcoming tradition ILU pretreatment to solve large scale sparse linear equations.The present invention mainly have employed and abandons strategy based on inverse, reduces the generation abandoning the numerical value wild effect caused by element to a great extent, improves numerical stability;Have employed and be not required to storage and update many wave-front methods of battle array, it is to avoid interim storage updates battle array, substantially reduces memory cost;Have employed super block adaptive block incomplete decomposing method so that calculated performance, internal memory performance are increased dramatically.Therefore, the present invention propose based on inverse many wavefront block ILU preprocess method, the solution efficiency of Large Scale Sparse asymmetric system of linear equations can be substantially improved, and obtain the asymmetric linear solution of equations of Large Scale Sparse more accurately and fast with the least internal memory cost.

Description

Based on inverse many wavefront block ILU preprocess method
Technical field
The invention belongs to numerical solution field, be specifically related to a kind of based on inverse many wavefront block ILU preprocess method, for solving the large scale sparse linear equations produced in science and Engineering computing problems.
Background technology
At many science and engineering field, such as nuclear physics, Higher Order Differential Equation numerical solution, structure and the finite element analysis of non-structural problem, Fluid Mechanics Computation, the process of petroleum seismic data, the optimization design of power system and numerical weather forecast etc., the case is extremely complicated for these scientific and engineerings large-scale, superhuge.The most traditional experimental technique, method of attempting are the most costly, and require a great deal of time;Due to the functional relationship being difficult to be given in these problems between various physical quantitys, it is impossible to obtained the accurate solution of these challenges by analytic method.Relative to analytic method, numerical method solves not by concrete mathematical function relationship, if but approximating the solution of characterization problems by being distributed in a series of concrete numerical value done.Though numerical method simply approximation method, but as long as taken point number is abundant, also can approach analytic solutions with enough precision, it is often more important that it can solve the partial differential equation of complexity, it is thus able in the case of meeting required precision, solves the indeterminable problem of analytic method.In particular with the development of computer technology, the numerical value emulation method such as such as finite difference calculus, Time-limited integral, FInite Element and boundary element method is widely used in science and engineering problem, and becomes one of the most frequently used and most efficient method.Solving of numerical value emulation method is often attributed to solving of large scale sparse linear equations, and this is also elapsed time and the step of internal memory in this type of method.
The main method for solving of sparse vectors has direct method and iterative method two kinds, and wherein direct method obtains solution by directly equation group being carried out a series of computing;And iterative method carrys out solving equation group by Approach by inchmeal, iteration is till reaching to preset precision.For large scale sparse linear equations, direct method stores cost and amount of calculation is huge, it is high to calculate cost;And if directly be used for solving by iterative method, its convergence and convergence rate can not get being effectively ensured.Therefore, more efficient way is, by methods such as split matrix pretreatment, incomplete decomposing pretreatment and sparse approximate inverse pretreatment, first equation group carries out pretreatment, then uses solution by iterative method.Wherein, the matrix that incomplete decomposing pretreatment generates owing to can do preferably approximation to coefficient matrix, it is possible to greatly reduces conditional number and the convergence step number of iterative method of matrix, realize the advantage such as the most effective, applied widely and be widely applied.Traditional incomplete decomposing pretreatment has decomposes pretreatment and incomplete LU (ILU) the decomposition pretreatment etc. for Asymmetric Equations group for the incomplete cholesky of Symmetric Equations.
Traditional ILU decomposes pretreatment such as ILUT decomposition and (sees " Iterative Methods for Sparse Linear Systems ", Society for Industrial and Applied Mathematics, 2003, Saad) when processing less than the small-sized sparse vectors on 100,000 rank, there is good calculated performance, and the large scale sparse linear equations on rank the most up to a million for hundreds of thousands, its calculated performance is the most satisfactory.
Summary of the invention
The problem of inefficiency when the invention aims to overcome tradition ILU pretreatment to solve large scale sparse linear equations, propose based on inverse many wavefront block ILU preprocess method, utilize the method can obtain the solution of large scale sparse linear equations more accurately, efficiently with less internal memory cost.
To achieve these goals, the technical scheme is that based on inverse many wavefront block ILU preprocess method, comprise the following steps that (unless specifically indicated, L and U represents triangle LU factoring matrix on lower trigonometric sum respectively herein;P represents the permutation matrix that sequence produces):
Step one: perform original matrix A to reorder and symbol decomposition, matrix is reorganized into a series of dense matrix,
Produce during reducing sparse matrix decomposition inserts unit's quantity, increases dense operation.
I). carry out inserting reordering for the purpose of unit by minimizing;
Ii). perform symbol and decompose, and set up cancellation tree;
Iii). reorder according to eliminating tree postorder traversal order;
Iv). again perform symbol and decompose, set up and eliminate tree;
Eventually form the matrix after reordering(wherein P0Being the permutation matrix obtained that reorders, T represents transposition).
Step 2: calculate diagonal angle scaling matrix DrAnd Dc, obtain matrixStrengthen numerical stability.
Step 3: divide the matrix into some supernodes, in order to utilize dense matrix operation to promote calculated performance.
Step 4: each supernode is divided into some block matrix.
Step 5: use based on super block-adaptive block incomplete decomposing method and based on the inverse many wave-front methods being not required to storage renewal battle array abandoning strategy, each supernode carried out block ILU decomposition (by i-th supernode FiIt is expressed as block matrix form F i = F 11 F 12 F 21 F 22 )。
I). the value information of integrated complete adding section;
Ii). integrated complete adding section is from the renewal of descendants's supernode;
Iii). perform complete adding section block F11ILU decomposeWhereinIt is the permutation matrix of partial pivot method generation,It is to postpone the permutation matrix that pivot produces;
Iv). adjust according to new pivot order and decomposed part L0,L1,…Li-1And U0,U1,…Ui-1, respectively obtainWith(wherein subscript 0,1 ... i-1 represents all decomposed supernode before i-th supernode);
V). according to the value information of the part adding section of the new pivot integrated supernode of order, it is adjusted the part adding section after pivot orderWith
Vi). calculating also integration section adding section is from the renewal of descendant nodes;
Vii). solve part adding section matrix equation respectively
Obtain FiBlock ILU decomposeAfter all supernodes decompose, ILU can be obtained and decompose P r P c A ~ P c T = P r P c D r P 0 A P 0 T D c P c T ≈ LU .
Step 6: perform forwardly and rearwardly back substitution, solution matrix equation
I). perform back substitution forward, solve (PrPcDrP0)-1Ly=b obtains
Ii). perform back substitution backward, solve Uz=y and obtain
Iii). solve vector x = P 0 T D c P c T z .
Beneficial effects of the present invention: utilize that the present invention proposes based on inverse many wavefront block ILU preprocess method, the solution efficiency of Large Scale Sparse asymmetric system of linear equations can be substantially improved, the asymmetric linear solution of equations of Large Scale Sparse can be obtained more accurately and fast with the least internal memory cost.The present invention is relative to traditional ILU pretreatment, when processing dimension matrix more than 100,000 rank, can obtain the speed lifting of 10~18 times with the internal memory advantage of 2%~25%.This is because the present invention propose have following characteristics based on inverse many wavefront block ILU preprocess method: 1) have employed scaling technology, it is to avoid some numerical value difficult points, enhance numerical stability and improve computational accuracy;2) employing abandons strategy based on inverse, reduces the generation abandoning the numerical value wild effect caused by element to a great extent, improves numerical stability;3) use partial pivot method, postpone Pivoting Method and diagonal angle perturbation technique, further increase robustness and stability so that preprocess method is more reliable and more stable;4) use and be not required to storage and update many wave-front methods of battle array, it is to avoid interim storage updates battle array, substantially reduces memory cost;5) super block-adaptive block incomplete decomposing method is used so that calculated performance, internal memory performance are increased dramatically;6) all pieces of correlation computations operations all use BLAS3 (Basic Linear Algebra Subprograms Level3, the third level basis linear algebra subroutine library) or LAPACK (Linear Algebra PACKage, linear algebra function library) in dense matrix computing realize, be greatly improved calculated performance.
Accompanying drawing explanation
Fig. 1 is the main flow chart of the present invention;
Fig. 2 is the supernode division figure of matrix;
Fig. 3 is the inside block diagram of supernode;
Fig. 4 is wavefront battle array decomposition process figure;
Fig. 5 is ILU decomposition process figure.
Detailed description of the invention
The invention will be further described with specific embodiment below in conjunction with the accompanying drawings.
As it is shown in figure 1, based on inverse many wavefront block ILU preprocess method, comprise the following steps:
Step one: performing matrix and reorder and symbol decomposition, matrix is reorganized into a series of dense matrix, produce during reducing sparse matrix decomposition inserts unit's quantity, increases dense operation.
Described step one includes:
I). carry out inserting reordering for the purpose of unit by minimizing
Use sparse matrix Ordering Software bag METIS based on multistage nested subdivision method and minimum degree of freedom ordering techniques that former sparse matrix A is reordered, insert unit's quantity with produce during reducing sparse matrix decomposition, thus reduce the storage demand in catabolic process and amount of calculation.
Ii). perform symbol and decompose, and set up cancellation tree
Execution symbol decomposes, and determines after reordering and to insert unit position in matrix, sets up the cancellation tree guiding value decomposition.
Iii). reorder according to eliminating tree postorder traversal order
First cancellation tree is carried out postorder traversal, then according to postorder traversal order, matrix is resequenced, matrix is reorganized into a series of dense matrix, to obtain bigger supernode, and then increase algorithm granularity, increase dense operation.
Iv). again perform symbol and decompose, set up and eliminate tree
Again perform symbol to decompose, determine after reordering and matrix inserts unit position, set up and eliminate tree.
Original matrix A is represented by after twice is reorderedWherein P0Being the matrix that reorders, T is matrix transpose operator.Step 2: calculate diagonal angle scaling matrix DrAnd Dc, strengthen numerical stability
Calculate diagonal angle scaling matrix DrAnd DcSo that the matrix after scalingThe norm of each row, column is approximately 1.To avoid big number except decimal, to be worth the numerical value difficult points such as close number subtracts each other, strengthen numerical stability, accelerate decomposition rate, improve computational accuracy.
Step 3: divide the matrix into some supernodes, in order to utilize dense matrix operation to promote calculated performance
Use supernode method that original matrix is divided into a series of supernode.The continuous column or row with identical non-zero structure are regarded as a supernode by supernode method, it is a kind of conventional improvement technology in many wave-front methods, in order to utilize dense matrix to operate, decrease invalid indirect addressing, add data reusing, the calculated performance of algorithm can be substantially improved.Owing to coefficient matrix has the non-zero structure of symmetry, the supernode marked off also is symmetrical configuration.Fig. 2 shows coefficient matrixA kind of block partition structure, in figureIt is divided into S1,S2,S3,S4Four supernodes, supernode is made up of diagonal blocks matrix (complete adding section) and two structurally symmetrical non-diagonal block matrix (part adding section) again, and wherein last supernode S4 only comprises diagonal blocks matrix.
For obtaining bigger supernode, improve further in calculated performance, this step and have employed following two and relax supernode technology:
I). lax supernode method: by loosening the division condition of supernode non-zero structure " identical ", it is allowed to have a certain proportion of difference at non-zero structure, obtain bigger supernode by explicit null element of adding;
Ii). supernode integration technology: less continuous leaf node is merged into a supernode, limit the dimension of minimum supernode.
Step 4: each supernode is divided into some block matrix
Owing to the block structure of naturally occurring is destroyed during the reordering of coefficient matrix, needs carry out artificial piecemeal: each supernode is respectively divided into the square that some dimensions are fixing, and the block dimension at supernode edge suitably adjusts according to the structure of supernode.Piecemeal situation within supernode is as it is shown on figure 3, a is for setting square dimension, b and c is that in figure, supernode is constituted by eight kinds various sizes of piece for the two kinds of length of sides produced after the edge adjustment according to the complete adding section of supernode and part adding section.
Step 5: use based on super block-adaptive block incomplete decomposing method and based on the inverse many wave-front methods storing renewal battle array that are not required to abandoning strategy, each supernode carried out block ILU decomposition
Being not required in many wave-front methods that storage updates battle array, wavefront battle array is obtained by Matrix extension sum operation by the renewal battle array of supernode He its descendants's supernode.IfIt is from i-th wavefront battle array FiK descendants's node updates battle array, AiIt is i-th supernode, then wavefront battle array FiMeet following relation
F i = A i + U 0 ( i ) + U 2 ( i ) + · · · + U k ( i ) - - - ( 1 )
Wherein operator "+" represent Matrix extension sum operation.Owing to supernode has different non-zero structures from its descendant nodes, need to obtain wavefront battle array by Matrix extension sum operation.In many wave-front methods, the decomposition of wavefront battle array is represented by:
F = F 11 F 12 F 21 F 22 = F 11 F 12 F 21 0 + 0 0 0 F 22 = L 11 0 L 21 0 U 11 U 12 0 0 + 0 0 0 L 21 U 12 - - - ( 2 )
Wherein, L11、L21、U11And U12Represent relevant block LU Decomposition gained factoring matrix in wavefront battle array.Traditional many wave-front methods are after having decomposed wavefront battle array, the current wavefront battle array renewal battle array to its father node will be calculated, and be saved to just use and discharge when its father node wavefront is formed, needing during the most whole many wavefront value decomposition to store many wavefront battle arrays, this will consume substantial amounts of storage resource temporarily.It is not required to the decomposition of wavefront battle array in many wave-front methods of storage renewal battle array be made up of two steps: i) calculate the decomposition of wavefront battle array self: first calculate complete adding section matrix-block ILU and decomposeThen part adding section matrix equation is solved(whereinIt is the permutation matrix of partial pivot method generation,It is to postpone the permutation matrix that pivot produces);Ii) current wavefront is calculated to its renewal L to ancestor node21U12, this renewal will be delayed to when corresponding ancestor node wavefront battle array is decomposed just calculate and integrated, thus avoid interim storage renewal battle array, substantially reduce the memory requirements of many wave-front methods.And therefore avoid the operation reducing storage renewal battle array in traditional many wave-front methods in step one by cancellation being set the optimization of postorder traversal order, accelerate the realization of step one.
Due to complete adding section pivot order in catabolic process it may happen that change, remainder needs to make corresponding adjustment.Therefore sequentially change, for minimizing pivot, the operation of reordering caused, it is not required to storage and updates the most integrated information accordingly after many wave-front methods corresponding pivot order of battle array determines, such as: decompose complete ability according to the value information of pivot order integration section adding section in complete adding section.In addition, (2) formula indicate two block matrix of part adding section computing be unrelated, can individually carry out, the most further the decomposition of part adding section is divided into two parts of upper and lower triangle, the submatrix that the decomposition of wavefront battle array has then become complete adding section block matrix and part adding section block matrix etc. are relatively small decomposes, and the interim storage demand storing current wavefront battle array reduces further.
During incomplete decomposing, abandon element and diagonal element may be caused the least by even zero, cause numerical value wild effect.The generation of the numerical value wild effect that this factor causes will be reduced to a great extent based on the inverse employing abandoning strategy.Decomposing A=LU for LU, pretreated matrix is represented by
M - 1 A = ( L ~ U ~ ) - 1 A = U ~ - 1 L ~ - 1 A = ( U - 1 + Y ) ( L - 1 + X ) LU = I + YU + U - 1 XA + YXA - - - ( 3 )
WhereinWithThe inverse factoring decomposed for corresponding ILU,Error matrix X and Y of inverse factoring determines the preconditioning matrix degree of approximation to original matrix.Therefore, based on inverse strategy of abandoning on the basis of thresholding abandons strategy, in abandoning condition, add the norm of inverse factoring, form new condition of abandoning:
| l jk | | | e k T L ~ - 1 | | ∞ ≤ τ , | u km | | | u ~ - 1 e k | | ∞ ≤ τ - - - ( 4 )
Wherein j > k, m > k, ljkAnd ukmIt is kth column element and the row k element of U, the unit vector e of factoring L respectivelykRepresenting and take matrix row k and kth row, τ is for abandoning tolerance limit, and T is matrix transpose operator,It is inverse factoring respectivelyRow k norm andKth row norm.
As shown in Figure 4, described step 5: including:
I). the value information of integrated complete adding section
The value information of complete for supernode adding section is integrated to wavefront battle array.
Ii). integrated complete adding section is from the renewal of descendants's supernode
Use the many wave-front methods being not required to storage renewal battle array, utilize Matrix extension sum operation that the renewal to complete adding section from descendant nodes is superimposed to current wavefront battle array.
Iii). perform complete adding section block ILU and decompose
First the inverse norm of factoring is estimated, it is then determined that abandon condition;
Use super block-adaptive block incomplete decomposing method to perform block ILU to decompose, this method includes three aspects: i) abandoning in mode, and its basic discarding unit is block, and the element meeting condition of abandoning in block all will be abandoned, if all elements is all abandoned in block, whole piece is abandoned;Ii) on storage strategy, uses self adaptation based on memory pool storage tactful, by comparing cost needed for Sparse Storage Modes and dense storage mode, automatically select the storage mode that storage cost is little;All pieces of memory spaces and small pieces internal memory all use internal memory pool managing;Iii) on computing architecture, using super block method, in temporary memory space, the block copy participating in multiple Coutinuous stores of computing with same or vector is formed a super block, recycling dense matrix computing realizes correlation computations operation;Iv) operation that calculates of all pieces all uses the dense matrix computing in BLAS3 to realize, and substantially increases the calculated performance of pretreatment.
During as it is shown in figure 5, perform ILU decomposition, initially with partial pivot method, complete adding section is decomposed, when meeting with zero pivot, using delay Pivoting Method to postpone to eliminate by current pivot, so circulating, until decomposing successfully.If after all pivots order is attempted, still exist diagonal entry too small or be zero or decomposes unsuccessfully, employing diagonal angle perturbation technique, add disturbance on the diagonal, to ensure that decomposition smoothly completes.
Iv). adjust according to pivot order and decomposed part L0,L1,…Li-1And U0,U1,…Ui-1, respectively obtainWith
Owing to complete adding section has carried out ranks exchange when decomposing, need to be allocated as out adjusting accordingly to decomposition unit according to new pivot order.
V). according to the value information of the part adding section of the integrated supernode of pivot order
According to the value information of new pivot order integration section adding section, it is adjusted the part adding section after pivot orderWith
Vi). calculating also integration section adding section is from the renewal of descendant nodes
Utilize Matrix extension sum operation that the renewal to part adding section from descendant nodes is superimposed to current wavefront battle array.
Vii). solve part adding section matrix equation respectivelyWith
So far, it is thus achieved that FiILU decomposeAfter all supernodes decompose, the whole value decomposition stage terminates, and can obtain ILU and decompose:
P r P c A ~ P c T = P r P c D r P 0 A P 0 T D c P c T ≈ LU - - - ( 5 )
Step 6: perform forwardly and rearwardly back substitution, solution matrix equation
Represent that former coefficient matrix has with factoring:
A ≈ ( P r P c D r P 0 ) - 1 LU ( P 0 T D c P c T ) - 1 - - - ( 6 )
Then system of linear equations is represented by
( P r P c D r P 0 ) - 1 LU ( P 0 T D c P c T ) - 1 x = b - - - ( 7 )
I). perform back substitution forward, solve (PrPcDrP0)-1Ly=b obtains
Ii). perform back substitution backward, solve Uz=y and obtain
Iii). solve vector x = P 0 T D c P c T z

Claims (8)

1. based on inverse many wavefront block ILU preprocess method, including:
Step one: performing original matrix A to reorder and symbol decomposition, matrix is reorganized into a series of dense matrix, produce during reducing sparse matrix decomposition inserts unit's quantity, increases dense operation, eventually forms the matrix after reorderingWherein P0Being the permutation matrix obtained that reorders, T represents transposition;
Step 2: calculate diagonal angle scaling matrix DrAnd Dc, obtain matrix
Step 3: divide the matrix into some supernodes;
Step 4: each supernode is divided into some block matrix;
Step 5: use based on super block-adaptive block incomplete decomposing method and each supernode carried out block ILU decomposition, by i-th supernode F based on the inverse many wave-front methods being not required to storage renewal battle array abandoning strategyiIt is expressed as block matrix formObtain FiILU decomposeAfter all supernodes decompose, it is thus achieved that ILU decomposesWhereinIt is the permutation matrix of partial pivot method generation,It is to postpone the permutation matrix that pivot produces;
Step 6: perform forwardly and rearwardly back substitution, solution matrix equation:
I). perform back substitution forward, solve (PrPcDrP0)-1Ly=b obtains
Ii). perform back substitution backward, solve Uz=y and obtain
Iii). obtain solution vector
A kind of based on inverse many wavefront block ILU preprocess method, it is characterised in that described step one includes:
I). carry out inserting reordering for the purpose of unit by minimizing;
Ii). perform symbol and decompose, and set up cancellation tree;
Iii). reorder according to eliminating tree postorder traversal order;
Iv). again perform symbol and decompose, set up and eliminate tree.
A kind of based on inverse many wavefront block ILU preprocess method, it is characterised in that described step 5 includes:
I). the value information of integrated complete adding section;
Ii). integrated complete adding section is from the renewal of descendants's supernode;
Iii). perform complete adding section block F11ILU decompose
Iv). adjust according to new pivot order and decomposed part L0,L1,…Li-1And U0,U1,…Ui-1, respectively obtain
WithWherein subscript 0,1 ... i-1 represents all decomposed supernode before i-th supernode;
V). according to the value information of the part adding section of the new pivot integrated supernode of order, it is adjusted the part adding section after pivot orderF21And F12
Vi). calculating also integration section adding section is from the renewal of descendant nodes;
Vii). solve part adding section matrix equation respectivelyWith
4. a kind of based on inverse many wavefront block ILU preprocess method as described in claim 1 or claim 2, it is characterised in that described in reorder and use sparse matrix Ordering Software bag METIS based on multistage nested subdivision method and minimum degree of freedom ordering techniques that former sparse matrix A is reordered.
A kind of based on inverse many wavefront block ILU preprocess method, it is characterised in that step 3 to use following two relax supernode technology:
I). lax supernode method: by loosening the division condition of supernode non-zero structure " identical ", it is allowed to have a certain proportion of difference at non-zero structure, obtain bigger supernode by explicit null element of adding;
Ii). supernode integration technology: less continuous leaf node is merged into a supernode, limit the dimension of minimum supernode.
A kind of based on inverse many wavefront block ILU preprocess method, it is characterized in that described step 5 uses the many wave-front methods being not required to storage renewal battle array, be deferred to when corresponding ancestor node wavefront battle array is decomposed just calculate and integrated to the renewal battle array of ancestor node to it by current wavefront battle array;Wavefront battle array is divided into complete adding section and part adding section, and each several part solves respectively, the integrated corresponding value information of ability when decomposing.
A kind of based on inverse many wavefront block ILU preprocess method, it is characterised in that the storage that is not required to used in step 5 updates many wave-front methods of battle array and have employed based on inverse strategy of abandoning, described based on the inverse condition of abandoning abandoning strategy be:
Wherein j > k, m > k, ljkAnd ukmIt is kth column element and the row k element of U, the unit vector e of factoring L respectivelykRepresenting and take matrix row k and kth row, τ is for abandoning tolerance limit, and T is matrix transpose operator,WithIt is inverse factoring respectivelyRow k norm andKth row norm.
A kind of based on inverse many wavefront block ILU preprocess method, it is characterized in that the many wave-front methods being not required to storage renewal battle array used in step 5 have employed super piece-adaptive block incomplete decomposing method, described super block-adaptive block incomplete decomposing method includes:
I). abandoning in mode, its basic discarding unit is block, and the element meeting condition of abandoning in block all will be abandoned, and when in block, all elements is all abandoned, abandon whole piece;
Ii). on storage strategy, uses self adaptation based on memory pool storage tactful, by comparing cost needed for Sparse Storage Modes and dense storage mode, automatically select the storage mode that storage cost is little;All pieces of memory spaces and small pieces internal memory all use internal memory pool managing;
Iii). on computing architecture, using super block method, in temporary memory space, the block copy participating in multiple Coutinuous stores of computing with same or vector is formed a super block, recycling dense matrix computing realizes;
Iv). the operation that calculates of all pieces all uses the dense matrix computing in BLAS3 or LAPACK to realize.
CN201410245950.0A 2014-06-05 2014-06-05 Based on inverse many wavefront block ILU preprocess method Active CN104035915B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410245950.0A CN104035915B (en) 2014-06-05 2014-06-05 Based on inverse many wavefront block ILU preprocess method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410245950.0A CN104035915B (en) 2014-06-05 2014-06-05 Based on inverse many wavefront block ILU preprocess method

Publications (2)

Publication Number Publication Date
CN104035915A CN104035915A (en) 2014-09-10
CN104035915B true CN104035915B (en) 2016-12-07

Family

ID=51466686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410245950.0A Active CN104035915B (en) 2014-06-05 2014-06-05 Based on inverse many wavefront block ILU preprocess method

Country Status (1)

Country Link
CN (1) CN104035915B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944145A (en) * 2010-08-31 2011-01-12 电子科技大学 Finite element simulation method capable of removing microwave tube high-frequency circuit in pseudo-DC mode
CN102334110A (en) * 2008-11-12 2012-01-25 兰德马克绘图国际公司 Systems and methods for improved parallel ilu factorization in distributed sparse linear systems

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102334110A (en) * 2008-11-12 2012-01-25 兰德马克绘图国际公司 Systems and methods for improved parallel ilu factorization in distributed sparse linear systems
CN101944145A (en) * 2010-08-31 2011-01-12 电子科技大学 Finite element simulation method capable of removing microwave tube high-frequency circuit in pseudo-DC mode

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Multifrontal Block ILU Preconditioner for the 3D Finite-element Eigenvalue Analysis of Lossy Slow-wave Structures of Traveling-wave Tubes;Hao wang等;《Vacuum Electronics Conference》;20140424;第83-84页 *
IMF:An Incomplete Multifrontal LU-Factorization For Element-Structured Sparse Linear Systems;Nick Vannieuwenhoven等;《SIAM Journal on Scientific Computing》;20130131;第35卷(第1期);第A270-293页 *
一种基于超节点的不完全LU分解算法;邵美悦;《中国优秀硕士学位论文全文数据库 基础科学辑》;20091231;第2009年卷(第12期);A002-21 *
一种用于有损耗慢波结构有限元本征分析的混合多波前块ILU-p型多重网格预处理;王浩 等;《真空科学与技术学报》;20150228;第35卷(第2期);第201-206页 *
大型结构并行有限元分析软件研究;吴伟蔚 等;《计算机工程与应用》;20050331;第2005年卷(第6期);第89-91页 *

Also Published As

Publication number Publication date
CN104035915A (en) 2014-09-10

Similar Documents

Publication Publication Date Title
Świrydowicz et al. Low synchronization Gram–Schmidt and generalized minimal residual algorithms
Golub et al. On solving block-structured indefinite linear systems
Li et al. FETI‐DP, BDDC, and block Cholesky methods
Sorensen et al. Direct methods for matrix Sylvester and Lyapunov equations
US20100082724A1 (en) Method For Solving Reservoir Simulation Matrix Equation Using Parallel Multi-Level Incomplete Factorizations
Yamazaki et al. One-sided dense matrix factorizations on a multicore with multiple GPU accelerators
Zhao et al. Power grid analysis with hierarchical support graphs
Ballard et al. Reconstructing Householder vectors from tall-skinny QR
Jia et al. A fast collocation approximation to a two-sided variable-order space-fractional diffusion equation and its analysis
Zheng et al. A power Schur complement low-rank correction preconditioner for general sparse linear systems
Fang et al. A fast finite volume method for spatial fractional diffusion equations on nonuniform meshes
Swirydowicz et al. Low synchronization GMRES algorithms
US7769571B2 (en) Constraint stabilization
CN104035915B (en) Based on inverse many wavefront block ILU preprocess method
Boumzough et al. THE INCOMPLETE LU PRECONDITIONER USING BOTH CSR AND CSC FORMATS.
Datta et al. Parallel and large scale matrix computations in control: some ideas
Imai et al. Efficient sequential and parallel algorithms for planar minimum cost flow
Song et al. An improvement to exact reanalysis algorithm for local non-topological structural modifications
Langseth et al. Three-dimensional Euler computations using clawpack
Chen et al. Implementation of block algorithm for LU factorization
CN112446004B (en) Non-structural grid DILU preconditioned sub-many-core parallel optimization method
Mohanty et al. I/O efficient QR and QZ algorithms
Marcia On solving sparse symmetric linear systems whose definiteness is unknown
OGINO et al. Two-level extension of the hierarchical domain decomposition method
Kulkarni et al. A framework for low communication approaches for large scale 3D convolution

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant