KR20060057691A

KR20060057691A - Numerical simulator or finite element method

Info

Publication number: KR20060057691A
Application number: KR1020040096678A
Authority: KR
Inventors: 원태영; 윤상호
Original assignee: 원태영; 윤상호
Priority date: 2004-11-24
Filing date: 2004-11-24
Publication date: 2006-05-29

Abstract

본 발명은 수치 해석 시뮬레이션 방법에 관한 것으로서, 특히 유한 요소법 (finite element method) 수치 해석 계산을 수행함에 있어서 병렬 컴퓨팅 알고리즘을 이용한 수치 계산의 효율성을 높이기 위한 수치 해석적 계산 방법을 제공한다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a numerical analysis simulation method, and more particularly, to a numerical analysis method for increasing the efficiency of numerical calculation using a parallel computing algorithm in performing a finite element method numerical analysis calculation.

본 발명의 수치 해석적 계산 방법은 계산해야 할 수치 영역을 프로세서 수에 따라 영역 분할한 후, 병렬 계산을 위한 각 영역의 메쉬를 형성하고 시스템 방정식을 형성하여, 각 프로세서가 독립적으로 병렬 계산을 수행하는 것을 특징으로 한다.In the numerical analysis method of the present invention, the numerical domain to be calculated is divided into regions according to the number of processors, and then a mesh of each region for parallel computation and a system equation are formed, and each processor independently performs parallel computation. Characterized in that.

이와 같이 본원 발명은 각 병렬 프로세서에 독립적인 연산을 수행하도록 함으로써, 병렬 계산에서 각 프로세서 상호간의 계산 의존성에 의한 효율성 저하 문제점을 해결한다.As described above, the present invention solves the problem of deterioration in efficiency due to the dependency of calculation between the processors in parallel computation by performing an operation independent of each parallel processor.

수치 해석, 유한 요소법, Numerical analysis, finite element method,

Description

Finite element method numerical analysis processor {NUMERICAL SIMULATOR OR FINITE ELEMENT METHOD}

제1도는 종래의 병렬 컴퓨팅 수치 해석 방법을 나타낸 개략도.1 is a schematic diagram showing a conventional parallel computing numerical analysis method.

제2a도 내지 제2f도는 본 발명의 실시 예에 따른 비구조형 메쉬 생성 및 병렬 유한 요소법 수치 해석 처리 방법을 나타낸 계산 순서도.2A to 2F are calculation flowcharts showing a method for generating a non-structured mesh and a parallel finite element method for analyzing numerical values according to an exemplary embodiment of the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

10, 210 : 내부 노드10, 210: internal node

20, 200 : 경계 노드20, 200: border node

220, 230 : 보로노이 다이아그램 및 디라우니 메쉬220, 230: Voronoi diagram and Delaunay mesh

본 발명은 단일 프로세서로는 계산이 불가능한 수치해석 계산을 병렬 처리함으로서, 계산 시간 및 메모리 사용량을 효과적으로 처리하기 위한 수치 해석 연산 방법에 관한 것으로, 특히 메쉬 생성에서 유한요소법 계산까지 전체 연산을 병렬화하여 프로세서의 효율을 극대화하기 위한 수치 해석 연산 기법에 관한 것이다.The present invention relates to a numerical computation method for effectively processing computation time and memory usage by parallel processing of numerical calculations that cannot be calculated with a single processor. In particular, the present invention relates to a parallel processing of all operations from mesh generation to finite element calculation. Numerical analysis algorithm to maximize the efficiency of the.

유한요소법 등의 수치 계산을 위해서는 먼저 계산하고자 하는 영역을 불연속 노드의 집합인 메쉬의 형태로 변환하고 각 노드사이의 관계를 정의하는 이산화작업을 수행한다. 한편, 계산 영역이 증가하거나 좀 더 정확한 계산을 위해서는 노드의 개수가 기하급수적으로 증가하게 되고, 따라서, 단일 프로세서의 연산만으로는 그 계산을 수행하는데 방대한 메모리 및 계산 시간을 요구하게 되어, 하드웨어의 기본 요구사항이 더욱 높아지게 된다. For numerical calculations such as the finite element method, first, the area to be calculated is transformed into the form of a mesh, which is a set of discrete nodes, and the discretization process is defined to define the relationship between each node. On the other hand, the number of nodes increases exponentially to increase the computational area or more accurate calculations. Therefore, a single processor operation requires huge memory and computation time to perform the calculations. The matter becomes even higher.

즉, N개의 노드가 연산에 사용되면, 이를 계산하기 위해서 N x N 크기의 행렬 계산이 요구되어 지고, 따라서, 노드의 수가 증가함에 따라서 그 계산 시간 및 메모리 요구 량은 노드 수의 제곱에 비례하게 된다.In other words, if N nodes are used in the calculation, a matrix calculation of size N x N is required to calculate them, and as the number of nodes increases, the computation time and memory requirements are proportional to the square of the number of nodes. do.

이러한 문제점을 극복하기 위하여 수치 해석 엔지니어들은 영역 분할법 등의 병렬 유한요소법 수치 해석 알고리즘 등의 병렬 연산 기법적 해결 수단을 사용하고 있다. 즉, 영역 분할법(domain decomposition method)은 전체 영역을 세부 영역으로 분할한 후, 각 세부 영역을 병렬로 계산하기 위하여, 세부 영역간의 경계조건을 반복법(iterative method)으로 구하는 방법이다. 따라서, 계산 결과가 수렴할 때까지 매 반복 계산에서 프로세서 상호간에 데이터를 주고받아야 하며, 매 반복 계산마다 프로세스를 동기(synchronization)시켜야 한다. 결과적으로, 데이터의 전송 량이 많기 때문에 타 프로세서에 대한 의존성이 커지고, 각 프로세서에서의 수렴 속도가 일정하지 않을 경우 각 프로세서의 부하 균형(workload balance)이 이루어지지 않아 병렬처리의 효율성이 떨어지는 단점을 가지고 있다.To overcome this problem, numerical engineers are using parallel computational solutions such as parallel finite element numerical analysis algorithms such as domain segmentation. In other words, the domain decomposition method is a method of dividing an entire area into detailed areas and then calculating boundary conditions between the detailed areas by an iterative method in order to calculate each of the detailed areas in parallel. Therefore, data must be exchanged between processors in each iteration until the calculation results converge, and the process must be synchronized in every iteration. As a result, there is a drawback that the dependence on other processors increases due to the large amount of data transmission, and when the convergence speed of each processor is not constant, the workload balance of each processor is not achieved, resulting in inefficient parallel processing. have.

이러한 효율성 저하 및 메모리 관리의 문제점을 해결하기 위해 돌치니스 및 놀팅(St. Doltsinis and S. Nolting, Computer Methods in Mechanics and Engineering, Vol. 89, pp. 497-521, 1991.)은 하부구조법(substructure method)에 의한 병렬 연산기법을 제시하였다.In order to solve this problem of efficiency degradation and memory management, St. Doltsinis and S. Nolting, Computer Methods in Mechanics and Engineering, Vol. 89, pp. 497-521, 1991. parallel computing method is proposed.

즉, 하부 구조법은 할당된 영역에서 시스템 방정식을 만들어서 각 하부 영역의 경계 노드만의 관계식을 만들고, 이 값들을 이용하여 전체 시스템의 계산을 수행한다. 가장 큰 장점은 프로세서간의 데이터 전송이 최소화된다는 것과, 전체 행렬 계산에서 내부 노드는 제외하고 경계 노드만으로 첫 번째 계산을 수행하므로 행렬 크기가 크게 감소하여 계산시간의 이득을 가져올 수 있다는 것이다. In other words, the substructure method creates a system equation in the allocated area, creates a relational expression only for the boundary nodes of each sub-area, and uses these values to calculate the entire system. The biggest advantage is that the data transfer between processors is minimized, and the first calculation is performed only by the boundary node except the internal node in the overall matrix calculation, so that the matrix size can be greatly reduced, resulting in a gain in computation time.

한편, 프로세서의 효율을 극대화하고 계산의 정확성을 높이기 위해서는 비구조형 메쉬 생성에서 유한요소법 계산에 이르기까지 전체 계산을 병렬화하여 프로세서의 효율을 극대화하는 것이 바람직하다.On the other hand, in order to maximize the efficiency of the processor and to increase the accuracy of the calculation, it is desirable to maximize the efficiency of the processor by parallelizing the entire calculation from the generation of the unstructured mesh to the finite element method calculation.

그러나, 종래 기술에 따르면, 각 하부 영역의 내부 노드와 경계 노드의 번호 정의가 어려워 비구조형 메쉬(unstructured mesh)나 복잡한 형태의 경계에서 하부구조법을 적용하는데 어려움이 있어서 지금까지는 구조형 메쉬(sturctured mesh)에만 적용되어 왔고, 비구조형 메쉬 생성에서 유한 요소법 계산에 이르기까지 전체를 완전히 병렬화하는 데는 어려움이 있었다. 이하 첨부 도면 제1도를 참조하여 종래의 기술이 지니는 문제점을 상술하고자 한다.However, according to the related art, it is difficult to define the number of internal nodes and boundary nodes in each subregion, and thus, it is difficult to apply the substructure method to an unstructured mesh or a complex boundary, so far, a structured mesh has been described. It has been applied only to, and it has been difficult to completely parallelize the whole from the generation of unstructured mesh to the finite element calculation. Hereinafter, a problem with the related art will be described in detail with reference to FIG. 1.

즉, 수치 계산 영역을 프로세서 수에 따라 분할하고, 각 분할된 영역의 노드를 내부 노드(10)와 경계 노드(20)로 나누고, 유한 요소법 적용을 위해 번호 매김을 할 때, 내부 노드를 먼저 번호 매김하고 경계노드를 번호 매김한다. 상기 방식에 따라 번호 매김을 수행하면 약정식(weak formulation)을 통해서 만들어진 시스 템 방정식의 행렬 항 중 내부 노드 항과 경계 노드 항에 관련된 값들을 판별할 수 있다. 각 프로세서에서 형성된 경계 노드는 전체 영역에서 보면 다시 내부 노드(30)로 포함된다.That is, when the numerical calculation region is divided according to the number of processors, the nodes of each divided region are divided into the inner node 10 and the boundary node 20, and the numbering is applied for the finite element method, the inner node is first numbered. Number and border nodes. By numbering according to the above method, it is possible to determine the values related to the inner node term and the boundary node term among the matrix terms of the system equation created through the weak formulation. The boundary node formed in each processor is included as the internal node 30 again in the whole area.

이와 같이 병렬 연산 기법을 적용하는데 있어서 메쉬 형성과 노드 번호 매김의 난해성 때문에 구조형(structured) 메쉬만을 이용하여 병렬 처리 연산을 수행하였기 때문에, 메쉬의 적응 능력(adaptiveness)을 갖추지 못하고 불필요한 영역까지 적절한 메쉬 형서을 수행할 수 없으며, 비구조형 메쉬 생성에서 유한요소법 계산에 이르기까지 전체를 완전히 병렬화할 수 없는 단점을 초래한다.Since the parallel processing is performed using only the structured mesh due to the difficulty of mesh formation and node numbering in applying the parallel operation technique, it is possible to obtain an appropriate mesh format even without unnecessary adaptability of the mesh. It is not possible to do this, which results in the inability to completely parallelize the whole from the creation of unstructured mesh to the finite element calculation.

따라서, 본 발명의 제1 목적은 비구조형 메쉬 생성에서 유한요소법 계산에 이르기까지 전체 계산을 병렬화하여 프로세서의 효율을 극대화하는 방법을 제공하는데 있다.Accordingly, a first object of the present invention is to provide a method of maximizing the efficiency of a processor by parallelizing the entire calculation from generating a non-structured mesh to calculating a finite element method.

본 발명의 제2 목적은 상기 제1 목적에 부가하여, 병렬 컴퓨팅을 위한 영역 분할 후 디라우니 메쉬(delaunay mesh)를 이용하여 각 영역의 메쉬를 형성하는 방법을 제공하는데 있다.A second object of the present invention is to provide a method of forming a mesh of each region by using a delaunay mesh after segmentation for parallel computing in addition to the first object.

본 발명의 제3 목적은 상기 제1 목적에 부가하여, 각 경계 영역의 경계 노드를 설정하고 디라우니 메쉬의 경계를 설정하는 방법을 제공하는데 있다.It is a third object of the present invention to provide a method for setting a boundary node of each boundary region and setting a boundary of a Delaunay mesh in addition to the first object.

상기 목적을 달성하기 위하여, 본 발명은 프로세서 수에 따른 영역 분할 단계, 각 분할 영역에서의 병렬 메쉬 생성 단계, 각 분할 영역에서의 시스템 방정식 병렬 작성 단계, 각 시스템 방정식 통합 단계, 계산된 노드 값을 이용하여 각 분할 영역의 시스템 방정식의 내부 노드 값을 계산하는 단계를 포함하는 것을 특징으로 하는 병렬 컴퓨팅 수치 해석 방법을 제공한다.In order to achieve the above object, the present invention provides a region partitioning step according to the number of processors, a parallel mesh generation step in each partition, a system equation parallel writing step in each partition, a system equation integration step, and calculated node values. Comprising a step of calculating the internal node value of the system equation of each partition using the parallel computing numerical analysis method.

이하, 본 발명에 따른 병렬 컴퓨팅 수치 해석 방법의 바람직한 실시 예를 첨부 도면 제2a도 내지 제2f도를 참조하여 상세히 설명한다.Hereinafter, exemplary embodiments of a parallel computing numerical analysis method according to the present invention will be described in detail with reference to FIGS. 2A to 2F.

제2a도는 본 발명의 바람직한 실시 예로서 4개의 프로세서를 예로 하여 영역 분할한 결과를 도시하였다. 영역 분할은 주어진 프로세서의 수에 따라 수행하므로 4개의 프로세서가 할당되었다고 가정하여 전체 영역을 4개로 나누었다. 검은 색을 칠한 노드(200)는 프로세서 사이의 경계를 의미하며, 흰색 노드는 내부 영역의 노드(210)를 의미한다. 상기와 같이 각각 분할된 영역에서 다른 프로세서에 독립적으로 메쉬를 형성한다. 2A illustrates a result of region division using four processors as an example of a preferred embodiment of the present invention. Since partitioning is performed according to a given number of processors, the entire area is divided into four assuming four processors are allocated. The black node 200 denotes a boundary between processors, and the white node denotes a node 210 of an inner region. As described above, meshes are formed independently of other processors in each divided region.

각 독립된 영역에 대한 메쉬를 생성하기 위하여, 보로노이 다이아그램(voronoi diagram)을 이용한 디라우니 메쉬 생성 기술을 제공한다. 제2b도 및 제2c도의 디라우니 메쉬 생성은 먼저 난수 발생기를 이용하여 임의의 위치에 노드를 생성시킨 후, 보로노이 다이아그램을 형성(220)한 다음 디라우니 메쉬를 형성(230)한다. 바람직한 실시 예로서 상기 메쉬 형성은 디라우니 메쉬 생성기 및 표면 전진 노드 생성기 등의 비구조형 메쉬 생성기가 사용될 수 있다.In order to generate a mesh for each independent area, a technique for generating a Delaunay mesh using a Voronoi diagram is provided. In the dilaunay mesh generation of FIGS. 2b and 2c, a node is first generated at a random position using a random number generator, and then a Voronoi diagram is formed 220, followed by a 230 durani mesh. In a preferred embodiment, the mesh formation may be a non-structured mesh generator such as a Dilauni mesh generator and a surface advanced node generator.

각 영역의 메쉬를 형성한 후, 형성된 메쉬의 노드 순서를 적절히 조화시키면, 즉, 내부 노드를 먼저 번호 매김하고, 그런 다음 경계노드를 번호 매김 하면, 유한요소법의 특성상 전체 행렬이 영역별로 형성된다. 상기 방식은 경계 노드를 제2a도에 도시한 바와 같이, 초기에 먼저 설정하여 메모리에 할당하였으므로 번호 매김에 있어서 판별을 용이하게 할 수 있다. 제2d도에는 처음에 가정한 네 영역 중에서 PE0에 해당되는 영역만을 도시하였다. 제2d도의 강성 행렬(stiffness matrix)에서 4가지 영역으로 형성된 부 행렬(sub matrix) 중 첨자 ii는 내부 노드만에 의해 구성된 행렬이 되고(240), 첨자 bb는 외부노드 만의 값들로 형성된 부 행렬(250)을 나타낸다. 대각의 ib와 bi는 내부 노드와 외부 노드가 상호 영향을 미치는 행렬(260)로 구성되어 있다. 이렇게 만들어진 강성 행렬을 간단한 수학으로 다시 표현하면 외부 노드는 내부노드의 함수로 나타낼 수 있다. 따라서 A*와 b*는 내부 노드의 영향을 고려한 외부 노드만의 행렬식으로 다시 표현되고, 따라서, 제2d도 식(270)과 같이 새로운 강성 행렬을 만들 수 있다.After forming the mesh of each region, if the node order of the formed mesh is properly matched, that is, the internal nodes are numbered first, and then the boundary nodes are numbered, the entire matrix is formed for each region by the characteristics of the finite element method. In the above scheme, as shown in FIG. 2A, the boundary node is initially set and allocated to the memory, thereby facilitating discrimination in numbering. In FIG. 2d, only the regions corresponding to PE0 are shown among the four regions initially assumed. In the stiffness matrix of FIG. 2d, the subscript ii of the sub-matrix formed by four regions is 240 composed of only internal nodes (240), and the subscript bb is formed of sub-matrix formed values of only external nodes ( 250). Diagonal ib and bi consist of a matrix 260 in which internal and external nodes interact. This stiffness matrix can be expressed in simple mathematics, and the outer node can be represented as a function of the inner node. Therefore, A * and b * are re-expressed as a determinant of only the outer node considering the influence of the inner node, and thus, 2d can also create a new stiffness matrix as shown in equation (270).

상기 계산한 경계 노드만의 관계식(270)을 이용하여 전체 영역을 다시 정의할 수 있다. 제2e도에 도시된 것과 같이 4개의 프로세서에서 만든 각각의 시스템 방정식을 다시 하나의 영역으로 조합한 시스템 방정식으로 나타내면, 처음에 각각의 프로세서에서 만들었던 강성 행렬(270)과 동일한 형태를 지니게 된다. 전체영역의 경계노드는 경계조건에 의해서 주어지는 값이므로 Xb의 값은 알고있다. 따라서, Xi의 값은 식(280)과 같이 간단한 행렬 계산으로 구할 수 있다. The entire area may be redefined by using the relational expression 270 of the calculated boundary node only. As shown in FIG. 2E, each system equation created by four processors is represented by a system equation that is combined into one region, and has the same shape as the stiffness matrix 270 created by each processor. Since the boundary node of the whole area is given by the boundary condition, the value of Xb is known. Therefore, the value of Xi can be obtained by simple matrix calculation as shown in equation (280).

각 분할 영역의 경계 값은 앞서 언급한 방법을 이용하여 구한 후, 제2f도와 같이 각각의 하부 영역에서 독립적으로 내부 노드(210)를 계산하게 된다. 앞서 전체 영역에서 계산한 방법과 마찬가지로 경계 영역의 값들을 알고 있으므로, 단순한 행렬 계산(290)만 가지고 내부 노드의 값을 구할 수 있다. The boundary value of each divided region is calculated using the aforementioned method, and then the internal node 210 is independently calculated in each lower region as shown in FIG. 2f. As in the previous method of calculating the entire region, since the values of the boundary region are known, the value of the internal node can be obtained only by simple matrix calculation 290.

전술한 내용은 후술할 발명의 특허 청구 범위를 보다 잘 이해할 수 있도록 본 발명의 특징과 기술적 장점을 다소 폭넓게 개설하였다. 본 발명의 특허 청구 범위를 구성하는 부가적인 특징과 장점들이 이하에서 상술될 것이다. 개시된 본 발명의 개념과 특정 실시 예는 본 발명과 유사 목적을 수행하기 위한 다른 구조의 설계나 수정의 기본으로서 즉시 사용될 수 있음이 당해 기술 분야의 숙련된 사람들에 의해 인식되어야 한다.The foregoing has outlined rather broadly the features and technical advantages of the present invention to better understand the claims of the invention which will be described later. Additional features and advantages that make up the claims of the present invention will be described below. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed herein may be used immediately as a basis for designing or modifying other structures for carrying out similar purposes to the present invention.

본 발명에서 개시된 발명 개념과 실시 예가 본 발명의 동일 목적을 수행하기 위하여 다른 구조로 수정하거나 설계하기 위한 기초로서 당해 기술 분야의 숙련된 사람들에 의해 사용되어질 수 있을 것이다. 또한, 당해 기술 분야의 숙련된 사람에 의한 그와 같은 수정 도는 변경된 등가 구조는 특허 청구 범위에서 기술한 발명의 사상이나 범위를 벗어나지 않는 한도 내에서 다양한 변화, 치환 및 변경이 가능하다.The inventive concepts and embodiments disclosed in the present invention may be used by those skilled in the art as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. In addition, such modifications or altered equivalent structures by those skilled in the art may be variously changed, substituted, and changed without departing from the spirit or scope of the invention described in the claims.

이상과 같이 본 발명에 따른 병렬 수치 해석 연산 기법은, 각 프로세서 간의 데이터 전송이 단 일 회만 발생하므로 각 프로세세의 독립성을 크게 향상시킬 수 있고, 계산해야 하는 노드의 수를 크게 감소시킴으로써 초고속 병렬처리를 위한 효율성을 높일 수 있는 장점을 지닌다.As described above, the parallel numerical analysis algorithm according to the present invention can greatly improve the independence of each process because only one data transfer occurs between the processors, and by greatly reducing the number of nodes to be calculated, ultra-high parallel processing It has the advantage of increasing the efficiency for.

즉, 전체 N개 노드 중 내부가 R개 경계가 N - R개의 노드를 가진다고 가정할 때, 행렬 계산에서는 크기의 제곱에 비례하여 계산시간이 증가하므로, R 만큼의 계산시간이 단축되는 것이 아니라 그 제곱만큼의 시간이 감소하는 효과를 얻을 수 있 다.In other words, assuming that the R boundary has N-R nodes among the total N nodes, the calculation time increases in proportion to the square of the size in the matrix calculation, so that the calculation time is not shortened as much as R. The effect of reducing the time squared can be obtained.

Claims

Dividing the calculation region according to the number of processors;

Generating a parallel mesh in the partitioned area;

Parallel writing system equations in each partition;

Integrating respective system equations in each partition;

Calculating the internal node value of the system equation of each partition using the calculated node value

Parallel computing numerical analysis method comprising a.

The method of claim 1, wherein after the segmentation for parallel computing, a mesh of each region is formed using a delaunay mesh.

The method of claim 1, wherein a boundary node of each boundary region is set and a boundary of the dilaunay mesh is set.