CN115455794B - LBM parallel optimization method, device and storage medium based on communication pore division calculation region - Google Patents
LBM parallel optimization method, device and storage medium based on communication pore division calculation region Download PDFInfo
- Publication number
- CN115455794B CN115455794B CN202210953478.0A CN202210953478A CN115455794B CN 115455794 B CN115455794 B CN 115455794B CN 202210953478 A CN202210953478 A CN 202210953478A CN 115455794 B CN115455794 B CN 115455794B
- Authority
- CN
- China
- Prior art keywords
- pore
- units
- unit
- calculation
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000011148 porous material Substances 0.000 title claims abstract description 156
- 238000004364 calculation method Methods 0.000 title claims abstract description 83
- 238000004891 communication Methods 0.000 title claims abstract description 45
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000005457 optimization Methods 0.000 title claims abstract description 33
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 24
- 210000004027 cell Anatomy 0.000 claims description 31
- 239000012530 fluid Substances 0.000 claims description 26
- 238000003491 array Methods 0.000 claims description 21
- 238000005315 distribution function Methods 0.000 claims description 13
- 108020001568 subdomains Proteins 0.000 claims description 12
- 239000002245 particle Substances 0.000 claims description 11
- 239000011800 void material Substances 0.000 claims description 6
- 210000003429 pore cell Anatomy 0.000 claims description 5
- 238000004088 simulation Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/25—Design optimisation, verification or simulation using particle-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/10—Numerical modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/08—Fluids
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides an LBM parallel optimization method and device for dividing calculation areas based on communication pores, which can divide the number of subdomains uniformly according to the number of calculation nodes, so that each calculation node can be processed with high efficiency in a load balancing manner, and the calculation processing efficiency is improved. The method comprises the following steps: step 1, determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system; step 2, decomposing a calculation domain: dividing the basin into n along the x-axis x Each region having the same or similar number of cells is then divided into n along the y-axis y A sub-area with the same or similar unit number, and finally, n x ×n y The sub-region divides n along the z-axis z Obtaining N subdomains with the same or similar unit numbers, wherein each subdomain is a three-dimensional area formed by a plurality of pore units; and, the difference of the number of units between the sub-field with the maximum number of units and the sub-field with the minimum number of units after decomposition should not exceed one thousandth of the total number of units; and step 3, distributing calculation tasks.
Description
Technical Field
The invention belongs to the technical field of pore structure simulation calculation, and particularly relates to an LBM parallel optimization method, an LBM parallel optimization device and a storage medium based on a communication pore division calculation region.
Background
The lattice boltzmann method (Lattice Boltzmann Method abbreviated LBM) is an effective means of simulating fluid flow in porous media pore structures at mesoscale. In order for the results of the simulation to be representative, the dimensions of the simulated sample should be large enough and, due to the complexity of the fluid flow itself in the pore structure, the pore scale LBM tends to face significant computational resource and storage space requirements in numerical simulation. Thus, large-scale LBM simulation requires parallel optimization, and its execution time and memory requirements depend on the data size and parallel optimization method.
The basic idea of domain decomposition for current parallel computing is to decompose the computing domain into several subfields, then restore the interfaces and perform load balancing. For complex porous materials, however, this step often takes a significant amount of time to iterate to achieve acceptable load balancing. Wherein the recursive dichotomy (Recursive Bisection Method) is the most commonly used computational domain decomposition scheme, which uniformly divides the computational domain into two subfields, then equally divides the subfields, and finally co-divides the subfields into 2 by R times of recursion R The subfields. The partitioning scheme can well balance the workload of a uniform medium regular or irregular structure, but has the following three problems: 1) Good workload balancing is still difficult to achieve for heterogeneous, highly heterogeneous porous structures; 2) The number of division subfields must be 2 R The number of processors (compute nodes) is not exactly equal to the number of processors, so that part of the processors cannot be effectively utilized; 3) When the divided subfields are more, the number of times R of repeating the recursive division is larger and the efficiency is low.
Disclosure of Invention
The invention aims to solve the problems, and aims to provide an LBM parallel optimization method, an LBM parallel optimization device and a storage medium for dividing a calculation area based on communication pores, which can divide the number of subdomains uniformly according to the number of calculation nodes, so that each calculation node can be subjected to load balancing and high-efficiency processing, and the calculation processing speed is improved.
In order to achieve the above object, the present invention adopts the following scheme:
< method >
The invention provides an LBM parallel optimization method based on a communication pore dividing calculation region, which is characterized by comprising the following steps:
step 1, determining pore unit information and communication conditions of a sample according to pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system;
step 2, decomposing a calculation domain:
dividing the basin into n along the x-axis x Each region having the same or similar number of cells is then divided into n along the y-axis y A sub-area with the same or similar unit number, and finally, n x ×n y The sub-region divides n along the z-axis z Secondary, get n x ×n y ×n z N subfields having the same or similar cell numbers, each subfield being a three-dimensional region composed of a plurality of pore cells; and after decomposition has the maximum number of units M max Subdomains and having a minimum number of units M min The difference in the number of units between subfields should not exceed one thousandth of the total number of units; the allowable difference value of the number of units among the subdomains can be further reduced according to the requirement of the division precision of the calculation domain;
step 3, distributing calculation tasks:
during parallel computing, N sub-domains are allocated to N computing nodes one by one for processing; the computing node considers only all connected pore units when processing the sub-domains, renumbers the connected pore units, and then stores the connected pore units in a one-dimensional array p respectively i In the formula, i is the number of the subdomain; each communicated pore unit is associated with one coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array storing the ordinal numbers of the pore units and the corresponding coordinates, and each pore unit is tracked through the unique ordinal numbers and the unique coordinates;
for each void cell: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit as a one-dimensional array to obtain a series of first derivative arrays; and storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
Specifically, the LBM parallel optimization method based on the communication pore dividing calculation region provided by the invention can also have the following characteristics: in step 2, the maximum number of units M max :
Minimum number of units M min :
Where NX, NY, NZ are the total number of units of the interconnected pores in the x, y, z axis directions of the simulation areas, respectively.
Preferably, the LBM parallel optimization method based on the connected pore dividing calculation region provided by the invention can also have the following characteristics: in step 3, the primary data structure of each pore unit is its fluid particle distribution function and the equilibrium fluid particle distribution function; and storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain four second derivative arrays.
Preferably, the LBM parallel optimization method based on the connected pore dividing calculation region provided by the invention further comprises the following steps: step 4, setting a communication mode among subdomains: when processing adjacent or diagonal pore units between different subdomains, an interface layer on the interface of the current subdomain is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication mode of the expanded pore units is consistent with that of the interface units in the current subdomain.
< device >
Further, the present invention also provides an apparatus for automatically implementing the above < method >, which is characterized by comprising:
a determining part for determining pore unit information and communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system;
a calculation domain decomposition unit for dividing the basin into n along the x-axis x Each region having the same or similar number of cells is then divided into n along the y-axis y A sub-area with the same or similar unit number, and finally, n x ×n y The sub-region divides n along the z-axis z Secondary, get n x ×n y ×n z N subfields with the same or similar number of cells; the decomposed subfields should satisfy the condition: with a maximum number of units M max Subdomains and having a minimum number of units M min The difference in the number of units between subfields should not exceed one thousandth of the total number of units;
a calculation task allocation unit which allocates N sub-domains to N calculation nodes one by one for processing and performs parallel calculation; the computing node considers only all connected pore units when processing the sub-domains, renumbers the connected pore units, and then stores the connected pore units in a one-dimensional array p respectively i In the formula, i is the number of the subdomain; each communicated pore unit is associated with one coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array storing the ordinal number of the pore unit and the corresponding coordinate; for each void cell: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit as a one-dimensional array to obtain a series of first derivative arrays; storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit as a one-dimensional array correspondingly to obtain a series of second derivative arrays;
and the control part is in communication connection with the determining part, the computing domain decomposing part and the computing task distributing part and controls the operation of the determining part, the computing domain decomposing part and the computing task distributing part.
Preferably, the LBM parallel optimization device based on the connected pore dividing calculation region provided by the invention can also have the following characteristics: the maximum number of units M in the domain decomposition part is calculated max :
Minimum number of units M min :
Where NX, NY, NZ are the total number of units of the interconnected pores in the x, y, z axis directions of the simulation areas, respectively.
Preferably, the LBM parallel optimization device based on the connected pore dividing calculation region provided by the invention further comprises: and the communication mode setting part is in communication connection with the control part, and when adjacent or diagonal pore units among different subfields are processed, an interface layer on the interface of the current subfield is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication modes of the expanded pore units are consistent with those of the interface units in the current subfield.
Preferably, the LBM parallel optimization device based on the connected pore dividing calculation region provided by the invention further comprises: and the input display part is communicated with the determining part, the calculation domain decomposition part, the calculation task allocation part, the communication mode setting part and the control part and is used for enabling a user to input an operation instruction and correspondingly display the operation instruction.
Preferably, the LBM parallel optimization device based on the connected pore dividing calculation region provided by the invention can also have the following characteristics: the input display unit can display the sample pore unit information and the communication condition determined by the determination unit and the number of calculation nodes or the total number of sub-fields N according to the operation instruction, display the decomposition condition of the calculation domain decomposition unit and correspondingly display the distribution condition of the calculation task distribution unit.
< storage Medium >
In addition, the present invention also provides a computer-readable storage medium storing a program for realizing the above < method >.
Effects and effects of the invention
The LBM parallel optimization method, the LBM parallel optimization device and the storage medium based on the communication pore division calculation region can divide the total number of subdomains according to the number of calculation nodes in a random and balanced modeNot limited to 2 R The method and the system can effectively reduce the communication complexity caused by the tortuosity of the division interface, balance the workload of all subdomains in the watershed decomposition process, avoid secondary optimization such as iteration and the like, reduce the memory consumption of the system, simplify the calculation task distribution process, and ensure that the communication among subdomains is efficient and easy to realize, and reduce the communication time; therefore, the method can remarkably improve the calculation efficiency, is particularly suitable for the treatment of large calculation amount of pore scale simulation and large memory consumption of the porous material, develops a new calculation region decomposition and treatment thought, and has great popularization value.
Drawings
FIG. 1 is a schematic diagram of a computational domain decomposition process in an embodiment of the present invention, wherein (a) is decomposed along the x-axis, (b) is decomposed along the y-axis, and (c) is decomposed along the z-axis;
FIG. 2 is a schematic diagram of a process of marking fluid cells in a porous medium with a one-dimensional array after decomposing to obtain subfields in an embodiment of the present invention;
FIG. 3 is a diagram of embodiment 0 of the present invention # Subdomain and 1 # Schematic interface of subdomain (white is pore unit, grey is solid unit);
FIG. 4 is a graph of a D3Q19 discrete velocity model in accordance with an embodiment of the present invention;
fig. 5 is a schematic diagram illustrating a process of exchanging information between diagonal units among different subfields according to an embodiment of the present invention.
Detailed Description
The LBM parallel optimization method, the LBM parallel optimization device and the LBM storage medium based on the communication pore division calculation area related to the invention are explained in detail below with reference to the attached drawings.
< example >
The LBM parallel optimization method based on the connected pore dividing calculation region provided by the embodiment comprises the following steps:
step 1, determining pore unit information and communication conditions of a sample according to pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system; for a porous material with uniform and simple structure distribution, the pore distribution data can be directly defined by a programming language, and for a non-uniform and highly heterogeneous porous structure, the pore distribution data can be obtained by adopting an X-CT technology.
Step 2, decomposing a calculation domain:
dividing porous media sample flow field into n along x-axis x Each region having the same or similar number of cells is then divided into n along the y-axis y A sub-area with the same or similar unit number, and finally, n x ×n y The sub-region divides n along the z-axis z Secondary, get n x ×n y ×n z N subfields with the same or similar number of cells; and after decomposition has the maximum number of units M max Subdomains and having a minimum number of units M min The number of cells between subfields should not differ by more than one thousandth of the total number of cells.
Maximum number of units M max :
Minimum number of units M min :
Where NX, NY, NZ are the total number of units of the interconnected pores in the x, y, z axis directions of the simulation areas, respectively.
Specifically, as shown in fig. 1, each cell within the calculation region is first traversed by successive scans in the z, y, and x directions. It should be noted that the scan order of z→y→x is at x i The cross-section is traversed in a z-then-y order. Fig. 1 (b) is a view showing that on the basis of fig. 1 (a), each sub-field is divided into 2 small areas along the y-axis, specifically, the pore units of the sub-fields are numbered in the order of x→z→y and stored in a one-dimensional array, and then the one-dimensional array associated with each sub-field is divided into two parts, at this time, the drainage field is divided into 3×2×1 sub-areas. FIG. 1 (c) shows the same procedure as x.fwdarw.The y- > z scan sequence traverses the aperture cells of each subfield and repeats the division process, with the final simulation area being divided into 3 x 2 subfields. Next, as shown in fig. 2, the connected pore units are screened and stored in a one-dimensional array, and there are NX pore units in the x-axis direction in the drainage basin. As shown in fig. 3, all connected pore units are divided into a plurality of equal-sized groups, each group being associated with one subdomain, and the total number of groups is equal to the number of subdomains after decomposition. Then, the one-dimensional group is divided into three subgroups, each subgroup containing NX/3 or NX/3+1 pore units, the last unit number of the three subgroups being N respectively 0 、N 1 And NX. Finally, the interface between two adjacent subfields is restored, as shown in fig. 3 and 1 (a), and the interface includes the last pore unit of each partition.
The calculation domain decomposition scheme divides the soil sample into n directly x ×n y ×n z The number of subfields, the difference in workload between subfields, is the difference in the number of pore units at adjacent interfaces between each subfield. The subsequent simulation is based on the divided one-dimensional array, and the simulation area is from 1 to nx i ,nx i Is the size of the subfields in the x-direction (fluid flow direction) associated with processor i.
Step 3, distributing calculation tasks:
during parallel computing, N sub-domains are allocated to N computing nodes one by one for processing; the computing node considers only all connected pore units when processing the sub-domains, renumbers the connected pore units, and then stores the connected pore units in a one-dimensional array p respectively i In the formula, i is the number of the subdomain; each connected pore unit is associated with a coordinate and stored in sequence according to a one-dimensional array to obtain an array of pore units storing the ordinal numbers of the pore units and the corresponding coordinates, and each pore unit is tracked by the unique ordinal numbers and coordinates.
For each void cell: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit as a one-dimensional array to obtain a series of first derivative arrays; and storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
The primary data structure of each pore unit is its fluid particle distribution function and equilibrium fluid particle distribution function; and storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain four second derivative arrays.
As shown in fig. 4, for each connected pore unit x, there are up to 18 adjacent pore units stored in 18 one-dimensional arrays according to the D3Q19 model in LBM. Each array corresponds to a component. The primary data structure of each pore cell is its 19 fluid particle distribution functions and 18 equilibrium fluid particle distribution functions, each described by one-dimensional array, the other four one-dimensional arrays for storing momentum and local fluid density in the pore cell.
Step 4, setting a communication (extraction and calling of calculation data) mode among subdomains: as shown in fig. 5, when adjacent or diagonal pore units (d and d' in the figure) between different subfields are processed, the interface layer on the interface of the current subfield is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication mode of the expanded pore units is consistent with that of the interface units in the current subfield.
In each sub-field of the present implementation, the overlap region is stored in 18 one-dimensional arrays, and no redundant or duplicate calculations are performed. Each subfield is from 1 to nx i The particle distribution function and the equilibrium distribution function of the fluid cells of (a) are stored in two different one-dimensional arrays (18 one-dimensional arrays per set). A pointer is selected to locate the first aperture element of the interface and only the aperture element data in the interface layer is swapped to the adjacent partition. Since only all the aperture cells including the interface layer in each sub-field are stored and the adjacent cells of each aperture cell are stored in 18 one-dimensional arrays, the adjacent cells of each aperture cell can be easily determined in the simulation. Each iteration, fluid particle distribution functions of corresponding subfields are respectively calculated on each processor, and then data of the subfield interface are exchanged between the processors. In data exchangeIn this way, only 5 of the 18 distribution function components of the fluid element on the interface layer need be transferred to the adjacent processor. For example, in the x-direction, only the interface nx is required i Fluid particle distribution function f 3 [i]、f 8 [i]、f 9 [i]、f 12 [i]And f 13 [i]The left sub-field is transferred from the right sub-field, and likewise only f is required 1 [i]、f 7 [i]、f 10 [i]、f 11 [i]And f 14 [i]From the left subdomain to the right subdomain.
According to the method, the total number of the subfields can be divided according to the number of the calculation nodes in a random and balanced manner, all calculation nodes are fully utilized for calculation processing, the communication complexity caused by the tortuosity of a division interface can be effectively reduced, the workload of all subfields is balanced in the process of decomposing the river basin, secondary optimization such as iteration is not needed, the memory consumption of a system is reduced, the calculation task distribution process is simplified, the communication time among the subfields is effectively reduced, and therefore the calculation efficiency can be remarkably improved, the larger the data volume is, the more obvious the advantages of the method are, and the method is particularly suitable for processing with large calculation amount of pore scale simulation and large memory consumption of porous materials.
Further, the embodiment also provides a device capable of automatically implementing the method, which comprises a determining part, a calculation domain decomposing part, a calculation task distributing part, a communication mode setting part, an input display part and a control part.
The determining part determines pore unit information and communication conditions of the sample according to pore distribution data of the porous medium sample to be simulated; and determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system.
The computational domain decomposer divides the porous medium sample flow domain into n along the x-axis x Each region having the same or similar number of cells is then divided into n along the y-axis y A sub-area with the same or similar unit number, and finally, n x ×n y The sub-region divides n along the z-axis z Secondary, get n x ×n y ×n z =n have the same or similarA subdomain of the number of cells; the decomposed subfields should satisfy the condition: with a maximum number of units M max Subdomains and having a minimum number of units M min The number of cells between subfields should not differ by more than one thousandth of the total number of cells.
Maximum number of units M max :
Minimum number of units M min :
Where NX, NY, NZ are the total number of units of the interconnected pores in the x, y, z axis directions of the simulation areas, respectively.
The computing task allocation part allocates the N sub-domains to N computing nodes one by one for processing and performs parallel computing; the computing node considers only all connected pore units when processing the sub-domains, renumbers the connected pore units, and then stores the connected pore units in a one-dimensional array p respectively i In the formula, i is the number of the subdomain; each communicated pore unit is associated with one coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array storing the ordinal number of the pore unit and the corresponding coordinate; for each void cell: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit as a one-dimensional array to obtain a series of first derivative arrays; and storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
The communication mode setting part is in communication connection with the control part, and when adjacent or diagonal pore units among different subfields are processed, an interface layer on the interface of the current subfield is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication modes of the expanded pore units are consistent with the interface units in the current subfield.
The input display part is used for enabling a user to input an operation instruction and correspondingly display the operation instruction. For example, the input display unit may display the sample pore unit information and the connection condition determined by the determining unit, and the number of calculation nodes or the total number of subfields N, display the decomposition condition of the calculation domain decomposition unit, and display the distribution condition of the calculation task distribution unit, respectively, in accordance with the operation instruction.
The control unit is in communication with the determination unit, the calculation domain decomposition unit, the calculation task allocation unit, and the communication scheme setting unit, and controls the operation of the determination unit, the calculation domain decomposition unit, and the calculation task allocation unit.
The above embodiments are merely illustrative of the technical solutions of the present invention. The LBM parallel optimization method, apparatus and storage medium based on the connected pore dividing calculation region according to the present invention are not limited to the description in the above embodiments, but the scope defined by the claims. Any modifications, additions or equivalent substitutions made by those skilled in the art based on this embodiment are within the scope of the invention as claimed in the claims.
Claims (8)
1. The LBM parallel optimization method for dividing the calculation area based on the communication pores is characterized by comprising the following steps:
step 1, determining pore unit information and communication conditions of a sample according to pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system;
step 2, decomposing a calculation domain:
dividing porous media sample flow field into n along x-axis x Each region having the same or similar number of cells is then divided into n along the y-axis y A sub-area with the same or similar unit number, and finally, n x ×n y The sub-region divides n along the z-axis z Secondary, get n x ×n y ×n z N subfields with the same or similar cell numbers, each subfield being a three-dimensional region composed of a plurality of connected pore cells; and after decomposition has the maximum number of units M max Is a son of (2)Domain sum having minimum number of units M min The difference in the number of units between subfields should not exceed one thousandth of the total number of units;
step 3, distributing calculation tasks:
during parallel computing, N sub-domains are allocated to N computing nodes one by one for processing; the computing node considers only all connected pore units when processing the sub-domains, renumbers the connected pore units, and then stores the connected pore units in a one-dimensional array p respectively i In the formula, i is the number of the subdomain; each communicated pore unit is associated with one coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array storing the ordinal numbers of the pore units and the corresponding coordinates, and each pore unit is tracked through the unique ordinal numbers and the unique coordinates;
for each void cell: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit as a one-dimensional array correspondingly to obtain a series of first derivative arrays; and storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of second derivative arrays.
2. The LBM parallel optimization method based on the connected pore dividing calculation region according to claim 1, wherein:
wherein in step 3, the primary data structure of each pore unit is its fluid particle distribution function and equilibrium fluid particle distribution function; and storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain four second derivative arrays.
3. The LBM parallel optimization method based on the connected pore dividing calculation region according to claim 1, further comprising:
step 4, setting a communication mode among subdomains: when processing adjacent or diagonal pore units between different subdomains, an interface layer on the interface of the current subdomain is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication mode of the expanded pore units is consistent with that of the interface units in the current subdomain.
4. LBM parallel optimization device based on communication hole divides calculation region, characterized by including:
a determining part for determining pore unit information and communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system;
a calculation domain decomposition unit for dividing the porous medium sample flow domain into n along the x-axis x Each region having the same or similar number of cells is then divided into n along the y-axis y A sub-area with the same or similar unit number, and finally, n x ×n y The sub-region divides n along the z-axis z Secondary, get n x ×n y ×n z N subfields with the same or similar cell numbers, each subfield being a three-dimensional region composed of a plurality of connected pore cells; the decomposed subfields should satisfy the condition: with a maximum number of units M max Subdomains and having a minimum number of units M min The difference in the number of units between subfields should not exceed one thousandth of the total number of units;
a calculation task allocation unit which allocates N sub-domains to N calculation nodes one by one for processing and performs parallel calculation; the computing node considers only all connected pore units when processing the sub-domains, renumbers the connected pore units, and then stores the connected pore units in a one-dimensional array p respectively i In the formula, i is the number of the subdomain; each communicated pore unit is associated with one coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array storing the ordinal number of the pore unit and the corresponding coordinate; for each void cell: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit as a one-dimensional array correspondingly to obtain a series of first derivative arrays; storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit as a one-dimensional array to obtain a series of second derivative arrays;
And a control part which is communicated with the determining part, the computing domain decomposing part and the computing task distributing part and controls the operation of the determining part, the computing domain decomposing part and the computing task distributing part.
5. The LBM parallel optimization device based on the connected pore dividing calculation region according to claim 4, further comprising:
and the communication mode setting part is in communication connection with the control part, and when adjacent or diagonal pore units among different subfields are processed, an interface layer on the interface of the current subfield is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication modes of the expanded pore units are consistent with those of the interface units in the current subfield.
6. The LBM parallel optimization device based on connected pore dividing calculation region according to claim 5, further comprising:
and an input display unit which is communicatively connected to the determination unit, the calculation domain decomposition unit, the calculation task allocation unit, the communication mode setting unit, and the control unit, and is configured to allow a user to input an operation instruction and to display the operation instruction accordingly.
7. The LBM parallel optimization device based on connected pore dividing calculation region according to claim 6, wherein:
the input display part can display the sample pore unit information and the communication condition determined by the determining part and the calculated node number or total number N of the subdomains according to the operation instruction, display the decomposition condition of the calculated domain decomposition part and correspondingly display the distribution condition of the calculated task distribution part.
8. A storage medium, characterized in that:
a program for implementing the LBM parallel optimization method based on the connected pore dividing calculation region as claimed in any one of claims 1 to 3 is stored.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210953478.0A CN115455794B (en) | 2022-08-10 | 2022-08-10 | LBM parallel optimization method, device and storage medium based on communication pore division calculation region |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210953478.0A CN115455794B (en) | 2022-08-10 | 2022-08-10 | LBM parallel optimization method, device and storage medium based on communication pore division calculation region |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115455794A CN115455794A (en) | 2022-12-09 |
CN115455794B true CN115455794B (en) | 2024-03-29 |
Family
ID=84297534
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210953478.0A Active CN115455794B (en) | 2022-08-10 | 2022-08-10 | LBM parallel optimization method, device and storage medium based on communication pore division calculation region |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115455794B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109376481A (en) * | 2018-08-16 | 2019-02-22 | 清能艾科(深圳)能源技术有限公司 | Calculation method, device and the computer equipment of digital cores phase percolation curve based on more GPU |
CN112949112A (en) * | 2021-01-29 | 2021-06-11 | 中国石油大学(华东) | Rotor-sliding bearing system lubrication basin dynamic grid parallel computing method |
CN112992294A (en) * | 2021-04-19 | 2021-06-18 | 中国空气动力研究与发展中心计算空气动力研究所 | Porous medium LBM calculation grid generation method |
CN114565658A (en) * | 2022-01-14 | 2022-05-31 | 武汉理工大学 | Pore size calculation method and device based on CT technology |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8583411B2 (en) * | 2011-01-10 | 2013-11-12 | Saudi Arabian Oil Company | Scalable simulation of multiphase flow in a fractured subterranean reservoir as multiple interacting continua |
-
2022
- 2022-08-10 CN CN202210953478.0A patent/CN115455794B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109376481A (en) * | 2018-08-16 | 2019-02-22 | 清能艾科(深圳)能源技术有限公司 | Calculation method, device and the computer equipment of digital cores phase percolation curve based on more GPU |
CN112949112A (en) * | 2021-01-29 | 2021-06-11 | 中国石油大学(华东) | Rotor-sliding bearing system lubrication basin dynamic grid parallel computing method |
CN112992294A (en) * | 2021-04-19 | 2021-06-18 | 中国空气动力研究与发展中心计算空气动力研究所 | Porous medium LBM calculation grid generation method |
CN114565658A (en) * | 2022-01-14 | 2022-05-31 | 武汉理工大学 | Pore size calculation method and device based on CT technology |
Non-Patent Citations (2)
Title |
---|
孔隙尺度多孔介质流体流动与溶质运移高性能模拟;周鸿翔 等;水科学进展;第31卷(第3期);第422-430页 * |
格子Boltzmann方法多GPU并行性能的研究;张纲;王利民;葛蔚;;计算机与应用化学(第10期);第4-13页 * |
Also Published As
Publication number | Publication date |
---|---|
CN115455794A (en) | 2022-12-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kerbyson et al. | A performance model of the parallel ocean program | |
Sadayappan et al. | Nearest-neighbor mapping of finite element graphs onto processor meshes | |
Nicol et al. | Dynamic remapping of parallel computations with varying resource demands | |
Levchuk et al. | Normative design of organizations. II. Organizational structure | |
US5694602A (en) | Weighted system and method for spatial allocation of a parallel load | |
Jones et al. | Computational results for parallel unstructured mesh computations | |
Bhagavathi et al. | A fast selection algorithm for meshes with multiple broadcasting | |
CN107977444A (en) | Mass data method for parallel processing based on big data | |
JP2024066980A (en) | A method for characterizing three-dimensional fracture network rock mass models with multi-scale heterogeneity | |
CN115455794B (en) | LBM parallel optimization method, device and storage medium based on communication pore division calculation region | |
HSIEH et al. | Evaluation of automatic domain partitioning algorithms for parallel finite element analysis | |
Vaughan et al. | Enabling tractable exploration of the performance of adaptive mesh refinement | |
CN108108242A (en) | Accumulation layer intelligence distribution control method based on big data | |
CN115016947B (en) | Load distribution method, device, equipment and medium | |
Marshall et al. | Performance evaluation and enhancements of a flood simulator application for heterogeneous hpc environments | |
CN115906684A (en) | Hydrodynamics multi-grid solver parallel optimization method for Shenwei architecture | |
CN108062249A (en) | High in the clouds data allocation schedule method based on big data | |
Ponnusamy et al. | Graph contraction for mapping data on parallel computers: A quality–cost tradeoff | |
Loöhner et al. | A load balancing algorithm for unstructured grids | |
Rantakokko et al. | Parallel structured adaptive mesh refinement | |
Seredyński | Scheduling tasks of a parallel program in two-processor systems with use of cellular automata | |
Marshall et al. | Performance improvement of a two-dimensional flood simulation application in hybrid computing environments | |
CN108268697A (en) | A kind of high efficiency electric propulsion plume plasma parallel simulation method | |
Playne et al. | Simulating and Benchmarking the shallow-water fluid dynamical equations on multiple graphical processing units | |
JP7133843B2 (en) | Processing device, processing system, processing method, program, and recording medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |