CN115455794A - LBM parallel optimization method and device based on connected pore division calculation area and storage medium - Google Patents
LBM parallel optimization method and device based on connected pore division calculation area and storage medium Download PDFInfo
- Publication number
- CN115455794A CN115455794A CN202210953478.0A CN202210953478A CN115455794A CN 115455794 A CN115455794 A CN 115455794A CN 202210953478 A CN202210953478 A CN 202210953478A CN 115455794 A CN115455794 A CN 115455794A
- Authority
- CN
- China
- Prior art keywords
- pore
- calculation
- units
- unit
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000011148 porous material Substances 0.000 title claims abstract description 146
- 238000004364 calculation method Methods 0.000 title claims abstract description 98
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000005457 optimization Methods 0.000 title claims abstract description 35
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 28
- 238000004891 communication Methods 0.000 claims description 41
- 239000012530 fluid Substances 0.000 claims description 26
- 238000003491 array Methods 0.000 claims description 24
- 108020001568 subdomains Proteins 0.000 claims description 24
- 238000005192 partition Methods 0.000 claims description 21
- 238000004088 simulation Methods 0.000 claims description 16
- 238000005315 distribution function Methods 0.000 claims description 13
- 239000011800 void material Substances 0.000 claims description 13
- 239000002245 particle Substances 0.000 claims description 11
- 239000007787 solid Substances 0.000 claims description 8
- 238000000638 solvent extraction Methods 0.000 claims description 5
- 210000004027 cell Anatomy 0.000 description 41
- 210000003429 pore cell Anatomy 0.000 description 11
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/25—Design optimisation, verification or simulation using particle-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/10—Numerical modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/08—Fluids
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a LBM parallel optimization method and device based on a connected pore division calculation area, which can divide the sub-domain number in a balanced manner according to the calculation node number, so that each calculation node can perform load balancing and efficient processing, and the calculation processing efficiency is improved. The method comprises the following steps: step 1, determining the total number N of subdomains to be decomposed according to the number N of calculation nodes in the system; step 2, decomposing the calculation domain: dividing the basin into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtaining N groups ofSubdomains with the same or similar unit numbers, wherein each subdomain is a three-dimensional region consisting of a plurality of pore units; and the unit number difference between the sub-domain with the maximum unit number and the sub-domain with the minimum unit number after decomposition should not exceed one thousandth of the total unit number; and 3, distributing a calculation task.
Description
Technical Field
The invention belongs to the technical field of pore structure simulation calculation, and particularly relates to a LBM parallel optimization method, a device and a storage medium based on a connected pore partition calculation area.
Background
The Lattice Boltzmann Method (LBM for short) is an effective means for simulating fluid flow in a pore structure of a porous medium in a mesoscopic scale. In order to make the simulation result representative, the size of the simulation sample should be large enough, and due to the complexity of the fluid flow itself in the pore structure, the pore size LBM tends to face the demands of large amounts of computational resources and storage space in numerical simulation. Therefore, the large-scale LBM simulation needs parallel optimization, and the execution time and memory requirement thereof depend on the data volume and the parallel optimization method.
The basic idea of domain decomposition in parallel computing is to decompose a computing domain into several sub-domains, and then restore the interface and perform load balancing. However, for complex porous materials, this step often requires a significant amount of time to iterate repeatedly to achieve an acceptable load balance. Among them, recursive Bisection (Recursive Bisection Method) is the most commonly used scheme for decomposing a computation domain, and divides the computation domain into two sub-domains uniformly, then performs the same division on the sub-domains, and finally performs the total division of 2 through R recursions R A sub-field. The partitioning scheme can well balance the workload of a regular or irregular structure of a uniform medium, but has the following three problems: 1) Good workload balance is still difficult to achieve for non-uniform, highly heterogeneous porous structures; 2) The method divides the number of sub-fields to be 2 R The number of processors (computing nodes) is generally not exactly equal to the number of processors (computing nodes), which results in that part of the processors cannot be effectively utilized; 3) When the divided sub-fields are large, the number of times of recursive division needs to be repeatedR will be larger and less efficient.
Disclosure of Invention
The present invention is made to solve the above problems, and an object of the present invention is to provide an LBM parallel optimization method, an LBM parallel optimization device, and a LBM parallel optimization storage medium based on a connected pore partition calculation region, which can divide the sub-domain number equally according to the number of calculation nodes, so that each calculation node can perform load balancing and efficient processing, and improve the calculation processing speed.
In order to achieve the purpose, the invention adopts the following scheme:
< method >
The invention provides an LBM parallel optimization method based on a connected pore division calculation area, which is characterized by comprising the following steps:
dividing the basin into n along the x-axis x Areas having the same or similar number of cells, and dividing each area into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells, each subfield being a solid region composed of a plurality of pore cells; and has a maximum number of cells M after decomposition max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units; the allowable difference value of the number of units between the sub-domains can be further reduced according to the requirement of the division precision of the calculation domain;
during parallel computing, distributing N sub-domains to N computing nodes one by one for processing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are renumbered, and then the connected pore units are respectively stored inOne-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with a coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array in which the ordinal number and the corresponding coordinate of the pore unit are stored, and each pore unit is tracked through the unique ordinal number and the unique coordinate of the pore unit;
for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; and correspondingly storing the momentum and the local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
Specifically, the LBM parallel optimization method based on the connected pore partition calculation region provided by the present invention may further have the following features: in step 2, the maximum number of cells M max :
Minimum number of cells M min :
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
Preferably, the LBM parallel optimization method based on the connected pore partition calculation region provided by the present invention may further have the following features: in step 3, the main data structure of each pore cell is its fluid particle distribution function and equilibrium fluid particle distribution function; and correspondingly storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain four second derivative arrays.
Preferably, the LBM parallel optimization method based on the connected pore partition calculation region provided by the present invention may further include: step 4, setting a communication mode between sub-domains: when processing adjacent or diagonal pore units between different sub-domains, the interface layer on the current sub-domain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, so that the communication mode of the expanded pore units is consistent with that of the interface units in the current sub-domain.
< means >
Further, the present invention provides an apparatus for automatically implementing the < method >, which is characterized in that the apparatus comprises:
the determination part is used for determining the void cell information and the communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the calculation nodes in the system;
a computation domain decomposition unit for dividing the flow domain into n along the x-axis x Areas having the same or similar number of cells, and dividing each area into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; the decomposed subdomains should satisfy the condition: having a maximum number of cells M max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units;
the computing task allocation part allocates the N subdomains to N computing nodes one by one for processing and carries out parallel computing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are numbered again, and then the connected pore units are respectively stored in the one-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with a coordinate and stored according to the sequence of the one-dimensional arrays to obtain a pore unit array in which the ordinal number of the pore unit and the corresponding coordinate are stored; for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; comparing the momentum and local fluid density data of the pore unit with the poreCorrespondingly storing the ordinal numbers of the units into a one-dimensional array to obtain a series of second derivative arrays;
and the control part is in communication connection with the determination part, the calculation domain decomposition part and the calculation task allocation part and controls the operation of the determination part, the calculation domain decomposition part and the calculation task allocation part.
Preferably, the LBM parallel optimization device based on the connected pore partition calculation region provided by the present invention may further have the following features: maximum number of units M in the calculation domain decomposition part max :
Minimum number of cells M min :
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
Preferably, the LBM parallel optimization apparatus based on connected pore partition calculation region provided by the present invention may further include: and the communication mode setting part is in communication connection with the control part, and when adjacent or diagonal pore units among different sub-domains are processed, the interface layer on the interface of the current sub-domain is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication modes of the expanded pore units are consistent with the communication modes of the interface units in the current sub-domain.
Preferably, the LBM parallel optimization apparatus based on connected pore partition calculation region provided by the present invention may further include: and the input display part is in communication connection with the determining part, the calculation domain decomposing part, the calculation task distributing part, the communication mode setting part and the control part and is used for enabling a user to input an operation instruction and performing corresponding display.
Preferably, the LBM parallel optimization device based on the connected pore partition calculation region provided by the present invention may further have the following features: the input display part can display the sample gap unit information and the communication condition determined by the determination part and the number of the calculation nodes or the total number N of the sub-domains according to the operation command, display the decomposition condition of the calculation domain decomposition part and correspondingly display the distribution condition of the calculation task distribution part.
< storage Medium >
In addition, the present invention also provides a computer-readable storage medium storing a program for implementing the above < method >.
Action and Effect of the invention
The LBM parallel optimization method, the device and the storage medium based on the connected pore division calculation region can divide the total number of the sub-domains at will in a balanced manner according to the number of the calculation nodes, and the total number of the sub-domains is not limited to 2 R Moreover, the invention can effectively reduce the communication complexity caused by the tortuosity of the divided interface, balance the working load of all sub-domains in the flow domain decomposition process, and does not need to carry out secondary optimization such as iteration and the like, thereby reducing the memory consumption of the system, simplifying the distribution process of the calculation task, enabling the communication between the sub-domains to be efficient and easy to realize, and reducing the communication time; therefore, the method can obviously improve the calculation efficiency, is particularly suitable for processing the porous material with large simulation calculation amount of the pore size and large memory consumption, develops a new idea of decomposing and processing the calculation area, and has great popularization value.
Drawings
FIG. 1 is a schematic diagram of a domain decomposition process in an embodiment of the present invention, wherein (a) is decomposition along the x-axis, (b) is decomposition along the y-axis, and (c) is decomposition along the z-axis;
FIG. 2 is a schematic diagram of a process for marking fluid cells in a porous medium with a one-dimensional array after decomposing sub-domains according to an embodiment of the present invention;
FIG. 3 shows example 0 of the present invention # Sub-fields and 1 # Interface schematic of subdomains (white for pore units, grey for solid units);
FIG. 4 is a D3Q19 discrete velocity model diagram according to an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating an information exchange process of diagonal units between different sub-fields according to an embodiment of the present invention.
Detailed Description
The following describes in detail an LBM parallel optimization method, an LBM parallel optimization device, and a LBM parallel optimization storage medium based on a connected pore partition calculation region according to the present invention with reference to the accompanying drawings.
< example >
The LBM parallel optimization method based on the connected pore partition calculation region provided by the embodiment comprises the following steps:
dividing the porous medium sample flow field into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; and has a maximum number of cells M after decomposition max And has a minimum number of cells M min Should not differ by more than one thousandth of the total number of units.
Maximum number of cells M max :
Minimum number of cells M min :
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
Specifically, as shown in FIG. 1, each cell within the calculation region is first traversed by successive scans in the z, y, and x directions. Note that the scan order of z → y → x is in x i Traversing the cross section according to the sequence of z and y. Fig. 1 (b) is based on fig. 1 (a), each sub-domain is divided into 2 small regions along the y-axis, specifically, the pore unit traversal numbers of the sub-domains are numbered in the order of x → z → y and stored in a one-dimensional array, and then the one-dimensional array associated with each sub-domain is divided into two parts, at this time, the flow domain is divided into 3 × 2 × 1 sub-regions. FIG. 1 (c) traverses the pore cells of each subdomain in the x → y → z scan order in the same manner and repeats the partitioning process, with the resulting simulation region being partitioned into 3X 2 subdomains, based on the previous partitioning. Next, as shown in fig. 2, the interconnected pore units are screened and numbered and stored in a one-dimensional array, and the basin has N pore units in total. As shown in fig. 3, all connected pore cells are divided into a number of equal-sized groups, each group being associated with one subfield, the total number being equal to the number of subfields after decomposition. The one-dimensional array is then divided into three sub-groups, each of which contains N/3 or N/3+1 aperture elements, the last element of the three sub-groups being numbered N 0 、N 1 And N. Finally, the interface between two adjacent sub-domains is restored, as shown in fig. 3 and fig. 1 (a), and the interface includes the last pore unit of each partition.
The calculation domain decomposition scheme directly divides the soil sample into n x ×n y ×n z The difference in workload between subfields is the difference in the number of pore units at adjacent interfaces between each subfield. The subsequent simulation is based on the divided one-dimensional array, and the simulation area is from 1 to nx i ,nx i Is the size of the sub-field in the x-direction (fluid flow direction) associated with processor i.
during parallel computing, distributing N sub-domains to N computing nodes one by one for processing; when the computing node processes the subdomains, only all connected pores are consideredThe cells are renumbered, and then the connected pore cells are respectively stored in a one-dimensional array p i In, i is the subdomain number; each connected pore unit is associated with a coordinate and is stored according to the sequence of the one-dimensional array to obtain an array of pore units in which the ordinal numbers and corresponding coordinates of the pore units are stored, and each pore unit is tracked through the unique ordinal number and coordinate thereof.
For each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; and correspondingly storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
The primary data structure of each pore cell is its fluid particle distribution function and equilibrium fluid particle distribution function; and correspondingly storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain four second derivative arrays.
As shown in fig. 4, according to the D3Q19 model in LBM, for each connected pore unit x, up to 18 adjacent pore units are stored in 18 one-dimensional arrays. Each array corresponds to a component. The main data structure for each pore cell is its 19 fluid particle distribution functions and 18 equilibrium fluid particle distribution functions, each described by a one-dimensional array, the other four for storing the momentum and local fluid density in the pore cell.
In each subfield of this implementation, the overlap region is stored in 18 one-dimensional arrays, and no redundant or duplicate calculations are performed. Each timeSub-fields from 1 to nx i The particle distribution function and the equilibrium state distribution function of the fluid cell of (a) are stored in two different sets of one-dimensional arrays (18 one-dimensional arrays each). Pointers are selected to locate the first pore unit of the interface and only pore unit data in the interface layer is swapped to the adjacent partition. Since only all the pore cells including the interface layer in each subfield are stored and the neighboring cells of each pore cell are stored in 18 one-dimensional arrays, the neighboring cells of each pore cell can be easily determined in the simulation. For each iteration, the fluid particle distribution function of the corresponding sub-domain is calculated on each processor, and then the data of the sub-domain interface is exchanged between the processors. In the data exchange, only 5 of the 18 distribution function components of the fluid unit on the interface layer need to be transferred to the adjacent processor. For example, in the x-direction, only the interface nx is needed i Fluid particle distribution function f 3 [i]、f 8 [i]、f 9 [i]、f 12 [i]And f 13 [i]The left sub-field is transferred from the right sub-field and likewise only f needs to be transferred 1 [i]、f 7 [i]、f 10 [i]、f 11 [i]And f 14 [i]From the left sub-field to the right sub-field.
By the method, the total number of the divided subdomains can be randomly balanced according to the number of the calculation nodes, all the calculation nodes are fully utilized for calculation processing, the communication complexity caused by the tortuosity of a divided interface can be effectively reduced, the working loads of all the subdomains are balanced in the flow domain decomposition process, two-time optimization such as iteration and the like is not needed, the system memory consumption is reduced, the calculation task allocation process is simplified, and the communication time among the subdomains is effectively reduced.
Further, the present embodiment also provides an apparatus capable of automatically implementing the above method, the apparatus including a determination section, a calculation domain decomposition section, a calculation task assignment section, a communication method setting section, an input display section, and a control section.
The determining part determines the void cell information and the communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; and determining the total subdomain number N to be decomposed according to the number N of the computing nodes in the system.
The calculation domain decomposition part divides the porous medium sample flow domain into n along the x-axis x Areas having the same or similar number of cells, and dividing each area into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; the decomposed subdomains should satisfy the condition: with a maximum number of cells M max And has a minimum number of cells M min Should not differ by more than one thousandth of the total number of units.
Maximum number of cells M max :
Minimum number of cells M min :
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
The calculation task distribution part distributes the N sub-domains to the N calculation nodes one by one for processing, and parallel calculation is carried out; when the computing node processes the subdomains, only all connected pore units are considered and are renumbered, and then the connected pore units are respectively stored in the one-dimensional array p i In, i is the subdomain number; each connected pore unit is associated with a coordinate and stored according to the sequence of the one-dimensional arrays to obtain a pore unit array in which the ordinal number of the pore unit and the corresponding coordinate are stored; to pairAt each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; and correspondingly storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
The communication mode setting part is connected with the control part in a communication mode, when adjacent or diagonal pore units between different subdomains are processed, the interface layer on the current subdomain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, and the communication modes of the expanded pore units are consistent with the communication modes of the interface units in the current subdomain.
The input display part is used for allowing a user to input an operation instruction and performing corresponding display. For example, the input display unit may display the sample void cell information and the connectivity status and the number of calculation nodes or the total number N of sub-domains determined by the determination unit, display the resolution status of the calculation domain decomposition unit, and display the allocation status of the calculation task allocation unit according to the operation command.
The control part is connected with the determination part, the calculation domain decomposition part, the calculation task distribution part and the communication mode setting part in a communication way and controls the operation of the determination part, the calculation domain decomposition part, the calculation task distribution part and the communication mode setting part.
The above embodiments are merely illustrative of the technical solutions of the present invention. The method, apparatus and storage medium for LBM parallel optimization based on connected pore partition calculation region according to the present invention are not limited to the content described in the above embodiments, but shall be subject to the scope defined by the claims. Any modification or supplement or equivalent replacement made by a person skilled in the art on the basis of this embodiment is within the scope of the invention as claimed in the claims.
Claims (10)
1. The LBM parallel optimization method based on the communicated pore division calculation area is characterized by comprising the following steps:
step 1, determining the void unit information and the communication condition of a sample according to the void distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the calculation nodes in the system;
step 2, decomposing the calculation domain:
dividing a porous medium sample flow field into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; and has a maximum number of cells M after decomposition max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units;
step 3, distributing a calculation task:
during parallel computing, distributing N subdomains to N computing nodes one by one for processing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are numbered again, and then the connected pore units are respectively stored in the one-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with a coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array in which the ordinal number and the corresponding coordinate of the pore unit are stored, and each pore unit is tracked through the unique ordinal number and the unique coordinate of the pore unit;
for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit correspondingly into a one-dimensional array to obtain a series of first derivative arrays; and correspondingly storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
2. The LBM parallel optimization method based on the connected pore partition calculation region according to claim 1, wherein:
wherein, in step 2, the maximum unit number M max :
Minimum number of cells M min :
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
3. The LBM parallel optimization method based on the connected pore partition calculation region according to claim 1, wherein:
wherein, in step 3, the main data structure of each pore unit is its fluid particle distribution function and equilibrium fluid particle distribution function; and correspondingly storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain four second derivative arrays.
4. The LBM parallel optimization method based on the connected pore partition calculation region according to claim 1, further comprising:
step 4, setting a communication mode between sub-domains: when processing adjacent or diagonal pore units between different sub-domains, the interface layer on the current sub-domain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, so that the communication mode of the expanded pore units is consistent with that of the interface units in the current sub-domain.
5. LBM parallel optimization device based on connected pore partition calculation region, characterized by including:
the determination part is used for determining the void cell information and the communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the calculation nodes in the system;
a calculation domain decomposition part for dividing the porous medium sample flow domain into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; the decomposed subdomains should satisfy the condition: having a maximum number of cells M max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units;
the computing task allocation part allocates the N sub-domains to the N computing nodes one by one for processing and carries out parallel computing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are numbered again, and then the connected pore units are respectively stored in the one-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with one coordinate and is stored according to the sequence of the one-dimensional arrays, and the pore unit arrays storing the ordinal numbers of the pore units and the corresponding coordinates are obtained; for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit correspondingly into a one-dimensional array to obtain a series of first derivative arrays; storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit correspondingly into a one-dimensional array to obtain a series of second derivative arrays;
and the control part is in communication connection with the determining part, the calculation domain decomposing part and the calculation task distributing part and controls the operation of the determining part, the calculation domain decomposing part and the calculation task distributing part.
6. The device for LBM parallel optimization based on connected pore partition calculation area according to claim 5, wherein:
wherein, in the calculation domain decomposition part, the maximum unit number M max :
Minimum number of cells M min :
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
7. The device for LBM parallel optimization of calculation regions based on connected pore partitioning according to claim 5, further comprising:
and the communication mode setting part is in communication connection with the control part, and when the adjacent or diagonal pore units between different subdomains are processed, the interface layer on the current subdomain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, so that the communication modes of the expanded pore units are consistent with the communication modes of the interface units in the current subdomain.
8. The device for LBM parallel optimization of calculation regions based on connected pore partitioning according to claim 5, further comprising:
and the input display part is in communication connection with the determining part, the calculation domain decomposing part, the calculation task distributing part, the communication mode setting part and the control part, and is used for enabling a user to input an operation instruction and performing corresponding display.
9. The device for LBM parallel optimization based on connected pore partition calculation area according to claim 5, wherein:
the input display part can display the sample gap unit information and the communication condition determined by the determination part and the number of the calculation nodes or the total sub-domain number N according to an operation command, display the decomposition condition of the calculation domain decomposition part and correspondingly display the distribution condition of the calculation task distribution part.
10. A storage medium, characterized by:
a program for implementing the LBM parallel optimization method based on connected pore partition calculation region according to any one of claims 1 to 4 is stored.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210953478.0A CN115455794B (en) | 2022-08-10 | 2022-08-10 | LBM parallel optimization method, device and storage medium based on communication pore division calculation region |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210953478.0A CN115455794B (en) | 2022-08-10 | 2022-08-10 | LBM parallel optimization method, device and storage medium based on communication pore division calculation region |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115455794A true CN115455794A (en) | 2022-12-09 |
CN115455794B CN115455794B (en) | 2024-03-29 |
Family
ID=84297534
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210953478.0A Active CN115455794B (en) | 2022-08-10 | 2022-08-10 | LBM parallel optimization method, device and storage medium based on communication pore division calculation region |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115455794B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120179436A1 (en) * | 2011-01-10 | 2012-07-12 | Saudi Arabian Oil Company | Scalable Simulation of Multiphase Flow in a Fractured Subterranean Reservoir as Multiple Interacting Continua |
CN109376481A (en) * | 2018-08-16 | 2019-02-22 | 清能艾科(深圳)能源技术有限公司 | Calculation method, device and the computer equipment of digital cores phase percolation curve based on more GPU |
CN112949112A (en) * | 2021-01-29 | 2021-06-11 | 中国石油大学(华东) | Rotor-sliding bearing system lubrication basin dynamic grid parallel computing method |
CN112992294A (en) * | 2021-04-19 | 2021-06-18 | 中国空气动力研究与发展中心计算空气动力研究所 | Porous medium LBM calculation grid generation method |
CN114565658A (en) * | 2022-01-14 | 2022-05-31 | 武汉理工大学 | Pore size calculation method and device based on CT technology |
-
2022
- 2022-08-10 CN CN202210953478.0A patent/CN115455794B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120179436A1 (en) * | 2011-01-10 | 2012-07-12 | Saudi Arabian Oil Company | Scalable Simulation of Multiphase Flow in a Fractured Subterranean Reservoir as Multiple Interacting Continua |
CN109376481A (en) * | 2018-08-16 | 2019-02-22 | 清能艾科(深圳)能源技术有限公司 | Calculation method, device and the computer equipment of digital cores phase percolation curve based on more GPU |
CN112949112A (en) * | 2021-01-29 | 2021-06-11 | 中国石油大学(华东) | Rotor-sliding bearing system lubrication basin dynamic grid parallel computing method |
CN112992294A (en) * | 2021-04-19 | 2021-06-18 | 中国空气动力研究与发展中心计算空气动力研究所 | Porous medium LBM calculation grid generation method |
CN114565658A (en) * | 2022-01-14 | 2022-05-31 | 武汉理工大学 | Pore size calculation method and device based on CT technology |
Non-Patent Citations (2)
Title |
---|
周鸿翔 等: "孔隙尺度多孔介质流体流动与溶质运移高性能模拟", 水科学进展, vol. 31, no. 3, pages 422 - 430 * |
张纲;王利民;葛蔚;: "格子Boltzmann方法多GPU并行性能的研究", 计算机与应用化学, no. 10, pages 4 - 13 * |
Also Published As
Publication number | Publication date |
---|---|
CN115455794B (en) | 2024-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jones et al. | Computational results for parallel unstructured mesh computations | |
US5694602A (en) | Weighted system and method for spatial allocation of a parallel load | |
EP0228915A2 (en) | Method and apparatus for simulating systems described by partial differential equations | |
Balaji et al. | Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems | |
Feng et al. | Scalable 3D hybrid parallel Delaunay image-to-mesh conversion algorithm for distributed shared memory architectures | |
Vaughan et al. | Enabling tractable exploration of the performance of adaptive mesh refinement | |
CN115455794B (en) | LBM parallel optimization method, device and storage medium based on communication pore division calculation region | |
CN108108242A (en) | Accumulation layer intelligence distribution control method based on big data | |
Zhao et al. | Bapipe: Balanced pipeline parallelism for dnn training | |
Marshall et al. | Performance evaluation and enhancements of a flood simulator application for heterogeneous hpc environments | |
CN115016947A (en) | Load distribution method, device, equipment and medium | |
Rantakokko | A framework for partitioning structured grids with inhomogeneous workload | |
CN108062249A (en) | High in the clouds data allocation schedule method based on big data | |
CN114610501A (en) | Resource allocation method for parallel training of task planning model | |
Minyard et al. | Octree partitioning of hybrid grids for parallel adaptive viscous flow simulations | |
Biswas et al. | Global load balancing with parallel mesh adaption on distributed-memory systems | |
Seredyński | Scheduling tasks of a parallel program in two-processor systems with use of cellular automata | |
Rantakokko et al. | Parallel structured adaptive mesh refinement | |
Ierotheou et al. | Parallelisation of a novel 3d hybrid structured-unstructured grid CFD production code | |
Crandall et al. | Problem decomposition for non-uniformity and processor heterogeneity | |
Feldmann et al. | Automated parallel solution of unstructured PDE problems | |
Playne et al. | Simulating and Benchmarking the shallow-water fluid dynamical equations on multiple graphical processing units | |
Fujita et al. | Enhanced by High-Performance Computing | |
Houstis et al. | The algorithm mapper: a system for modeling and evaluating parallel applications/architecture pairs | |
Telstø | Performance Modeling of a Finite Volume Method Applied to the Euler Fluid Equations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |