CN115455794A - LBM parallel optimization method and device based on connected pore division calculation area and storage medium - Google Patents

LBM parallel optimization method and device based on connected pore division calculation area and storage medium Download PDF

Info

Publication number
CN115455794A
CN115455794A CN202210953478.0A CN202210953478A CN115455794A CN 115455794 A CN115455794 A CN 115455794A CN 202210953478 A CN202210953478 A CN 202210953478A CN 115455794 A CN115455794 A CN 115455794A
Authority
CN
China
Prior art keywords
pore
calculation
units
unit
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210953478.0A
Other languages
Chinese (zh)
Other versions
CN115455794B (en
Inventor
胡五龙
许铭扬
吴卫国
李凡
肖一鹤
蒋张泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202210953478.0A priority Critical patent/CN115455794B/en
Publication of CN115455794A publication Critical patent/CN115455794A/en
Application granted granted Critical
Publication of CN115455794B publication Critical patent/CN115455794B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/25Design optimisation, verification or simulation using particle-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/10Numerical modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/08Fluids

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a LBM parallel optimization method and device based on a connected pore division calculation area, which can divide the sub-domain number in a balanced manner according to the calculation node number, so that each calculation node can perform load balancing and efficient processing, and the calculation processing efficiency is improved. The method comprises the following steps: step 1, determining the total number N of subdomains to be decomposed according to the number N of calculation nodes in the system; step 2, decomposing the calculation domain: dividing the basin into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtaining N groups ofSubdomains with the same or similar unit numbers, wherein each subdomain is a three-dimensional region consisting of a plurality of pore units; and the unit number difference between the sub-domain with the maximum unit number and the sub-domain with the minimum unit number after decomposition should not exceed one thousandth of the total unit number; and 3, distributing a calculation task.

Description

LBM parallel optimization method and device based on connected pore division calculation area and storage medium
Technical Field
The invention belongs to the technical field of pore structure simulation calculation, and particularly relates to a LBM parallel optimization method, a device and a storage medium based on a connected pore partition calculation area.
Background
The Lattice Boltzmann Method (LBM for short) is an effective means for simulating fluid flow in a pore structure of a porous medium in a mesoscopic scale. In order to make the simulation result representative, the size of the simulation sample should be large enough, and due to the complexity of the fluid flow itself in the pore structure, the pore size LBM tends to face the demands of large amounts of computational resources and storage space in numerical simulation. Therefore, the large-scale LBM simulation needs parallel optimization, and the execution time and memory requirement thereof depend on the data volume and the parallel optimization method.
The basic idea of domain decomposition in parallel computing is to decompose a computing domain into several sub-domains, and then restore the interface and perform load balancing. However, for complex porous materials, this step often requires a significant amount of time to iterate repeatedly to achieve an acceptable load balance. Among them, recursive Bisection (Recursive Bisection Method) is the most commonly used scheme for decomposing a computation domain, and divides the computation domain into two sub-domains uniformly, then performs the same division on the sub-domains, and finally performs the total division of 2 through R recursions R A sub-field. The partitioning scheme can well balance the workload of a regular or irregular structure of a uniform medium, but has the following three problems: 1) Good workload balance is still difficult to achieve for non-uniform, highly heterogeneous porous structures; 2) The method divides the number of sub-fields to be 2 R The number of processors (computing nodes) is generally not exactly equal to the number of processors (computing nodes), which results in that part of the processors cannot be effectively utilized; 3) When the divided sub-fields are large, the number of times of recursive division needs to be repeatedR will be larger and less efficient.
Disclosure of Invention
The present invention is made to solve the above problems, and an object of the present invention is to provide an LBM parallel optimization method, an LBM parallel optimization device, and a LBM parallel optimization storage medium based on a connected pore partition calculation region, which can divide the sub-domain number equally according to the number of calculation nodes, so that each calculation node can perform load balancing and efficient processing, and improve the calculation processing speed.
In order to achieve the purpose, the invention adopts the following scheme:
< method >
The invention provides an LBM parallel optimization method based on a connected pore division calculation area, which is characterized by comprising the following steps:
step 1, determining the void unit information and the communication condition of a sample according to the void distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the calculation nodes in the system;
step 2, decomposing the calculation domain:
dividing the basin into n along the x-axis x Areas having the same or similar number of cells, and dividing each area into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells, each subfield being a solid region composed of a plurality of pore cells; and has a maximum number of cells M after decomposition max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units; the allowable difference value of the number of units between the sub-domains can be further reduced according to the requirement of the division precision of the calculation domain;
step 3, distributing a calculation task:
during parallel computing, distributing N sub-domains to N computing nodes one by one for processing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are renumbered, and then the connected pore units are respectively stored inOne-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with a coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array in which the ordinal number and the corresponding coordinate of the pore unit are stored, and each pore unit is tracked through the unique ordinal number and the unique coordinate of the pore unit;
for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; and correspondingly storing the momentum and the local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
Specifically, the LBM parallel optimization method based on the connected pore partition calculation region provided by the present invention may further have the following features: in step 2, the maximum number of cells M max
Figure BDA0003790104950000021
Minimum number of cells M min
Figure BDA0003790104950000022
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
Preferably, the LBM parallel optimization method based on the connected pore partition calculation region provided by the present invention may further have the following features: in step 3, the main data structure of each pore cell is its fluid particle distribution function and equilibrium fluid particle distribution function; and correspondingly storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain four second derivative arrays.
Preferably, the LBM parallel optimization method based on the connected pore partition calculation region provided by the present invention may further include: step 4, setting a communication mode between sub-domains: when processing adjacent or diagonal pore units between different sub-domains, the interface layer on the current sub-domain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, so that the communication mode of the expanded pore units is consistent with that of the interface units in the current sub-domain.
< means >
Further, the present invention provides an apparatus for automatically implementing the < method >, which is characterized in that the apparatus comprises:
the determination part is used for determining the void cell information and the communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the calculation nodes in the system;
a computation domain decomposition unit for dividing the flow domain into n along the x-axis x Areas having the same or similar number of cells, and dividing each area into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; the decomposed subdomains should satisfy the condition: having a maximum number of cells M max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units;
the computing task allocation part allocates the N subdomains to N computing nodes one by one for processing and carries out parallel computing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are numbered again, and then the connected pore units are respectively stored in the one-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with a coordinate and stored according to the sequence of the one-dimensional arrays to obtain a pore unit array in which the ordinal number of the pore unit and the corresponding coordinate are stored; for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; comparing the momentum and local fluid density data of the pore unit with the poreCorrespondingly storing the ordinal numbers of the units into a one-dimensional array to obtain a series of second derivative arrays;
and the control part is in communication connection with the determination part, the calculation domain decomposition part and the calculation task allocation part and controls the operation of the determination part, the calculation domain decomposition part and the calculation task allocation part.
Preferably, the LBM parallel optimization device based on the connected pore partition calculation region provided by the present invention may further have the following features: maximum number of units M in the calculation domain decomposition part max
Figure BDA0003790104950000031
Minimum number of cells M min
Figure BDA0003790104950000032
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
Preferably, the LBM parallel optimization apparatus based on connected pore partition calculation region provided by the present invention may further include: and the communication mode setting part is in communication connection with the control part, and when adjacent or diagonal pore units among different sub-domains are processed, the interface layer on the interface of the current sub-domain is additionally expanded by one layer of units to cover the adjacent and diagonal pore units, so that the communication modes of the expanded pore units are consistent with the communication modes of the interface units in the current sub-domain.
Preferably, the LBM parallel optimization apparatus based on connected pore partition calculation region provided by the present invention may further include: and the input display part is in communication connection with the determining part, the calculation domain decomposing part, the calculation task distributing part, the communication mode setting part and the control part and is used for enabling a user to input an operation instruction and performing corresponding display.
Preferably, the LBM parallel optimization device based on the connected pore partition calculation region provided by the present invention may further have the following features: the input display part can display the sample gap unit information and the communication condition determined by the determination part and the number of the calculation nodes or the total number N of the sub-domains according to the operation command, display the decomposition condition of the calculation domain decomposition part and correspondingly display the distribution condition of the calculation task distribution part.
< storage Medium >
In addition, the present invention also provides a computer-readable storage medium storing a program for implementing the above < method >.
Action and Effect of the invention
The LBM parallel optimization method, the device and the storage medium based on the connected pore division calculation region can divide the total number of the sub-domains at will in a balanced manner according to the number of the calculation nodes, and the total number of the sub-domains is not limited to 2 R Moreover, the invention can effectively reduce the communication complexity caused by the tortuosity of the divided interface, balance the working load of all sub-domains in the flow domain decomposition process, and does not need to carry out secondary optimization such as iteration and the like, thereby reducing the memory consumption of the system, simplifying the distribution process of the calculation task, enabling the communication between the sub-domains to be efficient and easy to realize, and reducing the communication time; therefore, the method can obviously improve the calculation efficiency, is particularly suitable for processing the porous material with large simulation calculation amount of the pore size and large memory consumption, develops a new idea of decomposing and processing the calculation area, and has great popularization value.
Drawings
FIG. 1 is a schematic diagram of a domain decomposition process in an embodiment of the present invention, wherein (a) is decomposition along the x-axis, (b) is decomposition along the y-axis, and (c) is decomposition along the z-axis;
FIG. 2 is a schematic diagram of a process for marking fluid cells in a porous medium with a one-dimensional array after decomposing sub-domains according to an embodiment of the present invention;
FIG. 3 shows example 0 of the present invention # Sub-fields and 1 # Interface schematic of subdomains (white for pore units, grey for solid units);
FIG. 4 is a D3Q19 discrete velocity model diagram according to an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating an information exchange process of diagonal units between different sub-fields according to an embodiment of the present invention.
Detailed Description
The following describes in detail an LBM parallel optimization method, an LBM parallel optimization device, and a LBM parallel optimization storage medium based on a connected pore partition calculation region according to the present invention with reference to the accompanying drawings.
< example >
The LBM parallel optimization method based on the connected pore partition calculation region provided by the embodiment comprises the following steps:
step 1, determining the void unit information and the communication condition of a sample according to the void distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the computing nodes in the system; for the porous material with uniform and simple structure distribution, the pore distribution data can be directly defined by a programming language, and for the non-uniform and highly heterogeneous porous structure, the X-CT technology can be selected to obtain the pore distribution data.
Step 2, decomposing the calculation domain:
dividing the porous medium sample flow field into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; and has a maximum number of cells M after decomposition max And has a minimum number of cells M min Should not differ by more than one thousandth of the total number of units.
Maximum number of cells M max
Figure BDA0003790104950000051
Minimum number of cells M min
Figure BDA0003790104950000052
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
Specifically, as shown in FIG. 1, each cell within the calculation region is first traversed by successive scans in the z, y, and x directions. Note that the scan order of z → y → x is in x i Traversing the cross section according to the sequence of z and y. Fig. 1 (b) is based on fig. 1 (a), each sub-domain is divided into 2 small regions along the y-axis, specifically, the pore unit traversal numbers of the sub-domains are numbered in the order of x → z → y and stored in a one-dimensional array, and then the one-dimensional array associated with each sub-domain is divided into two parts, at this time, the flow domain is divided into 3 × 2 × 1 sub-regions. FIG. 1 (c) traverses the pore cells of each subdomain in the x → y → z scan order in the same manner and repeats the partitioning process, with the resulting simulation region being partitioned into 3X 2 subdomains, based on the previous partitioning. Next, as shown in fig. 2, the interconnected pore units are screened and numbered and stored in a one-dimensional array, and the basin has N pore units in total. As shown in fig. 3, all connected pore cells are divided into a number of equal-sized groups, each group being associated with one subfield, the total number being equal to the number of subfields after decomposition. The one-dimensional array is then divided into three sub-groups, each of which contains N/3 or N/3+1 aperture elements, the last element of the three sub-groups being numbered N 0 、N 1 And N. Finally, the interface between two adjacent sub-domains is restored, as shown in fig. 3 and fig. 1 (a), and the interface includes the last pore unit of each partition.
The calculation domain decomposition scheme directly divides the soil sample into n x ×n y ×n z The difference in workload between subfields is the difference in the number of pore units at adjacent interfaces between each subfield. The subsequent simulation is based on the divided one-dimensional array, and the simulation area is from 1 to nx i ,nx i Is the size of the sub-field in the x-direction (fluid flow direction) associated with processor i.
Step 3, distributing a calculation task:
during parallel computing, distributing N sub-domains to N computing nodes one by one for processing; when the computing node processes the subdomains, only all connected pores are consideredThe cells are renumbered, and then the connected pore cells are respectively stored in a one-dimensional array p i In, i is the subdomain number; each connected pore unit is associated with a coordinate and is stored according to the sequence of the one-dimensional array to obtain an array of pore units in which the ordinal numbers and corresponding coordinates of the pore units are stored, and each pore unit is tracked through the unique ordinal number and coordinate thereof.
For each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; and correspondingly storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
The primary data structure of each pore cell is its fluid particle distribution function and equilibrium fluid particle distribution function; and correspondingly storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain four second derivative arrays.
As shown in fig. 4, according to the D3Q19 model in LBM, for each connected pore unit x, up to 18 adjacent pore units are stored in 18 one-dimensional arrays. Each array corresponds to a component. The main data structure for each pore cell is its 19 fluid particle distribution functions and 18 equilibrium fluid particle distribution functions, each described by a one-dimensional array, the other four for storing the momentum and local fluid density in the pore cell.
Step 4, setting a communication (extraction and calling of calculation data) mode between sub-domains: as shown in fig. 5, when processing adjacent or diagonal void cells (d and d' in the figure) between different sub-domains, the interface layer on the interface of the current sub-domain is additionally expanded by a layer of cells to cover the adjacent and diagonal void cells, so that the communication mode of the expanded void cells is consistent with that of the interface cells in the current sub-domain.
In each subfield of this implementation, the overlap region is stored in 18 one-dimensional arrays, and no redundant or duplicate calculations are performed. Each timeSub-fields from 1 to nx i The particle distribution function and the equilibrium state distribution function of the fluid cell of (a) are stored in two different sets of one-dimensional arrays (18 one-dimensional arrays each). Pointers are selected to locate the first pore unit of the interface and only pore unit data in the interface layer is swapped to the adjacent partition. Since only all the pore cells including the interface layer in each subfield are stored and the neighboring cells of each pore cell are stored in 18 one-dimensional arrays, the neighboring cells of each pore cell can be easily determined in the simulation. For each iteration, the fluid particle distribution function of the corresponding sub-domain is calculated on each processor, and then the data of the sub-domain interface is exchanged between the processors. In the data exchange, only 5 of the 18 distribution function components of the fluid unit on the interface layer need to be transferred to the adjacent processor. For example, in the x-direction, only the interface nx is needed i Fluid particle distribution function f 3 [i]、f 8 [i]、f 9 [i]、f 12 [i]And f 13 [i]The left sub-field is transferred from the right sub-field and likewise only f needs to be transferred 1 [i]、f 7 [i]、f 10 [i]、f 11 [i]And f 14 [i]From the left sub-field to the right sub-field.
By the method, the total number of the divided subdomains can be randomly balanced according to the number of the calculation nodes, all the calculation nodes are fully utilized for calculation processing, the communication complexity caused by the tortuosity of a divided interface can be effectively reduced, the working loads of all the subdomains are balanced in the flow domain decomposition process, two-time optimization such as iteration and the like is not needed, the system memory consumption is reduced, the calculation task allocation process is simplified, and the communication time among the subdomains is effectively reduced.
Further, the present embodiment also provides an apparatus capable of automatically implementing the above method, the apparatus including a determination section, a calculation domain decomposition section, a calculation task assignment section, a communication method setting section, an input display section, and a control section.
The determining part determines the void cell information and the communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; and determining the total subdomain number N to be decomposed according to the number N of the computing nodes in the system.
The calculation domain decomposition part divides the porous medium sample flow domain into n along the x-axis x Areas having the same or similar number of cells, and dividing each area into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; the decomposed subdomains should satisfy the condition: with a maximum number of cells M max And has a minimum number of cells M min Should not differ by more than one thousandth of the total number of units.
Maximum number of cells M max
Figure BDA0003790104950000071
Minimum number of cells M min
Figure BDA0003790104950000081
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
The calculation task distribution part distributes the N sub-domains to the N calculation nodes one by one for processing, and parallel calculation is carried out; when the computing node processes the subdomains, only all connected pore units are considered and are renumbered, and then the connected pore units are respectively stored in the one-dimensional array p i In, i is the subdomain number; each connected pore unit is associated with a coordinate and stored according to the sequence of the one-dimensional arrays to obtain a pore unit array in which the ordinal number of the pore unit and the corresponding coordinate are stored; to pairAt each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit into a one-dimensional array correspondingly to obtain a series of first derivative arrays; and correspondingly storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
The communication mode setting part is connected with the control part in a communication mode, when adjacent or diagonal pore units between different subdomains are processed, the interface layer on the current subdomain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, and the communication modes of the expanded pore units are consistent with the communication modes of the interface units in the current subdomain.
The input display part is used for allowing a user to input an operation instruction and performing corresponding display. For example, the input display unit may display the sample void cell information and the connectivity status and the number of calculation nodes or the total number N of sub-domains determined by the determination unit, display the resolution status of the calculation domain decomposition unit, and display the allocation status of the calculation task allocation unit according to the operation command.
The control part is connected with the determination part, the calculation domain decomposition part, the calculation task distribution part and the communication mode setting part in a communication way and controls the operation of the determination part, the calculation domain decomposition part, the calculation task distribution part and the communication mode setting part.
The above embodiments are merely illustrative of the technical solutions of the present invention. The method, apparatus and storage medium for LBM parallel optimization based on connected pore partition calculation region according to the present invention are not limited to the content described in the above embodiments, but shall be subject to the scope defined by the claims. Any modification or supplement or equivalent replacement made by a person skilled in the art on the basis of this embodiment is within the scope of the invention as claimed in the claims.

Claims (10)

1. The LBM parallel optimization method based on the communicated pore division calculation area is characterized by comprising the following steps:
step 1, determining the void unit information and the communication condition of a sample according to the void distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the calculation nodes in the system;
step 2, decomposing the calculation domain:
dividing a porous medium sample flow field into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; and has a maximum number of cells M after decomposition max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units;
step 3, distributing a calculation task:
during parallel computing, distributing N subdomains to N computing nodes one by one for processing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are numbered again, and then the connected pore units are respectively stored in the one-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with a coordinate and is stored according to the sequence of the one-dimensional array to obtain a pore unit array in which the ordinal number and the corresponding coordinate of the pore unit are stored, and each pore unit is tracked through the unique ordinal number and the unique coordinate of the pore unit;
for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit correspondingly into a one-dimensional array to obtain a series of first derivative arrays; and correspondingly storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain a series of second derivative arrays.
2. The LBM parallel optimization method based on the connected pore partition calculation region according to claim 1, wherein:
wherein, in step 2, the maximum unit number M max
Figure FDA0003790104940000011
Minimum number of cells M min
Figure FDA0003790104940000012
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
3. The LBM parallel optimization method based on the connected pore partition calculation region according to claim 1, wherein:
wherein, in step 3, the main data structure of each pore unit is its fluid particle distribution function and equilibrium fluid particle distribution function; and correspondingly storing the momentum and local fluid density data before and after calculation of the pore unit and the ordinal number of the pore unit into a one-dimensional array to obtain four second derivative arrays.
4. The LBM parallel optimization method based on the connected pore partition calculation region according to claim 1, further comprising:
step 4, setting a communication mode between sub-domains: when processing adjacent or diagonal pore units between different sub-domains, the interface layer on the current sub-domain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, so that the communication mode of the expanded pore units is consistent with that of the interface units in the current sub-domain.
5. LBM parallel optimization device based on connected pore partition calculation region, characterized by including:
the determination part is used for determining the void cell information and the communication condition of the sample according to the pore distribution data of the porous medium sample to be simulated; determining the total number N of subdomains to be decomposed according to the number N of the calculation nodes in the system;
a calculation domain decomposition part for dividing the porous medium sample flow domain into n along the x-axis x Regions having the same or similar number of cells, and dividing each region into n along the y-axis y Sub-regions with the same or similar number of cells, and finally dividing n x ×n y Sub-region dividing n along z-axis z Then, obtain n x ×n y ×n z = N subfields having the same or similar number of cells; the decomposed subdomains should satisfy the condition: having a maximum number of cells M max And has a minimum number of cells M min Should not exceed one thousandth of the total number of units;
the computing task allocation part allocates the N sub-domains to the N computing nodes one by one for processing and carries out parallel computing; when the computing node processes the subdomains, only all connected pore units are considered, the connected pore units are numbered again, and then the connected pore units are respectively stored in the one-dimensional array p i In the middle, i is the subdomain number; each connected pore unit is associated with one coordinate and is stored according to the sequence of the one-dimensional arrays, and the pore unit arrays storing the ordinal numbers of the pore units and the corresponding coordinates are obtained; for each pore unit: storing each function in the main data structure of the pore unit and the ordinal number of the pore unit correspondingly into a one-dimensional array to obtain a series of first derivative arrays; storing the momentum and local fluid density data of the pore unit and the ordinal number of the pore unit correspondingly into a one-dimensional array to obtain a series of second derivative arrays;
and the control part is in communication connection with the determining part, the calculation domain decomposing part and the calculation task distributing part and controls the operation of the determining part, the calculation domain decomposing part and the calculation task distributing part.
6. The device for LBM parallel optimization based on connected pore partition calculation area according to claim 5, wherein:
wherein, in the calculation domain decomposition part, the maximum unit number M max
Figure FDA0003790104940000031
Minimum number of cells M min
Figure FDA0003790104940000032
Wherein NX, NY and NZ are the total number of units including pores and solids on the x, y and z axes of the simulation area respectively.
7. The device for LBM parallel optimization of calculation regions based on connected pore partitioning according to claim 5, further comprising:
and the communication mode setting part is in communication connection with the control part, and when the adjacent or diagonal pore units between different subdomains are processed, the interface layer on the current subdomain interface is additionally expanded by a layer of units to cover the adjacent and diagonal pore units, so that the communication modes of the expanded pore units are consistent with the communication modes of the interface units in the current subdomain.
8. The device for LBM parallel optimization of calculation regions based on connected pore partitioning according to claim 5, further comprising:
and the input display part is in communication connection with the determining part, the calculation domain decomposing part, the calculation task distributing part, the communication mode setting part and the control part, and is used for enabling a user to input an operation instruction and performing corresponding display.
9. The device for LBM parallel optimization based on connected pore partition calculation area according to claim 5, wherein:
the input display part can display the sample gap unit information and the communication condition determined by the determination part and the number of the calculation nodes or the total sub-domain number N according to an operation command, display the decomposition condition of the calculation domain decomposition part and correspondingly display the distribution condition of the calculation task distribution part.
10. A storage medium, characterized by:
a program for implementing the LBM parallel optimization method based on connected pore partition calculation region according to any one of claims 1 to 4 is stored.
CN202210953478.0A 2022-08-10 2022-08-10 LBM parallel optimization method, device and storage medium based on communication pore division calculation region Active CN115455794B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210953478.0A CN115455794B (en) 2022-08-10 2022-08-10 LBM parallel optimization method, device and storage medium based on communication pore division calculation region

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210953478.0A CN115455794B (en) 2022-08-10 2022-08-10 LBM parallel optimization method, device and storage medium based on communication pore division calculation region

Publications (2)

Publication Number Publication Date
CN115455794A true CN115455794A (en) 2022-12-09
CN115455794B CN115455794B (en) 2024-03-29

Family

ID=84297534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210953478.0A Active CN115455794B (en) 2022-08-10 2022-08-10 LBM parallel optimization method, device and storage medium based on communication pore division calculation region

Country Status (1)

Country Link
CN (1) CN115455794B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120179436A1 (en) * 2011-01-10 2012-07-12 Saudi Arabian Oil Company Scalable Simulation of Multiphase Flow in a Fractured Subterranean Reservoir as Multiple Interacting Continua
CN109376481A (en) * 2018-08-16 2019-02-22 清能艾科(深圳)能源技术有限公司 Calculation method, device and the computer equipment of digital cores phase percolation curve based on more GPU
CN112949112A (en) * 2021-01-29 2021-06-11 中国石油大学(华东) Rotor-sliding bearing system lubrication basin dynamic grid parallel computing method
CN112992294A (en) * 2021-04-19 2021-06-18 中国空气动力研究与发展中心计算空气动力研究所 Porous medium LBM calculation grid generation method
CN114565658A (en) * 2022-01-14 2022-05-31 武汉理工大学 Pore size calculation method and device based on CT technology

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120179436A1 (en) * 2011-01-10 2012-07-12 Saudi Arabian Oil Company Scalable Simulation of Multiphase Flow in a Fractured Subterranean Reservoir as Multiple Interacting Continua
CN109376481A (en) * 2018-08-16 2019-02-22 清能艾科(深圳)能源技术有限公司 Calculation method, device and the computer equipment of digital cores phase percolation curve based on more GPU
CN112949112A (en) * 2021-01-29 2021-06-11 中国石油大学(华东) Rotor-sliding bearing system lubrication basin dynamic grid parallel computing method
CN112992294A (en) * 2021-04-19 2021-06-18 中国空气动力研究与发展中心计算空气动力研究所 Porous medium LBM calculation grid generation method
CN114565658A (en) * 2022-01-14 2022-05-31 武汉理工大学 Pore size calculation method and device based on CT technology

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周鸿翔 等: "孔隙尺度多孔介质流体流动与溶质运移高性能模拟", 水科学进展, vol. 31, no. 3, pages 422 - 430 *
张纲;王利民;葛蔚;: "格子Boltzmann方法多GPU并行性能的研究", 计算机与应用化学, no. 10, pages 4 - 13 *

Also Published As

Publication number Publication date
CN115455794B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
Jones et al. Computational results for parallel unstructured mesh computations
US5694602A (en) Weighted system and method for spatial allocation of a parallel load
EP0228915A2 (en) Method and apparatus for simulating systems described by partial differential equations
Balaji et al. Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems
Feng et al. Scalable 3D hybrid parallel Delaunay image-to-mesh conversion algorithm for distributed shared memory architectures
Vaughan et al. Enabling tractable exploration of the performance of adaptive mesh refinement
CN115455794B (en) LBM parallel optimization method, device and storage medium based on communication pore division calculation region
CN108108242A (en) Accumulation layer intelligence distribution control method based on big data
Zhao et al. Bapipe: Balanced pipeline parallelism for dnn training
Marshall et al. Performance evaluation and enhancements of a flood simulator application for heterogeneous hpc environments
CN115016947A (en) Load distribution method, device, equipment and medium
Rantakokko A framework for partitioning structured grids with inhomogeneous workload
CN108062249A (en) High in the clouds data allocation schedule method based on big data
CN114610501A (en) Resource allocation method for parallel training of task planning model
Minyard et al. Octree partitioning of hybrid grids for parallel adaptive viscous flow simulations
Biswas et al. Global load balancing with parallel mesh adaption on distributed-memory systems
Seredyński Scheduling tasks of a parallel program in two-processor systems with use of cellular automata
Rantakokko et al. Parallel structured adaptive mesh refinement
Ierotheou et al. Parallelisation of a novel 3d hybrid structured-unstructured grid CFD production code
Crandall et al. Problem decomposition for non-uniformity and processor heterogeneity
Feldmann et al. Automated parallel solution of unstructured PDE problems
Playne et al. Simulating and Benchmarking the shallow-water fluid dynamical equations on multiple graphical processing units
Fujita et al. Enhanced by High-Performance Computing
Houstis et al. The algorithm mapper: a system for modeling and evaluating parallel applications/architecture pairs
Telstø Performance Modeling of a Finite Volume Method Applied to the Euler Fluid Equations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant