CN116911147B

CN116911147B - Material point simulation parallel method based on three-dimensional self-adaptive division

Info

Publication number: CN116911147B
Application number: CN202310846422.XA
Authority: CN
Inventors: 田少博; 周旭; 崔潇骁; 李佳霖; 张鉴; 周纯葆; 付晗; 汪嘉恒
Original assignee: Computer Network Information Center of CAS; 63921 Troops of PLA
Current assignee: Computer Network Information Center of CAS; 63921 Troops of PLA
Priority date: 2023-07-11
Filing date: 2023-07-11
Publication date: 2024-01-23
Anticipated expiration: 2043-07-11
Also published as: CN116911147A

Abstract

The invention relates to a material point simulation parallel method based on three-dimensional self-adaptive division, belonging to the crossing field of computational mechanics and high-performance computation. The invention designs and realizes the mass parallel simulation algorithm of the object particle method on the super computing platform. First, the invention performs parallel processing on the initialization of the substance point algorithm. And then, uniformly dividing the substance points to each process through a self-adaptive dividing strategy to realize large-scale simulation, or designating the dividing mode of the background grid according to the user requirement, so that the substance points are more uniformly distributed among each process. Finally, through the MPI multi-core parallel method, performance is improved by using non-blocking communication and establishing a neighbor list and using a temporary buffer area to transmit and receive.

Description

Material point simulation parallel method based on three-dimensional self-adaptive division

Technical Field

The invention belongs to the crossing field of computational mechanics and high-performance computation, and particularly relates to a material point simulation parallel method based on three-dimensional self-adaptive division.

Background

The mass point method (Material Point Method, MPM) is a gridless method that originates from the mass point grid method (PIC) proposed by Harlow et al at the end of the 50 th century. The PIC method represents the fluid by lagrangian quantities, particles (particles) moving in a computational grid, i.e. by mapping the physical quantities to the background grid to obtain the latest state of the object. With the subsequent continuous supplement and evolution, the mass point method has wider application due to the fact that the mass point carries more information.

The object point method can solve the fluid problem of complex geometric shapes, can accurately describe the motion state of an object, and is applied to the fields of free surface flow, gas-liquid two-phase flow, viscous fluid and the like. In engineering problems such as solid penetration, large rotation and the like, the Euler format is difficult to realize, because constitutive equations and history correlation are required to be in line with the characteristics of materials, and a substance point method can well solve the problem. The object point method is also advantageous over the Lagrangian method: the method can effectively solve the problem of difficult numerical simulation caused by serious deformation of the grid, and is an important means for analyzing the problems related to material deformation, such as ultra-high-speed collision, fluid-solid coupling and the like. In addition, the material point method can better solve some problems related to material breakage, and achieves more real physical effects.

The object point method has the following characteristics: it uses discrete material areas of material points carrying all information of the material to describe the motion and deformation state of the material areas. The continuous body is discretized into specific mass points, and the mass points carry all information related to physical quantities, so that the motion track of the mass points represents the morphological change of the object. The object points are closely connected with the background grid, the motion equation of the object points is solved on the background grid, all the object point information is mapped to the background grid point through interpolation operation, and the calculation process is concise. After the physical simulation at one moment is completed, the physical quantity of the grid point movement is fed back to the material point, and then the running track information and other physical information of the material point are obtained.

However, there are many limitations to the serial procedure, such as: the running memory of the common machine is small, and large-scale and fine application scenes are difficult to simulate. In addition, the object point method needs to track a large number of object points, each object point needs to be solved, and the repeated mapping between the object points and grid points increases the calculation amount, so that the simulation time of the whole program is long, and especially in high-precision and large-scale calculation, the running time is not acceptable. Therefore, the present embodiment uses the MPI interface to invoke more computing resources and storage resources based on the supercomputer platform "Orient" in preparation for mass point simulation.

The calculation amount of the mass point method is roughly divided into two parts, one part is the calculation of the mass point itself, for example: the physical quantity calculation of the position, speed, stress and the like of the mass points is performed, and the other part is the mutual mapping of the object mass points and the grid (including the calculation of the physical quantity of mass, momentum, force and the like on the grid points). All physical information is carried by the particles, and meanwhile, the calculation of the physical quantity needs to be mapped to a background grid for interpolation, so that the method has good parallelism conditions. Based on the research, the three-dimensional parallel design is carried out on the material method, all material points are divided into each process in an equalizing way by automatically counting the material points in three directions, the simulation time is reduced, and the program performance is improved by using a non-blocking mode. For different simulation examples or different process numbers of the same example, the function of adaptively and uniformly dividing the material points is realized, and finally, the material point simulation parallel algorithm based on three-dimensional adaptive division is realized.

Disclosure of Invention

The invention aims to implement a mass parallel simulation algorithm of a material point method by performing three-dimensional parallel design on the material point method. The mass simulation is realized by carrying out data dependency analysis on each module/function of the material point method and designing a parallel algorithm. Parallel simulation is performed on the eastern of the super-computing platform, so that the computing resources of multiple nodes are fully utilized, and the advantage of modern high-performance computing is exerted.

In order to achieve the above purpose, the present invention provides a three-dimensional adaptive division-based substance point simulation parallel method, which comprises the following steps:

acquiring an initialization data file, wherein the content of the initialization data file comprises initialization of physical data and initialization of three-dimensional parallel division; wherein, the initialization of the physical data comprises: defining object information in the form of components, a plurality of components may be defined, wherein a component includes one or more objects, and an object is composed of a plurality of object points; defining a physical range and boundary conditions of a background grid, wherein the information of the background grid comprises maximum and minimum values in three directions of x, y and z, so as to determine a specific physical area, and determining the boundary conditions through different numerical types;

establishing a neighbor process list, and equivalently converting the process numbers into elements in a three-dimensional neighbor process array;

initializing and dividing the background grid;

the boundary is processed in parallel, the range managed by the process is judged, and if the management area of the process is in the boundary, corresponding boundary processing is carried out;

the mass and momentum calculation of all grid points in the simulation process is completed by circulating all object points in the process;

establishing a neighbor list buffer zone, and transmitting and receiving data to adjacent processes;

carrying out boundary processing on momentum on grid points according to boundary conditions of each process;

calculating the output by interpolating the value carried by each material point to the grid point through circulating the material points; after interpolation of each process is completed, in order to keep consistency of each process to the same grid, data transmission is carried out on the forces on the grid points;

each process obtains own boundary conditions and carries out boundary processing on the forces on the grid points;

by cycling each object point in the process, reading the mass and momentum of the object point interpolated to the grid point, calculating the speed and acceleration, and mapping to the object point, thereby updating the position and speed of the object point;

each process obtains own boundary conditions, carries out zero clearing processing on momentum on grid points, and prepares for updating momentum calculation;

calculating momentum by using the updated speed, mass and weight calculated by the shape function of the object particles, and carrying out data transmission on the momentum;

each process obtains own boundary conditions and carries out boundary processing on momentum on grid points;

calculating the strain increment and the rotation increment of the material point through the momentum and the mass, and updating the density and the stress of the material point;

determining which process to manage the material point belongs to according to the position of the material point to be sent, storing all the material point to be sent into a large buffer area, and packaging and sending;

when the object point movement is finished, all information is carried by the object point, discarding the deformed grid, and then entering the next iteration, wherein a new background grid is adopted for the next iteration; if the current physical time reaches the preset physical time, stopping iteration, otherwise, continuing to circulate until the result requirement is met.

The invention has the beneficial effects that:

1. the invention carries out parallel design and optimization aiming at an object point method algorithm, adaptively divides object points according to the requirement of a simulation calculation example, realizes large-scale simulation, transfers software to a domestic super-calculation platform, solves physical problems such as penetration collision and the like, and obviously improves the solving performance of related simulation problems.

2. The invention uses non-blocking communication and the modes of establishing neighbor list buffer area to pack necessary transmission data and transmitting the data once and the like through the MPI multi-core parallel method, thereby improving the program performance.

3. According to the method and the device, the division mode of the background grids can be designated according to the requirements of the user, so that the substance points are relatively uniformly distributed among all processes.

4. The method is suitable for parallel simulation of various different calculation models, and has good universality.

Drawings

FIG. 1 is a schematic flow chart of a substance point simulation parallel method based on three-dimensional adaptive partitioning according to an embodiment of the present invention;

FIG. 2 is a background grid initialization partition diagram;

FIG. 3 is a schematic (one-dimensional) diagram of a static partitioning strategy;

FIG. 4 is a schematic illustration of material dot groupings;

fig. 5 is a grid point division schematic diagram;

fig. 6 is a schematic diagram of data transmission at a mesh point;

FIG. 7 is a state diagram before and after load balancing;

fig. 8 is a schematic (two-dimensional) operation of the same cell grid with different material points mapped to four grid points.

Detailed Description

The technical scheme of the invention is further described in detail below with reference to the accompanying drawings and the examples. The embodiment of the invention discloses a material point simulation parallel method based on three-dimensional self-adaptive division, which provides a material point balanced division method, and can also autonomously select a grid division method according to user requirements to realize load balance setting. The performance is improved by adopting a temporary buffer area transmitting and receiving mode through non-blocking communication and neighbor list establishment, and the method can quickly solve the problems of collision penetration and the like, and realize parallel simulation of self-adaptive division.

Fig. 1 is a schematic flow chart of a substance point simulation parallel method based on three-dimensional adaptive partitioning according to an embodiment of the present invention. As shown in fig. 1, the three-dimensional adaptive partitioning-based material point simulation parallel method includes 22 sub-steps: software preparation, hardware resource preparation, preparation of an initialization data file, reading of the initialization data file (reading of an initialization value of physical data and a specific numerical value of three-dimensional parallelization division), establishment of a neighbor process list, background grid initialization division (background grid division, counting of the number of object particles in three directions respectively, dividing of the area definition of the object particles managed by each process and the parallelization of the background grid) and parallel processing of the boundary, object initialization parallelization, material initialization parallelization, physical time step calculation, mass and momentum calculation, data transmission design, boundary processing of mass and momentum physical quantity, calculation of force and transmission force related data, physical quantity force boundary processing, update of the position and speed of the object particles, momentum initialization, calculation of momentum and transmission momentum related data, momentum boundary processing on the grid, physical quantity of the object particles, material point movement and iterative loop, and a final simulation result is obtained. The method specifically comprises the following steps:

step 1, software preparation. The purpose of the software preparation is to configure the environment on which the software depends. And searching a dependency library path required by simulation software by using a CMake List.txt file, completing searching by using a command CMake C MakeList.txt, compiling an execution source file, and generating an executable file, so that the three-dimensional parallel large-scale simulation software of the substance point method can be deployed to a domestic super-computing platform.

And 2, preparing hardware resources. The purpose of the hardware resource preparation is to apply for the hardware resource calling authority required by calculation on the supercomputer platform-Zhongke Pilot number one. And using a slm script to control resources, applying for 64 CPU computing nodes by an execution instruction "# SBATCH-N64", and executing a program by each node according to a mode of 32 CPU threads by an execution instruction "# SBATCH-ntasks-per-node=32", so as to apply for 2048 CPU resources, and thus 2048 process running authorities of an MPI program can be obtained. A total of 2048 CPU operating rights are obtained. The preparation of hardware resources comprises the following specific steps: the step controller, typically a bash, that is first designated to run the entire program. Instruction "#SBATCH-J" is then used to specify the name of the program that needs to be run. Instruction "#SBATCH-p" is then used to specify the job queue that needs to run the program, the common queue being normal. Instruction "#sbatch-N" is then used to specify #sbatch-ntasks-per-node=32 to be run

#SBATCH--mem＝100G

Step 3, preparing an initialization data file. The initialization data parameters are written to xmp file. xmp is to write an initialization data file in a software XML format, and the content of the initialization data file includes initialization of physical data and initialization of three-dimensional parallel division. In the process of initializing and preparing the physical model, the format of the input file is a standard XML data format, and the method has good expandability. XML (eXtensible Markup Language) is a scalable markup language whose storage mechanism is a tree structure, suitable for representing complex data structures. The use of XML documents has strict writing specifications and an initialization data file (xmp file) is formulated according to this format. Such as: the three-dimensional parallel division format in the xmp file is "< DomainDecomposition dimension 0=" M "division 1=" N "division 2=" K "/>", initialization of Material "< Material > … … </Material >".

In one example, initialization data, such as material modules of object points, component numbers (including number of objects, names, material types, and shapes), and parameter configurations of background grids, are written in xmp file. The collision process of two objects is simulated, two components are adopted, the first component comprises one object, the name sphere, the shape is spherical, the second component comprises three objects, the names Block, honeycombPanel and Block respectively, the shape of the component is a cube, and the interior of the component is honeycomb. The left boundary value xmin= -75, the right boundary value xmax=75, the upper boundary value ymin=0, the lower boundary value ymax=75, the front boundary value zmin= -100, the rear boundary value zmax= 43.5, the grid cell lengths in three directions are dx=0.6, dy=0.6 and dz=0.6.

And 4, reading the xmp file in the XML format to obtain initial data. First, the contents of the initialization data file include initialization of physical data and initialization of three-dimensional parallel partitioning. Then, the initialization of the physical data mainly includes: first, material information is defined, including reference density, intensity model, state equation, and failure model. Second, defining object information in the form of components may define a plurality of components, where a component includes one or more objects and an object is made up of a plurality of object points. Third, defining the physical range and boundary conditions of the background grid, wherein the information of the background grid comprises the maximum and minimum values of x, y and z directions, so as to determine the specific physical area, and determining the boundary conditions through different numerical types, wherein 0 represents a free boundary surface, 1 represents a fixed boundary surface, 2 represents a symmetrical boundary surface, and 3 represents a projected boundary surface. Finally, the process of three-dimensional parallel partition initialization is that the following formats exist in the XMP file: < DomainDecomposition dimension 0= "M" dimension 1= "N" dimension 2= "K"/>, where "</>" is an XML fixed format, domaincomposition represents a three-dimensional parallel divided keyword, "dimension 0=" is a format set for obtaining the number of processes in the X direction, M represents a specific number of processes divided into the X direction, "dimension 1=" N represents a specific number of processes divided into the Y direction, "dimension 2=" K represents a specific number of processes divided into the Z direction. The corresponding keyword 'XMin' in the XMP file is a left boundary value in the X direction, the keyword 'XMax' is a right boundary value in the X direction, the keyword 'YMin' is an upper boundary value in the Y direction, the keyword 'YMax' is a lower boundary value in the Y direction, the keyword 'ZMin' is a front boundary value in the Z direction, the keyword 'ZMax' is a rear boundary value in the Z direction, and the keyword 'DCell' is followed by values corresponding to cell grid lengths dx, dy, dz in the three directions. The corresponding grid length can be obtained by dividing the grid area range by the cell grid length, for example: nx= (Xmax-Xmin)/dx, ny= (Ymax-Ymin)/dy and nz= (Zmax-Zmin)/dz. Each process of the process is simultaneously initialized. Therefore, the three-dimensional background grid formed by Nx, ny, nz is divided into m×n×k processes by inputting parameters in the file, where Nx and M are the grid number and the process number in the X direction, ny and N are the grid number and the process number in the Y direction, and Nz and K are the grid number and the process number in the Z direction, respectively.

Step 5, the process of establishing the neighbor process list is to equivalently convert the process number into elements in a three-dimensional neighbor process array neighbor_proc [ K ] [ N ] [ M ], one process is taken, the process number is neighbor_proc [ K ] [ M ], and the process numbers on the left two sides are as follows: k and n are unchanged, m minus 1. Such as: when m=n=k=3, process number 0 is denoted neighbor_proc [0] [0] [0], its left neighbor is NULL (mpi_proc_null) and not denoted, its right neighbor is 1 (neighbor_proc [0] [1 ]), its upper neighbor is NULL (mpi_proc_null) and not denoted, its lower neighbor is 3 (neighbor_proc [0] [1] [0 ]), its front neighbor is NULL (mpi_proc_null), and its rear neighbor is 6 (neighbor_proc [1] [0] [0 ]) and other processes are similar.

And 6, initializing and dividing the background grids. As shown in fig. 2, first, the background grid is divided equally, and the background grid (nx×ny×nz) is divided equally into each process, and each process obtains its own administered area range. Then, the process of counting the number of the object points in three directions respectively is to count the number of the object points in three directions in the area which is administrated by the user, and then the sum of the specifications is carried out, and the total number of the object points in three directions X, Y and Z is obtained in each process. Then, counting the number of material particles and preliminarily defining the area of the material points managed by each process, dividing the total number of material points in the X direction by the total number of processes in the X direction to obtain an average number of particles x_average_features, as shown in a state diagram before and after load balancing in fig. 7. For the X direction, a background grid is designated to extend one cell from left to right and gradually divide the cell to right until the number of the cells is equal to or similar to the average particle number, and the process No. 0 obtains a region range (mpi _ xgd [ 0-X1 ]) managed in the X direction. Then, the right boundary of the process No. 0 is marked as the left boundary of the process No. 1, and the process No. 1 is continuously divided to the right and divided into the same substance points, and the process No. 1 obtains the region range (mpi _ xgd [ X1-X2)) managed in the X direction, and the region management range of the (k, n, M-1) th process is (mpi _ xgd [ xM-Nx)) (note: mpi _ xgd is an array storing the distribution of material points of a cell grid in the X direction). Each process performs the same operation. To ensure the integrity of the original program data structure and the convenience of MPI boundary data exchange, when distributing material points to each process, the right boundary of the current process in the X direction needs to be shifted to the right (dx/2), as shown in the schematic substance point division diagrams in FIGS. 3 and 4. The Y/Z direction and the X direction are divided in the same way, and each process further obtains the area defined by the managed substance points. (remarks: k.epsilon. { 0- (K-1), n.epsilon. { 0- (N-1) }, dx represents the cell grid length in the X direction). The embodiment also provides another option, and the user can divide the background grid manually according to own requirements, and in three-dimensional parallel, the background grid can be divided according to own requirements. Assuming that there are px processes in the x direction, there are py processes in the y direction, and there are pz processes in the z direction, the dividing positions are automatically selected according to own requirements:

{ x_1, x_2, …, x_px }; { y_1, y_2, …, y_py }; { z_1, z_2, …, z_pz }; forming a px-py-pz sub-region. Finally, the background grids are divided in parallel, and the background grids are offset by half of the unit grids on the basis of material dot division, so that the background grids are expanded into a ghost region, and as shown in a grid dot division schematic diagram in fig. 5.

And 7, in the process of boundary parallel processing, 4 situations exist in boundary processing, and boundary processing options are required to be determined according to material properties or simulation requirements. This embodiment focuses mainly on the boundary problem of each process in parallel and how to handle it. In three-dimensional parallelism, each process has the number of 26 neighbor processes, and the boundary is processed in the following manner: judging according to the process number, each process stores adjacent neighbors, when the neighbors are empty, the process is indicated to be at the boundary, and assignment is carried out according to the boundary processing condition of the xmp file. When 26 directions have neighboring processes, the boundary condition is set to 0. The boundary processing has determined the value of fix s, and then the value of fix x/fix y/fix z at the grid point is given by the value of fix s, and then whether to boundary the grid point is selected. Since local grid point indexes are used among processes, whether boundary processing is performed cannot be directly determined, conversion into a global index is required, assignment is further performed on FixX/FixY/FixZ, and then boundary processing is performed according to whether the value is wire.

And 8, the object initialization parallel process is that all processes read key word information of the XMP file, wherein "< Component > … </Component >" represents Component information, the middle of the Component comprises information such as the number, shape, material, name and the like of the objects, the Component can appear for a plurality of times, and the program can be marked according to the sequence of appearance. And then the material points are defined according to the area in the step 6, each process adds the material points for managing the area into the related array, otherwise, skips the material points, and continues to judge whether the next material point is in the area. The existence forms of the substance points are more, and the existence forms of the substances are cuboid, round and the like, and the existence forms of the substances are processed in different files, so that the embodiment needs to modify related files according to the specific form of the simulated object, if the substance points after the object is discrete are in the process, the substance points are stored in the substance point array of the process, and the same substance point cannot appear in a plurality of processes, namely, the situation that the substance points among the processes are not overlapped is avoided.

And 9, material initialization is parallel. The keyword "< Material > … … </Material >" indicates the property of the Material. The material is initialized without data dependency, each process is initialized at the same time, and then the process enters a cycle.

Step 10, the process of calculating the physical time step is that the velocity has three components, the absolute values of the three components of the velocity are taken, and the maximum value of the three components and the sound velocity of the material point are accumulated. Each object point then performs its own same operation. Then, the maximum value of all the particles was selected and designated as maxinfo. Each cell grid also has three directions, and the minimum length in the cell grid is marked as MinDCell. Finally, the MinDCell is divided by the maxinfo and then multiplied by a fixed parameter to obtain the time step of the current moment. In parallel, the present embodiment calculates the minimum maxinfo of the present process first, and then calculates the maximum maxinfo through data transmission. The time step of each process is consistent at the current moment after transmission.

And 11, calculating mass and momentum. The material point is in a certain cell grid, and the cell grid is composed of 8 grid points. The data of the object points are mapped to 8 grid points, and the weights of the 8 grid points are calculated by calling a shape function, so that interpolation is carried out, and the mass and momentum of the 8 corresponding grid points are calculated. And each process completes the calculation of the mass and momentum of all grid points in the simulation process by circulating all object points in the process. Calculate massAnd momentum->The calculation process of (2) is as follows:

wherein,weight value, m, of 8 grid points calculated by the representational function _p Mass of object point, +.>Is the velocity of the particle, n _p Is the number of object points.

Step 12, data dependent transmission design. If a plurality of object points are interpolated on the same grid point, accumulation is needed, so that data transmission between processes is needed after calculation is completed in order to keep the consistency of data on each inter-process grid. For example, two substance points are interpolated to 8 grid points in the calculation of the same unit grid, two substance points are interpolated to the same grid, two substance points are respectively interpolated, as shown in fig. 8, the operation schematic diagram (two dimensions) of different substance points mapped to four grid points is three-dimensionally divided by the background grid, the grid point marks between adjacent processes are determined, a corresponding neighbor list is established for the adjacent 26 processes, and the data is copied into the list. Such as: the neighbor process number of process number 15 is 14, then the mass and momentum of the grid point where process number 15 overlaps process number 14 are packed and transmitted once. Assuming that there are only two processes and one mesh is overlapped, data communication is required, as shown in fig. 6, which is a schematic diagram of data transmission of the mesh point

As previously described, each process has 26 neighbors, from which the process needs to obtain the required value or send it to the neighbor value. Aiming at three-dimensional division of an object point method, parallel design is roughly divided into the following points: first, the overlapping grid positions are determined, and the physical quantity is transmitted by the index of the grid points. The overlapping area of the grid is relatively fixed according to the x, y and z directions, and the calculation of the object points needs to be mapped to the grid, so that the grid point numbers to be transmitted need to be stored in an array, and the physical quantity on the grid point is found according to the index in the embodiment. Next, a transmission method of the physical quantity is determined. All grid points are packed into one array for transmission, the length of the whole array is required to be calculated in order to ensure the length of transmission data, then a corresponding buffer space is applied, and then all data to be transmitted are put into the buffer area, so that a handshake is required between adjacent processes. In addition, the packing and communication of different processes are mutually covered, so that the calculation efficiency is improved. In the substance point parallel program, the present embodiment mainly uses the modes of reducing the number of communication times, reducing the traffic and using non-blocking communication between processes to improve the communication efficiency, such as: in order to reduce the traffic, only the mass and momentum at the grid points are transmitted when calculating the mass and momentum, not all the member variables are transmitted with reference to the object of the grid point class. In addition, in order to avoid frequent inter-process data transmission, the present embodiment opens up temporary buffer arrays for packaging data to be transmitted, such as: there are 1000 dependent data for process number 0 and process number 1, then the 1000 masses and momentums are packed and then transferred to process number 1. Furthermore, by adopting a non-blocking mode, when the process No. 2 is transmitted to the process No. 3, the process No. 0 is transmitted to the process No. 1, and meanwhile, the process No. 1 and the process No. 1 are transmitted, so that mutual influence is avoided.

And 13, carrying out boundary processing on the mass and momentum physical quantities. And carrying out parallel boundary processing on the momentum on the grid points according to the boundary conditions of each process. For a fixed boundary, momentum

And 14, calculating the calculation force. All processes sequentially make differences between the average stress carried by each material point and the artificial viscosity coefficient by circulating the material points of the process, then multiply the average stress with the volume, multiply the average stress with the weight value of the grid point and interpolate the average stress to the grid point, and then calculate the output. After interpolation of each process is completed, in order to maintain consistency of each process for the same grid, data transmission is also required for the force on the grid point, and the transmission process is shown in fig. 6.

Calculating internal force of background grid nodeNode external force->And total node force->The method comprises the following steps:

wherein,weight value of grid point calculated by representational function, +.>Indicating the bias stress of the particle->Representing density of object particles, m _p Indicating mass of the object point, < >>Indicating physical forces acting on the object, +.>Representing the face force at the boundary of the object.

And 15, calculating boundary processing of physical quantity force. Each process obtains own boundary conditions, performs boundary processing on the forces on the grid points, and if the node I is fixed in the I directionWhile integrating momentum over background grid points, the equation is as follows:

wherein,representing updated momentum, ++>Representing movementsQuantity (S)>Indicating the total node force.

At step 16, the calculation of the position and velocity of the object point is updated. Data dependence does not exist among the processes of the step, and parallel calculation can be performed. The velocity is calculated by cycling each object point and then reading the momentum of the object point interpolated to the grid point divided by the mass multiplied by the weight calculated by the shape function, and the acceleration is calculated from the force on the grid point divided by the mass multiplied by the weight calculated by the shape function. Further, the new position of the object point is obtained by mapping the new position to the object point, adding the product of the velocity and the time step to the original object point, adding the product of the velocity of the original object point and the acceleration times the time step to obtain the new velocity of the object point, and the calculation equation is as follows:

wherein,is the particle velocity, +.>Representing total node force, +.>Weight value of grid point calculated by representational function, +.>Representing the updated momentum.

Step 17, calculation of the initialization of the momentum of the grid points. Each process obtains own boundary conditions, carries out zero clearing processing on the momentum on the grid points, and prepares for updating momentum calculation.

In step 18, momentum is calculated. And (3) obtaining the updated speed, mass and weight calculated by the shape function of the object point through the 16 th step calculation, multiplying the speed, mass and weight by the shape function to calculate momentum, and then carrying out data transmission on physical quantity information such as momentum in order to keep the consistency of data on grid points, wherein the calculation formula is as follows:

wherein,representing the weight value, m, of the grid point calculated by the form function _p Indicating mass of the object point, < >>Indicating the speed.

Step 19, calculation of boundary processing of momentum on the grid. Each process obtains own boundary conditions and carries out boundary processing on momentum on grid points.

Step 20, calculate mass point strain delta, spin delta and update density and stress of individual particles. The background grid speed is calculated first, and then each process calculates relevant physical quantities such as strain increment and the like. Calculating background grid speedThe formula of (2) is as follows:

wherein,expressed as momentum on grid points, +.>Represented as the mass at the grid points.

Calculating strain delta of particlesAnd rotation increment->Is defined by the formula:

updating the formula for particle density:

then utilizeAnd rotation increment->The constitutive relationship is invoked to update the stress on the object points.

Step 21, the material point moves. The updated position of the material point causes that data needs to be moved, the process determines which process to manage the material point belongs to according to the position and the position correlation of the material point to be sent, and then all the material point data to be sent are stored in a large buffer area and packaged and sent.

Step 22, iterating the loop. When the object point movement is finished, all information is carried by the object point, the deformed grid can be discarded, then the next iteration is carried out, and a new background grid can be adopted for the next iteration. If the current physical time reaches the preset physical time, stopping iteration, ending the program operation, otherwise, continuing to circulate, and repeating the steps 10 to 22 until the result requirement is met.

The embodiment of the invention designs and realizes a mass parallel simulation algorithm of an object point method on a domestic super computing platform. First, the present embodiment performs parallel processing for initializing the substance point algorithm. And then, uniformly distributing the substance points to each process, and realizing a substance point simulation parallel algorithm based on three-dimensional self-adaptive partitioning, or designating a partitioning mode of a background grid according to user requirements, so that the substance points are distributed among each process more uniformly. Finally, the embodiment improves the program performance by using modes such as non-blocking communication, neighbor list establishment and the like through an MPI multi-core parallel method.

The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. The material point simulation parallel method based on three-dimensional self-adaptive division is characterized by comprising the following steps of:

acquiring an initialization data file, wherein the content of the initialization data file comprises initialization of physical data and initialization of three-dimensional parallel division; wherein, the initialization of the physical data comprises: defining object information in the form of components, and defining a plurality of components, wherein a component includes one or more objects, and an object is composed of a plurality of object points; defining a physical range and boundary conditions of a background grid, wherein the information of the background grid comprises maximum and minimum values in three directions of x, y and z, so as to determine a specific physical area, and determining the boundary conditions through different numerical types;

initializing and dividing the background grid;

the value carried by each material point is interpolated to grid points to calculate the output by circulating the material points in the process; after interpolation of each process is completed, in order to keep consistency of each process to the same grid, data transmission is carried out on the forces on the grid points;

when the object point movement is finished, all information is carried by the object point, discarding the deformed grid, and then entering the next iteration, wherein a new background grid is adopted in the next iteration; if the current physical time reaches the preset physical time, stopping iteration, otherwise, continuing to circulate until the result requirement is met.

2. The method according to claim 1, wherein the step of creating a neighbor process list and equivalently converting the process number into an element in the three-dimensional neighbor process array comprises:

equivalently converting the process number into elements in a three-dimensional neighbor process array neighbor_proc [ K ] [ N ] [ M ], taking one process, wherein the process number is neighbor_proc [ K ] [ N ] [ M ], and the process number of the left neighbor is: k and n are unchanged, m minus 1.

3. The method of claim 1, wherein the step of initially partitioning the background grid comprises:

dividing the background grid into various processes in an equalizing way, and acquiring the area range of each process; counting the number of object points in the process; counting the quantity of object points of three components of a unit grid of an area in jurisdiction, then carrying out protocol summation, and obtaining the total point quantity of the object points in each process; equally dividing the number of the object points, and preliminarily realizing the definition of the area of the object point managed by each process: dividing the total substance point number by the total process number in the X direction to obtain the average particle number in the X direction; determining the area range managed by each process through the average point number in each direction; the background grids are offset by half of the unit grids on the basis of dividing the material points.

4. The method of claim 1 wherein the process of boundary parallel processing is that each process is judged according to the process number, each process stores adjacent neighbors, when the neighbors are empty, the process is indicated to be at the boundary, and assignment is performed according to the boundary processing condition of xmp file; when 26 directions have neighbor processes, the boundary condition is set to 0; the boundary processing determines the value of fix s, and then gives the value of fix x/fix y/fix z on the grid point by the value of fix s, thereby selecting whether to boundary-process the grid point.

5. The method of claim 1, wherein all processes simultaneously circulate all material points of the process to complete the calculation steps of mass and momentum of all grid points of the simulation process, comprising:

mapping the data of the object points to 8 grid points, calling a shape function to calculate the weights of the 8 grid points, so as to interpolate, and calculating the mass and momentum of the 8 corresponding grid points; the mass and momentum calculation of all grid points in the simulation process is completed by circulating all material points, and data communication is carried out between processes after the calculation is completed; calculate massAnd momentum->The calculation process of (2) is as follows:

wherein,weight value, m, of 8 grid points calculated by the representational function _p Mass of object point, +.>Is the velocity of the particle, n _p Is the number of points of the substance。

6. The method of claim 1, wherein the step of establishing a neighbor list buffer for data transmission and reception for neighboring processes comprises:

firstly, determining the positions of overlapped grids, and transmitting physical quantities according to indexes of the grid points; dividing according to x, y and z directions, wherein the overlapped area of the grid is relatively fixed, and meanwhile, calculation of object points needs to be mapped to the grid; storing the grid point marks to be transmitted into an array, and finding out physical quantities on the grid points according to the indexes; secondly, determining a transmission mode of physical quantity, packing all grid points into a group for transmission, calculating the length of the whole group, applying for a corresponding buffer space, and then placing all data to be transmitted into the buffer area.

7. The method of claim 1, wherein all processes calculate the forces on the grid, and wherein the data transfer is performed after the calculation is completed to calculate the internal forces of the background grid nodesNode external force->And total node force->The method comprises the following steps:

wherein,weight value of grid point calculated by representational function, +.>Indicating the bias stress of the particle->Representing density of object particles, m _p Indicating mass of the object point, < >>Indicating physical forces acting on the object, +.>Representing the surface forces acting on the object boundary.

8. The method of claim 1 wherein each process updates the location and velocity of the object points in its own managed area by the following equation:

wherein,weight value of grid point calculated by representational function, +.>Is the velocity of the particle.

9. The method of claim 1, wherein each process calculates the momentum on the grid, and the data transmission is performed after the calculation is completed, and the calculation formula is as follows:

wherein,representing the weight value, m, of the grid point calculated by the form function _p Indicating mass of the object point, < >>Indicating the velocity of the particle.

10. The method of claim 1, wherein the steps of calculating strain and curl increments of the mass point by momentum and mass and updating the density and stress of the mass point comprise:

firstly, calculating the background grid speed, and then, calculating the strain increment and the rotation increment of each process; calculating background grid speedThe formula of (2) is as follows:

wherein,represented as grid pointsMomentum of (1)>Represented as mass on grid points;

calculating strain delta of particlesAnd rotation increment->The formula of (2) is as follows:

updating the formula for particle density: