CN112463360A

CN112463360A - Parallel read-in method for billion-hundred-GB-level grid data file

Info

Publication number: CN112463360A
Application number: CN202011183930.7A
Authority: CN
Inventors: 王年华; 常兴华; 赵钟; 张来平
Original assignee: AERODYNAMICS NATIONAL KEY LABORATORY
Current assignee: AERODYNAMICS NATIONAL KEY LABORATORY
Priority date: 2020-10-29
Filing date: 2020-10-29
Publication date: 2021-03-09

Abstract

The invention provides a parallel reading-in method of a billion-hundred GB magnitude grid data file, which is characterized in that grouping files are adopted to store ultra-large scale grid data files generated by a plurality of objects, each grouping comprises a plurality of files, and each file comprises a plurality of data partitions; and during reading, reading the file by adopting a plurality of file processes, sending the file to the corresponding non-file processes, and carrying out data load balancing and allocation. By adopting the method provided by the invention, the grid file IO efficiency can be greatly improved, and when the same super-large-scale grid data is read, the time consumed by the technical scheme of the invention is only 1/50-1/10 which is consumed by the prior art. The invention can greatly save the grid data reading time and improve the calculation efficiency and the economic benefit.

Description

Parallel read-in method for billion-hundred-GB-level grid data file

Technical Field

The invention relates to the field of data processing, in particular to a parallel reading method of a billion-hundred-GB-level grid data file.

Background

The invention relates to the rapid development of computer technology and numerical method, and the Computational Fluid Dynamics (CFD) numerical simulation is widely applied in the fields of aerospace and the like.After decades of development, aerodynamic force/moment prediction in a conventional state based on a reynolds average NS (reynolds average-mean-stocks, RANS) equation has not been difficult, but when swirl, separation, transition, turbulent noise, turbulent combustion and other abnormal and nonlinear phenomena flow obviously, solving the RANS equation on a million-level grid cannot obtain a sufficiently accurate numerical solution, and then a larger-scale grid and a numerical method with higher fidelity, such as a large vortex simulation (LES) and a Direct Numerical Simulation (DNS), need to be adopted. The common feature of these methods is that the requirement for the amount of lattice is high, and the LES method is generally considered to meet the requirement for the amount of lattice in the adhesive bottom layer to Re^1.8Magnitude, while DNS requires a grid size up to Re^9/4. For practical aircraft profiles, the Reynolds number Re is typically 10⁶Above the magnitude, the grid quantity at least reaches over one billion magnitude, and the resolution requirement of the algorithm on the multi-scale flow structure can be met.

When the grid scale is in the order of tens of millions, all grid and flow field data are stored in a single file, the file size is in the order of 1GB, at the moment, file storage and serial reading and writing do not cause too much pressure on a computer file system, and the storage and reading and writing efficiency is still at an acceptable level. However, when the grid quantity reaches the billion magnitude, the grid file and the flow field file reach several hundreds of GB magnitude, and if data is still stored in a single file and serial read-write is performed by a single process, the file read-write speed will inevitably drop sharply, resulting in unacceptable CFD computational efficiency.

Therefore, aiming at the actual requirements of high-efficiency storage and IO of the billion-level grid, a new file storage and parallel IO method needs to be developed, and a foundation is laid for high-resolution numerical simulation of the ultra-large-scale grid of the aircraft in the future.

Disclosure of Invention

Aiming at the problems in the prior art, a method for reading in billions and hundreds of GB magnitude grid data files in parallel is provided.

The technical scheme adopted by the invention is as follows: a parallel reading-in method of billion-hundred GB magnitude grid data files is used for storing ultra-large scale grid data files generated by a plurality of objects in groups, wherein each group comprises a plurality of files, and each file comprises a plurality of data partitions; and during reading, reading the file by adopting a plurality of file processes, sending the file to the corresponding non-file processes, and carrying out data load balancing and allocation.

Further, the specific reading method comprises:

if the process number is larger than the number of the files, the file process number is equal to the number of the files, each file process reads one file, and after reading, the grid data are sent to the non-file processes, so that load balancing is realized;

if the number of the processes is smaller than the number of the files, when the number of the files is integral multiple of the number of the processes, the number of the file processes is the total number of the processes, all the processes read the files, and the number of the files read by each process is the same; when the number of files is not integral multiple of the number of processes, the number of the file processes is not more than the total number of the processes on the premise that the number of the files read by each file process is the same;

further, the grid data generated by a plurality of objects are stored in groups, each object is used as a group, each group comprises a plurality of files, and each grid data file contains a plurality of grid partition data.

Further, the amount of data in each grid partition data is approximately the same.

Furthermore, if the number of partitions in each process is different, the partition data in each process is allocated in a balanced manner.

Further, the specific balanced blending method comprises the following steps: communication through MPI enables each process to process the same number of partitions.

Compared with the prior art, the beneficial effects of adopting the technical scheme are as follows: when reading the same ultra-large scale grid data, the time consumption of the technical scheme of the invention is only 1/50-1/10 which is time-consuming in the prior art. The invention can greatly save the grid data reading time and improve the calculation efficiency and the economic benefit.

Drawings

FIG. 1 is a geometric drawing of an airfoil pylon marking die in accordance with an embodiment of the present invention.

FIG. 2 is a schematic view of a very large scale grid for an external wing pylon according to an embodiment of the invention.

Fig. 3 is a diagram illustrating a CS-mode reading grid in the prior art.

FIG. 4 is a diagram of a P2P reading grid according to an embodiment of the present invention.

FIG. 5 is a diagram illustrating reading when the number of partitions in each file is the same according to an embodiment of the present invention.

FIG. 6 is a diagram illustrating reading when the number of partitions in each file is different according to an embodiment of the present invention.

FIG. 7 is a diagram illustrating reading when there are redundant non-file processes in an embodiment of the invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

The invention provides a parallel reading-in method of a billion-hundred GB magnitude grid data file, which is used for storing ultra-large scale grid data files generated by a plurality of objects in groups, wherein each group comprises a plurality of files, and each file comprises a plurality of data partitions; and during reading, reading the file by adopting a plurality of file processes, sending the file to the corresponding non-file processes, and carrying out data load balancing and allocation.

In the CFD calculation, the main large files include a mesh file and a flow field file, which contain the topology and the flow field value of each discrete unit and need to occupy a large amount of storage space, for example, for an unstructured mesh with ten million units, the occupied storage is about 1.5GB under the framework of a second-order finite volume method, and the size of the flow field file is about 5 GB; but for a billion-order unstructured grid, the occupied memory is about 150GB, and the size of the flow field file is about 500 GB; therefore, for a billion or even billion-level grid in the future, files of the grid reach the ultra-large-scale level of hundreds of GB, and the actual requirements cannot be met by storing and IO by adopting a single process and a single file.

Therefore, the invention provides a new storage mode, wherein the grid of each object is divided into a plurality of files which are grouped into one group, namely, each object corresponds to one group, each group comprises a plurality of files, and each file comprises a plurality of partition data; in this embodiment, taking a multi-body separation numerical simulation problem of a super-large-scale wing pendant (as shown in fig. 1) as an example, the number of partitions in each wing grid data file is 16, and the number of partitions in each pendant grid data file is 8 or 16, specifically as follows:

the wings and the external stores have separate grids, and due to the super-large scale of the grids of the single object, the grids can be stored in a group of files, so that for the problem of a plurality of objects, the storage form of the super-large scale grid files is a plurality of groups of a plurality of files, as shown in table 1.

TABLE 1 very large-scale grid storage form of multiple objects

For example, for the wing add-on modeling example, the total number of units is 28.8 billion non-structural grids, wherein the number of wing grids is 19.8 billion, the number of add-on grids is 9 billion, as shown in fig. 2, the grids are divided into 2 groups respectively, and stored in 8192 files (4096+4096) or 6144 files (4096+2048), the total file size is 160GB, and after the grids are divided into a plurality of files and stored, the size of each file is about 10-20MB, as shown in table 2.

TABLE 2 example of very large scale grid storage for wing stores

For such data of multiple sets of multiple files, the prior art reading method is as follows: data is read only by the main process and sequentially sent by the main process to the corresponding sub-processes, which is called a Server mode, i.e. a CS mode, similarly to the manner of the Server and the Client, and the reading process is as shown in fig. 3. However, this method has a great limitation, the reading speed of the grid is very slow, and the time consumption is great, as shown in table 3, it takes about 90s to read the data file with the size of 2.5 GB.

Therefore, in this embodiment, it is considered that a plurality of processes are adopted to read a file (file process), and after reading, the grid partition data is sent to the corresponding processes respectively, which is referred to as a Point-to-Point method (P2P).

The method comprises the following specific steps: if the process number is larger than the number of the files, the file process number is equal to the number of the files, each file process reads one file, and after reading, the grid data are sent to the non-file processes, so that load balancing is realized; if the number of the processes is smaller than the number of the files, when the number of the files is integral multiple of the number of the processes, the number of the file processes is the total number of the processes, all the processes read the files, and the number of the files read by each process is the same; when the number of files is not integral multiple of the number of processes, the number of the file processes is not more than the total number of the processes on the premise that the number of the files read by each file process is the same;

and if the number of the partitions in each process is different, performing balanced allocation on the partition data in each process by adopting MPI communication. The specific balanced blending method comprises the following steps: communication through MPI allows each process to process the same amount of partitioned data.

The reading method is explained in detail in this embodiment:

two situations can occur when reading:

1. the number of processes is greater than the number of files

At this time, the number of file taking processes is equal to the number of files, that is, each file process only needs to read 1 file, and after the reading is completed, the mesh data is sent to the non-file process, as shown in fig. 4.

2. The number of processes is less than the number of files

For this case, the number of file processes is necessarily smaller than the number of files, and at this time, the case of class 2 is to be classified:

(1) the number of files F is an integer multiple of the number of processes P

At this time, the file process number M is taken as the total process number P, that is, M equals to P, all processes read the file, and each process reads K equals to F/M files. There are still 2 cases at this time: a) if the number of partitions in each file is the same, then the number of partitions in each process is also the same at this time, and MPI communication is not required, as shown in fig. 5. b) If the number of partitions in each file is different, the number of partitions in each process is different, and MPI communication is also required to perform balanced allocation on the partition data in each process, as shown in fig. 6.

For example, when the process number is P-4 and the number of files is F-12, the process number M-P-4 of the files, and K-F/M-3 of the files read by each file process. a) If each file contains 2 partitions and the total number of the partitions is 24, each process only has 6 partitions, and communication is not needed at this time. b) If the first 6 files each contain 1 partition, the last 6 files each have 3 partitions, and the total partition number is 6+18 to 24, then the process 2 and the process 3 need to send mesh data of 3 partitions to the process 0 and the process 1, respectively, so as to ensure that each process has 6 partitions.

(2) The number of files F is not an integer multiple of the number of processes P.

At the moment, a part of processes M serve as file processes to read files, and the other part of processes N serve as redundant non-file processes to receive data, (M and N meet the condition that M + N equals P), on the premise that the number K of the read files of each file process is the same (M x K equals F), the number of the file processes is not more than the total number of the processes, namely M < equalsP, and in order to reduce communication and improve efficiency, the larger the number M of the file processes is, the better the file processes are;

if the number of processes is P ═ 6, the number of files is F ═ 20, and on the premise that the number of file processes read files per file process is the same, the number of file processes is not greater than the total number of processes, then the number of file processes M and the number of file processes read files per file process K may be M ═ 2, K ═ 10 or M ═ 4, and K ═ 5, and in order to reduce traffic, the number of file processes M ═ 4 is taken, that is, K ═ 5 files are read per file process, and at this time, N ═ 2 processes are still left as non-file processes, and these redundant non-file processes need to acquire mesh data through MPI communication, as shown in fig. 7.

The method is adopted for testing, the wing plug-in object model calculation example is tested, and the time consumed by grid reading is inspected. The results show that: on a domestic cluster, the time consumption for reading 3.6 hundred million grids and 28.8 hundred million grids by adopting a single-process CS mode is 161.64s and 2142.71s, and in contrast, the time consumption for reading grids by adopting 1024 processes and 8192 processes respectively in a P2P mode is 12.30s and 37.28s respectively, the time consumption is only 1/50-1/10 of the CS mode, and the IO efficiency of grid files is obviously improved by adopting a P2P mode.

TABLE 3 wing hangings grid and read-in time

Grid mesh

Number of cells in grid

Number of documents

File size

P2P model File Process count

P2P mode is time consuming

CS mode time consuming

original

4500 million

1024

2.5GB

1024

7.12s

94.21s

adapt1

3.6 hundred million

1024

20GB

1024

12.30s

161.64s

adapt2

28.8 hundred million

8192

160GB

8192

37.28s

2142.71s

The invention is not limited to the foregoing embodiments. The invention extends to any novel feature or any novel combination of features disclosed in this specification and any novel method or process steps or any novel combination of features disclosed. Those skilled in the art to which the invention pertains will appreciate that insubstantial changes or modifications can be made without departing from the spirit of the invention as defined by the appended claims.

All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.

Any feature disclosed in this specification may be replaced by alternative features serving equivalent or similar purposes, unless expressly stated otherwise. That is, unless expressly stated otherwise, each feature is only an example of a generic series of equivalent or similar features.

Claims

1. A parallel read-in method of billion-hundred GB magnitude grid data file is characterized in that grouping files are adopted to store ultra-large scale grid data generated by a plurality of objects, each grouping comprises a plurality of files, and each file comprises a plurality of data partitions; and during reading, reading the file by adopting a plurality of file processes, sending the file to the corresponding non-file processes, and carrying out data load balancing and allocation.

2. The parallel read-in method of the billion hundred GB magnitude grid data file according to claim 1, characterized in that the specific read-in method comprises:

if the number of the processes is smaller than the number of the files, when the number of the files is integral multiple of the number of the processes, the number of the file processes is the total number of the processes, all the processes read the files, and the number of the files read by each process is the same; and when the number of the files is not integral multiple of the number of the processes, the number of the file processes is not more than the total number of the processes on the premise that the number of the files read by each file process is the same.

3. The method for parallelly reading in the billion hundred GB magnitude grid data file according to claim 2, wherein the grid data generated by a plurality of objects are stored in groups, each object is used as a group, each group comprises a plurality of files, and each grid data file contains a plurality of grid partition data.

4. The method for reading in the billion hundred GB magnitude grid data file in parallel according to claim 3, wherein the data volume in each grid partition data is approximately the same.

5. The method for parallel reading-in of billion hundred GB magnitude grid data files according to claim 3 or 4, characterized in that if the number of partitions on each process is the same, partition data allocation is not needed, and if the number of partitions on each process is different, load balancing allocation is performed on the partition data in each process.

6. The parallel read-in method of billions of GB magnitude grid data files according to claim 5, characterized in that the specific balanced allocation method is as follows: communication through MPI enables each process to process the same amount of grid partition data.