CN114995754A

CN114995754A - High-performance read-write method for single scientific big data HDF5 file

Info

Publication number: CN114995754A
Application number: CN202210585275.0A
Authority: CN
Inventors: 张承龙; 张一�; 何旭; 李想; 朱中柱
Original assignee: Institute of High Energy Physics of CAS
Current assignee: Institute of High Energy Physics of CAS
Priority date: 2022-05-26
Filing date: 2022-05-26
Publication date: 2022-09-02
Anticipated expiration: 2042-05-26
Also published as: CN114995754B

Abstract

The invention relates to the technical field of computers, in particular to a high-performance read-write method for a single scientific big data HDF5 file, which is characterized in that fine-grained parallelism is carried out on data sets in a single HDF5 file, each process processes a part of data sets, the next data sets are processed only after the data sets are processed, and the next data sets are processed only after all processes process the same data sets.

Description

High-performance read-write method for single scientific big data HDF5 file

Technical Field

The invention relates to the technical field of computers, in particular to a high-performance read-write method for a single scientific big data HDF5 file.

Background

The Hierarchical Data Format (HDF) is a file Format developed by the american super computing application center (NCSA) for storing and organizing large amounts of Data. The data storage system is mainly used for storing various types of scientific data generated by different computing platforms, has the advantages of parallel I/O (input/output) capability, cross-platform performance, easiness in expansion and the like, has become a standard format of EOS (Ethernet over Standard) data and information systems, and is widely applied to various scientific big data fields of physics, biology, chemistry, environmental science, materials, earth science, aviation, ocean and the like due to self-description, universality, flexibility and expansibility, so that the data storage system is used for storing and processing various complex types of scientific data.

In order to better perform data acquisition, transmission, processing and pre-experiment, scientific experiment process generally aggregates scientific data into a few files, which results in a huge single HDF5 file, the HDF5 file is stored in a disk file system, and the huge data volume of the HDF5 file and the low efficiency of disk storage result in that a large amount of time is spent by a scientific big data software system for waiting for the HDF5 file to perform read-write operation because the disk is a slow storage medium.

In the prior art, a read-write method for a single HDF5 file is mainly serial read-write, and parallel I/O read-write is difficult to maintain consistency, and a method for improving the read-write performance of a single HDF5 file through parallel I/O is not available, so that a large amount of time is consumed when a scientific big data software system reads and writes a single HDF5 file.

Therefore, a high-performance read-write method for a single scientific big data HDF5 file needs to be researched, and the problems that the single file is used as unit for coarse-grained parallel read-write, the parallel efficiency of a multi-core processor and a disk file system is low, and the synchronization among processes is complex are solved, so that the scientific experiment efficiency is indirectly improved. By performing finer-grained parallel on dataset inside a single HDF5 file as a granularity, the problems in the prior art can be effectively solved, the parallel efficiency of a multi-core processor and a disk file system is improved, and the read-write performance of the single HDF5 file is remarkably improved.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a high-performance read-write method for a single scientific big data HDF5 file, solves the problems of coarse-grained parallel read-write by taking the single file as a unit, low parallel efficiency of a multi-core processor and a disk file system and serious restriction on scientific experimental efficiency due to the fact that synchronization among processes is complex, can effectively solve the problems in the prior art by performing finer-grained parallel on dataset in the single HDF5 file by taking dataset as the granularity, improves the parallel efficiency of the multi-core processor and the disk file system, and remarkably improves the read-write performance of the single HDF5 file.

In order to achieve the purpose, the invention provides a high-performance read-write method for a single scientific big data HDF5 file, the read-write method is similar in principle, and the read-write method comprises the following steps:

s1: setting the number of processes in the communication domain as size, and recording the serial number of the current process as ID;

s2: opening the HDF5 file using parallel IO;

s3: acquiring the total number of datasets in the HDF5 file, and recording the total number as count; acquiring the data byte size of a single dataset in the HDF5 file, and recording the data byte size as M;

s4: setting the total number of the data sets stored in the process sending buffer area as N; distributing a cache for a sending cache region send _ buf of each process, wherein the size of the cache is nbytes ═ NxM; the total number of datasets processed by all processes at each time is NN (N × size);

s5: judging whether the process is a process No. 0; if so, allocating a memory space array with the size of count × M for saving the read data; distributing a receiving buffer recv _ buf, wherein the size of the receiving buffer recv _ buf is NNXM; if not, no action is taken.

S6: setting an iteration starting point 1 as ID multiplied by N; iteration end1 int (count/NN × NN); the iteration variable x1 is start 1;

s6: judging whether x1 is less than end 1;

when the judgment result of S6 is yes, the following steps are performed:

s6-1: setting an iteration starting point start2 to be 0; iteration end2 ═ N; the iteration variable x2 is start 2;

s6-2: judging whether x2 is less than end 2;

s6-3: when the judgment result of the step S6-2 is yes, reading the (x 1+ x 2) th dataset from the memory, copying the read dataset to the (x 2) th dataset position of the send buffer send _ buf, executing (x 2) ═ x2+1, and returning to the step S6-2;

s6-4: when the judgment result of the S6-2 is negative, performing aggregation operation on the cache data of all the processes, namely, Gather (send _ buf, recv _ buf);

s7: judging whether the process is a process No. 0;

s7-1: when the determination result at S7 is yes, the data in the receiving buffer recv _ buf is stored to the address beginning at the x1 dataset in the array, x1 is executed as x1+ NN, and then the process returns to S6 again;

s7-2: when the result of S7 is no, go directly back to S6;

when the judgment result of S6 is no, the following steps are performed:

s8: judging whether the process is a process No. 0;

when the judgment result of the S8 is yes, serially reading the rest dataset and storing the dataset into the array;

if the result of determination at S8 is negative, no processing is performed, and the process is skipped.

S9: close the HDF5 file;

s10: judging whether the process is a non-0 process;

when the result of S10 is yes, then exit is performed exit (0);

when the result of S10 is no, the entire algorithm process ends.

According to the invention, each process processes a part of the datasets, the next batch of datasets is processed after the datasets are processed, and the next batch of datasets is processed after all processes process the same batch of datasets.

Compared with the prior art, the method has the advantages that by performing finer-grained parallelization on the dataset in the single HDF5 file, the problems in the prior art can be effectively solved, the parallelization efficiency of the multi-core processor and the disk file system is improved, and the read-write performance of the single HDF5 file is remarkably improved.

Drawings

FIG. 1 is a schematic diagram of the logical structure of the HDF5 file according to the present invention;

FIG. 2 is a first schematic diagram of the algorithm of the present invention;

FIG. 3 is a schematic diagram of the algorithm flow of the present invention;

Detailed Description

The invention will now be further described with reference to the accompanying drawings. Referring to fig. 1 to fig. 3, the invention provides a high-performance read-write method for a single file of HDF5 for scientific big data, the read-write method has similar principles, wherein the read-write method comprises the following steps:

s1: setting the number of processes in the communication domain as size, and recording the number of the current process as ID;

s2: opening the HDF5 file using parallel IO;

s5: judging whether the process is the process No. 0, if the process is the process No. 0, allocating a memory space array with the size of count multiplied by M for storing the read data, allocating a receiving cache recv _ buf with the size of NN multiplied by M, and if the process is not the process No. 0, not operating in the step;

s6: setting an iteration starting point 1 as ID multiplied by N; iteration end1 int (count/NN); the iteration variable x1 is equal to start1, and whether x1 is smaller than end1 is judged;

when the judgment result of S6 is yes, the following steps are performed:

s6-2: judging whether x2 is less than end 2;

s6-3: when the judgment result of the step S6-2 is yes, reading the (x 1+ x 2) th dataset from the memory, copying the read dataset to the (x 2) th dataset position of the sending buffer send _ buf, executing (x 2 is equal to x2+ 1), and returning to the step S6-2;

s6-4: when the judgment result of the S6-2 is negative, performing aggregation operation on the cache data of all the processes, namely Gather _ buf and recv _ buf;

s7: judging whether the process is a process No. 0;

s7-2: when the result of S7 is no, go directly back to S6;

when the judgment result of S6 is no, the following steps are performed:

s8: judging whether the process is a process No. 0;

when the judgment result of the S8 is negative, no processing is performed, and the skipping is directly carried out;

s9: close the HDF5 file;

s10: judging whether the process is a non-0 process;

when the result of S10 is yes, then exit is performed exit (0);

when the result of S10 is no, the entire algorithm process ends.

Example (b):

the operation platform of the method mainly comprises a server, a desktop, a notebook and the like, and assuming that the file name of the HDF5 is file.h5, the principle of the invention is described below by taking an example of storing each picture as a dataset and reading HDF5 (fig. 1 is a logical structure of an HDF5 file).

All pictures are stored as a 3D array (the third dimension of the 3D array is regarded as a dataset), and the single file writing process of the HDF5 is similar to the three dimensions, as shown in FIGS. 2 and 3, the specific algorithm steps are as follows:

1. the number of processes in the communication domain is size

2. The process number is ID

3. Opening file. h5 using MPI-IO

4. Get the total number of datasets in file.h5, and record as count

5. H5, obtaining the data byte size of a single dataset and recording as M

6. Setting the total number of datasets stored in the process sending buffer area and recording the total number as N

7. Allocating a buffer to the send buffer (send _ buf) of each process, wherein the size of the buffer is nbytes ═ N × M

8. The total number of datasets processed by all processes at a time is NN (N size)

9. If the process is the process 0, allocating memory space array with the size of count × M for storing the read data

10. If the process is the process No. 0, the receiving buffer recv _ buf is distributed and has the size of NN M

11. Iteration start1 ═ ID × N

12. Iteration end1 int (count/NN)

13. Iteration variable x1 ═ start1

14. If x1< end1, then all processes perform the following in parallel:

14.1. iteration start2 is 0

14.2. Iteration end2 ═ N

14.3. Iteration variable x2 ═ start2

14.4. If x2< end2, then the following is performed:

14.4.1. reading the (x 1+ x 2) th dataset from the memory, and copying the dataset to the position of the (x 2) th dataset of the sending buffer send _ buf

14.4.2.x2＝x2+1

14.4.3. Jump to execution at 14.4

14.5. Performing aggregation operation on the cache data of all the processes, Gather (send _ buf, recv _ buf)

14.6. If the process is the process No. 0, the data in the receiving buffer recv _ buf is stored to the address starting from the x1 th dataset in the array

14.7.x1＝x1+NN

14.8. Jump to execution at 14 th

15. If it is process 0, then the remaining datasets are serially read and stored in array.

16. Close file. h5

17. If the process is not process number 0, exit is executed exit (0).

The above are only preferred embodiments of the present invention, and are only used to help understanding the method and the core idea of the present application, the scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the scope of the present invention. It should be noted that modifications and adaptations to those skilled in the art without departing from the principles of the present invention should also be considered as within the scope of the present invention.

The invention integrally solves the problems of low efficiency and complex operation caused by the fact that experimental data are stored in a single HDF5 file in the prior art to perform coarse-grained parallel reading and writing by taking a single file as a unit, and can effectively solve the problems in the prior art by performing finer-grained parallel by taking dataset in a single HDF5 file as a granularity, improve the parallel efficiency of a multi-core processor and a disk file system, remarkably improve the reading and writing performance of a single HDF5 file and greatly improve the efficiency of scientific experiments.

Claims

1. A high-performance read-write method for a single scientific big data HDF5 file is characterized in that the read-write method is similar in principle, wherein the read method comprises the following steps:

s2: opening the HDF5 file by using parallel IO;

s4: setting the total number of the data sets stored in the process sending buffer area as N; distributing a cache for a sending cache region send _ buf of each process, wherein the size of the cache is Nbytes which is N multiplied by M; the total number of datasets processed by all processes at each time is NN (N × size);

s5: judging whether the process is a process No. 0, if the process is the process No. 0, allocating a memory space array with the size of count multiplied by M for storing read data, allocating a receiving cache recv _ buf with the size of NNmultiplied by M, and if the process is not the process No. 0, not operating in the step;

s6: setting an iteration starting point 1 as ID multiplied by N; iteration end1 int (count/NN); an iteration variable x1 is equal to start1, and whether x1 is smaller than end1 is judged;

when the judgment result of the step S6 is yes, the following steps are executed:

s6-2: judging whether x2 is less than end 2;

s7: judging whether the process is a process No. 0;

s7-2: when the result of S7 is no, go directly back to S6;

when the judgment result of the S6 is negative, the following steps are executed:

s8: judging whether the process is a process No. 0;

when the judgment result of the S8 is yes, serially reading the rest dataset and storing the dataset into an array;

s9: close the HDF5 file;

s10: judging whether the process is a non-0 process;

when the result of the S10 is yes, then exit (0) is executed;

when the result of S10 is no, the entire algorithm process ends.