CN116755636B

CN116755636B - Parallel reading method, device and equipment for grid files and storage medium

Info

Publication number: CN116755636B
Application number: CN202311029868.XA
Authority: CN
Inventors: 陈呈; 何舟桥; 杨超; 赵丹; 郭宁波; 邢德; 王岳青; 杨文祥; 喻杰
Original assignee: Computational Aerodynamics Institute of China Aerodynamics Research and Development Center
Current assignee: Computational Aerodynamics Institute of China Aerodynamics Research and Development Center
Priority date: 2023-08-16
Filing date: 2023-08-16
Publication date: 2023-10-27
Anticipated expiration: 2043-08-16
Also published as: CN116755636A

Abstract

The invention relates to the field of grid file processing, and discloses a parallel reading method, a device, equipment and a storage medium of grid files, wherein the method comprises the following steps: preprocessing the grid file and constructing metadata; grid segmentation is carried out by utilizing the information of the metadata, and each sub-grid data set obtained by segmentation is distributed to different processes; reading in the sub-grid data set in parallel, constructing grid topology and carrying out grid mapping to obtain a mapping table; and analyzing the mapping table, and reading corresponding attribute data in parallel to obtain an output result. The method can read the large-scale data file in parallel by utilizing the multi-core characteristic of the cluster system, greatly improves the reading speed of the large-scale file and solves the problem of memory limitation during serial reading.

Description

Parallel reading method, device and equipment for grid files and storage medium

Technical Field

The present invention relates to the field of grid file processing, and in particular, to a method, an apparatus, a device, and a storage medium for parallel reading of grid files.

Background

With the rapid development of computer technology and numerical simulation methods, the computational and data processing capabilities of fluid mechanics (Computational fluid dynamics, CFD) have greatly increased. The solving process of computing CFD with high precision and high fidelity often generates data files such as ultra-large scale grids, flow fields, attributes and the like, and the ultra-large scale data volume brings challenges to the visualization process of the post-processing stage, wherein the problem of reading the ultra-large scale data is an unavoidable problem to be solved at first.

After simulation by different post-processing tools, a specific data format is typically generated, such as: the EnSight Gold data file format shown in FIG. 1 contains three-dimensional position and size information, scalar fields, vector fields, and other three-dimensional data, which can directly describe vector data. The geo is a file store of suffixes, and attribute data related to models, such as scalar, vector, tensor, is stored in a file of user-defined suffixes. The EnSight Gold format typically stores unsteady flow field data, and maintains file associations between geometric models and attributes, as well as organizational relationships describing different time steps of unsteady data, through case files.

The current data reading method aiming at EnSight Gold format is still mainly based on a traditional serial reading mode, and the mode generally has some problems: because the speed of the processor is not matched with the speed of the disk, the speed of the hardware limits the reading speed of data, so that the disk can be in a frequent access state when a very large-scale file of tens to hundreds of GB is read on a single CPU, the processing speed of the CPU is far greater than the IO speed, and a great amount of time is wasted in the process of waiting for IO by the CPU; the large-scale flow field data is huge in data quantity and is usually stored in a binary form, and when the large-scale flow field data is read into a memory, the operation system can perform operations such as memory alignment, data type conversion and the like, so that the occupied memory quantity is increased sharply, for example, a 3GB EnSight Gold format file stored in a binary form can occupy 10GB of memory when being read in series. The serial read-in mode brings great pressure to the memory, and even program breakdown is possibly caused by insufficient memory, so that the read-in fails.

Disclosure of Invention

Accordingly, the present invention is directed to a method, apparatus, device and storage medium for parallel reading of grid files, which can increase the reading speed of large-scale files and solve the problem of memory limitation during serial reading. The specific scheme is as follows:

a parallel reading method of grid files comprises the following steps:

preprocessing the grid file and constructing metadata;

grid segmentation is carried out by utilizing the information of the metadata, and each sub-grid data set obtained by segmentation is distributed to different processes;

reading in the sub-grid data set in parallel, constructing grid topology and carrying out grid mapping to obtain a mapping table;

and analyzing the mapping table, and reading corresponding attribute data in parallel to obtain an output result.

Preferably, in the parallel reading method of a grid file provided by the embodiment of the present invention, preprocessing and metadata construction are performed on the grid file, including:

pre-scanning the grid file to obtain scanning data;

and extracting key information of the grid file from the scanning data to construct metadata.

Preferably, in the parallel reading method of a grid file provided by the embodiment of the present invention, grid segmentation is performed by using information of the metadata, and each sub-grid data set obtained by segmentation is allocated to a different process, including:

dividing the grids of each component according to the number of processes according to the information of the metadata to obtain a plurality of sub-grid data sets;

calculating the start-stop positions of each sub-grid data set in the grid file;

and distributing each sub-grid data set to different processes according to the start-stop positions of each sub-grid data set in the grid file so as to create file view ports for each process.

Preferably, in the parallel reading method of a grid file provided by the embodiment of the present invention, calculating a start-stop position of each sub-grid data set in the grid file includes:

calculating the total number of the sub-grid data sets allocated by the processes before each process;

calculating the initial byte position of the sub-grid data set in the grid file according to the first byte position of the sub-grid data set, the number of points forming the sub-grid data set and the total number of the sub-grid data sets distributed by the processes before each process;

acquiring the number of the sub-grid data sets distributed by the current process;

and calculating the ending byte position of the sub-grid data set in the grid file according to the starting byte position of the sub-grid data set in the grid file, the number of points forming the sub-grid data set and the number of the sub-grid data sets distributed by the current process.

Preferably, in the parallel reading method of a grid file provided by the embodiment of the present invention, reading the sub-grid dataset in parallel, constructing a grid topology and performing grid mapping to obtain a mapping table, including:

mapping the global point set shared by all the components to the corresponding process local point set in each file viewport to generate a mapping result;

and reading in the sub-grid data set in parallel, constructing grid topology by using the mapping result, and mapping the global grid unit into a process to obtain a mapping table.

Preferably, in the parallel reading method of a grid file provided by the embodiment of the present invention, after obtaining the mapping table, the method further includes:

counting the points used by the current process;

copying the memory space corresponding to the used points into the mapping table; and the mapping table stores the mapping relation between the ID of the coordinate point used by the current process in the grid file and the ID in the process.

Preferably, in the parallel reading method of a grid file provided by the embodiment of the present invention, reading corresponding attribute data in parallel to obtain an output result includes:

when the grid attribute is read in parallel, locating the corresponding grid through the ID, the type name and the type ID of the component; the length of the mapping table represents the number of grids;

when the coordinate point attributes are read in parallel, positioning the coordinate point used in the current process through the ID of the component; the length of the mapping table represents the number of real points used in the current process.

The embodiment of the invention also provides a parallel reading device of the grid file, which comprises the following steps:

the metadata construction module is used for preprocessing the grid file and constructing metadata;

the grid segmentation and distribution module is used for carrying out grid segmentation by utilizing the information of the metadata and distributing each sub-grid data set obtained by segmentation to different processes;

the grid construction and mapping module is used for reading the sub-grid data sets in parallel, constructing grid topology and carrying out grid mapping to obtain a mapping table;

and the data reading module is used for analyzing the mapping table and reading corresponding attribute data in parallel to obtain an output result.

The embodiment of the invention also provides a parallel reading device of the grid file, which comprises a processor and a memory, wherein the parallel reading method of the grid file provided by the embodiment of the invention is realized when the processor executes the computer program stored in the memory.

The embodiment of the invention also provides a computer readable storage medium for storing a computer program, wherein the computer program realizes the parallel reading method of the grid file provided by the embodiment of the invention when being executed by a processor.

From the above technical solution, the parallel reading method of grid files provided by the present invention includes: preprocessing the grid file and constructing metadata; grid segmentation is carried out by utilizing the information of the metadata, and each sub-grid data set obtained by segmentation is distributed to different processes; reading in the sub-grid data set in parallel, constructing grid topology and carrying out grid mapping to obtain a mapping table; and analyzing the mapping table, and reading corresponding attribute data in parallel to obtain an output result.

According to the parallel reading method for the grid files, the grid files are preprocessed and metadata are constructed, grid segmentation is carried out by utilizing the metadata, segmented sub-grid data sets are distributed to different processes, construction and mapping of grid topology are further achieved, and corresponding attribute data are read in parallel by analyzing a mapping table, so that the files are read in parallel by utilizing the multi-core characteristic of the cluster system, the reading speed of large-scale files is greatly improved, the I/O performance is improved, and the memory restriction problem in serial reading is solved.

In addition, the invention also provides a corresponding device, equipment and a computer readable storage medium for the parallel reading method of the grid file, so that the method has more practicability, and the device, the equipment and the computer readable storage medium have corresponding advantages.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the related art, the drawings that are required to be used in the embodiments or the related technical descriptions will be briefly described, and it is apparent that the drawings in the following description are only embodiments of the present invention, and other drawings may be obtained according to the provided drawings without inventive effort for those skilled in the art.

FIG. 1 is a conventional EnSight Gold format file structure;

FIG. 2 is a flow chart of a parallel reading method of grid files provided by an embodiment of the invention;

FIG. 3 is a schematic diagram of a parallel reading method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a metadata data structure according to an embodiment of the present invention;

FIG. 5 is a schematic view of a view port of a process reading file according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of grid construction and mapping according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of coordinate two-time mapping provided in an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a parallel reading device for grid files according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides a parallel reading method of grid files, which is shown in fig. 2 and comprises the following steps:

s201, preprocessing a grid file and constructing metadata;

it should be noted that, before formally reading in data to construct a grid, the grid file needs to be preprocessed and metadata needs to be constructed, and the constructed metadata can assist parallel reading in.

S202, grid segmentation is carried out by utilizing the information of the metadata, and each sub-grid data set obtained by segmentation is distributed to different processes;

in the invention, the grid data is divided by utilizing the information of the metadata, a plurality of sub-grid data sets can be obtained, then the divided sub-grid data sets are distributed to different computing nodes, and each computing node can load the sub-grid data sets into the memory, so that the memory limit of serial reading is broken through.

The result of step S202 is that a file viewport is created for each process, conditions are created for parallel reading, each process maintains a file pointer, and operates in the respective file viewport.

S203, parallelly reading in the sub-grid data set, constructing grid topology and carrying out grid mapping to obtain a mapping table;

it should be noted that, the present invention uses a process level parallelism (Message Passing Interface, MPI) technique for file processing in a cluster environment, and partitions and distributes the grid to different computing nodes for parallel reading, so as to improve the reading speed and the I/O performance, and improve the subsequent visual rendering efficiency.

S204, analyzing the mapping table, and reading corresponding attribute data in parallel to obtain an output result.

In the parallel reading method of the grid file provided by the embodiment of the invention, the grid file is preprocessed and metadata is constructed, the metadata is utilized to carry out grid segmentation, the segmented sub-grid data sets are distributed to different processes, further, construction and mapping of grid topology are realized, and the corresponding attribute data are read in parallel by analyzing the mapping table, so that the file is read in parallel by utilizing the multi-core characteristic of the cluster system, the reading speed of a large-scale file is greatly improved, the I/O performance is improved, and the memory limitation problem during serial reading is solved.

In practical application, the parallel reading method provided by the invention is independent of a file storage mode, and can be applied to centralized storage and distributed storage. The invention mainly aims to realize parallel reading of large-scale data files in EnSight Gold format.

Further, in the implementation, in the parallel reading method of the grid file provided by the embodiment of the present invention, step S201 performs preprocessing and metadata construction on the grid file, as shown in fig. 3, may include: firstly, pre-scanning a grid file to obtain scanning data; then, key information of the grid file is extracted from the scan data, and metadata is constructed.

It should be noted that the data file in the EnSight Gold format differs greatly from other format data types in the manner in which the grid and grid point coordinates are stored. It uses a global set of point coordinates (both two and three dimensional, including XYZ three dimensional coordinate values) in each part (part) separately, different types of grids in part share the ID of this set of coordinates and the ID of the midpoint and grid are incremented by default in the order stored starting from 0. The number of parts and the size of the file are not in positive correlation, so that the parts cannot be directly allocated to each process, but the number of grids in the parts and the size of the file are in positive correlation, and therefore, the strategy adopted by the invention is to divide the grids in the parts into sub-grids and allocate the sub-grids to different processes. The invention designs a data structure (partsintype) without an information overview of each part in the geo file storing the geometric model information, as shown in fig. 4, and scans the geo file once before reading in the geometric model file to perform grid construction, circularly scans each part and stores their basic information in the data structure as metadata of the geometric model file.

Further, in the embodiment of the present invention, in the parallel reading method of the grid file, step S202 performs grid segmentation using metadata information, and distributes each sub-grid data set obtained by segmentation to different processes, as shown in fig. 3, may include: firstly, dividing grids of each part according to the number of processes according to the information of metadata to obtain a plurality of sub-grid data sets; then, calculating the start and stop positions of each sub-grid data set in the grid file; and then, distributing each sub-grid data set to different processes according to the start-stop positions of each sub-grid data set in the grid file so as to create a file viewport for each process.

It should be noted that each part is generally formed by a plurality of cells of different types, and the number and topology of each cell are explicitly given, and the ID of each cell of the different types is implicitly expressed. In the process of constructing metadata, the grid is further divided according to the number of processes. The invention designs a data structure (CellTypes) for representing the division condition of grids. Optionally, the dividing the grid of each part according to the number of processes in the step may include: firstly, according to the number of cells of each type, dividing all cells into all processes, if n cells remain, sequentially distributing the cells to the processes from 0 to n-1, and storing the number of the cells distributed to each process into vBlock (a vector, the subscript of which represents the process number, and the value corresponding to the subscript represents the number of the cells distributed to the process).

In addition, since the calculated split case cannot be directly used by the file pointer, it is necessary to further calculate the start-stop positions of the split grid in the file.

Optionally, in a specific implementation, calculating the start-stop position of each sub-grid dataset in the grid file in the step above may include: calculating the total number of sub-grid data sets allocated by the processes before each process; calculating the initial byte position of the sub-grid data set in the grid file according to the first byte position of the sub-grid data set, the number of points forming the sub-grid data set and the total number of the sub-grid data sets allocated by the processes before each process; acquiring the number of sub-grid data sets distributed by the current process; and calculating the ending byte position of the sub-grid data set in the grid file according to the starting byte position of the sub-grid data set in the grid file, the number of points forming the sub-grid data set and the number of sub-grid data sets distributed by the current process.

Specifically, firstly, calculating the total number of cells allocated by the process before each process according to vBlockPREBLOCK) Obtaining the number of cells distributed by the current process through the value of vBlockCURBLOCK). One for each digit in EnSight GoldINTThe type indicates that, therefore, the cell's start position calculation method is (1), and the end position calculation method is (2):

REALSTARTPOS=CELLSTARTPOS+PREBLOCK*sizeof（INT）* NUMPOINTSOFCELL；（1）

REALSTARTPOS=REALSTARTPOS+CURBLOCK *sizeof（INT）* NUMPOINTSOFCELL-1；（2）

wherein, the liquid crystal display device comprises a liquid crystal display device,REALSTARTPOSindicating that the current process should handle the starting byte position of the cell,CELLSTARTPOSrepresenting the firstThe first byte position of a cell,NUMPOINTSOFCELLindicating the number of points constituting the cell,REALENDPOSindicating that the current process should handle the ending byte position of the cell.

Further, in the implementation, in the parallel reading method of the grid file provided by the embodiment of the present invention, step S203 reads the sub-grid dataset in parallel, constructs a grid topology and performs grid mapping to obtain a mapping table, which may include: mapping the global point set shared by all the components to the corresponding process local point set in each file viewport to generate a mapping result; and reading in the sub-grid data set in parallel, constructing grid topology by using the mapping result, and mapping the global grid unit into a process to obtain a mapping table. After performing step S203 to obtain the mapping table, it may further include: counting the points used by the current process; copying the memory space corresponding to the used points into a mapping table; the mapping table stores the mapping relation between the ID of the coordinate point used by the current process in the grid file and the ID in the process.

Fig. 5 shows the read viewport situation for each process. Each process opens the file and maintains the file pointer in each process, and each process only reads the data belonging to the own viewport to construct the grid topology and make the grid mapping with the help of the metadata, so that the contents processed by each process are not interfered with each other.

In the invention, the coordinate point set can be used for constructing a grid topology, in EnSight Gold, one part shares one set of coordinate point set, and after the grid is divided into a plurality of sub-grid data sets to be distributed to different processes, each process needs to independently maintain the point set in the process. The invention provides a point set mapping method, which maps a global point set shared by a part to a process local point set; formally reading in data to construct grids, constructing a new grid topology by utilizing a local point set, and mapping global grid units into processes. The point set in the geometric model part is mapped to each process through the two mappings, so that the correct construction and mapping of grids and the correct reading of attribute files are ensured.

Fig. 6 shows the two mapping process. The point set of the first mapping is used to constructThe grid comprises the following specific schemes: each process maintains a mapping tablepointsIdReflectTable(length is total number of part midpoints), when constructing the grid, acquiring a first coordinate ID from the NODEIDLIST, taking the ID as a subscript, and setting the value corresponding to the subscript in the array aspointsCnt(initial value of 1, cyclic increment), each coordinate ID is obtained and judged firstpointsIdReflectTableIf the mapping exists, mapping is carried out, then the mapping result is utilized to construct a grid, and if the mapping exists, the mapping result is directly utilized to construct a new cell topological structure. After all grid topologies are built, mapping tablepointsIdReflectTableAnd (3) completing construction, wherein a value of 0 indicates no mapping relation, and a mapping relation is formed by a non-0 value and a subscript of the value.

The second mapped point set is stored in the object and used for reading attribute data, and the specific scheme is as follows: each process maintains a mapping tablerealPointsIdReflectTable(the array cannot be directly constructed because the number of points used by the current process can only be counted after all grids are constructed, so that an intermediate array is needed to temporarily map points and copy the memory into the array). The first coordinate ID is obtained from NODEIDLIST, which is taken as a value,pointsCnt -1 is stored for the subscript into the temporary mapping table. When all grids are constructed, the temporary mapping table is mapped, and the length is equal topointsCntIs copied to mapping tablerealPointsIdReflectTableIn the method, the mapping table only stores the mapping relation between the ID of the coordinate point used by the current process in the geometric model file and the ID in the process.

In addition, it should be noted that, the parallel construction and mapping of the grid are realized by parsing the metadata of the geometric model file instead of reading line by line in the serial I/O scheme. Each process analyzes the partsinftype, obtains the information of the part to be processed by the process, and operates the file pointer to move in the part. First pass throughREALSTARTPOSAndREALENDPOSall cell topologies to be processed by the current process are read directly into an array NODEIDLIST (the array is used to store the topology of all cells, the partAre consecutive bytes and can therefore be read in directly). The ID of the cell in the geometric model file is increased from 0 by default, and the ID of the cell is increased from 0 when the serial method reads in the constructed grid. By means ofPREBLOCKAndCURBLOCKcalculating the starting ID and the ending ID of a cell to be processed in the process, taking the starting ID and the ending ID as the circulation conditions for constructing grids and mapping, sequentially taking out the IDs of coordinate points for constructing a single cell from NODEIDLIST, wherein the coordinate point sets used in different processes are different, the point sets in the geometric model file need to be mapped in each process respectively, each process maintains the point set of the current process, and the point set is used in each process to construct grids and point set mapping. At this time, the cyclic variable K represents the ID of the cell in the geometric model file, and after each grid is built, the variable K is inserted into the list and stored, and the subscript of the list is started from 0 by default, which maps the ID of the cell in the geometric model to the process, and fig. 7 shows the mapping method. One specific embodiment is listed: when the grid type of the process 2 is TETRA4 and the ID of the process is 1000, the original topology of the cell is 198,21,33,18 (the topology represents that coordinates of the points with the ID of 4 in a geometric model file are sequentially connected to form a grid of the TETRA4 type), after mapping by the method provided by the invention, the ID of the coordinate which is actually inserted when the cell is built in the process is 0,1,2 and 3, the value of K is 1000, and the K is inserted into a list, so that the cell with the ID of 1000 in the geometric model file is mapped into the cell with the ID of 0 in the process 2.

Further, in the parallel reading method of the grid file provided by the embodiment of the present invention, step S204 reads corresponding attribute data in parallel to obtain an output result, which may include: when the grid attribute is read in parallel, locating the corresponding grid through the ID, the type name and the type ID of the component; the length of the mapping table represents the number of grids; when the coordinate point attributes are read in parallel, positioning the coordinate point used in the current process through the ID of the component; the length of the mapping table represents the number of real points used in the current process.

In the invention, when constructing the grid, the ID mapping tables of the grid and the coordinate points in the files and the processes are constructed and stored in the Reader object, and when the attribute data is read in, the mapping tables can be directly called to acquire the mapping relation, so that the attribute data is associated with the corresponding grid and coordinate points. The real IDs of the grids and coordinate points in the process are obtained through analyzing the mapping table, when the grid attribute is read in, the corresponding grids can be quickly positioned through the part ID, the type name and the type ID, the length of the mapping table indicates the number of the grids, and therefore attribute data are associated; when the coordinate point attribute is read in, the corresponding coordinate point can be quickly positioned through the ID of the part, and the length of the mapping table indicates the number of the real points used in the process, so that attribute data are associated.

Based on the same inventive concept, the embodiment of the invention also provides a parallel reading device of the grid file, and because the principle of solving the problem of the device is similar to that of the parallel reading method of the grid file, the implementation of the device can refer to the implementation of the parallel reading method of the grid file, and the repetition is omitted.

In a specific implementation, the parallel reading device for a grid file provided by the embodiment of the present invention, as shown in fig. 8, specifically includes:

the metadata construction module 11 is used for preprocessing the grid file and constructing metadata;

the grid dividing and distributing module 12 is configured to divide the grid by using information of metadata, and distribute each sub-grid data set obtained by dividing to different processes;

the grid construction and mapping module 13 is used for reading in the sub-grid data set in parallel, constructing grid topology and performing grid mapping to obtain a mapping table;

the data reading module 14 is configured to parse the mapping table, and read in parallel the corresponding attribute data to obtain an output result.

In the parallel reading device for the grid files provided by the embodiment of the invention, the parallel reading of the files can be realized through the interaction of the four modules, the reading speed of large-scale files is greatly improved, the I/O performance is improved, and the memory limitation problem during serial reading is solved.

For more specific working procedures of the above modules, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.

Correspondingly, the embodiment of the invention also discloses parallel reading equipment of the grid file, which comprises a processor and a memory; the parallel reading method of the grid file disclosed in the foregoing embodiment is implemented when the processor executes the computer program stored in the memory. For more specific procedures of the above method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.

Further, the invention also discloses a computer readable storage medium for storing a computer program; the computer program, when executed by the processor, implements the parallel reading method of the grid file disclosed above. For more specific procedures of the above method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. The apparatus, device, and storage medium disclosed in the embodiments are relatively simple to describe, and the relevant parts refer to the description of the method section because they correspond to the methods disclosed in the embodiments.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The parallel reading method, device, equipment and storage medium of the grid file provided by the invention are described in detail, and specific examples are applied to the description of the principle and implementation mode of the invention, and the description of the above examples is only used for helping to understand the method and core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. A parallel reading method of grid files, comprising:

preprocessing the grid file and constructing metadata;

distributing each sub-grid data set to different processes according to the start-stop positions of each sub-grid data set in the grid file so as to create a file viewport for each process;

reading in the sub-grid data set in parallel, constructing grid topology by using the mapping result, and mapping the global grid unit into a process to obtain a mapping table;

2. The parallel reading method of grid files according to claim 1, wherein preprocessing and metadata construction are performed on the grid files, comprising:

pre-scanning the grid file to obtain scanning data;

3. The parallel reading-in method of a grid file according to claim 2, wherein calculating start-stop positions of each of the sub-grid data sets in the grid file comprises:

4. A parallel reading method of grid files according to claim 3, further comprising, after obtaining the mapping table:

counting the points used by the current process;

5. The parallel reading-in method of grid files according to claim 4, wherein reading-in corresponding attribute data in parallel, obtaining an output result, comprises:

6. A parallel reading device for a grid file, comprising:

the grid dividing and distributing module is used for dividing grids of all the components according to the process number according to the information of the metadata to obtain a plurality of sub-grid data sets; calculating the start-stop positions of each sub-grid data set in the grid file; distributing each sub-grid data set to different processes according to the start-stop positions of each sub-grid data set in the grid file so as to create a file viewport for each process;

the grid construction and mapping module is used for mapping the global point set shared by all the components to the corresponding process local point set in each file viewport to generate a mapping result; reading in the sub-grid data set in parallel, constructing grid topology by using the mapping result, and mapping the global grid unit into a process to obtain a mapping table;

7. A parallel reading-in device of a grid file, characterized by comprising a processor and a memory, wherein the processor implements the parallel reading-in method of a grid file according to any one of claims 1 to 5 when executing a computer program stored in the memory.

8. A computer readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements a parallel read-in method of a grid file according to any one of claims 1 to 5.