CN112235356B - Distributed PB-level CFD simulation data management system based on cluster - Google Patents

Distributed PB-level CFD simulation data management system based on cluster Download PDF

Info

Publication number
CN112235356B
CN112235356B CN202011007979.7A CN202011007979A CN112235356B CN 112235356 B CN112235356 B CN 112235356B CN 202011007979 A CN202011007979 A CN 202011007979A CN 112235356 B CN112235356 B CN 112235356B
Authority
CN
China
Prior art keywords
data
file
storage
module
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011007979.7A
Other languages
Chinese (zh)
Other versions
CN112235356A (en
Inventor
唐滨
李宝君
段文洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital Intelligent Ship&ocean Technology Co ltd
Original Assignee
Digital Intelligent Ship&ocean Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital Intelligent Ship&ocean Technology Co ltd filed Critical Digital Intelligent Ship&ocean Technology Co ltd
Priority to CN202011007979.7A priority Critical patent/CN112235356B/en
Publication of CN112235356A publication Critical patent/CN112235356A/en
Application granted granted Critical
Publication of CN112235356B publication Critical patent/CN112235356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/28Design optimisation, verification or simulation using fluid dynamics, e.g. using Navier-Stokes equations or computational fluid dynamics [CFD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/08Protocols specially adapted for terminal emulation, e.g. Telnet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The invention discloses a distributed PB-level CFD simulation data management system based on a cluster. The invention relates to the technical field of simulation data management, and the system hardware of the invention is mainly divided into two parts, namely a client and a server, which are connected with the Internet through a router. The client side is a man-machine port which directly interacts with the user and is responsible for sending the instruction of the user to the server side and presenting the result returned by the server side to the user. According to the invention, even if a certain part of data is lost, other data can still be used, the data generation (calculation) and storage are completed in the cluster, the resources of the cluster machine are fully utilized, large-scale data is divided into small data, and the idle resources of the disk can be effectively utilized. The data access method and the data access system can support a plurality of engineers to access the data at the same time, reduce data circulation time, enable the data to be accessed immediately after being calculated, and are high in timeliness.

Description

Distributed PB-level CFD simulation data management system based on cluster
Technical Field
The invention relates to the technical field of simulation data management, in particular to a distributed PB-level CFD simulation data management system based on a cluster.
Background
With the improvement of computer computing power, especially the application of supercomputers such as light of Tianhe and Taihu lake in the CFD field, the data volume of CFD simulation results is rapidly increased, and the data volume reaching TB level through single calculation is gradually changed into a normal state. The huge data volume exceeds the storage capacity of a single computer, and the invention provides a distributed PB-level CFD simulation result data management system always based on a cluster. On one hand, an efficient data storage and retrieval method is provided, and on the other hand, idle storage resources of the cluster can be fully utilized.
In a scientific research institute or an enterprise, when engineering-level CFD simulation analysis is performed, in order to increase analysis speed, an ultra-computation or private computation cluster is often used, and the number of available computation cores is usually from several hundred to thousands of cores. When the cluster carries out analysis and calculation, each node sends a calculation result to one or more fixed nodes, and the nodes combine data and store the data in the hard disk of the nodes. In this case, only a few node storage resources can be used, and the storage resources of other nodes are not fully utilized.
At present, in CFD calculation, result data is often large, on one hand, the data amount of a single file is large, usually hundreds of MB, even more than 1GB, and on the other hand, the number of time sequence files in single calculation is large, which can reach hundreds of thousands. In research institutions and enterprises, special private clusters are usually used for CFD simulation calculation, a simulation engineer usually copies and takes away results by using a single mobile storage device, or the results are analyzed after downloading the results in a local area network, the time efficiency of downloading the results from the local area network is usually very short, and result files can be quickly covered by other calculations.
The simulation result of the CFD is divided into two parts, one part is a mesh part for representing the geometric shape, and the other part is an attribute data part for representing the physical characteristics. The grid part is divided into two parts of nodes and units (connection relation of the nodes). Attribute data can be divided into three categories: scalar, vector and tensor. Meanwhile, the result in the CFD simulation analysis is usually a timing result, that is, a simulation calculation result is output every time step.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a distributed PB-level CFD simulation data management system based on a cluster, and the invention provides the following technical scheme:
a distributed PB-level CFD simulation data management system based on a cluster comprises a client 1, a client 2, a router 1, a router 2, a server, a central switch, a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 are connected with a router 1, the router 1 is connected with the router 2 through the Internet, the router 2 is connected with a server, the router 2 is connected with a central switch, and the central switch is connected with a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 comprise a TCP/IP service module, a data management module, a control center, a service function module and a user interaction interface;
the TCP/IP service module is connected with the data management module and the control center, the data management module is connected with the control center, the control center is connected with the service function module, and the control center and the service function module are connected with the user interaction interface.
Preferably, a management system is deployed on the server to realize management and maintenance of the whole system cluster, the server needs to complete instruction response to the client, and files required by the instruction are combined and then transmitted to the specified client;
the storage node is the final storage end of the data, that is, all the data will be stored in the hard disks of different storage nodes in a distributed manner.
Preferably, the management system stores the result file in a distributed storage mode, splits the result file into smaller data units based on the data characteristics of the CFD simulation result file, and then performs distributed storage.
Preferably, the result file is split in a grid-attribute mode, and a mapping relation database is established to ensure that the split result is retrieved and merged again; for a single CFD result file, the single CFD result file comprises grid data and physical attribute data, the grid data is divided into grid node sequences and unit topology data, the grid node sequences and the unit topology data are respectively stored in different storage nodes, and storage paths are stored in a mapping relation database; or establishing a physical attribute list, splitting and storing the physical attribute data according to a scalar, a vector and a tensor, and storing the storage path into a mapping relation database.
Preferably, after the splitting process, a single file is split into a plurality of small data files and stored in different storage nodes, and the data at the example level needs to be maintained and managed again, so that forward search according to the example and the time sequence is guaranteed.
Preferably, the TCP/IP service module realizes the connection and instruction transmission with the server end through network port mapping, and downloads the file transmitted by the server end to the local through an FTP protocol;
the data management module is used for managing the downloaded file and analyzing and managing the file;
the control center is the core of the whole client software, realizes the processing and forwarding of user instructions, and organizes and coordinates data to realize the functional instructions of the software;
the business function module realizes the core business function and mainly comprises the functions of data processing and data visualization;
the user interaction interface is a software window directly operated by a user and is the foremost end of human-computer interaction.
Preferably, the server end mainly realizes the storage and reading of data, and the TCP/IP service is connected with the client end and the storage node end through a grid port to carry out instruction transmission and file transmission; the file storage module is used for decomposing the result file, storing the decomposed file into a storage node and storing a storage path into a database;
the file reading module collects and combines files stored in different nodes into a complete file according to the requirement of the instruction, and transmits the complete file to the client;
the database management module realizes the unified maintenance of the database, adopts the MySQL database, and encapsulates the basic operations of adding, deleting, modifying and checking the database.
Preferably, the storage node is responsible for uploading and downloading management of local files of the storage node, the TCP/IP service is connected with the server side software through a network port to realize instruction transmission and file transmission, the file storage realizes localized storage of the software transmitted from the server side, the resource maintenance mainly realizes management of local slave storage resources, addition and deletion operations are performed on the files, information such as file numbers are inquired at the same time, and the file uploading realizes uploading of the locally stored files to the server side.
The invention has the following beneficial effects:
the system hardware of the invention is mainly divided into two parts, namely a client and a server, which are connected with the Internet through a router.
The client side is a man-machine port which directly interacts with the user and is responsible for sending the instruction of the user to the server side and presenting the result returned by the server side to the user.
The server is the core of the whole system. The server side is composed of hardware and comprises a router, a server and a plurality of storage nodes connected through a switch. The management system is deployed on the server to realize management and maintenance of the whole system cluster, and the storage nodes are final data storage ends, namely all data can be stored in hard disks of different storage nodes in a distributed manner. Functionally, the server needs to complete the instruction response to the client, and combine the files required by the instruction and transmit the combined files to the specified client. Distributed storage is adopted, data is safer, other data can still be used even if a certain part of data is lost, data generation (calculation) and storage are completed in a cluster, resources of a cluster machine are fully utilized, large-scale data are divided into small data, and idle resources of a disk can be effectively utilized. The data access method and the data access system can support a plurality of engineers to access the data at the same time, reduce data circulation time, enable the data to be accessed immediately after being calculated, and are high in timeliness.
Drawings
FIG. 1 is a block diagram of a cluster-based distributed PB-level CFD simulation data management system;
FIG. 2 is a diagram of a client architecture;
FIG. 3 is a diagram of a server side architecture;
FIG. 4 is a storage node side architecture diagram;
FIG. 5 is a file storage flow diagram;
FIG. 6 is a flowchart of file query.
Detailed Description
The present invention will be described in detail with reference to specific examples.
The first embodiment is as follows:
as shown in fig. 1 to fig. 6, the present invention provides a cluster-based distributed PB-level CFD simulation data management system, which specifically includes:
a distributed PB-level CFD simulation data management system based on a cluster comprises a client 1, a client 2, a router 1, a router 2, a server, a central switch, a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 are connected with a router 1, the router 1 is connected with the router 2 through the Internet, the router 2 is connected with a server, the router 2 is connected with a central switch, and the central switch is connected with a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 comprise a TCP/IP service module, a data management module, a control center, a service function module and a user interaction interface;
the TCP/IP service module is connected with the data management module and the control center, the data management module is connected with the control center, the control center is connected with the service function module, and the control center and the service function module are connected with the user interaction interface.
The server is provided with a management system to realize management and maintenance of the whole system cluster, and the server needs to complete instruction response to the client and transmit files required by the instruction to the specified client after combining;
the storage node is the final storage end of the data, that is, all the data will be stored in the hard disks of different storage nodes in a distributed manner.
The management system stores the result file in a distributed storage mode, splits the result file into smaller data units based on the data characteristics of the CFD simulation result file, and then performs distributed storage.
The result file is split in a grid-attribute mode, and meanwhile, a mapping relation database is established to ensure that the split result is retrieved and merged again; the resolution is shown in table 1 below. As described above, for a single CFD result file, which contains mesh data and physical attribute data, the mesh data is split into a mesh node sequence and unit topology data, and stored in different storage nodes, and storage paths are stored in a mapping relation database; or establishing a physical attribute list, splitting and storing the physical attribute data according to a scalar, a vector and a tensor, and storing the storage path into a mapping relation database.
TABLE 1
Figure BDA0002696605320000041
Figure BDA0002696605320000042
After the splitting processing, a single file is split into a plurality of small data files and stored on different storage nodes, and the data at the example level needs to be maintained and managed again according to the form of the table 2, so that forward search according to the example and the time sequence is guaranteed.
TABLE 2
Figure BDA0002696605320000051
The TCP/IP service module realizes the connection and instruction transmission with the server end through the mapping of the network port and realizes the downloading of the file transmitted by the server end to the local through the FTP protocol;
the data management module is used for managing the downloaded file and analyzing and managing the file;
the control center is the core of the whole client software, realizes the processing and forwarding of user instructions, and organizes and coordinates data to realize the functional instructions of the software;
the business function module realizes the core business function and mainly comprises the functions of data processing and data visualization;
the user interaction interface is a software window directly operated by a user and is the foremost end of human-computer interaction.
The server mainly realizes the storage and reading of data, and the TCP/IP service is connected with the client and the storage node end through the grid port to transmit instructions and transmit files; the file storage module is used for decomposing the result file, storing the decomposed file into a storage node and storing a storage path into a database;
the file reading module collects and combines files stored in different nodes into a complete file according to the requirement of the instruction, and transmits the complete file to the client;
the database management module realizes the unified maintenance of the database, adopts the MySQL database, and encapsulates the basic operations of adding, deleting, modifying and checking the database.
The storage node is responsible for uploading and downloading management of local files of the storage node, the TCP/IP service is connected with server-side software through a network port to realize instruction transmission and file transmission, file storage realizes the localized storage of the software transmitted from the server side, resource maintenance mainly realizes the management of local slave storage resources, adds and deletes files, inquires information such as file numbers and the like, and the file uploading realizes the uploading of the files stored locally to the server side.
The above description is only a preferred embodiment of the distributed PB-level CFD simulation data management system based on the cluster, and the protection scope of the distributed PB-level CFD simulation data management system based on the cluster is not limited to the above embodiments, and all technical solutions belonging to the idea belong to the protection scope of the present invention. It should be noted that modifications and variations which do not depart from the gist of the invention will be those skilled in the art to which the invention pertains and which are intended to be within the scope of the invention.

Claims (6)

1. A distributed PB-level CFD simulation data management system based on clusters is characterized in that: the management system comprises a client 1, a client 2, a router 1, a router 2, a server, a central switch, a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 are connected with a router 1, the router 1 is connected with the router 2 through the Internet, the router 2 is connected with a server, the router 2 is connected with a central switch, and the central switch is connected with a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 comprise a TCP/IP service module, a data management module, a control center, a service function module and a user interaction interface;
the TCP/IP service module is connected with a data management module and a control center, the data management module is connected with the control center, the control center is connected with a service function module, and the control center and the service function module are connected with a user interaction interface;
the management system stores the result file in a distributed storage mode, splits the result file into smaller data units based on the data characteristics of the CFD simulation result file, and then performs distributed storage;
splitting the result file in a grid-attribute mode, and simultaneously establishing a mapping relation database to ensure that the split result is retrieved and merged again; for a single CFD result file, the single CFD result file comprises grid data and physical attribute data, the grid data is divided into grid node sequences and unit topology data, the grid node sequences and the unit topology data are respectively stored in different storage nodes, and storage paths are stored in a mapping relation database; or establishing a physical attribute list, splitting and storing the physical attribute data according to a scalar, a vector and a tensor, and storing the storage path into a mapping relation database.
2. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the server is provided with a management system to realize management and maintenance of the whole system cluster, and the server needs to complete instruction response to the client and transmit files required by the instruction to the specified client after combining;
the storage node is the final storage end of the data, that is, all the data will be stored in the hard disks of different storage nodes in a distributed manner.
3. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: after the splitting processing is carried out on a single file, the single file is split into a plurality of small data files to be stored on different storage nodes, the data at the example level needs to be maintained and managed again, and forward searching according to the example and the time sequence is guaranteed to be achieved.
4. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the TCP/IP service module realizes the connection and instruction transmission with the server end through the mapping of the network port and realizes the downloading of the file transmitted by the server end to the local through the FTP protocol;
the data management module is used for managing the downloaded file and analyzing and managing the file;
the control center is the core of the whole client software, realizes the processing and forwarding of user instructions, and organizes and coordinates data to realize the functional instructions of the software;
the business function module realizes the core business function and mainly comprises the functions of data processing and data visualization;
the user interaction interface is a software window directly operated by a user and is the foremost end of human-computer interaction.
5. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the server side realizes the storage and reading of data, and comprises a TCP/IP service module, a file storage module, a file reading module and a database management module, wherein the TCP/IP service module is connected with the client side and the storage node side through a grid port to transmit instructions and transmit files; the file storage module is used for decomposing the result file, storing the decomposed file into a storage node and storing a storage path into a database;
the file reading module collects and combines files stored in different nodes into a complete file according to the requirement of the instruction, and transmits the complete file to the client;
the database management module realizes the unified maintenance of the database, adopts the MySQL database, and encapsulates the basic operations of adding, deleting, modifying and checking the database.
6. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the storage node is responsible for uploading and downloading management of local files of the storage node, and comprises a TCP/IP service module, a file storage module, a resource maintenance module and a file uploading module, wherein the TCP/IP service module is connected with server-side software through a network port to realize instruction transmission and file transmission, the file storage module realizes the localized storage of the software transmitted from the server side, the resource maintenance module realizes the management of local storage resources, the addition and deletion operation of files are carried out, meanwhile, the number information of the files is inquired, and the file uploading module realizes the uploading of the files stored locally to the server side.
CN202011007979.7A 2020-09-23 2020-09-23 Distributed PB-level CFD simulation data management system based on cluster Active CN112235356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011007979.7A CN112235356B (en) 2020-09-23 2020-09-23 Distributed PB-level CFD simulation data management system based on cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011007979.7A CN112235356B (en) 2020-09-23 2020-09-23 Distributed PB-level CFD simulation data management system based on cluster

Publications (2)

Publication Number Publication Date
CN112235356A CN112235356A (en) 2021-01-15
CN112235356B true CN112235356B (en) 2021-09-07

Family

ID=74108611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011007979.7A Active CN112235356B (en) 2020-09-23 2020-09-23 Distributed PB-level CFD simulation data management system based on cluster

Country Status (1)

Country Link
CN (1) CN112235356B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115481539B (en) * 2022-09-29 2023-06-06 成都安世亚太科技有限公司 Simulation result data rapid analysis and storage method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841854A (en) * 2011-05-20 2012-12-26 国际商业机器公司 Method and system for executing data reading based on dynamic hierarchical memory cache (hmc) awareness

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446099A (en) * 2016-09-13 2017-02-22 国家超级计算深圳中心(深圳云计算中心) Distributed cloud storage method and system and uploading and downloading method thereof
CN107329982A (en) * 2017-06-01 2017-11-07 华南理工大学 A kind of big data parallel calculating method stored based on distributed column and system
US11178246B2 (en) * 2018-08-25 2021-11-16 Panzura, Llc Managing cloud-based storage using a time-series database
US11470146B2 (en) * 2018-08-25 2022-10-11 Panzura, Llc Managing a cloud-based distributed computing environment using a distributed database
CN110378037B (en) * 2019-07-23 2022-08-19 苏州浪潮智能科技有限公司 CFD simulation data storage method and device based on Ceph and server

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841854A (en) * 2011-05-20 2012-12-26 国际商业机器公司 Method and system for executing data reading based on dynamic hierarchical memory cache (hmc) awareness

Also Published As

Publication number Publication date
CN112235356A (en) 2021-01-15

Similar Documents

Publication Publication Date Title
US8713182B2 (en) Selection of a suitable node to host a virtual machine in an environment containing a large number of nodes
CN109815283B (en) Heterogeneous data source visual query method
CN104484472B (en) A kind of data-base cluster and implementation method of a variety of heterogeneous data sources of mixing
US10908834B2 (en) Load balancing for scalable storage system
CN109933631A (en) Distributed parallel database system and data processing method based on Infiniband network
US8812645B2 (en) Query optimization in a parallel computer system with multiple networks
CN109918450B (en) Distributed parallel database based on analysis type scene and storage method
CN111966677A (en) Data report processing method and device, electronic equipment and storage medium
CN113312283A (en) Heterogeneous image learning system based on FPGA acceleration
CN110851234A (en) Log processing method and device based on docker container
US20240004853A1 (en) Virtual data source manager of data virtualization-based architecture
CN113127526A (en) Distributed data storage and retrieval system based on Kubernetes
CN104166661A (en) Data storage system and method
CN112235356B (en) Distributed PB-level CFD simulation data management system based on cluster
Sun et al. Survey of distributed computing frameworks for supporting big data analysis
US11263026B2 (en) Software plugins of data virtualization-based architecture
US11960616B2 (en) Virtual data sources of data virtualization-based architecture
KR20220026603A (en) File handling methods, devices, electronic devices and storage media
US11593310B2 (en) Providing writable streams for external data sources
Rasool et al. Replica placement in multi-tier data grid
CN114138898A (en) SMG-VME-AFS iterable distributed storage system
CN104731785A (en) Information searching method and device
Li et al. Distributed nosql storage for extreme-scale system services
US20080189288A1 (en) Query governor with network monitoring in a parallel computer system
Rachuri et al. Optimizing Near-Data Processing for Spark

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant