CN111221541A - Cluster parallel program deployment method and device - Google Patents

Cluster parallel program deployment method and device Download PDF

Info

Publication number
CN111221541A
CN111221541A CN201911368791.2A CN201911368791A CN111221541A CN 111221541 A CN111221541 A CN 111221541A CN 201911368791 A CN201911368791 A CN 201911368791A CN 111221541 A CN111221541 A CN 111221541A
Authority
CN
China
Prior art keywords
parallel program
mirror image
library
program
container
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911368791.2A
Other languages
Chinese (zh)
Inventor
解西国
韩孟之
翟健
孙建鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN201911368791.2A priority Critical patent/CN111221541A/en
Publication of CN111221541A publication Critical patent/CN111221541A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • G06F8/63Image based installation; Cloning; Build to order

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a cluster parallel program deployment method and a device, wherein the cluster parallel program deployment method comprises the following steps: sequentially constructing mirror images for a plurality of program libraries of the running environment of the parallel program to generate the mirror images of the running environment; taking the mirror image of the running environment as a basic mirror image to construct a mirror image for the parallel program so as to generate the mirror image of the parallel program; and deploying the generated image of the parallel program to a required node. The invention has the technical scheme, improves the distribution and deployment efficiency of the HPC application, solves the problems encountered by the traditional deployment mode, and simplifies the deployment process and difficulty.

Description

Cluster parallel program deployment method and device
Technical Field
The present invention relates to the field of high performance computing technologies, and in particular, to a cluster parallel program deployment method and apparatus.
Background
In High Performance Computing (HPC), MPI (Message passing interface) is a common programming interface. In high-performance computing, a large number of processes are simultaneously used for parallel computing, and message communication and data synchronization are performed among the processes through MPI. The MPI underlay can use shared memory or Infiniband high-speed network to realize high-efficiency intra-node communication and inter-node communication.
Conventional HPC parallel programs are typically distributed in source code packages. To deploy the programs, a corresponding compiler, a mathematic library and an MPI library are installed in a high-performance computing cluster, and then compiling environment detection, processor optimization and bottom communication optimization are completed by means of compiling scripts, so that the computing capacity of the high-performance cluster is efficiently utilized for acceleration. These base environment installations and compilation optimization parameter selection present increasing challenges to the users of the clusters.
In recent years, container technology represented by docker has been increasingly widely used. The container can pack the application and the running environment (such as a dynamic library, a configuration file and the like) thereof into the container, and the container can run under various Linux operating system versions after being packaged. The container technology is based on cgroups and namespace technologies provided by a Linux kernel and is a lightweight virtualization technology. Compared with the conventional virtualization technology, the container runs directly on the kernel of the host machine, and performance loss caused by an intermediate virtualization layer is avoided, so that the container is more efficient and suitable for packaging the HPC parallel program sensitive to performance. The container can achieve a start-up speed on the order of seconds, which is much better than a virtual machine (typically requiring several minutes to start). In addition, compared with a huge image file (generally more than a few GB) of a virtual machine, the container image file is smaller, generally about a few hundred MB, and is more suitable for application distribution.
docker, as a representative of container technology, has been increasingly widely used in the field of cloud computing and the like in recent years. In the HPC field, docker is also used to package some HPC applications. However, some technical characteristics of the docker make the docker not an ideal method for HPC containerized deployment, and the compatibility of the docker container with the conventional HPC software stack is poor due to the starting mode, the mirror management mode, the network isolation and the like of the docker container.
Specifically, the docker uses cgroups and namespace provided by the Linux kernel to limit and isolate resources, and provides a set of running environments (such as a runtime library, a network stack and the like) independent of the host for the applications in the container. In the aspect of resource isolation, the docker applies various namespaces such as UTS, IPC, PID, NET and the like, so that problems can occur when the docker container-based HPC application is packaged during operation, the docker container-based HPC application is poor in compatibility with a traditional HPC software stack, and the docker container-based HPC application mainly has the following problems:
while docker uses technologies such as cgroups and namespace to isolate resources, HPC application programs, on the contrary, integrate computations by means of MPI and the like to realize massively parallel computations. Excessive resource isolation makes docker have some difficulties in running HPC applications;
the docker is complex in configuration in the aspect of communication across physical machines and is not suitable for parallel communication of HPC applications. Since the docker enables network isolation by default, that is, the container has a network stack independent of the host, when the container performs communication across physical hosts, a complex virtual network card (veth) and iptables rule need to be configured, or communication is performed in a vxlan manner or the like. For HPC applications, massive cross-node parallel computing is often required. Before the HPC application is run, cross-node container communication needs to be configured, so that the running difficulty of the application is increased;
the starting mode of the docker container is poor in compatibility with the parallel starting mode of the traditional HPC application. Conventional HPC applications are typically run in parallel by MPI means. Application runtime a process manager (typically mpirun) provided by the MPI is responsible for starting application processes at the corresponding nodes. The docker container is started and managed through a docker run command, and compatibility with a traditional HPC application starting mode is poor;
the docker mirror storage approach is not applicable to HPC clusters. The docker image is stored on a local disk (generally under a/var directory) in a hierarchical management mode. In an HPC cluster, applications are generally stored in a shared storage system, and all nodes are mounted with shared storage to realize consistent access. The mirror storage in the local disk will first occupy a relatively large amount of disk space. In addition, when the local disk does not have the corresponding mirror image, the mirror image needs to be downloaded to the mirror image warehouse node. When large-scale parallel computing is carried out, a large number of computing nodes can download images to the image warehouse node at the same time, on one hand, application starting is slow, and in addition, large-scale access often causes the image warehouse node to crash, so that application starting failure is caused;
conventional HPC parallel program distribution and deployment typically employs a source code package approach. Installing and deploying such applications requires installing a corresponding compiler, library file, etc. on the deployment machine to prepare the running environment of the application. This typically takes a relatively long time and requires the application installer to be knowledgeable about the associated software installation deployment method, which is a relatively demanding task.
Disclosure of Invention
In view of the above problems in the related art, the present invention provides a method and an apparatus for deploying a cluster parallel program, which can solve the problems encountered in the conventional deployment manner and simplify the deployment flow and difficulty of the cluster parallel program.
The technical scheme of the invention is realized as follows:
according to an aspect of the present invention, there is provided a cluster parallel program deployment method including:
sequentially constructing mirror images for a plurality of program libraries of the running environment of the parallel program to generate the mirror images of the running environment;
taking the mirror image of the running environment as a basic mirror image to construct a mirror image for the parallel program so as to generate the mirror image of the parallel program;
and deploying the generated image of the parallel program to the required node.
According to an embodiment of the invention, the plurality of libraries comprises: the system comprises a system bottom library, an Infiniband driving layer library, a compiler library, an MPI parallel library and a math library, wherein the system bottom library is positioned in an application software stack; wherein, in the application software stack, the parallel program is positioned at the upper layer of a plurality of program libraries.
According to the embodiment of the invention, sequentially mirroring a plurality of program libraries of the running environment of the parallel program comprises the following steps: installing a Singularity container; the Singularity container is utilized to sequentially mirror multiple libraries.
According to an embodiment of the present invention, mirroring a plurality of libraries comprises: constructing a mirror image of a system bottom library, comprising: reading the Recipe file by using the build command of the Singularity container, and executing the command of the corresponding field of the Recipe file to construct the mirror image of the system bottom library, wherein the Recipe file comprises the flow of mirror image construction of the system bottom library.
According to an embodiment of the present invention, mirroring a plurality of libraries comprises: mirroring the compiler library, the compiler may include a variety of compilers, including: copying the packed compiler file to the directory in the Singularity container by using the% file field of the Recipe file; installing a configuration compiler by using a% post field of a Recipe file; the compiler-related environment variables within the Singularity container are configured with the% environment field of the Recipe file.
According to an embodiment of the present invention, mirroring a plurality of libraries comprises: mirroring an MPI parallel library, comprising: copying the MPI source code package to a directory in a Singularity container by utilizing a% file field of a Recipe file; installing an MPI source code package by utilizing a% post field of a Recipe file; the environment variables related to MPI are configured using the% environment field of the Recipe file.
According to the embodiment of the invention, the mirror image of the running environment is used as a basic mirror image to construct the mirror image of the parallel program, and the method comprises the following steps: using the mirror image of the designated running environment of the Recipe file as a basic mirror image; copying a source code packet and a compiling configuration file of the parallel program; the% post field of the Recipe file is used to perform the compilation and installation of the parallel program.
According to another aspect of the present invention, there is provided a cluster parallel program deploying apparatus, including:
the environment mirror image construction module is used for constructing mirror images for a plurality of program libraries of the running environment of the parallel program in sequence so as to generate the mirror images of the running environment;
the application mirror image construction module is used for constructing a mirror image of the parallel program by taking the mirror image of the running environment as a basic mirror image so as to generate the mirror image of the parallel program;
and the deployment module is used for deploying the generated mirror image of the parallel program to the required node.
According to an embodiment of the invention, the plurality of libraries comprises: the system comprises a system bottom library, an Infiniband driving layer library, a compiler library, an MPI parallel library and a math library, wherein the system bottom library is positioned in an application software stack; wherein, in the application software stack, the parallel program is positioned at the upper layer of a plurality of program libraries.
According to an embodiment of the present invention, an environment image construction module includes: a container mounting sub-module for mounting a Singularity container; and the mirror image construction submodule is used for constructing mirror images for the plurality of program libraries in sequence by utilizing the Singularity container.
According to the technical scheme, the HPC parallel program and the environment depending on the HPC parallel program are packaged into the mirror image through the containerization deployment method, the system can run by directly copying the mirror image onto the target machine without installing a related running environment on a deployment machine or executing a compiling process, the deployment speed is improved, the deployment difficulty is reduced, a complex compiling process is not required to be executed, various errors in a compiling and linking process are not required to be processed, the HPC parallel program mirror image is made by HPC professionals, compiling and optimizing are carried out in the mirror image making process, the program running speed is improved, the running speed of the parallel program cannot be reduced through a containerization deployment mode, and the system is suitable for HPC application.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flow diagram of a method for cluster parallel program deployment according to an embodiment of the invention;
FIG. 2 is a schematic diagram of the HPC software stack according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a mirror format supported by Singularity container technology in accordance with an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
As shown in fig. 1, the cluster parallel program deployment method according to the embodiment of the present invention includes the following steps:
s12, sequentially constructing mirror images of a plurality of program libraries of the running environment of the parallel program to generate mirror images of the running environment;
s14, constructing a mirror image of the parallel program by taking the mirror image of the running environment as a basic mirror image so as to generate the mirror image of the parallel program;
and S16, deploying the generated image of the parallel program to the required node.
The invention has the technical scheme, and provides the HPC cluster parallel program deployment method, which realizes that the HPC application is installed once and runs everywhere through a container technology, and improves the distribution and deployment efficiency of the HPC application. In addition, through the container technology, the work flows (such as preprocessing, post-processing and the like) related to the HPC application can be packaged into the container together, so that the cooperation use among various applications is facilitated, the problems encountered by the traditional deployment mode are solved, and the deployment flow and difficulty are simplified. The containerized deployment method packages the HPC parallel program and the environment depending on the HPC parallel program into a mirror image, does not need to install a related running environment on a deployment machine, does not need to execute a compiling process, can run by directly copying the mirror image onto a target machine, improves the deployment speed, reduces the deployment difficulty, does not need to execute a complex compiling process, does not need to process various errors in a compiling and linking process, is used for manufacturing the HPC parallel program mirror image by HPC professionals, is used for compiling and optimizing in the mirror image manufacturing process, improves the program running speed, does not reduce the running speed of the parallel program in a containerized deployment mode, and is suitable for HPC application.
In some embodiments, the cluster parallel program deployment method of the present invention is implemented using Singularity container technology. The problem of the container of the Docker on the HPC is solved, and the container deployment of the HPC parallel program is realized.
According to an embodiment of the present invention, at step S12, the plurality of libraries may include: the system comprises a system bottom library, an Infiniband driving layer library, a compiler library, an MPI parallel library and a math library, wherein the system bottom library is positioned in an application software stack; wherein, in the application software stack, the parallel program is positioned at the upper layer of a plurality of program libraries.
As shown in fig. 2, is a typical HPC application software stack. HPC application operation relies on several libraries, which mainly include math libraries (blas, lapack, fftw, etc.), MPI parallel libraries (intelMPI, openMPI, etc. implementations), compiler-dependent libraries (intel compiler, gcc compiler, etc.). In addition, the MPI library is a library which utilizes Infiniband network communication and also needs an Infiniband network-related user mode. Finally, all programs and libraries typically rely on system underlying libraries (e.g., glibc, etc.). Because the Singularity container and the host share the kernel of the operating system, after the physical host loads the driver (kernel module) of the device, the application in the container can directly use the device without reloading the corresponding kernel module in the container. According to these features of the container, all programs and libraries running on the kernel (parts above the dotted line in the figure) need to be packaged into the singleton container image. Since the HPC software stack presents the hierarchical structure shown in the figure, the mirror image can be sequentially constructed according to the following hierarchical structure, and finally the mirror image packaging is completed.
In some embodiments, at step S12, a Singularity container is installed; the Singularity container is utilized to sequentially mirror multiple libraries.
According to an embodiment of the present invention, building a mirror image of the plurality of libraries at step S12 includes: constructing a mirror image of a system bottom library, comprising: reading the Recipe file by using the build command of the Singularity container, and executing the command of the corresponding field of the Recipe file to construct the mirror image of the system bottom library, wherein the Recipe file comprises the flow of mirror image construction of the system bottom library.
According to an embodiment of the present invention, building a mirror image of the plurality of libraries at step S12 includes: mirroring the compiler library, the compiler may include a variety of compilers, including: copying the packed compiler file to the directory in the Singularity container by using the% file field of the Recipe file; installing a configuration compiler by using a% post field of a Recipe file; the compiler-related environment variables within the Singularity container are configured with the% environment field of the Recipe file.
According to an embodiment of the present invention, at step S12, mirroring a plurality of libraries includes: mirroring an MPI parallel library, comprising: copying the MPI source code package to a directory in a Singularity container by utilizing a% file field of a Recipe file; installing an MPI source code package by utilizing a% post field of a Recipe file; the environment variables related to MPI are configured using the% environment field of the Recipe file.
According to an embodiment of the present invention, mirroring the parallel program with the image of the runtime environment as the base image at step S14 includes: using the mirror image of the designated running environment of the Recipe file as a basic mirror image; copying a source code report and a compiling configuration file of the parallel program; the% post field of the Recipe file is used to perform the compilation and installation of the parallel program.
The following will describe a method of installation of a Singularity container, a method of cross-node communication using an Infiniband/OPA high-speed network by the Singularity container, installation of an MPI parallel library in the Singularity container, and application of a containerized deployment method for HPC using the Singularity container. The following first describes the Singularity installation method, and then the layer mirror image construction method.
Singularity installation
Singularity is published in the form of source code packages, which are first downloaded to a Singularity website (https:// gitubu. com/Singularityware/Singularity). Then the following command compilation installation is performed:
tar xvf singularity-2.4.2.tar.gz
cd singularity-2.4.2
./configure--prefix=/public/software/apps/singularity/2.4.2
make
make install
the rpm packet may also be generated using rpmbild commands, and installed using rpm commands, as follows:
# assigned rpm Package installation Path
PREFIX=/opt/Singularity
rpmbuild-ta--define="_prefix$PREFIX"--define"_sysconfdir$PREFIX/etc"--define"_defaultdocdir$PREFIX/share"singularity-2.4.2.tar.gz
The manufactured PRM packet can be found under/root/rpmbild/RPMS/x 86-64, and a yum install command is executed to complete installation.
singularity-2.4.2-1.el7.x86_64.rpm
singularity-devel-2.4.2-1.el7.x86_64.rpm
singularity-debuginfo-2.4.2-1.el7.x86_64.rpm
singularity-runtime-2.4.2-1.el7.x86_64.rpm
yum install Singularity*
System bottom level mirror build
As shown in fig. 3, Singularity supports a variety of mirror formats, such as ext3, sandbox, squashfs. The sandbox is a directory containing a mirror image root file system, the ext3 is a mirror image file with a file system in an ext3 format, the two formats are editable, and the occupied space ratio is increased. After the mirror image is modified, the mirror image can be established into a read-only squashfs format, the mirror image can be greatly compressed, and application and distribution are facilitated. Squashfs is also the default mirror format after Singularity version 2.4.
Singularity uses the build command to complete the image build process. The build command may be a docker image, a local container, a local compact package, and a Recipe file. Where the Recipe file is similar to the dockerfile in docker. The Recipe file defines the fields bootstrap,% setup,% post,% files,% environment,% runscript, etc. that specify the manner in which the image is created, the script that is executed when the image is created, the files that need to be copied, the environment variable settings, the container launch default program, etc. The build command reads the Recipe file, and executes the command specified by the corresponding field to complete the creation of the image. Since the Recipe file contains all the processes of image creation, the method is a good image creation means.
The system-under-level image includes a system base library (e.g., glibc) and an Infiniband network-related library. In one embodiment, an image is constructed based on the Centos7.4 system, and an Infiniband network driver is installed in the image. The following is the Recipe file used to construct the image.
The docker mirror based mirror using centros: 7.4.1708 is specified in the BootStrap field of the Recipe file, and the% file field specifies the Infiniband drive file to be copied under the/tmp directory in the container. Then, the command to be executed when the image is built is specified in the% post field, the main operation is to configure the driver yum source, and then install the Infiniband driver user state related rpm package (mainly libibverbs, rdma, etc.). The command specified by the% test field is executed after the image creation is completed. In this embodiment, ibstat is executed after the image creation is completed, and it is checked whether the Infiniband network can be used in the container.
Figure BDA0002339137340000091
Figure BDA0002339137340000101
And saving the Recipe file as ib _ mlnx.def, and executing a command to complete image creation work, wherein the image is saved as/public/simple/images/ib _ mlnx.img.
Singularity build/public/Singularity/images/ib_mlnx.img ib_mlnx.def
To this end, the creation of a system base image including an Infiniband drive is complete.
Compiler image construction
The compiler image build process is then completed based on the system base image. The GNU compiler is included in a general linux release version and can be installed using the yum command. For an Intel compiler, firstly, extracting a binary file and a function library related to the compiler, packaging the binary file and the function library into a compressed file, and then packaging the compressed file into a mirror image. The Intel compiler contains an MKL mathematical library, and can be packaged into a mirror image together with the compiler.
The mirror construction is completed using the following Recipe file. First specify in the BootStrap field that the build is based on local images, use the previously created base images ib _ mlnx.img, and then specify in the% file field that the packed Intel compiler is to be copied to the container under the/opt directory. The yum command is executed in the% post section to install the GNU compiler and unroll the Intel compiler compact package. And finally, configuring environment variables in the container in the% environment field, and adding binary files, header files and LIBRARY files related to the compiler into the environment variables such as system PATH, INCLUDE, LIBRARY _ PATH and LD _ LIBRARY _ PATH.
Figure BDA0002339137340000102
Figure BDA0002339137340000111
And saving the file as intel.def, and executing the following command to complete the construction of the compiler mirror image.
Singularity build/public/Singularity/images/intel.img intel.def
MPI image construction
For HPC applications, MPI is a common parallel approach. The MPI image is constructed based on the previously completed compiler image, in which OpenMPI is packaged. Other versions of MPI may also be created with reference to the method of OpenMPI.
The following is the Recipe file used to create the image. Img using the previously completed intel.img is first specified in the boottrap field using the local image. The/mnt directory in which the openMPI source code package is copied into the container is then specified in the% file section. The compilation installation process for openMPI is next performed in the% post field. Finally, the relevant environment variables are configured in the% environment field.
Figure BDA0002339137340000121
Figure BDA0002339137340000131
And saving the file as openmpi.def, and then executing the following command to complete the image construction work.
Singularity build/public/Singularity/images/openmpi.img openmpi.def
HPC application image construction
After the above steps are executed, the Infiniband communication library, the compiler (including the math library), the MPI library and the like are packaged in the simplex container image, so that the complete environment of the HPC application is provided, and installation and image construction of the HPC application can be performed.
In this embodiment, a method for constructing an HPC application image is described by taking an HPL, which is a typical application of HPC, as an example. The following is the Recipe file used to create the image. Firstly, the openmpi.img completed before is used as a basic mirror image, then the HPL source code packet HPL-2.2.tar.gz and the compiling configuration file make.intel are copied into the mirror image, and the compiling and installing work of the HPL can be completed by executing the script of the% post directory.
Figure BDA0002339137340000132
The file is saved as hpl.
Singularity build/public/singualarity/images/hpl.img hpl.def
By this, the image creation work of the HPL is completed. Because the mirror image contains all software environments for operating the HPL, after the mirror image is created, the mirror image file can be directly copied to other nodes to complete the deployment work of the HPL, and a complex compiling and installing process does not need to be executed. For other HPC applications, HPC parallel program containerized deployment can be realized by referring to the method of the HPL application.
The HPC parallel program containerized deployment method provided by the invention solves the problems encountered by the traditional deployment mode, and simplifies the deployment process and difficulty. Taking the HPL application image manufactured above as an example, the application deployment can be completed by directly copying the HPL. The HPL program in the container can be run by executing the following command.
Singularity exec/public/Singularity/images/hpl.img xhpl
For HPC clusters using a job scheduling system, the submission of HPL compute tasks may be accomplished using the following spurm script. And 2 nodes are specified to be used in the slurm job submission script, each node runs 24 processes, and the start of the HPL process is completed by using a srun command.
#!/bin/bash
#SBATCH-J HPL
#SBATCH-N 2
#SBATCH--ntasks-per-node=24
#SBATCH-p test
srun--MPI=pmi2 singularity exec/public/Singularity/images/hpl.imgxhpl
For HPC parallel programs, the computational efficiency of the program is very important. Conventional virtualization techniques, while reducing deployment difficulties, have a significant performance penalty for HPC parallel programs running in virtual machines. Correspondingly, the application in the Singularity container runs directly on the host kernel, and the performance loss is almost negligible. The following table compares the efficiency of the HPL running inside the container versus the HPL running directly on the physical host.
Figure BDA0002339137340000141
It can be seen that the HPLs running inside the container perform cross-node parallel computing with little difference in efficiency from running on the physical host. The cluster parallel program deployment method provided by the invention is suitable for being used as a distribution and deployment method of the HPC parallel program.
According to the embodiment of the invention, the invention also provides a cluster parallel program deployment device. In some embodiments, the cluster parallel program deployment device is configured to execute the cluster parallel program deployment method. The cluster parallel program deployment device may include:
the environment mirror image construction module is used for constructing mirror images for a plurality of program libraries of the running environment of the parallel program in sequence so as to generate the mirror images of the running environment;
the application mirror image construction module is used for constructing a mirror image of the parallel program by taking the mirror image of the running environment as a basic mirror image so as to generate the mirror image of the parallel program;
and the deployment module is used for deploying the generated mirror image of the parallel program to the required node.
In one embodiment, the plurality of libraries includes: the system comprises a system bottom library, an Infiniband driving layer library, a compiler library, an MPI parallel library and a math library, wherein the system bottom library is positioned in an application software stack; wherein, in the application software stack, the parallel program is positioned at the upper layer of a plurality of program libraries.
In one embodiment, the environment image construction module comprises: a container mounting sub-module for mounting a Singularity container; and the mirror image construction submodule is used for constructing mirror images for the plurality of program libraries in sequence by utilizing the Singularity container.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A cluster parallel program deployment method, comprising:
sequentially constructing mirror images for a plurality of program libraries of the running environment of the parallel program to generate the mirror images of the running environment;
taking the mirror image of the running environment as a basic mirror image to construct a mirror image for the parallel program so as to generate the mirror image of the parallel program;
and deploying the generated image of the parallel program to a required node.
2. The cluster parallel program deployment method of claim 1, wherein the plurality of libraries comprise: the system comprises a system bottom library, an Infiniband driving layer library, a compiler library, an MPI parallel library and a math library, wherein the system bottom library is positioned in an application software stack;
wherein, in the application software stack, the parallel program is located at an upper layer of the plurality of program libraries.
3. The method of deploying a cluster parallel program according to claim 1, wherein sequentially mirroring a plurality of libraries of a running environment of the parallel program comprises:
installing a Singularity container;
and utilizing the Singularity container to sequentially construct mirror images for the plurality of program libraries.
4. The method of cluster parallel program deployment according to claim 2, wherein mirroring the plurality of libraries comprises:
constructing a mirror image of the system bottom library, comprising:
reading a Recipe file by using a build command of the Singularity container, and executing a command of a corresponding field of the Recipe file to construct an image of the system underlying library, wherein the Recipe file comprises a process of image construction of the system underlying library.
5. The method of cluster parallel program deployment according to claim 2, wherein mirroring the plurality of libraries comprises:
mirroring the compiler library, the compiler may include a variety of compilers, including:
copying the packaged compiler file to the directory in the Singularity container by using the% file field of the Recipe file;
executing the% post field command of the Recipe file to install a configuration compiler;
configuring the compiler-related environment variables within the Singularity container with the% environment field of the Recipe file.
6. The method of cluster parallel program deployment according to claim 2, wherein mirroring the plurality of libraries comprises:
constructing a mirror image of the MPI parallel library, comprising:
copying the MPI source code package to the directory in the Singularity container by utilizing the% file field of the Recipe file;
installing the MPI source code packet by using a% post field of a Recipe file;
and configuring the environment variable related to the MPI by using the% environment field of the Recipe file.
7. The method for deploying the cluster parallel program according to claim 1, wherein mirroring the parallel program by using the mirror image of the running environment as a base mirror comprises:
appointing the mirror image of the running environment as a basic mirror image by utilizing a Recipe file;
copying a source code report and a compiling configuration file of the parallel program;
and performing compiling and installing of the parallel program by using the% post field of the Recipe file.
8. A cluster parallel program deployment apparatus, comprising:
the environment mirror image construction module is used for constructing mirror images for a plurality of program libraries of the running environment of the parallel program in sequence so as to generate the mirror images of the running environment;
the application mirror image construction module is used for constructing a mirror image for the parallel program by taking the mirror image of the running environment as a basic mirror image so as to generate the mirror image of the parallel program;
and the deployment module is used for deploying the generated mirror image of the parallel program to a required node.
9. The clustered parallel program deployment apparatus of claim 8 wherein the plurality of libraries comprise: the system comprises a system bottom library, an Infiniband driving layer library, a compiler library, an MPI parallel library and a math library, wherein the system bottom library is positioned in an application software stack;
wherein, in the application software stack, the parallel program is located at an upper layer of the plurality of program libraries.
10. The cluster parallel program deployment device of claim 9, wherein the environment image construction module comprises:
a container mounting sub-module for mounting a Singularity container;
and the mirror image construction sub-module is used for constructing mirror images for the plurality of program libraries in sequence by utilizing the Singularity container.
CN201911368791.2A 2019-12-26 2019-12-26 Cluster parallel program deployment method and device Withdrawn CN111221541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911368791.2A CN111221541A (en) 2019-12-26 2019-12-26 Cluster parallel program deployment method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911368791.2A CN111221541A (en) 2019-12-26 2019-12-26 Cluster parallel program deployment method and device

Publications (1)

Publication Number Publication Date
CN111221541A true CN111221541A (en) 2020-06-02

Family

ID=70827824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911368791.2A Withdrawn CN111221541A (en) 2019-12-26 2019-12-26 Cluster parallel program deployment method and device

Country Status (1)

Country Link
CN (1) CN111221541A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112445595A (en) * 2020-11-26 2021-03-05 深圳晶泰科技有限公司 Multitask submission system based on slurm computing platform
CN112463123A (en) * 2020-11-25 2021-03-09 北京字跳网络技术有限公司 Task compiling method, device, network node, system and storage medium
CN113076109A (en) * 2021-04-08 2021-07-06 成都安恒信息技术有限公司 Cross-platform script language deploying method
CN113835683A (en) * 2021-09-17 2021-12-24 博锐尚格科技股份有限公司 Mirror image making method and device for target program
CN114116455A (en) * 2021-11-03 2022-03-01 郑州埃文计算机科技有限公司 Clustering fuzzy test method and device for open-source basic component library
CN114996134A (en) * 2022-05-30 2022-09-02 阿里巴巴(中国)有限公司 Containerized deployment method, electronic equipment and storage medium
CN115102851A (en) * 2022-08-26 2022-09-23 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Fusion platform for HPC and AI fusion calculation and resource management method thereof
CN117407008A (en) * 2023-12-14 2024-01-16 之江实验室 System component cluster deployment method and device for microminiature data center

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404385A (en) * 2011-10-25 2012-04-04 华中科技大学 Virtual cluster deployment system and deployment method for high performance computing
CN109034386A (en) * 2018-06-26 2018-12-18 中国科学院计算机网络信息中心 A kind of deep learning system and method based on Resource Scheduler
WO2019153829A1 (en) * 2018-02-12 2019-08-15 人和未来生物科技(长沙)有限公司 Method and system for rapid generation of container dockerfile and container mirror image
CN110543311A (en) * 2019-09-05 2019-12-06 曙光信息产业(北京)有限公司 Mirror image construction method and device and storage medium
CN106776005B (en) * 2016-11-23 2019-12-13 华中科技大学 Resource management system and method for containerized application

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404385A (en) * 2011-10-25 2012-04-04 华中科技大学 Virtual cluster deployment system and deployment method for high performance computing
CN106776005B (en) * 2016-11-23 2019-12-13 华中科技大学 Resource management system and method for containerized application
WO2019153829A1 (en) * 2018-02-12 2019-08-15 人和未来生物科技(长沙)有限公司 Method and system for rapid generation of container dockerfile and container mirror image
CN109034386A (en) * 2018-06-26 2018-12-18 中国科学院计算机网络信息中心 A kind of deep learning system and method based on Resource Scheduler
CN110543311A (en) * 2019-09-05 2019-12-06 曙光信息产业(北京)有限公司 Mirror image construction method and device and storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463123A (en) * 2020-11-25 2021-03-09 北京字跳网络技术有限公司 Task compiling method, device, network node, system and storage medium
CN112463123B (en) * 2020-11-25 2023-07-14 北京字跳网络技术有限公司 Task compiling method, device, network node, system and storage medium
CN112445595A (en) * 2020-11-26 2021-03-05 深圳晶泰科技有限公司 Multitask submission system based on slurm computing platform
CN113076109A (en) * 2021-04-08 2021-07-06 成都安恒信息技术有限公司 Cross-platform script language deploying method
CN113076109B (en) * 2021-04-08 2023-07-04 成都安恒信息技术有限公司 Cross-platform script language deployment method
CN113835683A (en) * 2021-09-17 2021-12-24 博锐尚格科技股份有限公司 Mirror image making method and device for target program
CN114116455A (en) * 2021-11-03 2022-03-01 郑州埃文计算机科技有限公司 Clustering fuzzy test method and device for open-source basic component library
CN114996134A (en) * 2022-05-30 2022-09-02 阿里巴巴(中国)有限公司 Containerized deployment method, electronic equipment and storage medium
CN115102851A (en) * 2022-08-26 2022-09-23 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Fusion platform for HPC and AI fusion calculation and resource management method thereof
CN117407008A (en) * 2023-12-14 2024-01-16 之江实验室 System component cluster deployment method and device for microminiature data center
CN117407008B (en) * 2023-12-14 2024-04-19 之江实验室 System component cluster deployment method and device for microminiature data center

Similar Documents

Publication Publication Date Title
CN111221541A (en) Cluster parallel program deployment method and device
CN110543311B (en) Mirror image construction method, device and storage medium
US9959105B2 (en) Configuration of an application in a computing platform
US20210349706A1 (en) Release lifecycle management system for multi-node application
US9170797B2 (en) Automated deployment of an application in a computing platform
KR102370568B1 (en) Containerized deployment of microservices based on monolithic legacy applications
US9262238B2 (en) Connection management for an application in a computing platform
US9413819B1 (en) Operating system interface implementation using network-accessible services
CN112416524A (en) Implementation method and device of cross-platform CI/CD (compact disc/compact disc) based on docker and kubernets offline
US20200034167A1 (en) Automatic application migration across virtualization environments
US20100205604A1 (en) Systems and methods for efficiently running multiple instances of multiple applications
US9965307B2 (en) Building virtual appliances
CN111061487A (en) Container-based load balancing distributed compiling system and method
CN109799998B (en) OpenStack cluster configuration and batch deployment method and system
CN111610985A (en) Kubernet cluster rapid deployment method on domestic platform
Piras et al. Container orchestration on HPC clusters
CN117112122A (en) Cluster deployment method and device
Boyer et al. Architecture-based automated updates of distributed microservices
WO2002075541A3 (en) Method and apparatus for providing application specific strategies to a java platform including start and stop policies
CN113407257A (en) Mysql cluster deployment method and device, electronic equipment and storage medium
Sekigawa et al. Web Application-Based WebAssembly Container Platform for Extreme Edge Computing
Jung et al. Oneos: Middleware for running edge computing applications as distributed posix pipelines
CN115145604A (en) Containerized electric power marketing system deployment method
Sethi et al. Rapid deployment of SOA solutions via automated image replication and reconfiguration
CN111008043A (en) Server starting method of cloud platform and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200602

WW01 Invention patent application withdrawn after publication