CN117170812B

CN117170812B - Numerical forecasting calculation cloud system based on research and development operation and maintenance integrated architecture

Info

Publication number: CN117170812B
Application number: CN202311148883.6A
Authority: CN
Inventors: 汪祥; 朱俊星; 韩毅; 张卫民; 任开军; 王辉赞; 赵娟
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2023-09-07
Filing date: 2023-09-07
Publication date: 2024-05-03
Anticipated expiration: 2043-09-07
Also published as: CN117170812A

Abstract

The invention relates to a numerical forecasting calculation cloud system based on an integrated architecture of research and development operation and maintenance, which comprises the following components: a hybrid cluster comprising at least one hybrid node; the mirror image warehouse is used for storing a plurality of Docker basic mirrors and a plurality of numerical forecasting application mirrors; the conversion unit is used for synchronously converting the plurality of Docker application images into a plurality of Singularity application images; the node scheduling unit is used for receiving application research and development tasks, determining a first target node from the hybrid cluster, wherein the first target node is used for scheduling by the K8S scheduler, and the first target node is used for research and development of the numerical forecasting application; the node scheduling unit is also used for receiving the business job task, determining a second target node from the mixed cluster, wherein the second target node is used for Slurm for scheduling by a scheduler, and the second target node is used for pulling the numerical forecasting application and running the job, and has the advantages of integrating the research, development and operation environment of the numerical forecasting and improving the utilization rate of hardware resources.

Description

A numerical forecast computing cloud system based on R&D and operation and maintenance integrated architecture

技术领域Technical Field

本发明涉及数据处理领域，特别涉及一种基于研发运维一体化架构的数值预报计算云系统。The present invention relates to the field of data processing, and in particular to a numerical forecast computing cloud system based on a research and development and operation and maintenance integrated architecture.

背景技术Background technique

目前应用于数值预报领域的高性能集群的集群调度器主要为slurm，主流的高性能容器为Singularity容器，应用于计算的集群调度器主要是Kubernetes(简称K8S)，主流的容器为Docker容器，容器技术能够实现计算环境的封装保存、快速部署、重复使用和安全隔离，在近几年获得了迅猛发展，其中技术更为成熟、隔离性更强的Docker容器及其主流编排系统K8S，主要侧重于容器化应用的制作和管理，适用于数值预报研发系统；而轻量化、弱隔离的Singularity容器及其主流作业调度系统Slurm，则主要侧重于高性能计算集群的资源管理和作业调度，适用于数值预报业务系统。然而，Slurm与K8S目前并不兼容，导致数值预报业务系统难以快速部署研发系统创建的容器化应用，并且二者也无法共享底层计算资源，这造成了研发环境与业务环境分离以及硬件计算资源的极大浪费。At present, the cluster scheduler of high-performance clusters used in the field of numerical forecasting is mainly slurm, the mainstream high-performance container is Singularity container, the cluster scheduler used for computing is mainly Kubernetes (abbreviated as K8S), and the mainstream container is Docker container. Container technology can realize the encapsulation and preservation of computing environment, rapid deployment, reuse and security isolation, and has achieved rapid development in recent years. Among them, Docker container with more mature technology and stronger isolation and its mainstream orchestration system K8S mainly focus on the production and management of containerized applications, which are suitable for numerical forecasting R&D system; while the lightweight and weakly isolated Singularity container and its mainstream job scheduling system Slurm mainly focus on the resource management and job scheduling of high-performance computing clusters, which are suitable for numerical forecasting business system. However, Slurm and K8S are not compatible at present, which makes it difficult for numerical forecasting business system to quickly deploy containerized applications created by R&D system, and the two cannot share underlying computing resources, which causes the separation of R&D environment and business environment and great waste of hardware computing resources.

因此，需要提供一种基于研发运维一体化架构的数值预报计算云系统，用于实现Slrum节点和K8S节点的资源共享和混合调度，从而整合数值预报的研发运维环境，提升硬件资源利用率。Therefore, it is necessary to provide a numerical forecast computing cloud system based on an integrated R&D and operation and maintenance architecture to realize resource sharing and hybrid scheduling of Slrum nodes and K8S nodes, thereby integrating the R&D and operation and maintenance environment of numerical forecasting and improving the utilization of hardware resources.

发明内容Summary of the invention

本说明书实施例之一提供一种基于研发运维一体化架构的数值预报计算云系统，包括：混合集群，包括Slurm集群和K8S集群，所述混合集群包括至少一个Slurm节点、至少一个混合节点及至少一个K8S节点，所述混合节点在同一时间供所述Slurm集群和所述K8S集群中的一个调度；镜像仓库，用于存储多个Docker基础镜像及多个数值预报应用镜像；转换单元，用于将所述多个Docker应用镜像同步转换为多个Singularity应用镜像；共享存储单元，用于存储所述转换单元转换的所述多个Singularity应用镜像；节点调度单元，用于接收应用研发任务，从所述混合集群中确定第一目标节点，其中，所述第一目标节点供所述K8S调度器调度，所述第一目标节点用于数值预报应用的研发；所述节点调度单元还用于接收业务作业任务，从所述混合集群中确定第二目标节点，其中，所述第二目标节点供所述Slurm调度器调度，所述第二目标节点用于拉取数值预报应用并运行作业。One of the embodiments of the present specification provides a numerical forecast computing cloud system based on an integrated R&D and operation architecture, including: a hybrid cluster, including a Slurm cluster and a K8S cluster, the hybrid cluster including at least one Slurm node, at least one hybrid node and at least one K8S node, the hybrid node being scheduled by one of the Slurm cluster and the K8S cluster at the same time; an image warehouse, used to store multiple Docker basic images and multiple numerical forecast application images; a conversion unit, used to synchronously convert the multiple Docker application images into multiple Singularity application images; a shared storage unit, used to store the multiple Singularity application images converted by the conversion unit; a node scheduling unit, used to receive an application R&D task, and determine a first target node from the hybrid cluster, wherein the first target node is scheduled by the K8S scheduler, and the first target node is used for the R&D of numerical forecast applications; the node scheduling unit is also used to receive a business job task, and determine a second target node from the hybrid cluster, wherein the second target node is scheduled by the Slurm scheduler, and the second target node is used to pull the numerical forecast application and run the job.

在一些实施例中，所述第一目标节点进行数值预报应用的研发，包括：所述第一目标节点调度分配用于进行所述镜像制作任务的计算资源；从所述镜像仓库拉取目标Docker基础镜像；基于所述目标Docker基础镜像及用户指令制作数值预报应用镜像，并将制作的所述数值预报应用镜像固化上传至所述镜像仓库。In some embodiments, the first target node conducts research and development of numerical forecasting applications, including: the first target node schedules and allocates computing resources for performing the image production task; pulls the target Docker base image from the image warehouse; produces a numerical forecasting application image based on the target Docker base image and user instructions, and solidifies and uploads the produced numerical forecasting application image to the image warehouse.

在一些实施例中，所述第二目标节点拉取数值预报应用并运行作业，包括：所述第二目标节点调度分配用于进行所述业务作业任务的计算资源；从所述共享存储单元拉取目标Singularity应用镜像；基于所述目标Singularity应用镜像及数值预报任务脚本运行数值预报应用程序。In some embodiments, the second target node pulls the numerical forecast application and runs the job, including: the second target node schedules and allocates computing resources for performing the business job task; pulls the target Singularity application image from the shared storage unit; and runs the numerical forecast application based on the target Singularity application image and the numerical forecast task script.

在一些实施例中，所述多个Docker基础镜像至少包括MySQL应用镜像、编程语言镜像及操作系统镜像。In some embodiments, the multiple Docker base images include at least a MySQL application image, a programming language image, and an operating system image.

在一些实施例中，所述多个数值预报应用镜像至少包括HPL应用镜像、Fvcom应用镜像及WRF应用镜像。In some embodiments, the multiple numerical forecast application images include at least an HPL application image, an Fvcom application image, and a WRF application image.

在一些实施例中，所述节点调度单元从所述混合集群中确定第一目标节点，包括：在Volcano调度器上安装节点组优先级插件；对所述至少一个混合节点及至少一个K8S节点按照资源类型进行分组，生成多个节点组，为每个所述节点组配置优先级；所述Volcano调度器基于每个所述节点组配置优先级，从所述至少一个混合节点及至少一个K8S节点中确定所述第一目标节点。In some embodiments, the node scheduling unit determines the first target node from the hybrid cluster, including: installing a node group priority plug-in on the Volcano scheduler; grouping the at least one hybrid node and the at least one K8S node according to resource type to generate multiple node groups, and configuring a priority for each of the node groups; the Volcano scheduler determines the first target node from the at least one hybrid node and the at least one K8S node based on the priority configuration of each of the node groups.

在一些实施例中，所述多个节点组至少包括Slurm节点组、混合CPU节点组、混合GPU节点组、K8S CPU节点组及K8S GPU节点组；所述Volcano调度器基于每个所述节点组配置优先级，从所述至少一个混合节点及至少一个K8S节点中确定所述第一目标节点，包括：判断所述K8S CPU节点组中是否存在所述第一目标节点；若所述K8S CPU节点组中不存在所述第一目标节点，判断所述混合CPU节点组中是否存在所述第一目标节点；若所述混合CPU节点组中不存在所述第一目标节点，判断所述K8S GPU节点组中是否存在所述第一目标节点；若所述K8S GPU节点组中不存在所述第一目标节点，判断所述混合GPU节点组是否存在所述第一目标节点。In some embodiments, the multiple node groups include at least a Slurm node group, a hybrid CPU node group, a hybrid GPU node group, a K8S CPU node group and a K8S GPU node group; the Volcano scheduler determines the first target node from the at least one hybrid node and the at least one K8S node based on the configuration priority of each of the node groups, including: determining whether the first target node exists in the K8S CPU node group; if the first target node does not exist in the K8S CPU node group, determining whether the first target node exists in the hybrid CPU node group; if the first target node does not exist in the hybrid CPU node group, determining whether the first target node exists in the K8S GPU node group; if the first target node does not exist in the K8S GPU node group, determining whether the first target node exists in the hybrid GPU node group.

在一些实施例中，所述节点调度单元还用于维护混合节点列表，其中，所述混合节点列表用于记录每个所述Slurm节点、每个所述混合节点及每个所述K8S节点的运行标识。In some embodiments, the node scheduling unit is further used to maintain a hybrid node list, wherein the hybrid node list is used to record the operation identification of each of the Slurm nodes, each of the hybrid nodes and each of the K8S nodes.

在一些实施例中，所述第一目标节点至少用于数值预报应用程序的研发、研发环境和资源的管理、数值预报应用镜像的创建和容器的管理；所述第二目标节点至少用于管理数值预报应用的运行环境、计算资源、运行结果和运行日记。In some embodiments, the first target node is used at least for the development of numerical forecasting applications, management of development environments and resources, creation of numerical forecasting application images, and management of containers; the second target node is used at least for managing the operating environment, computing resources, operating results, and operating logs of the numerical forecasting application.

在一些实施例中，所述镜像制作任务由具备root权限的研发用户发起；所述业务作业任务由不具备root权限的业务用户发起。In some embodiments, the image creation task is initiated by a research and development user with root authority; and the business operation task is initiated by a business user without root authority.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

本说明书将以示例性实施例的方式进一步说明，这些示例性实施例将通过附图进行详细描述。这些实施例并非限制性的，在这些实施例中，相同的编号表示相同的结构，其中：This specification will be further described in the form of exemplary embodiments, which will be described in detail by the accompanying drawings. These embodiments are not restrictive, and in these embodiments, the same number represents the same structure, wherein:

图1是根据本说明书一些实施例所示的基于研发运维一体化架构的数值预报计算云系统的模块图；FIG1 is a module diagram of a numerical forecast computing cloud system based on a research and development operation and maintenance integrated architecture according to some embodiments of this specification;

图2是根据本说明书一些实施例所示的研发运维环境一体化系统架构的结构示意图；FIG2 is a schematic diagram of a structural diagram of a system architecture for integrating R&D and operation and maintenance environments according to some embodiments of this specification;

图3是根据本说明书一些实施例所示的进行数值预报应用的研发及运行作业的流程示意图；FIG3 is a schematic diagram of a process for developing and operating numerical forecasting applications according to some embodiments of the present specification;

图4是根据本说明书一些实施例所示的从混合集群中确定第一目标节点的流程示意图。FIG. 4 is a schematic diagram of a flow chart of determining a first target node from a hybrid cluster according to some embodiments of the present specification.

具体实施方式Detailed ways

为了更清楚地说明本说明书实施例的技术方案，下面将对实施例描述中所需要使用的附图作简单的介绍。显而易见地，下面描述中的附图仅仅是本说明书的一些示例或实施例，对于本领域的普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图将本说明书应用于其它类似情景。除非从语言环境中显而易见或另做说明，图中相同标号代表相同结构或操作。In order to more clearly illustrate the technical solutions of the embodiments of this specification, the following is a brief introduction to the drawings required for the description of the embodiments. Obviously, the drawings described below are only some examples or embodiments of this specification. For ordinary technicians in this field, without paying creative work, this specification can also be applied to other similar scenarios based on these drawings. Unless it is obvious from the language environment or otherwise explained, the same reference numerals in the figures represent the same structure or operation.

应当理解，本文使用的“系统”、“装置”、“单元”和/或“模块”是用于区分不同级别的不同组件、元件、部件、部分或装配的一种方法。然而，如果其他词语可实现相同的目的，则可通过其他表达来替换所述词语。It should be understood that the "system", "device", "unit" and/or "module" used herein are a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.

如本说明书和权利要求书中所示，除非上下文明确提示例外情形，“一”、“一个”、“一种”和/或“该”等词并非特指单数，也可包括复数。一般说来，术语“包括”与“包含”仅提示包括已明确标识的步骤和元素，而这些步骤和元素不构成一个排它性的罗列，方法或者设备也可能包含其它的步骤或元素。As shown in this specification and claims, unless the context clearly indicates an exception, the words "a", "an", "an" and/or "the" do not refer to the singular and may also include the plural. Generally speaking, the terms "comprises" and "includes" only indicate the inclusion of the steps and elements that have been clearly identified, and these steps and elements do not constitute an exclusive list. The method or device may also include other steps or elements.

本说明书中使用了流程图用来说明根据本说明书的实施例的系统所执行的操作。应当理解的是，前面或后面操作不一定按照顺序来精确地执行。相反，可以按照倒序或同时处理各个步骤。同时，也可以将其他操作添加到这些过程中，或从这些过程移除某一步或数步操作。Flowcharts are used in this specification to illustrate the operations performed by the system according to the embodiments of this specification. It should be understood that the preceding or following operations are not necessarily performed precisely in order. Instead, the steps may be processed in reverse order or simultaneously. At the same time, other operations may be added to these processes, or one or more operations may be removed from these processes.

先对本说明书涉及的名词进行说明。First, the terms used in this specification are explained.

数值预报，是HPC的一项应用，通过大型计算机进行数值运算，以求解大气运动基本方程组，从而预报未来时刻的大气运动状态和天气现象；Numerical forecasting is an application of HPC. It uses large computers to perform numerical calculations to solve the basic equations of atmospheric motion, thereby predicting the state of atmospheric motion and weather phenomena at future times.

Kubernetes：是一个可移植、可扩展的开源平台，用于管理容器化的工作负载和服务，可促进声明式配置和自动化。Kubernetes: is a portable, extensible, open source platform for managing containerized workloads and services that facilitates declarative configuration and automation.

Pod(容器组)：是kubernetes管理的最小单元，多个容器组合在一起叫做Pod。Pod (container group): It is the smallest unit managed by Kubernetes. Multiple containers combined together are called a Pod.

Volcano：Volcano是CNCF下首个也是唯一的基于Kubernetes的容器批量计算平台，主要用于高性能计算场景。它提供了Kubernetes目前缺少的一套机制，这些机制通常是机器学习大数据应用、科学计算、特效渲染等多种高性能工作负载所需的。Volcano: Volcano is the first and only Kubernetes-based container batch computing platform under CNCF, mainly used in high-performance computing scenarios. It provides a set of mechanisms that Kubernetes currently lacks, which are usually required for various high-performance workloads such as machine learning big data applications, scientific computing, special effects rendering, etc.

Volcano Job：简称VcJob，是Volcano自定义的Job资源类型。区别于KubernetesJob，VcJob提供了更多高级功能，如可指定调度器、支持最小运行Pod数、支持Task、支持生命周期管理、支持指定队列、支持优先级调度等。Volcano Job更加适用于机器学习、大数据、科学计算等高性能计算场景。Volcano Job: VcJob for short, is a custom Job resource type of Volcano. Different from KubernetesJob, VcJob provides more advanced features, such as the ability to specify a scheduler, support a minimum number of running Pods, support Tasks, support lifecycle management, support for specified queues, support for priority scheduling, etc. Volcano Job is more suitable for high-performance computing scenarios such as machine learning, big data, and scientific computing.

CPU：中央处理器(Central Processing Unit，简称CPU)作为计算机系统的运算和控制核心，是信息处理、程序运行的最终执行单元。CPU: The Central Processing Unit (CPU) is the computing and control core of the computer system and the final execution unit for information processing and program running.

GPU：图形处理器(英语：graphics processing unit，缩写：GPU)，又称显示核心、视觉处理器、显示芯片，是一种专门在个人电脑、工作站、游戏机和一些移动设备(如平板电脑、智能手机等)上做图像和图形相关运算工作的微处理器。GPU: Graphics processing unit (GPU), also known as display core, visual processor, display chip, is a microprocessor that specializes in performing image and graphics related calculations on personal computers, workstations, game consoles and some mobile devices (such as tablets, smart phones, etc.).

Volcano Controller：Volcano的控制器，在集群上管理Volcano Job。Volcano Controller: Volcano controller, manages Volcano Job on the cluster.

Volcano Scheduler：Volcano Scheduler通过一系列的动作和插件调度VolcanoJob，并为它找到一个最适合的节点。Volcano Scheduler: Volcano Scheduler schedules VolcanoJob through a series of actions and plug-ins, and finds a most suitable node for it.

Slurm：Slurm工作调度工具是面向Linux和Unix类似内核的免费和开源工作调度程序，由世界上许多超级计算机和计算机集群使用。Slurm: The Slurm job scheduling tool is a free and open source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.

Singularity：Singularity是一个容器平台。它允许您创建和运行以可移植和可重复的方式打包软件的容器。您可以在笔记本电脑上使用Singularity构建容器，然后在世界上许多最大的HPC集群、本地大学或公司集群、单个服务器、云中或大厅下的工作站上运行它。Singularity: Singularity is a container platform. It allows you to create and run containers that package software in a portable and repeatable way. You can build a container with Singularity on your laptop, and then run it on many of the world's largest HPC clusters, a local university or corporate cluster, a single server, in the cloud, or on a workstation down the hall.

Harbor，是一个用于存储和分发Docker镜像的企业级镜像仓库服务器，提供权限管理、日志审阅、分层传输、水平扩展、镜像复制、图形化界面等诸多实用功能。Harbor is an enterprise-level image repository server for storing and distributing Docker images. It provides many practical functions such as permission management, log review, layered transmission, horizontal expansion, image replication, and graphical interface.

容器镜像，使用Docker打包获得的只读容器，其内容在构建之后也不会被改变，可以被认为是标准化的容器模板，而容器则是镜像的运行实例。A container image is a read-only container packaged using Docker. Its content will not be changed after it is built. It can be considered as a standardized container template, and the container is a running instance of the image.

基于研发运维一体化架构的数值预报计算云系统可以用于实现研发运维环境一体化系统架构，图2是根据本说明书一些实施例所示的研发运维环境一体化系统架构的结构示意图，如图2所示，研发运维环境一体化系统架构包括数值预报研发系统和数值预报业务系统两部分，分别应用于研发场景和业务场景。在数值预报研发系统中，研发用户使用K8S和Docker技术，进行数值预报应用程序的研发、应用镜像的创建、容器的管理等工作；在数值预报业务系统中，一般业务用户使用Slurm和Singularity技术，拉取研发系统构建的预报应用程序并运行作业。容器是一种虚拟化技术，可以为应用程序提供隔离的运行空间，并将程序运行的系统环境进行封装保存和快速移植。不同容器之间共享一个系统内核，因而相比于虚拟机，容器具有资源占用更少、启动时间更短，迁移部署更便捷等特性。The numerical forecast computing cloud system based on the R&D and operation and maintenance integrated architecture can be used to realize the integrated system architecture of the R&D and operation and maintenance environment. FIG2 is a structural schematic diagram of the integrated system architecture of the R&D and operation and maintenance environment shown in some embodiments of this specification. As shown in FIG2, the integrated system architecture of the R&D and operation and maintenance environment includes two parts: the numerical forecast R&D system and the numerical forecast business system, which are respectively applied to the R&D scenario and the business scenario. In the numerical forecast R&D system, R&D users use K8S and Docker technologies to carry out the development of numerical forecast applications, the creation of application images, and the management of containers; in the numerical forecast business system, general business users use Slurm and Singularity technologies to pull the forecast application built by the R&D system and run the job. Containers are a virtualization technology that can provide isolated operating space for applications, and encapsulate, save and quickly transplant the system environment in which the program runs. Different containers share a system kernel, so compared with virtual machines, containers have the characteristics of less resource usage, shorter startup time, and more convenient migration and deployment.

架构同时构建数值预报研发系统中的K8S集群和数值预报研发系统中的Slurm集群，并且在二者间建立通讯机制，协同调度物理计算节点，通过动态获取和释放节点资源的方式，共享底层硬件计算资源。The architecture simultaneously builds the K8S cluster in the numerical prediction R&D system and the Slurm cluster in the numerical prediction R&D system, and establishes a communication mechanism between the two to coordinate the scheduling of physical computing nodes and share the underlying hardware computing resources by dynamically acquiring and releasing node resources.

架构中数值预报研发系统中的Docker是目前使用最广泛的容器技术，但不适用于没有root权限的数值预报普通业务用户。Docker in the numerical weather forecast R&D system in the architecture is currently the most widely used container technology, but it is not suitable for ordinary business users of numerical weather forecast who do not have root permissions.

架构中数值预报业务系统中的Singularity具有简单、可移植、易扩展、易分发、用户权限在容器内外一致等特征，比Docker更适用于数值预报应用的容器化部署，但缺乏Docker更成熟的社区支持以及高质量镜像，也无法使用K8S等高效的容器编排系统。Singularity in the numerical forecast business system in the architecture is simple, portable, easy to expand, easy to distribute, and user permissions are consistent inside and outside the container. It is more suitable for containerized deployment of numerical forecast applications than Docker, but lacks Docker's more mature community support and high-quality images, and cannot use efficient container orchestration systems such as K8S.

在一些实施例中，在数值预报研发系统中，可以在K8S集群基础上引入Volcano框架，用以处理数值预报批量任务。Volcano框架加强了K8S的作业调度能力，，能够弥补K8S调度能力的不足，并且支持机器学习、深度学习、大数据等领域的大量主流计算框架。In some embodiments, in the numerical forecasting R&D system, the Volcano framework can be introduced on the basis of the K8S cluster to process numerical forecasting batch tasks. The Volcano framework enhances the job scheduling capability of K8S, can make up for the shortcomings of K8S scheduling capability, and supports a large number of mainstream computing frameworks in the fields of machine learning, deep learning, big data, etc.

图1是根据本说明书一些实施例所示的基于研发运维一体化架构的数值预报计算云系统的模块图，如图1所示，基于研发运维一体化架构的数值预报计算云系统至少包括混合集群、镜像仓库、转换单元、共享存储单元及节点调度单元。Figure 1 is a module diagram of a numerical forecasting computing cloud system based on an integrated R&D and operation and maintenance architecture according to some embodiments of this specification. As shown in Figure 1, the numerical forecasting computing cloud system based on an integrated R&D and operation and maintenance architecture includes at least a hybrid cluster, a mirror warehouse, a conversion unit, a shared storage unit and a node scheduling unit.

混合集群可以包括Slurm集群和K8S集群，混合集群包括至少一个Slurm节点、至少一个混合节点及至少一个K8S节点，混合节点在同一时间供Slurm集群和K8S集群中的一个调度。A hybrid cluster may include a Slurm cluster and a K8S cluster. The hybrid cluster includes at least one Slurm node, at least one hybrid node and at least one K8S node. The hybrid node is scheduled by one of the Slurm cluster and the K8S cluster at the same time.

镜像仓库可以用于存储多个Docker基础镜像及多个数值预报应用镜像。其中，多个Docker基础镜像至少包括MySQL应用镜像、编程语言(例如，Python等)镜像及操作系统(例如，Centos等)镜像，多个数值预报应用镜像至少包括HPL应用镜像、Fvcom(Finite-Volume Coastal Ocean Model)应用镜像及WRF(Weather Research and Forecasting)应用镜像。The image repository can be used to store multiple Docker base images and multiple numerical forecast application images. Among them, the multiple Docker base images at least include MySQL application images, programming language (e.g., Python, etc.) images, and operating system (e.g., Centos, etc.) images, and the multiple numerical forecast application images at least include HPL application images, Fvcom (Finite-Volume Coastal Ocean Model) application images, and WRF (Weather Research and Forecasting) application images.

转换单元可以用于将多个Docker应用镜像同步转换为多个Singularity应用镜像。The conversion unit can be used to synchronously convert multiple Docker application images into multiple Singularity application images.

共享存储单元可以用于存储转换单元转换的多个Singularity应用镜像。多个Singularity应用镜像可以被存储于云平台共享存储系统中，共享存储系统目录可以挂载到实例化的容器中实现数据持久化存储。云平台共享存储系统将挂载到物理集群中，其上的Singularity应用镜像(SIF文件)可以被业务用户实例化后通过Slurm调度执行。The shared storage unit can be used to store multiple Singularity application images converted by the conversion unit. Multiple Singularity application images can be stored in the cloud platform shared storage system, and the shared storage system directory can be mounted to the instantiated container to achieve data persistence storage. The cloud platform shared storage system will be mounted to the physical cluster, and the Singularity application images (SIF files) on it can be instantiated by business users and executed through Slurm scheduling.

节点调度单元可以用于接收应用研发任务，从混合集群中确定第一目标节点，其中，第一目标节点供K8S调度器调度，第一目标节点至少用于数值预报应用程序的研发、研发环境和资源的管理、数值预报应用镜像的创建和容器的管理，镜像制作任务可以由具备root权限的研发用户发起。The node scheduling unit can be used to receive application development tasks and determine the first target node from the hybrid cluster, wherein the first target node is scheduled by the K8S scheduler. The first target node is at least used for the development of numerical forecasting applications, the management of development environments and resources, the creation of numerical forecasting application images, and the management of containers. The image production task can be initiated by a development user with root permissions.

节点调度单元还用于接收业务作业任务，从混合集群中确定第二目标节点，其中，第二目标节点供Slurm调度器调度，第二目标节点用于拉取数值预报应用并运行作业，第二目标节点还至少用于管理数值预报应用的运行环境、计算资源、运行结果和运行日记，业务作业任务可以由不具备root权限的业务用户发起。The node scheduling unit is also used to receive business job tasks and determine a second target node from the hybrid cluster, wherein the second target node is scheduled by the Slurm scheduler, the second target node is used to pull the numerical forecast application and run the job, and the second target node is also used at least to manage the operating environment, computing resources, operating results and operating logs of the numerical forecast application. The business job task can be initiated by a business user who does not have root privileges.

图3是根据本说明书一些实施例所示的进行数值预报应用的研发及运行作业的流程示意图，如图3所示，在一些实施例中，第一目标节点进行数值预报应用的研发，包括：FIG3 is a flow chart of developing and operating numerical forecast applications according to some embodiments of the present specification. As shown in FIG3 , in some embodiments, the first target node develops numerical forecast applications, including:

第一目标节点调度分配用于进行镜像制作任务的计算资源；The first target node schedules and allocates computing resources for performing the image making task;

从镜像仓库拉取目标Docker基础镜像；Pull the target Docker base image from the image repository;

基于目标Docker基础镜像及用户指令制作数值预报应用镜像，并将制作的数值预报应用镜像固化上传至镜像仓库。A numerical forecast application image is created based on the target Docker base image and user instructions, and the created numerical forecast application image is solidified and uploaded to the image warehouse.

如图3所示，在一些实施例中，第二目标节点拉取数值预报应用并运行作业，包括：As shown in FIG3 , in some embodiments, the second target node pulls the numerical forecast application and runs the job, including:

第二目标节点调度分配用于进行业务作业任务的计算资源；The second target node schedules and allocates computing resources for performing business operation tasks;

从共享存储单元拉取目标Singularity应用镜像；Pull the target Singularity application image from the shared storage unit;

基于目标Singularity应用镜像及数值预报任务脚本运行数值预报应用程序。Run the numerical forecast application based on the target Singularity application image and numerical forecast task script.

在一些实施例中，节点调度单元从混合集群中确定第一目标节点，包括：In some embodiments, the node scheduling unit determines a first target node from the hybrid cluster, including:

在Volcano调度器上安装节点组优先级插件；Install the node group priority plugin on the Volcano scheduler;

对至少一个混合节点及至少一个K8S节点按照资源类型进行分组，生成多个节点组，为每个节点组配置优先级，其中，多个节点组至少包括Slurm节点组、混合CPU节点组、混合GPU节点组、K8S CPU节点组及K8S GPU节点组；Grouping at least one hybrid node and at least one K8S node according to resource types to generate multiple node groups, and configuring a priority for each node group, wherein the multiple node groups at least include a Slurm node group, a hybrid CPU node group, a hybrid GPU node group, a K8S CPU node group, and a K8S GPU node group;

Volcano调度器基于每个节点组配置优先级，从至少一个混合节点及至少一个K8S节点中确定第一目标节点。The Volcano scheduler determines the first target node from at least one hybrid node and at least one K8S node based on the configuration priority of each node group.

图4是根据本说明书一些实施例所示的从混合集群中确定第一目标节点的流程示意图，如图4所示，在一些实施例中，Volcano调度器基于每个节点组配置优先级，从至少一个混合节点及至少一个K8S节点中确定第一目标节点，包括：FIG4 is a flow chart of determining a first target node from a hybrid cluster according to some embodiments of the present specification. As shown in FIG4 , in some embodiments, the Volcano scheduler determines the first target node from at least one hybrid node and at least one K8S node based on the configuration priority of each node group, including:

判断K8S CPU节点组中是否存在第一目标节点；Determine whether the first target node exists in the K8S CPU node group;

若K8S CPU节点组中不存在第一目标节点，判断混合CPU节点组中是否存在第一目标节点；If the first target node does not exist in the K8S CPU node group, determine whether the first target node exists in the hybrid CPU node group;

若混合CPU节点组中不存在第一目标节点，判断K8S GPU节点组中是否存在第一目标节点；If the first target node does not exist in the hybrid CPU node group, determine whether the first target node exists in the K8S GPU node group;

若K8S GPU节点组中不存在第一目标节点，判断混合GPU节点组是否存在第一目标节点。If the first target node does not exist in the K8S GPU node group, determine whether the first target node exists in the hybrid GPU node group.

在一些实施例中，节点调度单元还用于维护混合节点列表，其中，混合节点列表用于记录每个Slurm节点、每个混合节点及每个K8S节点的运行标识。运行标识为idle时，则两类集群任务均可调度运行该节点。当节点运行了某一类集群任务时，将该运行标识设定为只能运行该任务。节点的任务完成后，将该节点的运行标识重新设定为idle。In some embodiments, the node scheduling unit is also used to maintain a hybrid node list, wherein the hybrid node list is used to record the operation identification of each Slurm node, each hybrid node and each K8S node. When the operation identification is idle, both types of cluster tasks can schedule the node to run. When a node runs a certain type of cluster task, the operation identification is set to only run this task. After the node's task is completed, the node's operation identification is reset to idle.

可以理解的，基于研发运维一体化架构的数值预报计算云系统可以至少包括以下有益效果：It can be understood that the numerical forecast computing cloud system based on the R&D and operation and maintenance integrated architecture can at least include the following beneficial effects:

1、能够实现研发环境的共享与快速构建，减少多用户安装部署数值预报系统的时间；1. It can realize the sharing and rapid construction of R&D environment, and reduce the time for multiple users to install and deploy numerical forecasting systems;

2、能够实现研发环境的封装保存，通过封装的文件实现数值预报研发环境的快速恢复；2. It can realize the packaging and preservation of the R&D environment, and realize the rapid recovery of the numerical forecasting R&D environment through the packaged files;

3、能够实现从研发环境到运行环境的高效迁移，避免传统数值预报系统环境复杂、依赖众多、部署困难、可移植差等问题；3. It can achieve efficient migration from the R&D environment to the operating environment, avoiding the problems of the traditional numerical forecasting system, such as complex environment, numerous dependencies, difficult deployment, and poor portability;

4、K8S和Slurm集群相互通讯，共享底层计算资源，提升资源利用效率；4. K8S and Slurm clusters communicate with each other, share underlying computing resources, and improve resource utilization efficiency;

5、Slurm直接调度物理节点，不再使用多级调度策略，调度效率极大提升；5. Slurm directly schedules physical nodes instead of using multi-level scheduling strategies, greatly improving scheduling efficiency;

6、容器实例环境直接被Slurm物理集群调度使用，仅使用一层容器，容器化性能损耗更小；6. The container instance environment is directly scheduled and used by the Slurm physical cluster, using only one layer of containers, resulting in less container performance loss;

7、通过Singularity容器打包数值预报应用，进一步降低容器化性能损耗；7. Use Singularity containers to package numerical forecast applications, further reducing container performance loss;

8、实现了Slurm集群的弹性伸缩，保障集群存算资源能够根据用户任务需求动态调度；8. Implemented elastic scaling of the Slurm cluster, ensuring that cluster storage and computing resources can be dynamically scheduled according to user task requirements;

9、通过在共享环境下原生地运行Slurm和K8S，避免Slurm和K8S嵌套调度的复杂性和不确定性，保证了系统稳定性。9. By running Slurm and K8S natively in a shared environment, the complexity and uncertainty of nested scheduling of Slurm and K8S are avoided, ensuring system stability.

上文已对基本概念做了描述，显然，对于本领域技术人员来说，上述详细披露仅仅作为示例，而并不构成对本说明书的限定。虽然此处并没有明确说明，本领域技术人员可能会对本说明书进行各种修改、改进和修正。该类修改、改进和修正在本说明书中被建议，所以该类修改、改进、修正仍属于本说明书示范实施例的精神和范围。The basic concepts have been described above. Obviously, for those skilled in the art, the above detailed disclosure is only for example and does not constitute a limitation of this specification. Although not explicitly stated here, those skilled in the art may make various modifications, improvements and corrections to this specification. Such modifications, improvements and corrections are suggested in this specification, so such modifications, improvements and corrections still belong to the spirit and scope of the exemplary embodiments of this specification.

同时，本说明书使用了特定词语来描述本说明书的实施例。如“一个实施例”、“一实施例”、和/或“一些实施例”意指与本说明书至少一个实施例相关的某一特征、结构或特点。因此，应强调并注意的是，本说明书中在不同位置两次或多次提及的“一实施例”或“一个实施例”或“一个替代性实施例”并不一定是指同一实施例。此外，本说明书的一个或多个实施例中的某些特征、结构或特点可以进行适当的组合。At the same time, this specification uses specific words to describe the embodiments of this specification. For example, "one embodiment", "an embodiment", and/or "some embodiments" refer to a certain feature, structure or characteristic related to at least one embodiment of this specification. Therefore, it should be emphasized and noted that "one embodiment" or "an embodiment" or "an alternative embodiment" mentioned twice or more in different positions in this specification does not necessarily refer to the same embodiment. In addition, certain features, structures or characteristics in one or more embodiments of this specification can be appropriately combined.

此外，除非权利要求中明确说明，本说明书所述处理元素和序列的顺序、数字字母的使用、或其他名称的使用，并非用于限定本说明书流程和方法的顺序。尽管上述披露中通过各种示例讨论了一些目前认为有用的发明实施例，但应当理解的是，该类细节仅起到说明的目的，附加的权利要求并不仅限于披露的实施例，相反，权利要求旨在覆盖所有符合本说明书实施例实质和范围的修正和等价组合。例如，虽然以上所描述的系统组件可以通过硬件设备实现，但是也可以只通过软件的解决方案得以实现，如在现有的服务器或移动设备上安装所描述的系统。In addition, unless explicitly stated in the claims, the order of the processing elements and sequences described in this specification, the use of alphanumeric characters, or the use of other names are not intended to limit the order of the processes and methods of this specification. Although the above disclosure discusses some invention embodiments that are currently considered useful through various examples, it should be understood that such details are only for illustrative purposes, and the attached claims are not limited to the disclosed embodiments. On the contrary, the claims are intended to cover all modifications and equivalent combinations that are consistent with the essence and scope of the embodiments of this specification. For example, although the system components described above can be implemented by hardware devices, they can also be implemented only by software solutions, such as installing the described system on an existing server or mobile device.

同理，应当注意的是，为了简化本说明书披露的表述，从而帮助对一个或多个发明实施例的理解，前文对本说明书实施例的描述中，有时会将多种特征归并至一个实施例、附图或对其的描述中。但是，这种披露方法并不意味着本说明书对象所需要的特征比权利要求中提及的特征多。实际上，实施例的特征要少于上述披露的单个实施例的全部特征。Similarly, it should be noted that in order to simplify the description disclosed in this specification and thus help understand one or more embodiments of the invention, in the above description of the embodiments of this specification, multiple features are sometimes combined into one embodiment, figure or description thereof. However, this disclosure method does not mean that the features required by the subject matter of this specification are more than the features mentioned in the claims. In fact, the features of the embodiments are less than all the features of the single embodiment disclosed above.

最后，应当理解的是，本说明书中所述实施例仅用以说明本说明书实施例的原则。其他的变形也可能属于本说明书的范围。因此，作为示例而非限制，本说明书实施例的替代配置可视为与本说明书的教导一致。相应地，本说明书的实施例不仅限于本说明书明确介绍和描述的实施例。Finally, it should be understood that the embodiments described in this specification are only used to illustrate the principles of the embodiments of this specification. Other variations may also fall within the scope of this specification. Therefore, as an example and not a limitation, alternative configurations of the embodiments of this specification may be considered consistent with the teachings of this specification. Accordingly, the embodiments of this specification are not limited to the embodiments explicitly introduced and described in this specification.

Claims

1. A numerical forecast computing cloud system based on an integrated R&D and operation and maintenance architecture, characterized by comprising:

A hybrid cluster, comprising a Slurm cluster and a K8S cluster, wherein the hybrid cluster comprises at least one Slurm node, at least one hybrid node and at least one K8S node, and the hybrid node is scheduled by one of the Slurm cluster and the K8S cluster at the same time;

Image repository, used to store multiple Docker base images and multiple numerical forecast application images;

A conversion unit, configured to synchronously convert the multiple Docker application images into multiple Singularity application images;

A shared storage unit, used to store the multiple Singularity application images converted by the conversion unit;

A node scheduling unit, configured to receive an application development task and determine a first target node from the hybrid cluster, wherein the first target node is scheduled by the K8S scheduler and is used for the development of numerical forecasting applications;

The node scheduling unit is further used to receive a business job task and determine a second target node from the hybrid cluster, wherein the second target node is scheduled by the Slurm scheduler, and the second target node is used to pull the numerical forecast application and run the job;

The node scheduling unit determines a first target node from the hybrid cluster, including:

Install the node group priority plugin on the Volcano scheduler;

The at least one hybrid node and the at least one K8S node are grouped according to resource types to generate multiple node groups, and a priority is configured for each of the node groups;

The Volcano scheduler determines the first target node from the at least one hybrid node and the at least one K8S node based on the configuration priority of each of the node groups;

The multiple node groups include at least a Slurm node group, a hybrid CPU node group, a hybrid GPU node group, a K8S CPU node group and a K8S GPU node group;

The Volcano scheduler determines the first target node from the at least one hybrid node and the at least one K8S node based on the configuration priority of each of the node groups, including:

Determine whether the first target node exists in the K8S CPU node group;

If the first target node does not exist in the K8S CPU node group, determine whether the first target node exists in the hybrid CPU node group;

If the first target node does not exist in the hybrid CPU node group, determine whether the first target node exists in the K8S GPU node group;

If the first target node does not exist in the K8S GPU node group, determine whether the first target node exists in the hybrid GPU node group.

2. According to claim 1, a numerical forecast computing cloud system based on an integrated R&D and operation architecture is characterized in that the first target node performs research and development of numerical forecast applications, including:

The first target node schedules and allocates computing resources for performing the image making task;

Pull the target Docker base image from the image repository;

A numerical forecast application image is produced based on the target Docker base image and user instructions, and the produced numerical forecast application image is solidified and uploaded to the image warehouse.

3. A numerical forecast computing cloud system based on a research and development operation and maintenance integrated architecture according to claim 1, characterized in that the second target node pulls the numerical forecast application and runs the job, comprising:

The second target node schedules and allocates computing resources for performing the business operation task;

Pull the target Singularity application image from the shared storage unit;

The numerical forecast application is run based on the target Singularity application image and the numerical forecast task script.

4. According to a numerical forecast computing cloud system based on a research and development and operation and maintenance integrated architecture as described in claim 1, it is characterized in that the multiple Docker basic images at least include a MySQL application image, a programming language image and an operating system image.

5. According to claim 4, a numerical forecast computing cloud system based on an integrated R&D and operation architecture is characterized in that the multiple numerical forecast application images at least include an HPL application image, an Fvcom application image and a WRF application image.

6. A numerical forecast computing cloud system based on a research and development and operation and maintenance integrated architecture according to any one of claims 1-5, characterized in that the node scheduling unit is also used to maintain a hybrid node list, wherein the hybrid node list is used to record the operation identification of each of the Slurm nodes, each of the hybrid nodes and each of the K8S nodes.

7. A numerical forecast computing cloud system based on a R&D and operation and maintenance integrated architecture according to any one of claims 1 to 5, characterized in that the first target node is at least used for the R&D of numerical forecast applications, the management of R&D environments and resources, the creation of numerical forecast application images, and the management of containers;

The second target node is at least used to manage the operating environment, computing resources, operating results and operating log of the numerical forecast application.

8. A numerical forecast computing cloud system based on a research and development operation and maintenance integrated architecture according to any one of claims 1 to 5, characterized in that the image creation task is initiated by a research and development user with root authority;

The business operation task is initiated by a business user who does not have root authority.