WO2024077885A1 - Management method, apparatus and device for container cluster, and non-volatile readable storage medium - Google Patents

Management method, apparatus and device for container cluster, and non-volatile readable storage medium Download PDF

Info

Publication number
WO2024077885A1
WO2024077885A1 PCT/CN2023/085261 CN2023085261W WO2024077885A1 WO 2024077885 A1 WO2024077885 A1 WO 2024077885A1 CN 2023085261 W CN2023085261 W CN 2023085261W WO 2024077885 A1 WO2024077885 A1 WO 2024077885A1
Authority
WO
WIPO (PCT)
Prior art keywords
cluster
target load
infrastructure
resource pool
container
Prior art date
Application number
PCT/CN2023/085261
Other languages
French (fr)
Chinese (zh)
Inventor
刘岩岩
Original Assignee
济南浪潮数据技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 济南浪潮数据技术有限公司 filed Critical 济南浪潮数据技术有限公司
Publication of WO2024077885A1 publication Critical patent/WO2024077885A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/084Configuration by using pre-existing information, e.g. using templates or copying from other elements
    • H04L41/0843Configuration by using pre-existing information, e.g. using templates or copying from other elements based on generic templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances

Abstract

The present application relates to the technical field of cloud computing. Specifically disclosed are a management method, apparatus and device for a container cluster, and a non-volatile readable storage medium. In the management method, by means of creating a Kubernetes management cluster in advance, designing a computing resource management module and a partition module, preparing a plurality of versions of virtual machine mirror images, a container mirror image and a container cluster application deployment template, and quickly creating a Kubernetes load cluster in a cluster declaration mode and on the basis of a cluster application program interface, repetitive operations for creating different types of infrastructure resource pools and deploying different types of basic resources are simplified, and unified management of the different types of infrastructure resource pools and the different types of basic resources is realized, thereby effectively solving the problem in the related art of it not being possible to manage the infrastructure of a load cluster, and thus making it possible to support dedicated cloud delivery scenarios such as multi-CPU architecture infrastructure, multi-cluster unified management, rapid deployment and delivery, high availability, and disaster recovery and backup support for container clusters.

Description

容器集群的管理方法、装置、设备及非易失性可读存储介质Container cluster management method, device, equipment and non-volatile readable storage medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2022年10月10日提交中国专利局,申请号为202211231347.8,申请名称为“容器集群的管理方法、装置、设备及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application filed with the Chinese Patent Office on October 10, 2022, with application number 202211231347.8 and application name “Management method, device, equipment and computer-readable storage medium for container clusters”, all contents of which are incorporated by reference in this application.
技术领域Technical Field
本申请涉及云计算技术领域,特别是涉及一种容器集群的管理方法、装置、设备及非易失性可读存储介质。The present application relates to the field of cloud computing technology, and in particular to a management method, device, equipment and non-volatile readable storage medium for a container cluster.
背景技术Background technique
kubernetes(简称K8s),是一个开源的、用于管理云平台中多个主机上的容器化的应用,能够方便地管理跨机器进行容器化的应用。随着越来越多的容器化应用程序参与生产,基于kubernetes以及各种需求的衍生项目和技术架构方案也层出不穷。基于kubernetes的多集群的统一管理能力,可以实现容器集群的快速部署交付、高可用、灾备等场景。然而,这些实现大多是基于kubernetes集群联邦,没有基于云平台的整体规划和设计,并且缺少跨CPU(Central Processing Unit,中央处理器)架构容器集群统一管理能力,难以适应用户多种多样的基础设施环境。Kubernetes (K8s for short) is an open source containerized application for managing multiple hosts in a cloud platform. It can easily manage containerized applications across machines. As more and more containerized applications are involved in production, derivative projects and technical architecture solutions based on Kubernetes and various needs are emerging in an endless stream. The unified management capabilities of multiple clusters based on Kubernetes can achieve scenarios such as rapid deployment and delivery, high availability, and disaster recovery of container clusters. However, most of these implementations are based on Kubernetes cluster federation, without overall planning and design based on the cloud platform, and lack the unified management capabilities of container clusters across CPU (Central Processing Unit) architectures, making it difficult to adapt to users' diverse infrastructure environments.
而目前实现多集群管理的方案,也仅仅是将用户创建好的集群注册在云平台中,利用云平台对集群的资源状态进行监测,而不涉及到对用户集群的基础设施的管理。这不仅导致用户集群创建过程较为繁琐,还导致集群和应用程序的可用性较差。The current solution for multi-cluster management is to simply register the user-created clusters in the cloud platform and use the cloud platform to monitor the resource status of the clusters, without involving the management of the user cluster's infrastructure. This not only makes the user cluster creation process more cumbersome, but also leads to poor availability of clusters and applications.
提供一种快速部署交付、高可用的基于云平台的多架构集群的统一管理方案,是本领域技术人员需要解决的技术问题。Providing a unified management solution for multi-architecture clusters based on a cloud platform with rapid deployment and delivery and high availability is a technical problem that needs to be solved by those skilled in the art.
发明内容Summary of the invention
本申请的目的是提供一种容器集群的管理方法、装置、设备及非易失性可读存储介质,被设置为实现快速部署交付、高可用的基于云平台的多架构集群的统一管理。The purpose of this application is to provide a container cluster management method, device, equipment and non-volatile readable storage medium, which are configured to achieve rapid deployment and delivery, and unified management of high-availability multi-architecture clusters based on a cloud platform.
为解决技术问题,本申请提供一种容器集群的管理方法,包括:To solve the technical problem, the present application provides a container cluster management method, including:
预先基于云平台在kubernetes集群部署集群应用程序接口,以创建管理集群;基于云平台配置针对不同类型云平台的基础设施资源池进行统一管理的计算资源管理模块,以及被设置为维护不同版本的虚拟机的分区管理模块;将多种版本的虚拟机镜像推送至各基础设施资源池的镜像仓库;将kubernetes负载集群所需附属组件的容器镜像推送至云平台的第一容器镜像仓库,并将容器集群应用部署模板推送至云平台的图表仓库;Pre-deploy the cluster application program interface in the kubernetes cluster based on the cloud platform to create a management cluster; configure the computing resource management module for unified management of infrastructure resource pools of different types of cloud platforms based on the cloud platform, and the partition management module set to maintain different versions of virtual machines; push multiple versions of virtual machine images to the image warehouse of each infrastructure resource pool; push the container images of the auxiliary components required by the kubernetes load cluster to the first container image warehouse of the cloud platform, and push the container cluster application deployment template to the chart warehouse of the cloud platform;
接收到基于客户端发送的kubernetes负载集群创建请求时,识别得到待创建的第一目标负载集群的基础设施资源池类型和第一目标负载集群所需的基础设施类型;When receiving a kubernetes load cluster creation request sent by a client, identifying an infrastructure resource pool type of a first target load cluster to be created and an infrastructure type required by the first target load cluster;
基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署。Based on the cluster application program interface, a cluster declaration management method is adopted to call the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and complete the infrastructure deployment for the first target load cluster.
在一些实施例中,将多种版本的虚拟机镜像推送至各基础设施资源池的镜像仓库,包括:In some embodiments, multiple versions of virtual machine images are pushed to the image repository of each infrastructure resource pool, including:
制作虚拟机镜像的过程中,在镜像内放置对应CPU架构的特定版本的kubernetes相关的二进制资源配置文件到指定目录,安装第二容器镜像仓库,并将对应CPU架构和对应版本的kubernetes相关镜像推送至第二容器镜像仓库; In the process of making a virtual machine image, a binary resource configuration file related to Kubernetes of a specific version corresponding to the CPU architecture is placed in the image to a specified directory, a second container image repository is installed, and Kubernetes-related images corresponding to the CPU architecture and the corresponding version are pushed to the second container image repository;
将制作完成的虚拟机镜像推送至各基础设施资源池的镜像仓库。Push the completed virtual machine image to the image repository of each infrastructure resource pool.
在一些实施例中,分区管理模块的分区与虚拟机镜像一一对应;In some embodiments, the partitions of the partition management module correspond one-to-one to the virtual machine images;
相应的,分区的分区信息包括与第一目标负载集群中的虚拟机对应的CPU架构、操作系统类型、操作系统版本、kubernetes版本、容器运行时类型、容器运行时版本、虚拟机镜像ID。Correspondingly, the partition information of the partition includes the CPU architecture, operating system type, operating system version, kubernetes version, container runtime type, container runtime version, and virtual machine image ID corresponding to the virtual machine in the first target load cluster.
在一些实施例中,基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署,包括:In some embodiments, based on the cluster application program interface, a cluster declaration is adopted to call a computing resource management module and a partition management module to create a first target load cluster in a first infrastructure resource pool corresponding to the infrastructure resource pool type and complete infrastructure deployment for the first target load cluster, including:
调用计算资源管理模块获取第一基础设施资源池的镜像服务组件接口后,调用分区管理模块在第一基础设施资源池的镜像仓库获取第一目标负载集群所需的虚拟机镜像ID;After calling the computing resource management module to obtain the image service component interface of the first infrastructure resource pool, calling the partition management module to obtain the virtual machine image ID required by the first target load cluster in the image warehouse of the first infrastructure resource pool;
根据入参生成并创建扩展资源文件,并为第一目标负载集群创建用于管控kubernetes集群中的自定义资源的扩展资源;Generate and create an extended resource file based on the input parameters, and create an extended resource for the first target load cluster to control custom resources in the Kubernetes cluster;
当监听到自定义资源时,基于集群应用程序接口和与第一基础设施资源池类型对应的资源供应接口,调用kubernetes编排工具在第一基础设施资源池中进行kubernetes负载集群的创建与部署;When a custom resource is monitored, based on a cluster application program interface and a resource provisioning interface corresponding to the first infrastructure resource pool type, a kubernetes orchestration tool is called to create and deploy a kubernetes load cluster in the first infrastructure resource pool;
当监听到第一目标负载集群的所有自定义资源的应用接口服务部署完毕后,获取第一目标负载集群的标识并记录;When it is monitored that the application interface services of all custom resources of the first target load cluster are deployed, the identifier of the first target load cluster is obtained and recorded;
基于客户端修改工作节点的标签设置角色,自容器镜像仓库中调用第一目标负载集群所需的容器镜像,自图表仓库中获取第一目标负载集群所需的容器集群应用部署模板,在第一目标负载集群中完成附属组件的安装与启动。Based on the client, the label setting role of the working node is modified, the container image required by the first target load cluster is called from the container image repository, the container cluster application deployment template required by the first target load cluster is obtained from the chart repository, and the installation and startup of the auxiliary components are completed in the first target load cluster.
在一些实施例中,调用分区管理模块在第一基础设施资源池的镜像仓库获取第一目标负载集群所需的虚拟机镜像ID,包括:In some embodiments, calling the partition management module to obtain the virtual machine image ID required by the first target load cluster from the image repository of the first infrastructure resource pool includes:
调用分区管理模块查询符合kubernetes负载集群创建请求所要求的CPU架构和kubernetes版本条件的虚拟机镜像的ID作为虚拟机镜像ID。The partition management module is called to query the ID of the virtual machine image that meets the CPU architecture and kubernetes version conditions required by the kubernetes load cluster creation request as the virtual machine image ID.
在一些实施例中,根据入参生成并创建扩展资源文件,并为第一目标负载集群创建用于管控kubernetes集群中的自定义资源的扩展资源,包括:In some embodiments, generating and creating an extended resource file according to input parameters, and creating an extended resource for managing and controlling custom resources in a kubernetes cluster for a first target load cluster includes:
根据入参生成并创建扩展资源文件并创建扩展资源,其中,扩展资源是一个用于管控kubernetes集群中的CRD(CustomResourceDefinition,自定义资源)资源。Generate and create an extended resource file based on the input parameters and create extended resources. The extended resource is a CRD (CustomResourceDefinition) resource used to manage and control the kubernetes cluster.
在一些实施例中,当监听到自定义资源时,基于集群应用程序接口和与第一基础设施资源池类型对应的资源供应接口,调用kubernetes编排工具在第一基础设施资源池中进行kubernetes负载集群的创建与部署,包括:In some embodiments, when a custom resource is monitored, based on a cluster application program interface and a resource provisioning interface corresponding to the first infrastructure resource pool type, a kubernetes orchestration tool is called to create and deploy a kubernetes load cluster in the first infrastructure resource pool, including:
当监听到自定义资源时,通过云平台与第一基础设施资源池类型对应的资源供应接口来调用基础设施资源池的接口来创建虚拟机、安全组和云硬盘的资源,在虚拟机中注入云平台初始化脚本,其中,云平台初始化脚本用于实现kubernetes初始化、新建和修改kubernetes配置参数,运行云平台初始化脚本以调用虚拟机镜像中预置的kubernetes编排工具进行第一目标负载集群的部署。When custom resources are monitored, the resource supply interface corresponding to the cloud platform and the first infrastructure resource pool type is used to call the interface of the infrastructure resource pool to create virtual machines, security groups and cloud hard disk resources, and the cloud platform initialization script is injected into the virtual machine, where the cloud platform initialization script is used to implement Kubernetes initialization, create and modify Kubernetes configuration parameters, and run the cloud platform initialization script to call the Kubernetes orchestration tool preset in the virtual machine image to deploy the first target load cluster.
在一些实施例中,当监听到第一目标负载集群的所有自定义资源的应用接口服务部署完毕后,获取第一目标负载集群的标识并记录,包括:In some embodiments, after monitoring that the application interface services of all custom resources of the first target load cluster are deployed, obtaining and recording the identifier of the first target load cluster includes:
监控cluster.x-k8s资源的状态,等待应用接口服务准备完毕后查询节点列表,待应用接口服务部署完毕后,调用集群应用程序接口获取第一目标负载集群的标识,并将标识记录到第一目标负载集群对应的管理记录。Monitor the status of cluster.x-k8s resources, wait for the application interface service to be ready, then query the node list. After the application interface service is deployed, call the cluster application program interface to obtain the identifier of the first target load cluster, and record the identifier in the management record corresponding to the first target load cluster.
在一些实施例中,在第一目标负载集群中完成附属组件的安装与启动,包括:In some embodiments, completing the installation and activation of the auxiliary components in the first target load cluster includes:
利用容器集群应用部署模板中的软件包管理工具安装附属组件到第一目标负载集群;Using a software package management tool in a container cluster application deployment template to install the attached components into the first target load cluster;
待附属组件安装完成后,启动附属组件,并记录部署完成状态到第一目标负载集群对应的管理记录。 After the installation of the subsidiary components is completed, the subsidiary components are started, and the deployment completion status is recorded in the management record corresponding to the first target load cluster.
在一些实施例中,还包括:In some embodiments, it also includes:
当接收到基于客户端发送的对已创建的第二目标负载集群的删除命令时,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施。When a deletion command for the created second target load cluster is received based on the client, the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located are called to delete the infrastructure of the second target load cluster in the second infrastructure resource pool.
在一些实施例中,在基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署之前,还包括:In some embodiments, before calling the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and completing infrastructure deployment for the first target load cluster based on the cluster application program interface and managing the cluster declaration, the method further includes:
为第一目标负载集群创建与用户的一一对应关系,并为第一目标负载集群配置安全认证规则。A one-to-one correspondence between the first target load cluster and the user is created, and a security authentication rule is configured for the first target load cluster.
在一些实施例中,为第一目标负载集群创建与用户的一一对应关系,并为第一目标负载集群配置安全认证规则,可以为:In some embodiments, a one-to-one correspondence between the first target load cluster and the user is created, and a security authentication rule is configured for the first target load cluster, which may be:
调用第一基础设施资源池的安全认证服务接口创建用户,并为用户配置对应的安全认证规则。The security authentication service interface of the first infrastructure resource pool is called to create a user, and corresponding security authentication rules are configured for the user.
在一些实施例中,当第一基础设施资源池为OpenStack基础设施资源池时,为用户配置对应的安全认证规则,可以为:In some embodiments, when the first infrastructure resource pool is an OpenStack infrastructure resource pool, the corresponding security authentication rules configured for the user may be:
采用OpenStack基础设施资源池的计算服务组件nova创建密钥对。Use nova, a computing component of the OpenStack infrastructure resource pool, to create a key pair.
在一些实施例中,还包括:In some embodiments, it also includes:
当接收到基于客户端发送的对已创建的第二目标负载集群的删除命令时,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施;When receiving a deletion command for the created second target load cluster sent by the client, calling the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located, and deleting the infrastructure of the second target load cluster in the second infrastructure resource pool;
调用计算资源管理模块以及第二基础设施资源池的资源供应接口,获取第二目标负载集群的认证信息后删除第二目标负载集群下对应用户的卷,并删除对应的安全认证规则,删除用户信息及第二目标负载集群在云平台数据库中的记录。Call the computing resource management module and the resource supply interface of the second infrastructure resource pool, obtain the authentication information of the second target load cluster, delete the volume of the corresponding user under the second target load cluster, delete the corresponding security authentication rules, and delete the user information and the record of the second target load cluster in the cloud platform database.
在一些实施例中,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施,可以包括:In some embodiments, calling the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located, and deleting the infrastructure of the second target load cluster in the second infrastructure resource pool may include:
调用计算资源管理模块执行对第二基础设施资源池中第二目标负载集群的自定义资源的删除操作;Invoke the computing resource management module to execute a deletion operation on the custom resources of the second target load cluster in the second infrastructure resource pool;
基于集群应用程序接口监测到删除操作后,通过第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施。After the deletion operation is detected based on the cluster application program interface, the infrastructure of the second target load cluster is deleted in the second infrastructure resource pool through the resource supply interface of the second infrastructure resource pool.
在一些实施例中,在基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署之前,还包括:In some embodiments, before calling the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and completing infrastructure deployment for the first target load cluster based on the cluster application program interface and managing the cluster declaration, the method further includes:
对kubernetes负载集群创建请求进行权限校验和请求参数校验;Perform permission verification and request parameter verification on kubernetes load cluster creation requests;
若通过校验,则进入基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块为第一目标负载集群完成基础设施资源池部署以及基础设施部署的步骤。If the verification is passed, the process proceeds to the step of managing cluster declarations based on the cluster application program interface, calling the computing resource management module and the partition management module to complete the infrastructure resource pool deployment and infrastructure deployment steps for the first target load cluster.
在一些实施例中,还包括:In some embodiments, it also includes:
在云平台的数据库为第一目标负载集群创建对应的管理记录,并将第一目标负载集群的集群信息同步更新至管理记录。A corresponding management record is created for the first target load cluster in the database of the cloud platform, and the cluster information of the first target load cluster is synchronously updated to the management record.
为解决技术问题,本申请还提供一种容器集群的管理装置,包括:To solve the technical problem, the present application also provides a container cluster management device, including:
环境准备单元,被设置为预先基于云平台在kubernetes集群部署集群应用程序接口,以创建管理集群;基于云平台配置针对不同类型云平台的基础设施资源池进行统一管理的计算资源管理模块,以及用于维护不同版本的虚拟机的分区管理模块;将多种版本的虚拟机镜像推送至各基础设施资源池的镜像仓库;将kubernetes负载集群所需附属组件的容器镜像推送至云平台的第一容器镜像仓库,并将容器集群应用部署模板推送至云平台的图表仓库; The environment preparation unit is configured to pre-deploy a cluster application program interface in a kubernetes cluster based on a cloud platform to create a management cluster; configure a computing resource management module for unified management of infrastructure resource pools of different types of cloud platforms based on the cloud platform, and a partition management module for maintaining different versions of virtual machines; push multiple versions of virtual machine images to the image warehouse of each infrastructure resource pool; push the container image of the auxiliary components required by the kubernetes load cluster to the first container image warehouse of the cloud platform, and push the container cluster application deployment template to the chart warehouse of the cloud platform;
识别单元,被设置为接收到基于客户端发送的kubernetes负载集群创建请求时,识别得到待创建的第一目标负载集群的基础设施资源池类型和第一目标负载集群所需的基础设施类型;an identification unit, configured to, upon receiving a kubernetes load cluster creation request sent by a client, identify an infrastructure resource pool type of a first target load cluster to be created and an infrastructure type required by the first target load cluster;
创建单元,被设置为基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块为第一目标负载集群完成基础设施资源池部署以及基础设施部署。The creation unit is configured to use a management cluster declaration method based on a cluster application program interface to call a computing resource management module and a partition management module to complete infrastructure resource pool deployment and infrastructure deployment for the first target load cluster.
为解决技术问题,本申请还提供一种容器集群的管理设备,包括:To solve the technical problem, the present application also provides a container cluster management device, including:
存储器,被设置为存储计算机程序;a memory arranged to store a computer program;
处理器,被设置为执行计算机程序,计算机程序被处理器执行时实现如任意一项容器集群的管理方法的步骤。The processor is configured to execute a computer program, and when the computer program is executed by the processor, the steps of any container cluster management method are implemented.
为解决技术问题,本申请还提供一种非易失性可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现如上述任意一项容器集群的管理方法的步骤。To solve the technical problem, the present application also provides a non-volatile readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the steps of any of the above-mentioned container cluster management methods are implemented.
本申请所提供的容器集群的管理方法,通过预先基于云平台在kubernetes集群部署集群应用程序接口,创建管理集群,配置针对不同类型云平台的基础设施资源池进行统一管理的计算资源管理模块,以及用于维护不同版本的虚拟机的分区管理模块,并针对各基础设施资源池推送不同版本的虚拟机镜像,预先准备附属组件的容器镜像,以及容器集群应用部署模板,在接收到基于客户端发送的kubernetes负载集群创建请求时,识别得到第一目标负载集群的基础设施资源池类型和基础设施类型,即可基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块为创建第一目标负载集群并完成基础设施部署,简化了创建不同类型基础设施资源池、部署不同类型基础资源的重复性操作,在不同架构的基础设施当中保持了一致性和可重复性,且实现了对不同类型基础设施资源池和不同类型基础资源的统一管理,有效解决了相关技术中无法对负载集群的基础设施进行管理的问题,使得多CPU架构基础设施、多集群统一管理、快速部署交付,高可用及容器集群容灾备份支持等专有云交付场景支持成为可能。The management method of the container cluster provided in the present application pre-deploys a cluster application program interface in a kubernetes cluster based on a cloud platform, creates a management cluster, configures a computing resource management module for unified management of infrastructure resource pools of different types of cloud platforms, and a partition management module for maintaining different versions of virtual machines, pushes different versions of virtual machine images for each infrastructure resource pool, pre-prepares container images of attached components, and a container cluster application deployment template, and upon receiving a kubernetes load cluster creation request sent by a client, identifies the infrastructure resource pool type and infrastructure type of the first target load cluster, and then uses a management cluster declaration method based on the cluster application program interface to call the computing resource management module and the partition management module to create the first target load cluster and complete the infrastructure deployment, thereby simplifying the repetitive operations of creating different types of infrastructure resource pools and deploying different types of basic resources, maintaining consistency and repeatability among infrastructures of different architectures, and realizing unified management of different types of infrastructure resource pools and different types of basic resources, effectively solving the problem that the infrastructure of the load cluster cannot be managed in the related technology, making it possible to support proprietary cloud delivery scenarios such as multi-CPU architecture infrastructure, unified management of multiple clusters, rapid deployment and delivery, high availability and disaster recovery and backup support for container clusters.
本申请还提供一种容器集群的管理装置、设备及非易失性可读存储介质,具有有益效果,在此不再赘述。The present application also provides a management device, equipment and non-volatile readable storage medium for a container cluster, which have beneficial effects and are not described in detail here.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚的说明本申请实施例或相关技术的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application or the related technologies, the drawings required for use in the embodiments or the related technical descriptions are briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.
图1为本申请实施例提供的一种容器集群的管理方法的流程图;FIG1 is a flow chart of a method for managing a container cluster provided in an embodiment of the present application;
图2为本申请实施例提供的一种S103的可选实施方式的流程图;FIG. 2 is a flow chart of an optional implementation of S103 provided in an embodiment of the present application;
图3为本申请实施例提供的一种容器集群的管理装置的结构示意图;FIG3 is a schematic diagram of the structure of a container cluster management device provided in an embodiment of the present application;
图4为本申请实施例提供的一种容器集群的管理设备的结构示意图。FIG4 is a schematic diagram of the structure of a container cluster management device provided in an embodiment of the present application.
具体实施方式Detailed ways
本申请的核心是提供一种容器集群的管理方法、装置、设备及非易失性可读存储介质,用于实现快速部署交付、高可用的基于云平台的多架构集群的统一管理。The core of this application is to provide a container cluster management method, device, equipment and non-volatile readable storage medium for realizing unified management of multi-architecture clusters based on cloud platforms with rapid deployment and delivery and high availability.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The following will be combined with the drawings in the embodiments of the present application to clearly and completely describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of this application.
图1为本申请实施例提供的一种容器集群的管理方法的流程图。FIG1 is a flowchart of a method for managing a container cluster provided in an embodiment of the present application.
如图1所示,本申请实施例提供的容器集群的管理方法包括:As shown in FIG1 , the container cluster management method provided in the embodiment of the present application includes:
S101:预先基于云平台在kubernetes集群部署集群应用程序接口(Cluster API),以创建管理集群;基于云平台配置针对不同类型云平台的基础设施资源池进行统一管理的计算资源管理模块,以及用于维护不同版本 的虚拟机的分区管理模块;将多种版本的虚拟机镜像推送至各基础设施资源池的镜像仓库;将kubernetes负载集群所需附属组件的容器镜像推送至云平台的第一容器镜像仓库,并将容器集群应用部署模板推送至云平台的图表仓库。S101: Pre-deploy the cluster API in the Kubernetes cluster based on the cloud platform to create a management cluster; configure the computing resource management module based on the cloud platform to uniformly manage the infrastructure resource pools of different types of cloud platforms, and the computing resource management module for maintaining different versions. The partition management module of the virtual machine; pushes multiple versions of virtual machine images to the image repository of each infrastructure resource pool; pushes the container images of the auxiliary components required by the Kubernetes load cluster to the first container image repository of the cloud platform, and pushes the container cluster application deployment template to the chart repository of the cloud platform.
S102:接收到基于客户端发送的kubernetes负载集群创建请求时,识别得到待创建的第一目标负载集群的基础设施资源池类型和第一目标负载集群所需的基础设施类型。S102: When receiving a kubernetes load cluster creation request sent by a client, identify the infrastructure resource pool type of the first target load cluster to be created and the infrastructure type required by the first target load cluster.
S103:基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署。S103: Based on the cluster application program interface, a cluster declaration method is adopted to call the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and complete infrastructure deployment for the first target load cluster.
相关技术中的主流架构设计为将服务器和操作系统与kubernetes节点的生命周期分开管理:通常,指定单一架构的机器预先配置操作系统(OS),然后再将kubernetes组件初始化到正在运行的系统中(使用kubeadm、kops或其他kubernetes编排工具),而且如果这些基础设施及其操作系统不属于kubernetes管理的一部分,还会降低集群和应用程序的可用性。比如,在用户需要创建kubernetes负载集群时,虽然可以基于已有的基础设施进行负载集群的创建,但虚拟机创建、操作系统安装等操作都需要用户创建好负载集群后,上传安装包通过云平台进行创建,需要大量上传和安装的重复性任务,不同用户可能会采用不同的流程操作,不同负载集群的创建流程之间不可重复,无法保持一致性和可重复性;且这些基础设施没有且无法纳入云平台的统一管理,无法保证多架构CPU的一致性,给集群生命周期管理带来负担,The mainstream architecture design in the relevant technology is to manage the life cycle of servers and operating systems separately from that of kubernetes nodes: usually, a single-architecture machine is specified to pre-configure the operating system (OS), and then the kubernetes components are initialized into the running system (using kubeadm, kops or other kubernetes orchestration tools). Moreover, if these infrastructures and their operating systems are not part of kubernetes management, the availability of clusters and applications will be reduced. For example, when a user needs to create a kubernetes load cluster, although the load cluster can be created based on the existing infrastructure, operations such as virtual machine creation and operating system installation require the user to create the load cluster and then upload the installation package to the cloud platform for creation. This requires a large number of repetitive tasks such as uploading and installing. Different users may use different process operations. The creation processes of different load clusters are not repeatable, and consistency and repeatability cannot be maintained. Moreover, these infrastructures are not and cannot be included in the unified management of the cloud platform, and the consistency of multi-architecture CPUs cannot be guaranteed, which brings a burden to cluster lifecycle management.
为实现多架构基础设施、快速部署交付、高可用及容器集群容灾备份支持等交付场景,快速构建与实现基于云平台的多架构集群的统一管理,本申请实施例提供的容器集群的管理方法通过设计云平台的计算资源管理模块、分区管理模块等模块,通过一系列的前期准备,用户只需基于客户端发送kubernetes负载集群创建请求提供待创建的第一目标负载集群的基础设施资源池类型和第一目标负载集群所需的基础设施类型,即可基于集群应用程序接口利用管理集群声明的方式快速创建多架构kubernetes容器集群,简化了kubernetes生命周期的重复性任务,同时在多CPU架构的基础设施当中保持一致性与可重复性,降低了集群生命周期管理工作带来的现实负担。使多CPU架构基础设施、多集群统一管理、快速部署交付、高可用及容器集群容灾备份支持等专有云交付场景支持成为可能。In order to realize delivery scenarios such as multi-architecture infrastructure, rapid deployment and delivery, high availability, and container cluster disaster recovery and backup support, and quickly build and realize unified management of multi-architecture clusters based on cloud platforms, the container cluster management method provided in the embodiment of the present application is designed by designing computing resource management modules, partition management modules, and other modules of the cloud platform. Through a series of preliminary preparations, the user only needs to send a kubernetes load cluster creation request based on the client to provide the infrastructure resource pool type of the first target load cluster to be created and the infrastructure type required by the first target load cluster. Based on the cluster application interface, a multi-architecture kubernetes container cluster can be quickly created by using the management cluster declaration method, which simplifies the repetitive tasks of the kubernetes life cycle, and at the same time maintains consistency and repeatability in the infrastructure of multiple CPU architectures, reducing the actual burden brought by the cluster life cycle management work. It makes it possible to support proprietary cloud delivery scenarios such as multi-CPU architecture infrastructure, unified management of multiple clusters, rapid deployment and delivery, high availability, and container cluster disaster recovery and backup support.
在可选的实施中,对于S101,首先基于云平台在已有的kubernetes集群上部署集群应用程序接口,即得到kubernetes管理集群(简称管理集群)。通过管理集群声明并创建出来的集群叫kubernetes负载集群(简称负载集群),即真正运行负载的集群。管理集群通常仅作为负载集群的管理使用,不运行其他应用。In an optional implementation, for S101, firstly, a cluster application program interface is deployed on an existing kubernetes cluster based on a cloud platform, that is, a kubernetes management cluster (referred to as management cluster) is obtained. The cluster declared and created by the management cluster is called a kubernetes load cluster (referred to as load cluster), that is, a cluster that actually runs the load. The management cluster is usually only used for the management of the load cluster and does not run other applications.
基于云平台配置计算资源管理模块,该模块为统一基础设施管理模块,针对不通类型的云平台,基础设施可能涵盖业务流程管理开发平台AWS、基于云平台的操作系统Azure、应用容器引擎Docker、谷歌云平台GCP、苹果云平台Metal3、VMware vSphere集成容器(VIC)、云计算管理平台OpenStack乃至裸机等。预先配置好不同类型云平台对应的基础设施资源池,均采用计算资源管理模块进行统一管理,实现云平台以一致方式跨越各类基础设施部署与管理kubernetes集群。以OpenStack为例,即云平台配置OpenStack为基础设施资源池,则预先为OpenStack基础设施资源池配置所需组件,如计算服务组件nova、网络即服务组件neutron、安全认证服务组件keysnote(负责用户认证和服务目录)、高可用分布式对象存储服务组件swift、持久性块存储功能组件cinder、镜像服务组件glance等。其中,镜像服务组件glance用于在OpenStack基础设施资源池的存储(即镜像仓库中存储有如amd64-mirror-template-1、amd64-mirror-template-2、arm64-mirror-template-1、arm64-mirror-template-2等多种CPU架构、操作系统类型及版本、kubernetes版本或容器运行时的需求的虚拟机镜像)拉取第一目标负载集群所需虚拟机镜像并在OpenStack基础设施资源池中创建虚拟机。The computing resource management module is configured based on the cloud platform. This module is a unified infrastructure management module. For different types of cloud platforms, the infrastructure may cover the business process management development platform AWS, the cloud platform-based operating system Azure, the application container engine Docker, the Google Cloud Platform GCP, the Apple Cloud Platform Metal3, VMware vSphere Integrated Container (VIC), the cloud computing management platform OpenStack and even bare metal. The infrastructure resource pools corresponding to different types of cloud platforms are pre-configured and all are managed uniformly using the computing resource management module, so that the cloud platform can deploy and manage kubernetes clusters across various infrastructures in a consistent manner. Taking OpenStack as an example, if the cloud platform configures OpenStack as an infrastructure resource pool, the required components are pre-configured for the OpenStack infrastructure resource pool, such as the computing service component nova, the network as a service component neutron, the security authentication service component keysnote (responsible for user authentication and service directory), the high-availability distributed object storage service component swift, the persistent block storage function component cinder, and the image service component glance. Among them, the image service component glance is used to pull the virtual machine image required by the first target load cluster from the storage of the OpenStack infrastructure resource pool (that is, the image warehouse stores virtual machine images required by various CPU architectures, operating system types and versions, kubernetes versions or container runtimes such as amd64-mirror-template-1, amd64-mirror-template-2, arm64-mirror-template-1, arm64-mirror-template-2, etc.) and create a virtual machine in the OpenStack infrastructure resource pool.
进一步的,制作不同版本的虚拟机镜像推送至各基础设施资源池的镜像仓库。为适应多种CPU架构、操作系统类型及版本、kubernetes版本或容器运行时的需求,预先制作不同CPU架构(x86、ARM等)、不同操作系统类型及不同操作系统版本版本、不同kubernetes版本或者容器运行时(Docker、containerd等)版本的虚拟机镜像并推送至镜像仓库。 Furthermore, different versions of virtual machine images are made and pushed to the image warehouse of each infrastructure resource pool. In order to meet the needs of various CPU architectures, operating system types and versions, kubernetes versions or container runtimes, virtual machine images of different CPU architectures (x86, ARM, etc.), different operating system types and different operating system versions, different kubernetes versions or container runtimes (Docker, containerd, etc.) are made in advance and pushed to the image warehouse.
则将多种版本的虚拟机镜像推送至各基础设施资源池的镜像仓库,可以包括:Multiple versions of virtual machine images are pushed to the image repositories of each infrastructure resource pool, which can include:
制作虚拟机镜像的过程中,在镜像内放置对应CPU架构的特定版本的kubernetes相关的二进制资源配置文件到指定目录,安装第二容器镜像仓库(容器镜像仓库harbor、注册表registry等),并将对应CPU架构和对应版本的kubernetes相关镜像推送至第二容器镜像仓库;In the process of making a virtual machine image, place the binary resource configuration file related to Kubernetes of a specific version corresponding to the CPU architecture in the image to the specified directory, install the second container image repository (container image repository harbor, registry, etc.), and push the Kubernetes-related images corresponding to the CPU architecture and the corresponding version to the second container image repository;
将制作完成的虚拟机镜像推送至各基础设施资源池的镜像仓库。Push the completed virtual machine image to the image repository of each infrastructure resource pool.
即是说,将预置好操作系统和kubernetes编排工具的虚拟机镜像推送至各基础设施资源池的镜像仓库,以便在部署第一目标负载集群时一并完成基础设施的部署,实现云平台对操作系统、虚拟机等的统一纳管。That is to say, the virtual machine image with pre-installed operating system and kubernetes orchestration tool is pushed to the image repository of each infrastructure resource pool, so that the infrastructure deployment can be completed at the same time when the first target load cluster is deployed, and the cloud platform can realize unified management of operating system, virtual machine, etc.
将创建kubernetes负载集群所需的附属组件(如Prometheus、Jenkins、Ingress-controller、localpath-provisioner、Istio、CNI插件、Cluster-Agent等)相关的容器镜像推送至云平台的第一容器镜像仓库,并将来自于helm client的容器集群应用部署(helm-chart)模板推送至云平台的图表(chart)仓库。Push the container images related to the ancillary components required to create a kubernetes load cluster (such as Prometheus, Jenkins, Ingress-controller, localpath-provisioner, Istio, CNI plug-in, Cluster-Agent, etc.) to the first container image repository of the cloud platform, and push the container cluster application deployment (helm-chart) template from the helm client to the chart (chart) repository of the cloud platform.
基于云平台设置分区管理模块,用于维护不同版本的虚拟机,可以设置分区管理模块的分区与虚拟机镜像一一对应;A partition management module is set up based on the cloud platform to maintain virtual machines of different versions. The partitions of the partition management module can be set to correspond one to one with the virtual machine images.
相应的,分区的分区信息可以包括与第一目标负载集群中的虚拟机对应的CPU架构、操作系统类型、操作系统版本、kubernetes版本、容器运行时类型、容器运行时版本、虚拟机镜像ID。Correspondingly, the partition information of the partition may include the CPU architecture, operating system type, operating system version, kubernetes version, container runtime type, container runtime version, and virtual machine image ID corresponding to the virtual machine in the first target load cluster.
即分区管理模块被设置为维护不同CPU架构的操作系统类型、操作系统版本、kubernetes/容器运行时版本、虚拟机镜像ID等信息,即每个分区对应一种CPU结构,每种CPU架构又分别对应多种不同的操作系统类型、操作系统版本、kubernetes版本、容器运行时类型、容器运行时版本、虚拟机镜像ID等信息。设置多个分区,即维护多个分区信息记录,每个分区对应一种指定的CPU架构、kubernetes/版本、操作系统类型/版本、容器运行时类型/版本等信息的一个虚拟机镜像。That is, the partition management module is set to maintain the operating system type, operating system version, kubernetes/container runtime version, virtual machine image ID and other information of different CPU architectures, that is, each partition corresponds to a CPU structure, and each CPU architecture corresponds to a variety of different operating system types, operating system versions, kubernetes versions, container runtime types, container runtime versions, virtual machine image ID and other information. Set up multiple partitions, that is, maintain multiple partition information records, each partition corresponds to a virtual machine image with a specified CPU architecture, kubernetes/version, operating system type/version, container runtime type/version and other information.
基于云平台,上述计算资源管理模块和分区管理模块可以基于预先部署的集群管理模块进行管理。此外,集群管理模块还被设置为通过集群应用程序接口与基础设施资源池对接的资源供应接口管理负载集群的自定义资源(CDR资源,如cluster.x-k8s)。Based on the cloud platform, the computing resource management module and the partition management module can be managed based on the pre-deployed cluster management module. In addition, the cluster management module is also configured to manage the custom resources (CDR resources, such as cluster.x-k8s) of the load cluster through the resource supply interface that interfaces with the infrastructure resource pool through the cluster application program interface.
完成上述准备后,对于S102,当接收到用户基于客户端发送的kubernetes负载集群创建请求时,识别得到kubernetes负载集群创建请求所携带的待创建的第一目标负载集群的基础设施资源池类型(如采用OpenStack基础设施资源池)和第一目标负载集群所需的基础设施类型(kubernetes版本、CPU架构、分区等信息)。为保证kubernetes负载集群创建请求的可执行性,在S103:基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块为第一目标负载集群完成基础设施资源池部署以及基础设施部署之前,本申请实施例提供的容器集群的管理方法还包括:After completing the above preparations, for S102, when receiving a kubernetes load cluster creation request sent by a user based on a client, identify the infrastructure resource pool type (such as using an OpenStack infrastructure resource pool) of the first target load cluster to be created carried by the kubernetes load cluster creation request and the infrastructure type (kubernetes version, CPU architecture, partition, etc.) required by the first target load cluster. To ensure the executability of the kubernetes load cluster creation request, in S103: before calling the computing resource management module and the partition management module to complete the infrastructure resource pool deployment and infrastructure deployment for the first target load cluster based on the cluster application program interface using the management cluster declaration method, the container cluster management method provided in the embodiment of the present application also includes:
对kubernetes负载集群创建请求进行权限校验和请求参数校验;Perform permission verification and request parameter verification on kubernetes load cluster creation requests;
若通过校验,则进入S103:基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块为第一目标负载集群完成基础设施资源池部署以及基础设施部署的步骤。If the verification is passed, the process proceeds to S103: based on the cluster application program interface, the computing resource management module and the partition management module are called to complete the infrastructure resource pool deployment and infrastructure deployment steps for the first target load cluster by managing the cluster declaration.
云平台被设置为对负载集群进行注册(registry)管理的模块,可以提供应用接口服务(api server)、存储服务(etcd)、代理服务(proxy)、无线通信管理服务(coredns)、集群记录服务(scheduler)等。在客户端携带所选的基础设施资源池类型和基础设施类型向集群管理模块发送新建负载集群的请求,集群管理模块进行权限校验以及请求参数校验,完成校验后,集群记录服务即可在上述云平台的数据库为第一目标负载集群创建对应的管理记录,并将第一目标负载集群的集群信息同步更新至该管理记录。在对第一目标负载集群的生命周期管理过程中,可以应用该管理记录来维护第一目标负载集群的创建进展信息、运行中的维护更改信息乃至删除信息等,以便形成对多架构集群的信息进行维护管理。The cloud platform is set as a module for registering (registry) management of load clusters, which can provide application interface services (api server), storage services (etcd), proxy services (proxy), wireless communication management services (coredns), cluster record services (scheduler), etc. The client sends a request for creating a new load cluster to the cluster management module with the selected infrastructure resource pool type and infrastructure type. The cluster management module performs permission verification and request parameter verification. After the verification is completed, the cluster record service can create a corresponding management record for the first target load cluster in the database of the above-mentioned cloud platform, and synchronize the cluster information of the first target load cluster to the management record. In the process of life cycle management of the first target load cluster, the management record can be used to maintain the creation progress information of the first target load cluster, the maintenance change information in operation, and even the deletion information, so as to form maintenance management of the information of the multi-architecture cluster.
对于S103,集群管理模块分别调用计算资源管理模块和分区管理模块,找到与基础设施资源池类型对应的第一基础设施资源池,从第一基础设施资源池的第二容器镜像仓库中拉取与符合基础设施类型要求的虚拟机镜像,在第一基础设施资源池中进行第一目标负载集群的创建、虚拟机的创建以及对应操作系统的创建,而后 从云平台的第一容器镜像仓库中拉取符合基础设施类型要求的附属组件的容器镜像,在云平台的图表仓库中拉取符合基础设施类型要求的容器集群应用部署模板,在第一基础设施资源池中完成第一目标负载集群的部署。For S103, the cluster management module calls the computing resource management module and the partition management module respectively, finds the first infrastructure resource pool corresponding to the infrastructure resource pool type, pulls the virtual machine image that meets the infrastructure type requirements from the second container image repository of the first infrastructure resource pool, creates the first target load cluster, creates the virtual machine, and creates the corresponding operating system in the first infrastructure resource pool, and then Pull the container image of the auxiliary components that meet the infrastructure type requirements from the first container image repository of the cloud platform, pull the container cluster application deployment template that meets the infrastructure type requirements from the chart repository of the cloud platform, and complete the deployment of the first target load cluster in the first infrastructure resource pool.
在完成上述对第一目标负载集群的创建过程后,记录对第一目标负载集群的部署完成状态到数据库中第一目标负载集群对应的管理记录。After completing the above creation process of the first target load cluster, the deployment completion status of the first target load cluster is recorded in the management record corresponding to the first target load cluster in the database.
由于第一目标负载集群的基础设施资源池部署以及基础设施部署均为采用集群应用程序接口采用管理集群声明的方式进行创建的,实现了对kubernetes负载集群的基础设施的纳管。Since the infrastructure resource pool deployment and infrastructure deployment of the first target load cluster are both created by using the cluster application program interface in a manner of managing cluster declarations, the infrastructure of the kubernetes load cluster is managed.
图2为本申请实施例提供的一种S103的可选实施方式的流程图。FIG. 2 is a flow chart of an optional implementation of S103 provided in an embodiment of the present application.
在上述实施例的基础上,本申请实施例进一步对创建kubernetes负载集群的步骤进行详细说明。Based on the above embodiments, the embodiments of the present application further describe in detail the steps of creating a Kubernetes load balancing cluster.
如图2所示,在上述实施例的基础上,在本申请实施例提供的容器集群的管理方法中,S103:基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署,可以包括:As shown in FIG. 2 , based on the above embodiment, in the container cluster management method provided in the embodiment of the present application, S103: based on the cluster application program interface, using the management cluster declaration method, calling the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and completing infrastructure deployment for the first target load cluster, may include:
S201:调用计算资源管理模块获取第一基础设施资源池的镜像服务组件接口后,调用分区管理模块在第一基础设施资源池的镜像仓库获取第一目标负载集群所需的虚拟机镜像ID。S201: After calling the computing resource management module to obtain the image service component interface of the first infrastructure resource pool, calling the partition management module to obtain the virtual machine image ID required by the first target load cluster in the image repository of the first infrastructure resource pool.
在可选的实施中,在完成对kubernetes负载集群创建请求的校验、创建第一目标负载集群的创建进展并同步更新至数据库后,运行集群管理模块调用计算资源管理模块获取第一基础设施资源池的认证信息,并根据kubernetes负载集群创建请求的请求参数调用第一基础设施资源池的镜像服务组件接口,调用分区管理模块查询符合kubernetes负载集群创建请求所要求的CPU架构、kubernetes版本等条件的虚拟机镜像的ID。例如,当基础设施资源池类型为OpenStack基础设施资源池时,集群管理模块获取OpenStack基础设施资源池的认证信息并根据请求参数调用OpenStack基础设施资源池的镜像服务组件glance的接口,以通过该接口查询符合基础设施类型要求的虚拟机镜像并获取该虚拟机镜像的ID。In an optional implementation, after completing the verification of the kubernetes load cluster creation request, creating the creation progress of the first target load cluster and synchronously updating it to the database, the cluster management module is run to call the computing resource management module to obtain the authentication information of the first infrastructure resource pool, and call the image service component interface of the first infrastructure resource pool according to the request parameters of the kubernetes load cluster creation request, and call the partition management module to query the ID of the virtual machine image that meets the CPU architecture, kubernetes version and other conditions required by the kubernetes load cluster creation request. For example, when the infrastructure resource pool type is an OpenStack infrastructure resource pool, the cluster management module obtains the authentication information of the OpenStack infrastructure resource pool and calls the interface of the image service component glance of the OpenStack infrastructure resource pool according to the request parameters, so as to query the virtual machine image that meets the infrastructure type requirements through the interface and obtain the ID of the virtual machine image.
需要说明的是,容器集群的生命周期管理除了创建集群外,还包括删除集群。为了便于在删除负载集群时清理云硬盘资源,在进行S103:基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署,或在进行S202之前,本申请实施例提供的容器集群的管理方法还包括:为第一目标负载集群创建与用户的一一对应关系,并为第一目标负载集群配置安全认证规则。It should be noted that the lifecycle management of the container cluster includes not only cluster creation, but also cluster deletion. In order to facilitate the cleanup of cloud hard disk resources when deleting a load cluster, when performing S103: based on the cluster application program interface, the computing resource management module and the partition management module are called to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and complete the infrastructure deployment for the first target load cluster, or before performing S202, the container cluster management method provided in the embodiment of the present application also includes: creating a one-to-one correspondence between the first target load cluster and the user, and configuring security authentication rules for the first target load cluster.
在一些实施例中,运行集群管理模块可以调用第一基础设施资源池的安全认证服务接口创建用户,并为用户配置对应的安全认证规则。例如,当基础设施资源池类型为OpenStack基础设施资源池时,集群管理模块调用OpenStack基础设施资源池的安全认证服务组件keysnote的接口创建一个用户。In some embodiments, the cluster management module can call the security authentication service interface of the first infrastructure resource pool to create a user and configure corresponding security authentication rules for the user. For example, when the infrastructure resource pool type is an OpenStack infrastructure resource pool, the cluster management module calls the interface of the security authentication service component keysnote of the OpenStack infrastructure resource pool to create a user.
进一步地,安全认证规则可以为采用密钥对进行认证。当第一基础设施资源池为OpenStack基础设施资源池时,为用户配置对应的安全认证规则,可以为:采用OpenStack基础设施资源池的计算服务组件nova创建密钥对。Furthermore, the security authentication rule may be to use a key pair for authentication. When the first infrastructure resource pool is an OpenStack infrastructure resource pool, configuring a corresponding security authentication rule for the user may be to use nova, a computing service component of the OpenStack infrastructure resource pool, to create a key pair.
可以理解的是,根据实际需要,还可以为第一目标负载集群的用户配置其他类型的安全认证规则,以保证用户信息以及第一目标负载集群的数据安全性。It is understandable that, according to actual needs, other types of security authentication rules may be configured for users of the first target load cluster to ensure the security of user information and data of the first target load cluster.
S202:根据入参生成并创建扩展资源文件,并为第一目标负载集群创建用于管控kubernetes集群中的自定义资源的扩展资源。S202: Generate and create an extended resource file according to the input parameters, and create an extended resource for the first target load cluster for managing custom resources in the Kubernetes cluster.
在可选的实施中,运行集群管理模块,根据入参生成并创建YAML格式的cluster.x-k8s扩展资源文件并创建扩展资源,该扩展资源是一个用于管控kubernetes集群中的CRD资源,即用于定义负载集群的自定义资源。In an optional implementation, the cluster management module is run to generate and create a cluster.x-k8s extended resource file in YAML format based on the input parameters and create an extended resource. The extended resource is a CRD resource used to manage and control the kubernetes cluster, that is, a custom resource used to define a load cluster.
S203:当监听到自定义资源时,基于集群应用程序接口和与第一基础设施资源池类型对应的资源供应接口,调用kubernetes编排工具在第一基础设施资源池中进行kubernetes负载集群的创建与部署。S203: When a custom resource is monitored, based on the cluster application program interface and the resource supply interface corresponding to the first infrastructure resource pool type, the Kubernetes orchestration tool is called to create and deploy a Kubernetes load cluster in the first infrastructure resource pool.
在可选的实施中,云平台的集群应用程序接口监听到新的自定义资源(即cluster.x-k8s资源)时,通过云 平台与第一基础设施资源池类型对应的资源供应接口(即集群应用程序接口对接基础设施资源池用的供应接口,可以为通过一组函数实现根据集群参数管理基础设施资源池)来调用基础设施资源池的接口来创建虚拟机、安全组、云硬盘的资源,在虚拟机中注入云平台初始化(cloud-init)脚本。例如,第一基础设施资源池为OpenStack基础设施资源池时,则集群应用程序接口通过Provider-OpenStack(集群应用程序接口对接OpenStack基础设施资源池使用的Provider,其实现了一组函数来根据集群参数管理OpenStack基础设施资源)来调用OpenStack接口来执行部署基础设施的操作。In an optional implementation, when the cloud platform's cluster application program interface listens to new custom resources (i.e., cluster.x-k8s resources), it The platform uses the resource supply interface corresponding to the first infrastructure resource pool type (i.e., the supply interface used by the cluster application interface to connect to the infrastructure resource pool, which can be implemented by a set of functions to manage the infrastructure resource pool according to cluster parameters) to call the infrastructure resource pool interface to create virtual machines, security groups, and cloud hard disk resources, and inject the cloud platform initialization (cloud-init) script into the virtual machine. For example, when the first infrastructure resource pool is an OpenStack infrastructure resource pool, the cluster application interface calls the OpenStack interface through Provider-OpenStack (the Provider used by the cluster application interface to connect to the OpenStack infrastructure resource pool, which implements a set of functions to manage OpenStack infrastructure resources according to cluster parameters) to perform the operation of deploying infrastructure.
云平台初始化脚本用于实现kubernetes初始化、新建、修改kubernetes配置参数等功能,运行云平台初始化脚本以调用虚拟机镜像中预置的kubernetes编排工具(如kubeadm)进行第一目标负载集群的部署。The cloud platform initialization script is used to implement functions such as Kubernetes initialization, creation, and modification of Kubernetes configuration parameters. The cloud platform initialization script is run to call the Kubernetes orchestration tool (such as kubeadm) preset in the virtual machine image to deploy the first target load cluster.
S204:当监听到第一目标负载集群的所有自定义资源的应用接口服务部署完毕后,获取第一目标负载集群的标识并记录。S204: After monitoring that the application interface services of all the custom resources of the first target load cluster are deployed, obtain and record the identifier of the first target load cluster.
在一些实施例中,运行集群管理模块监控cluster.x-k8s资源的状态,等待应用接口服务准备完毕(api server ready)后查询节点列表,待应用接口服务(api server)部署完成后,运行集群管理模块调用集群应用程序接口获取第一目标负载集群的标识(kubeconfig),并将该标识记录到数据库中第一目标负载集群对应的管理记录。In some embodiments, the cluster management module is run to monitor the status of cluster.x-k8s resources, and the node list is queried after the application interface service is ready (api server ready). After the application interface service (api server) is deployed, the cluster management module is run to call the cluster application program interface to obtain the identifier (kubeconfig) of the first target load cluster, and the identifier is recorded in the management record corresponding to the first target load cluster in the database.
S205:基于客户端修改工作节点的标签设置角色,自容器镜像仓库中调用第一目标负载集群所需的容器镜像,自图表仓库中获取第一目标负载集群所需的容器集群应用部署模板,在第一目标负载集群中完成附属组件的安装与启动。S205: Modify the label setting role of the working node based on the client, call the container image required by the first target load cluster from the container image repository, obtain the container cluster application deployment template required by the first target load cluster from the chart repository, and complete the installation and startup of the auxiliary components in the first target load cluster.
在一些实施例中,运行集群管理模块通过客户端(k8s-client)修改工作(worker)节点的标签(label)设置角色(如node-role.kubernetes.io/node),自第一容器镜像仓库中调用第一目标负载集群所需附属组件的容器镜像,自图表仓库中获取第一目标负载集群所需的容器集群应用部署模板(helm-chart),利用容器集群应用部署模板中的helm安装附属组件(如prometheus、Ingress-controller、localpath-provisioner、Istio、CNI插件、Cluster-Agent等)等到第一目标负载集群。待附属组件安装完成后,再记录部署完成状态到数据库中第一目标负载集群对应的管理记录。In some embodiments, the cluster management module is run to modify the label (label) of the worker node through the client (k8s-client) to set the role (such as node-role.kubernetes.io/node), call the container image of the auxiliary component required by the first target load cluster from the first container image warehouse, obtain the container cluster application deployment template (helm-chart) required by the first target load cluster from the chart warehouse, and use the helm in the container cluster application deployment template to install auxiliary components (such as prometheus, Ingress-controller, localpath-provisioner, Istio, CNI plug-in, Cluster-Agent, etc.) to the first target load cluster. After the auxiliary components are installed, the deployment completion status is recorded in the management record corresponding to the first target load cluster in the database.
由于容器集群的生命周期管理任务除了对负载集群的创建外,还包括对负载集群的删除,在上述实施例的基础上,本申请实施例提供的容器集群的管理方法还包括:Since the lifecycle management task of a container cluster includes not only the creation of a load cluster but also the deletion of a load cluster, based on the above embodiment, the container cluster management method provided in the embodiment of the present application also includes:
当接收到基于客户端发送的对已创建的第二目标负载集群的删除命令时,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施。When a deletion command for the created second target load cluster is received based on the client, the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located are called to delete the infrastructure of the second target load cluster in the second infrastructure resource pool.
在可选的实施中,用户基于客户端发送对已创建的第二目标负载集群的删除命令到云平台的集群管理模块,则集群管理模块更新第二目标负载集群的状态为删除中。In an optional implementation, the user sends a deletion command for the created second target load cluster to the cluster management module of the cloud platform based on the client, and the cluster management module updates the status of the second target load cluster to being deleted.
运行集群管理模块调用计算资源管理模块,自第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施(如cluster.x-k8s CRD资源),继而完成第二目标负载集群的删除操作。Run the cluster management module to call the computing resource management module, and delete the infrastructure of the second target load cluster (such as cluster.x-k8s CRD resources) in the second infrastructure resource pool from the resource supply interface of the second infrastructure resource pool where the second target load cluster is located, and then complete the deletion operation of the second target load cluster.
在完成上述对第二目标负载集群的删除操作后,再从云平台数据库中删除第二目标负载集群对应的管理记录。After completing the above-mentioned deletion operation on the second target load cluster, the management record corresponding to the second target load cluster is deleted from the cloud platform database.
在本申请上述实施例中提到,负载集群的创建过程包括对云硬盘资源的占用,为便于在删除负载集群中清理云硬盘资源,在进行负载集群的创建时就将负载集群创建为与用户一一对应的关系。则在此基础上,对于负载集群的删除过程,在上述实施例的基础上,本申请实施例提供的容器集群的管理方法还包括:In the above embodiment of the present application, it is mentioned that the creation process of the load cluster includes the occupation of cloud hard disk resources. In order to facilitate the cleanup of cloud hard disk resources when deleting the load cluster, the load cluster is created as a one-to-one correspondence with the user when the load cluster is created. On this basis, for the deletion process of the load cluster, on the basis of the above embodiment, the management method of the container cluster provided in the embodiment of the present application also includes:
当接收到基于客户端发送的对已创建的第二目标负载集群的删除命令时,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施; When receiving a deletion command for the created second target load cluster sent by the client, calling the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located, and deleting the infrastructure of the second target load cluster in the second infrastructure resource pool;
调用计算资源管理模块以及第二基础设施资源池的资源供应接口,获取第二目标负载集群的认证信息后删除第二目标负载集群下对应用户的卷,并删除对应的安全认证规则,删除用户信息及第二目标负载集群在云平台数据库中的记录。Call the computing resource management module and the resource supply interface of the second infrastructure resource pool, obtain the authentication information of the second target load cluster, delete the volume of the corresponding user under the second target load cluster, delete the corresponding security authentication rules, and delete the user information and the record of the second target load cluster in the cloud platform database.
在创建负载集群中设置负载集群与用户一一对应的前提下,在完成对第二目标负载集群的基础设施的删除后,进一步删除第二负载集群对应的用户的信息,从而在第二基础设施资源池中彻底清空第二目标负载集群占用的资源,再在云平台数据库中删除第二目标负载集群的状态信息以及用户信息。On the premise of setting a one-to-one correspondence between load clusters and users when creating a load cluster, after completing the deletion of the infrastructure of the second target load cluster, the user information corresponding to the second load cluster is further deleted, thereby completely clearing the resources occupied by the second target load cluster in the second infrastructure resource pool, and then deleting the status information and user information of the second target load cluster in the cloud platform database.
在一些实施例中,待第二目标负载集群的基础设施删除完成后,运行集群管理模块调用计算资源管理模块,获取第二基础设施资源池的认证信息,然后调用第二基础设施资源池的组件服务接口,删除第二目标负载集群对应用户下的卷,并删除与该用户对应的安全认证规则。例如,当第二基础设施资源池为OpenStack基础设施资源池时,即调用持久性块存储功能组件cinder删除第二目标负载集群对应用户下的卷,并调用计算服务组件nova删除该用户对应的密钥对,最后调用安全认证服务组件keysnote删除该用户信息。在完成上述对第二目标负载集群的删除操作后,再从云平台数据库中删除第二目标负载集群对应的管理记录。In some embodiments, after the infrastructure of the second target load cluster is deleted, the cluster management module is run to call the computing resource management module to obtain the authentication information of the second infrastructure resource pool, and then the component service interface of the second infrastructure resource pool is called to delete the volume under the user corresponding to the second target load cluster, and delete the security authentication rules corresponding to the user. For example, when the second infrastructure resource pool is an OpenStack infrastructure resource pool, the persistent block storage function component cinder is called to delete the volume under the user corresponding to the second target load cluster, and the computing service component nova is called to delete the key pair corresponding to the user, and finally the security authentication service component keysnote is called to delete the user information. After completing the above-mentioned deletion operation on the second target load cluster, the management record corresponding to the second target load cluster is deleted from the cloud platform database.
针对本申请上述实施例提供的两种删除负载集群的方法,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施,可以包括:With respect to the two methods for deleting a load cluster provided in the above embodiments of the present application, calling the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located, and deleting the infrastructure of the second target load cluster in the second infrastructure resource pool may include:
调用计算资源管理模块执行对第二基础设施资源池中第二目标负载集群的自定义资源的删除操作;Invoke the computing resource management module to execute a deletion operation on the custom resources of the second target load cluster in the second infrastructure resource pool;
基于集群应用程序接口监测到删除操作后,通过第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施。After the deletion operation is detected based on the cluster application program interface, the infrastructure of the second target load cluster is deleted in the second infrastructure resource pool through the resource supply interface of the second infrastructure resource pool.
即是说,在删除第二目标负载集群的基础设施时,先删除自定义资源(cluster.x-k8s CRD资源),再删除对应的基础设施。That is to say, when deleting the infrastructure of the second target load cluster, delete the custom resources (cluster.x-k8s CRD resources) first, and then delete the corresponding infrastructure.
例如,当第二基础设施资源池为OpenStack基础设施资源池时,运行集群管理模块调用计算资源管理模块,执行删除自定义资源(cluster.x-k8s CRD资源)的操作。For example, when the second infrastructure resource pool is an OpenStack infrastructure resource pool, the cluster management module is run to call the computing resource management module to execute the operation of deleting custom resources (cluster.x-k8s CRD resources).
云平台的集群应用程序接口监听到上述删除操作后,通过Provider-OpenStack调用OpenStack删除第二目标负载集群的基础设施,等待其删除完成。After the cluster application program interface of the cloud platform monitors the above deletion operation, it calls OpenStack through Provider-OpenStack to delete the infrastructure of the second target load cluster and waits for the deletion to be completed.
上文详述了容器集群的管理方法对应的各个实施例,在此基础上,本申请还公开了与上述方法对应的容器集群的管理装置、设备及非易失性可读存储介质。The above describes in detail various embodiments corresponding to the container cluster management method. On this basis, the present application also discloses a container cluster management device, equipment and non-volatile readable storage medium corresponding to the above method.
图3为本申请实施例提供的一种容器集群的管理装置的结构示意图。FIG3 is a schematic diagram of the structure of a container cluster management device provided in an embodiment of the present application.
如图3所示,本申请实施例提供的容器集群的管理装置包括:As shown in FIG3 , the container cluster management device provided in the embodiment of the present application includes:
环境准备单元301,被设置为预先基于云平台在kubernetes集群部署集群应用程序接口,以创建管理集群;基于云平台配置针对不同类型云平台的基础设施资源池进行统一管理的计算资源管理模块,以及被设置为维护不同版本的虚拟机的分区管理模块;将多种版本的虚拟机镜像推送至各基础设施资源池的镜像仓库;将kubernetes负载集群所需附属组件的容器镜像推送至云平台的第一容器镜像仓库,并将容器集群应用部署模板推送至云平台的图表仓库;The environment preparation unit 301 is configured to pre-deploy a cluster application program interface in a kubernetes cluster based on a cloud platform to create a management cluster; configure a computing resource management module for unified management of infrastructure resource pools of different types of cloud platforms based on the cloud platform, and a partition management module configured to maintain different versions of virtual machines; push multiple versions of virtual machine images to the image warehouse of each infrastructure resource pool; push the container image of the auxiliary components required by the kubernetes load cluster to the first container image warehouse of the cloud platform, and push the container cluster application deployment template to the chart warehouse of the cloud platform;
识别单元302,被设置为接收到基于客户端发送的kubernetes负载集群创建请求时,识别得到待创建的第一目标负载集群的基础设施资源池类型和第一目标负载集群所需的基础设施类型;The identification unit 302 is configured to, upon receiving a kubernetes load cluster creation request sent by a client, identify an infrastructure resource pool type of a first target load cluster to be created and an infrastructure type required by the first target load cluster;
创建单元303,被设置为基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署。The creation unit 303 is configured to use a management cluster declaration method based on the cluster application program interface to call the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and complete infrastructure deployment for the first target load cluster.
进一步的,环境准备单元301将多种版本的虚拟机镜像推送至各基础设施资源池的镜像仓库,可以包括:Furthermore, the environment preparation unit 301 pushes multiple versions of virtual machine images to the image warehouse of each infrastructure resource pool, which may include:
制作虚拟机镜像的过程中,在镜像内放置对应CPU架构的特定版本的kubernetes相关的二进制资源配置文件到指定目录,安装第二容器镜像仓库,并将对应CPU架构和对应版本的kubernetes相关镜像推送至第二 容器镜像仓库;When making a virtual machine image, place the binary resource configuration file related to the specific version of Kubernetes corresponding to the CPU architecture in the image to the specified directory, install the second container image repository, and push the corresponding CPU architecture and corresponding version of Kubernetes related images to the second Container image repository;
将制作完成的虚拟机镜像推送至各基础设施资源池的镜像仓库。Push the completed virtual machine image to the image repository of each infrastructure resource pool.
进一步的,分区管理模块的分区与虚拟机镜像一一对应;Furthermore, the partitions of the partition management module correspond one-to-one with the virtual machine images;
相应的,分区的分区信息可以包括与第一目标负载集群中的虚拟机对应的CPU架构、操作系统类型、操作系统版本、kubernetes版本、容器运行时类型、容器运行时版本、虚拟机镜像ID。Correspondingly, the partition information of the partition may include the CPU architecture, operating system type, operating system version, kubernetes version, container runtime type, container runtime version, and virtual machine image ID corresponding to the virtual machine in the first target load cluster.
进一步的,创建单元303可以包括:Furthermore, the creation unit 303 may include:
基础设施拉取子单元,被设置为调用计算资源管理模块获取第一基础设施资源池的镜像服务组件接口后,调用分区管理模块在第一基础设施资源池的镜像仓库获取第一目标负载集群所需的虚拟机镜像ID;The infrastructure pulling subunit is configured to call the computing resource management module to obtain the image service component interface of the first infrastructure resource pool, and then call the partition management module to obtain the virtual machine image ID required by the first target load cluster from the image repository of the first infrastructure resource pool;
扩展资源创建子单元,被设置为根据入参生成并创建扩展资源文件,并为第一目标负载集群创建用于管控kubernetes集群中的自定义资源的扩展资源;The extended resource creation subunit is configured to generate and create an extended resource file according to input parameters, and to create an extended resource for the first target load cluster for managing custom resources in the kubernetes cluster;
集群编排子单元,被设置为当监听到自定义资源时,基于集群应用程序接口和与第一基础设施资源池类型对应的资源供应接口,调用kubernetes编排工具在第一基础设施资源池中进行kubernetes负载集群的创建与部署;The cluster orchestration subunit is configured to, when monitoring a custom resource, call a kubernetes orchestration tool to create and deploy a kubernetes load cluster in the first infrastructure resource pool based on a cluster application program interface and a resource provisioning interface corresponding to the first infrastructure resource pool type;
集群注册子单元,被设置为当监听到第一目标负载集群的所有自定义资源的应用接口服务部署完毕后,获取第一目标负载集群的标识并记录;The cluster registration subunit is configured to obtain and record the identifier of the first target load cluster after monitoring that the application interface services of all custom resources of the first target load cluster are deployed;
集群部署子单元,被设置为基于客户端修改工作节点的标签设置角色,自容器镜像仓库中调用第一目标负载集群所需的容器镜像,自图表仓库中获取第一目标负载集群所需的容器集群应用部署模板,在第一目标负载集群中完成附属组件的安装与启动。The cluster deployment subunit is configured to modify the label setting role of the working node based on the client, call the container image required by the first target load cluster from the container image repository, obtain the container cluster application deployment template required by the first target load cluster from the chart repository, and complete the installation and startup of the auxiliary components in the first target load cluster.
进一步的,本申请实施例提供的容器集群的管理装置还包括:Furthermore, the container cluster management device provided in the embodiment of the present application also includes:
删除单元,被设置为当接收到基于客户端发送的对已创建的第二目标负载集群的删除命令时,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施。The deletion unit is configured to call the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located when receiving a deletion command for the created second target load cluster sent by the client, and delete the infrastructure of the second target load cluster in the second infrastructure resource pool.
进一步的,本申请实施例提供的容器集群的管理装置还包括:Furthermore, the container cluster management device provided in the embodiment of the present application also includes:
账户管理单元,被设置为在基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署之前,为第一目标负载集群创建与用户的一一对应关系,并为第一目标负载集群配置安全认证规则。The account management unit is configured to create a one-to-one correspondence between the first target load cluster and the user, and configure security authentication rules for the first target load cluster before calling the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and completing infrastructure deployment for the first target load cluster in a management cluster declaration manner based on the cluster application program interface.
进一步的,为第一目标负载集群创建与用户的一一对应关系,并为第一目标负载集群配置安全认证规则,可以为:Furthermore, a one-to-one correspondence between the first target load cluster and the user is created, and a security authentication rule is configured for the first target load cluster, which may be:
调用第一基础设施资源池的安全认证服务接口创建用户,并为用户配置对应的安全认证规则。The security authentication service interface of the first infrastructure resource pool is called to create a user, and corresponding security authentication rules are configured for the user.
进一步的,当第一基础设施资源池为OpenStack基础设施资源池时,为用户配置对应的安全认证规则,可以为:Furthermore, when the first infrastructure resource pool is an OpenStack infrastructure resource pool, the corresponding security authentication rules are configured for the user, which may be:
采用OpenStack基础设施资源池的计算服务组件nova创建密钥对。Use nova, a computing component of the OpenStack infrastructure resource pool, to create a key pair.
进一步的,本申请实施例提供的容器集群的管理装置还包括:Furthermore, the container cluster management device provided in the embodiment of the present application also includes:
基础设施删除单元,被设置为当接收到基于客户端发送的对已创建的第二目标负载集群的删除命令时,调用计算资源管理模块以及第二目标负载集群所在的第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施;The infrastructure deletion unit is configured to, when receiving a deletion command for the created second target load cluster sent based on the client, call the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located, and delete the infrastructure of the second target load cluster in the second infrastructure resource pool;
账户信息删除单元,被设置为调用计算资源管理模块以及第二基础设施资源池的资源供应接口,获取第二目标负载集群的认证信息后删除第二目标负载集群下对应用户的卷,并删除对应的安全认证规则,删除用户信息及第二目标负载集群在云平台数据库中的记录。The account information deletion unit is configured to call the computing resource management module and the resource supply interface of the second infrastructure resource pool, obtain the authentication information of the second target load cluster, delete the volume of the corresponding user under the second target load cluster, delete the corresponding security authentication rules, and delete the user information and the record of the second target load cluster in the cloud platform database.
进一步的,删除单元或基础设施删除单元调用计算资源管理模块以及第二目标负载集群所在的第二基础设 施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施,包括:Further, the deletion unit or the infrastructure deletion unit calls the computing resource management module and the second infrastructure where the second target load cluster is located. The resource supply interface of the infrastructure resource pool is used to delete the infrastructure of the second target load cluster in the second infrastructure resource pool, including:
调用计算资源管理模块执行对第二基础设施资源池中第二目标负载集群的自定义资源的删除操作;Invoke the computing resource management module to execute a deletion operation on the custom resources of the second target load cluster in the second infrastructure resource pool;
基于集群应用程序接口监测到删除操作后,通过第二基础设施资源池的资源供应接口,在第二基础设施资源池中删除第二目标负载集群的基础设施。After the deletion operation is detected based on the cluster application program interface, the infrastructure of the second target load cluster is deleted in the second infrastructure resource pool through the resource supply interface of the second infrastructure resource pool.
进一步的,本申请实施例提供的容器集群的管理装置还包括:Furthermore, the container cluster management device provided in the embodiment of the present application also includes:
校验单元,被设置为在基于集群应用程序接口采用管理集群声明的方式,调用计算资源管理模块和分区管理模块在基础设施资源池类型对应的第一基础设施资源池创建第一目标负载集群并为第一目标负载集群完成基础设施部署之前,对kubernetes负载集群创建请求进行权限校验和请求参数校验;若通过校验,则进入创建单元303。The verification unit is configured to call the computing resource management module and the partition management module to create a first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and complete the infrastructure deployment for the first target load cluster in a management cluster declaration manner based on the cluster application program interface, and perform permission verification and request parameter verification on the kubernetes load cluster creation request; if the verification is passed, enter the creation unit 303.
进一步的,本申请实施例提供的容器集群的管理装置还包括:Furthermore, the container cluster management device provided in the embodiment of the present application also includes:
记录单元,被设置为在云平台的数据库为第一目标负载集群创建对应的管理记录,并将第一目标负载集群的集群信息同步更新至管理记录。The recording unit is configured to create a corresponding management record for the first target load cluster in the database of the cloud platform, and synchronously update the cluster information of the first target load cluster to the management record.
由于装置部分的实施例与方法部分的实施例相互对应,因此装置部分的实施例请参见方法部分的实施例的描述,这里暂不赘述。Since the embodiments of the apparatus part correspond to the embodiments of the method part, please refer to the description of the embodiments of the method part for the embodiments of the apparatus part, which will not be repeated here.
图4为本申请实施例提供的一种容器集群的管理设备的结构示意图。FIG4 is a schematic diagram of the structure of a container cluster management device provided in an embodiment of the present application.
如图4所示,本申请实施例提供的容器集群的管理设备包括:As shown in FIG4 , the management device of the container cluster provided in the embodiment of the present application includes:
存储器410,被设置为存储计算机程序411;A memory 410 configured to store a computer program 411;
处理器420,被设置为执行计算机程序411,该计算机程序411被处理器420执行时实现如上述任意一项实施例上述容器集群的管理方法的步骤。The processor 420 is configured to execute a computer program 411. When the computer program 411 is executed by the processor 420, the steps of the container cluster management method according to any one of the above embodiments are implemented.
其中,处理器420可以包括一个或多个处理核心,比如3核心处理器、8核心处理器等。处理器420可以采用数字信号处理DSP(Digital Signal Processing)、现场可编程门阵列FPGA(Field-Programmable Gate Array)、可编程逻辑阵列PLA(Programmable Logic Array)中的至少一种硬件形式来实现。处理器420也可以包括主处理器和协处理器,主处理器是被设置为对在唤醒状态下的数据进行处理的处理器,也称中央处理器CPU(Central Processing Unit);协处理器是被设置为对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器420可以集成有图像处理器GPU(Graphics Processing Unit),GPU被设置为负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器420还可以包括人工智能AI(Artificial Intelligence)处理器,该AI处理器被设置为处理有关机器学习的计算操作。The processor 420 may include one or more processing cores, such as a 3-core processor, an 8-core processor, etc. The processor 420 may be implemented in at least one of the following hardware forms: a digital signal processor DSP (Digital Signal Processing), a field-programmable gate array FPGA (Field-Programmable Gate Array), and a programmable logic array PLA (Programmable Logic Array). The processor 420 may also include a main processor and a coprocessor. The main processor is a processor configured to process data in an awake state, also known as a central processing unit CPU (Central Processing Unit); the coprocessor is a low-power processor configured to process data in a standby state. In some embodiments, the processor 420 may be integrated with a graphics processor GPU (Graphics Processing Unit), and the GPU is configured to be responsible for rendering and drawing the content to be displayed on the display screen. In some embodiments, the processor 420 may also include an artificial intelligence AI (Artificial Intelligence) processor, which is configured to process computing operations related to machine learning.
存储器410可以包括一个或多个非易失性可读存储介质,该非易失性可读存储介质可以是非暂态的。存储器410还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。本实施例中,存储器410至少被设置为存储以下计算机程序411,其中,该计算机程序411被处理器420加载并执行之后,能够实现前述任一实施例公开的容器集群的管理方法中的相关步骤。另外,存储器410所存储的资源还可以包括操作系统412和数据413等,存储方式可以是短暂存储或者永久存储。其中,操作系统412可以为Windows。数据413可以包括但不限于上述方法所涉及到的数据。The memory 410 may include one or more non-volatile readable storage media, which may be non-transitory. The memory 410 may also include a high-speed random access memory, and a non-volatile memory, such as one or more disk storage devices, flash memory storage devices. In this embodiment, the memory 410 is at least configured to store the following computer program 411, wherein the computer program 411, after being loaded and executed by the processor 420, can implement the relevant steps in the management method of the container cluster disclosed in any of the aforementioned embodiments. In addition, the resources stored in the memory 410 may also include an operating system 412 and data 413, etc., and the storage method may be short-term storage or permanent storage. Among them, the operating system 412 may be Windows. Data 413 may include, but is not limited to, the data involved in the above method.
在一些实施例中,容器集群的管理设备还可包括有显示屏430、电源440、通信接口450、输入输出接口460、传感器470以及通信总线480。In some embodiments, the management device of the container cluster may further include a display screen 430 , a power supply 440 , a communication interface 450 , an input/output interface 460 , a sensor 470 , and a communication bus 480 .
本领域技术人员可以理解,图4中示出的结构并不构成对容器集群的管理设备的限定,可以包括比图示更多或更少的组件。Those skilled in the art will appreciate that the structure shown in FIG. 4 does not constitute a limitation on the management device of the container cluster, and may include more or fewer components than shown in the figure.
本申请实施例提供的容器集群的管理设备,包括存储器和处理器,处理器在执行存储器存储的程序时,能够实现如上上述的容器集群的管理方法,效果同上。The container cluster management device provided in an embodiment of the present application includes a memory and a processor. When the processor executes a program stored in the memory, it can implement the container cluster management method as described above, and the effect is the same as above.
需要说明的是,以上所描述的装置、设备实施例仅仅是示意性的,例如,模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一 些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。It should be noted that the above-described device and equipment embodiments are merely illustrative. For example, the division of modules is merely a logical function division. In actual implementation, there may be other division methods. For example, multiple modules or components may be combined or integrated into another system, or a Some features can be ignored or not performed. Another point is that the coupling or direct coupling or communication connection between each other shown or discussed can be an indirect coupling or communication connection through some interfaces, devices or modules, which can be electrical, mechanical or other forms. The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical modules, that is, they may be located in one place, or they may be distributed on multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the present embodiment.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。In addition, each functional module in each embodiment of the present application can be integrated into a processing module, or each module can exist physically separately, or two or more modules can be integrated into one module. The above integrated modules can be implemented in the form of hardware or software functional modules.
集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个非易失性可读存储介质中。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个非易失性可读存储介质中,执行本申请各个实施例上述方法的全部或部分步骤。If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a non-volatile readable storage medium. Based on this understanding, the technical solution of the present application, or the part that contributes to the relevant technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a non-volatile readable storage medium to execute all or part of the steps of the above methods of each embodiment of the present application.
为此,本申请实施例还提供一种非易失性可读存储介质,该非易失性可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现如容器集群的管理方法的步骤。To this end, an embodiment of the present application further provides a non-volatile readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the steps of the container cluster management method are implemented.
该非易失性可读存储介质可以包括:U盘、移动硬盘、只读存储器ROM(Read-Only Memory)、随机存取存储器RAM(Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The non-volatile readable storage medium may include: a U disk, a mobile hard disk, a read-only memory ROM (Read-Only Memory), a random access memory RAM (Random Access Memory), a magnetic disk or an optical disk, and other media that can store program codes.
本实施例中提供的非易失性可读存储介质所包含的计算机程序能够在被处理器执行时实现如上上述的容器集群的管理方法的步骤,效果同上。The computer program included in the non-volatile readable storage medium provided in this embodiment can implement the steps of the above-mentioned container cluster management method when executed by the processor, and the effect is the same as above.
以上对本申请所提供的一种容器集群的管理方法、装置、设备及非易失性可读存储介质进行了详细介绍。说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置、设备及非易失性可读存储介质而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。The above is a detailed introduction to a container cluster management method, device, equipment and non-volatile readable storage medium provided by the present application. The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments. The same and similar parts between the embodiments can refer to each other. For the devices, equipment and non-volatile readable storage medium disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the description is relatively simple, and the relevant parts can be referred to the method part description. It should be pointed out that for ordinary technicians in this technical field, without departing from the principles of the present application, several improvements and modifications can be made to the present application, and these improvements and modifications also fall within the scope of protection of the claims of the present application.
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括上述要素的过程、方法、物品或者设备中还存在另外的相同要素。 It should also be noted that, in this specification, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "comprise", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, the elements defined by the sentence "comprise a ..." do not exclude the presence of other identical elements in the process, method, article or device including the above elements.

Claims (20)

  1. 一种容器集群的管理方法,其特征在于,包括:A method for managing a container cluster, comprising:
    预先基于云平台在kubernetes集群部署集群应用程序接口,以创建管理集群;基于所述云平台配置针对不同类型所述云平台的基础设施资源池进行统一管理的计算资源管理模块,以及用于维护不同版本的虚拟机的分区管理模块;将多种版本的虚拟机镜像推送至各所述基础设施资源池的镜像仓库;将kubernetes负载集群所需附属组件的容器镜像推送至所述云平台的第一容器镜像仓库,并将容器集群应用部署模板推送至所述云平台的图表仓库;Pre-deploy a cluster application program interface in the kubernetes cluster based on the cloud platform to create a management cluster; configure a computing resource management module for unified management of different types of infrastructure resource pools of the cloud platform based on the cloud platform, and a partition management module for maintaining different versions of virtual machines; push multiple versions of virtual machine images to the image warehouse of each infrastructure resource pool; push the container image of the auxiliary components required by the kubernetes load cluster to the first container image warehouse of the cloud platform, and push the container cluster application deployment template to the chart warehouse of the cloud platform;
    接收到基于客户端发送的kubernetes负载集群创建请求时,识别得到待创建的第一目标负载集群的基础设施资源池类型和所述第一目标负载集群所需的基础设施类型;When receiving a kubernetes load cluster creation request sent by a client, identifying an infrastructure resource pool type of a first target load cluster to be created and an infrastructure type required by the first target load cluster;
    基于所述集群应用程序接口采用管理集群声明的方式,调用所述计算资源管理模块和所述分区管理模块在所述基础设施资源池类型对应的第一基础设施资源池创建所述第一目标负载集群并为所述第一目标负载集群完成基础设施部署。Based on the cluster application program interface, the computing resource management module and the partition management module are called to create the first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and complete infrastructure deployment for the first target load cluster.
  2. 根据权利要求1所述的容器集群的管理方法,其特征在于,所述将多种版本的虚拟机镜像推送至各所述基础设施资源池的镜像仓库,具体包括:The method for managing a container cluster according to claim 1, wherein pushing multiple versions of virtual machine images to the image repository of each of the infrastructure resource pools specifically includes:
    制作所述虚拟机镜像的过程中,在镜像内放置对应CPU架构的特定版本的kubernetes相关的二进制资源配置文件到指定目录,安装第二容器镜像仓库,并将对应CPU架构和对应版本的kubernetes相关镜像推送至所述第二容器镜像仓库;In the process of making the virtual machine image, a binary resource configuration file related to Kubernetes of a specific version corresponding to the CPU architecture is placed in the image to a specified directory, a second container image repository is installed, and the Kubernetes-related image corresponding to the CPU architecture and the corresponding version is pushed to the second container image repository;
    将制作完成的所述虚拟机镜像推送至各所述基础设施资源池的镜像仓库。The produced virtual machine image is pushed to the image warehouse of each infrastructure resource pool.
  3. 根据权利要求2所述的容器集群的管理方法,其特征在于,所述分区管理模块的分区与虚拟机镜像一一对应;The container cluster management method according to claim 2, characterized in that the partitions of the partition management module correspond one-to-one to the virtual machine images;
    相应的,所述分区的分区信息具体包括与所述第一目标负载集群中的虚拟机对应的CPU架构、操作系统类型、操作系统版本、kubernetes版本、容器运行时类型、容器运行时版本、虚拟机镜像ID。Correspondingly, the partition information of the partition specifically includes the CPU architecture, operating system type, operating system version, kubernetes version, container runtime type, container runtime version, and virtual machine image ID corresponding to the virtual machine in the first target load cluster.
  4. 根据权利要求1所述的容器集群的管理方法,其特征在于,所述基于所述集群应用程序接口采用管理集群声明的方式,调用所述计算资源管理模块和所述分区管理模块在所述基础设施资源池类型对应的第一基础设施资源池创建所述第一目标负载集群并为所述第一目标负载集群完成基础设施部署,具体包括:The container cluster management method according to claim 1 is characterized in that the method of managing cluster declaration based on the cluster application program interface calls the computing resource management module and the partition management module to create the first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and completes infrastructure deployment for the first target load cluster, specifically comprising:
    调用所述计算资源管理模块获取所述第一基础设施资源池的镜像服务组件接口后,调用所述分区管理模块在所述第一基础设施资源池的所述镜像仓库获取所述第一目标负载集群所需的虚拟机镜像ID;After calling the computing resource management module to obtain the image service component interface of the first infrastructure resource pool, calling the partition management module to obtain the virtual machine image ID required by the first target load cluster from the image repository of the first infrastructure resource pool;
    根据入参生成并创建扩展资源文件,并为所述第一目标负载集群创建用于管控kubernetes集群中的自定义资源的扩展资源;Generate and create an extended resource file according to the input parameters, and create an extended resource for the first target load cluster for managing custom resources in the kubernetes cluster;
    当监听到所述自定义资源时,基于所述集群应用程序接口和与所述第一基础设施资源池类型对应的资源供应接口,调用kubernetes编排工具在所述第一基础设施资源池中进行kubernetes负载集群的创建与部署;When the custom resource is monitored, based on the cluster application program interface and the resource supply interface corresponding to the first infrastructure resource pool type, calling the kubernetes orchestration tool to create and deploy a kubernetes load cluster in the first infrastructure resource pool;
    当监听到所述第一目标负载集群的所有所述自定义资源的应用接口服务部署完毕后,获取所述第一目标负载集群的标识并记录;When it is monitored that the application interface services of all the custom resources of the first target load cluster are deployed, obtaining and recording the identifier of the first target load cluster;
    基于所述客户端修改工作节点的标签设置角色,自所述容器镜像仓库中调用所述第一目标负载集群所需的容器镜像,自所述图表仓库中获取所述第一目标负载集群所需的容器集群应用部署模板,在所述第一目标负载集群中完成附属组件的安装与启动。Based on the client modifying the label setting role of the working node, the container image required by the first target load cluster is called from the container image repository, the container cluster application deployment template required by the first target load cluster is obtained from the chart repository, and the installation and startup of the auxiliary components are completed in the first target load cluster.
  5. 根据权利要求4所述的容器集群的管理方法,其特征在于,所述调用所述分区管理模块在所述第一基础设施资源池的所述镜像仓库获取所述第一目标负载集群所需的虚拟机镜像ID,包括:The container cluster management method according to claim 4, characterized in that the calling of the partition management module to obtain the virtual machine image ID required by the first target load cluster from the image repository of the first infrastructure resource pool includes:
    调用所述分区管理模块查询符合所述kubernetes负载集群创建请求所要求的CPU架构和kubernetes版本条件的虚拟机镜像的ID作为所述虚拟机镜像ID。The partition management module is called to query the ID of the virtual machine image that meets the CPU architecture and kubernetes version conditions required by the kubernetes load cluster creation request as the virtual machine image ID.
  6. 根据权利要求4所述的容器集群的管理方法,其特征在于,所述根据入参生成并创建扩展资源文 件,并为所述第一目标负载集群创建用于管控kubernetes集群中的自定义资源的扩展资源,包括:The method for managing a container cluster according to claim 4, wherein the step of generating and creating an extended resource file according to the input parameters And create extended resources for the first target load cluster to manage custom resources in the Kubernetes cluster, including:
    根据入参生成并创建扩展资源文件并创建扩展资源,其中,所述扩展资源是一个用于管控kubernetes集群中的CRD资源。Generate and create an extended resource file based on the input parameters and create an extended resource, where the extended resource is a CRD resource used to manage and control the kubernetes cluster.
  7. 根据权利要求4所述的容器集群的管理方法,其特征在于,所述当监听到所述自定义资源时,基于所述集群应用程序接口和与所述第一基础设施资源池类型对应的资源供应接口,调用kubernetes编排工具在所述第一基础设施资源池中进行kubernetes负载集群的创建与部署,包括:The method for managing a container cluster according to claim 4, characterized in that when the custom resource is monitored, based on the cluster application program interface and the resource supply interface corresponding to the first infrastructure resource pool type, calling a kubernetes orchestration tool to create and deploy a kubernetes load cluster in the first infrastructure resource pool comprises:
    当监听到所述自定义资源时,通过所述云平台与所述第一基础设施资源池类型对应的资源供应接口来调用基础设施资源池的接口来创建虚拟机、安全组和云硬盘的资源,在虚拟机中注入云平台初始化脚本,其中,所述云平台初始化脚本用于实现kubernetes初始化、新建和修改kubernetes配置参数,运行云平台初始化脚本以调用虚拟机镜像中预置的kubernetes编排工具进行所述第一目标负载集群的部署。When the custom resource is monitored, the interface of the infrastructure resource pool is called through the resource supply interface corresponding to the cloud platform and the first infrastructure resource pool type to create virtual machine, security group and cloud hard disk resources, and the cloud platform initialization script is injected into the virtual machine, wherein the cloud platform initialization script is used to implement Kubernetes initialization, create and modify Kubernetes configuration parameters, and run the cloud platform initialization script to call the Kubernetes orchestration tool preset in the virtual machine image to deploy the first target load cluster.
  8. 根据权利要求4所述的容器集群的管理方法,其特征在于,所述当监听到所述第一目标负载集群的所有所述自定义资源的应用接口服务部署完毕后,获取所述第一目标负载集群的标识并记录,包括:The method for managing a container cluster according to claim 4, wherein after monitoring that the application interface services of all the custom resources of the first target load cluster are deployed, obtaining and recording an identifier of the first target load cluster comprises:
    监控cluster.x-k8s资源的状态,等待所述应用接口服务准备完毕后查询节点列表,待所述应用接口服务部署完毕后,调用所述集群应用程序接口获取所述第一目标负载集群的标识,并将所述标识记录到所述第一目标负载集群对应的管理记录。Monitor the status of cluster.x-k8s resources, wait for the application interface service to be ready and then query the node list, after the application interface service is deployed, call the cluster application program interface to obtain the identifier of the first target load cluster, and record the identifier in the management record corresponding to the first target load cluster.
  9. 根据权利要求4所述的容器集群的管理方法,其特征在于,所述在所述第一目标负载集群中完成附属组件的安装与启动,包括:The method for managing a container cluster according to claim 4, wherein completing the installation and startup of the auxiliary component in the first target load cluster comprises:
    利用所述容器集群应用部署模板中的软件包管理工具安装附属组件到所述第一目标负载集群;Using a software package management tool in the container cluster application deployment template to install ancillary components into the first target load cluster;
    待附属组件安装完成后,启动所述附属组件,并记录部署完成状态到所述第一目标负载集群对应的管理记录。After the installation of the subsidiary component is completed, the subsidiary component is started, and the deployment completion status is recorded in the management record corresponding to the first target load cluster.
  10. 根据权利要求1所述的容器集群的管理方法,其特征在于,还包括:The method for managing a container cluster according to claim 1, further comprising:
    当接收到基于所述客户端发送的对已创建的第二目标负载集群的删除命令时,调用所述计算资源管理模块以及所述第二目标负载集群所在的第二基础设施资源池的资源供应接口,在所述第二基础设施资源池中删除所述第二目标负载集群的基础设施。When a deletion command for the created second target load cluster sent by the client is received, the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located are called to delete the infrastructure of the second target load cluster in the second infrastructure resource pool.
  11. 根据权利要求1所述的容器集群的管理方法,其特征在于,在所述基于所述集群应用程序接口采用管理集群声明的方式,调用所述计算资源管理模块和所述分区管理模块在所述基础设施资源池类型对应的第一基础设施资源池创建所述第一目标负载集群并为所述第一目标负载集群完成基础设施部署之前,还包括:The method for managing a container cluster according to claim 1, characterized in that before the method of managing cluster declarations based on the cluster application program interface, calling the computing resource management module and the partition management module to create the first target load cluster for the first infrastructure resource pool corresponding to the infrastructure resource pool type and completing infrastructure deployment for the first target load cluster, further comprises:
    为所述第一目标负载集群创建与用户的一一对应关系,并为所述第一目标负载集群配置安全认证规则。A one-to-one correspondence between the first target load cluster and the user is created, and a security authentication rule is configured for the first target load cluster.
  12. 根据权利要求11所述的容器集群的管理方法,其特征在于,所述为所述第一目标负载集群创建与用户的一一对应关系,并为所述第一目标负载集群配置安全认证规则,具体为:The container cluster management method according to claim 11 is characterized in that the step of creating a one-to-one correspondence between the first target load cluster and the user and configuring a security authentication rule for the first target load cluster is as follows:
    调用所述第一基础设施资源池的安全认证服务接口创建用户,并为用户配置对应的安全认证规则。Call the security authentication service interface of the first infrastructure resource pool to create a user, and configure corresponding security authentication rules for the user.
  13. 根据权利要求12所述的容器集群的管理方法,其特征在于,当所述第一基础设施资源池为OpenStack基础设施资源池时,所述为用户配置对应的安全认证规则,具体为:The method for managing a container cluster according to claim 12, wherein when the first infrastructure resource pool is an OpenStack infrastructure resource pool, the security authentication rules corresponding to the configuration for the user are specifically:
    采用所述OpenStack基础设施资源池的计算服务组件nova创建密钥对。A key pair is created using nova, a computing service component of the OpenStack infrastructure resource pool.
  14. 根据权利要求11所述的容器集群的管理方法,其特征在于,还包括:The method for managing a container cluster according to claim 11, further comprising:
    当接收到基于所述客户端发送的对已创建的第二目标负载集群的删除命令时,调用所述计算资源管理模块以及所述第二目标负载集群所在的第二基础设施资源池的资源供应接口,在所述第二基础设施资源池中删除所述第二目标负载集群的基础设施;When receiving a deletion command for the created second target load cluster sent by the client, calling the computing resource management module and the resource provision interface of the second infrastructure resource pool where the second target load cluster is located, and deleting the infrastructure of the second target load cluster in the second infrastructure resource pool;
    调用所述计算资源管理模块以及所述第二基础设施资源池的资源供应接口,获取所述第二目标负载集群的认证信息后删除所述第二目标负载集群下对应用户的卷,并删除对应的安全认证规则,删除用户信息 及所述第二目标负载集群在云平台数据库中的记录。Call the computing resource management module and the resource supply interface of the second infrastructure resource pool, obtain the authentication information of the second target load cluster, delete the volume of the corresponding user under the second target load cluster, delete the corresponding security authentication rules, and delete the user information And the record of the second target load cluster in the cloud platform database.
  15. 根据权利要求10或14所述的容器集群的管理方法,其特征在于,所述调用所述计算资源管理模块以及所述第二目标负载集群所在的第二基础设施资源池的资源供应接口,在所述第二基础设施资源池中删除所述第二目标负载集群的基础设施,具体包括:The container cluster management method according to claim 10 or 14, characterized in that the calling of the computing resource management module and the resource supply interface of the second infrastructure resource pool where the second target load cluster is located, and deleting the infrastructure of the second target load cluster in the second infrastructure resource pool, specifically includes:
    调用所述计算资源管理模块执行对所述第二基础设施资源池中所述第二目标负载集群的自定义资源的删除操作;Calling the computing resource management module to execute a deletion operation on the custom resources of the second target load cluster in the second infrastructure resource pool;
    基于所述集群应用程序接口监测到所述删除操作后,通过所述第二基础设施资源池的资源供应接口,在所述第二基础设施资源池中删除所述第二目标负载集群的基础设施。After the deletion operation is detected based on the cluster application program interface, the infrastructure of the second target load cluster is deleted from the second infrastructure resource pool through the resource supply interface of the second infrastructure resource pool.
  16. 根据权利要求1所述的容器集群的管理方法,其特征在于,在所述基于所述集群应用程序接口采用管理集群声明的方式,调用所述计算资源管理模块和所述分区管理模块在所述基础设施资源池类型对应的第一基础设施资源池创建所述第一目标负载集群并为所述第一目标负载集群完成基础设施部署之前,还包括:The method for managing a container cluster according to claim 1, characterized in that before the method of managing cluster declarations based on the cluster application program interface, calling the computing resource management module and the partition management module to create the first target load cluster for the first infrastructure resource pool corresponding to the infrastructure resource pool type and completing infrastructure deployment for the first target load cluster, further comprises:
    对所述kubernetes负载集群创建请求进行权限校验和请求参数校验;Performing permission verification and request parameter verification on the kubernetes load cluster creation request;
    若通过校验,则进入所述基于所述集群应用程序接口采用管理集群声明的方式,调用所述计算资源管理模块和所述分区管理模块为所述第一目标负载集群完成基础设施资源池部署以及基础设施部署的步骤。If the verification is passed, the process proceeds to the step of managing cluster declarations based on the cluster application program interface, calling the computing resource management module and the partition management module to complete infrastructure resource pool deployment and infrastructure deployment for the first target load cluster.
  17. 根据权利要求1所述的容器集群的管理方法,其特征在于,还包括:The method for managing a container cluster according to claim 1, further comprising:
    在所述云平台的数据库为所述第一目标负载集群创建对应的管理记录,并将所述第一目标负载集群的集群信息同步更新至所述管理记录。A corresponding management record is created for the first target load cluster in the database of the cloud platform, and cluster information of the first target load cluster is synchronously updated to the management record.
  18. 一种容器集群的管理装置,其特征在于,包括:A container cluster management device, characterized by comprising:
    环境准备单元,被设置为预先基于云平台在kubernetes集群部署集群应用程序接口,以创建管理集群;基于所述云平台配置针对不同类型所述云平台的基础设施资源池进行统一管理的计算资源管理模块,以及用于维护不同版本的虚拟机的分区管理模块;将多种版本的虚拟机镜像推送至各所述基础设施资源池的镜像仓库;将kubernetes负载集群所需附属组件的容器镜像推送至所述云平台的第一容器镜像仓库,并将容器集群应用部署模板推送至所述云平台的图表仓库;The environment preparation unit is configured to pre-deploy a cluster application program interface in a kubernetes cluster based on a cloud platform to create a management cluster; configure a computing resource management module for unified management of infrastructure resource pools of different types of the cloud platform based on the cloud platform, and a partition management module for maintaining virtual machines of different versions; push multiple versions of virtual machine images to the image warehouse of each infrastructure resource pool; push the container image of the auxiliary components required by the kubernetes load cluster to the first container image warehouse of the cloud platform, and push the container cluster application deployment template to the chart warehouse of the cloud platform;
    识别单元,被设置为接收到基于客户端发送的kubernetes负载集群创建请求时,识别得到待创建的第一目标负载集群的基础设施资源池类型和所述第一目标负载集群所需的基础设施类型;an identification unit, configured to, upon receiving a kubernetes load cluster creation request sent by a client, identify an infrastructure resource pool type of a first target load cluster to be created and an infrastructure type required by the first target load cluster;
    创建单元,被设置为基于所述集群应用程序接口采用管理集群声明的方式,调用所述计算资源管理模块和所述分区管理模块在所述基础设施资源池类型对应的第一基础设施资源池创建所述第一目标负载集群并为所述第一目标负载集群完成基础设施部署。The creation unit is configured to call the computing resource management module and the partition management module to create the first target load cluster in the first infrastructure resource pool corresponding to the infrastructure resource pool type and complete infrastructure deployment for the first target load cluster in a management cluster declaration manner based on the cluster application program interface.
  19. 一种容器集群的管理设备,其特征在于,包括:A management device for a container cluster, comprising:
    存储器,被设置为存储计算机程序;a memory arranged to store a computer program;
    处理器,被设置为执行所述计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至17任意一项所述容器集群的管理方法的步骤。A processor is configured to execute the computer program, and when the computer program is executed by the processor, the steps of the method for managing a container cluster as claimed in any one of claims 1 to 17 are implemented.
  20. 一种非易失性可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至17任意一项所述容器集群的管理方法的步骤。 A non-volatile readable storage medium having a computer program stored thereon, characterized in that when the computer program is executed by a processor, the steps of the container cluster management method as described in any one of claims 1 to 17 are implemented.
PCT/CN2023/085261 2022-10-10 2023-03-30 Management method, apparatus and device for container cluster, and non-volatile readable storage medium WO2024077885A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211231347.8 2022-10-10
CN202211231347.8A CN115292026B (en) 2022-10-10 2022-10-10 Management method, device and equipment of container cluster and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2024077885A1 true WO2024077885A1 (en) 2024-04-18

Family

ID=83819219

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/085261 WO2024077885A1 (en) 2022-10-10 2023-03-30 Management method, apparatus and device for container cluster, and non-volatile readable storage medium

Country Status (2)

Country Link
CN (1) CN115292026B (en)
WO (1) WO2024077885A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116301932B (en) * 2022-12-21 2023-09-29 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Rapid deployment method for large project in kubernetes environment
CN116225624B (en) * 2023-05-09 2023-06-30 江苏博云科技股份有限公司 Bare metal management method, system and device based on kubernets
CN116661979B (en) * 2023-08-02 2023-11-28 之江实验室 Heterogeneous job scheduling system and method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107426034B (en) * 2017-08-18 2020-09-01 国网山东省电力公司信息通信公司 Large-scale container scheduling system and method based on cloud platform
US10592290B2 (en) * 2018-01-10 2020-03-17 International Business Machines Corporation Accelerating and maintaining large-scale cloud deployment
CN109656686A (en) * 2018-12-17 2019-04-19 武汉烽火信息集成技术有限公司 The upper deployment container cloud method of OpenStack, storage medium, electronic equipment and system
CN110750332A (en) * 2019-10-23 2020-02-04 广西梯度科技有限公司 Method for setting static IP (Internet protocol) in Pod in Kubernetes
CN110750335A (en) * 2019-10-25 2020-02-04 北京金山云网络技术有限公司 Resource creating method and device and server
EP4165505A1 (en) * 2020-06-12 2023-04-19 Telefonaktiebolaget LM Ericsson (publ) Container orchestration system
CN112187860A (en) * 2020-08-28 2021-01-05 苏州浪潮智能科技有限公司 Construction method and device of kubernets cluster node mirror image

Also Published As

Publication number Publication date
CN115292026A (en) 2022-11-04
CN115292026B (en) 2023-02-28

Similar Documents

Publication Publication Date Title
CN110389900B (en) Distributed database cluster testing method and device and storage medium
CN108809722B (en) Method, device and storage medium for deploying Kubernetes cluster
WO2024077885A1 (en) Management method, apparatus and device for container cluster, and non-volatile readable storage medium
CN109120678B (en) Method and apparatus for service hosting of distributed storage system
US9459856B2 (en) Effective migration and upgrade of virtual machines in cloud environments
EP3974962A1 (en) Method, apparatus, electronic device, readable storage medium and program for deploying application
CN111614490B (en) Management system and method for managed container cluster based on top-level container cluster
US10061665B2 (en) Preserving management services with self-contained metadata through the disaster recovery life cycle
CN111527474B (en) Dynamic delivery of software functions
US11210132B2 (en) Virtual machine migration in virtualization environment having different virtualization systems
US11144432B2 (en) Testing and reproduction of concurrency issues
CN113474751A (en) Managing software programs
WO2017105897A1 (en) Resource provider sdk
EP4209894A1 (en) Cloud code development system, method, and apparatus, device, and storage medium
US10341181B2 (en) Method and apparatus to allow dynamic changes of a replica network configuration in distributed systems
CN117616395A (en) Continuous liveness and integrity of applications during migration
Tang et al. Application centric lifecycle framework in cloud
CN110019059B (en) Timing synchronization method and device
CN109491762B (en) Container state control method and device, storage medium and electronic equipment
US20180329782A1 (en) Data Migration For A Shared Database
CN117112122A (en) Cluster deployment method and device
CN111767345B (en) Modeling data synchronization method, modeling data synchronization device, computer equipment and readable storage medium
CN112241293A (en) Application management method, device, equipment and medium for industrial internet cloud platform
CN116225624B (en) Bare metal management method, system and device based on kubernets
US20240028335A1 (en) Application state synchronization across computing environments to an alternate application