WO2021109686A1 - 集群资源的控制方法、装置和云计算系统 - Google Patents

集群资源的控制方法、装置和云计算系统 Download PDF

Info

Publication number
WO2021109686A1
WO2021109686A1 PCT/CN2020/117413 CN2020117413W WO2021109686A1 WO 2021109686 A1 WO2021109686 A1 WO 2021109686A1 CN 2020117413 W CN2020117413 W CN 2020117413W WO 2021109686 A1 WO2021109686 A1 WO 2021109686A1
Authority
WO
WIPO (PCT)
Prior art keywords
resource
expanded
application
executed
processed
Prior art date
Application number
PCT/CN2020/117413
Other languages
English (en)
French (fr)
Inventor
沈伯伟
都海峰
李文乔
王俊
白石
韩楚怡
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Priority to JP2022534290A priority Critical patent/JP2023504870A/ja
Priority to EP20897260.4A priority patent/EP4071611A4/en
Priority to US17/782,110 priority patent/US20230004439A1/en
Publication of WO2021109686A1 publication Critical patent/WO2021109686A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5055Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool

Definitions

  • the present disclosure relates to the field of computer technology, and in particular to a method for controlling cluster resources, a device for controlling cluster resources, a cloud computing system, and a non-volatile computer-readable storage medium.
  • resource control such as capacity expansion and contraction
  • This kind of resource control has become an operation that operations and maintenance personnel need to perform on a regular basis.
  • a method for controlling cluster resources including: determining a binding relationship between the resource to be expanded and an application when the resource to be controlled is a resource to be expanded; Add the resource to be expanded to the resource pool of the corresponding application that has the binding relationship with it; generate a data package to be executed of the application to be processed according to the deployment type of the application to be processed; and add the data package to be executed It is deployed and executed on the resource to be expanded in the resource pool of the application to be processed.
  • the adding the initialized resource to be expanded to the resource pool of the corresponding application that has the binding relationship with it includes: transmitting the relevant information of the resource to be expanded to the front of the corresponding application. Script; execute the pre-script to complete the initialization of the resource to be expanded.
  • generating the to-be-executed data package of the application to be processed includes: when the deployment type is package deployment, determining that the resource to be expanded is a physical machine, and generating the program package of the application to be processed as the The to-be-executed data package; in the case that the deployment type is an image deployment, the resource to be expanded is determined to be a container image, the program package of the to-be-processed application is generated, and based on the package and the to-be-processed application Run the mirror to generate the to-be-executed data package.
  • the deploying the data packet to be executed on the resource to be expanded in the resource pool of the application to be processed for execution includes: when the resource to be expanded is a physical machine, the data packet to be executed Sent to the physical machine for execution; in the case that the resource to be expanded is a container image, the data packet to be executed is sent to an idle physical machine in the corresponding resource pool for execution.
  • the deploying the data packet to be executed on the resource to be expanded in the resource pool of the application to be processed for execution includes: obtaining information about the resource to be expanded in the resource pool;
  • the deployment interface configured for the application to be processed sends the relevant information about the resource to be expanded to the third-party program of the application to be processed, so that the third-party program deploys the data package to be executed according to its own deployment mode Execute on the resource to be expanded.
  • the method further includes: executing a post script of the application to be processed, the post script being used for at least one of the following processing: returning the expansion result to the management node of the cluster; creating a volume for the expansion resource in the resource pool ; Clean up the garbage generated by the expansion process.
  • the method further includes: establishing an SSH (Secure Shell, Secure Shell Protocol) connection with each resource to be expanded, used to execute each related script of the corresponding application, and can only communicate with each resource to be expanded at the same time.
  • SSH Secure Shell, Secure Shell Protocol
  • the method further includes: retaining the corresponding SSH connection for a preset period of time after the execution of the to-be-executed data packet of the corresponding application is completed.
  • the method further includes: when the resource to be controlled is the resource to be scaled down, judging whether the resource to be scaled has important data, and whether there is a service that depends on the resource to be scaled down; when there is no important data If there are no services that depend on the resource to be scaled down, remove the resource to be scaled down from the cluster.
  • the acquired information about the resource to be scaled is transferred through the configured shrink interface, so that the resource to be scaled can be removed from the cluster.
  • the method further includes: obtaining the result of the reduction by polling the configured query interface.
  • the method further includes: adding the resource to be scaled to the resource pool when the resource to be scaled is a physical machine; In the case of a container image, destroy the container image and add the resource to be reduced to the resource pool.
  • the method further includes: when the resource to be scaled down is a physical machine, executing a post script of the scale-down processing, the post script is used to start the installation process to reinstall the operating system for the resource to be scaled down.
  • a device for controlling cluster resources including: a determining unit, configured to determine the binding between the resource to be expanded and the application when the resource to be controlled is the resource to be expanded Relationship; adding unit for adding the initialized resource to be expanded to the resource pool of the application with the binding relationship; generating unit for generating the to-be-processed application according to the deployment type of the application to be processed Application data package to be executed; an execution unit, configured to deploy the data package to be executed on the resource to be expanded in the resource pool of the application to be processed for execution.
  • the adding unit transmits the relevant information of the resource to be expanded to the pre-script of the corresponding application, and executes the pre-script to complete the initialization of the resource to be expanded.
  • the deployment type when the deployment type is package deployment, it is determined that the resource to be expanded is a physical machine, and the generating unit generates the package of the application to be processed as the data package to be executed; when the deployment type is image deployment, it is determined The resource to be expanded is a container image, and the generating unit generates a program package of the application to be processed, and generates a data package to be executed according to the program package and the running image of the application to be processed.
  • the execution unit when the resource to be expanded is a physical machine, the execution unit sends the data packet to be executed to the physical machine for execution; when the resource to be expanded is a container image, the execution unit sends the data packet to be executed to Execution on an idle physical machine in the corresponding resource pool.
  • the execution unit obtains information about the resource to be expanded in the resource pool, and sends the information about the resource to be expanded to the third-party program of the application to be processed through a deployment interface configured for the application to be processed. So that the third-party program deploys the data package to be executed on the resource to be expanded for execution according to its own deployment method.
  • the execution unit executes a post-script of the application to be processed, and the post-script is used for at least one of the following processing: returning the expansion result to the management node of the cluster; creating a volume for the expansion resource in the resource pool; cleaning up and expanding the capacity Dispose of the generated garbage.
  • the device further includes an establishment unit for establishing an SSH connection with each resource to be expanded, and for executing each related script of the corresponding application. Only one SSH connection can be established with each resource to be expanded at the same time. connection.
  • the establishment unit retains the corresponding SSH connection for a preset period of time after the execution of the to-be-executed data packet of the corresponding application is completed.
  • the device further includes a scaling unit for determining whether the resource to be scaled has important data and whether there is a service that depends on the resource to be scaled when the resource to be controlled is the resource to be scaled down. ;
  • the shrinking unit removes the resources to be scaled from the cluster when there is no important data and there are no services that depend on the resources to be scaled down.
  • the shrinking unit transmits the acquired information about the resource to be shrinked through the configured shrinking interface, so as to remove the resource to be shrinked from the cluster.
  • the shrinking unit obtains the shrinking result by polling the configured query interface.
  • the adding unit adds the resource to be shrinked to the resource pool;
  • the shrinking unit destroys the container image, and the adding unit adds the resource to be shrinked to the resource pool.
  • the execution unit executes a post script of the scale-down processing, and the post script is used to start the installation process to reinstall the operating system for the resource to be scaled down.
  • a device for controlling cluster resources including: a memory; and a processor coupled to the memory, and the processor is configured to execute any of the foregoing implementations based on instructions stored in the memory device The control method of cluster resources in the example.
  • a non-volatile computer-readable storage medium having a computer program stored thereon, and when the program is executed by a processor, the method for controlling cluster resources in any of the above embodiments is implemented.
  • a cloud computing system including: a cluster resource control device, configured to execute the cluster resource control method in any one of the above embodiments.
  • Fig. 1 shows a flowchart of some embodiments of a method for controlling cluster resources of the present disclosure
  • FIG. 2 shows a flowchart of other embodiments of the cluster resource control method of the present disclosure
  • Figure 3 shows a schematic diagram of some embodiments of the cluster resource control device of the present disclosure
  • FIG. 4 shows a flowchart of still other embodiments of the cluster resource control method of the present disclosure
  • Figure 5 shows a schematic diagram of other embodiments of the cluster resource control device of the present disclosure.
  • Fig. 6 shows a schematic diagram of still other embodiments of the cluster resource control device of the present disclosure
  • FIG. 7 shows a schematic diagram of some embodiments of the method for controlling cluster resources of the present disclosure
  • FIG. 8 shows a schematic diagram of still other embodiments of the cluster resource control device of the present disclosure.
  • FIG. 9 shows a schematic diagram of still other embodiments of the cluster resource control device of the present disclosure.
  • Figure 10 shows a block diagram of some embodiments of the cluster resource control device of the present disclosure
  • FIG. 11 shows a block diagram of other embodiments of the cluster resource control device of the present disclosure.
  • FIG. 12 shows a block diagram of still other embodiments of the cluster resource control device of the present disclosure.
  • FIG. 13 shows a block diagram of some embodiments of the cluster resource control system of the present disclosure.
  • the inventor of the present disclosure found that the above-mentioned related technology has the following problem: it cannot be applied to resource expansion of applications of different services, resulting in poor applicability.
  • the present disclosure proposes a technical solution for controlling cluster resources, which can improve the applicability of resource expansion.
  • FIG. 1 shows a flowchart of some embodiments of the method for controlling cluster resources of the present disclosure.
  • the method includes: step S11, determining the binding relationship; step S12, adding resources to be expanded; step S13, generating data packets to be executed; and step S14, deploying data packets to be executed.
  • step S11 when the resource to be controlled is the resource to be expanded, the binding relationship between the resource to be expanded and the corresponding application is determined.
  • the resource to be expanded can be a physical machine or a container image.
  • step S12 according to the binding relationship, the initialized resource to be expanded is added to the resource pool of the corresponding application.
  • the relevant information of the resource to be expanded is transferred to the pre-script of the corresponding application; the pre-script is executed to complete the initialization of the resource to be expanded.
  • step S13 the to-be-executed data package of the to-be-processed application is generated according to the deployment type of the to-be-processed application.
  • the program package of the application to be processed is generated as the data package to be executed.
  • the resources to be expanded are physical machines.
  • the program package of the application to be processed is generated, and the data package to be executed is generated according to the program package and the running image of the application to be processed.
  • the resource to be expanded is a container image.
  • step S14 the data package to be executed is deployed on the corresponding resource to be expanded in the resource pool of the application to be processed for execution.
  • the data packet to be executed is sent to the physical machine for execution; when the resource to be expanded is a container image, the data packet to be executed is sent to the corresponding resource pool (The idle physical machine in the standby machine pool is executed.
  • the spare machine pool is a spare physical machine resource pool in the cloud computing system.
  • the relevant information of the resource to be expanded is obtained; the relevant information of the resource to be expanded is sent to the third-party program of the application to be processed through the deployment interface configured for the application to be processed; the third-party program is deployed according to its own deployment In this way, the data package to be executed is deployed on the resource to be expanded for execution.
  • the post script of the application to be processed is executed, and the post script is used for at least one of the following processing: return the expansion result to the management node of the cluster; create a corresponding volume for the corresponding expansion resource; clean up Expansion of the garbage generated.
  • an SSH connection is established with each resource to be expanded to execute each related script of the corresponding application, and only one SSH connection can be established with each resource to be expanded at the same time.
  • the corresponding SSH connection is reserved for a preset period of time, so that the next time the to-be-executed data packet of the corresponding application is executed, the SSH connection.
  • FIG. 2 shows a flowchart of other embodiments of the method for controlling cluster resources of the present disclosure.
  • the method further includes: step S21, judging important data and dependent services; and S22, removing the resource to be scaled down.
  • step S21 when the resource to be controlled is a resource to be scaled down, it is determined whether the resource to be scaled down has important data and whether there is a service that depends on the resource to be scaled down.
  • step S22 when there is no important data, and there is no service that depends on the resource to be scaled down, the resource to be scaled down is removed from the cluster.
  • the acquired information about the resource to be scaled can be transferred in the system through the configured shrinking interface, so that the resource to be scaled can be removed from the cluster; and the configured query interface can be polled to obtain the shrinking. Content result.
  • the resource to be scaled down when the resource to be scaled down is a physical machine, the resource to be scaled down is added to the resource pool (standby pool); when the resource to be scaled down is a container image, the container image is destroyed, And add the resources to be reduced to the resource pool.
  • a post script of the scale-down process is executed.
  • the post script is used to start the installation process to reinstall the operating system for the resources to be scaled down. For example, reinstallation can be achieved through PXE (Preboot eXecution Environment).
  • Fig. 3 shows a schematic diagram of some embodiments of the cluster resource control apparatus of the present disclosure.
  • the user can interact with the back-end expansion (or contraction) controller (ie, the control device) through the front-end Web (network) page.
  • the front-end Web page can be implemented by an Nginx (engine x, x engine) server; the control device can be implemented by the Golang programming language.
  • the internal structure of the control device can be a unified resource control system for complex proprietary cloud clusters that supports concurrent expansion and contraction of multiple components.
  • Each line of business script can be connected to the control device.
  • control device may include multiple modules such as a multi-component expansion and contraction controller, a heterogeneous component resource management module, a concurrent expansion and contraction executor, a unified component construction system, a multi-component deployment system, and a component reduction and recovery system.
  • modules such as a multi-component expansion and contraction controller, a heterogeneous component resource management module, a concurrent expansion and contraction executor, a unified component construction system, a multi-component deployment system, and a component reduction and recovery system.
  • a multi-component capacity expansion controller (for example, including a determining unit, an adding unit, etc.) is responsible for controlling the overall capacity expansion and contraction process.
  • a multi-component expansion and contraction controller can coordinate the work of other modules; control and call each module used in expansion and contraction; and provide necessary data parameters.
  • the heterogeneous component resource management module is responsible for managing all metadata required in the process of capacity expansion and contraction.
  • metadata includes server (physical machine resource) management IP (Internet Protocol, Internet Protocol), server specification parameters, expansion and contraction application data, and so on.
  • a heterogeneous component resource management module can use a relational database MySQL (My Structured Query Language, My Structured Sequence Language) to persistently store data.
  • MySQL My Structured Query Language, My Structured Sequence Language
  • the heterogeneous component resource management module can be deployed independently, and an OpenAPI (Open Application Programming Interface, open application programming interface) call interface is provided to the outside.
  • OpenAPI Open Application Programming Interface, open application programming interface
  • the concurrent expansion and contraction executor (for example, it may include an execution unit, an establishment unit, etc.) is responsible for executing the scripts that each resource server needs to run at different stages.
  • the bottom layer of the concurrent expansion and contraction executor can be a connection pool based on the SSH protocol.
  • the concurrent expansion and contraction executor can be implemented through the Golang programming language to ensure high concurrency.
  • the concurrent expansion and contraction actuator can establish an SSH connection between the control device and the resource to be expanded.
  • the concurrent expansion and contraction executor can execute a custom expansion and contraction process.
  • IaaS infrastructure as a Service
  • infrastructure as a service infrastructure as a service
  • the concurrent expansion and contraction executor can perform standardized expansion and contraction procedures.
  • the concurrent expansion and contraction executor can trigger a unified component construction system, a multi-component deployment system, and a component reduction and recycling system.
  • the unified component construction system, the multi-component deployment system, and the component shrinking and recycling system perform related processing on the expansion and shrinkage of the cloud computing system cluster.
  • the unified component construction system (for example, may include a generation unit) is responsible for compiling the source code of each resource server for online deployment; based on the Docker container, different compilation environments are prepared for different resource servers.
  • the unified component construction system can realize the isolation of the compilation environment, so that the dependent services required by the compilation do not interfere with each other, thereby ensuring the smooth progress of the compilation work.
  • a multi-component deployment system (for example, it may include an execution unit) is responsible for online deployment of the compiled program to a designated physical server or container.
  • a multi-component deployment system can include two types: package deployment and image deployment.
  • Package deployment will bring the compiled program online to the designated physical machine; image deployment will run the image packaged at compile time, and according to the required resources (including the number of CPU cores required for operation, memory size, hard disk size, etc. ) Deploy and start the container.
  • the component shrinking and recycling system (for example, it may include a shrinking unit and an adding unit) is responsible for recovering the shrinking resources.
  • the recycled container resources will be placed in the resource pool again, and the recycled physical machine will be placed in the spare machine pool.
  • the component shrinking and recycling system destroys container resources, reinstalls the recovered physical machine, and formats the data disk.
  • FIG. 4 shows a flowchart of still other embodiments of the method for controlling cluster resources of the present disclosure.
  • the method may include a capacity expansion process (the process on the left in the figure) and a contraction process (the process on the right in the figure).
  • the expansion process may include a standardized expansion process and a custom expansion process;
  • the shrinking process may include a standardized scaling process and a custom scaling process.
  • the expansion and shrinking of various resource nodes (such as MySQL resource nodes, mass storage resource nodes, etc.) related to IaaS and cloud database products belong to the custom expansion and shrinking process; it is related to cloud storage and data cloud
  • the expansion and shrinking of various resource nodes (such as ds2-datanode resource node, big data datanode resource node, etc.) belong to the standardized expansion and shrinking process.
  • the expansion and shrinking processes of the above-mentioned products all need to go through three common processes, namely, allocation of uses (that is, binding of resources to be expanded and scaled to the corresponding product application), execution of pre-scripts and execution of post-scripts. These public processes can be extracted to establish resource control methods with high applicability.
  • the standardized expansion process mainly includes: mounting the standby machine (resource) to be expanded to the standby machine pool of the specified product application; compiling and building the application package or running the image; deploying the program package or image to the standby machine Servers in the machine pool; poll the deployment results.
  • the expansion process of data cloud products is mainly for big data datanode resource nodes, and the standardized expansion process may include the following steps.
  • step 1 allocate uses for the expanded host.
  • the server (resource) to be expanded is allocated the tag mark of the big data datanode resource node, so as to realize the binding of the resource to be expanded and the application.
  • a multi-component expansion controller can call the preset script (pre-script) of the data cloud in the FTP (File Transfer Protocol) directory of the control machine, and use the IP list of the server to be expanded as a command line
  • the parameters are passed to the preset script.
  • the control machine may be a computer running the control method of the present disclosure in a cluster.
  • the preset script is mainly responsible for initializing and expanding the physical machine, such as installing some basic software packages.
  • step 3 hang on the expansion host. For example, mount the physical machine to be expanded into the spare machine pool of the corresponding product line in the product service tree to facilitate uniform resource allocation.
  • the resource to be expanded can be added to the resource pool of the product line to which the bound application belongs based on the relevant information of the product to be deployed in the cloud computing system.
  • the related information of the product to be deployed in the cloud computing system may be stored in a tree structure.
  • a tree structure For example, it can be stored in a five-level tree structure of departments, product lines, products, systems, and applications.
  • step 4 compile and create a data package. For example, according to the deployment type of the application, compile and build the application package. If the big data datanode resource node uses package deployment, the corresponding program package is compiled as the processing data package; for image deployment, the program package and the running image are packaged together to generate the to-be-processed data package. Transfer the compiled and built program package to the bound machine to be expanded in the standby machine pool, and execute the startup script.
  • step 5 poll the expansion results.
  • the deployment situation can be polled regularly, and the deployment result can be recorded in the deployment unit of the information management module CMDB (Configuration Management Database).
  • CMDB Configuration Management Database
  • the web front-end can query the deployment result and present it to the user through the API interface.
  • step 6 the post script is executed to process the post-expansion transaction.
  • the post script can inform the management node in the cluster of the expanded resource node to complete the expansion of the cluster.
  • the ds2-datanode resource node expansion is mainly completed, and the process steps are similar to the above data cloud.
  • the post script of the cloud storage expansion process can perform the process of creating a volume.
  • the standardized shrinking process may include business data checking, dependent service checking, moving out of the cluster, reclaiming resources, polling results, and executing post scripts.
  • the standardized shrinking process may include the following steps.
  • step 1 check the business data of the resource to be scaled down to determine whether there is important data in the resource to be scaled down. If there is important data, the resource to be scaled does not support the scale-down operation.
  • step 2 a dependent service check is performed on the resource to be scaled down, and it is determined whether there is a service dependent on its operation. If there are dependent services, the resource to be scaled does not support scale-down operations.
  • step 3 notify the management node of the cluster to move the resource to be reduced out of the cluster.
  • step 4 the component shrinking and recycling system is called to recover the shrinking resources.
  • step 5 poll the shrinking result.
  • step 6 the post script is executed to process the post-scaled transaction. For example, you can trigger the PXE installation operation to reclaim resources.
  • the custom expansion process may include: mounting the standby machine to be expanded into the designated product application standby machine pool; the third party provides its own expansion program (third-party program), and the expansion program must provide two Standard expansion interface, namely trigger expansion interface (deployment interface) and query expansion result interface (query interface); the multi-component expansion controller calls the trigger expansion interface, and passes the information of the standby machine to be expanded in the form of parameters; The interface for querying the expansion result polls the expansion result.
  • custom expansion process for the application of IaaS products can be implemented through the following steps.
  • step 1 allocate uses for the expanded host.
  • the server to be expanded is assigned the tag mark of the resource node of the IaaS product application.
  • step 2 execute the pre-script.
  • the multi-component capacity expansion controller calls the data cloud preset script in the FTP directory of the management and control machine, and passes the IP list of the expanded machine as a command line parameter to the preset script; executes the expansion script to complete the initialization of the machine.
  • the preset script is mainly responsible for initializing and expanding the physical machine, for example, installing some basic software packages.
  • step 3 hang on the expansion host. For example, mount the physical machine to be expanded into the spare machine pool of the corresponding product line in the product's service tree for unified acquisition by third-party programs.
  • step 4 transfer the expansion host.
  • the multi-component expansion and contraction controller triggers the IaaS deployment interface, and obtains the relevant information of the server to be expanded from the information management CMDB; passes the preliminary information to the IaaS product in the form of parameters; the IaaS product deployment program goes to the standby pool to obtain The deployed server to be expanded; the IaaS deployment service is deployed online according to its own business.
  • step 5 call the polling interface.
  • the multi-component expansion controller periodically polls the IaaS expansion results; records the deployment results to the deployment unit of the information management module CMDB; the web front-end can query the deployment results through the API interface and present them to users.
  • step 6 the post script is executed. If the product does not have a post script, this step can be empty.
  • the custom scaling process may include business data checking, dependent service checking, calling a custom scaling interface, polling scaling results, reclaiming resources to the resource pool or standby pool, and executing post scripts. For example, you can implement a custom shrinking process through the following steps.
  • step 1 check the business data of the resource to be scaled down to determine whether there is important data in the resource to be scaled down. If there is important data, the resource to be scaled does not support the scale-down operation.
  • step 2 a dependent service check is performed on the resource to be scaled down, and it is determined whether there is a service dependent on its operation. If there are dependent services, the resource to be scaled does not support scale-down operations.
  • step 3 call the self-defined shrinking interface to transfer the relevant information of the resource to be shrinked.
  • step 4 poll the customized shrinking result interface (query interface) to obtain the shrinking result.
  • step 5 put the recovered resources into the resource pool or the standby pool.
  • Fig. 5 shows a schematic diagram of other embodiments of the cluster resource control device of the present disclosure.
  • the heterogeneous component resource management module is mainly composed of the service layer and the data layer, and is used to provide other service modules and Web front-end UI (User Interface) through the Http (HyperText Transfer Protocol) API. Provide metadata. Supervisor can perform service monitoring on heterogeneous component resource management modules.
  • the service layer includes an API server, a server information management unit, a container information management unit, a product information management unit, a compilation information management unit, and a deployment information management unit.
  • the API server may be HTTP Server (server) developed based on Golang.
  • the API server provides a Restful (Representational State Transfer) style interface externally, and is responsible for safely and legally providing internally managed data to users.
  • the server information management unit is mainly responsible for managing all physical machine information in the cluster.
  • the physical machine information may include the management node server, the resource node server that has been used, and the resource standby machine to be expanded and contracted.
  • the server information management unit can manage the full amount of server information and can provide necessary information for subsequent server deployment, configuration, and recovery.
  • the container information management unit serves to manage all container information in the cluster.
  • the container information may include the number of CPU cores used by the container to run, the size of the memory, the size of the hard disk, the physical machine to which it belongs, and product applications.
  • the product information management unit is responsible for managing product information deployed in the proprietary cloud.
  • Product information can adopt a tree structure.
  • the tree structure can be divided into five levels: department, product line, product, system, and application.
  • the physical machines to be expanded and contracted need to be allocated to specific system applications to facilitate centralized management of the expansion and contraction process.
  • the compiled information management unit is responsible for managing the compiled programs of each product application.
  • the compilation information management unit can be developed based on Jenkins and supports automatically pulling code from the source code repository (gitlab).
  • the compilation information management unit can compile and construct an application binary program or image (including running images and program packages) according to the configuration information.
  • the deployment information management unit is responsible for deploying the compiled program package or packaged image to a designated physical machine or container, and recording all deployment records.
  • the data layer can include a master storage node, a slave storage node, and a memory.
  • the primary storage node stores various information sent from the service layer, and synchronously backs it up to the secondary storage node. For example, the data in the primary storage node can be backed up to the memory periodically to avoid obtaining backup data when both the primary storage node and the secondary storage node are contaminated.
  • FIG. 6 shows a schematic diagram of still other embodiments of the cluster resource control device of the present disclosure.
  • the multi-component expansion controller can be an SSH controller built by Golang.
  • Each line of business script can be connected to the internal structure of the multi-component expansion controller.
  • the multi-component capacity expansion controller executes the pre-script and the post-script, it needs to use a concurrent expansion and contraction executor to complete it.
  • the bottom layer of the concurrent expansion and contraction executor is a high-performance SSH concurrent connection pool based on Golang.
  • the connection pool includes SSH connections to the resources to be expanded.
  • connection pool For example, a series of operation interfaces such as connecting to physical machines and executing script commands can be encapsulated on the basis of the connection pool. Due to the high concurrency characteristics of the Golang language, combined with the above connection pool, it can ensure large-scale concurrent operations on servers and containers, thereby improving the efficiency of scaling execution.
  • connection pool only establishes one SSH connection for each resource (physical machine, container image) to be expanded or downsized, and periodically cleans up expired SSH connections.
  • FIG. 7 shows a schematic diagram of some embodiments of the method for controlling cluster resources of the present disclosure.
  • the method includes.
  • the user submits relevant code to the git distributed version control system; git triggers the compilation, construction and online deployment process of the Jenkins automation server.
  • the triggering of the compilation and build process can include manual triggering and automatic triggering after the user submits the code.
  • Jenkins uses Docker to compile, build and compile images, and deploy them online.
  • a program package or program image can be generated according to the type of application.
  • the big data datanode resource node uses package deployment, that is, the compiled and constructed output is only the program package, excluding the running image; for the image deployment type, the program package and the running image will be packaged together as a program image.
  • FIG. 8 shows a schematic diagram of still other embodiments of the cluster resource control device of the present disclosure.
  • the multi-component expansion controller transmits the data packets compiled and constructed by the component unified construction system to the physical machine to be expanded through the concurrent expansion and contraction executor according to the type of expansion application. Free machines in the resource pool.
  • the multi-component deployment system executes the startup script, deploys the package on the physical machine of the private cloud cluster, and checks the execution status; in the case of program image deployment, the multi-component deployment system Start the image and check the running status of the container in the Apsara Stack cluster.
  • FIG. 9 shows a schematic diagram of still other embodiments of the cluster resource control device of the present disclosure.
  • the component shrinking recycling system performs container resource recycling and physical machine resource recycling.
  • the component shrinking recovery system adds the physical machine to the standby machine pool and calls the PXE reinstallation system; what is recovered is the program image (container) of the private cloud cluster.
  • the component shrinking and recycling system will destroy the container and recycle the resources to the resource pool.
  • the component shrinking and recycling system before shrinking, needs to check whether there is business data in the shrinking server or container; and, the component shrinking and recycling system needs to check whether the shrinking server or container has other applications deployed.
  • Dependent services In the presence of business data or dependent services, the component shrinking and recycling system exits the shrinking process.
  • the component recycling system in order to clear the deployed services on the physical machine, can start the PXE installation process, reinstall the operating system on the reduced physical machine and format the data disk, and then put the reduced resources into the standby machine Pool; or after the container is destroyed, the shrinking resources are put into the resource pool.
  • FIG. 10 shows a block diagram of some embodiments of the device for controlling cluster resources of the present disclosure.
  • the device 10 for controlling cluster resources includes a determining unit 101, an adding unit 102, a generating unit 103, and an executing unit 104.
  • the determining unit 101 determines the binding relationship between the resource to be expanded and the corresponding application.
  • the adding unit 102 adds the initialized resources to be expanded to the resource pool of the corresponding application according to the binding relationship.
  • the adding unit 102 transmits the relevant information of the resource to be expanded to the pre-script of the corresponding application, and executes the pre-script to complete the initialization of the resource to be expanded.
  • the generating unit 103 generates the to-be-executed data package of the to-be-processed application according to the deployment type of the to-be-processed application.
  • the resource to be expanded is a physical machine
  • the generating unit 103 generates the program package of the application to be processed as the data package to be executed
  • the expansion resource is a container image
  • the generating unit 103 generates a program package of the application to be processed, and generates a data package to be executed according to the program package and the running image of the application to be processed.
  • the execution unit 104 deploys the data package to be executed on the corresponding resource to be expanded in the resource pool of the application to be processed for execution.
  • the execution unit 104 when the resource to be expanded is a physical machine, the execution unit 104 sends the data packet to be executed to the physical machine for execution; when the resource to be expanded is a container image, the execution unit 104 sends the data packet to be executed Sent to the idle physical machine in the corresponding resource pool for execution.
  • the execution unit 104 obtains the relevant information of the resource to be expanded, and sends the relevant information of the resource to be expanded to the third-party program of the application to be processed through the deployment interface configured for the application to be processed, so that the third-party program According to your own deployment method, deploy the data package to be executed on the resource to be expanded for execution.
  • the execution unit 104 executes a post script of the application to be processed, and the post script is used for at least one of the following processing: returning the expansion result to the management node of the cluster; creating a corresponding volume for the corresponding expansion resource; cleaning up Expansion of the garbage generated.
  • the device 10 for controlling cluster resources further includes an establishing unit 105 for establishing an SSH connection with each resource to be expanded.
  • the SSH connection is used to execute the relevant scripts of the corresponding application.
  • the establishing unit 105 can only establish one SSH connection with each resource to be expanded at the same time.
  • the establishing unit 105 retains the corresponding SSH connection for a preset period of time after the execution of the to-be-executed data packet of the corresponding application is completed.
  • the cluster resource control device 10 further includes a shrinking unit 106, which is used to determine whether the resource to be scaled has important data and whether the resource depends on the resource to be scaled.
  • the service of shrinking the resource; the shrinking unit 106 removes the resource to be shrinked from the cluster when there is no important data and there is no service that depends on the resource to be reduced.
  • the shrinking unit transmits the acquired information about the resource to be shrinked through the configured shrinking interface, so as to remove the resource to be shrinked from the cluster.
  • the shrinking unit 106 obtains the shrinking result by polling the configured query interface.
  • the adding unit 102 adds the resource to be shrinked to the resource pool;
  • the shrinking unit 106 destroys the container image, and the adding unit adds the resource to be shrinked to the resource pool.
  • the execution unit 104 executes a post script of the scale-down processing, and the post script is used to start the installation process to reinstall the operating system on the resource to be scaled down.
  • FIG. 11 shows a block diagram of other embodiments of the device for controlling cluster resources of the present disclosure.
  • the cluster resource control device 11 of this embodiment includes a memory 111 and a processor 112 coupled to the memory 111.
  • the processor 112 is configured to execute the present disclosure based on instructions stored in the memory 111.
  • the memory 111 may include, for example, a system memory, a fixed non-volatile storage medium, and the like.
  • the system memory stores an operating system, application programs, boot loader, database, and other programs, for example.
  • FIG. 12 shows a block diagram of still other embodiments of the cluster resource control device of the present disclosure.
  • the cluster resource control device 12 of this embodiment includes a memory 1210 and a processor 1220 coupled to the memory 1210.
  • the processor 1220 is configured to execute any of the foregoing based on instructions stored in the memory 1210.
  • a method for controlling cluster resources in an embodiment is shown in FIG. 12, the cluster resource control device 12 of this embodiment.
  • the memory 1210 may include, for example, a system memory, a fixed non-volatile storage medium, and the like.
  • the system memory stores, for example, an operating system, an application program, a boot loader, and other programs.
  • the cluster resource control device 12 may also include an input/output interface 1230, a network interface 1240, a storage interface 1250, and the like. These interfaces 1230, 1240, 1250, the memory 1210 and the processor 1220 may be connected via a bus 1260, for example.
  • the input and output interface 1230 provides a connection interface for input and output devices such as a display, a mouse, a keyboard, and a touch screen.
  • the network interface 1240 provides a connection interface for various networked devices.
  • the storage interface 1250 provides connection interfaces for external storage devices such as SD cards and U disks.
  • the embodiments of the present disclosure can be provided as a method, a system, or a computer program product. Therefore, the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present disclosure may take the form of a computer program product implemented on one or more computer-usable non-transitory storage media containing computer-usable program codes.
  • FIG. 13 shows a block diagram of some embodiments of the cluster resource control system of the present disclosure.
  • the cloud computing system 13 includes a cluster resource control device 131 for executing the cluster resource control method in any of the above embodiments.
  • the embodiments of the present disclosure can be provided as a method, a system, or a computer program product. Therefore, the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present disclosure may take the form of a computer program product implemented on one or more computer-usable non-transitory storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. .
  • the method and system of the present disclosure may be implemented in many ways.
  • the method and system of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware.
  • the above-mentioned order of the steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above, unless specifically stated otherwise.
  • the present disclosure can also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

本公开涉及一种集群资源的控制方法、装置和云计算系统,涉及计算机技术领域。该方法包括:在待控制资源为待扩容资源的情况下,确定待扩容资源与应用之间的绑定关系;将初始化后的待扩容资源添加到与其具有绑定关系的相应应用的资源池中;根据待处理应用的部署类型,生成待处理应用的待执行数据包;将待执行数据包部署在待处理应用的资源池中的待扩容资源上执行。

Description

集群资源的控制方法、装置和云计算系统
相关申请的交叉引用
本申请是以CN申请号为201911232841.4,申请日为2019年12月5日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及计算机技术领域,特别涉及一种集群资源的控制方法、集群资源的控制装置、云计算系统和非易失性计算机可读存储介质。
背景技术
随着云系统在使用过程中对资源的不断消耗,需要对集群中各种产品的进行资源控制(如扩容、缩容)。这种资源控制成为了运维人员定期需要进行的操作。
在相关技术中,针对某种特定场景下集群某种业务的应用,开发扩容方法。
发明内容
根据本公开的一些实施例,提供了一种集群资源的控制方法,包括:在待控制资源为待扩容资源的情况下,确定所述待扩容资源与应用之间的绑定关系;将初始化后的所述待扩容资源添加到与其具有所述绑定关系的相应应用的资源池中;根据待处理应用的部署类型,生成所述待处理应用的待执行数据包;将所述待执行数据包部署在所述待处理应用的资源池中的待扩容资源上执行。
在一些实施例中,所述将初始化后的所述待扩容资源添加到与其具有所述绑定关系的相应应用的资源池中包括:将所述待扩容资源的相关信息传递给相应应用的前置脚本;执行所述前置脚本,以完成所述待扩容资源的初始化。
在一些实施例中,生成待处理应用的待执行数据包包括:在所述部署类型为包部署的情况下,确定所述待扩容资源为物理机,生成所述待处理应用的程序包作为所述待执行数据包;在所述部署类型为镜像部署的情况下,确定所述待扩容资源为容器镜像,生成所述待处理应用的程序包,并根据该程序包和所述待处理应用的运行镜像,生成所述待执行数据包。
在一些实施例中,所述将所述待执行数据包部署在所述待处理应用的资源池中的 待扩容资源上执行包括:在待扩容资源为物理机的情况下,将待执行数据包发送给物理机执行;在待扩容资源为容器镜像的情况下,将待执行数据包发送给相应资源池中的空闲物理机执行。
在一些实施例中,所述将所述待执行数据包部署在所述待处理应用的资源池中的待扩容资源上执行包括:获取所述资源池中的待扩容资源的相关信息;通过为所述待处理应用配置的部署接口,将该待扩容资源的相关信息发送给所述待处理应用的第三方程序,以便所述第三方程序根据自己的部署方式,将所述待执行数据包部署在该待扩容资源上执行。
在一些实施例中,该方法还包括:执行待处理应用的后置脚本,后置脚本用于以下处理的至少一项:向集群的管理节点返回扩容结果;为资源池中的扩容资源创建卷;清理扩容处理产生的垃圾。
在一些实施例中,该方法还包括:建立与各待扩容资源的SSH(Secure Shell,安全壳协议)连接,用于执行相应应用的各相关脚本,在同一时间内只能与每个待扩容资源建立一条SSH连接。
在一些实施例中,该方法还包括:在相应应用的待执行数据包执行完毕后,在预设的时间段内保留相应的SSH连接。
在一些实施例中,该方法还包括:在待控制资源为待缩容资源的情况下,判断待缩容资源是否存在重要数据,以及是否存在依赖于待缩容资源的服务;在不存在重要数据,且不存在依赖于待缩容资源的服务的情况下,将待缩容资源从集群中移除。
在一些实施例中,通过配置的缩容接口,传递获取的待缩容资源的相关信息,以便将待缩容资源从集群中移除。
在一些实施例中,该方法还包括:通过轮询配置的查询接口,获取缩容结果。
在一些实施例中,将待缩容资源从集群中移除后,该方法还包括:在待缩容资源为物理机的情况下,将待缩容资源添加到资源池;在待缩容资源为容器镜像的情况下,销毁该容器镜像,并将待缩容资源添加到资源池。
在一些实施例中,该方法还包括:在待缩容资源为物理机的情况下,执行缩容处理的后置脚本,后置脚本用于启动装机流程对待缩容资源进行操作系统重装。
根据本公开的另一些实施例,提供一种集群资源的控制装置,包括:确定单元,用于在待控制资源为待扩容资源的情况下,确定所述待扩容资源与应用之间的绑定关系;添加单元,用于将初始化后的所述待扩容资源添加到与其具有所述绑定关系的应 用的资源池中;生成单元,用于根据待处理应用的部署类型,生成所述待处理应用的待执行数据包;执行单元,用于将所述待执行数据包部署在所述待处理应用的资源池中的待扩容资源上执行。
在一些实施例中,添加单元将待扩容资源的相关信息传递给相应应用的前置脚本,执行前置脚本,以完成待扩容资源的初始化。
在一些实施例中,在部署类型为包部署的情况下,确定待扩容资源为物理机,生成单元生成待处理应用的程序包作为待执行数据包;在部署类型为镜像部署的情况下,确定待扩容资源为容器镜像,生成单元生成待处理应用的程序包,并根据该程序包和待处理应用的运行镜像,生成待执行数据包。
在一些实施例中,执行单元在待扩容资源为物理机的情况下,将待执行数据包发送给物理机执行;执行单元在待扩容资源为容器镜像的情况下,将待执行数据包发送给相应资源池中的空闲物理机执行。
在一些实施例中,执行单元获取所述资源池中的待扩容资源的相关信息,通过为待处理应用配置的部署接口,将该待扩容资源的相关信息发送给待处理应用的第三方程序,以便第三方程序根据自己的部署方式,将待执行数据包部署在该待扩容资源上执行。
在一些实施例中,执行单元执行待处理应用的后置脚本,后置脚本用于以下处理的至少一项:向集群的管理节点返回扩容结果;为资源池中的扩容资源创建卷;清理扩容处理产生的垃圾。
在一些实施例中,该装置还包括建立单元,用于建立与各待扩容资源的SSH连接,用于执行相应应用的各相关脚本,在同一时间内只能与每个待扩容资源建立一条SSH连接。
在一些实施例中,建立单元在相应应用的待执行数据包执行完毕后,在预设的时间段内保留相应的SSH连接。
在一些实施例中,该装置还包括缩容单元,用于在待控制资源为待缩容资源的情况下,判断待缩容资源是否存在重要数据,以及是否存在依赖于待缩容资源的服务;缩容单元在不存在重要数据,且不存在依赖于待缩容资源的服务的情况下,将待缩容资源从集群中移除。
在一些实施例中,缩容单元通过配置的缩容接口,传递获取的待缩容资源的相关信息,以便将待缩容资源从集群中移除。
在一些实施例中,缩容单元通过轮询配置的查询接口,获取缩容结果。
在一些实施例中,缩容单元将待缩容资源从集群中移除后,在待缩容资源为物理机的情况下,添加单元将待缩容资源添加到资源池;在待缩容资源为容器镜像的情况下,缩容单元销毁该容器镜像,添加单元将待缩容资源添加到资源池。
在一些实施例中,在待缩容资源为物理机的情况下,执行单元执行缩容处理的后置脚本,后置脚本用于启动装机流程对待缩容资源进行操作系统重装。
根据本公开的又一些实施例,提供一种集群资源的控制装置,包括:存储器;和耦接至存储器的处理器,处理器被配置为基于存储在存储器装置中的指令,执行上述任一个实施例中的集群资源的控制方法。
根据本公开的再一些实施例,提供一种非易失性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述任一个实施例中的集群资源的控制方法。
根据本公开的又一些实施例,提供一种云计算系统,包括:集群资源的控制装置,用于执行上述任一个实施例中的集群资源的控制方法。
附图说明
此处所说明的附图用来提供对本公开的进一步理解,构成本申请的一部分,本公开的示意性实施例及其说明用于解释本公开,并不构成对本公开的不当限定。在附图中:
图1示出本公开的集群资源的控制方法的一些实施例的流程图;
图2示出本公开的集群资源的控制方法的另一些实施例的流程图;
图3示出本公开的集群资源的控制装置的一些实施例的示意图;
图4示出本公开的集群资源的控制方法的又一些实施例的流程图;
图5示出本公开的集群资源的控制装置的另一些实施例的示意图;
图6示出本公开的集群资源的控制装置的又一些实施例的示意图;
图7示出本公开的集群资源的控制方法的一些实施例的示意图;
图8示出本公开的集群资源的控制装置的再一些实施例的示意图;
图9示出本公开的集群资源的控制装置的再一些实施例的示意图;
图10示出本公开的集群资源的控制装置的一些实施例的框图;
图11示出本公开的集群资源的控制装置的另一些实施例的框图;
图12示出本公开的集群资源的控制装置的又一些实施例的框图;
图13示出本公开的集群资源的控制系统的一些实施例的框图。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为授权说明书的一部分。在这里示出和讨论的所有示例中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它示例可以具有不同的值。应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本公开的发明人发现上述相关技术中存在如下问题:无法适用于不同业务的应用的资源扩容,导致适用性差。
鉴于此,本公开提出了一种集群资源的控制技术方案,能够提高资源扩容的适用性。
图1示出本公开的集群资源的控制方法的一些实施例的流程图。
如图1所示,该方法包括:步骤S11,确定绑定关系;步骤S12,添加待扩容资源;步骤S13,生成待执行数据包;和步骤S14,部署待执行数据包。
在步骤S11中,在待控制资源为待扩容资源的情况下,确定待扩容资源与相应的应用之间的绑定关系。例如,待扩容资源可以是物理机或容器镜像。
在步骤S12中,根据绑定关系,将初始化后的待扩容资源添加到相应的应用的资源池中。
在一些实施例中,将待扩容资源的相关信息传递给相应的应用的前置脚本;执行前置脚本,以完成待扩容资源的初始化。
在步骤S13中,根据待处理应用的部署类型,生成待处理应用的待执行数据包。
在一些实施例中,在部署类型为包部署的情况下,生成待处理应用的程序包作为待执行数据包。在这种情况下,待扩容资源为物理机,。
在一些实施例中,在部署类型为镜像部署的情况下,生成待处理应用的程序包,并根据该程序包和待处理应用的运行镜像生成待执行数据包。在这种情况下,待扩容资源为容器镜像。
在步骤S14中,将待执行数据包部署在待处理应用的资源池中相应的待扩容资源上执行。
在一些实施例中,在待扩容资源为物理机的情况下,将待执行数据包发送给物理机执行;在待扩容资源为容器镜像的情况下,将待执行数据包发送给相应资源池(备机池)中的空闲物理机执行。例如,备机池为云计算系统中的备用物理机资源池。
在一些实施例中,获取该待扩容资源的相关信息;通过为待处理应用配置的部署接口,将该待扩容资源的相关信息发送给待处理应用的第三方程序;第三方程序根据自己的部署方式,将待执行数据包部署在该待扩容资源上执行。
在一些实施例中,执行待处理应用的后置脚本,后置脚本用于以下处理的至少一项:向集群的管理节点返回扩容结果;为相应的扩容资源创建相应的卷(volume);清理扩容处理产生的垃圾。
在一些实施例中,建立与各待扩容资源的SSH连接,用于执行相应的应用的各相关脚本,在同一时间内只能与每个待扩容资源建立一条SSH连接。
在一些实施例中,在相应的应用的待执行数据包执行完毕后,在预设的时间段内保留相应的SSH连接,以便下次执行该相应的应用的待执行数据包时,复用该SSH连接。
图2示出本公开的集群资源的控制方法的另一些实施例的流程图。
如图2所示,该方法还包括:步骤S21,判断重要数据和依赖服务;和S22,移除待缩容资源。
在步骤S21中,在待控制资源为待缩容资源的情况下,判断待缩容资源是否存在重要数据,以及是否存在依赖于待缩容资源的服务。
在步骤S22中,在不存在重要数据,且不存在依赖于待缩容资源的服务的情况下,将待缩容资源从集群中移除。
在一些实施例中,可以通过配置的缩容接口,在系统中传递获取的待缩容资源的 相关信息,以便将待缩容资源从集群中移除;通过轮询配置的查询接口,获取缩容结果。
在一些实施例中,在待缩容资源为物理机的情况下,将待缩容资源添加到资源池(备机池);在待缩容资源为容器镜像的情况下,销毁该容器镜像,并将待缩容资源添加到资源池。
在一些实施例中,在待缩容资源为物理机的情况下,执行缩容处理的后置脚本。后置脚本用于启动装机流程对待缩容资源进行操作系统重装。例如,可以通过PXE(Preboot eXecution Environment,预启动执行环境)实现重装。
在上述实施例中,根据待扩容资源与应用的绑定关系,将应用的数据包部署在相应的待扩容资源上执行。这样,可以适用于不同业务的应用的资源扩容,从而提高了资源扩容的适用性。
图3示出本公开的集群资源的控制装置的一些实施例的示意图。
如图3所示,用户可以通过前端Web(网络)页面与后端扩容(或缩容)控制器(即控制装置)交互。例如,前端Web页面可以通过Nginx(engine x,x引擎)服务器实现;控制装置可以通过Golang编程语言实现。
控制装置的内部结构可以是一个针对复杂专有云集群的,支持多组件并发扩容、缩容的统一资源控制系统。各业务线脚本可以与控制装置连接。
例如,该控制装置可以包括多组件扩缩容控制器、异构组件资源管理模块、并发扩缩容执行器、组件统一构建系统、多组件部署系统和组件缩容回收系统等多个模块。
在一些实施例中,多组件扩缩容控制器(如包括确定单元、添加单元等)负责控制整体扩容、缩容流程。例如,多组件扩缩容控制器可以协调其他各模块工作;控制调用扩容、缩容中用到的每一个模块;提供必要的数据参数。
在一些实施例中,异构组件资源管理模块负责管理扩容、缩容过程中需要的所有元数据。例如,元数据包括服务器(物理机资源)管理IP(Internet Protocol,互联网协议)、服务器规格参数、扩容缩容应用数据等。
例如,异构组件资源管理模块可以使用关系型数据库MySQL(My Structured Query Language,我的结构化序列语言)持久化存储数据。
例如,异构组件资源管理模块可以独立部署,并向外提供OpenAPI(Open Application Programming Interface,开放式应用编程接口)调用接口。
在一些实施例中,并发扩缩容执行器(如可以包括执行单元、建立单元等)负责 执行各中资源服务器在不同阶段所需要运行的脚本。
例如,并发扩缩容执行器的底层可以是一个基于SSH协议实现的连接池。可以通过Golang编程语言实现并发扩缩容执行器,以保证高并发。并发扩缩容执行器可以建立控制装置与待扩容资源的SSH连接。
例如,并发扩缩容执行器可以执行自定义的扩容、缩容流程。如针对IaaS(Infrastructure as a Service,基础设施即服务)的相关应用,可以执行自定义的扩容、缩容流程
例如,并发扩缩容执行器可以执行规范化的扩容、缩容流程。如针对大数据、云存储以及IaaS以外其他服务的相关应用,可以执行自定义的扩容、缩容流程。
例如,并发扩缩容执行器可以触发组件统一构建系统、多组件部署系统和组件缩容回收系统。组件统一构建系统、多组件部署系统和组件缩容回收系统对云计算系统的集群进行扩容、缩容的相关处理。
在一些实施例中,组件统一构建系统(如可以包括生成单元)负责编译各资源服务器上线部署的源代码;基于Docker容器为不同资源服务器准备不同的编译环境。组件统一构建系统可以实现编译环境的隔离,使编译所需要的依赖服务互不干扰,从而保证编译工作顺利进行。
在一些实施例中,多组件部署系统(如可以包括执行单元)负责将编译好的程序上线部署到指定的物理服务器或者容器。
例如,多组件部署系统可以包括包部署和镜像部署两种类型。包部署会将编译后的程序上线到指定的物理机;镜像部署会根据编译时打包的运行镜像,并根据所需要的资源(包括运行所需的中央处理器核数、内存大小、硬盘大小等)部署、启动容器。
在一些实施例中,组件缩容回收系统(如可以包括缩容单元、添加单元)负责将缩容的资源回收。
例如,回收的容器资源会重新放到资源池,回收的物理机则会放到备机池。组件缩容回收系统会销毁容器资源,并对回收的物理机进行重新装机、格式化数据盘。
图4示出本公开的集群资源的控制方法的又一些实施例的流程图。
如图4所示,该方法可以包括扩容流程(图中左侧流程)和缩容流程(图中右侧流程)。扩容流程可以包括规范化扩容流程和自定义扩容流程;缩容流程可以包括规范化缩容流程和自定义缩容流程。
例如,与IaaS、云数据库产品相关的各种资源节点(如MySQL资源节点、大容 量存储资源节点等)的扩容、缩容处理都属于自定义扩容、缩容流程;与云存储、数据云有关的各种资源节点(如ds2-datanode资源节点、大数据datanode资源节点等)的扩容、缩容处理都属于规范化扩容、缩容流程。
上述产品的扩容、缩容流程均需要经过分配用途(即将待扩容、缩容资源绑定相应的产品应用)、执行前置脚本和执行后置脚本这三个公共流程。可以提取这些公共流程,建立具有高适用性的资源控制方法。
在一些实施例中,规范化扩容流程主要包括:将待扩容备机(资源)挂载到指定产品应用的备机池中;编译构建应用的程序包或者运行镜像;将程序包或者镜像部署到备机池中的服务器;轮询部署结果。例如,数据云产品的扩容流程主要针对大数据datanode资源节点,该规范化扩容流程可以包括如下步骤。
在步骤1中,为扩容主机分配用途。例如,为待扩容的服务器(资源)分配大数据datanode资源节点的tag标记,从而实现待扩容资源与应用的绑定。
在步骤2中,执行前置脚本。例如,多组件扩缩容控制器可以通过调用管控机的FTP(File Transfer Protocol,文件传输协议)目录中数据云的预置脚本(前置脚本),将待扩容的服务器的IP列表作为命令行参数传递给预置脚本。例如,管控机可以是集群中运行本公开的控制方法的计算机。
通过执行扩容脚本,可以完成待扩容的服务器的初始化工作。预置脚本主要负责初始化扩容物理机,如安装一些基础软件包。
在步骤3中,挂在扩容主机。例如,将待扩容的物理机挂载到产品服务树中相应的产品线的备机池中,方便统一分配资源。例如,可以根据云计算系统内待部署产品的相关信息,将待扩容资源添加到绑定应用所属产品线的资源池中。
在一些实施例中,云计算系统内待部署产品的相关信息可以采用树形结构存储。例如,可以按照部门、产品线、产品、系统、应用的五级树形结构进行存储。
在步骤4中,编译创建数据包。例如,根据应用的部署类型,编译、构建应用的程序包。如大数据datanode资源节点使用的是包部署,则编译出相应的程序包作为处理数据包;对于镜像部署,则将程序包和运行镜像打包在一起生成待处理数据包。将编译构建好的程序包传到备机池中的绑定待扩容机器上,并执行启动脚本。
在步骤5中,轮询扩容结果。例如,可以定时去轮询部署的情况,将部署结果记录到信息管理模块CMDB(Configuration Management Database,配置管理数据库)的部署单元。Web前端可以通过API接口查询部署结果呈现给用户。
在步骤6中,执行后置脚本,以处理扩容后事务。例如,后置脚本可以将扩容的资源节点告知集群中的管理节点,完成集群的扩容。
在一些实施例中,针对云存储扩容流程,主要完成ds2-datanode资源节点扩容,其流程步骤与上面的数据云类似。但是,云存储扩容流程的后置脚本可以进行创建volume(卷)的过程。
在一些实施例中,规范化缩容流程可以包括业务数据检查、依赖服务检查、移出集群、回收资源、轮询结果、执行后置脚本。例如,该规范化缩容流程可以包括如下步骤。
在步骤1中,对待缩容资源进行业务数据检查,以确定待缩容资源中是否存在重要数据。如果存在重要数据,则待缩容资源不支持缩容操作。
在步骤2中,对待缩容资源进行依赖服务检查,已确定上是否存在依赖其运行的服务。如果存在依赖服务,则待缩容资源不支持缩容操作。
在步骤3中,通知集群的管理节点,将待缩容资源移出集群。
在步骤4中,调用组件缩容回收系统,对缩容资源进行回收。
在步骤5中,轮询缩容结果。
在步骤6中,执行后置脚本,处理缩容后事务。例如,可以触发回收资源的PXE装机操作。
在一些实施例中,自定义扩容流程可以包括:将待扩容备机挂载到指定产品应用备机池中;由第三方提供自己的扩容程序(第三方程序),该扩容程序必须提供两个规范的扩容接口,即触发扩容接口(部署接口)和查询扩容结果接口(查询接口);多组件扩缩容控制器调用触发扩容接口,将待扩容的备机信息以参数的形式传递过去;通过查询扩容结果接口轮询扩容结果。
例如,针对IaaS产品的应用的自定义扩容流程可以通过下面的步骤实现。
在步骤1中,为扩容主机分配用途。例如,将待扩容的服务器分配IaaS产品应用的资源节点的tag标记。
在步骤2中,执行前置脚本。例如,多组件扩缩容控制器调用管控机de FTP目录中数据云预置脚本,并将扩容机器的IP列表作为命令行参数传递给预置脚本;执行扩容脚本,完成机器的初始化工作。预置脚本主要负责初始化扩容物理机,例如,安装一些基础软件包。
在步骤3中,挂在扩容主机。例如,将待扩容的物理机挂载到产品的服务树中对 应产品线的备机池中,供第三方程序统一获取。
在步骤4中,传递扩容主机。例如,多组件扩缩容控制器触发IaaS的部署接口,从信息管理CMDB获取待扩容的服务器的相关信息;以参数的形式将先关信息传递给IaaS产品;IaaS产品部署程序去备机池获取部署的该待扩容的服务器;IaaS部署服务根据自有业务上线部署。
在步骤5中,调用轮询接口。例如,多组件扩缩容控制器定期轮询IaaS扩容结果;将部署结果记录到信息管理模块CMDB的部署单元;Web前端可以通过API接口查询部署结果呈现给用户。
在步骤6中,执行后置脚本。如果产品没有后置脚本,该步骤可为空。
在一些实施例中,自定义缩容流程可以包括业务数据检查、依赖服务检查、调用自定义缩容接口、轮询缩容结果、回收资源到资源池或备机池、执行后置脚本。例如,可以通过如下步骤实现自定义缩容流程。
在步骤1中,对待缩容资源进行业务数据检查,以确定待缩容资源中是否存在重要数据。如果存在重要数据,则待缩容资源不支持缩容操作。
在步骤2中,对待缩容资源进行依赖服务检查,已确定上是否存在依赖其运行的服务。如果存在依赖服务,则待缩容资源不支持缩容操作。
在步骤3中,调用自定义的缩容接口,传递待缩容资源的相关信息。
在步骤4中,轮询自定义的缩容结果接口(查询接口),获取缩容结果。
在步骤5中,将回收的资源放入资源池或备机池中。
图5示出本公开的集群资源的控制装置的另一些实施例的示意图。
如图5所示,异构组件资源管理模块主要由服务层和数据层组成,并通过Http(HyperText Transfer Protocol,超文本传输协议)API向其他服务模块和Web前端UI(User Interface,用户界面)提供元数据。Supervisor可以对异构组件资源管理模块进行服务监视。
在一些实施例中,服务层开一个包括API服务器、服务器信息管理单元、容器信息管理单元、产品信息管理单元、编译信息管理单元以及部署信息管理单元。
例如,API服务器可以是基于Golang开发的Http Server(服务器)。API服务器对外提供Restful(Representational State Transfer,表征状态转移)风格的接口,负责将内部管理的数据安全、合法地提供给使用者。
例如,服务器信息管理单元主要负责管理集群内所有物理机信息。物理机信息 可以包括管理节点服务器、已使用资源节点服务器以及待扩缩容资源备机。服务器信息管理单元可以管理全量服务器信息可以为之后服务器的部署、配置、回收提供必要的信息。
例如,容器信息管理单元服务管理集群内所有容器信息。容器信息可以包括容器运行使用的中央处理器核数、内存大小、硬盘大小以及所属物理机、产品应用等。
例如,产品信息管理单元负责管理专有云内部署的产品信息。产品信息可以采用树形结构。树形结构可以按照部门、产品线、产品、系统、应用五级结构划分。在扩容、缩容前需要将待扩缩容物理机分配到具体系统应用,以方便集中管理扩缩容流程。
例如,编译信息管理单元负责管理每个产品应用的编译程序。编译信息管理单元可以基于Jenkins开发,支持自动从源代码仓库(gitlab)拉取代码。编译信息管理单元可以根据配置信息,编译构建出应用二进制程序或者镜像(包括运行镜像和程序包)。
例如,部署信息管理单元负责将编译好的程序包或者打包镜像部署到指定的物理机、容器中,并记录所有的部署记录。
数据层可以包括主存储节点、从存储节点和存储器。主存储节点存储服务层发来的各种信息,并同步备份到从存储节点。例如,可以定时将主存储节点中的数据备份到存储器中,以免在主存储节点和从存储节点均被污染的情况下,获取备份数据。
图6示出本公开的集群资源的控制装置的又一些实施例的示意图。
如图6所示,多组件扩缩容控制器可以是通过Golang构建的SSH控制器。各业务线脚本可以与多组件扩缩容控制器的内部结构连接。
在一些实施例中,多组件扩缩容控制器在执行前置脚本和后置脚本的时候,需要借助并发扩缩容执行器来完成。例如,并发扩缩容执行器的底层是基于Golang的高性能SSH并发连接池,连接池中包括与各待扩容资源的SSH连接。
例如,在连接池的基础上可以封装连接物理机、执行脚本命令等一系列操作接口。由于Golang语言的高并发特性,结合上述连接池,可以保证大规模并发地对服务器、容器进行操作,从而提升扩缩容执行的效率。
例如,为了保证操作安全,连接池对于每台待扩容或待缩容的资源(物理机、容器镜像)只建立一个SSH连接,并定期清理掉过期的SSH连接。
图7示出本公开的集群资源的控制方法的一些实施例的示意图。
如图7所示,该方法包括。用户通过向git分布式版本控制系统提交相关代码;git触发Jenkins自动化服务器的编译构建和上线部署流程。例如,编译构建流程的触 发可以包括手动触发以及设置用户提交代码后自动触发两种方式。
Jenkins通过Docker进行编译构建和编译镜像,并进行上线部署。例如,可以根据应用的类型产出程序包或者程序镜像。如大数据datanode资源节点使用的是包部署,即编译构建产出的只为程序包,不包含运行镜像;如镜像部署类型,则会将程序包和运行镜像一起打包为程序镜像。
图8示出本公开的集群资源的控制装置的再一些实施例的示意图。
如图8所示,当应用程序编译构建完成后,多组件扩容控制器根据扩容应用的类型,通过并发扩缩容执行器将组件统一构建系统编译构建的数据包传输到待扩容的物理机或者资源池中的空闲机。
在进行物理机程序包扩容的情况下,多组件部署系统执行启动脚本,将程序包部署在专有云集群的物理机上运行,并检查执行状态;在程序镜像部署的情况下,多组件部署系统启动镜像,检查专有云集群的容器运行状态。
图9示出本公开的集群资源的控制装置的再一些实施例的示意图。
如图9所示,根据应用部署的类型,组件缩容回收系统进行容器资源回收和物理机资源回收。在回收的应用是专有云集群的物理机的情况下,组件缩容回收系统将物理机添加到备机池,并调用PXE重装系统;在回收的是专有云集群的程序镜像(容器)的情况下,组件缩容回收系统会销毁容器,并把资源回收到资源池。
在一些实施例中,在缩容前,组件缩容回收系统需要检查缩容服务器或者容器中是否已经存在业务数据;并且,组件缩容回收系统需要检查是否缩容服务器或者容器是否部署了其他应用依赖的服务。在存在业务数据或者依赖服务的情况下,组件缩容回收系统退出缩容流程。
在一些实施例中,为了将物理机上的已部署服务清除,组件回收系统可以启动PXE装机流程,对缩容物理机进行操作系统重装并格式化数据盘后,将缩容资源放入备机池;或者将容器销毁后,将缩容资源放入资源池。
图10示出本公开的集群资源的控制装置的一些实施例的框图。
如图10所示,集群资源的控制装置10包括确定单元101、添加单元102、生成单元103和执行单元104。
确定单元101在待控制资源为待扩容资源的情况下,确定待扩容资源与相应的应用之间的绑定关系。
添加单元102根据绑定关系,将初始化后的待扩容资源添加到相应的应用的资源 池中。
在一些实施例中,添加单元102将待扩容资源的相关信息传递给相应的应用的前置脚本,执行前置脚本,以完成待扩容资源的初始化。
生成单元103根据待处理应用的部署类型,生成待处理应用的待执行数据包。
在一些实施例中,在部署类型为包部署的情况下,待扩容资源为物理机,生成单元103生成待处理应用的程序包作为待执行数据包;在部署类型为镜像部署的情况下,待扩容资源为容器镜像,生成单元103生成待处理应用的程序包,并根据该程序包和待处理应用的运行镜像,生成待执行数据包。
执行单元104将待执行数据包部署在待处理应用的资源池中相应的待扩容资源上执行。
在一些实施例中,执行单元104在待扩容资源为物理机的情况下,将待执行数据包发送给物理机执行;执行单元104在待扩容资源为容器镜像的情况下,将待执行数据包发送给相应资源池中的空闲物理机执行。
在一些实施例中,执行单元104获取该待扩容资源的相关信息,通过为待处理应用配置的部署接口,将该待扩容资源的相关信息发送给待处理应用的第三方程序,以便第三方程序根据自己的部署方式,将待执行数据包部署在该待扩容资源上执行。
在一些实施例中,执行单元104执行待处理应用的后置脚本,后置脚本用于以下处理的至少一项:向集群的管理节点返回扩容结果;为相应的扩容资源创建相应的卷;清理扩容处理产生的垃圾。
在一些实施例中,集群资源的控制装置10还包括建立单元105,用于建立与各待扩容资源的SSH连接。SSH连接用于执行相应的应用的各相关脚本。建立单元105在同一时间内只能与每个待扩容资源建立一条SSH连接。
在一些实施例中,建立单元105在相应的应用的待执行数据包执行完毕后,在预设的时间段内保留相应的SSH连接。
在一些实施例中,集群资源的控制装置10还包括缩容单元106,用于在待控制资源为待缩容资源的情况下,判断待缩容资源是否存在重要数据,以及是否存在依赖于待缩容资源的服务;缩容单元106在不存在重要数据,且不存在依赖于待缩容资源的服务的情况下,将待缩容资源从集群中移除。
在一些实施例中,缩容单元通过配置的缩容接口,传递获取的待缩容资源的相关信息,以便将待缩容资源从集群中移除。
在一些实施例中,缩容单元106通过轮询配置的查询接口,获取缩容结果。
在一些实施例中,缩容单元106将待缩容资源从集群中移除后,在待缩容资源为物理机的情况下,添加单元102将待缩容资源添加到资源池;在待缩容资源为容器镜像的情况下,缩容单元106销毁该容器镜像,添加单元将待缩容资源添加到资源池。
在一些实施例中,在待缩容资源为物理机的情况下,执行单元104执行缩容处理的后置脚本,后置脚本用于启动装机流程对待缩容资源进行操作系统重装。
在上述实施例中,根据待扩容资源与应用的绑定关系,将应用的数据包部署在相应的待扩容资源上执行。这样,可以适用于不同业务的应用的资源扩容,从而提高了资源扩容的适用性。
图11示出本公开的集群资源的控制装置的另一些实施例的框图。
如图11所示,该实施例的集群资源的控制装置11包括:存储器111以及耦接至该存储器111的处理器112,处理器112被配置为基于存储在存储器111中的指令,执行本公开中任意一个实施例中的集群资源的控制方法。
其中,存储器111例如可以包括系统存储器、固定非易失性存储介质等。系统存储器例如存储有操作系统、应用程序、引导装载程序、数据库以及其他程序等。
图12示出本公开的集群资源的控制装置的又一些实施例的框图。
如图12所示,该实施例的集群资源的控制装置12包括:存储器1210以及耦接至该存储器1210的处理器1220,处理器1220被配置为基于存储在存储器1210中的指令,执行前述任意一个实施例中的集群资源的控制方法。
存储器1210例如可以包括系统存储器、固定非易失性存储介质等。系统存储器例如存储有操作系统、应用程序、引导装载程序以及其他程序等。
集群资源的控制装置12还可以包括输入输出接口1230、网络接口1240、存储接口1250等。这些接口1230、1240、1250以及存储器1210和处理器1220之间例如可以通过总线1260连接。其中,输入输出接口1230为显示器、鼠标、键盘、触摸屏等输入输出设备提供连接接口。网络接口1240为各种联网设备提供连接接口。存储接口1250为SD卡、U盘等外置存储设备提供连接接口。
本领域内的技术人员应当明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用非瞬时性存储介质上实施的计算机程序产品的形式。
图13示出本公开的集群资源的控制系统的一些实施例的框图。
如图13所示,云计算系统13包括集群资源的控制装置131,用于执行上述任一个实施例中的集群资源的控制方法。
本领域内的技术人员应当明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用非瞬时性存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
至此,已经详细描述了根据本公开的。为了避免遮蔽本公开的构思,没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述,完全可以明白如何实施这里公开的技术方案。
可能以许多方式来实现本公开的方法和系统。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法和系统。用于所述方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。
虽然已经通过示例对本公开的一些特定实施例进行了详细说明,但是本领域的技术人员应该理解,以上示例仅是为了进行说明,而不是为了限制本公开的范围。本领域的技术人员应该理解,可在不脱离本公开的范围和精神的情况下,对以上实施例进行修改。本公开的范围由所附权利要求来限定。

Claims (16)

  1. 一种集群资源的控制方法,包括:
    在待控制资源为待扩容资源的情况下,确定所述待扩容资源与应用之间的绑定关系;
    将初始化后的所述待扩容资源添加到与其具有所述绑定关系的相应应用的资源池中;
    根据待处理应用的部署类型,生成所述待处理应用的待执行数据包;
    将所述待执行数据包部署在所述待处理应用的资源池中的待扩容资源上执行。
  2. 根据权利要求1所述的控制方法,其中,所述将初始化后的所述待扩容资源添加到与其具有所述绑定关系的相应应用的资源池中包括:
    将所述待扩容资源的相关信息传递给相应应用的前置脚本;
    执行所述前置脚本,以完成所述待扩容资源的初始化。
  3. 根据权利要求1所述的控制方法,其中,所述生成所述待处理应用的待执行数据包包括:
    在所述部署类型为包部署的情况下,确定所述待扩容资源为物理机,生成所述待处理应用的程序包作为所述待执行数据包;
    在所述部署类型为镜像部署的情况下,确定所述待扩容资源为容器镜像,生成所述待处理应用的程序包,并根据该程序包和所述待处理应用的运行镜像,生成所述待执行数据包。
  4. 根据权利要求3述的控制方法,其中,所述将所述待执行数据包部署在所述待处理应用的资源池中的待扩容资源上执行包括:
    在待扩容资源为物理机的情况下,将所述待执行数据包发送给所述物理机执行;
    在待扩容资源为容器镜像的情况下,将所述待执行数据包发送给相应资源池中的空闲物理机执行。
  5. 根据权利要求1的控制方法,其中,所述将所述待执行数据包部署在所述待处理应用的资源池中的待扩容资源上执行包括:
    获取所述资源池中的待扩容资源的相关信息;
    通过为所述待处理应用配置的部署接口,将该待扩容资源的相关信息发送给所述待处理应用的第三方程序,以便所述第三方程序根据自己的部署方式,将所述待执行 数据包部署在该待扩容资源上执行。
  6. 根据权利要求1的控制方法,还包括:
    执行所述待处理应用的后置脚本,所述后置脚本用于以下处理的至少一项:
    向集群的管理节点返回扩容结果;
    为所述资源池中的扩容资源创建卷;
    清理扩容处理产生的垃圾。
  7. 根据权利要求1的控制方法,还包括:
    建立与各待扩容资源的安全壳协议SSH连接,用于执行所述相应应用的各相关脚本,在同一时间内只能与每个待扩容资源建立一条SSH连接。
  8. 根据权利要求7的控制方法,还包括:
    在所述相应应用的待执行数据包执行完毕后,在预设的时间段内保留相应的SSH连接。
  9. 根据权利要求1-8任一项的控制方法,还包括:
    在待控制资源为待缩容资源的情况下,判断所述待缩容资源是否存在重要数据,以及是否存在依赖于所述待缩容资源的服务;
    在不存在重要数据,且不存在依赖于所述待缩容资源的服务的情况下,将所述待缩容资源从集群中移除。
  10. 根据权利要求9的控制方法,其中,所述将所述待缩容资源从集群中移除包括:
    通过配置的缩容接口,传递获取的所述待缩容资源的相关信息,以便将所述待缩容资源从集群中移除;
    还包括:
    通过轮询配置的查询接口,获取缩容结果。
  11. 根据权利要求9的控制方法,将所述待缩容资源从集群中移除后,还包括:
    在所述待缩容资源为物理机的情况下,将所述待缩容资源添加到资源池;
    在所述待缩容资源为容器镜像的情况下,销毁该容器镜像,并将所述待缩容资源添加到资源池。
  12. 根据权利要求11的控制方法,还包括:
    在所述待缩容资源为物理机的情况下,执行缩容处理的后置脚本,所述后置脚本用于启动装机流程对所述待缩容资源进行操作系统重装。
  13. 一种集群资源的控制装置,包括:
    确定单元,用于在待控制资源为待扩容资源的情况下,确定所述待扩容资源与应用之间的绑定关系;
    添加单元,用于将初始化后的所述待扩容资源添加到与其具有所述绑定关系的相应应用的资源池中;
    生成单元,用于根据待处理应用的部署类型,生成所述待处理应用的待执行数据包;
    执行单元,用于将所述待执行数据包部署在所述待处理应用的资源池中的待扩容资源上执行。
  14. 一种集群资源的控制装置,包括:
    存储器;和
    耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行权利要求1-12任一项所述的集群资源的控制方法。
  15. 一种云计算系统,包括:
    集群资源的控制装置,用于执行权利要求1-12任一项所述的集群资源的控制方法。
  16. 一种非易失性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现权利要求1-12任一项所述的集群资源的控制方法。
PCT/CN2020/117413 2019-12-05 2020-09-24 集群资源的控制方法、装置和云计算系统 WO2021109686A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2022534290A JP2023504870A (ja) 2019-12-05 2020-09-24 クラスタリソースの制御方法および装置、ならびにクラウドコンピューティングシステム
EP20897260.4A EP4071611A4 (en) 2019-12-05 2020-09-24 METHOD FOR CONTROLLING CLUSTER RESOURCES AND APPARATUS AND CLOUD COMPUTING SYSTEM
US17/782,110 US20230004439A1 (en) 2019-12-05 2020-09-24 Control method and apparatus of cluster resource, and cloud computing system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911232841.4 2019-12-05
CN201911232841.4A CN110968427A (zh) 2019-12-05 2019-12-05 集群资源的控制方法、装置和云计算系统

Publications (1)

Publication Number Publication Date
WO2021109686A1 true WO2021109686A1 (zh) 2021-06-10

Family

ID=70033132

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/117413 WO2021109686A1 (zh) 2019-12-05 2020-09-24 集群资源的控制方法、装置和云计算系统

Country Status (5)

Country Link
US (1) US20230004439A1 (zh)
EP (1) EP4071611A4 (zh)
JP (1) JP2023504870A (zh)
CN (1) CN110968427A (zh)
WO (1) WO2021109686A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010096A (zh) * 2023-01-04 2023-04-25 上海弘积信息科技有限公司 一种负载均衡设备集中管理方法及系统

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110968427A (zh) * 2019-12-05 2020-04-07 北京京东尚科信息技术有限公司 集群资源的控制方法、装置和云计算系统
CN112698916B (zh) * 2020-12-31 2024-04-12 北京千方科技股份有限公司 一种多容器集群的管控系统、方法以及存储介质
CN113472565B (zh) * 2021-06-03 2024-02-20 北京闲徕互娱网络科技有限公司 服务器功能的扩容方法、装置、设备和计算机可读介质
CN115292045B (zh) * 2022-08-09 2023-06-23 安超云软件有限公司 Jenkins多节点复用的方法、系统及电子设备
CN117648173B (zh) * 2024-01-26 2024-05-14 杭州阿里云飞天信息技术有限公司 资源调度方法以及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105867955A (zh) * 2015-09-18 2016-08-17 乐视云计算有限公司 一种应用程序部署系统及部署方法
TW201812611A (zh) * 2016-09-01 2018-04-01 中華電信股份有限公司 基於動態叢集規則範本之服務元件自動建構與水平延展的系統與方法
CN109062666A (zh) * 2018-07-27 2018-12-21 浪潮电子信息产业股份有限公司 一种虚拟机集群管理方法及相关装置
CN110597623A (zh) * 2019-08-13 2019-12-20 平安普惠企业管理有限公司 容器资源分配方法、装置、计算机设备和存储介质
CN110825494A (zh) * 2019-11-01 2020-02-21 北京京东尚科信息技术有限公司 物理机调度方法及装置、计算机可存储介质
CN110968427A (zh) * 2019-12-05 2020-04-07 北京京东尚科信息技术有限公司 集群资源的控制方法、装置和云计算系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8584131B2 (en) * 2007-03-30 2013-11-12 International Business Machines Corporation Method and system for modeling and analyzing computing resource requirements of software applications in a shared and distributed computing environment
CN109803018B (zh) * 2019-01-24 2022-06-03 云南电网有限责任公司信息中心 一种基于Mesos和YARN结合的DCOS云管理平台
CN110333877A (zh) * 2019-07-09 2019-10-15 西安点告网络科技有限公司 基于应用的可视化容器配置管理方法、装置及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105867955A (zh) * 2015-09-18 2016-08-17 乐视云计算有限公司 一种应用程序部署系统及部署方法
TW201812611A (zh) * 2016-09-01 2018-04-01 中華電信股份有限公司 基於動態叢集規則範本之服務元件自動建構與水平延展的系統與方法
CN109062666A (zh) * 2018-07-27 2018-12-21 浪潮电子信息产业股份有限公司 一种虚拟机集群管理方法及相关装置
CN110597623A (zh) * 2019-08-13 2019-12-20 平安普惠企业管理有限公司 容器资源分配方法、装置、计算机设备和存储介质
CN110825494A (zh) * 2019-11-01 2020-02-21 北京京东尚科信息技术有限公司 物理机调度方法及装置、计算机可存储介质
CN110968427A (zh) * 2019-12-05 2020-04-07 北京京东尚科信息技术有限公司 集群资源的控制方法、装置和云计算系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4071611A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010096A (zh) * 2023-01-04 2023-04-25 上海弘积信息科技有限公司 一种负载均衡设备集中管理方法及系统

Also Published As

Publication number Publication date
US20230004439A1 (en) 2023-01-05
CN110968427A (zh) 2020-04-07
JP2023504870A (ja) 2023-02-07
EP4071611A1 (en) 2022-10-12
EP4071611A4 (en) 2023-12-27

Similar Documents

Publication Publication Date Title
WO2021109686A1 (zh) 集群资源的控制方法、装置和云计算系统
CN109104467B (zh) 开发环境构建方法、装置以及平台系统和存储介质
CN108376100B (zh) 基于安全的容器调度
US9307019B2 (en) Apparatus, systems and methods for deployment and management of distributed computing systems and applications
CN102576354B (zh) 支持不同部署架构的可扩展框架
JP5598762B2 (ja) 仮想マシンパッケージ生成システム、仮想マシンパッケージ生成方法および仮想マシンパッケージ生成プログラム
CN111641515B (zh) Vnf的生命周期管理方法及装置
WO2017049828A1 (zh) 基于Linux的数据处理方法、装置和系统
CN110741350A (zh) 用于分布式计算系统的备份和还原架构
US11086662B2 (en) Method and system of migrating applications to a cloud-computing environment
US10983908B1 (en) Method and system for garbage collection of data protection virtual machines in cloud computing networks
CN103064742A (zh) 一种hadoop集群的自动部署系统及方法
CN103248696B (zh) 一种云计算环境下的虚拟资源动态配置方法
WO2022151776A1 (zh) 一种云平台虚拟机回收方法及计算机设备
JP2008293358A (ja) 分散処理プログラム、分散処理方法、分散処理装置、および分散処理システム
CN111262908A (zh) 基于jenkins的任务构建方法及系统
CN103414712A (zh) 一种分布式虚拟桌面管理系统和方法
CN107707687A (zh) 一种虚拟机ip地址配置的方法和装置
WO2021013248A1 (zh) 容器分层部署方法及系统
JP2010250749A (ja) パッチ適用システム
CN107465709B (zh) 分布式镜像构建任务方法及装置、系统
CN112463290A (zh) 动态调整计算容器的数量的方法、系统、装置和存储介质
WO2020015751A1 (zh) 容器服务快照的管理方法和装置
JP2012208752A (ja) ライセンス管理装置、ライセンス管理方法、及びプログラム
US11797206B2 (en) Hash migration using a gold image library management system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20897260

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022534290

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 202237038494

Country of ref document: IN

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020897260

Country of ref document: EP

Effective date: 20220705