WO2023098614A1 - 一种云实例的扩缩容方法及其相关设备 - Google Patents

一种云实例的扩缩容方法及其相关设备 Download PDF

Info

Publication number
WO2023098614A1
WO2023098614A1 PCT/CN2022/134647 CN2022134647W WO2023098614A1 WO 2023098614 A1 WO2023098614 A1 WO 2023098614A1 CN 2022134647 W CN2022134647 W CN 2022134647W WO 2023098614 A1 WO2023098614 A1 WO 2023098614A1
Authority
WO
WIPO (PCT)
Prior art keywords
cloud
cloud instance
working node
instance
node
Prior art date
Application number
PCT/CN2022/134647
Other languages
English (en)
French (fr)
Inventor
蔡灏旻
敬锐
雷钟凯
卢景晓
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023098614A1 publication Critical patent/WO2023098614A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of cloud technology, and in particular to a method for expanding and contracting a cloud instance and related equipment.
  • a cloud service system usually includes multiple working nodes (workers) and management nodes (master), where multiple containers (docker) are deployed on each working node, and the management node can centrally manage all the containers.
  • the cloud service system uses kubernetes as the container management standard, which can implement functions such as orchestration and deployment, grayscale upgrade and upgrade, and automatic expansion and contraction of containers.
  • kubernetes can support two automatic scaling methods, namely vertical pod (container group) automatic scaling (vertical pod autoscale, VPA) and horizontal pod autoscale (horizontal pod autoscale, HPA).
  • VPA vertical pod autoscale
  • HPA horizontal pod autoscale
  • the management node can calculate the recommended resource quota value of the pod based on the resource occupancy rate of the pod in a certain working node, and send it to the working node.
  • the working node creates a pod, it can set a new resource quota for the pod based on the recommended value (for example, increasing or reducing the resource quota of the pod, which is equivalent to expanding or shrinking).
  • the resource quota of a pod due to the time point of modifying the resource quota of the pod, it can only be created when the pod is created.
  • the worker node can only release the pod first, and then the resource quota can be modified when the pod is recreated, which will cause the business of the pod to be interrupted.
  • the embodiments of the present application provide a method for expanding and contracting a cloud instance and related equipment, which can ensure that services running on the cloud instance will not be interrupted when the resource quota of the cloud instance is increased or decreased.
  • the first aspect of the embodiments of the present application provides a method for expanding and contracting a cloud instance.
  • the method is applied to a cloud service system.
  • the cloud service system includes multiple working nodes, and a schematic introduction of one of the working nodes is also called the The working node is the first working node, and the method includes:
  • the first working node is deployed with multiple cloud instances.
  • multiple cloud instances can be multiple containers.
  • multiple cloud instances can be multiple groups of containers.
  • multiple cloud instances can be multiple virtual machines.
  • multiple cloud instances may be multiple groups of virtual machines and so on.
  • the first working node can acquire the status information of the multiple cloud instances, and the status information of each cloud instance can be used to indicate the status of the service run by the cloud instance.
  • the first working node can analyze the multiple cloud instances one by one based on the status information of the multiple cloud instances, so as to determine the cloud instance to be expanded among the multiple cloud instances and The amount of resources required to expand the capacity of a cloud instance with capacity expansion.
  • the first working node can determine the amount of idle resources of the first working node, and detect whether the amount of idle resources of the first working node is greater than or equal to the amount of resources required for expansion, if the amount of idle resources of the first working node is greater than or equal to the amount of resources required for expansion.
  • the amount of resources required indicates that the idle resources of the first working node are sufficient, so the first working node can directly expand the cloud instance to be expanded, that is, increase the cloud instance to be expanded based on the amount of resources required for expansion The resource quota, so as to realize the expansion of the cloud instance.
  • the first working node can determine the cloud instance to be expanded and the amount of resources required for capacity expansion among the multiple cloud instances based on the state information. If the amount of idle resources of the first working node is greater than or equal to the amount of resources required for expansion, the first working node may increase the resource quota of the cloud instance to be expanded based on the amount of resources required for expansion. Based on the foregoing process, it can be seen that this application provides a new cloud instance expansion mechanism. After the first working node determines the cloud instance to be expanded, it can modify the cgroup configuration in real time to increase the resource quota of the cloud instance to be expanded. This is not perceived by the business running on the cloud instance to be expanded, so it will not interrupt the business running on the cloud instance to be expanded.
  • the first working node determines the first cloud instance to be expanded and the amount of resources required for expansion in the multiple cloud instances, including: in the multiple cloud instances , the first working node determines the cloud instance whose state information satisfies the preset expansion condition as the cloud instance to be expanded; the first working node determines the amount of resources required for expansion based on the state information of the cloud instance to be expanded .
  • the first working node can detect whether the state information of the cloud instance satisfies the preset expansion conditions, and if so, it determines that expansion is required, that is, the cloud instance It is determined that the cloud instance to be expanded, if not satisfied, it is determined that the cloud instance does not need to be expanded, and the operation ends. After the cloud instance is determined as the cloud instance to be expanded, the amount of resources required for the expansion of the cloud instance can also be accurately calculated based on the state information of the cloud instance.
  • the cloud service system further includes a second working node
  • the method further includes: if the amount of idle resources of the first working node is less than the amount of resources required for capacity expansion, the first working node Determine the cloud instance to be migrated, the business priority of the cloud instance to be migrated is lower than the preset priority; the first working node migrates the cloud instance to be migrated to the second working node to update the first working node if the updated idle resource amount of the first working node is greater than or equal to the resource amount required for expansion, the first working node increases the resource quota of the cloud instance to be expanded based on the resource amount required for capacity expansion.
  • the first working node can Among the cloud instances, at least one cloud instance to be migrated is determined, and the priority of the business run by these cloud instances is lower than the preset priority (that is, the business run by these cloud instances often has a lower priority), so it can be Migrate these cloud instances to be migrated to the second working node, then, the resources allocated to these cloud instances to be migrated in the first working node will be released and become new idle resources, thus updating the idle status of the first working node
  • the resource amount that is, the idle resource amount of the first working node is increased).
  • the first working node can detect whether the amount of idle resources of the first working node after the update is greater than or equal to the amount of resources required for capacity expansion, if the amount of idle resources of the first working node after the update is greater than or equal to the cloud instance (to be The amount of resources required for capacity expansion of the expanded cloud instance) indicates that the idle resources of the updated first working node are sufficient, and this part of resources can be used to directly expand the cloud instance, so the first working node can be based on the cloud instance The amount of resources required for capacity expansion, and increase the resource quota of the cloud instance, so as to realize the capacity expansion of the cloud instance.
  • the cloud instance to be The amount of resources required for capacity expansion of the expanded cloud instance
  • the cloud service system further includes a third working node
  • the method further includes: if the amount of idle resources of the updated first working node is less than the amount of resources required for capacity expansion, the first working node detects that the The type of business run by the expanded cloud instance; if the business run by the cloud instance to be expanded is a stateless application, the first working node creates a new cloud instance at the third working node, and the new cloud instance and the cloud instance to be expanded Commonly used to run stateless applications; if the business run by the cloud instance to be expanded is a stateful application, the first working node will migrate the cloud instance to be expanded to the third working node.
  • the first working node can first detect the type of business run by the cloud instance (also can be understood as the application run by the cloud instance), and perform corresponding processing based on the business type run by the cloud instance: if the business run by the cloud instance is For stateless applications, the first working node can apply to create a new cloud instance at the third working node, then the cloud instance and the new cloud instance can jointly run the business originally run by the cloud instance, which is equivalent to realizing expansion.
  • the node agent can migrate the cloud instance to the third working node, wherein the amount of idle resources of the third working node is greater than or equal to the amount of resources required for the expansion of the cloud instance, Then, after the cloud instance is migrated to the third working node, the resource quota of the cloud instance can be increased, which is equivalent to completing the expansion.
  • the state information includes at least one of the following: resource occupancy rate, load level, and service success rate. It can be seen that the status information of the cloud instance is related to the business logic of the cloud instance.
  • the preset expansion conditions include at least one of the following: the resource occupancy rate is greater than or equal to the preset first resource occupancy rate, the load level is greater than or equal to the preset first load level, and the business success rate is less than the preset first service success rate.
  • VPA it is only based on the resource occupancy rate of the cloud instance to detect whether the cloud instance needs to be expanded, and it is impossible to understand the business needs in depth. Multiple inspections are required to accurately determine whether the cloud instance needs to be expanded, and it takes a long time to detect.
  • the aforementioned implementation method can detect whether the cloud instance needs to be expanded based on the status information of the cloud instance.
  • the status information of the cloud instance includes information such as the resource occupancy rate of the cloud instance, the load level, and the business success rate
  • the status information of the cloud instance It is related to business logic and can better reflect business needs. Therefore, based on the status information of cloud instances, working nodes can sense business needs in real time, and accurately detect whether cloud instances need to be expanded based on business needs, which can effectively reduce the number of detection times, thereby shortening the detection time. .
  • the manner of migrating the cloud instance to be expanded to the third working node is cold migration or hot migration.
  • the cloud service system further includes a management node
  • the method further includes: the first working node increases the resource quota of the cloud instance to be expanded Quotas are sent to management nodes.
  • the first working node increases the resource quota of the cloud instance to be expanded, it can send the increased resource quota of the cloud instance to the management node, so the management node and the first working node can synchronize the The resource quota of the cloud instance makes the global (that is, the entire cloud service system) resource configuration information accurate and consistent.
  • the second aspect of the embodiment of the present application provides a method for reducing the capacity of a cloud instance, the method is applied to a cloud service system, the cloud service system includes a first working node, and the method includes:
  • the first working node is deployed with multiple cloud instances.
  • multiple cloud instances can be multiple containers.
  • multiple cloud instances can be multiple groups of containers.
  • multiple cloud instances can be multiple virtual machines.
  • multiple cloud instances may be multiple groups of virtual machines and so on.
  • the first working node can acquire the status information of the multiple cloud instances, and the status information of each cloud instance can be used to indicate the status of the service run by the cloud instance.
  • the first working node can analyze the multiple cloud instances one by one based on the status information of the multiple cloud instances, so as to determine the cloud instance to be scaled down among the multiple cloud instances .
  • the first working node can determine the non-idle resources and idle resources in the cloud instance to be scaled down, release the idle resources of the cloud instance to be scaled down, and calculate the released resources Size, which is to determine the amount of idle resources of the cloud instance to be released to be scaled down. Then, the first working node may reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down, thereby realizing the scaling down of the cloud instance.
  • the first working node can determine the cloud instance to be scaled down among the multiple cloud instances based on the status information. Then, the first working node releases the idle resources of the cloud instance to be scaled down, and determines the released idle resource amount of the cloud instance to be scaled down. Finally, the first working node may reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down. Based on the foregoing process, it can be seen that this application provides a new cloud instance scaling mechanism.
  • the first working node After the first working node determines the cloud instance to be scaled down by itself, it can modify the cgroup configuration in real time to reduce the number of cloud instances to be scaled down. Resource quotas are not aware of the business running on the cloud instance to be scaled down, so it will not cause business interruption to the cloud instance to be scaled down.
  • the first working node determining the cloud instance to be scaled out of the multiple cloud instances based on the state information includes: among the multiple cloud instances, the first working node sets the state information to meet the preset The cloud instance with shrinking conditions is determined as the cloud instance to be scaled down.
  • the first working node can detect whether the state information of the cloud instance satisfies the preset shrinking conditions, and if so, it determines that shrinking is required, that is, the The cloud instance is determined to be a cloud instance to be scaled down. If not satisfied, it is determined that the cloud instance does not need to be scaled down, and the operation ends.
  • the state information includes at least one of the following: resource occupancy rate, load level, and service success rate. It can be seen that the status information of the cloud instance is related to the business logic of the cloud instance.
  • the preset shrinking conditions include at least one of the following: the resource occupancy rate is lower than the preset second resource occupancy rate, the load level is lower than the preset second load level, and the service success rate is greater than or It is equal to the preset second service success rate.
  • VPA it is only based on the resource occupancy rate of the cloud instance to detect whether the cloud instance needs to be scaled down, and it is impossible to understand the business needs in depth. Multiple inspections are required to accurately determine whether the cloud instance needs to be scaled down, which takes a long time to detect. superior.
  • the aforementioned implementation method can detect whether the cloud instance needs to be scaled down based on the status information of the cloud instance.
  • the status information of the cloud instance includes information such as the resource occupancy rate of the cloud instance, the load level, and the business success rate
  • the status information of the cloud instance The business logic of the instance is related, which can better reflect the business needs. Therefore, based on the status information of the cloud instance, the working node can sense the business demand in real time, and accurately detect whether the cloud instance needs to be scaled down based on the business demand, which can effectively reduce the number of detections, thereby shortening the Detection time.
  • the cloud service system further includes a management node
  • the method further includes : The first working node sends the resource quota of the cloud instance to be scaled down to the management node.
  • the first working node after the first working node reduces the resource quota of the cloud instance, it can send the reduced resource quota of the cloud instance to the management node, so the management node and the first working node can synchronize the cloud
  • the resource quota of the instance makes the global (that is, the entire cloud service system) resource configuration information accurate and consistent.
  • the third aspect of the embodiment of the present application provides a working node.
  • the working node is used as the first working node.
  • the first working node is set in the cloud service system.
  • the first working node is deployed with multiple cloud instances.
  • the first working node includes : an acquisition module, used to obtain status information of multiple cloud instances; a first determination module, used to determine the cloud instance to be expanded and the amount of resources required for expansion in multiple cloud instances based on the status information; the first adjustment module , for increasing the resource quota of the cloud instance to be expanded based on the resource amount required for capacity expansion if the idle resource amount of the first working node is greater than or equal to the resource amount required for capacity expansion.
  • the first working node can determine the cloud instance to be expanded and the amount of resources required for capacity expansion among the multiple cloud instances based on the state information. If the amount of idle resources of the first working node is greater than or equal to the amount of resources required for expansion, the first working node can increase the resource quota of the cloud instance to be expanded based on the amount of resources required for expansion. Based on the foregoing process, it can be seen that this application provides a new cloud instance expansion mechanism. After the first working node determines the cloud instance to be expanded, it can modify the cgroup configuration in real time to increase the resource quota of the cloud instance to be expanded. This is not perceived by the business running on the cloud instance to be expanded, so it will not interrupt the business running on the cloud instance to be expanded.
  • the first determining module is configured to: among multiple cloud instances, the first working node determines the cloud instance whose state information satisfies the preset expansion condition as the cloud instance to be expanded; The status information of the expanded cloud instance determines the amount of resources required for capacity expansion.
  • the cloud service system further includes a second working node
  • the first working node further includes: a second determination module, configured to: if the amount of idle resources of the first working node is less than the amount of resources required for capacity expansion, Determine the cloud instance to be migrated among multiple cloud instances, and the priority of the business run by the cloud instance to be migrated is lower than the preset priority; the first migration module is used to migrate the cloud instance to be migrated to the second job node, to update the amount of idle resources of the first working node; the second adjustment module is used for if the amount of idle resources of the first working node after the update is greater than or equal to the amount of resources required for capacity expansion, based on the amount of resources required for capacity expansion, Increase the resource quota of the cloud instance to be expanded.
  • the cloud service system further includes a third working node
  • the first working node further includes: a detection module, configured to if the amount of idle resources of the updated first working node is less than the amount of resources required for capacity expansion , the first working node detects the type of business running on the cloud instance to be expanded; the creation module is used to create a new cloud instance at the third working node if the business running on the cloud instance to be expanded is a stateless application, and the new The cloud instance and the cloud instance to be expanded are used to run the stateless application; the second migration module is used to migrate the cloud instance to be expanded to the third working node if the service run by the cloud instance to be expanded is a stateful application.
  • the state information includes at least one of the following: resource occupancy rate, load level, and service success rate.
  • the preset expansion conditions include at least one of the following: the resource occupancy rate is greater than or equal to the preset first resource occupancy rate, the load level is greater than or equal to the preset first load level, and the business success rate is less than the preset first service success rate.
  • the aforementioned migration is cold migration or hot migration.
  • the cloud service system further includes a management node
  • the first working node further includes: a feedback module, configured to send the resource quota of the cloud instance to be expanded to the management node.
  • the fourth aspect of the embodiment of the present application provides a working node.
  • the working node is used as the first working node.
  • the first working node is set in the cloud service system.
  • the first working node is deployed with multiple cloud instances.
  • the first working node includes :
  • the acquisition module is used to obtain the status information of multiple cloud instances;
  • the determination module is used to determine the cloud instance to be scaled down among multiple cloud instances based on the status information;
  • the release module is used to release the cloud instance to be scaled down and determine the amount of idle resources of the released cloud instance to be scaled down;
  • the adjustment module is configured to reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down.
  • the first working node can determine the cloud instance to be scaled down among the multiple cloud instances based on the status information. Then, the first working node releases the idle resources of the cloud instance to be scaled down, and determines the released idle resource amount of the cloud instance to be scaled down. Finally, the first working node may reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down. Based on the foregoing process, it can be seen that this application provides a new cloud instance scaling mechanism.
  • the first working node After the first working node determines the cloud instance to be scaled down by itself, it can modify the cgroup configuration in real time to reduce the number of cloud instances to be scaled down. Resource quotas are not aware of the business running on the cloud instance to be scaled down, so it will not cause business interruption to the cloud instance to be scaled down.
  • the determining module is configured to, among multiple cloud instances, determine a cloud instance whose state information satisfies a preset shrinkage condition as a cloud instance to be scaled down.
  • the state information includes at least one of the following: resource occupancy rate, load level, and service success rate.
  • the preset shrinking conditions include at least one of the following: the resource occupancy rate is lower than the preset second resource occupancy rate, the load level is lower than the preset second load level, and the service success rate is greater than or It is equal to the preset second service success rate.
  • the cloud service system further includes a management node
  • the first working node further includes: a feedback module, configured to send the resource quota of the cloud instance to be expanded to the management node.
  • the fifth aspect of the embodiment of the present application provides a working node, the working node includes a memory and a processor; the memory stores codes, the processor is configured to execute the codes, and when the codes are executed, the working node performs the same as the first aspect , any possible implementation of the first aspect, the second aspect, or the method described in any possible implementation of the second aspect.
  • the sixth aspect of the embodiments of the present application provides a computer storage medium.
  • the computer storage medium stores one or more instructions. When executed by one or more computers, the instructions enable one or more computers to implement the first aspect, the first aspect, and the second aspect. Any possible implementation of the first aspect, the second aspect, or the method described in any possible implementation of the second aspect.
  • the seventh aspect of the embodiments of the present application provides a computer program product.
  • the computer program product stores instructions.
  • the instructions When the instructions are executed by a computer, the computer implements any one of the possible implementations of the first aspect, the first aspect, and The method described in the second aspect or any possible implementation manner of the second aspect.
  • the first working node may determine the cloud instance to be scaled down among the multiple cloud instances based on the status information. Then, the first working node releases the idle resources of the cloud instance to be scaled down, and determines the released idle resource amount of the cloud instance to be scaled down. Finally, the first working node may reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down. Based on the foregoing process, it can be seen that this application provides a new cloud instance scaling mechanism.
  • the first working node After the first working node determines the cloud instance to be scaled down by itself, it can modify the cgroup configuration in real time to reduce the number of cloud instances to be scaled down. Resource quotas are not aware of the business running on the cloud instance to be scaled down, so it will not cause business interruption to the cloud instance to be scaled down.
  • the first working node may determine the cloud instance to be scaled down among the multiple cloud instances based on the status information. Then, the first working node releases the idle resources of the cloud instance to be scaled down, and determines the released idle resource amount of the cloud instance to be scaled down. Finally, the first working node may reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down. Based on the foregoing process, it can be seen that this application provides a new cloud instance scaling mechanism.
  • the first working node After the first working node determines the cloud instance to be scaled down by itself, it can modify the cgroup configuration in real time to reduce the number of cloud instances to be scaled down. Resource quotas are not aware of the business running on the cloud instance to be scaled down, so it will not cause business interruption to the cloud instance to be scaled down.
  • Fig. 1 is a schematic diagram of VPA
  • FIG. 2 is a schematic structural diagram of a cloud service system provided by an embodiment of the present application.
  • Fig. 3 is another structural schematic diagram of the cloud service system provided by the embodiment of the present application.
  • FIG. 4 is a schematic flow diagram of a method for expanding the capacity of a cloud instance provided by an embodiment of the present application
  • FIG. 5 is a schematic flowchart of a method for reducing the capacity of a cloud instance provided by an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of a working node provided by an embodiment of the present application.
  • FIG. 7 is another schematic structural diagram of a working node provided by an embodiment of the present application.
  • FIG. 8 is another schematic structural diagram of a working node provided by an embodiment of the present application.
  • the embodiments of the present application provide a method for expanding and contracting a cloud instance and related equipment, which can ensure that services running on the cloud instance will not be interrupted when the resource quota of the cloud instance is increased or decreased.
  • a cloud service system usually includes multiple working nodes and management nodes, where multiple cloud instances are deployed on each working node, and the management node can centrally manage all cloud instances.
  • the cloud instance is used as a container (docker) for introduction.
  • the cloud service system uses kubernetes as the container management standard, which can implement functions such as orchestration and deployment, grayscale upgrade and upgrade, and automatic expansion and contraction of containers.
  • kubernetes can support two automatic scaling methods, namely VPA and HPA.
  • VPA is a schematic diagram of VPA
  • the management node can calculate the resource quota recommendation value of the pod based on the resource occupancy rate of the pod in a certain working node, and send it to the working node.
  • the working node When the working node creates a pod, it can set a new resource quota for the pod based on the recommended value, for example, increase the resource quota of the pod (that is, increase the amount of resources allocated to the pod, that is, expand the pod capacity) or reduce The resource quota of the pod (that is, reducing the amount of resources allocated to the pod, that is, shrinking the pod).
  • the resource quota of a pod due to the time point of modifying the resource quota of the pod, it can only be created when the pod is created.
  • the worker node can only release the pod first, and then the resource quota can be modified when the pod is recreated, which will cause the business of the pod to be interrupted.
  • the embodiment of the present application provides a method for expanding and contracting the cloud instance, which can be applied to the cloud service system shown in Figure 2 ( Figure 2 is the cloud service system provided by the embodiment of the present application A schematic diagram of the structure), the cloud scenarios applicable to the system include public cloud, private cloud, hybrid cloud and other scenarios, the system includes a management node and multiple working nodes, and the management node is used to centrally manage these multiple working nodes. Introduce the management node and working node:
  • the management node is usually a single physical server (also called a network device).
  • There are multiple functional modules in the management node which are application programming interface (application programming interface, API), scheduler (scheduler), main controller (controller-manager) and database (data base, DB).
  • the API can be understood as the interface of the cloud service system, and the user can create a cloud instance on the working node through the API.
  • the scheduler can be used to schedule cloud instances to appropriate nodes.
  • the scheduler is often a replaceable component, and its form can be set according to the requirements of different manufacturers.
  • the main controller is used to implement functions such as resource management of each working node and each cloud instance in the cloud service system.
  • the database is used to store the configuration information in the cloud service system, for example, the resource quota of each cloud instance and so on.
  • a working node is usually a separate physical server. Based on virtualization technology, a node agent (kubelet) and multiple cloud instances can be deployed on the operating system of the working node. Among them, the operating system is used to realize the hardware resource management of the working nodes. As an agent process on the working node, the node agent is used to manage all cloud instances on the working node, including life cycle management, resource configuration and so on.
  • a cloud instance can be presented in various forms, for example, a cloud instance can be a virtual machine (virtual machine, VM), and another example, a cloud instance can be a container (docker). For another example, a cloud instance can be a group of virtual machines.
  • a cloud instance can be a group of containers (also called a pod) and so on.
  • the working node itself contains certain hardware resources (computing resources, storage resources, network resources, etc.), and the node agent can set a resource quota for each cloud instance, and the resource quota of each cloud instance is allocated to The amount of resources of the cloud instance (including the amount of computing resources, the amount of storage resources and the amount of network resources, etc.), and the node agent can manage and adjust the resource quotas of all cloud instances.
  • FIG. 3 is another schematic structural diagram of the cloud service system provided by the embodiment of the present application
  • the node proxy of the working node is provided with a scaling interface (scaleAPI ) is called by the cloud instance of the working node, and a two-way channel is built between the node proxy of the working node and the cloud instance of the working node, so the two can communicate with each other.
  • the cloud instance of the working node can judge whether it needs to expand or shrink according to its own state information.
  • the node agent of the node scales the cloud instance of the worker node based on the request.
  • Figure 4 is a schematic flow chart of the expansion method of the cloud instance provided by the embodiment of the present application. It should be noted that this method can be applied to the cloud service shown in the aforementioned Figure 2 or Figure 3 system, the subject of execution of the method may be any one of the multiple working nodes in the cloud service system, and the working node is referred to as the first working node hereinafter. As shown in Figure 4, the method includes:
  • the first working node acquires status information of multiple cloud instances.
  • the first working node determines the cloud instance to be expanded and the amount of resources required for expansion among multiple cloud instances.
  • multiple cloud instances are deployed on the first working node.
  • the cloud instance can periodically obtain its own state information, so as to determine whether to expand capacity based on its own state information.
  • the period is usually a time period of second level.
  • the status information of the cloud instance may include at least one of the following: the resource occupancy rate of the cloud instance, the load level of the cloud instance, and the business success rate of the cloud instance.
  • the resource occupancy rate of the cloud instance may include at least one of the following: the usage rate of the central processing unit (central processing unit, CPU) of the cloud instance, the memory usage of the cloud instance, the storage per second of the cloud instance The number of read and write operations (input/output operations per second, IOPS) and the network IOPS of the cloud instance, etc.
  • the load level of the cloud instance may include at least one of the following: task response time of the cloud instance, task processing delay of the cloud instance, database index of the cloud instance, message queue index of the cloud instance, and the The task queue length of the cloud instance and so on.
  • the service success rate of the cloud instance may include at least one of the following: a task completion rate of the cloud instance, a message transmission success rate of the cloud instance, and the like.
  • the cloud instance can determine whether it needs to be expanded in the following ways:
  • the cloud instance can detect whether its state information satisfies the preset expansion conditions, and if so, it determines that it needs to be expanded, that is, it determines itself as the cloud instance to be expanded, and if it does not meet, it determines that it does not need to be expanded and ends the operation.
  • the status information of the cloud instance meeting the preset expansion conditions may include at least one of the following situations: the resource occupancy rate of the cloud instance is greater than or equal to the preset first resource occupancy rate, the load degree of the cloud instance is greater than or equal to The preset first load level and the service success rate of the cloud instance are smaller than the preset first service success rate.
  • the preset first resource occupancy rate can be understood as the resource occupancy rate threshold that meets the expansion requirement
  • the preset first load level can be understood as the load level threshold that meets the capacity expansion requirement
  • the rate can be understood as the business success rate threshold to meet the expansion requirements.
  • the size of these three thresholds can be set according to actual needs, and there is no limit here.
  • the state information of the cloud instance is the task response time of the cloud instance, and correspondingly, the preset first load level is the preset response time threshold.
  • the task response time of the cloud instance is 3S, and the preset response time threshold is 1S. It can be seen that the task response time of the cloud instance is greater than the preset response time threshold, so the cloud instance can be determined to be expanded.
  • the state information of the cloud instance is the memory usage of the cloud instance, and correspondingly, the preset first resource occupancy rate is the preset memory usage threshold.
  • the memory usage of the cloud instance is 8G, and the preset memory usage threshold is 1G. It can be seen that the memory usage of the cloud instance is greater than the preset memory usage threshold, so the cloud instance can determine that it needs to expansion and so on.
  • the cloud instance can determine the amount of resources required for its own expansion based on its own state information. For example, the cloud instance may determine the amount of computing resources, storage resources, and network resources required for its own capacity expansion based on the length of the task queue of the cloud instance.
  • the first working node detects whether the amount of idle resources of the first working node is greater than or equal to the amount of resources required for capacity expansion.
  • the cloud instance may initiate a capacity expansion request to the node agent of the first working node.
  • the node agent can parse the request, thereby determining that the cloud instance is a cloud instance to be expanded and the amount of resources required for expansion of the cloud instance.
  • the node agent can detect whether the amount of idle resources of the first working node (that is, the amount of unused local resources of the first working node) is greater than or equal to the amount of resources required for the expansion of the cloud instance, if the idle resources of the first working node If the amount is greater than or equal to the amount of resources required for capacity expansion of the cloud instance, perform step 404, and if the amount of idle resources of the first working node is less than the amount of resources required for capacity expansion of the cloud instance, perform step 405.
  • the amount of idle resources of the first working node that is, the amount of unused local resources of the first working node
  • the first working node increases the resource quota of the cloud instance to be expanded based on the amount of resources required for expansion.
  • the node agent can be increased based on the amount of resources required for the expansion of the cloud instance. For example, assuming that the cloud instance runs a large amount of business, the amount of memory required for expansion of the cloud instance is 5G, and the original memory quota of the cloud instance is 1G (that is, the amount of memory originally allocated to the cloud instance), the node agent You can modify the cgroup configuration so that the modified memory quota of the cloud instance is 6G.
  • the first working node determines the cloud instance to be migrated among the multiple cloud instances, and the priority of the business run by the cloud instance to be migrated is lower than the predetermined set priority.
  • the first working node migrates the cloud instance to be migrated to the second working node, so as to update the idle resource amount of the first working node.
  • the node agent can try to solve the problem of insufficient resources, that is, the node agent can work on the first Among the multiple cloud instances of the node, determine at least one cloud instance to be migrated, and the priority of the business run by these cloud instances is lower than the preset priority (that is, the business run by these cloud instances often has a lower priority ), so these cloud instances to be migrated can be migrated to the second working node, then, the resources allocated to these cloud instances to be migrated in the first working node are released and become new idle resources, thereby updating the first The amount of idle resources of the working nodes (that is, the amount of idle resources of the first working node is increased).
  • the migration method of the cloud instance to be migrated can be cold migration, that is, the node agent of the first working node sends to the management
  • the node sends a scheduling request, and based on the scheduling request, the management node can select a working node from the remaining working nodes except the first working node as the migration destination, that is, the second working node.
  • the management node can notify the node agent of the second working node to create a new cloud instance, and control the new cloud instance to rerun the service run by the cloud instance to be migrated in the first working node.
  • the management node may notify the node agent of the first working node to release the cloud instance to be migrated, so that the resources allocated to the cloud instance to be migrated are released, thereby updating the idle resource amount of the first working node.
  • the first working node detects whether the updated idle resource amount of the first working node is greater than or equal to the resource amount required for capacity expansion.
  • the node agent can detect whether the updated idle resource amount of the first working node is greater than or equal to the resource amount required for capacity expansion, if the updated idle resource amount of the first working node If it is greater than or equal to the amount of resources required for capacity expansion of the cloud instance, execute step 408 , and if the amount of idle resources of the first working node after updating is less than the amount of resources required for capacity expansion of the cloud instance, execute step 410 .
  • the first working node increases the resource quota of the cloud instance to be expanded based on the resource amount required for capacity expansion.
  • the node agent can increase the resource quota of the cloud instance based on the amount of resources required for the expansion of the cloud instance.
  • the first working node sends the resource quota of the cloud instance to be expanded to the management node.
  • the node agent After the node agent increases the resource quota of the cloud instance, it can send the increased resource quota of the cloud instance to the management node, so the management node and the first working node can synchronize the resource quota of the cloud instance, so that the global (that is, the entire cloud service system) resource configuration information is accurate and consistent.
  • the first working node detects the type of service run by the cloud instance to be expanded.
  • the first working node creates a new cloud instance at the third working node, and the new cloud instance and the cloud instance to be expanded are used to run the stateless application.
  • the first working node migrates the cloud instance to be expanded to the third working node.
  • the agent can first detect the type of business run by the cloud instance (also can be understood as the application run by the cloud instance), and perform corresponding processing based on the business type run by the cloud instance:
  • the node agent can apply to create a new cloud instance at the third working node, then the cloud instance and the new cloud instance can jointly run the business previously run by the cloud instance, It is equivalent to realizing expansion.
  • the node agent of the first working node determines that the service run by the cloud instance is a stateless application, and may send a scheduling request to the management node, and the management node may, based on the scheduling request, perform Among the remaining working nodes, a working node is selected, that is, the third working node.
  • the management node can notify the node agent of the third working node to create a new cloud instance, and control the new cloud instance to run the business run by the cloud instance in the first working node. Since the business is a stateless application, for As far as different cloud instances are concerned, the effect of which cloud instance is running is similar. Therefore, although the new cloud instance and the cloud instance are located on different co-located nodes, they can share the business, which is equivalent to completing the expansion.
  • the node agent can migrate the cloud instance to the third working node, wherein the amount of idle resources of the third working node is greater than or equal to the amount of resources required for the expansion of the cloud instance, Then, after the cloud instance is migrated to the third working node, the resource quota of the cloud instance can be increased, which is equivalent to completing the expansion.
  • the manner in which the cloud instance is migrated to the third working node may be hot migration or cold migration.
  • the following two methods will be introduced separately: (1) The method of live migration is: the node agent of the first working node determines that the business running on the cloud instance is a stateful application, and can send a scheduling request to the management node to the management node.
  • the node may select a working node from the remaining working nodes except the first working node as a migration destination, that is, the third working node. Then, the management node can notify the node agent of the third working node to create a new cloud instance, and keep the business state of the new cloud instance consistent with the business state of the cloud instance, so as to control the new cloud instance to continue running in the first working node The business run by the cloud instance. Finally, the management node can notify the node agent of the first working node to release the cloud instance. Since the node agent of the third working node can make the resource quota of the new cloud instance larger than the resource quota of the cloud instance in the first working node, it is quite The expansion is completed.
  • the way of cold migration is: the node agent of the first working node determines that the business running on the cloud instance is a stateful application, and can send a scheduling request to the management node. Among the other working nodes except the first working node, a working node is selected as the migration destination, that is, the third working node. Then, the management node can notify the node agent of the third working node to create a new cloud instance, and control the new cloud instance to rerun the service run by the cloud instance in the first working node. Finally, the management node can notify the node agent of the first working node to release the cloud instance. Since the node agent of the third working node can make the resource quota of the new cloud instance larger than the resource quota of the cloud instance in the first working node, it is quite The expansion is completed.
  • this embodiment only uses one of the cloud instances of the first working node for schematic illustration, and the other cloud instances of the first working node can also perform operations similar to those performed by the cloud instance, that is, for each cloud instance of the first working node A cloud instance can execute the operations described in steps 401 to 412, which will not be repeated here.
  • the first working node after obtaining the state information of multiple cloud instances, can determine the cloud instance to be expanded and the amount of resources required for expansion among the multiple cloud instances based on the state information. If the amount of idle resources of the first working node is greater than or equal to the amount of resources required for expansion, the first working node may increase the resource quota of the cloud instance to be expanded based on the amount of resources required for expansion. Based on the foregoing process, it can be seen that this application provides a new cloud instance expansion mechanism. After the first working node determines the cloud instance to be expanded, it can modify the cgroup configuration in real time to increase the resource quota of the cloud instance to be expanded. This is not perceived by the business running on the cloud instance to be expanded, so it will not interrupt the business running on the cloud instance to be expanded.
  • the working node can only modify the resource quota of the cloud instance when the cloud instance is rebuilt, but the cloud instance rebuild often needs to wait for a specific opportunity, for example, Banished, killed, etc. This part of the time is usually uncontrollable, resulting in too much time required to increase the resource quota of the cloud instance.
  • the embodiment of the present application does not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resources of the cloud instance in real time. Quota, which effectively shortens the time required to increase the resource quota of the cloud instance.
  • the working node can only modify the resource quota of the cloud instance when the cloud instance is rebuilt, but the cloud instance rebuilding and business startup often take a long time.
  • the time required to increase the resource quota of the cloud instance is too long, but the embodiment of the present application does not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resource quota of the cloud instance in real time, thereby effectively shortening the time limit.
  • the amount of time required to increase the resource quota of the cloud instance is too long, but the embodiment of the present application does not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resource quota of the cloud instance in real time, thereby effectively shortening the time limit.
  • the amount of time required to increase the resource quota of the cloud instance is not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resource quota of the cloud instance in real time, thereby effectively shortening the time limit.
  • VPA virtual photoassisted laser ray spectroscopy
  • the embodiment of the present application can detect whether the cloud instance needs to be expanded based on the status information of the cloud instance. Since the status information of the cloud instance includes information such as the resource occupancy rate of the cloud instance, the load level, and the business success rate, the status information of the cloud instance is different from that of the cloud instance.
  • the business logic of the instance is related, which can better reflect the business needs.
  • the working node can sense the business demand in real time, and accurately detect whether the cloud instance needs to be expanded based on the business demand, which can effectively reduce the number of detection times, thereby shortening the detection time. duration.
  • FIG. 5 is a schematic flowchart of a method for reducing the capacity of a cloud instance provided by the embodiment of the present application. It should be noted that this method can be applied to the cloud service system shown in Fig. 2 or Fig. 3 above, and the execution body of this method can be Any one of the multiple working nodes in the cloud service system is referred to as the first working node hereinafter. As shown in Figure 5, the method includes:
  • the first working node acquires status information of multiple cloud instances.
  • the first working node determines a cloud instance to be scaled down among multiple cloud instances.
  • multiple cloud instances are deployed on the first working node.
  • the cloud instance can periodically obtain its own status information, so as to determine whether it needs to be scaled down based on its own status information. .
  • the status information of the cloud instance may include at least one of the following: the resource occupancy rate of the cloud instance, the load level of the cloud instance, and the business success rate of the cloud instance.
  • the resource occupancy rate of the cloud instance may include at least one of the following: CPU usage of the cloud instance, memory usage of the cloud instance, storage IOPS of the cloud instance, network IOPS of the cloud instance, and the like.
  • the load level of the cloud instance may include at least one of the following: task response time of the cloud instance, task processing delay of the cloud instance, database index of the cloud instance, message queue index of the cloud instance, and the The task queue length of the cloud instance and so on.
  • the service success rate of the cloud instance may include at least one of the following: a task completion rate of the cloud instance, a message transmission success rate of the cloud instance, and the like.
  • the cloud instance can determine whether it needs to be scaled down in the following ways:
  • the cloud instance can detect whether its own state information meets the preset shrinking conditions. If it meets the conditions, it will determine that it needs to be reduced, that is, it will determine itself as a cloud instance to be reduced. If it does not meet the requirements, it will determine that it does not need to be reduced. , to end the operation.
  • the status information of the cloud instance meeting the preset shrinkage conditions may include at least one of the following conditions: the resource occupancy rate of the cloud instance is lower than the preset second resource occupancy rate, the load degree of the cloud instance is lower than the preset second resource occupancy rate, The second load level and the service success rate of the cloud instance are greater than or equal to the preset second service success rate.
  • the preset second resource occupancy rate can be understood as the resource occupancy rate threshold for meeting the scaling-in requirement
  • the preset second load level can be understood as the load level threshold for meeting the scaling-in demand
  • the preset second The business success rate can be understood as the threshold of the business success rate that meets the demand for capacity reduction.
  • the size of these three thresholds can be set according to actual needs, and there is no limit here.
  • the first working node releases idle resources of the cloud instance to be scaled down, and determines the released idle resource amount of the cloud instance to be scaled down.
  • the cloud instance After the cloud instance determines that it needs to be scaled down, it can release its own idle resources (that is, the resources not used by the cloud instance, and the resources occupied by the cloud instance to run business are the non-idle resources of the cloud instance), and the After release, calculate the amount of idle resources (that is, the size of the resources released by the cloud instance).
  • the cloud instance can try to scale down in the following ways: (1) For system-level languages, the memory is managed by the business itself, and most of the memory is dynamically allocated And the release can be recovered in time. Another part is managed in the form of a memory pool, which can shrink the memory pool when the business volume is small, and expand the memory pool when the business volume is large; (2) For high-level languages with garbage collection, memory recovery is performed by the running of the high-level language However, the memory reclamation of high-level languages is usually not synchronized with the business volume. At this time, the business can actively call forced garbage collection to release the memory.
  • the cloud instance After the cloud instance obtains its own idle resource amount, it sends its own idle resource amount to the node agent of the first working node.
  • the first working node reduces the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down.
  • the node agent After the node agent determines the amount of idle resources of the cloud instance, it can reduce the resource quota of the cloud instance based on the amount of idle resources of the cloud instance. For example, assuming that the cloud instance runs a relatively small amount of business, resulting in the free memory of the cloud instance being 2G, and the original memory quota of the cloud instance being 5G (that is, the amount of memory originally allocated to the cloud instance), the node agent can pass Modify the cgroup configuration so that the modified memory quota of the cloud instance is 3G.
  • the first working node sends the resource quota of the cloud instance to be scaled down to the management node.
  • the node agent After the node agent reduces the resource quota of the cloud instance, it can send the reduced resource quota of the cloud instance to the management node, so the management node and the first working node can synchronize the resource quota of the cloud instance, so that the global (that is, the entire cloud service system) resource configuration information is accurate and consistent.
  • this embodiment only uses one of the cloud instances of the first working node for schematic illustration, and the other cloud instances of the first working node can also perform operations similar to those performed by the cloud instance, that is, for each cloud instance of the first working node A cloud instance can execute the operations described in step 501 to step 502, which will not be repeated here.
  • the first working node may determine the cloud instance to be scaled down among the multiple cloud instances based on the status information. Then, the first working node releases the idle resources of the cloud instance to be scaled down, and determines the released idle resource amount of the cloud instance to be scaled down. Finally, the first working node may reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down. Based on the foregoing process, it can be seen that this application provides a new cloud instance scaling mechanism.
  • the first working node After the first working node determines the cloud instance to be scaled down by itself, it can modify the cgroup configuration in real time to reduce the number of cloud instances to be scaled down. Resource quotas are not aware of the business running on the cloud instance to be scaled down, so it will not cause business interruption to the cloud instance to be scaled down.
  • the cloud instance to be scaled down can be made to release idle resources by itself, thereby ensuring a success rate when modifying the resource quota of the cloud instance to be scaled down.
  • the working node can only modify the resource quota of the cloud instance when the cloud instance is rebuilt, but the cloud instance rebuild often needs to wait for a specific opportunity, for example , was expelled, killed, etc. This part of the time is usually uncontrollable, resulting in too much time required to reduce the resource quota of the cloud instance.
  • the embodiment of the present application does not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resources of the cloud instance in real time. quotas, which effectively shortens the time required to reduce the resource quotas of cloud instances.
  • the working node can only modify the resource quota of the cloud instance when the cloud instance is rebuilt, but the cloud instance rebuild and business startup often take a long time.
  • the time required to reduce the resource quota of the cloud instance is too long, but the embodiment of the present application does not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resource quota of the cloud instance in real time, thereby effectively shortening the time limit.
  • the amount of time required to reduce the resource quota of the cloud instance is too long, but the embodiment of the present application does not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resource quota of the cloud instance in real time, thereby effectively shortening the time limit.
  • the amount of time required to reduce the resource quota of the cloud instance is not need to modify the resource quota of the cloud instance when the cloud instance is rebuilt, and can modify the resource quota of the cloud instance in real time, thereby effectively shortening the time limit.
  • VPA virtual photoassisted laser ray spectroscopy
  • the embodiment of the present application can detect whether the cloud instance needs to be scaled down based on the status information of the cloud instance. Since the status information of the cloud instance includes information such as the resource occupancy rate of the cloud instance, the load level, and the business success rate, the status information of the cloud instance is related to The business logic of the cloud instance is related, which can better reflect the business needs. Therefore, the working nodes can sense the business needs in real time based on the status information of the cloud instances, and accurately detect whether the cloud instance needs to be scaled down based on the business needs, which can effectively reduce the number of detections, thereby Shorten the detection time.
  • Figure 6 is a schematic structural diagram of a working node provided by the embodiment of the present application. As shown in Figure 6, the working node is used as the first working node, the first working node is set in the cloud service system, and the first working node is deployed with multiple For cloud instances, the first working node includes:
  • An acquisition module 601, configured to acquire status information of multiple cloud instances
  • the first determining module 602 is configured to determine, among multiple cloud instances, the cloud instance to be expanded and the amount of resources required for expansion based on the state information;
  • the first adjustment module 603 is configured to increase the resource quota of the cloud instance to be expanded based on the resource amount required for capacity expansion if the idle resource amount of the first working node is greater than or equal to the resource amount required for capacity expansion.
  • the first working node after obtaining the state information of multiple cloud instances, can determine the cloud instance to be expanded and the amount of resources required for capacity expansion among the multiple cloud instances based on the state information. If the amount of idle resources of the first working node is greater than or equal to the amount of resources required for expansion, the first working node may increase the resource quota of the cloud instance to be expanded based on the amount of resources required for expansion. Based on the foregoing process, it can be seen that this application provides a new cloud instance expansion mechanism. After the first working node determines the cloud instance to be expanded, it can modify the cgroup configuration in real time to increase the resource quota of the cloud instance to be expanded. This is not perceived by the business running on the cloud instance to be expanded, so it will not interrupt the business running on the cloud instance to be expanded.
  • the first determining module 602 is configured to: among multiple cloud instances, the first working node determines the cloud instance whose state information satisfies the preset expansion condition as the cloud instance to be expanded; The status information of the cloud instance to be expanded determines the amount of resources required for expansion.
  • the cloud service system further includes a second working node
  • the first working node further includes: a second determination module, configured to: if the amount of idle resources of the first working node is less than the amount of resources required for capacity expansion, Determine the cloud instance to be migrated among multiple cloud instances, and the priority of the business run by the cloud instance to be migrated is lower than the preset priority; the first migration module is used to migrate the cloud instance to be migrated to the second job node, to update the amount of idle resources of the first working node; the second adjustment module is used for if the amount of idle resources of the first working node after the update is greater than or equal to the amount of resources required for capacity expansion, based on the amount of resources required for capacity expansion, Increase the resource quota of the cloud instance to be expanded.
  • the cloud service system further includes a third working node
  • the first working node further includes: a detection module, configured to if the amount of idle resources of the updated first working node is less than the amount of resources required for capacity expansion , the first working node detects the type of business running on the cloud instance to be expanded; the creation module is used to create a new cloud instance at the third working node if the business running on the cloud instance to be expanded is a stateless application, and the new The cloud instance and the cloud instance to be expanded are used to run the stateless application; the second migration module is used to migrate the cloud instance to be expanded to the third working node if the service run by the cloud instance to be expanded is a stateful application.
  • the state information includes at least one of the following: resource occupancy rate, load level, and service success rate.
  • the preset expansion conditions include at least one of the following: the resource occupancy rate is greater than or equal to the preset first resource occupancy rate, the load level is greater than or equal to the preset first load level, and the business success rate is less than the preset first service success rate.
  • the aforementioned migration is cold migration or hot migration.
  • the cloud service system further includes a management node
  • the first working node further includes: a feedback module, configured to send the resource quota of the cloud instance to be expanded to the management node.
  • Fig. 7 is another schematic structural diagram of the working node provided by the embodiment of the present application. As shown in Fig. 7, the working node is used as the first working node, and the first working node is set in the cloud service system. How many nodes are deployed in the first working node? cloud instance, the first working node includes:
  • An acquisition module 701, configured to acquire status information of multiple cloud instances
  • a determination module 702 configured to determine a cloud instance to be scaled down among multiple cloud instances based on the status information
  • a release module 703, configured to release idle resources of the cloud instance to be scaled down, and determine the amount of idle resources of the released cloud instance to be scaled down;
  • An adjustment module 704 configured to reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down.
  • the first working node may determine the cloud instance to be scaled down among the multiple cloud instances based on the status information. Then, the first working node releases the idle resources of the cloud instance to be scaled down, and determines the released idle resource amount of the cloud instance to be scaled down. Finally, the first working node may reduce the resource quota of the cloud instance to be scaled down based on the amount of idle resources of the cloud instance to be scaled down. Based on the foregoing process, it can be seen that this application provides a new cloud instance scaling mechanism.
  • the first working node After the first working node determines the cloud instance to be scaled down by itself, it can modify the cgroup configuration in real time to reduce the number of cloud instances to be scaled down. Resource quotas are not aware of the business running on the cloud instance to be scaled down, so it will not cause business interruption to the cloud instance to be scaled down.
  • the determining module 702 is configured to, among multiple cloud instances, determine a cloud instance whose state information satisfies a preset shrinkage condition as a cloud instance to be scaled down.
  • the state information includes at least one of the following: resource occupancy rate, load level, and service success rate.
  • the preset shrinking conditions include at least one of the following: the resource occupancy rate is lower than the preset second resource occupancy rate, the load level is lower than the preset second load level, and the service success rate is greater than or It is equal to the preset second service success rate.
  • the cloud service system further includes a management node
  • the first working node further includes: a feedback module, configured to send the resource quota of the cloud instance to be expanded to the management node.
  • FIG. 8 is another schematic structural diagram of a working node provided by an embodiment of the present application.
  • the working node is used as the first working node, the first working node is set in the cloud service system, the first working node is deployed with multiple cloud instances, and the first working node may include one or more central processing units 801, memory 802, input and output interface 803, wired or wireless network interface 804, power supply 805.
  • Memory 802 may be transient or persistent storage. Furthermore, the central processing unit 801 may be configured to communicate with the memory 802, and execute a series of instruction operations in the memory 802 on the first working node.
  • the central processing unit 801 may execute the operations performed by the first working node in the foregoing embodiment shown in FIG. 4 or FIG. 5 , and details are not described here again.
  • the division of specific functional modules in the central processing unit 801 can be compared with the acquisition module, the first determination module, the second adjustment module, the second determination module, the first migration module, and the second adjustment module described in FIG. Modules such as the detection module, the creation module, the second migration module and the feedback module are divided in a similar manner, and will not be repeated here; or,
  • the division of specific functional modules in the central processing unit 801 may be similar to the division of modules such as the acquisition module, determination module, release module, adjustment module and feedback module described in FIG. 7 above, and will not be repeated here.
  • the embodiment of the present application also relates to a computer storage medium, including computer-readable instructions.
  • the steps performed by the first server in the embodiment shown in FIG. The steps performed by the arbitrator in the embodiment shown in FIG. 5 .
  • the embodiment of the present application also relates to a computer program product containing instructions. When it is run on a computer, it causes the computer to perform the steps performed by the first server in the embodiment shown in FIG. Steps performed by the arbitrator in the embodiment.
  • the disclosed system, device and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供一种云实例的扩缩容方法及其相关设备,可在增大或减小云实例的资源配额时,确保云实例运行的业务不会中断。本申请的方法包括:第一工作节点获取多个云实例的状态信息;第一工作节点基于状态信息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量;若第一工作节点的空闲资源量大于或等于扩容所需的资源量,第一工作节点基于扩容所需的资源量,增大待扩容的云实例的资源配额。

Description

一种云实例的扩缩容方法及其相关设备
本申请要求于2021年11月30日提交中国专利局,申请号为202111450334.5,发明名称为“一种云实例的扩缩容方法及其相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及云技术领域,尤其涉及一种云实例的扩缩容方法及其相关设备。
背景技术
随着技术的飞速发展,云服务系统的规模越来越大。云服务系统通常包含多个工作节点(worker)和管理节点(master),其中,每个工作节点上部署有多个容器(docker),管理节点可对所有容器进行集中管理。
目前,云服务系统以kubernetes作为容器的管理标准,可对容器实现编排部署、灰度升降级、自动扩缩容等功能。在自动伸缩功能中,kubernetes可支持两种自动两种自动伸缩方法,分别为垂直pod(容器组)自动伸缩(vertical pod autoscale,VPA)和水平pod自动伸缩(horizontal pod autoscale,HPA)。在VPA中,管理节点可基于某个工作节点中pod的资源占用率计算出pod的资源配额推荐值,并发送至该工作节点。当该工作节点创建pod时,可基于该推荐值为pod设置新的资源配额(例如,增大pod的资源配额或减小pod的资源配额,相当于扩容或缩容)。
前述过程中,由于修改pod的资源配额的时间点,只能在创建pod的时候。当需要修改某个pod的资源配额时,工作节点只能先释放该pod,并在重新创建该pod时才能实现资源配额的修改,这样会导致该pod运行的业务中断。
发明内容
本申请实施例提供了一种云实例的扩缩容方法及其相关设备,可在增大或减小云实例的资源配额时,确保云实例运行的业务不会中断。
本申请实施例的第一方面提供了一种云实例的扩缩容方法,该方法应用于云服务系统,云服务系统包含多个工作节点,也其中一个工作节点进行示意性介绍,并称该工作节点为第一工作节点,该方法包括:
第一工作节点部署有多个云实例,例如,多个云实例可以为多个容器,又如,多个云实例可以为多组容器,再如,多个云实例可以为多个虚拟机,还如,多个云实例可以为多组虚拟机等等。第一工作节点可获取这多个云实例的状态信息,每个云实例的状态信息可用于指示该云实例所运行的业务的状态。
得到多个云实例的状态信息后,第一工作节点可基于这多个云实例的状态信息,对这多个云实例逐个进行分析,从而在这多个云实例中确定待扩容的云实例以及对带扩容的云实例进行扩容所需的资源量。
第一工作节点可确定第一工作节点的空闲资源量,并检测第一工作节点的空闲资源量是否大于或等于扩容所需的资源量,若第一工作节点的空闲资源量大于或等于扩容所需的资源 量,说明第一工作节点的空闲资源是充足的,故第一工作节点可直接对待扩容的云实例直接进行扩容处理,即基于扩容所需的资源量,增大待扩容的云实例的资源配额,从而实现云实例的扩容。
从上述方法可以看出:第一工作节点在获取多个云实例的状态信息后,可基于这部分状态信息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量。若第一工作节点的空闲资源量大于或等于扩容所需的资源量,第一工作节点可基于扩容所需的资源量,增大待扩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例扩容机制,第一工作节点自行确定待扩容的云实例后,可通过实时修改cgroup配置,以增大待扩容的云实例的资源配额,这对于待扩容的云实例运行的业务而言,是不感知的,故不会导致待扩容的云实例运行的业务中断。
在一种可能的实现方式中,第一工作节点基于状态信息,在多个云实例中确定待扩容的第一云实例以及扩容所需的扩容所需的资源量包括:在多个云实例中,第一工作节点将状态信息满足预置的扩容条件的云实例确定为待扩容的云实例;第一工作节点基于待扩容的云实例的状态信息,确定扩容所需的扩容所需的资源量。前述实现方式中,对于多个云实例中的任意一个云实例,第一工作节点可检测该云实例的状态信息是否满足预置的扩容条件,若满足,则确定需要进行扩容,即将该云实例确定为待扩容的云实例,若不满足,则确定该云实例不需要进行扩容,结束操作。在将该云实例确定为待扩容的云实例后,还可基于该云实例的状态信息准确计算出该云实例扩容所需的资源量。
在一种可能的实现方式中,云服务系统还包含第二工作节点,该方法还包括:若第一工作节点的空闲资源量小于扩容所需的资源量,第一工作节点在多个云实例中确定待迁移的云实例,待迁移的云实例运行的业务的优先级低于预置的优先级;第一工作节点将待迁移的云实例迁移至第二工作节点,以更新第一工作节点的空闲资源量;若更新后的第一工作节点的空闲资源量大于或等于扩容所需的资源量,第一工作节点基于扩容所需的资源量,增大待扩容的云实例的资源配额。前述实现方式中,若第一工作节点的空闲资源量小于该云实例扩容所需的资源量,说明第一工作节点的空闲资源是不足的,第一工作节点可在第一工作节点的多个云实例中,确定至少一个待迁移的云实例,这些待迁移的云实例运行的业务的优先级低于预置的优先级(即这些云实例所运行的业务往往优先级较低),故可将这些待迁移的云实例迁移至第二工作节点,那么,第一工作节点中被分配至这些待迁移的云实例的资源被释放,成为新的空闲资源,从而更新了第一工作节点的空闲资源量(即增大了第一工作节点的空闲资源量)。然后,第一工作节点可检测更新后的第一工作节点的空闲资源量是否大于或等于扩容所需的资源量,若更新后的第一工作节点的空闲资源量大于或等于该云实例(待扩容的云实例)扩容所需的资源量,说明更新后的第一工作节点的空闲资源是充足的,可利用这部分资源直接对该云实例进行扩容,故第一工作节点可基于该云实例扩容所需的资源量,增大该云实例的资源配额,从而实现云实例的扩容。
在一种可能的实现方式中,云服务系统还包含第三工作节点,该方法还包括:若更新后的第一工作节点的空闲资源量小于扩容所需的资源量,第一工作节点检测待扩容的云实例运行的业务的类型;若待扩容的云实例运行的业务为无状态应用,第一工作节点在第三工作节点处创建新的云实例,新的云实例和待扩容的云实例共同用于运行无状态应用;若待扩容的 云实例运行的业务为有状态应用,第一工作节点将待扩容的云实例迁移至第三工作节点。前述实现方式中,若更新后的第一工作节点的空闲资源量小于该云实例(待扩容的云实例)扩容所需的资源量,说明更新后的第一工作节点的空闲资源是不足的,第一工作节点可先检测该云实例运行的业务(也可以理解为该云实例运行的应用)的类型,以基于该云实例运行的业务类型进行相应的处理:若该云实例运行的业务为无状态应用,第一工作节点可申请在第三工作节点处创建新的云实例,那么,该云实例和新的云实例可共同运行该云实例原先所运行的业务,也就相当于实现了扩容。若该云实例运行的业务为有状态应用,节点代理可将该云实例迁移至在第三工作节点,其中,第三工作节点的空闲资源量大于或等于该云实例扩容所需的资源量,那么,在将该云实例迁移至第三工作节点后,可增大该云实例的资源配额,也就相当于完成了扩容。
在一种可能的实现方式中,状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。可见,云实例的状态信息与云实例的业务逻辑相关。
在一种可能的实现方式中,预置的扩容条件包括以下至少一种:资源占用率大于或等于预置的第一资源占用率、负载程度大于或等于预置的第一负载程度和业务成功率小于预置的第一业务成功率。在VPA中,仅根据云实例的资源占用率来检测云实例是否需要进行扩容,无法深入了解业务需求,需要多次检测才能精准确定云实例是否需要扩容,耗费了较长的时间在检测上。前述实现方式可基于云实例的状态信息来检测云实例是否需要扩容,由于云实例的状态信息包含云实例的资源占用率、负载程度以及业务成功率等信息,故云实例的状态信息与云实例的业务逻辑相关,更能体现业务需求,故工作节点基于云实例的状态信息,可实时感应业务需求,并基于业务需求准确检测云实例是否需要扩容,可有效减少检测的次数,从而缩短检测时长。
在一种可能的实现方式中,将待扩容的云实例迁移至第三工作节点的方式为冷迁移或热迁移。
在一种可能的实现方式中,云服务系统还包含管理节点,第一工作节点增大待扩容的云实例的资源配额之后,该方法还包括:第一工作节点将待扩容的云实例的资源配额发送至管理节点。前述实现方式中,第一工作节点对待扩容的云实例的资源配额进行增大后,可将增大后的该云实例的资源配额发送至管理节点,故管理节点和第一工作节点可同步该云实例的资源配额,使得全局(即整个云服务系统)资源配置信息准确一致。
本申请实施例的第二方面提供了一种云实例的缩容方法,该方法应用于云服务系统,云服务系统包含第一工作节点,该方法包括:
第一工作节点部署有多个云实例,例如,多个云实例可以为多个容器,又如,多个云实例可以为多组容器,再如,多个云实例可以为多个虚拟机,还如,多个云实例可以为多组虚拟机等等。第一工作节点可获取这多个云实例的状态信息,每个云实例的状态信息可用于指示该云实例所运行的业务的状态。
得到多个云实例的状态信息后,第一工作节点可基于这多个云实例的状态信息,对这多个云实例逐个进行分析,从而在这多个云实例中确定待缩容的云实例。
确定待缩容的云实例后,第一工作节点可确定待缩容的云实例中的非空闲资源和空闲资源,并释放待缩容的云实例的空闲资源,并计算这部分被释放的资源大小,即确定被释放的 待缩容的云实例的空闲资源量。那么,第一工作节点可基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额,从而实现云实例的缩容。
从上述方法可以看出:在获取多个云实例的状态信息后,第一工作节点可基于这些状态信息,在多个云实例中确定待缩容的云实例。然后,第一工作节点释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量。最后,第一工作节点可基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例缩容机制,第一工作节点自行确定待缩容的云实例后,可通过实时修改cgroup配置,以减小待缩容的云实例的资源配额,这对于待缩容的云实例运行的业务而言,是不感知的,故不会导致待缩容的云实例运行的业务中断。
在一种可能的实现方式中,第一工作节点基于状态信息,在多个云实例中确定待缩容的云实例包括:在多个云实例中,第一工作节点将状态信息满足预置的缩容条件的云实例确定为待缩容的云实例。前述实现方式中,对于多个云实例中的任意一个云实例,第一工作节点可检测该云实例的状态信息是否满足预置的缩容条件,若满足,则确定需要进行缩容,即将该云实例确定为待缩容的云实例,若不满足,则确定该云实例不需要进行缩容,结束操作。
在一种可能的实现方式中,状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。可见,云实例的状态信息与云实例的业务逻辑相关。
在一种可能的实现方式中,预置的缩容条件包括以下至少一种:资源占用率小于预置的第二资源占用率、负载程度小于预置的第二负载程度和业务成功率大于或等于预置的第二业务成功率。在VPA中,仅根据云实例的资源占用率来检测云实例是否需要进行缩容,无法深入了解业务需求,需要多次检测才能精准确定云实例是否需要缩容,耗费了较长的时间在检测上。前述实现方式可基于云实例的状态信息来检测云实例是否需要缩容,由于云实例的状态信息包含云实例的资源占用率、负载程度以及业务成功率等信息,故云实例的状态信息与云实例的业务逻辑相关,更能体现业务需求,故工作节点基于云实例的状态信息,可实时感应业务需求,并基于业务需求准确检测云实例是否需要缩容,可有效减少检测的次数,从而缩短检测时长。
在一种可能的实现方式中,云服务系统还包含管理节点,第一工作节点基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额之后,该方法还包括:第一工作节点将待缩容的云实例的资源配额发送至管理节点。前述实现方式中,第一工作节点对该云实例的资源配额进行减小后,可将减小后的该云实例的资源配额发送至管理节点,故管理节点和第一工作节点可同步该云实例的资源配额,使得全局(即整个云服务系统)资源配置信息准确一致。
本申请实施例的第三方面提供了一种工作节点,工作节点作为第一工作节点,第一工作节点设置于云服务系统中,第一工作节点部署有多个云实例,第一工作节点包括:获取模块,用于获取多个云实例的状态信息;第一确定模块,用于基于状态信息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量;第一调整模块,用于若第一工作节点的空闲资源量大于或等于扩容所需的资源量,基于扩容所需的资源量,增大待扩容的云实例的资源配额。
从上述工作节点可以看出:第一工作节点在获取多个云实例的状态信息后,可基于这部分状态信息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量。若第一工作节点 的空闲资源量大于或等于扩容所需的资源量,第一工作节点可基于扩容所需的资源量,增大待扩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例扩容机制,第一工作节点自行确定待扩容的云实例后,可通过实时修改cgroup配置,以增大待扩容的云实例的资源配额,这对于待扩容的云实例运行的业务而言,是不感知的,故不会导致待扩容的云实例运行的业务中断。
在一种可能的实现方式中,第一确定模块,用于:在多个云实例中,第一工作节点将状态信息满足预置的扩容条件的云实例确定为待扩容的云实例;基于待扩容的云实例的状态信息,确定扩容所需的扩容所需的资源量。
在一种可能的实现方式中,云服务系统还包含第二工作节点,第一工作节点还包括:第二确定模块,用于若第一工作节点的空闲资源量小于扩容所需的资源量,在多个云实例中确定待迁移的云实例,待迁移的云实例运行的业务的优先级低于预置的优先级;第一迁移模块,用于将待迁移的云实例迁移至第二工作节点,以更新第一工作节点的空闲资源量;第二调整模块,用于若更新后的第一工作节点的空闲资源量大于或等于扩容所需的资源量,基于扩容所需的资源量,增大待扩容的云实例的资源配额。
在一种可能的实现方式中,云服务系统还包含第三工作节点,第一工作节点还包括:检测模块,用于若更新后的第一工作节点的空闲资源量小于扩容所需的资源量,第一工作节点检测待扩容的云实例运行的业务的类型;创建模块,用于若待扩容的云实例运行的业务为无状态应用,在第三工作节点处创建新的云实例,新的云实例和待扩容的云实例共同用于运行无状态应用;第二迁移模块,用于若待扩容的云实例运行的业务为有状态应用,将待扩容的云实例迁移至第三工作节点。
在一种可能的实现方式中,状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
在一种可能的实现方式中,预置的扩容条件包括以下至少一种:资源占用率大于或等于预置的第一资源占用率、负载程度大于或等于预置的第一负载程度和业务成功率小于预置的第一业务成功率。
在一种可能的实现方式中,前述的迁移为冷迁移或热迁移。
在一种可能的实现方式中,云服务系统还包含管理节点,第一工作节点还包括:反馈模块,用于将待扩容的云实例的资源配额发送至管理节点。
本申请实施例的第四方面提供了一种工作节点,工作节点作为第一工作节点,第一工作节点设置于云服务系统中,第一工作节点部署有多个云实例,第一工作节点包括:获取模块,用于获取多个云实例的状态信息;确定模块,用于基于状态信息,在多个云实例中确定待缩容的云实例;释放模块,用于释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量;调整模块,用于基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。
从上述工作节点可以看出:在获取多个云实例的状态信息后,第一工作节点可基于这些状态信息,在多个云实例中确定待缩容的云实例。然后,第一工作节点释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量。最后,第一工作节点可基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。基于前述过程可知,本申请 提供了一种新的云实例缩容机制,第一工作节点自行确定待缩容的云实例后,可通过实时修改cgroup配置,以减小待缩容的云实例的资源配额,这对于待缩容的云实例运行的业务而言,是不感知的,故不会导致待缩容的云实例运行的业务中断。
在一种可能的实现方式中,确定模块,用于在多个云实例中,将状态信息满足预置的缩容条件的云实例确定为待缩容的云实例。
在一种可能的实现方式中,状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
在一种可能的实现方式中,预置的缩容条件包括以下至少一种:资源占用率小于预置的第二资源占用率、负载程度小于预置的第二负载程度和业务成功率大于或等于预置的第二业务成功率。
在一种可能的实现方式中,云服务系统还包含管理节点,第一工作节点还包括:反馈模块,用于将待扩容的云实例的资源配额发送至管理节点。
本申请实施例的第五方面提供了一种工作节点,该工作节点包括存储器和处理器;存储器存储有代码,处理器被配置为执行代码,当代码被执行时,工作节点执行如第一方面、第一方面的任意一种可能的实现方式、第二方面或第二方面的任意一种可能的实现方式所述的方法。
本申请实施例的第六方面提供了一种计算机存储介质,计算机存储介质存储有一个或多个指令,指令在由一个或多个计算机执行时使得一个或多个计算机实施如第一方面、第一方面的任意一种可能的实现方式、第二方面或第二方面的任意一种可能的实现方式所述的方法。
本申请实施例的第七方面提供了一种计算机程序产品,计算机程序产品存储有指令,指令在由计算机执行时,使得计算机实施如第一方面、第一方面的任意一种可能的实现方式、第二方面或第二方面的任意一种可能的实现方式所述的方法。
本申请实施例中,在获取多个云实例的状态信息后,第一工作节点可基于这些状态信息,在多个云实例中确定待缩容的云实例。然后,第一工作节点释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量。最后,第一工作节点可基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例缩容机制,第一工作节点自行确定待缩容的云实例后,可通过实时修改cgroup配置,以减小待缩容的云实例的资源配额,这对于待缩容的云实例运行的业务而言,是不感知的,故不会导致待缩容的云实例运行的业务中断。
此外,本申请实施例中,在获取多个云实例的状态信息后,第一工作节点可基于这些状态信息,在多个云实例中确定待缩容的云实例。然后,第一工作节点释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量。最后,第一工作节点可基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例缩容机制,第一工作节点自行确定待缩容的云实例后,可通过实时修改cgroup配置,以减小待缩容的云实例的资源配额,这对于待缩容的云实例运行的业务而言,是不感知的,故不会导致待缩容的云实例运行的业务中断。
附图说明
图1为VPA的一个示意图;
图2为本申请实施例提供的云服务系统的一个结构示意图;
图3为本申请实施例提供的云服务系统的另一结构示意图;
图4为本申请实施例提供的云实例的扩容方法的一个流程示意图;
图5为本申请实施例提供的云实例的缩容方法的一个流程示意图;
图6为本申请实施例提供的工作节点的一个结构示意图;
图7为本申请实施例提供的工作节点的另一个结构示意图;
图8为本申请实施例提供的工作节点的另一个结构示意图。
具体实施方式
本申请实施例提供了一种云实例的扩缩容方法及其相关设备,可在增大或减小云实例的资源配额时,确保云实例运行的业务不会中断。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”并他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
随着技术的飞速发展,云服务系统的规模越来越大。云服务系统通常包含多个工作节点和管理节点,其中,每个工作节点上部署有多个云实例,管理节点可对所有云实例进行集中管理。
为了便于说明,以云实例为容器(docker)进行介绍。目前,云服务系统以kubernetes作为容器的管理标准,可对容器实现编排部署、灰度升降级、自动扩缩容等功能。在自动伸缩功能中,kubernetes可支持两种自动两种自动伸缩方法,分别为VPA和HPA。如图1所示(图1为VPA的一个示意图),在VPA中,管理节点可基于某个工作节点中pod的资源占用率计算出pod的资源配额推荐值,并发送至该工作节点。当该工作节点创建pod时,可基于该推荐值为pod设置新的资源配额,例如,增大pod的资源配额(也就是增大分配给pod的资源量,即对pod进行扩容)或减小pod的资源配额(也就是减小分配pod的资源量,即对pod进行缩容)。
前述过程中,由于修改pod的资源配额的时间点,只能在创建pod的时候。当需要修改某个pod的资源配额时,工作节点只能先释放该pod,并在重新创建该pod时才能实现资源配额的修改,这样会导致该pod运行的业务中断。
为了解决上述问题,本申请实施例提供了一种云实例的扩缩容方法,该方法可应用于如图2所示的云服务系统中(图2为本申请实施例提供的云服务系统的一个结构示意图),该系统所适用的云场景包含公有云、私有云、混合云等场景,该系统包含管理节点和多个工作节点,管理节点用于集中管理这多个工作节点,下文将分别对管理节点和工作节点进行介绍:
管理节点,通常为单独的一台物理服务器(也可以称为网络设备)。管理节点内设置有多 个功能模块,分别为应用程序接口(application programming interface,API)、调度器(scheduler)、主控器(controller-manager)以及数据库(data base,DB)。其中,API可理解为云服务系统的接口,用户可通过API在工作节点上创建云实例。调度器可用于调度云实例到合适的节点上,调度器往往是可替换的组件,其形式可根据不同厂商的要求自行进行设置,此处不做限制。主控制器用于实现云服务系统中各个工作节点以及各个云实例的资源管理等功能。数据库用于存储云服务系统中的配置信息,例如,各个云实例的资源配额等等。
工作节点,通常也为单独的一台物理服务器。基于虚拟化技术,工作节点的操作系统(operating system)之上可部署有节点代理(kubelet)以及多个云实例。其中,操作系统用于实现工作节点的硬件资源管理。节点代理作为工作节点上的代理进程,用于管理工作节点上的所有云实例,包括生命周期的管理,资源配置等等。云实例可通过多种形式呈现,例如,一个云实例可以为一个虚拟机(virtual machine,VM),又如,一个云实例可以为一个容器(docker)。再如,一个云实例可以为一组虚拟机。还如,一个云实例可以为一组容器(也可以称为一个pod)等等。可以理解的是,工作节点自身包含一定的硬件资源(计算资源、存储资源和网络资源等等),节点代理可为每个云实例设置一个资源配额,每个云实例的资源配额即为分配至该云实例的资源量(包含计算资源量、存储资源量和网络资源量等等),且节点代理可管理和调整所有云实例的资源配额。
需要说明的是,如图3所示(图3为本申请实施例提供的云服务系统的另一结构示意图),对于某个工作节点而言,该工作节点的节点代理设置有伸缩接口(scaleAPI)供该工作节点的云实例调用,且在该工作节点的节点代理和该工作节点的云实例之间构建了双向通道,故二者可实现相互通信。具体地,该工作节点的云实例可自行根据自身的状态信息判断是否需要进行扩缩容,在确定需要进行扩缩容后,向该工作节点的节点代理发起自动伸缩请求,以使得该工作节点的节点代理基于该请求对该工作节点的云实例进行扩缩容。
值得注意的是,在节点代理中增加伸缩接口及实现,这部分需要侵入式修改节点代理功能,或者开发成插件。节点代理与云实例之间的双向通道,无需侵入式修改,只需配置相关网络通路即可。
为了进一步理解前述扩缩容的过程,下文将分别从云实例的扩容和云实例的缩容这两个方面对该过程做进一步的介绍。首先对云实例的扩容过程进行说明,图4为本申请实施例提供的云实例的扩容方法的一个流程示意图,需要说明的是,该方法可应用于前述图2或图3所示的云服务系统,该方法的执行主体可为云服务系统中多个工作节点的任意一个工节点,下文将该工作节点称为第一工作节点。如图4所示,该方法包括:
401、第一工作节点获取多个云实例的状态信息。
402、第一工作节点基于状态信息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量。
本实施例中,第一工作节点上部署有多个云实例,对于任意一个云实例而言,该云实例可周期性地获取自身的状态信息,以基于自身的状态信息判定是否需要进行扩容。其中,该周期通常是秒级的时间段。
进一步地,该云实例的状态信息可包含以下至少一种:该云实例的资源占用率、该云实 例的负载程度和该云实例的业务成功率。
更进一步地,该云实例的资源占用率可包含以下至少一种:该云实例的中央处理器(central processing unit,CPU)使用率、该云实例的内存使用量、该云实例的存储每秒进行读写操作的次数(input/output operations per second,IOPS)以及该云实例的网络IOPS等等。
更进一步地,该云实例的负载程度可包含以下至少一种:该云实例的任务响应时间、该云实例的任务处理时延、该云实例的数据库指标、该云实例的消息队列指标以及该云实例的任务队列长度等等。
更进一步地,该云实例的业务成功率可包含以下至少一种:该云实例的任务完成率以及该云实例的消息传输成功率等等。
具体地,该云实例可通过以下方式确定是否需要进行扩容:
该云实例可检测自身的状态信息是否满足预置的扩容条件,若满足,则确定需要进行扩容,即将自身确定为待扩容的云实例,若不满足,则确定不需要进行扩容,结束操作。
其中,该云实例的状态信息满足预置的扩容条件可包含以下至少一种情况:该云实例的资源占用率大于或等于预置的第一资源占用率、该云实例的负载程度大于或等于预置的第一负载程度以及该云实例的业务成功率小于预置的第一业务成功率。需要说明的是,预置的第一资源占用率可理解为达到扩容需求的资源占用率阈值,预置的第一负载程度可理解为达到扩容需求的负载程度阈值,预置的第一业务成功率可理解为达到扩容需求的业务成功率阈值,这三个阈值的大小可根据实际需求进行设置,此处不做限制。
例如,设该云实例的状态信息为该云实例的任务响应时间,相应的,预置的第一负载程度则为预置的响应时间阈值。并且,该云实例的任务响应时间为3S,预置的响应时间阈值为1S,由此可见,该云实例的任务相应时间大于预置的响应时间阈值,故该云实例可确定需要进行扩容。又如,设该云实例的状态信息为该云实例的内存使用量,相应的,预置的第一资源占用率则为预置的内存使用量阈值。并且,该云实例的内存使用量为8G,预置的内存使用量阈值为1G,由此可见,该云实例的内存使用量大于预置的内存使用量阈值,故该云实例可确定需要进行扩容等等。
若该云实例确定需要进行扩容,该云实例可基于自身的状态信息,来确定自身扩容所需的资源量。例如,该云实例可基于该云实例的任务队列长度,来确定自身扩容所需的计算资源量、存储资源量和网络资源量等等。
403、第一工作节点检测第一工作节点的空闲资源量是否大于或等于扩容所需的资源量。
该云实例确定需要进行扩容以及自身扩容所需的资源量后,可向第一工作节点的节点代理发起扩容请求。节点代理接收到来自该云实例的扩容请求后,可解析该请求,从而确定该云实例为待扩容的云实例以及该云实例扩容所需的资源量。
那么,节点代理可检测第一工作节点的空闲资源量(即第一工作节点本地未被使用的资源量)是否大于或等于该云实例扩容所需的资源量,若第一工作节点的空闲资源量大于或等于该云实例扩容所需的资源量,则执行步骤404,若第一工作节点的空闲资源量小于该云实例扩容所需的资源量,则执行步骤405。
404、若第一工作节点的空闲资源量大于或等于扩容所需的资源量,第一工作节点基于扩 容所需的资源量,增大待扩容的云实例的资源配额。
若第一工作节点的空闲资源量大于或等于该云实例扩容所需的资源量,说明第一工作节点的空闲资源是充足的,可利用这部分资源直接对该云实例进行扩容,故节点代理可基于该云实例扩容所需的资源量,增大该云实例的资源配额。例如,设该云实例运行的业务量较大,致使该云实例扩容所需的内存量为5G,原先该云实例的内存配额为1G(即原先分配至该云实例的内存量),节点代理可通过修改cgroup配置,以使得修改后的该云实例的内存配额为6G。
405、若第一工作节点的空闲资源量小于扩容所需的资源量,第一工作节点在多个云实例中确定待迁移的云实例,待迁移的云实例运行的业务的优先级低于预置的优先级。
406、第一工作节点将待迁移的云实例迁移至第二工作节点,以更新第一工作节点的空闲资源量。
若第一工作节点的空闲资源量小于该云实例扩容所需的资源量,说明第一工作节点的空闲资源是不足的,节点代理可尝试解决资源不足的问题,即节点代理可在第一工作节点的多个云实例中,确定至少一个待迁移的云实例,这些待迁移的云实例运行的业务的优先级低于预置的优先级(即这些云实例所运行的业务往往优先级较低),故可将这些待迁移的云实例迁移至第二工作节点,那么,第一工作节点中被分配至这些待迁移的云实例的资源被释放,成为新的空闲资源,从而更新了第一工作节点的空闲资源量(即增大了第一工作节点的空闲资源量)。
需要说明的是,由于第一工作节点中待迁移的云实例所运行的业务往往优先级较低,故待迁移的云实例的迁移方式可以为冷迁移,即第一工作节点的节点代理向管理节点发调度请求,管理节点基于该调度请求,可在除第一工作节点之外的其余工作节点中,挑选一个工作节点作为迁移的目的地,即第二工作节点。然后,管理节点可通知第二工作节点的节点代理创建新的云实例,并控制新的云实例重新运行第一工作节点中待迁移的云实例所运行的业务。最后,管理节点可通知第一工作节点的节点代理释放待迁移的云实例,以使得分配至待迁移的云实例的资源被释放,从而更新第一工作节点的空闲资源量。
407、第一工作节点检测更新后的第一工作节点的空闲资源量是否大于或等于扩容所需的资源量。
更新了第一工作节点的空闲资源量后,节点代理可检测更新后的第一工作节点的空闲资源量是否大于或等于扩容所需的资源量,若更新后的第一工作节点的空闲资源量大于或等于该云实例扩容所需的资源量,则执行步骤408,若更新后的第一工作节点的空闲资源量小于该云实例扩容所需的资源量,则执行步骤410。
408、若更新后的第一工作节点的空闲资源量大于或等于扩容所需的资源量,第一工作节点基于扩容所需的资源量,增大待扩容的云实例的资源配额。
若更新后的第一工作节点的空闲资源量大于或等于该云实例扩容所需的资源量,说明更新后的第一工作节点的空闲资源是充足的,可利用这部分资源直接对该云实例进行扩容,故节点代理可基于该云实例扩容所需的资源量,增大该云实例的资源配额。
409、第一工作节点将待扩容的云实例的资源配额发送至管理节点。
节点代理对该云实例的资源配额进行增大后,可将增大后的该云实例的资源配额发送至管理节点,故管理节点和第一工作节点可同步该云实例的资源配额,使得全局(即整个云服 务系统)资源配置信息准确一致。
410、若更新后的第一工作节点的空闲资源量小于扩容所需的资源量,第一工作节点检测待扩容的云实例运行的业务的类型。
411、若待扩容的云实例运行的业务为无状态应用,第一工作节点在第三工作节点处创建新的云实例,新的云实例和待扩容的云实例共同用于运行无状态应用。
412、若待扩容的云实例运行的业务为有状态应用,第一工作节点将待扩容的云实例迁移至第三工作节点。
若更新后的第一工作节点的空闲资源量小于该云实例扩容所需的资源量,说明更新后的第一工作节点的空闲资源是不足的,节点代理则需要再次解决资源不足的问题,节点代理可先检测该云实例运行的业务(也可以理解为该云实例运行的应用)的类型,以基于该云实例运行的业务类型进行相应的处理:
若该云实例运行的业务为无状态应用,节点代理可申请在第三工作节点处创建新的云实例,那么,该云实例和新的云实例可共同运行该云实例原先所运行的业务,也就相当于实现了扩容。具体地,第一工作节点的节点代理确定该云实例运行的业务为无状态应用,可向管理节点向管理节点发调度请求,管理节点基于该调度请求,可在除第一工作节点之外的其余工作节点中,挑选一个工作节点,即第三工作节点。然后,管理节点可通知第三工作节点的节点代理创建新的云实例,并控制新的云实例用于运行第一工作节点中该云实例所运行的业务,由于该业务为无状态应用,对于不同的云实例而言,哪个云实例来运行的效果是相似的,故新的云实例和该云实例虽然位于不同的同坐节点上,但可共同承担该业务,相当于完成了扩容。
若该云实例运行的业务为有状态应用,节点代理可将该云实例迁移至在第三工作节点,其中,第三工作节点的空闲资源量大于或等于该云实例扩容所需的资源量,那么,在将该云实例迁移至第三工作节点后,可增大该云实例的资源配额,也就相当于完成了扩容。具体地,该云实例迁移至第三工作节点的方式既可以是热迁移,也可以是冷迁移。下文将分别对两种方式进行介绍:(1)热迁移的方式为:第一工作节点的节点代理确定该云实例运行的业务为有状态应用,可向管理节点向管理节点发调度请求,管理节点基于该调度请求,可在除第一工作节点之外的其余工作节点中,挑选一个工作节点作为迁移的目的地,即第三工作节点。然后,管理节点可通知第三工作节点的节点代理创建新的云实例,并保持新的云实例的业务状态和该云实例的业务状态一致,以控制新的云实例继续运行第一工作节点中该云实例所运行的业务。最后,管理节点可通知第一工作节点的节点代理释放该云实例,由于第三工作节点的节点代理可令新的云实例的资源配额大于第一工作节点中该云实例的资源配额,故相当于完成了扩容。(2)冷迁移的方式为:第一工作节点的节点代理确定该云实例运行的业务为有状态应用,可向管理节点向管理节点发调度请求,管理节点基于该调度请求,可在除第一工作节点之外的其余工作节点中,挑选一个工作节点作为迁移的目的地,即第三工作节点。然后,管理节点可通知第三工作节点的节点代理创建新的云实例,并控制新的云实例重新运行第一工作节点中该云实例所运行的业务。最后,管理节点可通知第一工作节点的节点代理释放该云实例,由于第三工作节点的节点代理可令新的云实例的资源配额大于第一工作节点中该云实例的资源配额,故相当于完成了扩容。
应理解,本实施例仅以第一工作节点的其中一个云实例进行示意性说明,第一工作节点的其余云实例也可执行如同该云实例所执行的操作,即对于第一工作节点的每一个云实例,均可执行如步骤401至步骤412所述的操作,此处不再赘述。
本申请实施例中,第一工作节点在获取多个云实例的状态信息后,可基于这部分状态信息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量。若第一工作节点的空闲资源量大于或等于扩容所需的资源量,第一工作节点可基于扩容所需的资源量,增大待扩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例扩容机制,第一工作节点自行确定待扩容的云实例后,可通过实时修改cgroup配置,以增大待扩容的云实例的资源配额,这对于待扩容的云实例运行的业务而言,是不感知的,故不会导致待扩容的云实例运行的业务中断。
进一步地,在VPA中,若需要增大某个云实例的资源配额,工作节点只能在云实例重建的时候才能修改云实例的资源配额,但是云实例重建往往需要等待特定的时机,例如,被驱逐、被杀掉等等。这部分时间通常不可控,导致增大云实例的资源配额所需的时长过大,而本申请实施例不需要在云实例重建的时候去修改云实例的资源配额,可实时修改云实例的资源配额,从而有效缩短增大云实例的资源配额所需的时长。
更进一步地,在VPA中,若需要增大某个云实例的资源配额,工作节点只能在云实例重建的时候才能修改云实例的资源配额,但是云实例重建和业务启动往往需要较长的时间,导致增大云实例的资源配额所需的时长过大,而本申请实施例不需要在云实例重建的时候去修改云实例的资源配额,可实时修改云实例的资源配额,从而有效缩短增大云实例的资源配额所需的时长。
更进一步地,在VPA中,仅根据云实例的资源占用率来检测云实例是否需要进行扩容,无法深入了解业务需求,需要多次检测才能精准确定云实例是否需要扩容,耗费了较长的时间在检测上。本申请实施例可基于云实例的状态信息来检测云实例是否需要扩容,由于云实例的状态信息包含云实例的资源占用率、负载程度以及业务成功率等信息,故云实例的状态信息与云实例的业务逻辑相关,更能体现业务需求,故工作节点基于云实例的状态信息,可实时感应业务需求,并基于业务需求准确检测云实例是否需要扩容,可有效减少检测的次数,从而缩短检测时长。
以上对云实例的扩容过程所进行的详细说明,以下将对云实例的缩容过程进行介绍。图5为本申请实施例提供的云实例的缩容方法的一个流程示意图,需要说明的是,该方法可应用于前述图2或图3所示的云服务系统,该方法的执行主体可为云服务系统中多个工作节点的任意一个工节点,下文将该工作节点称为第一工作节点。如图5所示,该方法包括:
501、第一工作节点获取多个云实例的状态信息。
502、第一工作节点基于状态信息,在多个云实例中确定待缩容的云实例。
本实施例中,第一工作节点上部署有多个云实例,对于任意一个云实例而言,该云实例可周期性地获取自身的状态信息,以基于自身的状态信息判定是否需要进行缩容。
进一步地,该云实例的状态信息可包含以下至少一种:该云实例的资源占用率、该云实例的负载程度和该云实例的业务成功率。
更进一步地,该云实例的资源占用率可包含以下至少一种:该云实例的CPU使用率、该 云实例的内存使用量、该云实例的存储IOPS以及该云实例的网络IOPS等等。
更进一步地,该云实例的负载程度可包含以下至少一种:该云实例的任务响应时间、该云实例的任务处理时延、该云实例的数据库指标、该云实例的消息队列指标以及该云实例的任务队列长度等等。
更进一步地,该云实例的业务成功率可包含以下至少一种:该云实例的任务完成率以及该云实例的消息传输成功率等等。
具体地,该云实例可通过以下方式确定是否需要进行缩容:
该云实例可检测自身的状态信息是否满足预置的缩容条件,若满足,则确定需要进行缩容,即将自身确定为待缩容的云实例,若不满足,则确定不需要进行缩容,结束操作。
其中,该云实例的状态信息满足预置的缩容条件可包含以下至少一种情况:该云实例的资源占用率小于预置的第二资源占用率、该云实例的负载程度小于预置的第二负载程度以及该云实例的业务成功率大于或等于预置的第二业务成功率。需要说明的是,预置的第二资源占用率可理解为达到缩容需求的资源占用率阈值,预置的第二负载程度可理解为达到缩容需求的负载程度阈值,预置的第二业务成功率可理解为达到缩容需求的业务成功率阈值,这三个阈值的大小可根据实际需求进行设置,此处不做限制。
503、第一工作节点释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量。
该云实例确定需要进行缩容后,可将自身的空闲资源(即该云实例未使用的资源,该云实例运行业务所占用的资源则为该云实例的非空闲资源)释放,并在完成释放后,计算自身的空闲资源量(即该云实例所释放的资源的大小)。
需要说明的是,当云实例确定自身运行的业务下降需要进行缩容时,云实例可以尝试如下方式进行缩容:(1)对于系统级语言,内存由业务自行管理,大部分内存通过动态分配与释放能及时回收。还有一部分以内存池的方式进行管理,可以在业务量小的时候缩小内存池,业务量大的时候扩大内存池;(2)对于带有垃圾回收的高级语言,内存回收由高级语言的运行时进行内存回收,但是高级语言的内存回收和业务量通常不同步,这时可以由业务主动调用强制垃圾回收来释放内存。
该云实例得到自身的空闲资源量后,则将自身的空闲资源量发送至第一工作节点的节点代理。
504、第一工作节点基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。
节点代理确定该云实例的空闲资源量后,可基于该云实例的空闲资源量,减小该云实例的资源配额。例如,设该云实例运行的业务量较小,致使该云实例的空闲内存量为2G,原先该云实例的内存配额为5G(即原先分配至该云实例的内存量),节点代理可通过修改cgroup配置,以使得修改后的该云实例的内存配额为3G。
505、第一工作节点将待缩容的云实例的资源配额发送至管理节点。
节点代理对该云实例的资源配额进行减小后,可将减小后的该云实例的资源配额发送至管理节点,故管理节点和第一工作节点可同步该云实例的资源配额,使得全局(即整个云服务系统)资源配置信息准确一致。
应理解,本实施例仅以第一工作节点的其中一个云实例进行示意性说明,第一工作节点 的其余云实例也可执行如同该云实例所执行的操作,即对于第一工作节点的每一个云实例,均可执行如步骤501至步骤502所述的操作,此处不再赘述。
本申请实施例中,在获取多个云实例的状态信息后,第一工作节点可基于这些状态信息,在多个云实例中确定待缩容的云实例。然后,第一工作节点释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量。最后,第一工作节点可基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例缩容机制,第一工作节点自行确定待缩容的云实例后,可通过实时修改cgroup配置,以减小待缩容的云实例的资源配额,这对于待缩容的云实例运行的业务而言,是不感知的,故不会导致待缩容的云实例运行的业务中断。
进一步地,在减小待缩容的云实例的资源配额之前,可令待缩容的云实例自行释放空闲资源,从而确保对待缩容的云实例的资源配额进行修改时的成功率。
更进一步地,在VPA中,若需要减小某个云实例的资源配额,工作节点只能在云实例重建的时候才能修改云实例的资源配额,但是云实例重建往往需要等待特定的时机,例如,被驱逐、被杀掉等等。这部分时间通常不可控,导致减小云实例的资源配额所需的时长过大,而本申请实施例不需要在云实例重建的时候去修改云实例的资源配额,可实时修改云实例的资源配额,从而有效缩短减小云实例的资源配额所需的时长。
更进一步地,在VPA中,若需要减小某个云实例的资源配额,工作节点只能在云实例重建的时候才能修改云实例的资源配额,但是云实例重建和业务启动往往需要较长的时间,导致减小云实例的资源配额所需的时长过大,而本申请实施例不需要在云实例重建的时候去修改云实例的资源配额,可实时修改云实例的资源配额,从而有效缩短减小云实例的资源配额所需的时长。
更进一步地,在VPA中,仅根据云实例的资源占用率来检测云实例是否需要进行缩容,无法深入了解业务需求,需要多次检测才能精准确定云实例是否需要缩容,耗费了较长的时间在检测上。本申请实施例可基于云实例的状态信息来检测云实例是否需要缩容,由于云实例的状态信息包含云实例的资源占用率、负载程度以及业务成功率等信息,故云实例的状态信息与云实例的业务逻辑相关,更能体现业务需求,故工作节点基于云实例的状态信息,可实时感应业务需求,并基于业务需求准确检测云实例是否需要缩容,可有效减少检测的次数,从而缩短检测时长。
以上是对本申请实施例提供的云实例的缩容方法所进行的详细说明,以下将对本申请实施例提供的工作节点进行介绍。图6为本申请实施例提供的工作节点的一个结构示意图,如图6所示,该工作节点作为第一工作节点,第一工作节点设置于云服务系统中,第一工作节点部署有多个云实例,第一工作节点包括:
获取模块601,用于获取多个云实例的状态信息;
第一确定模块602,用于基于状态信息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量;
第一调整模块603,用于若第一工作节点的空闲资源量大于或等于扩容所需的资源量,基于扩容所需的资源量,增大待扩容的云实例的资源配额。
本申请实施例中,第一工作节点在获取多个云实例的状态信息后,可基于这部分状态信 息,在多个云实例中确定待扩容的云实例以及扩容所需的资源量。若第一工作节点的空闲资源量大于或等于扩容所需的资源量,第一工作节点可基于扩容所需的资源量,增大待扩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例扩容机制,第一工作节点自行确定待扩容的云实例后,可通过实时修改cgroup配置,以增大待扩容的云实例的资源配额,这对于待扩容的云实例运行的业务而言,是不感知的,故不会导致待扩容的云实例运行的业务中断。
在一种可能的实现方式中,第一确定模块602,用于:在多个云实例中,第一工作节点将状态信息满足预置的扩容条件的云实例确定为待扩容的云实例;基于待扩容的云实例的状态信息,确定扩容所需的扩容所需的资源量。
在一种可能的实现方式中,云服务系统还包含第二工作节点,第一工作节点还包括:第二确定模块,用于若第一工作节点的空闲资源量小于扩容所需的资源量,在多个云实例中确定待迁移的云实例,待迁移的云实例运行的业务的优先级低于预置的优先级;第一迁移模块,用于将待迁移的云实例迁移至第二工作节点,以更新第一工作节点的空闲资源量;第二调整模块,用于若更新后的第一工作节点的空闲资源量大于或等于扩容所需的资源量,基于扩容所需的资源量,增大待扩容的云实例的资源配额。
在一种可能的实现方式中,云服务系统还包含第三工作节点,第一工作节点还包括:检测模块,用于若更新后的第一工作节点的空闲资源量小于扩容所需的资源量,第一工作节点检测待扩容的云实例运行的业务的类型;创建模块,用于若待扩容的云实例运行的业务为无状态应用,在第三工作节点处创建新的云实例,新的云实例和待扩容的云实例共同用于运行无状态应用;第二迁移模块,用于若待扩容的云实例运行的业务为有状态应用,将待扩容的云实例迁移至第三工作节点。
在一种可能的实现方式中,状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
在一种可能的实现方式中,预置的扩容条件包括以下至少一种:资源占用率大于或等于预置的第一资源占用率、负载程度大于或等于预置的第一负载程度和业务成功率小于预置的第一业务成功率。
在一种可能的实现方式中,前述的迁移为冷迁移或热迁移。
在一种可能的实现方式中,云服务系统还包含管理节点,第一工作节点还包括:反馈模块,用于将待扩容的云实例的资源配额发送至管理节点。
图7为本申请实施例提供的工作节点的另一个结构示意图,如图7所示,该工作节点作为第一工作节点,第一工作节点设置于云服务系统中,第一工作节点部署有多个云实例,第一工作节点包括:
获取模块701,用于获取多个云实例的状态信息;
确定模块702,用于基于状态信息,在多个云实例中确定待缩容的云实例;
释放模块703,用于释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量;
调整模块704,用于基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。
本申请实施例中,在获取多个云实例的状态信息后,第一工作节点可基于这些状态信息,在多个云实例中确定待缩容的云实例。然后,第一工作节点释放待缩容的云实例的空闲资源,并确定被释放的待缩容的云实例的空闲资源量。最后,第一工作节点可基于待缩容的云实例的空闲资源量,减小待缩容的云实例的资源配额。基于前述过程可知,本申请提供了一种新的云实例缩容机制,第一工作节点自行确定待缩容的云实例后,可通过实时修改cgroup配置,以减小待缩容的云实例的资源配额,这对于待缩容的云实例运行的业务而言,是不感知的,故不会导致待缩容的云实例运行的业务中断。
在一种可能的实现方式中,确定模块702,用于在多个云实例中,将状态信息满足预置的缩容条件的云实例确定为待缩容的云实例。
在一种可能的实现方式中,状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
在一种可能的实现方式中,预置的缩容条件包括以下至少一种:资源占用率小于预置的第二资源占用率、负载程度小于预置的第二负载程度和业务成功率大于或等于预置的第二业务成功率。
在一种可能的实现方式中,云服务系统还包含管理节点,第一工作节点还包括:反馈模块,用于将待扩容的云实例的资源配额发送至管理节点。
需要说明的是,上述装置各模块/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其带来的技术效果与本申请方法实施例相同,具体内容可参考本申请实施例前述所示的方法实施例中的叙述,此处不再赘述。
图8为本申请实施例提供的工作节点的另一个结构示意图。如图8所示,该工作节点作为第一工作节点,第一工作节点设置于云服务系统中,第一工作节点部署有多个云实例,第一工作节点可以包括一个或一个以上中央处理器801,存储器802,输入输出接口803,有线或无线网络接口804,电源805。
存储器802可以是短暂存储或持久存储。更进一步地,中央处理器801可以配置为与存储器802通信,在第一工作节点上执行存储器802中的一系列指令操作。
本实施例中,中央处理器801可以执行前述图4或图5所示实施例中第一工作节点所执行的操作,具体此处不再赘述。
本实施例中,中央处理器801中的具体功能模块划分可以与前述图6中所描述的获取模块、第一确定模块、第二调整模块、第二确定模块、第一迁移模块、第二调整模块、检测模块、创建模块、第二迁移模块和反馈模块等模块的划分方式类似,此处不再赘述;或,
中央处理器801中的具体功能模块划分可以与前述图7中所描述的获取模块、确定模块、释放模块、调整模块和反馈模块等模块的划分方式类似,此处不再赘述。
本申请实施例还涉及一种计算机存储介质,包括计算机可读指令,当所述计算机可读指令被执行时,实现如图5所示实施例中第一服务器所执行的步骤,或,实现如图5所示实施例中仲裁器所执行的步骤。
本申请实施例还涉及一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如图5所示实施例中第一服务器所执行的步骤,或,实现如图5所示实施例中仲裁器所执行的步骤。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (26)

  1. 一种云实例的扩容方法,其特征在于,所述方法应用于云服务系统,所述云服务系统包含第一工作节点,所述第一工作节点部署有多个云实例,所述方法包括:
    所述第一工作节点获取所述多个云实例的状态信息;
    所述第一工作节点基于所述状态信息,在所述多个云实例中确定待扩容的云实例以及扩容所需的资源量;
    若所述第一工作节点的空闲资源量大于或等于所述扩容所需的资源量,所述第一工作节点基于所述扩容所需的资源量,增大所述待扩容的云实例的资源配额。
  2. 根据权利要求1所述的方法,其特征在于,所述第一工作节点基于所述状态信息,在所述多个云实例中确定待扩容的第一云实例以及扩容所需的扩容所需的资源量包括:
    在所述多个云实例中,所述第一工作节点将状态信息满足预置的扩容条件的云实例确定为待扩容的云实例;
    所述第一工作节点基于所述待扩容的云实例的状态信息,确定扩容所需的扩容所需的资源量。
  3. 根据权利要求1或2所述的方法,其特征在于,所述云服务系统还包含第二工作节点,所述方法还包括:
    若所述第一工作节点的空闲资源量小于所述扩容所需的资源量,所述第一工作节点在所述多个云实例中确定待迁移的云实例,所述待迁移的云实例运行的业务的优先级低于预置的优先级;
    所述第一工作节点将所述待迁移的云实例迁移至第二工作节点,以更新所述第一工作节点的空闲资源量;
    若更新后的第一工作节点的空闲资源量大于或等于所述扩容所需的资源量,所述第一工作节点基于所述扩容所需的资源量,增大所述待扩容的云实例的资源配额。
  4. 根据权利要求3所述的方法,其特征在于,所述云服务系统还包含第三工作节点,所述方法还包括:
    若更新后的第一工作节点的空闲资源量小于所述扩容所需的资源量,所述第一工作节点检测所述待扩容的云实例运行的业务的类型;
    若所述待扩容的云实例运行的业务为无状态应用,所述第一工作节点在所述第三工作节点处创建新的云实例,所述新的云实例和所述待扩容的云实例共同用于运行所述无状态应用;
    若所述待扩容的云实例运行的业务为有状态应用,所述第一工作节点将所述待扩容的云实例迁移至第三工作节点。
  5. 根据权利要求2所述的方法,其特征在于,所述状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
  6. 根据权利要求5所述的方法,其特征在于,所述预置的扩容条件包括以下至少一种:资源占用率大于或等于预置的第一资源占用率、负载程度大于或等于预置的第一负载程度和业务成功率小于预置的第一业务成功率。
  7. 根据权利要求4所述的方法,其特征在于,所述迁移为冷迁移或热迁移。
  8. 根据权利要求1至7任意一项所述的方法,其特征在于,所述云服务系统还包含管理 节点,所述第一工作节点增大所述待扩容的云实例的资源配额之后,所述方法还包括:
    所述第一工作节点将所述待扩容的云实例的资源配额发送至所述管理节点。
  9. 一种云实例的缩容方法,其特征在于,所述方法应用于云服务系统,所述云服务系统包含第一工作节点,所述第一工作节点部署有多个云实例,所述方法包括:
    所述第一工作节点获取所述多个云实例的状态信息;
    所述第一工作节点基于所述状态信息,在所述多个云实例中确定待缩容的云实例;
    所述第一工作节点释放所述待缩容的云实例的空闲资源,并确定被释放的所述待缩容的云实例的空闲资源量;
    所述第一工作节点基于所述待缩容的云实例的空闲资源量,减小所述待缩容的云实例的资源配额。
  10. 根据权利要求9所述的方法,其特征在于,所述第一工作节点基于所述状态信息,在所述多个云实例中确定待缩容的云实例包括:
    在所述多个云实例中,所述第一工作节点将状态信息满足预置的缩容条件的云实例确定为待缩容的云实例。
  11. 根据权利要求9或10所述的方法,其特征在于,所述状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
  12. 根据权利要求11所述的方法,其特征在于,所述预置的缩容条件包括以下至少一种:资源占用率小于预置的第二资源占用率、负载程度小于预置的第二负载程度和业务成功率大于或等于预置的第二业务成功率。
  13. 根据权利要求9至12任意一项所述的方法,其特征在于,所述云服务系统还包含管理节点,所述第一工作节点基于所述待缩容的云实例的空闲资源量,减小所述待缩容的云实例的资源配额之后,所述方法还包括:
    所述第一工作节点将所述待缩容的云实例的资源配额发送至所述管理节点。
  14. 一种工作节点,其特征在于,所述工作节点作为第一工作节点,所述第一工作节点设置于云服务系统中,所述第一工作节点部署有多个云实例,所述第一工作节点包括:
    获取模块,用于获取所述多个云实例的状态信息;
    第一确定模块,用于基于所述状态信息,在所述多个云实例中确定待扩容的云实例以及扩容所需的资源量;
    第一调整模块,用于若所述第一工作节点的空闲资源量大于或等于所述扩容所需的资源量,基于所述扩容所需的资源量,增大所述待扩容的云实例的资源配额。
  15. 根据权利要求14所述的工作节点,其特征在于,所述第一确定模块,用于:
    在所述多个云实例中,所述第一工作节点将状态信息满足预置的扩容条件的云实例确定为待扩容的云实例;
    基于所述待扩容的云实例的状态信息,确定扩容所需的扩容所需的资源量。
  16. 根据权利要求14或15所述的工作节点,其特征在于,所述云服务系统还包含第二工作节点,所述第一工作节点还包括:
    第二确定模块,用于若所述第一工作节点的空闲资源量小于所述扩容所需的资源量,在所述多个云实例中确定待迁移的云实例,所述待迁移的云实例运行的业务的优先级低于预置 的优先级;
    第一迁移模块,用于将所述待迁移的云实例迁移至第二工作节点,以更新所述第一工作节点的空闲资源量;
    第二调整模块,用于若更新后的第一工作节点的空闲资源量大于或等于所述扩容所需的资源量,基于所述扩容所需的资源量,增大所述待扩容的云实例的资源配额。
  17. 根据权利要求16所述的工作节点,其特征在于,所述云服务系统还包含第三工作节点,所述第一工作节点还包括:
    检测模块,用于若更新后的第一工作节点的空闲资源量小于所述扩容所需的资源量,所述第一工作节点检测所述待扩容的云实例运行的业务的类型;
    创建模块,用于若所述待扩容的云实例运行的业务为无状态应用,在所述第三工作节点处创建新的云实例,所述新的云实例和所述待扩容的云实例共同用于运行所述无状态应用;
    第二迁移模块,用于若所述待扩容的云实例运行的业务为有状态应用,将所述待扩容的云实例迁移至第三工作节点。
  18. 根据权利要求15所述的工作节点,其特征在于,所述状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
  19. 根据权利要求18所述的工作节点,其特征在于,所述预置的扩容条件包括以下至少一种:资源占用率大于或等于预置的第一资源占用率、负载程度大于或等于预置的第一负载程度和业务成功率小于预置的第一业务成功率。
  20. 一种工作节点,其特征在于,所述工作节点作为第一工作节点,所述第一工作节点设置于云服务系统中,所述第一工作节点部署有多个云实例,所述第一工作节点包括:
    获取模块,用于获取所述多个云实例的状态信息;
    确定模块,用于基于所述状态信息,在所述多个云实例中确定待缩容的云实例;
    释放模块,用于释放所述待缩容的云实例的空闲资源,并确定被释放的所述待缩容的云实例的空闲资源量;
    调整模块,用于基于所述待缩容的云实例的空闲资源量,减小所述待缩容的云实例的资源配额。
  21. 根据权利要求20所述的工作节点,其特征在于,所述确定模块,用于在所述多个云实例中,将状态信息满足预置的缩容条件的云实例确定为待缩容的云实例。
  22. 根据权利要求20或21所述的工作节点,其特征在于,所述状态信息包括以下至少一种:资源占用率、负载程度和业务成功率。
  23. 根据权利要求22所述的工作节点,其特征在于,所述预置的缩容条件包括以下至少一种:资源占用率小于预置的第二资源占用率、负载程度小于预置的第二负载程度和业务成功率大于或等于预置的第二业务成功率。
  24. 一种工作节点,其特征在于,所述工作节点包括存储器和处理器;所述存储器存储有代码,所述处理器被配置为执行所述代码,当所述代码被执行时,所述工作节点执行如权利要求1至13任一所述的方法。
  25. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有一个或多个指令,所述指令在由一个或多个计算机执行时使得所述一个或多个计算机实施权利要求1至13任一所 述的方法。
  26. 一种计算机程序产品,其特征在于,所述计算机程序产品存储有指令,所述指令在由计算机执行时,使得所述计算机实施权利要求1至13任意一项所述的方法。
PCT/CN2022/134647 2021-11-30 2022-11-28 一种云实例的扩缩容方法及其相关设备 WO2023098614A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111450334.5A CN116204268A (zh) 2021-11-30 2021-11-30 一种云实例的扩缩容方法及其相关设备
CN202111450334.5 2021-11-30

Publications (1)

Publication Number Publication Date
WO2023098614A1 true WO2023098614A1 (zh) 2023-06-08

Family

ID=86508193

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134647 WO2023098614A1 (zh) 2021-11-30 2022-11-28 一种云实例的扩缩容方法及其相关设备

Country Status (2)

Country Link
CN (1) CN116204268A (zh)
WO (1) WO2023098614A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701126A (zh) * 2023-08-01 2023-09-05 中海石油气电集团有限责任公司 pod容量控制方法及装置
CN117806815A (zh) * 2023-11-27 2024-04-02 本原数据(北京)信息技术有限公司 数据处理方法、系统、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199194A (zh) * 2020-10-14 2021-01-08 广州虎牙科技有限公司 基于容器集群的资源调度方法、装置、设备和存储介质
CN113037794A (zh) * 2019-12-25 2021-06-25 马上消费金融股份有限公司 计算资源配置调度方法、装置及系统
CN113268310A (zh) * 2021-04-12 2021-08-17 新浪网技术(中国)有限公司 一种Pod资源配额调整方法、装置、电子设备及存储介质
CN113395178A (zh) * 2021-06-11 2021-09-14 聚好看科技股份有限公司 一种容器云弹性伸缩的方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113037794A (zh) * 2019-12-25 2021-06-25 马上消费金融股份有限公司 计算资源配置调度方法、装置及系统
CN112199194A (zh) * 2020-10-14 2021-01-08 广州虎牙科技有限公司 基于容器集群的资源调度方法、装置、设备和存储介质
CN113268310A (zh) * 2021-04-12 2021-08-17 新浪网技术(中国)有限公司 一种Pod资源配额调整方法、装置、电子设备及存储介质
CN113395178A (zh) * 2021-06-11 2021-09-14 聚好看科技股份有限公司 一种容器云弹性伸缩的方法及装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701126A (zh) * 2023-08-01 2023-09-05 中海石油气电集团有限责任公司 pod容量控制方法及装置
CN116701126B (zh) * 2023-08-01 2023-12-12 中海石油气电集团有限责任公司 pod容量控制方法及装置
CN117806815A (zh) * 2023-11-27 2024-04-02 本原数据(北京)信息技术有限公司 数据处理方法、系统、电子设备及存储介质

Also Published As

Publication number Publication date
CN116204268A (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
WO2023098614A1 (zh) 一种云实例的扩缩容方法及其相关设备
JP7138126B2 (ja) リソース配置を最適化するための適時性リソース移行
US11500832B2 (en) Data management method and server
CN104077212A (zh) 压力测试系统及方法
CN111641515B (zh) Vnf的生命周期管理方法及装置
JP2013513174A (ja) 仮想マシンのストレージスペースおよび物理ホストを管理するための方法およびシステム
CN107431696A (zh) 用于应用自动化部署的方法和云管理节点
CN110247984B (zh) 业务处理方法、装置及存储介质
CN104461744A (zh) 一种资源分配方法及装置
US20140282540A1 (en) Performant host selection for virtualization centers
JP2018026042A (ja) 移動制御プログラム、移動制御装置及び移動制御方法
KR101660514B1 (ko) 분산 렌더링 시스템
CN106484528A (zh) 分布式框架中用于实现集群动态伸缩的方法及装置
US10320892B2 (en) Rolling capacity upgrade control
TW201322134A (zh) 虛擬機管理系統及方法
WO2020134364A1 (zh) 一种虚拟机迁移方法、云计算管理平台和存储介质
US20220329651A1 (en) Apparatus for container orchestration in geographically distributed multi-cloud environment and method using the same
CN109582459A (zh) 应用的托管进程进行迁移的方法及装置
CN103488538B (zh) 云计算系统中的应用扩展装置和应用扩展方法
AU2015203316A1 (en) Intelligent application back stack management
CN111078628A (zh) 一种多盘并发数据迁移方法、系统、装置及可读存储介质
Van Tendeloo et al. Activity in pythonpdevs
CN112199192A (zh) 基于服务器部署Kubernetes集群精细化管理配额的方法及系统
CN114546587A (zh) 一种在线图像识别服务的扩缩容方法及相关装置
CN109905258B (zh) PaaS的管理方法、装置及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22900418

Country of ref document: EP

Kind code of ref document: A1