CN117349001A - Node telescoping method and related device - Google Patents

Node telescoping method and related device Download PDF

Info

Publication number
CN117349001A
CN117349001A CN202210743228.4A CN202210743228A CN117349001A CN 117349001 A CN117349001 A CN 117349001A CN 202210743228 A CN202210743228 A CN 202210743228A CN 117349001 A CN117349001 A CN 117349001A
Authority
CN
China
Prior art keywords
cluster
nodes
node
current period
period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210743228.4A
Other languages
Chinese (zh)
Inventor
袁诗宇
迟勇欣
李星泽
陈明
朱锦鸿
莫介水
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Cloud Computing Technologies Co Ltd filed Critical Huawei Cloud Computing Technologies Co Ltd
Priority to CN202210743228.4A priority Critical patent/CN117349001A/en
Publication of CN117349001A publication Critical patent/CN117349001A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a node telescoping method, which comprises the following steps: acquiring the state of the cluster in the current period, acquiring a prediction result of the number of instances to be deployed of the cluster in the next period, determining the ideal node number of the cluster in the next period according to the state of the cluster in the current period and the prediction result, and performing node expansion and contraction on the cluster in the current period according to the ideal node number. According to the method, on the basis of an automatic node layer expansion strategy, the prediction of the number of examples to be deployed in the future of the cluster is increased, and node expansion is performed in the current period in a mode of combining a real-time state with a prediction result, so that the number of nodes in the cluster can meet the requirement of the upcoming next period, expansion efficiency is effectively improved, stability of service is ensured, and the purposes of improving resource utilization rate, reducing cost and meeting and improving service performance are achieved.

Description

Node telescoping method and related device
Technical Field
The present application relates to the field of cloud computing technologies, and in particular, to a node scaling method, a cluster scaler, a scaling decision system, a computer cluster, a computer readable storage medium, and a computer program product.
Background
An Application (APP) is a program written for some special application purpose for a user, such as a text processor, a form, an accounting application, a browser, a media player, an aviation flight simulator, a command line game, an image editor, etc. Applications typically need to be deployed to a computer, such as a terminal or server, that runs the program code for the application to implement the corresponding functionality.
Some large applications require a large amount of computation, and deploying such applications on a single computer may have the problem of insufficient stand-alone computing power, for which purpose a cluster (cluster) may be used to deploy the applications. A cluster is a group of mutually independent computers interconnected by a high-speed network, each computer being referred to as a node of the cluster. The program code of an application running on a node is called an instance (instance), which is typically dynamic code.
The clusters may be automatically scaled (autoscaling) considering that traffic flow may be dynamically changing. Auto-scaling refers to automatically adding or removing nodes or instances depending on the state of the cluster. For example, the automatic scaling of the node layer may be to trigger an increase or decrease in the number of nodes depending on the state of the cluster.
Currently, node telescoping strategies for many applications are: periodically detecting the states of the examples and the states of the nodes in the cluster, and adding the nodes into the cluster when the number of the examples is found to be so large that the current number of the nodes cannot deploy all the examples; when nodes in the cluster are found to be in an underutilized state for a long period of time, the nodes are removed from the cluster.
However, when adding a node to a cluster, that is, for a cluster capacity (Scale Up/Out), a certain time is required from creation to use of the node, so that a large number of failure requests appear before the node is ready, which affects reliability.
Disclosure of Invention
The method increases the prediction of the number of examples to be deployed in the future of the cluster on the basis of an automatic node layer telescoping strategy, and utilizes a mode of combining a real-time state and a prediction result to conduct node telescoping in the current period, so that the number of nodes in the cluster can meet the requirement of the upcoming next period, effectively improve telescoping efficiency, ensure service stability and achieve the purposes of improving resource utilization rate, reducing cost and meeting and improving service performance. The application also provides a corresponding cluster expander, a telescopic decision system, a computer cluster, a computer readable storage medium and a computer program product.
In a first aspect, the present application provides a node scaling method. The cluster expander can be software, the software can be deployed in a computer cluster, and the computer cluster implements the node expansion method of the embodiment of the application by executing program codes of the software. In some embodiments, the cluster scaler may also be hardware, such as a computer cluster with node scaling functionality.
Specifically, the cluster stretcher acquires the state of a cluster in a current period and acquires a prediction result of the number of instances of the cluster to be deployed in a next period, then the cluster stretcher can determine the ideal node number of the cluster in the next period according to the state of the cluster in the current period and the prediction result, and then the cluster stretcher can stretch and retract the cluster in the current period according to the ideal node number.
According to the method, on the basis of an automatic node layer expansion strategy, the prediction of the number of the examples to be deployed in the future of the cluster is increased, for example, the future flow can be subjected to minute-level prediction, the number of the examples to be deployed in the future of the cluster is subjected to minute-level prediction based on the predicted flow distribution, and correspondingly, the cluster expansion device can expand and contract the nodes in the current period in a mode of combining a real-time state with a prediction result, so that the number of the nodes in the cluster can meet the requirement of the upcoming next period, the expansion efficiency is effectively improved, the stability of service is ensured, and the purposes of improving the resource utilization rate, reducing the cost and meeting and improving the service performance are achieved.
In some possible implementations, the cluster scaler may add nodes to the cluster from the target node pool at the current cycle such that the number of nodes in the cluster reaches the ideal number of nodes. Therefore, the capacity expansion of the cluster according to the service requirement can be realized, and the capacity expansion of the cluster is carried out in advance in the current period, so that the problem that a certain time is needed from the creation to the use of the node, a large number of failure requests appear before the node is ready, and the reliability is influenced is avoided.
In some possible implementations, the target node pool is determined by simulating results of instance deployment according to a target policy. According to the method, the appropriate target node pool is determined by carrying out boxing deduction according to the target strategy, and the corresponding type of nodes are added from the target node pool for capacity expansion, so that the service requirement can be met.
In some possible implementations, the target policy includes one or more of a random policy, a maximum number of deployed instances policy, or a maximum resource utilization policy. Wherein, the random strategy refers to randomly selecting one from all known node pools of the cluster, and adding one or more nodes to the node pool; the strategy of the maximum number of deployed examples is that a node pool capable of deploying the maximum number of examples is selected from all node pools to expand capacity; the minimum waste policy is also called a resource utilization maximum policy, and refers to selecting a node pool with the highest resource utilization after all instances are deployed for capacity expansion.
The method provides a plurality of deployment strategies for users to select, and the users can select a random strategy or a strategy with the maximum number of deployment examples or a strategy with the maximum resource utilization rate according to the needs to carry out boxing deduction so as to realize the maximization of the number of deployment examples or the maximization of the resource utilization rate.
In some possible implementations, the cluster scaler may also obtain the scaling constraint. The expansion constraint is used for representing constraint conditions of node expansion. The constraint may include an effective time interval and a telescoping interval. The cluster expansion device can expand and contract the nodes of the cluster in the current period according to the ideal node number and the expansion constraint.
Therefore, the problem that the deduction result is too dependent can be solved, and the precise expansion and contraction of the clusters can be realized.
In some possible implementations, the stretch constraints include an effective time interval and a stretch interval. Wherein the telescoping interval includes at least one of a maximum number of nodes and a minimum number of nodes. When the next period is in the effective time interval, if the number of nodes in the cluster in the current period is smaller than the minimum number of nodes, the cluster telescopic device can add nodes to the cluster in the current period so that the number of nodes in the cluster reaches the minimum number of nodes, and if the number of nodes in the cluster in the current period is larger than the maximum number of nodes, the cluster telescopic device can remove the nodes from the cluster in the current period so that the number of nodes in the cluster reaches the maximum number of nodes. If the number of nodes in the cluster is not less than the minimum number of nodes and not greater than the maximum number of nodes in the current period, the cluster scaler may add nodes to the cluster or remove nodes from the cluster in the current period, so that the number of nodes in the cluster reaches the ideal number of nodes.
Therefore, the problem that the deduction result is inaccurate to cause excessive capacity expansion and further cause resource waste can be avoided, or the problem that the deduction result is inaccurate to cause excessive capacity shrinkage and further cause service unavailability can be avoided.
In some possible implementations, the maximum number of nodes or the minimum number of nodes is determined from a historical number. Specifically, the cluster scaler may detect a historical period of time when the prediction accuracy does not reach the standard according to the historical data (including the number of histories and the historical prediction result). The effective time interval may be determined based on the distribution law of the historical time period, and the maximum node number or the minimum node number may be determined based on the distribution law of the historical number in the historical time period.
The maximum node number or the minimum node number can be determined by mining the historical data, and the maximum node number and the minimum node number are determined based on the real data, so that the method has high reference value.
In some possible implementations, the historical number includes a number of instances that an application was deployed over a historical period of time or a number of instances that a homogeneous application of the application was deployed over a historical period of time. When the application is a newly released application, the historical quantity can refer to the quantity of instances deployed in a historical time period by the similar application of the application, so that model multiplexing can be realized on one hand, and reliable node telescoping service can be provided for the newly released application on the other hand.
In some possible implementations, the maximum number of nodes or the minimum number of nodes is configured by a user through a configuration interface. In other words, the cluster scaler supports a user to manually set the maximum or minimum number of nodes. Under some planning business scenes, such as annual celebration of a store, the user expects that the business flow is larger, and the maximum node number and the minimum node number during annual celebration of the store can be manually configured so that the cluster expansion device can expand and contract nodes according to the minimum node number and the maximum node number. Thus, the personalized business requirements can be met.
In some possible implementations, the state of the cluster in the current cycle includes one or more of a ready node type and a corresponding type of node number in the cluster, an upcoming ready node type and a corresponding type of node number, a running instance type and number on each node, and an instance type and number to be deployed.
The node type may be identified by one or more of a processor architecture, a processor frequency, a memory type, or a memory size in the node. Similarly, instance types are identified by one or more of the processor architecture, processor frequency, memory type, or memory size of the running instance.
The information of the clusters can represent the real-time state of the clusters, node expansion and contraction can be realized in the current period by combining with the prediction result to meet the requirement of the upcoming next period, expansion and contraction efficiency is effectively improved, stability of service is ensured, and the purposes of improving resource utilization rate, reducing cost and meeting and improving service performance are achieved.
In a second aspect, the present application provides a cluster jack. The cluster expander comprises:
the acquisition module is used for acquiring the state of the cluster in the current period and acquiring a prediction result of the number of instances to be deployed of the cluster in the next period;
the determining module is used for determining the ideal node quantity of the cluster in the next period according to the state of the cluster in the current period and the prediction result;
and the expansion module is used for expanding and contracting the nodes of the cluster in the current period according to the ideal node quantity.
In some possible implementations, the expansion module is specifically configured to:
and adding nodes from the target node pool to the cluster in the current period, so that the number of the nodes in the cluster reaches the ideal number of the nodes.
In some possible implementations, the target node pool is determined by simulating results of instance deployment according to a target policy.
In some possible implementations, the target policy includes one or more of a random policy, a maximum number of deployed instances policy, or a maximum resource utilization policy.
In some possible implementations, the obtaining module is further configured to:
obtaining a telescopic constraint;
the telescopic module is specifically used for:
and according to the ideal node quantity and the expansion constraint, node expansion and contraction are carried out on the cluster in the current period.
In some possible implementations, the scaling constraint includes an effective time interval and a scaling interval, the scaling interval including at least one of a maximum number of nodes and a minimum number of nodes, the scaling module being specifically configured to:
and when the next period is in the effective time interval, if the number of nodes in the cluster is smaller than the minimum number of nodes in the current period, adding nodes to the cluster in the current period so that the number of nodes in the cluster reaches the minimum number of nodes, if the number of nodes in the cluster is larger than the maximum number of nodes in the current period, removing nodes from the cluster in the current period so that the number of nodes in the cluster reaches the maximum number of nodes, and if the number of nodes in the cluster is not smaller than the minimum number of nodes and not larger than the maximum number of nodes in the current period, adding nodes to the cluster or removing nodes from the cluster in the current period so that the number of nodes in the cluster reaches the ideal number of nodes.
In some possible implementations, the maximum number of nodes or the minimum number of nodes is determined from a historical number.
In some possible implementations, the historical number includes a number of instances that an application was deployed over a historical period of time or a number of instances that a homogeneous application of the application was deployed over a historical period of time.
In some possible implementations, the maximum number of nodes or the minimum number of nodes is configured by a user through a configuration interface.
In some possible implementations, the state of the cluster in the current cycle includes one or more of a ready node type and a corresponding type of node number in the cluster, an upcoming ready node type and a corresponding type of node number, a running instance type and number on each node, and an instance type and number to be deployed.
In a third aspect, the present application provides a telescopic decision system. The telescopic decision system comprises:
the prediction subsystem is used for predicting the number of instances to be deployed in the next period of the cluster;
the cluster expansion device is used for obtaining a prediction result of the number of instances of the cluster to be deployed in the next period, obtaining the state of the cluster in the current period, determining the ideal node number of the cluster in the next period according to the state of the cluster in the current period and the prediction result, and carrying out node expansion on the cluster in the current period according to the ideal node number.
In a fourth aspect, the present application provides a computer cluster. The computer cluster includes at least one computer including at least one processor and at least one memory. The at least one processor and the at least one memory are in communication with each other. The at least one processor is configured to execute instructions stored in the at least one memory to cause a computer or a cluster of computers to perform the node telescoping method of the first aspect or any implementation of the first aspect.
In a fifth aspect, the present application provides a computer-readable storage medium having stored therein instructions for instructing a computer or a cluster of computers to perform the node scaling method according to the first aspect or any implementation manner of the first aspect.
In a sixth aspect, the present application provides a computer program product comprising instructions which, when run on a computer or a cluster of computers, cause the computer or the cluster of computers to perform the node scaling method of the first aspect or any implementation of the first aspect.
Further combinations of the present application may be made to provide further implementations based on the implementations provided in the above aspects.
Drawings
In order to more clearly illustrate the technical method of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below.
Fig. 1 is a system architecture diagram of a telescopic decision system according to an embodiment of the present application;
FIG. 2 is an interface schematic diagram of a recommendation interface according to an embodiment of the present disclosure;
fig. 3 is a flowchart of a node scaling method provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of an example quantity prediction model provided by an embodiment of the present application;
fig. 5 is a schematic diagram of node number deduction according to an embodiment of the present application;
fig. 6 is a schematic flow chart of node scaling provided in an embodiment of the present application;
fig. 7 is a schematic diagram of a node capacity expansion scenario provided in an embodiment of the present application;
fig. 8A is a schematic view of a node capacity reduction scenario provided in an embodiment of the present application;
fig. 8B is a schematic view of a node capacity reduction scenario provided in an embodiment of the present application;
fig. 9 is a schematic diagram of a node scaling scenario provided in an embodiment of the present application;
fig. 10 is a schematic structural diagram of a cluster expander according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a computer cluster according to an embodiment of the present application.
Detailed Description
The terms "first", "second" in the embodiments of the present application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature.
Some technical terms related to the embodiments of the present application will be first described.
Application refers to a program written for some special application purpose for the user, such as a text processor, a form, an accounting application, a media player, an aviation flight simulator, a command line game, an image editor. The program may be deployed in a cloud platform.
Cloud platform refers to a platform that provides computing, storage, or networking capabilities to users in the form of cloud services. Specifically, the cloud platform may provide computing, storage, or networking capabilities as needed to enable deployment of applications in a cluster (cluster) manner. A cluster is a group of mutually independent computers interconnected by a high-speed network, each computer being referred to as a node of the cluster.
Each node in the cluster is divided into a pool of nodes, all of which are of the same type. The node type may be identified by the processor architecture, frequency, and/or memory size of the node. For example, the nodes in one node pool are each nodes with an X86 architecture central processing unit (central processing unit, CPU) at a frequency of 4 GigaHertz (Ghz), and the nodes in the other node pool are each nodes with an advanced reduced instruction set machine (Advanced RISC Machine, ARM) architecture graphics processor (graphical processing unit, GPU) at a frequency of 8 Ghz.
The program code of an application running on a node is called an instance (instance), which is typically dynamic code. The dynamic code may be referred to as a process or thread, which may achieve a corresponding application purpose, such as video playback, image editing, etc. One or more examples can be deployed in one node, and considering that the service flow can change along with time, the expansion and contraction of the examples can be realized by adjusting the number of the examples in the node, so that the service requirements of different time periods are met.
Besides the example expansion and contraction, the cluster can also perform node expansion and contraction so as to meet the business requirements of different time periods. Node scaling refers to adding nodes to or removing nodes from a cluster to accommodate traffic demands of different time periods. Wherein adding a node to the cluster may also be referred to as scaling Up/Out, and removing a node from the cluster may also be referred to as scaling Down/In.
Node scaling may typically be implemented by a cluster scaler (Cluster Autoscaler, CA). Currently, many applications employ Kubernetes Cluster Autoscaler for node telescoping, and Kubernetes Cluster Autoscaler has a node telescoping strategy of: periodically detecting the states of the examples and the states of the nodes in the cluster, and adding the nodes into the cluster when the nodes of the cluster are found to be insufficient, namely the number of the examples is so large that the current number of the nodes cannot deploy all the examples; when nodes in the cluster are found to be in an underutilized state for a long period of time, the nodes are removed from the cluster.
However, when adding a node to a cluster, reliability is affected because it takes a certain time from creation to being used, resulting in a large number of failed requests before the node is ready. In removing nodes from the cluster, since the capacity shrinkage is triggered according to the detection result of the performance index and the preset cooling time, the capacity shrinkage time may not be consistent with the actual situation, so that the reserved nodes are removed or reserved, and the balance of service reliability and resource utilization rate is difficult to ensure.
In view of this, the embodiment of the application provides a node telescoping method. The method may be performed by a cluster scaler. The cluster expander can be software, the software can be deployed in a computer cluster, and the computer cluster implements the node expansion method of the embodiment of the application by executing program codes of the software. In some embodiments, the cluster scaler may also be hardware, such as a computer cluster with node scaling functionality.
Specifically, a cluster expander acquires a state of a cluster in a current period, acquires a prediction result of the number of instances of the cluster to be deployed in a next period, then determines an ideal node number of the cluster in the next period according to the state of the cluster in the current period and the prediction result, and then performs node expansion and contraction on the cluster in the current period according to the ideal node number.
According to the method, on the basis of an automatic node layer expansion strategy, the prediction of the number of the examples to be deployed in the future of the cluster is increased, for example, the future flow can be subjected to minute-level prediction, the number of the examples to be deployed in the future of the cluster is subjected to minute-level prediction based on the predicted flow distribution, and correspondingly, the cluster expansion device can expand and contract the nodes in the current period in a mode of combining a real-time state with a prediction result, so that the number of the nodes in the cluster can meet the requirement of the upcoming next period, the expansion efficiency is effectively improved, the stability of service is ensured, and the purposes of improving the resource utilization rate, reducing the cost and meeting and improving the service performance are achieved.
In order to make the technical solution of the present application clearer and easier to understand, the telescopic decision system of the embodiments of the present application is described in the following with reference to the accompanying drawings.
Referring to the system architecture diagram of the telescopic decision system shown in fig. 1, the telescopic decision system 10 includes a cluster telescopic 102 and a prediction subsystem 104. The cluster expander 102 and the prediction subsystem 104 may be software, where the software may be deployed in a computer cluster, and the computer cluster implements corresponding functions by executing program codes of the software. The cluster scaler 102 may also be hardware, such as a computer cluster with corresponding functionality. For ease of description, the following description will be exemplified with the components of the telescopic decision system 10 as software.
The prediction subsystem 104 is configured to predict the number of instances that need to be deployed by the cluster 20 in the next cycle, so as to obtain a prediction result. The prediction subsystem 104 may predict traffic of the cluster 20 in the next period, for example, the prediction subsystem 104 may predict, according to the traffic of the cluster 20 in the current period, the traffic of the cluster 20 in the next period through a time sequence model, and then the prediction subsystem 104 may predict, according to the traffic of the cluster 20 in the next period, the number of instances that the cluster 20 needs to be deployed in the next period. In some embodiments, prediction subsystem 104 may also directly predict the number of instances that cluster 20 needs to deploy in the next cycle through a trained instance number prediction model.
The cluster scaler 102 is configured to obtain a state of the cluster 20 in a current cycle, and obtain a prediction result of the number of instances that the cluster 20 needs to deploy in a next cycle, for example, obtain the prediction result from the prediction subsystem 104, then determine an ideal node number of the cluster 20 in the next cycle according to the state of the cluster 20 in the current cycle and the prediction result, and then scale the cluster 20 in the current cycle according to the ideal node number.
The state of the cluster 20 in the current period includes one or more of the ready node type and the corresponding type of node number in the cluster 20, the ready node type and the corresponding type of node number, the running instance type and number on each node, and the instance type and number to be deployed. When the application is deployed using kubernetes, the number of instances running on the node may be the number of PODs, and the cluster 20 may include multiple namespaces (namespaces), such as Inner Client Namespace and Admin namespaces. Instances or components, etc. in different namespaces may be isolated from each other. For example, pod belongs to Inner Client Namespace, and the cluster jack 10 belongs to Admin nasspace, which are isolated from each other.
In some possible implementations, the telescopic decision system 10 may also include a data collection subsystem 106. The data collection subsystem 106 is configured to collect historical data applied in the cluster 20, such as traffic flows, number of instances, performance, etc. applied over a historical time period, so as to provide sample data for training or updating the time series model, the number of instances prediction model, for the prediction subsystem 104.
The data collection subsystem 106 may include a data acquisition device 1062, a data processing device 1064, and a data storage device 1066. The data collection device 1062 is configured to collect historical data applied in the cluster 20, the data processing device 1064 is configured to screen or integrate the collected historical data, and the data storage device 1066 is configured to store the screened or integrated historical data, so as to provide the historical data to the prediction subsystem 104 for model training or updating.
In some embodiments, the data collection device 1062 may be implemented by a promethaus service, the data processing device 1064 may be implemented by a remote storage adapter service, and the data storage device 1066 may be implemented by a time-series database such as InfluxDB or GaussDB.
Accordingly, the prediction subsystem 104 may include a model management device 1042 and a model storage device 1044. The model management means 1042 may obtain historical data of the applications in the cluster 20 from a time series database, such as InfluxDB, using which the model training is performed. The model storage 1044 may store a trained model, such as a time series model or an instance number prediction model. The model management apparatus 1042 may then make predictions using a trained model such as a time series model or an instance number prediction model. Specifically, when the model management device 1042 receives the prediction request sent by the cluster scaler 102, the model management device 1042 may obtain the trained model from the model storage device 1044 to perform prediction, and return the prediction result to the cluster scaler 102. Further, the model management device 1042 may also store the prediction results in the data storage device 1066 to provide sample data for subsequent model updates.
It should be noted that, after the training of the time series model or the instance number prediction model is completed, the history data does not need to be permanently saved. Data store 1066 (e.g., a time series database) may purge or dump historical data to other storage devices, saving database resources.
In some possible implementations, the telescopic decision system 10 may also include a recommendation subsystem 108. The recommendation subsystem 108 is configured to provide at least one of cost changes and performance changes caused by node scaling according to different scaling policies to a user, so that the user can select an appropriate scaling policy to scale. The different stretching strategies can comprise a mixed stretching strategy and a native automatic stretching strategy, wherein the mixed stretching strategy refers to a strategy for stretching nodes in the current period according to the number of ideal nodes, and the strategy is determined by combining a prediction result on the basis of the automatic stretching strategy. The performance change may include one or more of a change in resource utilization or a change in application response delay.
The recommendation subsystem 108 may place the hybrid scaling strategy and the native auto-scaling strategy provided in the embodiments of the present application into a simulation environment for simulation, using the real historical traffic as input to the simulation environment, thereby calculating the cost change and/or performance change caused by using the hybrid scaling strategy over the native auto-scaling strategy during the same time period in the past. The recommendation subsystem 108 may then present the above-described changes to the user through a recommendation interface, see the schematic diagram of the recommendation interface shown in fig. 2, where the recommendation interface 200 is a graphical user interface (graphical user interface, GUI), the recommendation interface 200 including a text box 202, where the text box 202 includes a rate of change in cost, a rate of change in response time, and a rate of change in resource utilization resulting from using the hybrid-telescoping strategy relative to using the native auto-telescoping strategy. In this example, using a hybrid scaling strategy results in a 31% cost reduction, a 28% reduction in application response delay, and a 45% improvement in resource utilization over using a native auto scaling strategy. The recommendation interface 200 also carries a policy selection control 204, through which a user may select a hybrid telescoping policy of embodiments of the present application, or a native automatic telescoping policy. When a user selects a hybrid scaling strategy, the cluster scaler 102 may scale the nodes of the cluster 20 in the current cycle according to the desired number of nodes.
Having described the telescopic decision system 10 in detail, a detailed description of a node telescopic method according to an embodiment of the present application will be provided below with reference to the accompanying drawings.
Referring to the flow chart of the node scaling method shown in fig. 3, the method comprises:
s302: the cluster scaler 102 obtains the state of the cluster 20 at the current cycle.
Specifically, cluster scaler 102 may periodically obtain the state of cluster 20. The period may be set to the minute level. In some embodiments, the period may be set with reference to the time it takes to add a node to the cluster 20. For example, if adding a node to cluster 20 takes 3 minutes, then the period may be set to 3 minutes. The cluster expander 102 acquires the state of the cluster 20 in a smaller period, which is equivalent to acquiring the state of the cluster 20 in real time, so that node expansion and contraction of the cluster 20 can be performed in time, and service availability can be guaranteed.
The state of the cluster 20 is used to describe the operational status of the nodes in the cluster 20. Specifically, the state of the cluster 20 in the current cycle includes one or more of a ready node type and a corresponding type of node number in the cluster 20, a ready node type and a corresponding type of node number, a running instance type and number on each node, and an instance type and number to be deployed.
The node type may be identified by one or more of a processor architecture, a processor frequency, a memory type, or a memory size in the node. Similarly, instance types are identified by one or more of the processor architecture, processor frequency, memory type, or memory size of the running instance.
The cluster scaler 102 may cause the module responsible for index collection in the cluster 20 to return the index indicated in the API request, such as the ready node type and the corresponding type of node number, the running instance type and number on each node, the instance type and number to be deployed, by sending an application programming interface (application programming interface, API) request to the cluster 20, so that the cluster scaler 102 may obtain the state of the cluster 20 in the current cycle. In some possible implementations, the cluster scaler 102 may also obtain the state of the cluster 20 in the current cycle after the user triggers a state query operation through the console.
S304: the cluster scaler 102 obtains a prediction of the number of instances of the cluster 20 that need to be deployed in the next cycle from the prediction subsystem 104.
The prediction subsystem 104 may provide the cluster scaler 102 with a prediction of the number of instances in the cluster 20 that each application needs to deploy in the future. The cluster scaler 102 may send a prediction request to the prediction subsystem 104, and in response to the prediction request, the prediction system 104 invokes a corresponding model (e.g., an instance number prediction model) to predict, obtain a prediction result, and then returns the prediction result to the cluster scaler 102.
The prediction principle of the example number prediction model is described below. Referring to the predictive schematic of the example number prediction model shown in FIG. 4, the example number prediction model includes an encoder and a decoder, the prediction subsystem 104 obtains historical data of the target (i.e., the number of historical examples for each application) and historical data { x, z } characterizing the important indicators of performance, where x characterizes the historical data of the target, and z characterizes the historical data of the important indicators of performanceIncluding CPU utilization, memory utilization, horizontal auto-scaling (Horizontal Pod Autoscaling, HPA) policies, etc., where z may be fed into the encoder as a covariate along with x. The encoder groups all sequences into K classes, one for each decoder. Wherein the encoder can extract the embedded feature f E The embedded feature may be used to select one decoder from the K-class decoders, for example, the class i decoder.
Wherein the encoder comprises a plurality of encoding modules, each encoding module outputs a predicted value, and the encoder can also output the sum of the predicted values of all the encoding modulesEach coding module will also output a reconstruction value of x, the encoder may also output the sum of the reconstruction values of all coding modules +.>The encoder can obtain residual x from the reconstructed values of x and z and x and z E And z E Wherein->In accordance with the embedded feature f E After selecting one decoder corresponding to it from all decoders, such as the category i decoder, it can be based on +.>Obtaining a predicted value +.>And reconstruction value of the object->The predicted value of the target refers to the predicted result of the predicting subsystem 104 on the number of instances that need to be deployed in the next period.
In training the number of instances prediction model, the prediction subsystem 104 may use the past day or dayOne hour of data updates parameters of the instance number prediction model, such as parameters of the encoder and decoder, thereby implementing a training instance number prediction model. When the instance number prediction model training is completed, the prediction subsystem 104 may predict the number of instances to be deployed in the next cycle using the trained instance number prediction model. In particular, the prediction subsystem 104 may input the applied historical data into an instance number prediction model, an encoder in the instance number prediction model may extract embedded features, select a decoder based on the embedded features, and then output based on the decoder and the encoder Decoding is performed to obtain the number of instances that need to be deployed for the next cycle. For example, the prediction subsystem 104 may input 2021-10-1:07 to 2021-10-1:07:59 instance numbers to the instance number prediction model to predict 2021-10-1:08:00 to 2021-10-108:02 instance numbers.
In this approach, the prediction subsystem 104 may use an unsupervised clustering approach to categorize applications into several types. The same type of application can use an instance number prediction model, achieve model multiplexing by analyzing the change mode of commonalities in the class, and simultaneously can provide prediction for the newly added application with almost no historical data. Different example quantity prediction models can be used among different types, the difference of the change modes is considered in a distinguishing mode, and the prediction accuracy can be improved.
When predicting the number of instances to be deployed in the future, the prediction subsystem 104 considers other modal information, including running state information such as CPU utilization rate and memory utilization rate, or setting information such as HPA policy, category embedded feature information, and the like, and improves the prediction performance through integration of multi-modal information.
It should be noted that, S302 and S304 may be executed in parallel, or may be executed according to a set sequence. For example, the cluster extender 102 may first perform S302 and then S304, and for another example, the cluster extender 102 may first perform S304 and then S302.
S306: the cluster scaler 102 determines the ideal node number of the cluster 20 in the next cycle according to the state of the cluster 20 in the current cycle and the prediction result.
The cluster scaler 102 may provide a hybrid scaling strategy (also referred to as a scaling strategy) that combines real-time states (e.g., the state of the cluster 20 at the current cycle) with predicted results. In this hybrid scaling strategy, node scaling operations may be triggered either by the real-time state of the cluster 20 or by the predicted outcome.
Referring to the node number deduction schematic diagram shown in fig. 5, the cluster scaler 102 creates a cluster snapshot (clustersnappshot) according to a prediction result (recorded as a forecast info, including the number of instances to be deployed in the future, such as the number of instances to be deployed in the next period), and a real-time state of the cluster 20 (recorded as a cluster info, which may specifically be a state of the cluster 20 in the current period, including a ready node readyode, a node type and the number of ready nodes, an instance schedule pool running on the node, and a type and the number of instances pending to be deployed), and performs a box deduction according to the cluster snapshot to obtain the ideal node number.
Specifically, the cluster scaler 102 may examine the instances in the cluster 20 to be deployed and simulate the deployment of those instances onto the various nodes of the cluster 20. If all the existing nodes cannot deploy the examples, one node is added to the simulation of the cluster 20 until the nodes in the cluster 20 can fully deploy the examples to be deployed, so that the ideal node number can be obtained. In this case, the ideal number of nodes is greater than the number of nodes of cluster 20 in the current cycle.
As shown in fig. 5, after obtaining the information of the to-be-deployed instance, the cluster scaler 102 may perform a box deduction for each node pool, so as to obtain options corresponding to different node pools, which is called as an get option (object options). The options corresponding to the node pool comprise information of the node pool, the number of nodes to be created and information of all instances which can be accommodated by a single node. Then, the cluster expander 102 obtains the best option (best option) from the options corresponding to the different node pools according to the capacity expansion policy set by the user (if the user does not set the capacity expansion policy, the default policy).
The capacity expansion strategy set by the user can be one or more of a random strategy, a strategy with the largest number of deployment examples or a strategy with the least waste. A random policy refers to randomly selecting one from among all known node pools of the cluster 20, adding one or more nodes to the node pool; the strategy of the maximum number of deployed examples is named as 'most-points', and specifically, the node pool capable of deploying the maximum number of examples is selected from all node pools to expand capacity; the least-waste (least-waste) policy, also referred to as a maximum resource utilization policy, is denoted as "least-waste" and refers to selecting a node pool with the highest resource utilization (e.g., CPU utilization) after all instances are deployed for expansion.
The best option may include information of a node pool (e.g., a target node pool) determined based on a user-configured capacity expansion policy or a default policy, the number of nodes that the node pool needs to create, and an instance that each node should deploy. In view of the fact that the expansion and contraction ranges of different node pools may be different, for example, the number of nodes and the node types that can be provided may be different, the cluster scaler 102 may further check the node creation condition, for example, check the number of nodes created in the "best option" so that the number of creation meets the necessary requirement (for example, the maximum number of nodes in the node pool is not exceeded, etc.), and obtain a "final option". The cluster expander 102 may record the final options to capacity expansion information for subsequent capacity expansion use.
In some possible implementations, the cluster scaler 102 may further perform box deduction on the selected node pool according to the node pool expansion scheme recorded in the "final option" to obtain a "remaining instance". If the "remaining examples" are not null, it indicates that after the example deployment scheme of the "final option" is executed, the example is still to be deployed, and then the cluster scaler 102 may update the cluster snapshot to obtain the cluster snapshot after simulating and deploying the example recorded in the "final option", and perform a new round of boxing deduction on the "example to be deployed" recorded in the new cluster snapshot in each node pool, to obtain the option corresponding to each node pool, and continue to determine the final option therefrom until the remaining examples are null.
In some possible implementations, cluster scaler 102 may also monitor whether there are underutilized nodes in cluster 20 to determine an ideal number of nodes. Wherein the underutilized node refers to an empty node (empty nodes), or a node on which the CPU utilization is below a preset value and on which the running instance can be migrated to other nodes. In this case, the ideal number of nodes is smaller than the number of nodes of cluster 20 in the current cycle.
S308: the cluster scaler 102 scales the nodes of the cluster 20 in the current cycle according to the ideal node number.
Specifically, the cluster scaler 102 may add nodes to the cluster from the target node pool at the current cycle such that the number of nodes in the cluster reaches the ideal number of nodes. As shown in fig. 5, the cluster scaler 102 truly deploys a corresponding number of nodes of a corresponding type into the cluster 20 in the current period according to the node pool information and the node capacity expansion number (the number of nodes that the node pool needs to create) in the capacity expansion information.
In the case where the number of ideal nodes is less than the number of nodes of cluster 20 in the current cycle, cluster scaler 102 may remove monitored underutilized nodes (e.g., empty nodes or nodes with CPU utilization below a preset value) from cluster 20. The cluster scaler 102 may obtain a list of underutilized nodes, then perform node inspection according to the number of ideal nodes, select a number of nodes from the list, and remove the number of nodes equal to the difference between the number of nodes in the current cycle and the number of ideal nodes. If there are instances on the underutilized node that need to be migrated, the cluster scaler 102 may migrate the instance to other nodes before removing the node from the cluster 20.
Based on the above description, the method increases the prediction of the number of instances that the cluster 20 needs to be deployed on the basis of the node layer automatic expansion policy, and accordingly, the cluster expansion device 102 can perform node expansion in the current period by combining the real-time state with the prediction result, so that the number of nodes in the cluster 20 can meet the requirement of the upcoming next period, effectively improve expansion efficiency, ensure the stability of service, and achieve the purposes of improving the resource utilization rate, reducing the cost, and meeting and improving the service performance.
In the above embodiment, when the number of nodes required for representing the next period by the deduction result is greater than the number of nodes existing in the current period, the capacity expansion operation is often performed. If the result is inaccurate, the newly added node is in an idle or underutilized state, so that the resource utilization rate is reduced. When the number of nodes needed by the deduction result to represent the next period is smaller than the number of nodes existing in the current period, the capacity reduction operation is often executed. If the result is inaccurate, the removed node is actually the node required by the user, which causes a problem of service performance degradation. Therefore, the embodiment of the application also provides a node telescoping method combined with telescoping constraint.
In particular, cluster scaler 102 may also obtain a scaling constraint that characterizes node scaling constraints, which may include an effective time interval and a scaling interval. Wherein the validation time interval may be characterized by a start time and an end time. The stretch interval may include at least one of a minimum number of nodes and a maximum number of nodes. The cluster scaler 102 may scale the nodes of the cluster 20 in the current cycle according to the number of ideal nodes and the scaling constraint.
Specifically, when the next period is in the effective time interval, if the number of nodes in the cluster 20 in the current period is smaller than the minimum number of nodes, adding nodes to the cluster 20 in the current period so that the number of nodes in the cluster 20 reaches the minimum number of nodes, if the number of nodes in the cluster 20 in the current period is larger than the maximum number of nodes, removing nodes from the cluster 20 in the current period so that the number of nodes in the cluster 20 reaches the maximum number of nodes, and if the number of nodes in the cluster 20 in the current period is not smaller than the minimum number of nodes and not larger than the maximum number of nodes, adding nodes to the cluster 20 in the current period or removing nodes from the cluster 20 so that the number of nodes in the cluster 20 reaches the ideal number of nodes.
Wherein the maximum number of nodes or the minimum number of nodes may be determined according to the historical number. Referring to the flow chart of node scaling shown in fig. 6, the cluster scaler 102 may detect a historical period of time when the prediction accuracy does not reach the standard based on the historical data (including the number of histories and the historical prediction result). Wherein the historical number may include a number of instances of the application deployed during the historical period or a number of instances of a same class of the application deployed during the historical period. The cluster scaler 102 may perform statistical analysis on the historical time periods to set the planning strategy. Specifically, when the historical time period has periodicity, such as 8 to 9 points per day, the time period may be determined as a lifetime interval, and the cluster expansion device 102 may set an expansion interval of the time period in the future, so as to make a planning strategy. The planning strategy may be characterized by a start time, an end time, and a telescoping interval. Wherein the telescoping interval may be identified by at least one of a minimum number of nodes or a maximum number of nodes. Further, the planning strategy can also comprise a node pool, and the node pool is used for indicating to stretch in a stretch interval of the node pool.
The cluster scaler 102 scales nodes according to the above-described planning strategy. Specifically, when the current time is within the effective time interval (between the start time and the end time) defined by the planning strategy, if the number of nodes in the designated node pool is smaller than the minimum number of nodes, the cluster expansion device 102 expands the capacity to the minimum number of nodes, so as to ensure the service performance. If the number of nodes in the designated node pool is greater than the maximum number of nodes, the cluster expander 102 contracts to the maximum number of nodes to ensure resource utilization. If the number of nodes in the designated node pool is not less than the minimum number of nodes and not greater than the maximum number of nodes, the cluster scaler 102 may scale the nodes according to a hybrid scaling strategy, specifically, add nodes to the cluster 20 in the current cycle, or remove nodes from the cluster 20, so that the number of nodes in the cluster 20 reaches the ideal number of nodes.
It should be noted that, in some possible implementations, the minimum node number or the maximum node number may be set manually. Specifically, the user may configure the minimum node number and/or the maximum node number through the configuration interface, and accordingly, the cluster scaler 102 may scale the node according to the minimum node number and/or the maximum node number configured by the user. For example, during annual celebration of a store, the traffic flow is generally large, and the user may configure the minimum node number and the maximum node number in the annual celebration period, so that the cluster expander 102 performs node expansion and contraction according to the minimum node number and the maximum node number, thereby meeting the traffic demand during annual celebration.
In order to make the technical scheme of the application clearer and easier to understand, the node telescoping method of the application is described below in combination with a specific scene.
Referring to a schematic diagram of a node capacity expansion scenario shown in fig. 7, in an initial state, a node in a cluster is denoted as node 1, and an instance of an application 1 running on the node 1 can process at most 1000 requests, where the number of requests is 1000 at present. At this point, the prediction subsystem 104 may predict the number of instances in the future. The expansion of the node may be triggered directly when it knows that there will be 4 instances running on the cluster in the future. If a node can accommodate two instances, the cluster scaler 102 triggers the expansion operation to add a new node (node 2). At this point, the current 1000 requests will still be well processed.
When the same burst flow scene is encountered, namely the number of requests is increased from 1000 to 4000, two nodes are ready, 3 new instances of HPA increase can be directly deployed on the two nodes, all 4000 requests are processed, failure requests are not generated, and service availability is guaranteed.
Referring to a schematic view of a node capacity reduction scenario shown in fig. 8A, there are two nodes in the current cluster, where two instances are deployed on one node (node 1), and the other node (node 2) is in an idle state, and the number of current requests is 2000. The cluster scaler 102 detects that node 2 is in an idle state and predicts that the number of future instances will be reduced to 1, so only 1 node is needed in the future, thus directly moving node 2 out of the cluster.
Referring next to another node scaling scenario shown in fig. 8B, when the cluster scaler 102 detects that node 2 is idle, a prediction result (e.g., 3) of the number of future instances is obtained at the same time. The cluster stretcher 102 may calculate the number of ideal nodes to be 2 according to the prediction result and the real-time state of the cluster. And the number of nodes in the current period is consistent with that of the cluster, so that the node 2 is reserved. When the number of requests suddenly increases from 2000 to 2500, the HPA increases by 1 instance (application 1, instance 3) to handle 500 more requests. Since node 2 is in the idle state at this time, the newly added instance can be directly deployed, so that all 2500 requests can be well processed without generating a failure request.
According to the method and the system, the prediction subsystem 104 is added to provide prediction of the minute level, meanwhile, real-time flow and a prediction result are considered, the effectiveness of automatic expansion is greatly improved, preparation is made for future flow under the condition that current flow is met, expansion and contraction opportunities are more effectively mastered, and the purposes of improving the resource utilization rate, reducing the cost and meeting and improving the service performance are achieved.
When the flow fluctuation mode is not fixed, the prediction accuracy tends to be reduced, and for this purpose, the cluster expander 102 may also perform node expansion in combination with expansion constraint. The following description is presented in connection with a specific scenario.
Referring to the node scaling scenario diagram shown in fig. 9, the cluster scaler 102 determines a historical period of time for which the prediction accuracy does not reach the standard according to the historical data. If the historical time periods (time intervals) have periodicity, a telescopic interval, namely a telescopic interval defined by the maximum node number and the minimum node number, can be set for the node pool in the future time period through the historical data.
When 2 instances are predicted to exist in the future in the period, and node 2 is not only currently idle, but also will be idle in the future, then the cluster scaler 102 may look at the scaling interval of the node pool. From the history data, it is known that the number of nodes frequently fluctuates between 1 and 2, and if the lower limit of the node pool, that is, the minimum number of nodes is set to 2, then in order to meet the lower limit, the node 2 is not moved out of the cluster, but the node is selected to be reserved. In this way, when the fluctuation of the request quantity also occurs, the problem of unstable service generated by the native automatic scaling algorithm does not occur due to the minimum node quantity of the node pool.
According to the hybrid expansion algorithm, a self-adaptive node Chi Shensu interval strategy is provided according to historical statistical data, and the stability of service is guaranteed by adjusting the maximum node number and the minimum node number of a node pool, so that the user cost is saved on the premise of meeting the user requirements and improving the service stability.
Based on the node scaling method provided in the embodiment of the present application, the embodiment of the present application further provides a cluster scaler 102 as described above. The cluster jack 102 provided in the embodiments of the present application will be described below with reference to the accompanying drawings.
Referring to the schematic structure of the cluster jack 102 shown in fig. 10, the cluster jack 102 includes:
an obtaining module 1022, configured to obtain a state of the cluster 20 in a current period, and obtain a prediction result of the number of instances that need to be deployed by the cluster 20 in a next period;
a determining module 1024, configured to determine an ideal number of nodes of the cluster 20 in a next cycle according to the state of the cluster 20 in the current cycle and the prediction result;
the scaling module 1026 is configured to scale the nodes of the cluster 20 in the current period according to the number of ideal nodes.
In some possible implementations, the expansion module 1026 is specifically configured to:
nodes are added to the cluster 20 from the target node pool in the current cycle so that the number of nodes in the cluster 20 reaches the ideal number of nodes.
In some possible implementations, the target node pool is determined by simulating results of instance deployment according to a target policy.
In some possible implementations, the target policy includes one or more of a random policy, a maximum number of deployed instances policy, or a maximum resource utilization policy.
In some possible implementations, the acquiring module 1022 is further configured to:
obtaining a telescopic constraint;
the expansion module 1026 is specifically configured to:
and node expansion and contraction are carried out on the cluster 20 in the current period according to the ideal node number and the expansion and contraction constraint.
In some possible implementations, the scaling constraint includes an effective time interval and a scaling interval, the scaling interval including at least one of a maximum number of nodes and a minimum number of nodes, and the scaling module 1026 is specifically configured to:
when the next cycle is in the effective time interval, if the number of nodes in the cluster 20 in the current cycle is smaller than the minimum number of nodes, adding nodes to the cluster 20 in the current cycle so that the number of nodes in the cluster 20 reaches the minimum number of nodes, if the number of nodes in the cluster 20 in the current cycle is larger than the maximum number of nodes, removing nodes from the cluster 20 in the current cycle so that the number of nodes in the cluster 20 reaches the maximum number of nodes, and if the number of nodes in the cluster in the current cycle is not smaller than the minimum number of nodes and not larger than the maximum number of nodes, adding nodes to the cluster 20 or removing nodes from the cluster 20 in the current cycle so that the number of nodes in the cluster 20 reaches the ideal number of nodes.
In some possible implementations, the maximum number of nodes or the minimum number of nodes is determined from a historical number.
In some possible implementations, the historical number includes a number of instances that an application was deployed over a historical period of time or a number of instances that a homogeneous application of the application was deployed over a historical period of time.
In some possible implementations, the maximum number of nodes or the minimum number of nodes is configured by a user through a configuration interface.
In some possible implementations, the state of the cluster 20 in the current cycle includes one or more of a ready node type and a corresponding type of node number in the cluster 20, an upcoming ready node type and a corresponding type of node number, a running instance type and number on each node, an instance type and number to be deployed.
The cluster jack 102 according to the embodiments of the present application may correspond to performing the methods described in the embodiments of the present application, and the above and other operations and/or functions of the respective modules/units of the cluster jack 102 are respectively for implementing the respective flows of the respective methods in the embodiments shown in fig. 3, which are not repeated herein for brevity.
Based on the cluster expander 102 provided in the embodiments of the present application, the embodiments of the present application further provide a telescopic decision system 10 as described above. The cluster jack 102 provided in the embodiments of the present application will be described below with reference to the accompanying drawings.
Referring to the schematic of the architecture of the telescopic decision system 10 shown in fig. 1, the telescopic decision system 10 comprises a cluster telescopic 102 and a prediction subsystem 104;
a prediction subsystem 104, configured to predict the number of instances that need to be deployed by the cluster in the next period;
and the cluster expander 102 is configured to obtain a prediction result of the number of instances of the cluster to be deployed in the next cycle, obtain a state of the cluster in the current cycle, determine an ideal node number of the cluster in the next cycle according to the state of the cluster in the current cycle and the prediction result, and implement node expansion and contraction on the cluster in the current cycle according to the ideal node number.
The cluster expander 102 may further perform method steps corresponding to various implementations of the embodiment shown in fig. 3, which is not limited in this embodiment.
Further, the telescopic decision system 10 may also comprise a data collection subsystem 106. The data collection subsystem 106 is configured to collect historical data applied in the cluster 20, such as traffic flows, number of instances, performance, etc. applied over a historical time period, so as to provide sample data for training or updating the time series model, the number of instances prediction model, for the prediction subsystem 104.
In some possible implementations, the telescopic decision system 10 may also include a recommendation subsystem 108. The recommendation subsystem 108 is configured to provide at least one of cost changes and performance changes caused by node scaling according to different scaling policies to a user, so that the user can select an appropriate scaling policy to scale. The different scaling strategies can include a hybrid scaling strategy and a native automatic scaling strategy according to the embodiment of the present application, where the hybrid scaling strategy refers to a strategy that performs node scaling in a current period according to an ideal node number.
The embodiment of the application also provides a computer cluster. The computer cluster comprises at least one computer, and any one of the at least one computer can be from a cloud environment or an edge environment or can be a terminal device. The computer cluster is specifically configured to implement the functionality of cluster jack 102 in the embodiment shown in fig. 10.
Fig. 11 provides a schematic structural diagram of a computer cluster, and as shown in fig. 11, the computer cluster 110 includes a plurality of computers 1100, and the computers 1100 include a bus 1101, a processor 1102, a communication interface 1103 and a memory 1104. The processor 1102, the memory 1104 and the communication interface 1103 communicate via the bus 1101.
The bus 1101 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, or the like. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in FIG. 11, but not only one bus or one type of bus.
The processor 1102 may be any one or more of a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), a Microprocessor (MP), or a digital signal processor (digital signal processor, DSP).
The communication interface 1103 is used for communication with the outside. For example, the communication interface 1103 is configured to obtain a state of the cluster 20 in a current period, obtain a prediction result of the number of instances that need to be deployed by the cluster 20 in a next period, and so on.
The memory 1104 may include volatile memory (RAM), such as random access memory (random access memory). The memory 1104 may also include a non-volatile memory (ROM), such as a read-only memory (ROM), a flash memory, a Hard Disk Drive (HDD), or a solid state drive (solid state drive, SSD).
The memory 1104 has stored therein computer readable instructions that are executed by the processor 1102 to cause the computer cluster 110 to perform the aforementioned node telescoping method (or to implement the functionality of the aforementioned cluster telescoping 102).
In particular, in the case of implementing the embodiment of the cluster jack 102 shown in fig. 10, and in the case where the functions of the modules of the cluster jack 102 described in fig. 10, such as the acquisition module 1022, the determination module 1024, and the jack module 1026, are implemented by software, software or program code required to perform the functions of the modules in fig. 10 may be stored in the at least one memory 1104 in the computer cluster 110. The at least one processor 1102 executes the program code stored in the memory 1104 to cause the computer cluster 110 to perform the node scaling method described above.
Embodiments of the present application also provide a computer-readable storage medium. The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a data center containing one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc. The computer-readable storage medium includes instructions that instruct a computer or cluster of computers to perform the node telescoping method described above.
Embodiments of the present application also provide a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, or data center to another website, computer, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer program product may be a software installation package, which may be downloaded and executed on a computer or cluster of computers in case any one of the aforementioned node scaling methods is required.
The descriptions of the processes or structures corresponding to the drawings have emphasis, and the descriptions of other processes or structures may be referred to for the parts of a certain process or structure that are not described in detail.

Claims (24)

1. A method of node scaling, the method comprising:
acquiring a state of a cluster in a current period, and acquiring a prediction result of the number of instances to be deployed of the cluster in a next period;
determining the ideal node quantity of the cluster in the next period according to the state of the cluster in the current period and the prediction result;
and node expansion and contraction are carried out on the cluster in the current period according to the ideal node quantity.
2. The method of claim 1, wherein node scaling the cluster at the current cycle based on the number of ideal nodes comprises:
and adding nodes from the target node pool to the cluster in the current period, so that the number of the nodes in the cluster reaches the ideal number of the nodes.
3. The method of claim 2, wherein the pool of target nodes is determined by simulating results of instance deployment in accordance with a target policy.
4. The method of claim 3, wherein the target policy comprises one or more of a random policy, a maximum number of deployed instances policy, or a maximum resource utilization policy.
5. The method according to claim 1, wherein the method further comprises:
Obtaining a telescopic constraint;
the node expansion and contraction of the cluster are carried out in the current period according to the ideal node quantity, and the node expansion and contraction method comprises the following steps:
and according to the ideal node quantity and the expansion constraint, node expansion and contraction are carried out on the cluster in the current period.
6. The method of claim 5, wherein the scaling constraint comprises an effective time interval and a scaling interval, the scaling interval comprising at least one of a maximum number of nodes and a minimum number of nodes, the node scaling the cluster at a current cycle according to the ideal number of nodes and the scaling constraint comprising:
and when the next period is in the effective time interval, if the number of nodes in the cluster is smaller than the minimum number of nodes in the current period, adding nodes to the cluster in the current period so that the number of nodes in the cluster reaches the minimum number of nodes, if the number of nodes in the cluster is larger than the maximum number of nodes in the current period, removing nodes from the cluster in the current period so that the number of nodes in the cluster reaches the maximum number of nodes, and if the number of nodes in the cluster is not smaller than the minimum number of nodes and not larger than the maximum number of nodes in the current period, adding nodes to the cluster or removing nodes from the cluster in the current period so that the number of nodes in the cluster reaches the ideal number of nodes.
7. The method of claim 6, wherein the maximum number of nodes or the minimum number of nodes is determined based on a historical number.
8. The method of claim 7, wherein the historical quantity comprises a quantity of instances of an application deployed over a historical period of time or a quantity of instances of a homogeneous application of the application deployed over a historical period of time.
9. The method of claim 6, wherein the maximum number of nodes or the minimum number of nodes is configured by a user through a configuration interface.
10. The method of any of claims 1 to 9, wherein the state of the cluster at the current cycle comprises one or more of a ready node type and a corresponding type of node number in the cluster, an upcoming ready node type and a corresponding type of node number, a running instance type and number on each node, an instance type and number to be deployed.
11. A cluster jack, the cluster jack comprising:
the acquisition module is used for acquiring the state of the cluster in the current period and acquiring a prediction result of the number of instances to be deployed of the cluster in the next period;
The determining module is used for determining the ideal node quantity of the cluster in the next period according to the state of the cluster in the current period and the prediction result;
and the expansion module is used for expanding and contracting the nodes of the cluster in the current period according to the ideal node quantity.
12. The cluster jack of claim 11, wherein the jack module is specifically configured to:
and adding nodes from the target node pool to the cluster in the current period, so that the number of the nodes in the cluster reaches the ideal number of the nodes.
13. The cluster jack of claim 12, wherein the target node pool is determined by modeling results of instance deployment in accordance with a target policy.
14. The cluster scaler of claim 13, wherein the target policy comprises one or more of a random policy, a maximum number of deployed instances policy, or a maximum resource utilization policy.
15. The cluster jack of claim 11, wherein the acquisition module is further configured to:
obtaining a telescopic constraint;
the telescopic module is specifically used for:
and according to the ideal node quantity and the expansion constraint, node expansion and contraction are carried out on the cluster in the current period.
16. The cluster compensator of claim 15, wherein the expansion constraint comprises an effective time interval and an expansion interval, the expansion interval comprising at least one of a maximum number of nodes and a minimum number of nodes, the expansion module being specifically configured to:
and when the next period is in the effective time interval, if the number of nodes in the cluster is smaller than the minimum number of nodes in the current period, adding nodes to the cluster in the current period so that the number of nodes in the cluster reaches the minimum number of nodes, if the number of nodes in the cluster is larger than the maximum number of nodes in the current period, removing nodes from the cluster in the current period so that the number of nodes in the cluster reaches the maximum number of nodes, and if the number of nodes in the cluster is not smaller than the minimum number of nodes and not larger than the maximum number of nodes in the current period, adding nodes to the cluster or removing nodes from the cluster in the current period so that the number of nodes in the cluster reaches the ideal number of nodes.
17. The cluster jack of claim 16, wherein the maximum number of nodes or the minimum number of nodes is determined from a historical number.
18. The cluster extender of claim 17, wherein the historical quantity comprises a quantity of instances of an application deployed over a historical period of time or a quantity of instances of a homogeneous application of the application deployed over a historical period of time.
19. The cluster jack of claim 16, wherein the maximum number of nodes or the minimum number of nodes is configured by a user through a configuration interface.
20. The cluster scaler of any one of claims 11 to 19, wherein the state of the cluster at the current cycle comprises one or more of a ready node type and a corresponding type of node number in the cluster, an upcoming ready node type and a corresponding type of node number, a running instance type and number on each node, an instance type and number to be deployed.
21. A telescopic decision system, the telescopic decision system comprising:
the prediction subsystem is used for predicting the number of instances to be deployed in the next period of the cluster;
the cluster expansion device is used for obtaining a prediction result of the number of instances of the cluster to be deployed in the next period, obtaining the state of the cluster in the current period, determining the ideal node number of the cluster in the next period according to the state of the cluster in the current period and the prediction result, and carrying out node expansion on the cluster in the current period according to the ideal node number.
22. A computer cluster comprising at least one computer, the at least one computer comprising at least one processor and at least one memory, the at least one memory having computer readable instructions stored therein; the at least one processor executing the computer readable instructions to cause the computer cluster to perform the method of any one of claims 1 to 10.
23. A computer-readable storage medium comprising computer-readable instructions; the computer readable instructions are for implementing the method of any one of claims 1 to 10.
24. A computer program product comprising computer readable instructions; the computer readable instructions are for implementing the method of any one of claims 1 to 10.
CN202210743228.4A 2022-06-28 2022-06-28 Node telescoping method and related device Pending CN117349001A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210743228.4A CN117349001A (en) 2022-06-28 2022-06-28 Node telescoping method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210743228.4A CN117349001A (en) 2022-06-28 2022-06-28 Node telescoping method and related device

Publications (1)

Publication Number Publication Date
CN117349001A true CN117349001A (en) 2024-01-05

Family

ID=89356174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210743228.4A Pending CN117349001A (en) 2022-06-28 2022-06-28 Node telescoping method and related device

Country Status (1)

Country Link
CN (1) CN117349001A (en)

Similar Documents

Publication Publication Date Title
CN108632330B (en) Cloud resource management system and management method thereof
CN110489306A (en) A kind of alarm threshold value determines method, apparatus, computer equipment and storage medium
CN107656807B (en) Automatic elastic expansion method and device for virtual resources
US11726836B2 (en) Predicting expansion failures and defragmenting cluster resources
CN111381928B (en) Virtual machine migration method, cloud computing management platform and storage medium
CN113010260A (en) Elastic expansion method and system for container quantity
US10565021B2 (en) Automated capacity management in distributed computing systems
CN110933178B (en) Method for adjusting node configuration in cluster system and server
EP3798930A2 (en) Machine learning training resource management
CN114490566B (en) Cluster data migration method and device, computer equipment and storage medium
CN110297743B (en) Load testing method and device and storage medium
CN114490078A (en) Dynamic capacity reduction and expansion method, device and equipment for micro-service
KR101630125B1 (en) Method for resource provisioning in cloud computing resource management system
CN114911492A (en) Inference service deployment method, device, equipment and storage medium
EP3798931A1 (en) Machine learning training resource management
CN110347546B (en) Dynamic adjustment method, device, medium and electronic equipment for monitoring task
CN112416568A (en) Duration estimation method and duration estimation device for audio and video transcoding task
CN117349001A (en) Node telescoping method and related device
CN110928860B (en) Data migration method and device
CN114880079A (en) Kubernetes cluster scale adjustment method, system and equipment based on reinforcement learning
CN114297512A (en) Data recommendation method and device, electronic equipment and storage medium
US20170090820A1 (en) Method and device for operating a many-core system
CN114253663A (en) Virtual machine resource scheduling method and device
CN113407192B (en) Model deployment method and device
KR102658677B1 (en) System and method for managing virtual machine based on role-specific resource utilization in naval combat systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication