WO2024021467A1

WO2024021467A1 - Cluster resource planning method, device, apparatus, and medium

Info

Publication number: WO2024021467A1
Application number: PCT/CN2022/141378
Authority: WO
Inventors: 陈赜
Original assignee: 天翼云科技有限公司
Priority date: 2022-07-26
Filing date: 2022-12-23
Publication date: 2024-02-01
Also published as: CN115309501A

Abstract

Disclosed are a cluster resource planning method, a device, an apparatus, and a medium, which are applied to a public cloud platform. The method comprises: after acquiring a first list of components to be deployed, sorting the first list of components according to a first preset priority and a second preset priority, to obtain a list of instances to be scheduled; for each instance in a module, selecting a host node from a host node list according to a first preset rule, to obtain a preselected host node list corresponding to the current instance; after selecting an optimal host node from the preselected host node list according to the preselected host node list and a second preset rule, binding the optimal host node to the current instance, to obtain a binding relationship; deploying to the host node the instance corresponding to the host node in the binding relationship. The present invention can improve the efficiency of resource planning by implementing automatic deployment of cluster resources under the public cloud platform.

Description

A cluster resource planning method, equipment, device and medium

Technical field

The present invention relates to the field of computer technology, and in particular to a cluster resource planning method, equipment, device and medium.

Background technique

With the advent of the cloud computing era, various big data components have been widely used with the help of public cloud platforms. In particular, public cloud platforms can be used to deploy multi-component clusters.

Currently, when deploying multiple components of a cluster based on a public cloud platform, relevant experts are required to intervene and use expert experience to deploy instances for cluster nodes.

technical problem

However, since expert experience relies on the experts themselves, experts are required to manually process work orders, so the deployment efficiency is low.

Technical solutions

The present invention provides a cluster resource planning method, equipment, device and medium to solve the problem of low efficiency of cluster resource planning existing in the prior art.

In a first aspect, embodiments of the present invention provide a cluster resource planning method applied to a public cloud platform, including:

After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;

For each instance in each module in the to-be-scheduled instance list, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;

According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain a binding relationship;

Deploy the instance corresponding to the host node in the binding relationship to the host node.

In a cluster resource planning method provided by an embodiment of the present invention, after the obtained first component list is sorted according to the first preset priority and the second preset priority, a list of instances to be scheduled is obtained. For the list of instances to be scheduled, For each instance in each module, the preselected host node list is determined according to the first preset rule, and the optimal host node bound to the current instance is selected from the preselected host node list according to the second preset rule, and we get Binding relationship, deploy the instance corresponding to the host node according to the binding relationship. Since this cluster resource planning method realizes automatic allocation and deployment of instances in multiple components to suitable host nodes, it can improve the efficiency of multi-component cluster resource planning based on public cloud platforms.

In an optional implementation, the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled, including:

Arrange each component in the first component list in descending order according to the first preset priority to obtain a second component list;

Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component;

For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module, to obtain the list of instances to be scheduled.

In the above method, the components in the first component list are arranged in descending order according to the first preset priority, the modules corresponding to each component are arranged in descending order according to the second preset priority, and the attribute information of the number of instances corresponding to the module is arranged. , create an instance for the module to obtain a list of instances to be scheduled. Sort each component and each module corresponding to the component according to priority, and ensure that the affinity dependency strategy of the deployed instance is correctly parsed to improve the excellence of the allocation results.

In an optional implementation, arranging each component in the first component list in descending order according to the first preset priority includes:

Traverse the first component list;

Sort each component in the first component list according to priority value from small to large.

As above, each component in the first component list is arranged in descending order according to the first preset priority, that is, each component in the first component list is sorted according to the priority value from small to large. The smaller the priority value, the smaller the priority value. The higher the priority of the corresponding component, that is, sort each component in the first component list in order from high to low priority. Optimize the allocation of host nodes to instances in each component by prioritizing the components so that each component can be traversed one by one in the subsequent process.

In an optional implementation, the first preset rule includes:

The current host node satisfies the strong affinity rule;

Moreover, the current host node satisfies the strong anti-affinity rule;

Moreover, the available resources of the current host node meet the deployment requirements of the current instance.

The above method selects a host node in the host node list according to the first preset rule, that is, determines whether the current host node satisfies the strong affinity rule, and whether the current host node satisfies the strong anti-affinity rule, and the availability of the current host node Whether the resources are sufficient. If the current node meets the above three judgment conditions at the same time, the node will be used as a preselected host node and added to the preselected host node list corresponding to the current instance. If the current node does not meet any of the above three judgment conditions. A condition is used to determine the next host node. Through pre-screening through the above method, the pre-selected host nodes available for the current instance are initially determined, which narrows the selection range and improves the efficiency of node allocation.

In an optional implementation, selecting the optimal host node from the host node list according to the preselected host node list and the second preset rule includes:

For each host node corresponding to each instance, determine the score of the host node corresponding to the instance according to the preset corresponding relationship between the host node, the instance and the score;

The scores of each host node corresponding to each instance are compared, and the host node with the highest score is regarded as the optimal host node.

The above method determines the score of the host node corresponding to the instance based on the preset corresponding relationship between the host node, instance and score, and compares the score of each host node, thereby selecting the host node with the highest score as the best. Optimal host node. This method has flexible scoring rules for host nodes corresponding to instances and can be widely adapted to different types of host nodes, which improves the universality of the method.

In an optional implementation, if the host nodes with the highest scores include at least two, the method further includes:

The idle resources of the at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node.

In the above method, if there are at least two host nodes with the highest scores, the idle resources of at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node. Therefore, the excellence of the resource allocation results is further ensured by the double optimization rule of optimizing the host node with the highest score and the most idle resources.

In an optional implementation, before selecting a host node in the host node list according to the first preset rule to obtain the preselected host node list corresponding to the current instance, the method further includes:

Traverse the list of instances to be scheduled;

For each instance in the to-be-scheduled instance list, a preselected host node list corresponding to the current instance is initialized.

In the above method, for each instance in the instance list to be scheduled, a preselected host node list corresponding to the current instance needs to be initialized to ensure the accuracy of the preselected host list obtained according to the first preset rule.

In a second aspect, embodiments of the present invention provide a cluster resource planning device, which is applied to a public cloud platform and includes a memory and a processor. The memory stores a computer program. When the processor executes the computer program, any one of the above is implemented. The steps of the cluster resource planning method described in this embodiment.

In a third aspect, embodiments of the present invention provide a cluster resource planning device applied to a public cloud platform, including:

A priority sorting module, used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain the instance to be scheduled. list;

A rule verification module, configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list to obtain a preselected host node corresponding to the current instance. list;

A resource optimization module, configured to select an optimal host node from the preselected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain a binding relationship;

A deployment module is used to deploy the instance corresponding to the host node in the binding relationship to the host node.

In an optional implementation, the priority sorting module is specifically used to:

Traverse the first component list;

In an optional implementation, the first preset rule includes:

The current host node satisfies the strong affinity rule;

Moreover, the current host node satisfies the strong anti-affinity rule;

In an optional implementation, the resource optimization module is specifically used to:

In an optional implementation, if the host nodes with the highest scores include at least two, the resource optimization module is also used to:

In an optional implementation, the rule checking module is also used to:

Traverse the list of instances to be scheduled;

In a fourth aspect, embodiments of the present invention provide a computer storage medium. The computer-readable storage medium stores computer instructions. When the computer instructions are run on a computer, they cause the computer to execute any of the above embodiments. The steps of the cluster resource planning method.

For technical effects that may be achieved by the cluster resource planning equipment disclosed in the second aspect, the cluster resource planning device disclosed in the third aspect, and the computer storage medium disclosed in the fourth aspect, please refer to the above-mentioned description of the first aspect or various possibilities in the first aspect. The description of the technical effects that the solution can achieve will not be repeated here.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following will briefly introduce the drawings needed to describe the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. Those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting any creative effort.

Figure 1 is a schematic flow chart of a cluster resource planning method provided by an embodiment of the present invention;

Figure 2 is a schematic flow chart of another cluster resource planning method provided by an embodiment of the present invention;

Figure 3 is a schematic flow chart of an affinity rule checking method provided by an embodiment of the present invention;

Figure 4 is a schematic flow chart of a method for determining available resources of a host node provided by an embodiment of the present invention;

Figure 5 is a schematic flowchart of host node resource optimization provided by an embodiment of the present invention;

Figure 6 is a schematic module structure diagram of a cluster resource planning device provided by an embodiment of the present invention;

Figure 7 is a schematic structural diagram of a cluster resource planning device provided by an embodiment of the present invention;

Figure 8 is a schematic diagram of a program product of a cluster resource planning method provided by an embodiment of the present invention.

Embodiments of the invention

In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

It should be noted that the terms "first", "second", etc. in the description and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the invention described herein are capable of being practiced in sequences other than those illustrated or described herein. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the appended claims.

Currently, when deploying multiple components to a customer cluster based on a public cloud platform, relevant experts are required to intervene. Expert experience is used to determine the compatibility and mutual exclusivity of different components on the cluster nodes, confirm the resource requirements of the instance, and balance the deployment as much as possible. Pre-allocate different instances of different components on multiple host nodes. However, since the above method requires experts to manually process work orders, automated processing of work orders cannot be carried out, making the creation of public cloud clusters difficult as business volume increases. It will be limited by the processing efficiency of experts. In addition, even if experts rely on manual intervention, the excellence of the resource planning allocation results is difficult to guarantee.

In order to solve the above problems, embodiments of the present invention provide a cluster resource planning method, equipment, device and medium to improve the efficiency of cluster resource planning.

Example 1

The following describes a cluster resource planning method provided by the present invention through specific embodiments. This method is applied to a public cloud platform, as shown in Figure 1, and includes:

Step 101: After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;

Step 102: For each instance in each module in the instance list to be scheduled, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;

Step 103: According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain the binding relationship;

Step 104: Deploy the instance corresponding to the host node in the binding relationship to the host node.

It should be noted that the cluster resource planning method provided by the embodiment of the present invention can be applied to cloud hosts, network-side devices, GPU (Graphics Processing Unit, graphics processor) computing devices, and can also be applied to Terminal, the application scenarios of this cluster resource planning method are not specifically limited here.

Embodiments of the present invention provide a cluster resource planning method. After the obtained first component list is sorted according to the first preset priority and the second preset priority, a list of instances to be scheduled is obtained. For each instance in the list of instances to be scheduled, Each instance in a module determines the preselected host node list according to the first preset rule, and selects the optimal host node to be bound to the current instance from the preselected host node list according to the second preset rule to obtain the binding Deploy the instance deployment corresponding to the host node according to the binding relationship. Therefore, this cluster resource planning method realizes the automatic allocation and deployment of instances in multi-components to suitable host nodes, improves the efficiency of multi-component cluster resource planning based on public cloud platforms, and also ensures the efficiency of multi-component cluster resource planning. Excellent distribution results.

In addition, this method can be widely adapted to various resource allocation scenarios and has good versatility.

As an optional implementation manner, the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled. The first component can be first sorted according to the first preset priority. Arrange each component in the list in descending order to obtain a second component list, and then arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component. Finally, For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain a list of instances to be scheduled.

In specific implementation, as shown in Figure 2, it is an overall flow chart of a cluster resource planning method provided by the present invention. Refer to 21 in Figure 2, which is based on the first preset priority and the second preset priority. The specific process of sorting the first component list to obtain the list of instances to be scheduled includes the following steps:

Step 201: Obtain the host node list Hosts and obtain the first component list Components1 that needs to be deployed;

Specifically, the host node list Hosts includes all host nodes owned by the cluster.

Step 202: Sort the first component list according to the first preset priority to obtain the second component list Components2;

Specifically, the first component list Components1 is traversed, and each component in the first component list Components1 is sorted from small to large according to the priority value, where the priority value of the component is the piriority attribute corresponding to the preconfigured component, and piriority is A positive integer value. The smaller the value of piriority, the higher the priority of the corresponding component. That is to say, after obtaining the first component list Components1 that needs to be deployed, traverse the first component list Components1, sort each component in the first component list Components1 according to the piriority value from small to large, and obtain the second component list Components2, where, The components in the second component list Components2 are arranged according to priority from large to small.

Step 203: Traverse the second component list and sort the modules corresponding to each component according to the second preset priority;

Specifically, the modules corresponding to each component in the second component list Components2 are arranged in descending order according to the second preset priority to obtain a component module list corresponding to the current component, where the second preset priority is the preset value of the module. Priority, the order of modules included in the component is defined by the internal structure of the corresponding component, and no additional adjustments are made.

Step 204: Determine whether the current module is a dynamic module. If so, set the dynamic flag true for the current module. Otherwise, perform step 205;

Specifically, modules can be divided into dynamic modules and static modules according to their types. The attribute information of the module is set in advance through the replica value. If the replica value of the current module is a magic number, it indicates that the current module is a dynamic module. For example, the current module The replica value of the module is 999999 or 999998; if the replica value of the current module is not a magic number, it means that the current module is a static module. For example, the replica value of the current module is 3.

Step 205: Create a corresponding instance for each module according to the attribute information of the number of instances corresponding to the module.

For example, if the current module is a static module and the number of instances of the module is 3, create three instances a1, a2, and a3 and add them to the list of instances to be scheduled; if the current module is a dynamic module, create corresponding instances according to the number of host nodes. Add to the list of instances to be scheduled;

Specifically, for each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain the component module instance list, that is, generate a replica value for the module through the replica value corresponding to the module. number of instances. If the replica value of the current module is a magic number, it indicates that the current module is a dynamic module. Create an instance according to the given number of host nodes and add the created instance to the list of instances to be scheduled. For example, if the replica value of the current module If it is 999999, it means deploying the module on all host nodes; if the replica value of the current module is 999998, it means deploying the module on two-thirds of the host nodes. If the replica value of the current module is not a magic number, it means that the current module is a static module, and an instance is created based on the replica value of the current module. For example, if the replica value of the current module is 3, it means that the number of instances contained in the current module is 3, then create 3 instances, and add the created instances to the list of instances to be scheduled.

For example, assuming that the cluster is a cluster with 5 host nodes, 3 different big data components need to be allocated to the cluster. The 5 host nodes are set to be: A1, A2,..., A5, and the 3 components are respectively : B1, B2, B3. Among them, component B1 contains 3 modules, namely: B11, B12, B13; component B2 contains 2 modules, namely: B21, B22; component B3 contains 4 modules, namely: B31 , B32, B33, B34, modules B12 and B33 are dynamic modules, the remaining modules are static modules, and each static module contains 2 instances. The above-mentioned first component list, second component list, component module list and to-be-scheduled instance list are shown in Table 1.

Table 1

In the embodiment of the present invention, the components in the first component list are arranged in descending order according to the first preset priority, and the modules corresponding to each component are arranged in descending order according to the second preset priority. According to the number of instances corresponding to the module Attribute information of the module is used to create an instance for the module, thereby obtaining a list of instances to be scheduled. Therefore, each component and each module corresponding to the component is sorted by priority to ensure that the affinity dependency strategy of the deployed instance is correctly parsed to improve the excellence of the allocation results.

As an optional implementation, the first preset rule may include: the current host node satisfies the strong affinity rule; and, the current host node satisfies the strong anti-affinity rule; and, the available resources of the current host node satisfy the current Deployment requirements for the instance.

It should be noted that assuming a host node A1, component B1, and component B2, if instance C11 in component B1 is deployed to host node A1, then host node A1 must deploy instance C21 in component B2, then instance C11 and Instance C21 complies with the strong affinity rule, that is to say, instance C11 in component B1 and instance C21 in component B2 have a binding relationship when deployed on host node A1; if instance C11 in component B1 is deployed to host node A1 After that, host node A1 can no longer deploy instance C21 in component B2, and instance C11 and instance C21 comply with the strong anti-affinity rule. That is to say, instance C11 in component B1 and instance C21 in component B2 are in There is a mutually exclusive relationship when deployed on host node A1; the above strong affinity rules and strong anti-affinity rules are predefined, and can be reflected in the affinity rule list.

In specific implementation, refer to 22 in Figure 2. The specific process of selecting a host node in the host node list according to the first preset rule and obtaining the pre-selected host node list corresponding to the current instance includes the following steps:

Step 206: Traverse the list of instances to be scheduled and initialize the preselected host node list PreChooseHosts corresponding to the current instance;

Specifically, for each instance in the instance list to be scheduled, a preselected host node list corresponding to the current instance needs to be initialized to ensure the accuracy of the preselected host list obtained according to the first preset rule.

Step 207: Traverse the host node list Hosts, complying with the affinity rules and hard limits of CPU performance, memory disk performance;

Step 208: Determine whether the current host node satisfies the strong affinity rule. If so, execute step 209; otherwise, return to step 207;

Specifically, you can first select the instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled, and then for each instance that is allowed to be deployed, if the current instance has a binding relationship with the instance that is allowed to be deployed, that is, the strong affinity is satisfied If the current instance does not meet the strong affinity rules, the current instance cannot be deployed on the current host node, and the next host node in the host node list is traversed.

Step 209: Determine whether the current host node satisfies the strong anti-affinity rule. If so, execute step 210; otherwise, return to step 207;

Specifically, first select the instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled, and then for each instance that is allowed to be deployed, if the current instance and the instance that are allowed to be deployed comply with the principle of non-mutual exclusion, that is, they do not satisfy the strong reflection Affinity rules, then the current instance has the possibility of being deployed to the current host node. If the current instance satisfies the strong anti-affinity rule, the current instance cannot be deployed to the current host node, and the next host node in the host node list is traversed. .

Specifically, as shown in Figure 3, it is a flow chart of affinity rule verification, including the following steps:

Step 301: Enter the affinity rule list AffinityList to obtain the current host node list Hosts;

Specifically, the rule expression to be verified in the affinity rule list AffinityList includes strong affinity rules and strong anti-affinity rules.

Step 302: Traverse the affinity rule list AffinityList to obtain the current rule expression to be verified;

Here it is judged whether the current rule to be verified is a strong affinity rule or a strong anti-affinity rule.

Step 303: Obtain the attribute value X corresponding to the node according to the node attribute key specified by the current rule expression to be verified;

Specifically, if the current rule to be verified is a strong anti-affinity rule, and the node attribute specified according to the strong anti-affinity rule is a module, obtain the attribute value X1 of the current host node, where X1 is all the allowed deployments of the current node. List of instances. If the current rule to be verified is a strong affinity rule, and the node attribute specified according to the strong affinity rule is a module, obtain the attribute value X2 of the current host node, where X2 is a list of all instances that are allowed to be deployed on the current node.

Step 304: Call different operation logic according to the operation type operator specified by the current rule expression to be verified;

Specifically, if the current rule to be verified is a strong anti-affinity rule, and the node attribute specified according to the strong anti-affinity rule is a module, the attribute value X1 of the current host node is obtained, and the IN operation logic is called, then step 305 is executed. . If the current rule to be verified is a strong affinity rule, and the node attribute specified according to the strong affinity rule is a module, the attribute value X2 of the current host node is obtained, and the EXIST operation logic is called, then step 307 is executed.

Step 305: Determine which operation logic is called. If the IN operation logic is called, step 306 is executed. If the NOTIN operation logic is called, step 307 is executed. If the EXIST operation logic is called, step 308 is executed. If the NOTEXIST operation logic is called, then step 308 is executed. Execute step 309;

It should be noted that the four operation logics of IN, NOTIN, EXIST, and NOTEXIST have nothing to do with which affinity rule is selected. It can only indicate whether the current rule to be scheduled is triggered, that is, each affinity rule can call four operations. Any kind of logic.

Step 306: Traverse X and the adaptation values specified by the current rule expression to be verified. If there is a match, return the value true; otherwise, return the value false;

For example, if the current rule to be verified is a strong anti-affinity rule, set The values of values are C16 and C33, which means that the strong anti-affinity rules are satisfied between C35 and C16. The two instances cannot be deployed on the same host node at the same time. The strong anti-affinity rules are also satisfied between C35 and C33. Two instances cannot be deployed on the same host node at the same time. Since there is a match between X and values, that is, both have C16, the value true is returned, triggering the strong anti-affinity rule.

Step 307: Traverse

For example, if the current rule to be verified is a strong affinity rule, set The values of values are C14 and C33, which means that C35 and C14 satisfy the strong affinity rules. The two instances must be deployed on one host node at the same time. The strong affinity rules are also satisfied between C35 and C33. The two instances Must be deployed on one host node at the same time. Since there is no match between X and values, that is, there is no identical instance, the value true is returned and the strong affinity rule is not triggered.

Step 308: Traverse X. If it is not empty, return the value true; otherwise, return the value false;

For example, if the current rule to be verified is a strong anti-affinity rule, set X to be the instance list allowed to be deployed by host node A1 corresponding to instance C35, where the values of , because X is not empty, the return value is true.

Step 309: Traverse X. If it is empty, return the value true; otherwise, return the value false;

For example, if the current rule to be verified is a strong anti-affinity rule, set X to be the instance list allowed to be deployed by host node A1 corresponding to instance C35, where the value of Return value true.

Step 210: Determine whether the available resources of the current host node are sufficient. If so, execute step 211. Otherwise, return to step 207;

Specifically, select an instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled; for each instance that is allowed to be deployed, if the available resources of the current host node meet the deployment requirements of the current instance, the current instance is allowed to be deployed to the current instance. On the host node, if the available resources of the current host node do not meet the deployment requirements of the current instance, the current instance cannot be deployed on the current host node, and the next host node in the host node list is traversed.

Specifically, to determine whether the available resources of the current host node meet the deployment requirements of the current instance, the available resources of the current host node can be determined from the three dimensions of central processing unit (CPU), memory (memory), and disk (disk). Whether the deployment requirements are met. If the three dimensions of the available resources of the current host node all meet the deployment requirements of the current instance, the current host node will be added to the preselected host node list; if any of the three dimensions of the available resources of the current host node If the deployment requirements of the current instance are not met, the next host node in the host node list is determined.

As shown in Figure 4, the specific flow chart for determining whether the available resources of the current host node are sufficient includes the following steps:

Step 401: Obtain the available resources of the current host node and obtain the preset resource requirements of the current instance;

Step 402: Determine whether the available CPU of the current host node is greater than the required CPU of the current instance. If so, execute step 403; otherwise, execute step 406;

Step 403: Determine whether the available memory of the current node of the host is greater than the required memory of the current instance. If so, execute step 404; otherwise, execute step 406;

Step 404: Determine whether the available disk of the current host node is greater than the required disk of the current instance. If so, perform step 405; otherwise, perform step 406;

Step 405: The available resources of the current host node are sufficient, and the pre-selected host node list PreChooseHosts is added;

Step 406: The available resources of the current host node are insufficient and cannot be added to the preselected host node list PreChooseHosts.

It should be noted that in different types of clusters, the weight proportions of the three dimensions of CPU, memory, and disk can be different. For example, for computing clusters, in the process of pre-selecting host nodes, the weight ratio of CPU and memory can be increased. In other words, host nodes with high CPU and/or large memory will be more preferred; for storage clusters, In the process of pre-selecting host nodes, the weight proportion of disk can be increased. In other words, host nodes with larger disks will be preferred. By configuring different weight proportions for different types of clusters, you can obtain better host nodes.

Specifically, first select the instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled, and then for each instance that is allowed to be deployed, if the instance has a binding relationship with the instance that is allowed to be deployed and complies with the principle of non-mutual exclusion, And the available resources of the current host node meet the deployment requirements of the current instance, then the current host node meets the first preset rule and is added to the preselected host node list PreChooseHosts; if the current host node does not meet the above first preset rule For any rule, traverse the next host node in the host node list.

Step 211: Add the current host node to the preselected host node list PreChooseHosts;

Specifically, if the preselected host node list PreChooseHosts corresponding to any instance is empty, you can obtain other host nodes outside the host node list, add the host node to the host node list, and deploy the instance to the host node. This method reduces the probability of instance allocation failure by obtaining other host nodes outside the host node list, thereby improving the reliability of cluster resource allocation results.

In the embodiment of the present invention, the host node is selected from the host node list according to the first preset rule, that is, it is judged whether the current host node satisfies the strong affinity rule, and whether the current host node satisfies the strong anti-affinity rule, and the current host node Whether the available resources of the node are sufficient. If the current node meets the above three judgment conditions at the same time, the node will be used as a preselected host node and added to the preselected host node list corresponding to the current instance. If the current node does not meet the above three judgment conditions. If any one of the conditions is met, the next host node will be determined. Through pre-screening through the above method, the pre-selected host nodes available for the current instance are initially determined, which narrows the selection range and improves the efficiency of node allocation.

As an optional implementation, according to the second preset rule, select the optimal host node from the preselected host node list. Specifically, for each host node corresponding to each instance, according to the preset host node, instance According to the corresponding relationship with the score, determine the score of the host node corresponding to the instance, and then compare the scores of each host node corresponding to each instance, and use the host node with the highest score as the optimal host node.

It should be noted that, assuming a host node A1, component B1, and component B2, if instance C11 in component B1 is deployed to host node A1, it is recommended to deploy instance C21 in component B2 to host node A1 to improve System performance, however, if there is no instance C21 in component B2, system functions can also be achieved by deploying only instance C11 in component B1 to host node A1. The relationship between instance C11 and instance C21 here satisfies the weak affinity rule. , that is to say, if instance C11 in component B1 and instance C21 in component B2 are deployed together on the host node A1, the system performance will be better. Either instance C11 in component B1 or instance C21 in component B2 will be deployed on the host node. System performance can also be achieved on A1; if instance C11 in component B1 has been deployed to host node A1, it is not recommended to deploy instance C21 in component B2 to host node A1. However, if there are force majeure factors, Instance C11 in component B1 and instance C21 in component B2 must be deployed together on host node A1. System functions can also be implemented, but system performance will be reduced. Instance C11 and instance C21 here satisfy the weak affinity rule, and That is to say, if either component B1 or component B2 is deployed on the host node A1, the system performance will be better. If the instance C11 in the component B1 and the instance C21 in the component B2 are jointly deployed on the host node A1, the system performance can also be achieved. The above weak affinity rules and weak anti-affinity rules are predefined and can be embodied in the form of an affinity rule list.

During specific implementation, the weak affinity rules and weak anti-affinity rules specified in the affinity rule list all pre-set the score (weight) of each host node, where the weak affinity rule is defined in The score of is a positive integer, and the score defined in the weak anti-affinity rule is a negative integer. Referring to 23 in Figure 2, for each node in the preselected host node corresponding to the current instance, determine the rule to be verified of the current host node according to the affinity rule list. If the rule to be verified of the current host node is weak affinity rule, then according to the score defined in the weak affinity rule, the score of the current host node that is allowed to deploy the current instance can be obtained, where the obtained score is a positive integer; if the rule to be verified of the current host node is weak For anti-affinity rules, based on the score defined in the weak anti-affinity rule, the score of the current host node that is allowed to deploy the current instance can be obtained, where the score is a negative integer.

Specifically, if the current host node A1 can deploy n instances, C1, C2,..., Cn, it is assumed that for instance C1, the score of the current host node A1 is Y1; for instance C2, the score of the current host node A1 is Y2;...for instance Cn, the score of the current host node A1 is Yn. By summing the above scores, Y1, Y2,..., Yn, we can get the total score Y of the current host node, and then compare and select the score. The host node with the highest value is regarded as the optimal host node.

In the embodiment of the present invention, according to the preset corresponding relationship between host nodes, instances and scores, the score of the host node corresponding to the instance is determined, and the scores of each host node are compared to select the host with the highest score. node as the optimal host node. This method has flexible scoring rules for host nodes corresponding to instances and can be widely adapted to different types of host nodes, which improves the universality of the method.

Specifically, if there are at least two host nodes with the highest scores, the idle resources of the at least two host nodes with the highest scores can be compared, and the host node with the most idle resources can be used as the optimal host node.

In the specific implementation, refer to 23 in Figure 2. After selecting the optimal host node from the pre-selected host node list, bind the optimal host node to the current instance to obtain a specific flow chart of the binding relationship between the host node and the instance. , including the following steps:

Step 212: Traverse the pre-selected host node list PreChooseHosts, score each host node respectively, and select the optimal host node;

Step 213: Determine whether the weak affinity rule score of the current host node is greater than the highest score of the host node. If greater, proceed to step 215. If equal, proceed to step 214. If less, return to step 212;

Step 214: Compare the idle resource levels of the host nodes and select the host node with the most idle resources as the optimal host node;

Specifically, the host node with more idle resources is preferred as the optimal host node. When analyzing the idle resources of the host node, it can be analyzed from the three dimensions of CPU, memory, and disk. For different types of clusters, CPU, memory, and The three dimensions of disk are preset with different weights. For example, for computing clusters, you can increase the weight proportion of CPU and memory; for storage clusters, you can increase the weight proportion of disk.

As shown in Figure 5, the optimal workflow diagram for resource planning includes the following steps:

Step 501, enter the host node host1 and the host node host2;

Among them, the host node host1 and the host node host2 have the highest scores and the scores are equal.

Step 502: Collect idle resources of the host node host1 and collect idle resources of the host node host2;

Step 503: Calculate the idle resource status score S1 of the host node host1 in a weighted manner, and calculate the idle resource status score S2 of the host node host2 in a weighted manner;

Specifically, the idle resources of the host node host1 and the idle resources of the host node host2 are respectively collected, and weighted operations are performed according to the preset weight and the absolute value of the idle resources to obtain the idle resource situation score S1 of the host node host1 and the idle resources of the host node host2. Situation score S2.

Step 504, determine whether the condition S1>S2 is satisfied, if so, execute step 505, otherwise, execute step 506;

Step 505, use the host node host1 as the optimal host node;

Step 506: Use the host node host2 as the optimal host node.

In the embodiment of the present invention, S1 and S2 are compared, and the host node with the highest score is selected as the optimal host node.

Step 215: Bind the current instance to the optimal host node to obtain the binding relationship between the host node and the instance;

Step 216: Determine whether a host node is deployed for each instance. If so, perform step 217; otherwise, perform step 218;

Step 217: Output the adjusted binding relationship between the host node and the component module instance;

Step 218: Output the reason for the allocation failure and the allocated host node list.

Embodiments of the present invention provide a cluster resource planning method. If there are at least two host nodes with the highest scores, compare the idle resources of at least two host nodes with the highest scores, and compare the host nodes with the highest idle resource scores. as the optimal host node. Therefore, by configuring different dimensions and weights to prioritize the idle resources of host nodes, the scoring rules are flexible and can be widely applied to different types of clusters. In addition, by optimizing the host node with the highest score and the highest idle resource score, Double optimization rules further ensure the excellence of resource allocation results.

As an implementation method, the method can also quantify expert experience into a set of json template files, including a list of affinity rules, a first preset priority, a second preset priority, available resources of the host node, and preset Weights.

If there is no template file based on expert experience, you need to pre-set the relationship between each module in the component, the affinity rule list, the first default priority, the second default priority, the default weight and the available resources of the host node Condition.

By consolidating expert experience and generating template files, the cluster resource planning method can be efficiently automated. The definition of component affinity rules is created in the template file, which can accurately express the dependencies and mutual exclusion relationships between components, and through the template The file can consider the resource requirements of components in layers, and consider the requirements of different layers separately. In addition, different template combinations can be introduced as needed, and their priority relationships can be adjusted as needed to ensure the rationality and excellence of resource allocation. sex.

Example 2

Based on the same concept, embodiments of the present invention also provide a cluster resource planning device, which is applied to a public cloud platform. This device is the device in the method in the embodiment of the present invention, and the principle of solving the problem of the device is the same as that of the method. are similar, so the implementation of the device can be referred to the implementation of the method, and repeated details will not be repeated.

As shown in Figure 6, the above device includes the following modules:

The priority sorting module 601 is used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled. ;

The rule verification module 602 is configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list, and obtain a preselected host node list corresponding to the current instance;

The resource optimization module 603 is configured to select the optimal host node from the pre-selected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain the binding relationship;

The deployment module 604 is used to deploy the instance corresponding to the host node in the binding relationship to the host node.

As an optional implementation, the priority sorting module is specifically used to:

Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a list of component modules corresponding to the current component;

For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain a list of instances to be scheduled.

The prioritization module is specifically used for:

Traverse the first component list;

As an optional implementation, the first preset rule includes:

The current host node satisfies the strong affinity rule;

Moreover, the current host node satisfies the strong anti-affinity rule;

As an optional implementation, the resource optimization module is specifically used to:

For each host node corresponding to each instance, determine the score of the host node corresponding to the instance based on the preset correspondence between the host node, instance and score;

Compare the scores of each host node corresponding to each instance, and use the host node with the highest score as the optimal host node.

As an optional implementation, if the host nodes with the highest scores include at least two, the resource optimization module is also used to:

Compare the idle resources of at least two host nodes with the highest scores, and use the host node with the most idle resources as the optimal host node.

As an optional implementation manner, before selecting a host node in the host node list according to the first preset rule to obtain the preselected host node list corresponding to the current instance, the rule verification module is also used to:

Traverse the list of instances to be scheduled;

For each instance in the list of instances to be scheduled, initialize a list of preselected host nodes corresponding to the current instance.

Example 3

Based on the same concept, embodiments of the present invention also provide a cluster resource planning device, which is applied to a public cloud platform. Since the cluster resource planning device is the cluster resource planning device in the method in the embodiment of the present invention, and the cluster resource The principle of planning equipment to solve problems is similar to this method. Therefore, the implementation of the cluster resource planning equipment can be found in the implementation of the method, and the duplication will not be repeated.

The cluster resource planning device 70 according to this embodiment of the present invention is described below with reference to FIG. 7 . The cluster resource planning device 70 shown in FIG. 7 is only an example and should not impose any restrictions on the functions and usage scope of the embodiments of the present invention.

As shown in FIG. 7 , the cluster resource planning device 70 may be in the form of a general computing device, for example, it may be a terminal device. The components of the cluster resource planning device 70 may include, but are not limited to: the above-mentioned at least one processor 71, the above-mentioned at least one memory 72 that stores executable instructions of the processor 71, and a bus 73 connecting different system components (including the memory 72 and the processor 71). , the processor 71 is the processor of the smart device.

The processor 71 executes executable instructions to implement the following steps:

For each instance in each module in the instance list to be scheduled, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;

According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain the binding relationship;

As an optional implementation, the processor 71 is specifically used to:

Traverse the first component list;

As an optional implementation, the first preset rule includes:

The current host node satisfies the strong affinity rule;

Moreover, the current host node satisfies the strong anti-affinity rule;

Moreover, the number of available resources on the current host node meets the deployment requirements of the current instance.

As an optional implementation, the processor 71 is specifically used to:

As an optional implementation, if the host nodes with the highest scores include at least two, the processor 71 is also used to:

As an optional implementation, before selecting a host node in the host node list according to the first preset rule and obtaining the preselected host node list corresponding to the current instance, the processor 71 is also used to:

Traverse the list of instances to be scheduled;

Bus 73 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus structures.

Memory 72 may include readable media in the form of volatile memory, such as random access memory (RAM) 721 and/or cache memory 722 , and may further include read only memory (ROM) 723 .

Memory 72 may also include a program/utility 725 having a set of (at least one) program modules 724 including, but not limited to: an operating system, one or more application programs, other program modules, and program data. Each of the examples, or some combination thereof, may include the implementation of a network environment.

Cluster resource planning device 70 may also communicate with one or more external devices 74 (e.g., keyboard, pointing device, etc.), may also communicate with one or more devices that enable a user to interact with cluster resource planning device 70, and/or with Any device (eg, router, modem, etc.) that enables cluster resource planning device 70 to communicate with one or more other computing devices. This communication may occur through input/output (I/O) interface 75. Furthermore, the cluster resource planning device 70 may also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through a network adapter 76 . As shown, network adapter 76 communicates with other modules of electronic device 70 via bus 73 . It should be understood that, although not shown in the figure, other hardware and/or software modules may be used in conjunction with the cluster resource planning device 70, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, Tape drives and data backup storage systems, etc.

Example 4

In some possible implementations, various aspects of the present invention can also be implemented in the form of a program product, which includes program code. When the program product is run on a terminal device, the program code is used to cause the terminal device to execute the above described instructions. The steps of each module in the cluster resource planning device according to various exemplary embodiments of the present disclosure described in the "Example Method" section, for example, the network side device can be used to obtain the host node list and the first component list that needs to be deployed. Finally, the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled; for each instance in each module in the list of instances to be scheduled, the first component list is sorted according to the first preset The rule selects the host node in the host node list to obtain the preselected host node list corresponding to the current instance; according to the second preset rule, after selecting the optimal host node from the preselected host node list, the optimal host node is compared with the current instance. Bind, get the binding relationship;

The Program Product may take the form of one or more readable media in any combination. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

As shown in Figure 8, a program product 80 for cluster resource planning according to an embodiment of the present invention is described, which can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be used on a terminal device, For example, run on a personal computer. However, the program product of the present invention is not limited thereto. In this document, a readable storage medium may be any tangible medium containing or storing a program that may be used by or in combination with an instruction execution system, apparatus or device.

The readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying readable program code therein. Such propagated data signals may take a variety of forms, including - but not limited to - electromagnetic signals, optical signals, or any suitable combination of the above. A readable signal medium may also be any readable medium other than a readable storage medium that can send, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any suitable medium, including - but not limited to - wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.

Program code for performing the operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., as well as conventional procedural programming. Language—such as "C" or a similar programming language. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In situations involving remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (e.g., utilizing an Internet service provider to connect via the Internet).

It should be noted that although several modules or sub-modules of the system are mentioned in the above detailed description, this division is only exemplary and not mandatory. In fact, according to embodiments of the present invention, the features and functions of two or more modules described above may be embodied in one module. Conversely, the features and functions of a module described above can be further divided into being embodied by multiple modules.

In addition, although the operations of the various modules of the system of the present invention are described in a specific order in the drawings, this does not require or imply that these operations must be performed in this specific order, or that all of the illustrated operations must be performed to achieve the desired results. result. Additionally or alternatively, certain operations may be omitted, multiple operations combined into one operation execution, and/or one operation broken into multiple operation executions.

The present application is described above with reference to block diagrams and/or flowcharts illustrating methods, apparatus (systems) and/or computer program products according to embodiments of the application. It will be understood that one block of the block diagrams and/or flowchart illustrations, and combinations of blocks of the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a general-purpose computer, a processor of a special-purpose computer, and/or other programmable data processing apparatus to produce a machine, such that the instructions, executed via the computer processor and/or other programmable data processing apparatus, create a machine for A method that implements the functions/actions specified in the block diagram and/or flowchart blocks.

Accordingly, the present application can also be implemented using hardware and/or software (including firmware, resident software, microcode, etc.). Furthermore, the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by an instruction execution system or Used in conjunction with the instruction execution system. In the context of this application, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, transmit, or transport a program for use by or in connection with an instruction execution system, apparatus, or device, device or equipment use.

Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the invention. In this way, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies, the present invention is also intended to include these modifications and variations.

Claims

A cluster resource planning method is characterized in that it is applied to a public cloud platform and includes:

After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;

For each instance in each module in the to-be-scheduled instance list, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;

According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain a binding relationship;

Deploy the instance corresponding to the host node in the binding relationship to the host node.
The method of claim 1, wherein the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled, including:

Arrange each component in the first component list in descending order according to the first preset priority to obtain a second component list;

Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component;

For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module, to obtain the list of instances to be scheduled.
The method of claim 2, wherein arranging each component in the first component list in descending order according to a first preset priority includes:

Traverse the first component list;

Sort each component in the first component list according to priority value from small to large.
The method of claim 1, wherein the first preset rule includes:

The current host node satisfies the strong affinity rule;

Moreover, the current host node satisfies the strong anti-affinity rule;

Moreover, the available resources of the current host node meet the deployment requirements of the current instance.
The method of claim 1, wherein selecting the optimal host node from the host node list according to the preselected host node list and the second preset rule includes:

For each host node corresponding to each instance, determine the score of the host node corresponding to the instance according to the preset corresponding relationship between the host node, the instance and the score;

The scores of each host node corresponding to each instance are compared, and the host node with the highest score is regarded as the optimal host node.
The method of claim 5, wherein if the host nodes with the highest scores include at least two, the method further includes:

The idle resources of the at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node.
The method according to any one of claims 1 to 6, characterized in that before selecting a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance, the method further includes: :

Traverse the list of instances to be scheduled;

For each instance in the to-be-scheduled instance list, a preselected host node list corresponding to the current instance is initialized.
A cluster resource planning device, characterized in that it is applied to a public cloud platform and includes a memory and a processor. The memory stores a computer program. When the processor executes the computer program, any one of claims 1 to 7 is implemented. The steps of the cluster resource planning method.
A cluster resource planning device is characterized in that it is applied to a public cloud platform and includes:

A priority sorting module, used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain the instance to be scheduled. list;

A rule verification module, configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list to obtain a preselected host node corresponding to the current instance. list;

A resource optimization module, configured to select an optimal host node from the preselected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain a binding relationship;

A deployment module is used to deploy the instance corresponding to the host node in the binding relationship to the host node.
A computer storage medium, characterized in that the computer-readable storage medium stores computer instructions. When the computer instructions are run on a computer, the computer is caused to execute the cluster as described in any one of claims 1 to 7. Steps in the resource planning method.