WO2024021467A1 - Cluster resource planning method, device, apparatus, and medium - Google Patents

Cluster resource planning method, device, apparatus, and medium Download PDF

Info

Publication number
WO2024021467A1
WO2024021467A1 PCT/CN2022/141378 CN2022141378W WO2024021467A1 WO 2024021467 A1 WO2024021467 A1 WO 2024021467A1 CN 2022141378 W CN2022141378 W CN 2022141378W WO 2024021467 A1 WO2024021467 A1 WO 2024021467A1
Authority
WO
WIPO (PCT)
Prior art keywords
host node
list
instance
component
current
Prior art date
Application number
PCT/CN2022/141378
Other languages
French (fr)
Chinese (zh)
Inventor
陈赜
Original Assignee
天翼云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天翼云科技有限公司 filed Critical 天翼云科技有限公司
Publication of WO2024021467A1 publication Critical patent/WO2024021467A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Definitions

  • the present invention relates to the field of computer technology, and in particular to a cluster resource planning method, equipment, device and medium.
  • public cloud platforms can be used to deploy multi-component clusters.
  • the present invention provides a cluster resource planning method, equipment, device and medium to solve the problem of low efficiency of cluster resource planning existing in the prior art.
  • embodiments of the present invention provide a cluster resource planning method applied to a public cloud platform, including:
  • the optimal host node after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain a binding relationship;
  • a cluster resource planning method provided by an embodiment of the present invention, after the obtained first component list is sorted according to the first preset priority and the second preset priority, a list of instances to be scheduled is obtained. For the list of instances to be scheduled, For each instance in each module, the preselected host node list is determined according to the first preset rule, and the optimal host node bound to the current instance is selected from the preselected host node list according to the second preset rule, and we get Binding relationship, deploy the instance corresponding to the host node according to the binding relationship. Since this cluster resource planning method realizes automatic allocation and deployment of instances in multiple components to suitable host nodes, it can improve the efficiency of multi-component cluster resource planning based on public cloud platforms.
  • the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled, including:
  • the components in the first component list are arranged in descending order according to the first preset priority
  • the modules corresponding to each component are arranged in descending order according to the second preset priority
  • the attribute information of the number of instances corresponding to the module is arranged.
  • arranging each component in the first component list in descending order according to the first preset priority includes:
  • each component in the first component list is arranged in descending order according to the first preset priority, that is, each component in the first component list is sorted according to the priority value from small to large.
  • the smaller the priority value the smaller the priority value.
  • the higher the priority of the corresponding component that is, sort each component in the first component list in order from high to low priority.
  • the first preset rule includes:
  • the current host node satisfies the strong affinity rule
  • the current host node satisfies the strong anti-affinity rule
  • the available resources of the current host node meet the deployment requirements of the current instance.
  • the above method selects a host node in the host node list according to the first preset rule, that is, determines whether the current host node satisfies the strong affinity rule, and whether the current host node satisfies the strong anti-affinity rule, and the availability of the current host node Whether the resources are sufficient. If the current node meets the above three judgment conditions at the same time, the node will be used as a preselected host node and added to the preselected host node list corresponding to the current instance. If the current node does not meet any of the above three judgment conditions. A condition is used to determine the next host node. Through pre-screening through the above method, the pre-selected host nodes available for the current instance are initially determined, which narrows the selection range and improves the efficiency of node allocation.
  • selecting the optimal host node from the host node list according to the preselected host node list and the second preset rule includes:
  • the scores of each host node corresponding to each instance are compared, and the host node with the highest score is regarded as the optimal host node.
  • the above method determines the score of the host node corresponding to the instance based on the preset corresponding relationship between the host node, instance and score, and compares the score of each host node, thereby selecting the host node with the highest score as the best.
  • Optimal host node This method has flexible scoring rules for host nodes corresponding to instances and can be widely adapted to different types of host nodes, which improves the universality of the method.
  • the method further includes:
  • the idle resources of the at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node.
  • the idle resources of at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node. Therefore, the excellence of the resource allocation results is further ensured by the double optimization rule of optimizing the host node with the highest score and the most idle resources.
  • the method before selecting a host node in the host node list according to the first preset rule to obtain the preselected host node list corresponding to the current instance, the method further includes:
  • a preselected host node list corresponding to the current instance is initialized.
  • a preselected host node list corresponding to the current instance needs to be initialized to ensure the accuracy of the preselected host list obtained according to the first preset rule.
  • embodiments of the present invention provide a cluster resource planning device, which is applied to a public cloud platform and includes a memory and a processor.
  • the memory stores a computer program.
  • the processor executes the computer program, any one of the above is implemented. The steps of the cluster resource planning method described in this embodiment.
  • embodiments of the present invention provide a cluster resource planning device applied to a public cloud platform, including:
  • a priority sorting module used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain the instance to be scheduled. list;
  • a rule verification module configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list to obtain a preselected host node corresponding to the current instance. list;
  • a resource optimization module configured to select an optimal host node from the preselected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain a binding relationship;
  • a deployment module is used to deploy the instance corresponding to the host node in the binding relationship to the host node.
  • the priority sorting module is specifically used to:
  • the priority sorting module is specifically used to:
  • the first preset rule includes:
  • the current host node satisfies the strong affinity rule
  • the current host node satisfies the strong anti-affinity rule
  • the available resources of the current host node meet the deployment requirements of the current instance.
  • the resource optimization module is specifically used to:
  • the scores of each host node corresponding to each instance are compared, and the host node with the highest score is regarded as the optimal host node.
  • the resource optimization module is also used to:
  • the idle resources of the at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node.
  • rule checking module is also used to:
  • a preselected host node list corresponding to the current instance is initialized.
  • embodiments of the present invention provide a computer storage medium.
  • the computer-readable storage medium stores computer instructions. When the computer instructions are run on a computer, they cause the computer to execute any of the above embodiments. The steps of the cluster resource planning method.
  • Figure 1 is a schematic flow chart of a cluster resource planning method provided by an embodiment of the present invention
  • Figure 2 is a schematic flow chart of another cluster resource planning method provided by an embodiment of the present invention.
  • Figure 3 is a schematic flow chart of an affinity rule checking method provided by an embodiment of the present invention.
  • Figure 4 is a schematic flow chart of a method for determining available resources of a host node provided by an embodiment of the present invention
  • FIG. 5 is a schematic flowchart of host node resource optimization provided by an embodiment of the present invention.
  • Figure 6 is a schematic module structure diagram of a cluster resource planning device provided by an embodiment of the present invention.
  • Figure 7 is a schematic structural diagram of a cluster resource planning device provided by an embodiment of the present invention.
  • Figure 8 is a schematic diagram of a program product of a cluster resource planning method provided by an embodiment of the present invention.
  • embodiments of the present invention provide a cluster resource planning method, equipment, device and medium to improve the efficiency of cluster resource planning.
  • Step 101 After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;
  • Step 102 For each instance in each module in the instance list to be scheduled, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;
  • Step 103 According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain the binding relationship;
  • Step 104 Deploy the instance corresponding to the host node in the binding relationship to the host node.
  • cluster resource planning method provided by the embodiment of the present invention can be applied to cloud hosts, network-side devices, GPU (Graphics Processing Unit, graphics processor) computing devices, and can also be applied to Terminal, the application scenarios of this cluster resource planning method are not specifically limited here.
  • Embodiments of the present invention provide a cluster resource planning method. After the obtained first component list is sorted according to the first preset priority and the second preset priority, a list of instances to be scheduled is obtained. For each instance in the list of instances to be scheduled, Each instance in a module determines the preselected host node list according to the first preset rule, and selects the optimal host node to be bound to the current instance from the preselected host node list according to the second preset rule to obtain the binding Deploy the instance deployment corresponding to the host node according to the binding relationship.
  • this cluster resource planning method realizes the automatic allocation and deployment of instances in multi-components to suitable host nodes, improves the efficiency of multi-component cluster resource planning based on public cloud platforms, and also ensures the efficiency of multi-component cluster resource planning. Excellent distribution results.
  • this method can be widely adapted to various resource allocation scenarios and has good versatility.
  • the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled.
  • the first component can be first sorted according to the first preset priority. Arrange each component in the list in descending order to obtain a second component list, and then arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component. Finally, For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain a list of instances to be scheduled.
  • FIG. 2 it is an overall flow chart of a cluster resource planning method provided by the present invention.
  • the specific process of sorting the first component list to obtain the list of instances to be scheduled includes the following steps:
  • Step 201 Obtain the host node list Hosts and obtain the first component list Components1 that needs to be deployed;
  • the host node list Hosts includes all host nodes owned by the cluster.
  • Step 202 Sort the first component list according to the first preset priority to obtain the second component list Components2;
  • the first component list Components1 is traversed, and each component in the first component list Components1 is sorted from small to large according to the priority value, where the priority value of the component is the piriority attribute corresponding to the preconfigured component, and piriority is A positive integer value.
  • Step 203 Traverse the second component list and sort the modules corresponding to each component according to the second preset priority
  • the modules corresponding to each component in the second component list Components2 are arranged in descending order according to the second preset priority to obtain a component module list corresponding to the current component, where the second preset priority is the preset value of the module.
  • the order of modules included in the component is defined by the internal structure of the corresponding component, and no additional adjustments are made.
  • Step 204 Determine whether the current module is a dynamic module. If so, set the dynamic flag true for the current module. Otherwise, perform step 205;
  • modules can be divided into dynamic modules and static modules according to their types.
  • the attribute information of the module is set in advance through the replica value. If the replica value of the current module is a magic number, it indicates that the current module is a dynamic module. For example, the current module The replica value of the module is 999999 or 999998; if the replica value of the current module is not a magic number, it means that the current module is a static module. For example, the replica value of the current module is 3.
  • Step 205 Create a corresponding instance for each module according to the attribute information of the number of instances corresponding to the module.
  • the current module is a static module and the number of instances of the module is 3, create three instances a1, a2, and a3 and add them to the list of instances to be scheduled; if the current module is a dynamic module, create corresponding instances according to the number of host nodes. Add to the list of instances to be scheduled;
  • each module in the component module list create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain the component module instance list, that is, generate a replica value for the module through the replica value corresponding to the module. number of instances. If the replica value of the current module is a magic number, it indicates that the current module is a dynamic module. Create an instance according to the given number of host nodes and add the created instance to the list of instances to be scheduled. For example, if the replica value of the current module If it is 999999, it means deploying the module on all host nodes; if the replica value of the current module is 999998, it means deploying the module on two-thirds of the host nodes.
  • the replica value of the current module is not a magic number, it means that the current module is a static module, and an instance is created based on the replica value of the current module. For example, if the replica value of the current module is 3, it means that the number of instances contained in the current module is 3, then create 3 instances, and add the created instances to the list of instances to be scheduled.
  • component B1 contains 3 modules, namely: B11, B12, B13; component B2 contains 2 modules, namely: B21, B22; component B3 contains 4 modules, namely: B31 , B32, B33, B34, modules B12 and B33 are dynamic modules, the remaining modules are static modules, and each static module contains 2 instances.
  • Table 1 The above-mentioned first component list, second component list, component module list and to-be-scheduled instance list are shown in Table 1.
  • the components in the first component list are arranged in descending order according to the first preset priority, and the modules corresponding to each component are arranged in descending order according to the second preset priority.
  • Attribute information of the module is used to create an instance for the module, thereby obtaining a list of instances to be scheduled. Therefore, each component and each module corresponding to the component is sorted by priority to ensure that the affinity dependency strategy of the deployed instance is correctly parsed to improve the excellence of the allocation results.
  • the first preset rule may include: the current host node satisfies the strong affinity rule; and, the current host node satisfies the strong anti-affinity rule; and, the available resources of the current host node satisfy the current Deployment requirements for the instance.
  • instance C11 in component B1 is deployed to host node A1, then host node A1 must deploy instance C21 in component B2, then instance C11 and Instance C21 complies with the strong affinity rule, that is to say, instance C11 in component B1 and instance C21 in component B2 have a binding relationship when deployed on host node A1; if instance C11 in component B1 is deployed to host node A1 After that, host node A1 can no longer deploy instance C21 in component B2, and instance C11 and instance C21 comply with the strong anti-affinity rule.
  • instance C11 in component B1 and instance C21 in component B2 are in There is a mutually exclusive relationship when deployed on host node A1; the above strong affinity rules and strong anti-affinity rules are predefined, and can be reflected in the affinity rule list.
  • the specific process of selecting a host node in the host node list according to the first preset rule and obtaining the pre-selected host node list corresponding to the current instance includes the following steps:
  • Step 206 Traverse the list of instances to be scheduled and initialize the preselected host node list PreChooseHosts corresponding to the current instance;
  • a preselected host node list corresponding to the current instance needs to be initialized to ensure the accuracy of the preselected host list obtained according to the first preset rule.
  • Step 207 Traverse the host node list Hosts, complying with the affinity rules and hard limits of CPU performance, memory disk performance;
  • Step 208 Determine whether the current host node satisfies the strong affinity rule. If so, execute step 209; otherwise, return to step 207;
  • Step 209 Determine whether the current host node satisfies the strong anti-affinity rule. If so, execute step 210; otherwise, return to step 207;
  • the current instance has the possibility of being deployed to the current host node. If the current instance satisfies the strong anti-affinity rule, the current instance cannot be deployed to the current host node, and the next host node in the host node list is traversed. .
  • FIG. 3 it is a flow chart of affinity rule verification, including the following steps:
  • Step 301 Enter the affinity rule list AffinityList to obtain the current host node list Hosts;
  • the rule expression to be verified in the affinity rule list AffinityList includes strong affinity rules and strong anti-affinity rules.
  • Step 302 Traverse the affinity rule list AffinityList to obtain the current rule expression to be verified;
  • the current rule to be verified is a strong affinity rule or a strong anti-affinity rule.
  • Step 303 Obtain the attribute value X corresponding to the node according to the node attribute key specified by the current rule expression to be verified;
  • the current rule to be verified is a strong anti-affinity rule
  • the node attribute specified according to the strong anti-affinity rule is a module
  • Step 304 Call different operation logic according to the operation type operator specified by the current rule expression to be verified;
  • step 305 is executed. If the current rule to be verified is a strong anti-affinity rule, and the node attribute specified according to the strong anti-affinity rule is a module, the attribute value X1 of the current host node is obtained, and the IN operation logic is called, then step 305 is executed. If the current rule to be verified is a strong affinity rule, and the node attribute specified according to the strong affinity rule is a module, the attribute value X2 of the current host node is obtained, and the EXIST operation logic is called, then step 307 is executed.
  • Step 305 Determine which operation logic is called. If the IN operation logic is called, step 306 is executed. If the NOTIN operation logic is called, step 307 is executed. If the EXIST operation logic is called, step 308 is executed. If the NOTEXIST operation logic is called, then step 308 is executed. Execute step 309;
  • Step 306 Traverse X and the adaptation values specified by the current rule expression to be verified. If there is a match, return the value true; otherwise, return the value false;
  • the current rule to be verified is a strong anti-affinity rule
  • set The values of values are C16 and C33, which means that the strong anti-affinity rules are satisfied between C35 and C16.
  • the two instances cannot be deployed on the same host node at the same time.
  • the strong anti-affinity rules are also satisfied between C35 and C33. Two instances cannot be deployed on the same host node at the same time. Since there is a match between X and values, that is, both have C16, the value true is returned, triggering the strong anti-affinity rule.
  • the current rule to be verified is a strong affinity rule
  • set The values of values are C14 and C33, which means that C35 and C14 satisfy the strong affinity rules.
  • the two instances must be deployed on one host node at the same time.
  • the strong affinity rules are also satisfied between C35 and C33.
  • the two instances Must be deployed on one host node at the same time. Since there is no match between X and values, that is, there is no identical instance, the value true is returned and the strong affinity rule is not triggered.
  • Step 308 Traverse X. If it is not empty, return the value true; otherwise, return the value false;
  • the current rule to be verified is a strong anti-affinity rule
  • set X to be the instance list allowed to be deployed by host node A1 corresponding to instance C35, where the values of , because X is not empty, the return value is true.
  • Step 309 Traverse X. If it is empty, return the value true; otherwise, return the value false;
  • Step 210 Determine whether the available resources of the current host node are sufficient. If so, execute step 211. Otherwise, return to step 207;
  • the current instance selects an instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled; for each instance that is allowed to be deployed, if the available resources of the current host node meet the deployment requirements of the current instance, the current instance is allowed to be deployed to the current instance. On the host node, if the available resources of the current host node do not meet the deployment requirements of the current instance, the current instance cannot be deployed on the current host node, and the next host node in the host node list is traversed.
  • the available resources of the current host node can be determined from the three dimensions of central processing unit (CPU), memory (memory), and disk (disk). Whether the deployment requirements are met. If the three dimensions of the available resources of the current host node all meet the deployment requirements of the current instance, the current host node will be added to the preselected host node list; if any of the three dimensions of the available resources of the current host node If the deployment requirements of the current instance are not met, the next host node in the host node list is determined.
  • CPU central processing unit
  • memory memory
  • disk disk
  • the specific flow chart for determining whether the available resources of the current host node are sufficient includes the following steps:
  • Step 401 Obtain the available resources of the current host node and obtain the preset resource requirements of the current instance
  • Step 402 Determine whether the available CPU of the current host node is greater than the required CPU of the current instance. If so, execute step 403; otherwise, execute step 406;
  • Step 403 Determine whether the available memory of the current node of the host is greater than the required memory of the current instance. If so, execute step 404; otherwise, execute step 406;
  • Step 404 Determine whether the available disk of the current host node is greater than the required disk of the current instance. If so, perform step 405; otherwise, perform step 406;
  • Step 405 The available resources of the current host node are sufficient, and the pre-selected host node list PreChooseHosts is added;
  • Step 406 The available resources of the current host node are insufficient and cannot be added to the preselected host node list PreChooseHosts.
  • the weight proportions of the three dimensions of CPU, memory, and disk can be different.
  • the weight ratio of CPU and memory can be increased. In other words, host nodes with high CPU and/or large memory will be more preferred; for storage clusters, In the process of pre-selecting host nodes, the weight proportion of disk can be increased. In other words, host nodes with larger disks will be preferred.
  • the current host node first select the instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled, and then for each instance that is allowed to be deployed, if the instance has a binding relationship with the instance that is allowed to be deployed and complies with the principle of non-mutual exclusion, And the available resources of the current host node meet the deployment requirements of the current instance, then the current host node meets the first preset rule and is added to the preselected host node list PreChooseHosts; if the current host node does not meet the above first preset rule For any rule, traverse the next host node in the host node list.
  • Step 211 Add the current host node to the preselected host node list PreChooseHosts;
  • the preselected host node list PreChooseHosts corresponding to any instance is empty, you can obtain other host nodes outside the host node list, add the host node to the host node list, and deploy the instance to the host node.
  • This method reduces the probability of instance allocation failure by obtaining other host nodes outside the host node list, thereby improving the reliability of cluster resource allocation results.
  • the host node is selected from the host node list according to the first preset rule, that is, it is judged whether the current host node satisfies the strong affinity rule, and whether the current host node satisfies the strong anti-affinity rule, and the current host node Whether the available resources of the node are sufficient. If the current node meets the above three judgment conditions at the same time, the node will be used as a preselected host node and added to the preselected host node list corresponding to the current instance. If the current node does not meet the above three judgment conditions. If any one of the conditions is met, the next host node will be determined. Through pre-screening through the above method, the pre-selected host nodes available for the current instance are initially determined, which narrows the selection range and improves the efficiency of node allocation.
  • the second preset rule select the optimal host node from the preselected host node list. Specifically, for each host node corresponding to each instance, according to the preset host node, instance According to the corresponding relationship with the score, determine the score of the host node corresponding to the instance, and then compare the scores of each host node corresponding to each instance, and use the host node with the highest score as the optimal host node.
  • instance C11 in component B1 is deployed to host node A1, it is recommended to deploy instance C21 in component B2 to host node A1 to improve System performance, however, if there is no instance C21 in component B2, system functions can also be achieved by deploying only instance C11 in component B1 to host node A1.
  • the relationship between instance C11 and instance C21 satisfies the weak affinity rule. , that is to say, if instance C11 in component B1 and instance C21 in component B2 are deployed together on the host node A1, the system performance will be better. Either instance C11 in component B1 or instance C21 in component B2 will be deployed on the host node.
  • System performance can also be achieved on A1; if instance C11 in component B1 has been deployed to host node A1, it is not recommended to deploy instance C21 in component B2 to host node A1. However, if there are force majeure factors, Instance C11 in component B1 and instance C21 in component B2 must be deployed together on host node A1. System functions can also be implemented, but system performance will be reduced. Instance C11 and instance C21 here satisfy the weak affinity rule, and That is to say, if either component B1 or component B2 is deployed on the host node A1, the system performance will be better. If the instance C11 in the component B1 and the instance C21 in the component B2 are jointly deployed on the host node A1, the system performance can also be achieved.
  • the above weak affinity rules and weak anti-affinity rules are predefined and can be embodied in the form of an affinity rule list.
  • the weak affinity rules and weak anti-affinity rules specified in the affinity rule list all pre-set the score (weight) of each host node, where the weak affinity rule is defined in The score of is a positive integer, and the score defined in the weak anti-affinity rule is a negative integer.
  • the weak affinity rule is defined in The score of is a positive integer
  • the score defined in the weak anti-affinity rule is a negative integer.
  • the score of the current host node that is allowed to deploy the current instance can be obtained, where the obtained score is a positive integer; if the rule to be verified of the current host node is weak
  • the score of the current host node that is allowed to deploy the current instance can be obtained, where the score is a negative integer.
  • the current host node A1 can deploy n instances, C1, C2,..., Cn, it is assumed that for instance C1, the score of the current host node A1 is Y1; for instance C2, the score of the current host node A1 is Y2;...for instance Cn, the score of the current host node A1 is Yn.
  • Y1, Y2,..., Yn we can get the total score Y of the current host node, and then compare and select the score.
  • the host node with the highest value is regarded as the optimal host node.
  • the score of the host node corresponding to the instance is determined, and the scores of each host node are compared to select the host with the highest score. node as the optimal host node.
  • This method has flexible scoring rules for host nodes corresponding to instances and can be widely adapted to different types of host nodes, which improves the universality of the method.
  • the idle resources of the at least two host nodes with the highest scores can be compared, and the host node with the most idle resources can be used as the optimal host node.
  • Step 212 Traverse the pre-selected host node list PreChooseHosts, score each host node respectively, and select the optimal host node;
  • Step 213 Determine whether the weak affinity rule score of the current host node is greater than the highest score of the host node. If greater, proceed to step 215. If equal, proceed to step 214. If less, return to step 212;
  • Step 214 Compare the idle resource levels of the host nodes and select the host node with the most idle resources as the optimal host node;
  • the host node with more idle resources is preferred as the optimal host node.
  • the idle resources of the host node can be analyzed from the three dimensions of CPU, memory, and disk.
  • CPU, memory, and disk are preset with different weights. For example, for computing clusters, you can increase the weight proportion of CPU and memory; for storage clusters, you can increase the weight proportion of disk.
  • the optimal workflow diagram for resource planning includes the following steps:
  • Step 501 enter the host node host1 and the host node host2;
  • the host node host1 and the host node host2 have the highest scores and the scores are equal.
  • Step 502 Collect idle resources of the host node host1 and collect idle resources of the host node host2;
  • Step 503 Calculate the idle resource status score S1 of the host node host1 in a weighted manner, and calculate the idle resource status score S2 of the host node host2 in a weighted manner;
  • the idle resources of the host node host1 and the idle resources of the host node host2 are respectively collected, and weighted operations are performed according to the preset weight and the absolute value of the idle resources to obtain the idle resource situation score S1 of the host node host1 and the idle resources of the host node host2.
  • Situation score S2 the idle resource situation score of the host node host1 and the idle resources of the host node host2.
  • Step 504 determine whether the condition S1>S2 is satisfied, if so, execute step 505, otherwise, execute step 506;
  • Step 505 use the host node host1 as the optimal host node
  • Step 506 Use the host node host2 as the optimal host node.
  • S1 and S2 are compared, and the host node with the highest score is selected as the optimal host node.
  • Step 215 Bind the current instance to the optimal host node to obtain the binding relationship between the host node and the instance;
  • Step 216 Determine whether a host node is deployed for each instance. If so, perform step 217; otherwise, perform step 218;
  • Step 217 Output the adjusted binding relationship between the host node and the component module instance
  • Step 218 Output the reason for the allocation failure and the allocated host node list.
  • Embodiments of the present invention provide a cluster resource planning method. If there are at least two host nodes with the highest scores, compare the idle resources of at least two host nodes with the highest scores, and compare the host nodes with the highest idle resource scores. as the optimal host node. Therefore, by configuring different dimensions and weights to prioritize the idle resources of host nodes, the scoring rules are flexible and can be widely applied to different types of clusters. In addition, by optimizing the host node with the highest score and the highest idle resource score, Double optimization rules further ensure the excellence of resource allocation results.
  • the method can also quantify expert experience into a set of json template files, including a list of affinity rules, a first preset priority, a second preset priority, available resources of the host node, and preset Weights.
  • the cluster resource planning method can be efficiently automated.
  • the definition of component affinity rules is created in the template file, which can accurately express the dependencies and mutual exclusion relationships between components, and through the template
  • the file can consider the resource requirements of components in layers, and consider the requirements of different layers separately.
  • different template combinations can be introduced as needed, and their priority relationships can be adjusted as needed to ensure the rationality and excellence of resource allocation. sex.
  • embodiments of the present invention also provide a cluster resource planning device, which is applied to a public cloud platform.
  • This device is the device in the method in the embodiment of the present invention, and the principle of solving the problem of the device is the same as that of the method. are similar, so the implementation of the device can be referred to the implementation of the method, and repeated details will not be repeated.
  • the above device includes the following modules:
  • the priority sorting module 601 is used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled. ;
  • the rule verification module 602 is configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list, and obtain a preselected host node list corresponding to the current instance;
  • the resource optimization module 603 is configured to select the optimal host node from the pre-selected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain the binding relationship;
  • the deployment module 604 is used to deploy the instance corresponding to the host node in the binding relationship to the host node.
  • the priority sorting module is specifically used to:
  • the prioritization module is specifically used for:
  • the first preset rule includes:
  • the current host node satisfies the strong affinity rule
  • the current host node satisfies the strong anti-affinity rule
  • the available resources of the current host node meet the deployment requirements of the current instance.
  • the resource optimization module is specifically used to:
  • the resource optimization module is also used to:
  • the rule verification module is also used to:
  • embodiments of the present invention also provide a cluster resource planning device, which is applied to a public cloud platform. Since the cluster resource planning device is the cluster resource planning device in the method in the embodiment of the present invention, and the cluster resource The principle of planning equipment to solve problems is similar to this method. Therefore, the implementation of the cluster resource planning equipment can be found in the implementation of the method, and the duplication will not be repeated.
  • the cluster resource planning device 70 according to this embodiment of the present invention is described below with reference to FIG. 7 .
  • the cluster resource planning device 70 shown in FIG. 7 is only an example and should not impose any restrictions on the functions and usage scope of the embodiments of the present invention.
  • the cluster resource planning device 70 may be in the form of a general computing device, for example, it may be a terminal device.
  • the components of the cluster resource planning device 70 may include, but are not limited to: the above-mentioned at least one processor 71, the above-mentioned at least one memory 72 that stores executable instructions of the processor 71, and a bus 73 connecting different system components (including the memory 72 and the processor 71).
  • the processor 71 is the processor of the smart device.
  • the processor 71 executes executable instructions to implement the following steps:
  • the optimal host node after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain the binding relationship;
  • the processor 71 is specifically used to:
  • the processor 71 is specifically used to:
  • the first preset rule includes:
  • the current host node satisfies the strong affinity rule
  • the current host node satisfies the strong anti-affinity rule
  • the number of available resources on the current host node meets the deployment requirements of the current instance.
  • the processor 71 is specifically used to:
  • the processor 71 is also used to:
  • the processor 71 before selecting a host node in the host node list according to the first preset rule and obtaining the preselected host node list corresponding to the current instance, the processor 71 is also used to:
  • Bus 73 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus structures.
  • Memory 72 may include readable media in the form of volatile memory, such as random access memory (RAM) 721 and/or cache memory 722 , and may further include read only memory (ROM) 723 .
  • RAM random access memory
  • ROM read only memory
  • Memory 72 may also include a program/utility 725 having a set of (at least one) program modules 724 including, but not limited to: an operating system, one or more application programs, other program modules, and program data. Each of the examples, or some combination thereof, may include the implementation of a network environment.
  • Cluster resource planning device 70 may also communicate with one or more external devices 74 (e.g., keyboard, pointing device, etc.), may also communicate with one or more devices that enable a user to interact with cluster resource planning device 70, and/or with Any device (eg, router, modem, etc.) that enables cluster resource planning device 70 to communicate with one or more other computing devices. This communication may occur through input/output (I/O) interface 75.
  • the cluster resource planning device 70 may also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through a network adapter 76 . As shown, network adapter 76 communicates with other modules of electronic device 70 via bus 73 .
  • network adapter 76 communicates with other modules of electronic device 70 via bus 73 .
  • cluster resource planning device 70 may be used in conjunction with the cluster resource planning device 70, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, Tape drives and data backup storage systems, etc.
  • various aspects of the present invention can also be implemented in the form of a program product, which includes program code.
  • the program product When the program product is run on a terminal device, the program code is used to cause the terminal device to execute the above described instructions.
  • the steps of each module in the cluster resource planning device according to various exemplary embodiments of the present disclosure described in the "Example Method" section, for example, the network side device can be used to obtain the host node list and the first component list that needs to be deployed.
  • the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled; for each instance in each module in the list of instances to be scheduled, the first component list is sorted according to the first preset
  • the rule selects the host node in the host node list to obtain the preselected host node list corresponding to the current instance; according to the second preset rule, after selecting the optimal host node from the preselected host node list, the optimal host node is compared with the current instance. Bind, get the binding relationship;
  • the Program Product may take the form of one or more readable media in any combination.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a program product 80 for cluster resource planning according to an embodiment of the present invention is described, which can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be used on a terminal device, For example, run on a personal computer.
  • CD-ROM compact disk read-only memory
  • the program product of the present invention is not limited thereto.
  • a readable storage medium may be any tangible medium containing or storing a program that may be used by or in combination with an instruction execution system, apparatus or device.
  • the readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying readable program code therein. Such propagated data signals may take a variety of forms, including - but not limited to - electromagnetic signals, optical signals, or any suitable combination of the above.
  • a readable signal medium may also be any readable medium other than a readable storage medium that can send, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including - but not limited to - wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., as well as conventional procedural programming. Language—such as "C” or a similar programming language.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (e.g., utilizing an Internet service provider to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • an external computing device e.g., utilizing an Internet service provider to connect via the Internet.
  • the present application can also be implemented using hardware and/or software (including firmware, resident software, microcode, etc.).
  • the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by an instruction execution system or Used in conjunction with the instruction execution system.
  • a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, transmit, or transport a program for use by or in connection with an instruction execution system, apparatus, or device, device or equipment use.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed are a cluster resource planning method, a device, an apparatus, and a medium, which are applied to a public cloud platform. The method comprises: after acquiring a first list of components to be deployed, sorting the first list of components according to a first preset priority and a second preset priority, to obtain a list of instances to be scheduled; for each instance in a module, selecting a host node from a host node list according to a first preset rule, to obtain a preselected host node list corresponding to the current instance; after selecting an optimal host node from the preselected host node list according to the preselected host node list and a second preset rule, binding the optimal host node to the current instance, to obtain a binding relationship; deploying to the host node the instance corresponding to the host node in the binding relationship. The present invention can improve the efficiency of resource planning by implementing automatic deployment of cluster resources under the public cloud platform.

Description

一种集群资源规划方法、设备、装置及介质A cluster resource planning method, equipment, device and medium 技术领域Technical field
本发明涉及计算机技术领域,特别涉及一种集群资源规划方法、设备、装置及介质。The present invention relates to the field of computer technology, and in particular to a cluster resource planning method, equipment, device and medium.
背景技术Background technique
随着云计算时代的到来,各种大数据组件借助公有云平台得到了广泛的应用,尤其可以利用公有云平台对集群进行多组件部署。With the advent of the cloud computing era, various big data components have been widely used with the help of public cloud platforms. In particular, public cloud platforms can be used to deploy multi-component clusters.
目前,基于公有云平台对集群进行多组件部署时,需要相关专家的介入,利用专家经验为集群节点部署实例。Currently, when deploying multiple components of a cluster based on a public cloud platform, relevant experts are required to intervene and use expert experience to deploy instances for cluster nodes.
技术问题technical problem
然而,由于专家经验依赖于专家本身,需要专家手动处理工单,因此,部署效率低。However, since expert experience relies on the experts themselves, experts are required to manually process work orders, so the deployment efficiency is low.
技术解决方案Technical solutions
本发明提供一种集群资源规划方法、设备、装置及介质,用以解决现有技术中存在的集群资源规划的效率低的问题。The present invention provides a cluster resource planning method, equipment, device and medium to solve the problem of low efficiency of cluster resource planning existing in the prior art.
第一方面,本发明实施例提供一种集群资源规划方法,应用于公有云平台,包括:In a first aspect, embodiments of the present invention provide a cluster resource planning method applied to a public cloud platform, including:
获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对所述第一组件列表进行排序,得到待调度实例列表;After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;
针对所述待调度实例列表中每个模块中的每个实例,根据第一预设规则在所述主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;For each instance in each module in the to-be-scheduled instance list, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;
根据第二预设规则,从所述预选主机节点列表中选择最优主机节点后,将所述最优主机节点与所述当前实例进行绑定,得到绑定关系;According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain a binding relationship;
将所述绑定关系中主机节点对应的实例部署到所述主机节点上。Deploy the instance corresponding to the host node in the binding relationship to the host node.
本发明实施例提供的一种集群资源规划方法,由获取到的第一组件列表根据第一预设优先级以及第二预设优先级排序后,得到待调度实例列表,针对待调度实例列表中每个模块中的每个实例,根据第一预设规则确定预选主机节点列表,以及根据第二预设规则,从预选主机节点列表中选择出与当前实例进行绑定的最优主机节点,得到绑定关系,根据绑定关系将主机节点对应的实例部署进行部署。由于该集群资源规划方法实现了多组件中的实例自动化分配并部署到适合的主机节点上,从而可以提高基于公有云平台的多组件集群资源规划的效率。In a cluster resource planning method provided by an embodiment of the present invention, after the obtained first component list is sorted according to the first preset priority and the second preset priority, a list of instances to be scheduled is obtained. For the list of instances to be scheduled, For each instance in each module, the preselected host node list is determined according to the first preset rule, and the optimal host node bound to the current instance is selected from the preselected host node list according to the second preset rule, and we get Binding relationship, deploy the instance corresponding to the host node according to the binding relationship. Since this cluster resource planning method realizes automatic allocation and deployment of instances in multiple components to suitable host nodes, it can improve the efficiency of multi-component cluster resource planning based on public cloud platforms.
在一种可选的实施方式中,所述根据第一预设优先级以及第二预设优先级对所述第一组件列表进行排序,得到待调度实例列表,包括:In an optional implementation, the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled, including:
根据第一预设优先级对所述第一组件列表中每个组件进行降序排列,得到第二组件列表;Arrange each component in the first component list in descending order according to the first preset priority to obtain a second component list;
根据第二预设优先级对所述第二组件列表中每个组件对应的模块进行降序排列,得到与当前组件对应的组件模块列表;Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component;
针对所述组件模块列表中的每个模块,根据所述模块对应的实例个数的属性信息,为所述模块创建实例,以得到所述待调度实例列表。For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module, to obtain the list of instances to be scheduled.
上述方法,根据第一预设优先级对第一组件列表中的组件进行降序排列,根据第二预设优先级对每个组件对应的模块进行降序排列,根据模块对应的实例个数的属性信息,为模块创建实例,从而得到待调度实例列表。将每个组件、组件对应的每个模块均按优先级进行排序,在保证后部署的实例的亲和性依赖策略得到正确的解析,以提高分配结果的优异性。In the above method, the components in the first component list are arranged in descending order according to the first preset priority, the modules corresponding to each component are arranged in descending order according to the second preset priority, and the attribute information of the number of instances corresponding to the module is arranged. , create an instance for the module to obtain a list of instances to be scheduled. Sort each component and each module corresponding to the component according to priority, and ensure that the affinity dependency strategy of the deployed instance is correctly parsed to improve the excellence of the allocation results.
在一种可选的实施方式中,所述根据第一预设优先级对所述第一组件列表中每个组件进行降序排列,包括:In an optional implementation, arranging each component in the first component list in descending order according to the first preset priority includes:
遍历所述第一组件列表;Traverse the first component list;
将所述第一组件列表中每个组件按照优先级数值从小到大排序。Sort each component in the first component list according to priority value from small to large.
上述,根据第一预设优先级对第一组件列表中每个组件进行降序排列,即,将第一组件列表中每个组件按照优先级数值从小到大排序,优先级数值的值越小,相应的组件的优先级越高,也就是说,将第一组件列表中每个组件按照优先级从高到低的顺序进行排序。通过对组件按优先级进行排序,以便于在后续过程中逐一遍历每个组件,优化每个组件中实例分配的主机节点。As above, each component in the first component list is arranged in descending order according to the first preset priority, that is, each component in the first component list is sorted according to the priority value from small to large. The smaller the priority value, the smaller the priority value. The higher the priority of the corresponding component, that is, sort each component in the first component list in order from high to low priority. Optimize the allocation of host nodes to instances in each component by prioritizing the components so that each component can be traversed one by one in the subsequent process.
在一种可选的实施方式中,所述第一预设规则包括:In an optional implementation, the first preset rule includes:
当前主机节点满足强亲和性规则;The current host node satisfies the strong affinity rule;
且,所述当前主机节点满足强反亲和性规则;Moreover, the current host node satisfies the strong anti-affinity rule;
且,所述当前主机节点的可用资源满足所述当前实例的部署要求。Moreover, the available resources of the current host node meet the deployment requirements of the current instance.
上述方法,根据第一预设规则在主机节点列表中选择主机节点,即判断当前主机节点是否满足强亲和性规则,且当前主机节点是否满足强反亲和性规则,且当前主机节点的可用资源是否充足,如果当前节点同时满足上述三个判断条件,则将该节点作为预选主机节点,并加入到当前实例对应的预选主机节点列表中,如果当前节点不满足上述三个判断条件中的任意一个条件,则判断下一个主机节点。通过上述方法进行预筛选,初步判定出当前实例可选的预选主机节点,缩小了选择范围,提高了节点分配的效率。The above method selects a host node in the host node list according to the first preset rule, that is, determines whether the current host node satisfies the strong affinity rule, and whether the current host node satisfies the strong anti-affinity rule, and the availability of the current host node Whether the resources are sufficient. If the current node meets the above three judgment conditions at the same time, the node will be used as a preselected host node and added to the preselected host node list corresponding to the current instance. If the current node does not meet any of the above three judgment conditions. A condition is used to determine the next host node. Through pre-screening through the above method, the pre-selected host nodes available for the current instance are initially determined, which narrows the selection range and improves the efficiency of node allocation.
在一种可选的实施方式中,所述根据所述预选主机节点列表和第二预设规则,从所述主机节点列表中选择最优主机节点,包括:In an optional implementation, selecting the optimal host node from the host node list according to the preselected host node list and the second preset rule includes:
针对每个实例对应的每个主机节点,根据预设的主机节点、实例和分值的对应关系,确定所述实例对应的所述主机节点的分值;For each host node corresponding to each instance, determine the score of the host node corresponding to the instance according to the preset corresponding relationship between the host node, the instance and the score;
对所述每个实例对应的每个主机节点的分值进行比较,将分值最高的主机节点作为所述最优主机节点。The scores of each host node corresponding to each instance are compared, and the host node with the highest score is regarded as the optimal host node.
上述方法,根据预设的主机节点、实例和分值的对应关系,确定实例对应的主机节点的分值,并对每个主机节点的分值进行比较,从而选出得分最高的主机节点作为最优主机节点。该方法对实例对应的主机节点的打分规则比较灵活,可以广泛适配不同类型的主机节点,提高了该方法的普适性。The above method determines the score of the host node corresponding to the instance based on the preset corresponding relationship between the host node, instance and score, and compares the score of each host node, thereby selecting the host node with the highest score as the best. Optimal host node. This method has flexible scoring rules for host nodes corresponding to instances and can be widely adapted to different types of host nodes, which improves the universality of the method.
在一种可选的实施方式中,若所述分值最高的主机节点包括至少两个,该方法还包括:In an optional implementation, if the host nodes with the highest scores include at least two, the method further includes:
对所述至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源最多的主机节点作为所述最优主机节点。The idle resources of the at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node.
上述方法,若分值最高的主机节点包括至少两个,则对至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源最多的主机节点作为最优主机节点。因此,通过优选分值最高且空闲资源最多的主机节点的二重优选规则,进一步保证了资源分配结果的优异性。In the above method, if there are at least two host nodes with the highest scores, the idle resources of at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node. Therefore, the excellence of the resource allocation results is further ensured by the double optimization rule of optimizing the host node with the highest score and the most idle resources.
在一种可选的实施方式中,所述根据第一预设规则在所述主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表之前,还包括:In an optional implementation, before selecting a host node in the host node list according to the first preset rule to obtain the preselected host node list corresponding to the current instance, the method further includes:
遍历所述待调度实例列表;Traverse the list of instances to be scheduled;
针对所述待调度实例列表中的每个实例,初始化与所述当前实例相对应的预选主机节点列表。For each instance in the to-be-scheduled instance list, a preselected host node list corresponding to the current instance is initialized.
上述方法,针对待调度实例列表中的每个实例,需要初始化与当前实例相对应的预选主机节点列表,以确保根据第一预设规则得到的预选主机列表的准确性。In the above method, for each instance in the instance list to be scheduled, a preselected host node list corresponding to the current instance needs to be initialized to ensure the accuracy of the preselected host list obtained according to the first preset rule.
第二方面,本发明实施例提供一种集群资源规划设备,应用于公有云平台,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现上述任一项实施例所述的集群资源规划方法的步骤。In a second aspect, embodiments of the present invention provide a cluster resource planning device, which is applied to a public cloud platform and includes a memory and a processor. The memory stores a computer program. When the processor executes the computer program, any one of the above is implemented. The steps of the cluster resource planning method described in this embodiment.
第三方面,本发明实施例提供一种集群资源规划装置,应用于公有云平台,包括:In a third aspect, embodiments of the present invention provide a cluster resource planning device applied to a public cloud platform, including:
优先级排序模块,用于获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对所述第一组件列表进行排序,得到待调度实例列表;A priority sorting module, used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain the instance to be scheduled. list;
规则校验模块,用于针对所述待调度实例列表中每个模块中的每个实例,根据第一预设规则在所述主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;A rule verification module, configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list to obtain a preselected host node corresponding to the current instance. list;
资源优选模块,用于根据第二预设规则,从所述预选主机节点列表中选择最优主机节点后,将所述最优主机节点与所述当前实例进行绑定,得到绑定关系;A resource optimization module, configured to select an optimal host node from the preselected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain a binding relationship;
部署模块,用于将所述绑定关系中主机节点对应的实例部署到所述主机节点上。A deployment module is used to deploy the instance corresponding to the host node in the binding relationship to the host node.
在一种可选的实施方式中,所述优先级排序模块具体用于:In an optional implementation, the priority sorting module is specifically used to:
根据第一预设优先级对所述第一组件列表中每个组件进行降序排列,得到第二组件列表;Arrange each component in the first component list in descending order according to the first preset priority to obtain a second component list;
根据第二预设优先级对所述第二组件列表中每个组件对应的模块进行降序排列,得到与当前组件对应的组件模块列表;Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component;
针对所述组件模块列表中的每个模块,根据所述模块对应的实例个数的属性信息,为所述模块创建实例,以得到所述待调度实例列表。For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module, to obtain the list of instances to be scheduled.
在一种可选的实施方式中,所述优先级排序模块具体用于:In an optional implementation, the priority sorting module is specifically used to:
遍历所述第一组件列表;Traverse the first component list;
将所述第一组件列表中每个组件按照优先级数值从小到大排序。Sort each component in the first component list according to priority value from small to large.
在一种可选的实施方式中,所述第一预设规则包括:In an optional implementation, the first preset rule includes:
当前主机节点满足强亲和性规则;The current host node satisfies the strong affinity rule;
且,所述当前主机节点满足强反亲和性规则;Moreover, the current host node satisfies the strong anti-affinity rule;
且,所述当前主机节点的可用资源满足所述当前实例的部署要求。Moreover, the available resources of the current host node meet the deployment requirements of the current instance.
在一种可选的实施方式中,所述资源优选模块具体用于:In an optional implementation, the resource optimization module is specifically used to:
针对每个实例对应的每个主机节点,根据预设的主机节点、实例和分值的对应关系,确定所述实例对应的所述主机节点的分值;For each host node corresponding to each instance, determine the score of the host node corresponding to the instance according to the preset corresponding relationship between the host node, the instance and the score;
对所述每个实例对应的每个主机节点的分值进行比较,将分值最高的主机节点作为所述最优主机节点。The scores of each host node corresponding to each instance are compared, and the host node with the highest score is regarded as the optimal host node.
在一种可选的实施方式中,若所述分值最高的主机节点包括至少两个,所述资源优选模块还用于:In an optional implementation, if the host nodes with the highest scores include at least two, the resource optimization module is also used to:
对所述至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源最多的主机节点作为所述最优主机节点。The idle resources of the at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node.
在一种可选的实施方式中,所述规则校验模块还用于:In an optional implementation, the rule checking module is also used to:
遍历所述待调度实例列表;Traverse the list of instances to be scheduled;
针对所述待调度实例列表中的每个实例,初始化与所述当前实例相对应的预选主机节点列表。For each instance in the to-be-scheduled instance list, a preselected host node list corresponding to the current instance is initialized.
第四方面,本发明实施例提供一种计算机存储介质,所述计算机可读存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行上述中任一项实施例所述的集群资源规划方法的步骤。In a fourth aspect, embodiments of the present invention provide a computer storage medium. The computer-readable storage medium stores computer instructions. When the computer instructions are run on a computer, they cause the computer to execute any of the above embodiments. The steps of the cluster resource planning method.
上述第二方面公开的集群资源规划设备、第三方面公开的集群资源规划装置以及第四方面公开的计算机存储介质可能达到的技术效果请参照上述针对第一方面或第一方面中的各种可能方案可以达到的技术效果说明,这里不再重复赘述。For technical effects that may be achieved by the cluster resource planning equipment disclosed in the second aspect, the cluster resource planning device disclosed in the third aspect, and the computer storage medium disclosed in the fourth aspect, please refer to the above-mentioned description of the first aspect or various possibilities in the first aspect. The description of the technical effects that the solution can achieve will not be repeated here.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following will briefly introduce the drawings needed to describe the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. Those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting any creative effort.
图1为本发明实施例提供的一种集群资源规划方法的流程示意图;Figure 1 is a schematic flow chart of a cluster resource planning method provided by an embodiment of the present invention;
图2为本发明实施例提供的另一种集群资源规划方法的流程示意图;Figure 2 is a schematic flow chart of another cluster resource planning method provided by an embodiment of the present invention;
图3为本发明实施例提供的一种亲和性规则校验方法的流程示意图;Figure 3 is a schematic flow chart of an affinity rule checking method provided by an embodiment of the present invention;
图4为本发明实施例提供的一种主机节点可用资源判定方法的流程示意图;Figure 4 is a schematic flow chart of a method for determining available resources of a host node provided by an embodiment of the present invention;
图5为本发明实施例提供的一种主机节点资源优选的流程示意图;Figure 5 is a schematic flowchart of host node resource optimization provided by an embodiment of the present invention;
图6为本发明实施例提供的一种集群资源规划装置的模块结构示意图;Figure 6 is a schematic module structure diagram of a cluster resource planning device provided by an embodiment of the present invention;
图7为本发明实施例提供的一种集群资源规划设备的结构示意图;Figure 7 is a schematic structural diagram of a cluster resource planning device provided by an embodiment of the present invention;
图8为本发明实施例提供的一种集群资源规划方法的程序产品的示意图。Figure 8 is a schematic diagram of a program product of a cluster resource planning method provided by an embodiment of the present invention.
本发明的实施方式Embodiments of the invention
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。It should be noted that the terms "first", "second", etc. in the description and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the invention described herein are capable of being practiced in sequences other than those illustrated or described herein. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the appended claims.
目前,基于公有云平台对客户集群进行多组件部署时,需要相关专家的介入,利用专家经验判定不同组件在集群节点上的相容性与互斥性,确认实例的资源需求,尽可能平衡地在多个主机节点上预分配不同的组件中的不同实例,然而,由于上述方法需要专家手动处理工单,无法进行工单的自动化处理,使得在业务量增长的情况下,公有云的集群创建会受到专家处理效率的限制,此外,即使依靠专家人工介入,其资源规划的分配结果的优异性也很难得到保证。Currently, when deploying multiple components to a customer cluster based on a public cloud platform, relevant experts are required to intervene. Expert experience is used to determine the compatibility and mutual exclusivity of different components on the cluster nodes, confirm the resource requirements of the instance, and balance the deployment as much as possible. Pre-allocate different instances of different components on multiple host nodes. However, since the above method requires experts to manually process work orders, automated processing of work orders cannot be carried out, making the creation of public cloud clusters difficult as business volume increases. It will be limited by the processing efficiency of experts. In addition, even if experts rely on manual intervention, the excellence of the resource planning allocation results is difficult to guarantee.
为了解决上述问题,本发明的实施例提供了一种集群资源规划方法、设备、装置及介质,以提高集群资源规划的效率。In order to solve the above problems, embodiments of the present invention provide a cluster resource planning method, equipment, device and medium to improve the efficiency of cluster resource planning.
实施例1Example 1
下面通过具体的实施例对本发明提供的一种集群资源规划方法进行说明,该方法应用于公有云平台,如图1所示,包括:The following describes a cluster resource planning method provided by the present invention through specific embodiments. This method is applied to a public cloud platform, as shown in Figure 1, and includes:
步骤101,获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对第一组件列表进行排序,得到待调度实例列表;Step 101: After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;
步骤102,针对待调度实例列表中每个模块中的每个实例,根据第一预设规则在主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;Step 102: For each instance in each module in the instance list to be scheduled, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;
步骤103,根据第二预设规则,从预选主机节点列表中选择最优主机节点后,将最优主机节点与当前实例进行绑定,得到绑定关系;Step 103: According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain the binding relationship;
步骤104,将绑定关系中主机节点对应的实例部署到主机节点上。Step 104: Deploy the instance corresponding to the host node in the binding relationship to the host node.
需要说明的是,本发明实施例提供的一种集群资源规划方法可以应用于云主机,可以应用于网络侧设备,可以应用于GPU(Graphics Processing Unit,图形处理器)计算设备,还可以应用于终端,在此并不对该集群资源规划方法的应用场景作具体限定。It should be noted that the cluster resource planning method provided by the embodiment of the present invention can be applied to cloud hosts, network-side devices, GPU (Graphics Processing Unit, graphics processor) computing devices, and can also be applied to Terminal, the application scenarios of this cluster resource planning method are not specifically limited here.
本发明实施例提供一种集群资源规划方法,由获取到的第一组件列表根据第一预设优先级以及第二预设优先级排序后,得到待调度实例列表,针对待调度实例列表中每个模块中的每个实例,根据第一预设规则确定预选主机节点列表,以及根据第二预设规则,从预选主机节点列表中选择出与当前实例进行绑定的最优主机节点,得到绑定关系,根据绑定关系将主机节点对应的实例部署进行部署。因此,该集群资源规划方法实现了多组件中的实例自动化分配并部署到适合的主机节点上,提高了基于公有云平台的多组件集群资源规划的效率,同时,也保证多组件集群资源规划的分配结果的优异性。Embodiments of the present invention provide a cluster resource planning method. After the obtained first component list is sorted according to the first preset priority and the second preset priority, a list of instances to be scheduled is obtained. For each instance in the list of instances to be scheduled, Each instance in a module determines the preselected host node list according to the first preset rule, and selects the optimal host node to be bound to the current instance from the preselected host node list according to the second preset rule to obtain the binding Deploy the instance deployment corresponding to the host node according to the binding relationship. Therefore, this cluster resource planning method realizes the automatic allocation and deployment of instances in multi-components to suitable host nodes, improves the efficiency of multi-component cluster resource planning based on public cloud platforms, and also ensures the efficiency of multi-component cluster resource planning. Excellent distribution results.
此外,该方法还可以广泛适配各类资源分配场景,有良好的通用性。In addition, this method can be widely adapted to various resource allocation scenarios and has good versatility.
作为一种可选的实施方式,根据第一预设优先级以及第二预设优先级对第一组件列表进行排序,得到待调度实例列表,可以先根据第一预设优先级对第一组件列表中每个组件进行降序排列,得到第二组件列表,然后根据第二预设优先级对第二组件列表中每个组件对应的模块进行降序排列,得到与当前组件对应的组件模块列表,最后针对组件模块列表中的每个模块,根据模块对应的实例个数的属性信息,为模块创建实例,以得到待调度实例列表。As an optional implementation manner, the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled. The first component can be first sorted according to the first preset priority. Arrange each component in the list in descending order to obtain a second component list, and then arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component. Finally, For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain a list of instances to be scheduled.
在具体实施时,如图2所示,为本发明提供的一种集群资源规划方法的整体流程图,参见图2中的21,为根据第一预设优先级以及第二预设优先级对第一组件列表进行排序,得到待调度实例列表的具体流程,包括以下步骤:In specific implementation, as shown in Figure 2, it is an overall flow chart of a cluster resource planning method provided by the present invention. Refer to 21 in Figure 2, which is based on the first preset priority and the second preset priority. The specific process of sorting the first component list to obtain the list of instances to be scheduled includes the following steps:
步骤201,获取主机节点列表Hosts,获取需要部署的第一组件列表Components1;Step 201: Obtain the host node list Hosts and obtain the first component list Components1 that needs to be deployed;
具体地,主机节点列表Hosts中包括该集群所拥有的全部主机节点。Specifically, the host node list Hosts includes all host nodes owned by the cluster.
步骤202,根据第一预设优先级对第一组件列表进行排序得到第二组件列表Components2;Step 202: Sort the first component list according to the first preset priority to obtain the second component list Components2;
具体地,遍历该第一组件列表Components1,将第一组件列表Components1中每个组件按照优先级数值从小到大排序,其中,组件的优先级数值即为预先配置的组件对应的piriority属性,piriority是一个正整数值,piriority的数值越小,其对应的组件的优先级越高。也就是说,获取到需要部署的第一组件列表Components1后,遍历第一组件列表Components1,将第一组件列表Components1中每个组件按照piriority数值从小到大排序,得到第二组件列表Components2,其中,第二组件列表Components2中的组件是按照优先级从大到小进行排列的。Specifically, the first component list Components1 is traversed, and each component in the first component list Components1 is sorted from small to large according to the priority value, where the priority value of the component is the piriority attribute corresponding to the preconfigured component, and piriority is A positive integer value. The smaller the value of piriority, the higher the priority of the corresponding component. That is to say, after obtaining the first component list Components1 that needs to be deployed, traverse the first component list Components1, sort each component in the first component list Components1 according to the piriority value from small to large, and obtain the second component list Components2, where, The components in the second component list Components2 are arranged according to priority from large to small.
步骤203,遍历第二组件列表,根据第二预设优先级对每个组件对应的模块进行排序;Step 203: Traverse the second component list and sort the modules corresponding to each component according to the second preset priority;
具体地,对第二组件列表Components2中每个组件对应的模块按照第二预设优先级进行降序排列,得到与当前组件对应的组件模块列表,其中,第二预设优先级为模块的预设优先级,由与之对应的组件的内部结构定义该组件包含的模块的先后顺序,并且不做额外调整。Specifically, the modules corresponding to each component in the second component list Components2 are arranged in descending order according to the second preset priority to obtain a component module list corresponding to the current component, where the second preset priority is the preset value of the module. Priority, the order of modules included in the component is defined by the internal structure of the corresponding component, and no additional adjustments are made.
步骤204,判断当前模块是否是动态模块,若是,则为当前模块设置动态标记true,否则执行步骤205;Step 204: Determine whether the current module is a dynamic module. If so, set the dynamic flag true for the current module. Otherwise, perform step 205;
具体地,模块根据其类型不同,可以分为动态模块和静态模块,预先通过replica数值设定模块的属性信息,如果当前模块的replica数值为魔数,则表明当前模块为动态模块,例如,当前模块的replica数值为999999或999998;如果当前模块的replica数值不是魔数,则表明当前模块为静态模块,例如,当前模块的replica数值为3。Specifically, modules can be divided into dynamic modules and static modules according to their types. The attribute information of the module is set in advance through the replica value. If the replica value of the current module is a magic number, it indicates that the current module is a dynamic module. For example, the current module The replica value of the module is 999999 or 999998; if the replica value of the current module is not a magic number, it means that the current module is a static module. For example, the replica value of the current module is 3.
步骤205,根据模块对应的实例个数的属性信息,为每个模块创建对应的实例。Step 205: Create a corresponding instance for each module according to the attribute information of the number of instances corresponding to the module.
例如,如果当前模块为静态模块且模块的实例数为3,则创建a1、a2、a3三个实例加入待调度实例列表;如果当前模块为动态模块,则根据主机节点的个数创建对应的实例加入待调度实例列表;For example, if the current module is a static module and the number of instances of the module is 3, create three instances a1, a2, and a3 and add them to the list of instances to be scheduled; if the current module is a dynamic module, create corresponding instances according to the number of host nodes. Add to the list of instances to be scheduled;
具体地,针对组件模块列表中的每个模块,根据模块对应的实例个数的属性信息,为模块创建实例,以得到组件模块实例列表,即通过模块对应的replica数值,为该模块生成replica数值的个数的实例。如果当前模块的replica数值为魔数,则表明当前模块为动态模块,根据给定的主机节点的个数来创建实例,并将创建的实例加入待调度实例列表,例如,若当前模块的replica数值为999999,则可以表示在所有主机节点上部署该模块;若当前模块的replica数值为999998,则可以表示在三分之二主机节点上部署该模块。如果当前模块的replica数值不是魔数,则表明当前模块为静态模块,根据当前模块的replica数值创建实例,例如,若当前模块的replica数值为3,表明当前模块包含的实例数为3,则创建3个实例,并将创建的实例加入待调度实例列表。Specifically, for each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain the component module instance list, that is, generate a replica value for the module through the replica value corresponding to the module. number of instances. If the replica value of the current module is a magic number, it indicates that the current module is a dynamic module. Create an instance according to the given number of host nodes and add the created instance to the list of instances to be scheduled. For example, if the replica value of the current module If it is 999999, it means deploying the module on all host nodes; if the replica value of the current module is 999998, it means deploying the module on two-thirds of the host nodes. If the replica value of the current module is not a magic number, it means that the current module is a static module, and an instance is created based on the replica value of the current module. For example, if the replica value of the current module is 3, it means that the number of instances contained in the current module is 3, then create 3 instances, and add the created instances to the list of instances to be scheduled.
例如,假设该集群为一个拥有5台主机节点的集群,需要为该集群分配3个不同的大数据组件,设定5台主机节点分别为:A1、A2、…、A5,3个组件分别为:B1、B2、B3,其中,组件B1包含3个模块,分别为:B11、B12、B13;组件B2包含2个模块,分别为:B21、B22;组件B3包含4个模块,分别为:B31、B32、B33、B34,模块B12、B33为动态模块,其余模块均为静态模块,且每个静态模块均包含2个实例。上述第一组件列表、第二组件列表、组件模块列表和待调度实例列表如表1所示。For example, assuming that the cluster is a cluster with 5 host nodes, 3 different big data components need to be allocated to the cluster. The 5 host nodes are set to be: A1, A2,..., A5, and the 3 components are respectively : B1, B2, B3. Among them, component B1 contains 3 modules, namely: B11, B12, B13; component B2 contains 2 modules, namely: B21, B22; component B3 contains 4 modules, namely: B31 , B32, B33, B34, modules B12 and B33 are dynamic modules, the remaining modules are static modules, and each static module contains 2 instances. The above-mentioned first component list, second component list, component module list and to-be-scheduled instance list are shown in Table 1.
表1Table 1
本发明实施例中,根据第一预设优先级对第一组件列表中的组件进行降序排列,根据第二预设优先级对每个组件对应的模块进行降序排列,根据模块对应的实例个数的属性信息,为模块创建实例,从而得到待调度实例列表。因此,将每个组件、组件对应的每个模块均按优先级进行排序,保证部署的实例的亲和性依赖策略得到正确的解析,以提高分配结果的优异性。In the embodiment of the present invention, the components in the first component list are arranged in descending order according to the first preset priority, and the modules corresponding to each component are arranged in descending order according to the second preset priority. According to the number of instances corresponding to the module Attribute information of the module is used to create an instance for the module, thereby obtaining a list of instances to be scheduled. Therefore, each component and each module corresponding to the component is sorted by priority to ensure that the affinity dependency strategy of the deployed instance is correctly parsed to improve the excellence of the allocation results.
作为一种可选的实施方式,第一预设规则可以包括:当前主机节点满足强亲和性规则;并且,当前主机节点满足强反亲和性规则;并且,当前主机节点的可用资源满足当前实例的部署要求。As an optional implementation, the first preset rule may include: the current host node satisfies the strong affinity rule; and, the current host node satisfies the strong anti-affinity rule; and, the available resources of the current host node satisfy the current Deployment requirements for the instance.
需要说明的是,假设一个主机节点A1、组件B1、组件B2,如果组件B1中的实例C11部署到主机节点A1上后,则主机节点A1必须再部署组件B2中的实例C21,则实例C11和实例C21符合强亲和性规则,也就是说,组件B1中的实例C11和组件B2中的实例C21在主机节点A1上部署时具有绑定关系;如果组件B1中的实例C11部署到主机节点A1上后,则主机节点A1就无法再部署组件B2中的实例C21,则实例C11和实例C21符合强反亲和性规则,也就是说,组件B1中的实例C11和组件B2中的实例C21在主机节点A1上部署时具有互斥关系;上述强亲和性规则和强反亲和性规则为预先定义的,具体可以通过亲和性规则列表体现。It should be noted that assuming a host node A1, component B1, and component B2, if instance C11 in component B1 is deployed to host node A1, then host node A1 must deploy instance C21 in component B2, then instance C11 and Instance C21 complies with the strong affinity rule, that is to say, instance C11 in component B1 and instance C21 in component B2 have a binding relationship when deployed on host node A1; if instance C11 in component B1 is deployed to host node A1 After that, host node A1 can no longer deploy instance C21 in component B2, and instance C11 and instance C21 comply with the strong anti-affinity rule. That is to say, instance C11 in component B1 and instance C21 in component B2 are in There is a mutually exclusive relationship when deployed on host node A1; the above strong affinity rules and strong anti-affinity rules are predefined, and can be reflected in the affinity rule list.
在具体实施时,参见图2中的22,为根据第一预设规则在主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表的具体流程,包括以下步骤:In specific implementation, refer to 22 in Figure 2. The specific process of selecting a host node in the host node list according to the first preset rule and obtaining the pre-selected host node list corresponding to the current instance includes the following steps:
步骤206,遍历待调度实例列表,初始化与当前实例相对应的预选主机节点列表PreChooseHosts;Step 206: Traverse the list of instances to be scheduled and initialize the preselected host node list PreChooseHosts corresponding to the current instance;
具体地,针对待调度实例列表中的每个实例,需要初始化与当前实例相对应的预选主机节点列表,以确保根据第一预设规则得到的预选主机列表的准确性。Specifically, for each instance in the instance list to be scheduled, a preselected host node list corresponding to the current instance needs to be initialized to ensure the accuracy of the preselected host list obtained according to the first preset rule.
步骤207,遍历主机节点列表Hosts,符合亲和性规则和CPU性能、内存磁盘性能的硬限制;Step 207: Traverse the host node list Hosts, complying with the affinity rules and hard limits of CPU performance, memory disk performance;
步骤208,判断当前主机节点是否满足强亲和性规则,若是,则执行步骤209,否则,返回步骤207;Step 208: Determine whether the current host node satisfies the strong affinity rule. If so, execute step 209; otherwise, return to step 207;
具体地,可以首先从待调度实例列表中选择与当前主机节点对应的允许部署的实例,然后针对每个允许部署的实例,如果当前实例与允许部署的实例具有绑定关系,即,满足强亲和性规则,则当前实例可以部署到当前主机节点上,如果当前实例不满足强亲和性规则,则当前实例不可以部署到当前主机节点上,遍历主机节点列表中的下一个主机节点。Specifically, you can first select the instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled, and then for each instance that is allowed to be deployed, if the current instance has a binding relationship with the instance that is allowed to be deployed, that is, the strong affinity is satisfied If the current instance does not meet the strong affinity rules, the current instance cannot be deployed on the current host node, and the next host node in the host node list is traversed.
步骤209,判断当前主机节点是否满足强反亲和性规则,若是,则执行步骤210,否则,返回步骤207;Step 209: Determine whether the current host node satisfies the strong anti-affinity rule. If so, execute step 210; otherwise, return to step 207;
具体地,首先从待调度实例列表中选择与当前主机节点对应的允许部署的实例,然后针对每个允许部署的实例,如果当前实例与允许部署的实例符合不互斥原则,即不满足强反亲和性规则,则当前实例具有部署到当前主机节点的可能性,如果当前实例满足强反亲和性规则,则当前实例不能部署到当前主机节点上,遍历主机节点列表中的下一个主机节点。Specifically, first select the instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled, and then for each instance that is allowed to be deployed, if the current instance and the instance that are allowed to be deployed comply with the principle of non-mutual exclusion, that is, they do not satisfy the strong reflection Affinity rules, then the current instance has the possibility of being deployed to the current host node. If the current instance satisfies the strong anti-affinity rule, the current instance cannot be deployed to the current host node, and the next host node in the host node list is traversed. .
具体地,如图3所示,为亲和性规则校验的流程图,包括以下步骤:Specifically, as shown in Figure 3, it is a flow chart of affinity rule verification, including the following steps:
步骤301,输入亲和性规则列表AffinityList,获取当前主机节点列表Hosts;Step 301: Enter the affinity rule list AffinityList to obtain the current host node list Hosts;
具体地,亲和性规则列表AffinityList中的待校验规则expression包括强亲和性规则和强反亲和性规则。Specifically, the rule expression to be verified in the affinity rule list AffinityList includes strong affinity rules and strong anti-affinity rules.
步骤302,遍历亲和性规则列表AffinityList,获取当前待校验规则expression;Step 302: Traverse the affinity rule list AffinityList to obtain the current rule expression to be verified;
这里判断当前待校验规则,即判断为强亲和性规则还是强反亲和性规则。Here it is judged whether the current rule to be verified is a strong affinity rule or a strong anti-affinity rule.
步骤303,根据当前待校验规则expression指定的节点属性key,获取该节点对应的属性值X;Step 303: Obtain the attribute value X corresponding to the node according to the node attribute key specified by the current rule expression to be verified;
具体地,如果当前待校验规则为强反亲和性规则,根据强反亲和性规则指定的节点属性为模块,获取当前主机节点的属性值X1,其中,X1为当前节点所有允许部署的实例的列表。如果当前待校验规则为强亲和性规则,根据强亲和性规则指定的节点属性为模块,获取当前主机节点的属性值X2,其中X2为当前节点所有允许部署的实例的列表。Specifically, if the current rule to be verified is a strong anti-affinity rule, and the node attribute specified according to the strong anti-affinity rule is a module, obtain the attribute value X1 of the current host node, where X1 is all the allowed deployments of the current node. List of instances. If the current rule to be verified is a strong affinity rule, and the node attribute specified according to the strong affinity rule is a module, obtain the attribute value X2 of the current host node, where X2 is a list of all instances that are allowed to be deployed on the current node.
步骤304,根据当前待校验规则expression指定的操作类型operator,调用不同的运算逻辑;Step 304: Call different operation logic according to the operation type operator specified by the current rule expression to be verified;
具体地,如果当前待校验规则为强反亲和性规则,根据强反亲和性规则指定的节点属性为模块,获取当前主机节点的属性值X1,并且调用IN运算逻辑,则执行步骤305。如果当前待校验规则为强亲和性规则,根据强亲和性规则指定的节点属性为模块,获取当前主机节点的属性值X2,并且调用EXIST运算逻辑,则执行步骤307。Specifically, if the current rule to be verified is a strong anti-affinity rule, and the node attribute specified according to the strong anti-affinity rule is a module, the attribute value X1 of the current host node is obtained, and the IN operation logic is called, then step 305 is executed. . If the current rule to be verified is a strong affinity rule, and the node attribute specified according to the strong affinity rule is a module, the attribute value X2 of the current host node is obtained, and the EXIST operation logic is called, then step 307 is executed.
步骤305,判断调用哪种运算逻辑,若调用IN运算逻辑,则执行步骤306,若调用NOTIN运算逻辑,则执行步骤307,若调用EXIST运算逻辑,则执行步骤308,若调用NOTEXIST运算逻辑,则执行步骤309;Step 305: Determine which operation logic is called. If the IN operation logic is called, step 306 is executed. If the NOTIN operation logic is called, step 307 is executed. If the EXIST operation logic is called, step 308 is executed. If the NOTEXIST operation logic is called, then step 308 is executed. Execute step 309;
需要说明的是,IN、NOTIN、EXIST、NOTEXIST四种运算逻辑与选用哪种亲和性规则无关,只能说明有没有触发当前待调度规则,即每种亲和性规则均可以调用四种运算逻辑中的任意一种。It should be noted that the four operation logics of IN, NOTIN, EXIST, and NOTEXIST have nothing to do with which affinity rule is selected. It can only indicate whether the current rule to be scheduled is triggered, that is, each affinity rule can call four operations. Any kind of logic.
步骤306,遍历X,遍历当前待校验规则expression指定的适配值values,如有匹配,返回值true;否则返回值false;Step 306: Traverse X and the adaptation values specified by the current rule expression to be verified. If there is a match, return the value true; otherwise, return the value false;
例如,如果当前待校验规则为强反亲和性规则,设定X为实例C35对应的主机节点A1允许部署的实例列表,其中X的值为C31、C35、C39、C16、C23,设定的values的值为C16、C33,则说明C35和C16之间满足强反亲和性规则,两个实例不能同时部署在一个主机节点上,C35和C33之间也满足强反亲和性规则,两个实例不能同时部署在一个主机节点上。由于X和values之间有匹配,即都拥有C16,则返回值true,触发强反亲和性规则。For example, if the current rule to be verified is a strong anti-affinity rule, set The values of values are C16 and C33, which means that the strong anti-affinity rules are satisfied between C35 and C16. The two instances cannot be deployed on the same host node at the same time. The strong anti-affinity rules are also satisfied between C35 and C33. Two instances cannot be deployed on the same host node at the same time. Since there is a match between X and values, that is, both have C16, the value true is returned, triggering the strong anti-affinity rule.
步骤307,遍历X,遍历当前待校验规则expression指定的适配值values,如无匹配,返回值true;否则返回值false;Step 307: Traverse
例如,如果当前待校验规则为强亲和性规则,设定X为实例C35对应的主机节点A1允许部署的实例列表,其中X的值为C31、C35、C39、C16、C23,设定的values的值为C14、C33,则说明C35和C14之间满足强亲和性规则,两个实例必须同时部署在一个主机节点上,C35和C33之间也满足强亲和性规则,两个实例必须同时部署在一个主机节点上。由于X和values之间无匹配,即没有相同的实例,则返回值true,不触发强亲和性规则。For example, if the current rule to be verified is a strong affinity rule, set The values of values are C14 and C33, which means that C35 and C14 satisfy the strong affinity rules. The two instances must be deployed on one host node at the same time. The strong affinity rules are also satisfied between C35 and C33. The two instances Must be deployed on one host node at the same time. Since there is no match between X and values, that is, there is no identical instance, the value true is returned and the strong affinity rule is not triggered.
步骤308,遍历X,如非空,返回值true;否则返回值false;Step 308: Traverse X. If it is not empty, return the value true; otherwise, return the value false;
例如,如果当前待校验规则为强反亲和性规则,设定X为实例C35对应的主机节点A1允许部署的实例列表,其中X的值为C31、C35、C39、C16、C23,遍历X,因为X不为空,则返回值true。For example, if the current rule to be verified is a strong anti-affinity rule, set X to be the instance list allowed to be deployed by host node A1 corresponding to instance C35, where the values of , because X is not empty, the return value is true.
步骤309,遍历X,如为空,返回值true;否则返回值false;Step 309: Traverse X. If it is empty, return the value true; otherwise, return the value false;
例如,如果当前待校验规则为强反亲和性规则,设定X为实例C35对应的主机节点A1允许部署的实例列表,其中X的值为空集,遍历X,因为X为空,则返回值true。For example, if the current rule to be verified is a strong anti-affinity rule, set X to be the instance list allowed to be deployed by host node A1 corresponding to instance C35, where the value of Return value true.
步骤210,判断当前主机节点的可用资源是否充足,若是,则执行步骤211,否则,返回步骤207;Step 210: Determine whether the available resources of the current host node are sufficient. If so, execute step 211. Otherwise, return to step 207;
具体地,从待调度实例列表中选择与当前主机节点对应的允许部署的实例;针对每个允许部署的实例,如果当前主机节点的可用资源满足当前实例的部署要求,则当前实例允许部署到当前主机节点上,如果当前主机节点的可用资源不满足当前实例的部署要求,则当前实例不能部署到当前主机节点上,遍历主机节点列表中的下一个主机节点。Specifically, select an instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled; for each instance that is allowed to be deployed, if the available resources of the current host node meet the deployment requirements of the current instance, the current instance is allowed to be deployed to the current instance. On the host node, if the available resources of the current host node do not meet the deployment requirements of the current instance, the current instance cannot be deployed on the current host node, and the next host node in the host node list is traversed.
具体地,判断当前主机节点的可用资源是否满足当前实例的部署要求,可以从中央处理器(CPU,Central Processing Unit)、内存(memory)、磁盘(disk)三个维度判断当前主机节点的可用资源是否满足部署要求,如果当前主机节点的可用资源的三个维度均满足当前实例的部署要求,则将当前主机节点加入预选主机节点列表;如果当前主机节点的可用资源的三个维度中任意一个维度不满足当前实例的部署要求,则判断主机节点列表中的下一个主机节点。Specifically, to determine whether the available resources of the current host node meet the deployment requirements of the current instance, the available resources of the current host node can be determined from the three dimensions of central processing unit (CPU), memory (memory), and disk (disk). Whether the deployment requirements are met. If the three dimensions of the available resources of the current host node all meet the deployment requirements of the current instance, the current host node will be added to the preselected host node list; if any of the three dimensions of the available resources of the current host node If the deployment requirements of the current instance are not met, the next host node in the host node list is determined.
如图4所示,为判断当前主机节点的可用资源是否充足的具体流程图,包括以下步骤:As shown in Figure 4, the specific flow chart for determining whether the available resources of the current host node are sufficient includes the following steps:
步骤401,获取当前主机节点的可用资源,获取当前实例的预设资源需求;Step 401: Obtain the available resources of the current host node and obtain the preset resource requirements of the current instance;
步骤402,判断当前主机节点可用CPU是否大于当前实例的需求CPU,若是,则执行步骤403,否则,执行步骤406;Step 402: Determine whether the available CPU of the current host node is greater than the required CPU of the current instance. If so, execute step 403; otherwise, execute step 406;
步骤403,判断主机当前节点可用内存是否大于当前实例的需求内存,若是,则执行步骤404,否则,执行步骤406;Step 403: Determine whether the available memory of the current node of the host is greater than the required memory of the current instance. If so, execute step 404; otherwise, execute step 406;
步骤404,判断当前主机节点可用磁盘是否大于当前实例的需求磁盘,若是,则执行步骤405,否则,执行步骤406;Step 404: Determine whether the available disk of the current host node is greater than the required disk of the current instance. If so, perform step 405; otherwise, perform step 406;
步骤405,当前主机节点的可用资源充足,加入预选主机节点列表PreChooseHosts;Step 405: The available resources of the current host node are sufficient, and the pre-selected host node list PreChooseHosts is added;
步骤406,当前主机节点的可用资源不足,不能加入预选主机节点列表PreChooseHosts。Step 406: The available resources of the current host node are insufficient and cannot be added to the preselected host node list PreChooseHosts.
需要说明的是,在不同类型的集群中,CPU、memory、disk三个维度的权重占比可以不同。例如,对于计算型集群,在预选主机节点判定的过程中,可以增大CPU和memory的权重占比,也就是说,会更优选CPU高和/或memory大的主机节点;对于存储型集群,在预选主机节点判定的过程中,可以增大disk的权重占比,也就是说,会更优选disk大的主机节点。通过为不同类型的集群配置不同的权重占比分配,以获得更优异的主机节点。It should be noted that in different types of clusters, the weight proportions of the three dimensions of CPU, memory, and disk can be different. For example, for computing clusters, in the process of pre-selecting host nodes, the weight ratio of CPU and memory can be increased. In other words, host nodes with high CPU and/or large memory will be more preferred; for storage clusters, In the process of pre-selecting host nodes, the weight proportion of disk can be increased. In other words, host nodes with larger disks will be preferred. By configuring different weight proportions for different types of clusters, you can obtain better host nodes.
具体地,首先从待调度实例列表中选择与当前主机节点对应的允许部署的实例,然后针对每个允许部署的实例,如果实例与允许部署的实例具有绑定关系,且符合不互斥原则,且当前主机节点的可用资源满足当前实例的部署要求,则当前主机节点满足第一预设规则,并将其加入到预选主机节点列表PreChooseHosts中;如果当前主机节点不满足上述第一预设规则中的任意一条规则,则遍历主机节点列表中的下一个主机节点。Specifically, first select the instance that is allowed to be deployed corresponding to the current host node from the list of instances to be scheduled, and then for each instance that is allowed to be deployed, if the instance has a binding relationship with the instance that is allowed to be deployed and complies with the principle of non-mutual exclusion, And the available resources of the current host node meet the deployment requirements of the current instance, then the current host node meets the first preset rule and is added to the preselected host node list PreChooseHosts; if the current host node does not meet the above first preset rule For any rule, traverse the next host node in the host node list.
步骤211,将当前主机节点加入预选主机节点列表PreChooseHosts;Step 211: Add the current host node to the preselected host node list PreChooseHosts;
具体地,若任意一个实例对应的预选主机节点列表PreChooseHosts为空,则可以获取主机节点列表外的其他主机节点,并将主机节点加入到主机节点列表中,将实例部署到主机节点上。该方法通过获取主机节点列表之外的其他主机节点,减小了实例分配失败的概率,进而提高了集群资源分配结果的可靠性。Specifically, if the preselected host node list PreChooseHosts corresponding to any instance is empty, you can obtain other host nodes outside the host node list, add the host node to the host node list, and deploy the instance to the host node. This method reduces the probability of instance allocation failure by obtaining other host nodes outside the host node list, thereby improving the reliability of cluster resource allocation results.
本发明实施例中,根据第一预设规则在主机节点列表中选择主机节点,即判断当前主机节点是否满足强亲和性规则,且当前主机节点是否满足强反亲和性规则,且当前主机节点的可用资源是否充足,如果当前节点同时满足上述三个判断条件,则将该节点作为预选主机节点,并加入到当前实例对应的预选主机节点列表中,如果当前节点不满足上述三个判断条件中的任意一个条件,则判断下一个主机节点。通过上述方法进行预筛选,初步判定出当前实例可选的预选主机节点,缩小了选择范围,提高了节点分配的效率。In the embodiment of the present invention, the host node is selected from the host node list according to the first preset rule, that is, it is judged whether the current host node satisfies the strong affinity rule, and whether the current host node satisfies the strong anti-affinity rule, and the current host node Whether the available resources of the node are sufficient. If the current node meets the above three judgment conditions at the same time, the node will be used as a preselected host node and added to the preselected host node list corresponding to the current instance. If the current node does not meet the above three judgment conditions. If any one of the conditions is met, the next host node will be determined. Through pre-screening through the above method, the pre-selected host nodes available for the current instance are initially determined, which narrows the selection range and improves the efficiency of node allocation.
作为一种可选的实施方式,根据第二预设规则,从预选主机节点列表中选择最优主机节点,具体可以先针对每个实例对应的每个主机节点,根据预设的主机节点、实例和分值的对应关系,确定实例对应的主机节点的分值,然后对每个实例对应的每个主机节点的分值进行比较,将分值最高的主机节点作为最优主机节点。As an optional implementation, according to the second preset rule, select the optimal host node from the preselected host node list. Specifically, for each host node corresponding to each instance, according to the preset host node, instance According to the corresponding relationship with the score, determine the score of the host node corresponding to the instance, and then compare the scores of each host node corresponding to each instance, and use the host node with the highest score as the optimal host node.
需要说明的是,假设一个主机节点A1、组件B1、组件B2,如果组件B1中的实例C11部署到主机节点A1上后,建议将组件B2中的实例C21也部署到主机节点A1上,以提升系统性能,但是,若不存在组件B2中的实例C21,只将组件B1中的实例C11部署到主机节点A1上也可以实现系统功能,这里的实例C11和实例C21的关系满足弱亲和性规则,也就是说,组件B1中的实例C11和组件B2中的实例C21共同部署在主机节点A1上,系统性能更优异,组件B1中的实例C11或组件B2中的实例C21任意一个部署在主机节点A1上,系统性能也可以实现;如果组件B1中的实例C11已经部署到主机节点A1上,则不建议将组件B2中的实例C21也部署到主机节点A1上,但是,若存在不可抗因素,必须将组件B1中的实例C11和组件B2中的实例C21共同部署到主机节点A1上,也可以实现系统功能,只是会降低系统性能,这里的实例C11和实例C21满足弱亲和性规则,也就是说,组件B1或组件B2任意一个部署在主机节点A1上,系统性能更优异,组件B1中的实例C11和组件B2中的实例C21共同部署在主机节点A1上,系统性能也可以实现。上述弱亲和性规则和弱反亲和性规则为预先定义的,可以以亲和性规则列表的形式体现。It should be noted that, assuming a host node A1, component B1, and component B2, if instance C11 in component B1 is deployed to host node A1, it is recommended to deploy instance C21 in component B2 to host node A1 to improve System performance, however, if there is no instance C21 in component B2, system functions can also be achieved by deploying only instance C11 in component B1 to host node A1. The relationship between instance C11 and instance C21 here satisfies the weak affinity rule. , that is to say, if instance C11 in component B1 and instance C21 in component B2 are deployed together on the host node A1, the system performance will be better. Either instance C11 in component B1 or instance C21 in component B2 will be deployed on the host node. System performance can also be achieved on A1; if instance C11 in component B1 has been deployed to host node A1, it is not recommended to deploy instance C21 in component B2 to host node A1. However, if there are force majeure factors, Instance C11 in component B1 and instance C21 in component B2 must be deployed together on host node A1. System functions can also be implemented, but system performance will be reduced. Instance C11 and instance C21 here satisfy the weak affinity rule, and That is to say, if either component B1 or component B2 is deployed on the host node A1, the system performance will be better. If the instance C11 in the component B1 and the instance C21 in the component B2 are jointly deployed on the host node A1, the system performance can also be achieved. The above weak affinity rules and weak anti-affinity rules are predefined and can be embodied in the form of an affinity rule list.
在具体实施时,亲和性规则列表中规定的弱亲和性规则和弱反亲和性规则均预先设定了每个主机节点的分值(weight),其中,弱亲和性规则中定义的分值为正整数,弱反亲和性规则中定义的分值为负整数。参见图2中的23,针对当前实例对应的预选主机节点中的每一个节点,根据亲和性规则列表判断当前主机节点的待校验规则,如果当前主机节点的待校验规则为弱亲和性规则,则根据弱亲和性规则中定义的分值,可以得到允许部署当前实例的当前主机节点的分值,其中,所得分值为正整数;如果当前主机节点的待校验规则为弱反亲和性规则,则根据弱反亲和性规则中定义的分值,可以得到允许部署当前实例的当前主机节点的分值,其中,所得分值为负整数。During specific implementation, the weak affinity rules and weak anti-affinity rules specified in the affinity rule list all pre-set the score (weight) of each host node, where the weak affinity rule is defined in The score of is a positive integer, and the score defined in the weak anti-affinity rule is a negative integer. Referring to 23 in Figure 2, for each node in the preselected host node corresponding to the current instance, determine the rule to be verified of the current host node according to the affinity rule list. If the rule to be verified of the current host node is weak affinity rule, then according to the score defined in the weak affinity rule, the score of the current host node that is allowed to deploy the current instance can be obtained, where the obtained score is a positive integer; if the rule to be verified of the current host node is weak For anti-affinity rules, based on the score defined in the weak anti-affinity rule, the score of the current host node that is allowed to deploy the current instance can be obtained, where the score is a negative integer.
具体地,若当前主机节点A1可以部署n个实例,C1、C2、…、Cn,假设,针对实例C1,当前主机节点A1的分值为Y1;针对实例C2,当前主机节点A1的分值为Y2;…针对实例Cn,当前主机节点A1的分值为Yn,对上述分值,Y1、Y2、…、Yn进行求和运算,可以得到当前主机节点的总分值Y,进而对比选出分值最高的主机节点作为最优主机节点。Specifically, if the current host node A1 can deploy n instances, C1, C2,..., Cn, it is assumed that for instance C1, the score of the current host node A1 is Y1; for instance C2, the score of the current host node A1 is Y2;...for instance Cn, the score of the current host node A1 is Yn. By summing the above scores, Y1, Y2,..., Yn, we can get the total score Y of the current host node, and then compare and select the score. The host node with the highest value is regarded as the optimal host node.
本发明实施例中,根据预设的主机节点、实例和分值的对应关系,确定实例对应的主机节点的分值,并对每个主机节点的分值进行比较,从而选出得分最高的主机节点作为最优主机节点。该方法对实例对应的主机节点的打分规则比较灵活,可以广泛适配不同类型的主机节点,提高了该方法的普适性。In the embodiment of the present invention, according to the preset corresponding relationship between host nodes, instances and scores, the score of the host node corresponding to the instance is determined, and the scores of each host node are compared to select the host with the highest score. node as the optimal host node. This method has flexible scoring rules for host nodes corresponding to instances and can be widely adapted to different types of host nodes, which improves the universality of the method.
具体地,若分值最高的主机节点包括至少两个,则可以将至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源最多的主机节点作为最优主机节点。Specifically, if there are at least two host nodes with the highest scores, the idle resources of the at least two host nodes with the highest scores can be compared, and the host node with the most idle resources can be used as the optimal host node.
在具体实施时,参见图2中的23,从预选主机节点列表中选择最优主机节点后,将最优主机节点与当前实例进行绑定,得到主机节点和实例的绑定关系的具体流程图,包括以下步骤:In the specific implementation, refer to 23 in Figure 2. After selecting the optimal host node from the pre-selected host node list, bind the optimal host node to the current instance to obtain a specific flow chart of the binding relationship between the host node and the instance. , including the following steps:
步骤212,遍历预选主机节点列表PreChooseHosts,分别对每个主机节点进行打分,选出最优主机节点;Step 212: Traverse the pre-selected host node list PreChooseHosts, score each host node respectively, and select the optimal host node;
步骤213,判断当前主机节点的弱亲和性规则得分是否大于主机节点的最高得分,若大于,则执行步骤215,若等于,则执行步骤214,若小于,则返回步骤212;Step 213: Determine whether the weak affinity rule score of the current host node is greater than the highest score of the host node. If greater, proceed to step 215. If equal, proceed to step 214. If less, return to step 212;
步骤214,对比主机节点的空闲资源程度,选出空闲资源最多的主机节点作为最优主机节点;Step 214: Compare the idle resource levels of the host nodes and select the host node with the most idle resources as the optimal host node;
具体地,优选空闲资源多的主机节点作为最优主机节点,在分析主机节点的空闲资源时,可以从CPU、memory、disk三个维度来分析,针对不同类型的集群,可以对CPU、memory、disk三个维度预设不同的权重。例如,对于计算型集群,可以增大CPU和memory的权重占比;对于存储型集群,可以增大disk的权重占比。Specifically, the host node with more idle resources is preferred as the optimal host node. When analyzing the idle resources of the host node, it can be analyzed from the three dimensions of CPU, memory, and disk. For different types of clusters, CPU, memory, and The three dimensions of disk are preset with different weights. For example, for computing clusters, you can increase the weight proportion of CPU and memory; for storage clusters, you can increase the weight proportion of disk.
如图5所示,为资源规划优选的工作流程图,包括以下步骤:As shown in Figure 5, the optimal workflow diagram for resource planning includes the following steps:
步骤501,输入主机节点host1,主机节点host2;Step 501, enter the host node host1 and the host node host2;
其中,主机节点host1和主机节点host2的分值最高且分值相等。Among them, the host node host1 and the host node host2 have the highest scores and the scores are equal.
步骤502,采集主机节点host1的空闲资源,采集主机节点host2的空闲资源;Step 502: Collect idle resources of the host node host1 and collect idle resources of the host node host2;
步骤503,加权计算主机节点host1的空闲资源情况得分S1,加权计算主机节点host2的空闲资源情况得分S2;Step 503: Calculate the idle resource status score S1 of the host node host1 in a weighted manner, and calculate the idle resource status score S2 of the host node host2 in a weighted manner;
具体地,分别采集主机节点host1空闲资源和主机节点host2的空闲资源,根据预设权重和空闲资源的绝对值分别进行加权运算,得到主机节点host1的空闲资源情况得分S1和主机节点host2的空闲资源情况得分S2。Specifically, the idle resources of the host node host1 and the idle resources of the host node host2 are respectively collected, and weighted operations are performed according to the preset weight and the absolute value of the idle resources to obtain the idle resource situation score S1 of the host node host1 and the idle resources of the host node host2. Situation score S2.
步骤504,判断条件S1>S2是否满足,若是,则执行步骤505,否则,执行步骤506;Step 504, determine whether the condition S1>S2 is satisfied, if so, execute step 505, otherwise, execute step 506;
步骤505,将主机节点host1作为最优主机节点;Step 505, use the host node host1 as the optimal host node;
步骤506,将主机节点host2作为最优主机节点。Step 506: Use the host node host2 as the optimal host node.
本发明实施例中,比较S1和S2,选择得分最高的主机节点作为最优主机节点。In the embodiment of the present invention, S1 and S2 are compared, and the host node with the highest score is selected as the optimal host node.
步骤215,将当前实例与最优主机节点进行绑定,得到主机节点与实例的绑定关系;Step 215: Bind the current instance to the optimal host node to obtain the binding relationship between the host node and the instance;
步骤216,判断是否为每个实例都部署了主机节点,若是,则执行步骤217,否则,执行步骤218;Step 216: Determine whether a host node is deployed for each instance. If so, perform step 217; otherwise, perform step 218;
步骤217,输出主机节点与组件模块实例调整后的绑定关系;Step 217: Output the adjusted binding relationship between the host node and the component module instance;
步骤218,输出分配失败原因和已分配的主机节点列表。Step 218: Output the reason for the allocation failure and the allocated host node list.
本发明实施例提供一种集群资源规划方法,若分值最高的主机节点包括至少两个,则对至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源情况得分最高的主机节点作为最优主机节点。因此,通过配置不同的维度和权重来对主机节点的空闲资源进行优选打分,其打分规则灵活,可以广泛适用不同类型的集群,此外,通过优选分值最高且空闲资源情况得分最高的主机节点的二重优选规则,进一步保证了资源分配结果的优异性。Embodiments of the present invention provide a cluster resource planning method. If there are at least two host nodes with the highest scores, compare the idle resources of at least two host nodes with the highest scores, and compare the host nodes with the highest idle resource scores. as the optimal host node. Therefore, by configuring different dimensions and weights to prioritize the idle resources of host nodes, the scoring rules are flexible and can be widely applied to different types of clusters. In addition, by optimizing the host node with the highest score and the highest idle resource score, Double optimization rules further ensure the excellence of resource allocation results.
作为一种实施方式,该方法还可以将专家经验量化为一组json模板文件,包括亲和性规则列表、第一预设优先级、第二预设优先级、主机节点的可用资源以及预设权重。As an implementation method, the method can also quantify expert experience into a set of json template files, including a list of affinity rules, a first preset priority, a second preset priority, available resources of the host node, and preset Weights.
如果没有依托专家经验的模板文件,则需要预先设定组件中各个模块的关系、亲和性规则列表、第一预设优先级、第二预设优先级、预设权重和主机节点的可用资源情况。If there is no template file based on expert experience, you need to pre-set the relationship between each module in the component, the affinity rule list, the first default priority, the second default priority, the default weight and the available resources of the host node Condition.
通过固化专家经验,生成模板文件,使得集群资源规划方法可以高效自动化,在模板文件中创建了组件亲和性规则的定义,可以精确表达各组件之间的依赖关系和互斥关系,并且通过模板文件可以对组件的资源需求进行分层考虑,对不同层的需求分别考虑,此外,还可以根据需要引入不同的模板组合,并按需求调整其优先级关系,以保证资源分配的合理性以及优异性。By consolidating expert experience and generating template files, the cluster resource planning method can be efficiently automated. The definition of component affinity rules is created in the template file, which can accurately express the dependencies and mutual exclusion relationships between components, and through the template The file can consider the resource requirements of components in layers, and consider the requirements of different layers separately. In addition, different template combinations can be introduced as needed, and their priority relationships can be adjusted as needed to ensure the rationality and excellence of resource allocation. sex.
实施例2Example 2
基于相同的构思,本发明实施例还提供一种集群资源规划装置,应用于公有云平台,由于该装置即是本发明实施例中的方法中的装置,并且该装置解决问题的原理与该方法相似,因此该装置的实施可以参见方法的实施,重复之处不再赘述。Based on the same concept, embodiments of the present invention also provide a cluster resource planning device, which is applied to a public cloud platform. This device is the device in the method in the embodiment of the present invention, and the principle of solving the problem of the device is the same as that of the method. are similar, so the implementation of the device can be referred to the implementation of the method, and repeated details will not be repeated.
如图6所示,上述装置包括以下模块:As shown in Figure 6, the above device includes the following modules:
优先级排序模块601,用于获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对第一组件列表进行排序,得到待调度实例列表;The priority sorting module 601 is used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled. ;
规则校验模块602,用于针对待调度实例列表中每个模块中的每个实例,根据第一预设规则在主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;The rule verification module 602 is configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list, and obtain a preselected host node list corresponding to the current instance;
资源优选模块603,用于根据第二预设规则,从预选主机节点列表中选择最优主机节点后,将最优主机节点与当前实例进行绑定,得到绑定关系;The resource optimization module 603 is configured to select the optimal host node from the pre-selected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain the binding relationship;
部署模块604,用于将绑定关系中主机节点对应的实例部署到主机节点上。The deployment module 604 is used to deploy the instance corresponding to the host node in the binding relationship to the host node.
作为一种可选的实施方式,优先级排序模块具体用于:As an optional implementation, the priority sorting module is specifically used to:
根据第一预设优先级对第一组件列表中每个组件进行降序排列,得到第二组件列表;Arrange each component in the first component list in descending order according to the first preset priority to obtain a second component list;
根据第二预设优先级对第二组件列表中每个组件对应的模块进行降序排列,得到与当前组件对应的组件模块列表;Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a list of component modules corresponding to the current component;
针对组件模块列表中的每个模块,根据模块对应的实例个数的属性信息,为模块创建实例,以得到待调度实例列表。For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain a list of instances to be scheduled.
优先级排序模块具体用于:The prioritization module is specifically used for:
遍历所述第一组件列表;Traverse the first component list;
将所述第一组件列表中每个组件按照优先级数值从小到大排序。Sort each component in the first component list according to priority value from small to large.
作为一种可选的实施方式,第一预设规则包括: As an optional implementation, the first preset rule includes:
当前主机节点满足强亲和性规则;The current host node satisfies the strong affinity rule;
且,当前主机节点满足强反亲和性规则;Moreover, the current host node satisfies the strong anti-affinity rule;
且,当前主机节点的可用资源满足当前实例的部署要求。Moreover, the available resources of the current host node meet the deployment requirements of the current instance.
作为一种可选的实施方式,资源优选模块具体用于:As an optional implementation, the resource optimization module is specifically used to:
针对每个实例对应的每个主机节点,根据预设的主机节点、实例和分值的对应关系,确定实例对应的主机节点的分值;For each host node corresponding to each instance, determine the score of the host node corresponding to the instance based on the preset correspondence between the host node, instance and score;
对每个实例对应的每个主机节点的分值进行比较,将分值最高的主机节点作为最优主机节点。Compare the scores of each host node corresponding to each instance, and use the host node with the highest score as the optimal host node.
作为一种可选的实施方式,若分值最高的主机节点包括至少两个,资源优选模块还用于:As an optional implementation, if the host nodes with the highest scores include at least two, the resource optimization module is also used to:
对至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源最多的主机节点作为最优主机节点。Compare the idle resources of at least two host nodes with the highest scores, and use the host node with the most idle resources as the optimal host node.
作为一种可选的实施方式,所述根据第一预设规则在所述主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表之前,所述规则校验模块还用于:As an optional implementation manner, before selecting a host node in the host node list according to the first preset rule to obtain the preselected host node list corresponding to the current instance, the rule verification module is also used to:
遍历待调度实例列表;Traverse the list of instances to be scheduled;
针对待调度实例列表中的每个实例,初始化与当前实例相对应的预选主机节点列表。For each instance in the list of instances to be scheduled, initialize a list of preselected host nodes corresponding to the current instance.
实施例3Example 3
基于相同的构思,本发明实施例还提供一种集群资源规划设备,应用于公有云平台,由于该集群资源规划设备即是本发明实施例中的方法中的集群资源规划设备,并且该集群资源规划设备解决问题的原理与该方法相似,因此该集群资源规划设备的实施可以参见方法的实施,重复之处不再赘述。Based on the same concept, embodiments of the present invention also provide a cluster resource planning device, which is applied to a public cloud platform. Since the cluster resource planning device is the cluster resource planning device in the method in the embodiment of the present invention, and the cluster resource The principle of planning equipment to solve problems is similar to this method. Therefore, the implementation of the cluster resource planning equipment can be found in the implementation of the method, and the duplication will not be repeated.
下面参照图7来描述根据本发明的这种实施方式的集群资源规划设备70。图7显示的集群资源规划设备70仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。The cluster resource planning device 70 according to this embodiment of the present invention is described below with reference to FIG. 7 . The cluster resource planning device 70 shown in FIG. 7 is only an example and should not impose any restrictions on the functions and usage scope of the embodiments of the present invention.
如图7所示,集群资源规划设备70可以以通用计算设备的形式表现,例如其可以为终端设备。集群资源规划设备70的组件可以包括但不限于:上述至少一个处理器71、上述至少一个存储处理器71可执行指令的存储器72、连接不同系统组件(包括存储器72和处理器71)的总线73,处理器71是智能设备的处理器。As shown in FIG. 7 , the cluster resource planning device 70 may be in the form of a general computing device, for example, it may be a terminal device. The components of the cluster resource planning device 70 may include, but are not limited to: the above-mentioned at least one processor 71, the above-mentioned at least one memory 72 that stores executable instructions of the processor 71, and a bus 73 connecting different system components (including the memory 72 and the processor 71). , the processor 71 is the processor of the smart device.
处理器71通过运行可执行指令以实现如下步骤:The processor 71 executes executable instructions to implement the following steps:
获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对第一组件列表进行排序,得到待调度实例列表;After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;
针对待调度实例列表中每个模块中的每个实例,根据第一预设规则在主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;For each instance in each module in the instance list to be scheduled, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;
根据第二预设规则,从预选主机节点列表中选择最优主机节点后,将最优主机节点与当前实例进行绑定,得到绑定关系;According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain the binding relationship;
将绑定关系中主机节点对应的实例部署到主机节点上。Deploy the instance corresponding to the host node in the binding relationship to the host node.
作为一种可选的实施方式,处理器71具体用于:As an optional implementation, the processor 71 is specifically used to:
根据第一预设优先级对第一组件列表中每个组件进行降序排列,得到第二组件列表;Arrange each component in the first component list in descending order according to the first preset priority to obtain a second component list;
根据第二预设优先级对第二组件列表中每个组件对应的模块进行降序排列,得到与当前组件对应的组件模块列表;Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a list of component modules corresponding to the current component;
针对组件模块列表中的每个模块,根据模块对应的实例个数的属性信息,为模块创建实例,以得到待调度实例列表。For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module to obtain a list of instances to be scheduled.
作为一种可选的实施方式,处理器71具体用于:As an optional implementation, the processor 71 is specifically used to:
遍历第一组件列表;Traverse the first component list;
将第一组件列表中每个组件按照优先级数值从小到大排序。Sort each component in the first component list according to priority value from small to large.
作为一种可选的实施方式,第一预设规则包括:As an optional implementation, the first preset rule includes:
当前主机节点满足强亲和性规则;The current host node satisfies the strong affinity rule;
且,当前主机节点满足强反亲和性规则;Moreover, the current host node satisfies the strong anti-affinity rule;
且,当前主机节点的可用资源数量满足当前实例的部署要求。Moreover, the number of available resources on the current host node meets the deployment requirements of the current instance.
作为一种可选的实施方式,处理器71具体用于:As an optional implementation, the processor 71 is specifically used to:
针对每个实例对应的每个主机节点,根据预设的主机节点、实例和分值的对应关系,确定实例对应的主机节点的分值;For each host node corresponding to each instance, determine the score of the host node corresponding to the instance based on the preset correspondence between the host node, instance and score;
对每个实例对应的每个主机节点的分值进行比较,将分值最高的主机节点作为最优主机节点。Compare the scores of each host node corresponding to each instance, and use the host node with the highest score as the optimal host node.
作为一种可选的实施方式,若分值最高的主机节点包括至少两个,处理器71还用于:As an optional implementation, if the host nodes with the highest scores include at least two, the processor 71 is also used to:
对至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源最多的主机节点作为最优主机节点。Compare the idle resources of at least two host nodes with the highest scores, and use the host node with the most idle resources as the optimal host node.
作为一种可选的实施方式,根据第一预设规则在主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表之前,处理器71还用于:As an optional implementation, before selecting a host node in the host node list according to the first preset rule and obtaining the preselected host node list corresponding to the current instance, the processor 71 is also used to:
遍历待调度实例列表;Traverse the list of instances to be scheduled;
针对待调度实例列表中的每个实例,初始化与当前实例相对应的预选主机节点列表。For each instance in the list of instances to be scheduled, initialize a list of preselected host nodes corresponding to the current instance.
总线73表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器、外围总线、处理器或者使用多种总线结构中的任意总线结构的局域总线。Bus 73 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus structures.
存储器72可以包括易失性存储器形式的可读介质,例如随机存取存储器(RAM)721和/或高速缓存存储器722,还可以进一步包括只读存储器(ROM)723。Memory 72 may include readable media in the form of volatile memory, such as random access memory (RAM) 721 and/or cache memory 722 , and may further include read only memory (ROM) 723 .
存储器72还可以包括具有一组(至少一个)程序模块724的程序/实用工具725,这样的程序模块724包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。Memory 72 may also include a program/utility 725 having a set of (at least one) program modules 724 including, but not limited to: an operating system, one or more application programs, other program modules, and program data. Each of the examples, or some combination thereof, may include the implementation of a network environment.
集群资源规划设备70也可以与一个或多个外部设备74(例如键盘、指向设备等)通信,还可与一个或者多个使得用户能与集群资源规划设备70交互的设备通信,和/或与使得集群资源规划设备70能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等)通信。这种通信可以通过输入/输出(I/O)接口75进行。并且,集群资源规划设备70还可以通过网络适配器76与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器76通过总线73与电子设备70的其它模块通信。应当明白,尽管图中未示出,可以结合集群资源规划设备70使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。Cluster resource planning device 70 may also communicate with one or more external devices 74 (e.g., keyboard, pointing device, etc.), may also communicate with one or more devices that enable a user to interact with cluster resource planning device 70, and/or with Any device (eg, router, modem, etc.) that enables cluster resource planning device 70 to communicate with one or more other computing devices. This communication may occur through input/output (I/O) interface 75. Furthermore, the cluster resource planning device 70 may also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through a network adapter 76 . As shown, network adapter 76 communicates with other modules of electronic device 70 via bus 73 . It should be understood that, although not shown in the figure, other hardware and/or software modules may be used in conjunction with the cluster resource planning device 70, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, Tape drives and data backup storage systems, etc.
实施例4Example 4
在一些可能的实施方式中,本发明的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当程序产品在终端设备上运行时,程序代码用于使终端设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的集群资源规划装置中各模块的步骤,例如,网络侧设备可以用于获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对第一组件列表进行排序,得到待调度实例列表;针对待调度实例列表中每个模块中的每个实例,根据第一预设规则在主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;根据第二预设规则,从预选主机节点列表中选择最优主机节点后,将最优主机节点与当前实例进行绑定,得到绑定关系;In some possible implementations, various aspects of the present invention can also be implemented in the form of a program product, which includes program code. When the program product is run on a terminal device, the program code is used to cause the terminal device to execute the above described instructions. The steps of each module in the cluster resource planning device according to various exemplary embodiments of the present disclosure described in the "Example Method" section, for example, the network side device can be used to obtain the host node list and the first component list that needs to be deployed. Finally, the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled; for each instance in each module in the list of instances to be scheduled, the first component list is sorted according to the first preset The rule selects the host node in the host node list to obtain the preselected host node list corresponding to the current instance; according to the second preset rule, after selecting the optimal host node from the preselected host node list, the optimal host node is compared with the current instance. Bind, get the binding relationship;
将绑定关系中主机节点对应的实例部署到主机节点上。Deploy the instance corresponding to the host node in the binding relationship to the host node.
程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The Program Product may take the form of one or more readable media in any combination. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
如图8所示,描述了根据本发明的实施方式的用于集群资源规划的程序产品80,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本发明的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。As shown in Figure 8, a program product 80 for cluster resource planning according to an embodiment of the present invention is described, which can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be used on a terminal device, For example, run on a personal computer. However, the program product of the present invention is not limited thereto. In this document, a readable storage medium may be any tangible medium containing or storing a program that may be used by or in combination with an instruction execution system, apparatus or device.
可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。The readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying readable program code therein. Such propagated data signals may take a variety of forms, including - but not limited to - electromagnetic signals, optical signals, or any suitable combination of the above. A readable signal medium may also be any readable medium other than a readable storage medium that can send, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or device.
可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、有线、光缆、RF等,或者上述的任意合适的组合。Program code embodied on a readable medium may be transmitted using any suitable medium, including - but not limited to - wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言的任意组合来编写用于执行本发明操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。Program code for performing the operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., as well as conventional procedural programming. Language—such as "C" or a similar programming language. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In situations involving remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (e.g., utilizing an Internet service provider to connect via the Internet).
应当注意,尽管在上文详细描述中提及了系统的若干模块或子模块,但是这种划分仅仅是示例性的并非强制性的。实际上,根据本发明的实施方式,上文描述的两个或更多模块的特征和功能可以在一个模块中具体化。反之,上文描述的一个模块的特征和功能可以进一步划分为由多个模块来具体化。It should be noted that although several modules or sub-modules of the system are mentioned in the above detailed description, this division is only exemplary and not mandatory. In fact, according to embodiments of the present invention, the features and functions of two or more modules described above may be embodied in one module. Conversely, the features and functions of a module described above can be further divided into being embodied by multiple modules.
此外,尽管在附图中以特定顺序描述了本发明系统各模块的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地,可以省略某些操作,将多个操作合并为一个操作执行,和/或将一个操作分解为多个操作执行。In addition, although the operations of the various modules of the system of the present invention are described in a specific order in the drawings, this does not require or imply that these operations must be performed in this specific order, or that all of the illustrated operations must be performed to achieve the desired results. result. Additionally or alternatively, certain operations may be omitted, multiple operations combined into one operation execution, and/or one operation broken into multiple operation executions.
以上参照示出根据本申请实施例的方法、装置(系统)和/或计算机程序产品的框图和/或流程图描述本申请。应理解,可以通过计算机程序指令来实现框图和/或流程图示图的一个块以及框图和/或流程图示图的块的组合。可以将这些计算机程序指令提供给通用计算机、专用计算机的处理器和/或其它可编程数据处理装置,以产生机器,使得经由计算机处理器和/或其它可编程数据处理装置执行的指令创建用于实现框图和/或流程图块中所指定的功能/动作的方法。The present application is described above with reference to block diagrams and/or flowcharts illustrating methods, apparatus (systems) and/or computer program products according to embodiments of the application. It will be understood that one block of the block diagrams and/or flowchart illustrations, and combinations of blocks of the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a general-purpose computer, a processor of a special-purpose computer, and/or other programmable data processing apparatus to produce a machine, such that the instructions, executed via the computer processor and/or other programmable data processing apparatus, create a machine for A method that implements the functions/actions specified in the block diagram and/or flowchart blocks.
相应地,还可以用硬件和/或软件(包括固件、驻留软件、微码等)来实施本申请。更进一步地,本申请可以采取计算机可使用或计算机可读存储介质上的计算机程序产品的形式,其具有在介质中实现的计算机可使用或计算机可读程序代码,以由指令执行系统来使用或结合指令执行系统而使用。在本申请上下文中,计算机可使用或计算机可读介质可以是任意介质,其可以包含、存储、通信、传输、或传送程序,以由指令执行系统、装置或设备使用,或结合指令执行系统、装置或设备使用。Accordingly, the present application can also be implemented using hardware and/or software (including firmware, resident software, microcode, etc.). Furthermore, the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by an instruction execution system or Used in conjunction with the instruction execution system. In the context of this application, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, transmit, or transport a program for use by or in connection with an instruction execution system, apparatus, or device, device or equipment use.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the invention. In this way, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies, the present invention is also intended to include these modifications and variations.

Claims (10)

  1. 一种集群资源规划方法,其特征在于,应用于公有云平台,包括:A cluster resource planning method is characterized in that it is applied to a public cloud platform and includes:
    获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对所述第一组件列表进行排序,得到待调度实例列表;After obtaining the host node list and the first component list that needs to be deployed, sort the first component list according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled;
    针对所述待调度实例列表中每个模块中的每个实例,根据第一预设规则在所述主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;For each instance in each module in the to-be-scheduled instance list, select a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance;
    根据第二预设规则,从所述预选主机节点列表中选择最优主机节点后,将所述最优主机节点与所述当前实例进行绑定,得到绑定关系;According to the second preset rule, after selecting the optimal host node from the preselected host node list, bind the optimal host node to the current instance to obtain a binding relationship;
    将所述绑定关系中主机节点对应的实例部署到所述主机节点上。Deploy the instance corresponding to the host node in the binding relationship to the host node.
  2. 如权利要求1所述的方法,其特征在于,所述根据第一预设优先级以及第二预设优先级对所述第一组件列表进行排序,得到待调度实例列表,包括:The method of claim 1, wherein the first component list is sorted according to the first preset priority and the second preset priority to obtain a list of instances to be scheduled, including:
    根据第一预设优先级对所述第一组件列表中每个组件进行降序排列,得到第二组件列表;Arrange each component in the first component list in descending order according to the first preset priority to obtain a second component list;
    根据第二预设优先级对所述第二组件列表中每个组件对应的模块进行降序排列,得到与当前组件对应的组件模块列表;Arrange the modules corresponding to each component in the second component list in descending order according to the second preset priority to obtain a component module list corresponding to the current component;
    针对所述组件模块列表中的每个模块,根据所述模块对应的实例个数的属性信息,为所述模块创建实例,以得到所述待调度实例列表。For each module in the component module list, create an instance for the module according to the attribute information of the number of instances corresponding to the module, to obtain the list of instances to be scheduled.
  3. 如权利要求2所述的方法,其特征在于,所述根据第一预设优先级对所述第一组件列表中每个组件进行降序排列,包括:The method of claim 2, wherein arranging each component in the first component list in descending order according to a first preset priority includes:
    遍历所述第一组件列表;Traverse the first component list;
    将所述第一组件列表中每个组件按照优先级数值从小到大排序。Sort each component in the first component list according to priority value from small to large.
  4. 如权利要求1所述的方法,其特征在于,所述第一预设规则包括:The method of claim 1, wherein the first preset rule includes:
    当前主机节点满足强亲和性规则;The current host node satisfies the strong affinity rule;
    且,所述当前主机节点满足强反亲和性规则;Moreover, the current host node satisfies the strong anti-affinity rule;
    且,所述当前主机节点的可用资源满足所述当前实例的部署要求。Moreover, the available resources of the current host node meet the deployment requirements of the current instance.
  5. 如权利要求1所述的方法,其特征在于,所述根据所述预选主机节点列表和第二预设规则,从所述主机节点列表中选择最优主机节点,包括:The method of claim 1, wherein selecting the optimal host node from the host node list according to the preselected host node list and the second preset rule includes:
    针对每个实例对应的每个主机节点,根据预设的主机节点、实例和分值的对应关系,确定所述实例对应的所述主机节点的分值;For each host node corresponding to each instance, determine the score of the host node corresponding to the instance according to the preset corresponding relationship between the host node, the instance and the score;
    对所述每个实例对应的每个主机节点的分值进行比较,将分值最高的主机节点作为所述最优主机节点。The scores of each host node corresponding to each instance are compared, and the host node with the highest score is regarded as the optimal host node.
  6. 如权利要求5所述的方法,其特征在于,若所述分值最高的主机节点包括至少两个,该方法还包括:The method of claim 5, wherein if the host nodes with the highest scores include at least two, the method further includes:
    对所述至少两个分值最高的主机节点的空闲资源进行比较,将空闲资源最多的主机节点作为所述最优主机节点。The idle resources of the at least two host nodes with the highest scores are compared, and the host node with the most idle resources is used as the optimal host node.
  7. 如权利要求1~6任一所述的方法,其特征在于,所述根据第一预设规则在所述主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表之前,还包括:The method according to any one of claims 1 to 6, characterized in that before selecting a host node in the host node list according to the first preset rule to obtain a pre-selected host node list corresponding to the current instance, the method further includes: :
    遍历所述待调度实例列表;Traverse the list of instances to be scheduled;
    针对所述待调度实例列表中的每个实例,初始化与所述当前实例相对应的预选主机节点列表。For each instance in the to-be-scheduled instance list, a preselected host node list corresponding to the current instance is initialized.
  8. 一种集群资源规划设备,其特征在于,应用于公有云平台,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现权利要求1~7任一项所述的集群资源规划方法的步骤。A cluster resource planning device, characterized in that it is applied to a public cloud platform and includes a memory and a processor. The memory stores a computer program. When the processor executes the computer program, any one of claims 1 to 7 is implemented. The steps of the cluster resource planning method.
  9. 一种集群资源规划装置,其特征在于,应用于公有云平台,包括:A cluster resource planning device is characterized in that it is applied to a public cloud platform and includes:
    优先级排序模块,用于获取到主机节点列表以及需要部署的第一组件列表后,根据第一预设优先级以及第二预设优先级对所述第一组件列表进行排序,得到待调度实例列表;A priority sorting module, used to obtain the host node list and the first component list that needs to be deployed, and sort the first component list according to the first preset priority and the second preset priority to obtain the instance to be scheduled. list;
    规则校验模块,用于针对所述待调度实例列表中每个模块中的每个实例,根据第一预设规则在所述主机节点列表中选择主机节点,得到与当前实例对应的预选主机节点列表;A rule verification module, configured to select a host node in the host node list according to the first preset rule for each instance in each module in the to-be-scheduled instance list to obtain a preselected host node corresponding to the current instance. list;
    资源优选模块,用于根据第二预设规则,从所述预选主机节点列表中选择最优主机节点后,将所述最优主机节点与所述当前实例进行绑定,得到绑定关系;A resource optimization module, configured to select an optimal host node from the preselected host node list according to the second preset rule, and then bind the optimal host node to the current instance to obtain a binding relationship;
    部署模块,用于将所述绑定关系中主机节点对应的实例部署到所述主机节点上。A deployment module is used to deploy the instance corresponding to the host node in the binding relationship to the host node.
  10. 一种计算机存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如权利要求1~7中任一项所述的集群资源规划方法的步骤。A computer storage medium, characterized in that the computer-readable storage medium stores computer instructions. When the computer instructions are run on a computer, the computer is caused to execute the cluster as described in any one of claims 1 to 7. Steps in the resource planning method.
PCT/CN2022/141378 2022-07-26 2022-12-23 Cluster resource planning method, device, apparatus, and medium WO2024021467A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210887793.8 2022-07-26
CN202210887793.8A CN115309501A (en) 2022-07-26 2022-07-26 Cluster resource planning method, device, apparatus and medium

Publications (1)

Publication Number Publication Date
WO2024021467A1 true WO2024021467A1 (en) 2024-02-01

Family

ID=83858998

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/141378 WO2024021467A1 (en) 2022-07-26 2022-12-23 Cluster resource planning method, device, apparatus, and medium

Country Status (2)

Country Link
CN (1) CN115309501A (en)
WO (1) WO2024021467A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115309501A (en) * 2022-07-26 2022-11-08 天翼云科技有限公司 Cluster resource planning method, device, apparatus and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360308A (en) * 2011-09-29 2012-02-22 用友软件股份有限公司 Distributed deployment system and method of components
US20180131583A1 (en) * 2016-11-07 2018-05-10 General Electric Company Automatic provisioning of cloud services
CN109960585A (en) * 2019-02-02 2019-07-02 浙江工业大学 A kind of resource regulating method based on kubernetes
CN110297658A (en) * 2018-03-21 2019-10-01 腾讯科技(深圳)有限公司 Functional unit sharing method, device and computer equipment
CN113342478A (en) * 2021-08-04 2021-09-03 阿里云计算有限公司 Resource management method, device, network system and storage medium
CN114138486A (en) * 2021-12-02 2022-03-04 中国人民解放军国防科技大学 Containerized micro-service arranging method, system and medium for cloud edge heterogeneous environment
CN115309501A (en) * 2022-07-26 2022-11-08 天翼云科技有限公司 Cluster resource planning method, device, apparatus and medium
CN115309544A (en) * 2022-07-26 2022-11-08 天翼云科技有限公司 Cluster resource planning method, device and apparatus

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360308A (en) * 2011-09-29 2012-02-22 用友软件股份有限公司 Distributed deployment system and method of components
US20180131583A1 (en) * 2016-11-07 2018-05-10 General Electric Company Automatic provisioning of cloud services
CN110297658A (en) * 2018-03-21 2019-10-01 腾讯科技(深圳)有限公司 Functional unit sharing method, device and computer equipment
CN109960585A (en) * 2019-02-02 2019-07-02 浙江工业大学 A kind of resource regulating method based on kubernetes
CN113342478A (en) * 2021-08-04 2021-09-03 阿里云计算有限公司 Resource management method, device, network system and storage medium
CN114138486A (en) * 2021-12-02 2022-03-04 中国人民解放军国防科技大学 Containerized micro-service arranging method, system and medium for cloud edge heterogeneous environment
CN115309501A (en) * 2022-07-26 2022-11-08 天翼云科技有限公司 Cluster resource planning method, device, apparatus and medium
CN115309544A (en) * 2022-07-26 2022-11-08 天翼云科技有限公司 Cluster resource planning method, device and apparatus

Also Published As

Publication number Publication date
CN115309501A (en) 2022-11-08

Similar Documents

Publication Publication Date Title
US11204793B2 (en) Determining an optimal computing environment for running an image
US20230281041A1 (en) File operation task optimization
US20210149743A1 (en) Resource processing method of cloud platform, related device, and storage medium
CN107734052B (en) Load balancing container scheduling method facing component dependence
CN109710405B (en) Block chain intelligent contract management method and device, electronic equipment and storage medium
Baker et al. Cloud-SEnergy: A bin-packing based multi-cloud service broker for energy efficient composition and execution of data-intensive applications
US10896058B2 (en) Managing virtual clustering environments according to requirements
WO2021093783A1 (en) Real-time resource scheduling method and apparatus, computer device, and storage medium
US9645852B2 (en) Managing a workload in an environment
US9407523B2 (en) Increasing performance of a streaming application by running experimental permutations
CN115309544A (en) Cluster resource planning method, device and apparatus
CN115134371A (en) Scheduling method, system, equipment and medium containing edge network computing resources
WO2024021467A1 (en) Cluster resource planning method, device, apparatus, and medium
CN111435354A (en) Data export method and device, storage medium and electronic equipment
US20130227113A1 (en) Managing virtualized networks based on node relationships
CN112433844B (en) Resource allocation method, system, equipment and computer readable storage medium
US20140351823A1 (en) Strategic Placement of Jobs for Spatial Elasticity in a High-Performance Computing Environment
CN110069319A (en) A kind of multiple target dispatching method of virtual machine and system towards cloudlet resource management
CN114416357A (en) Method and device for creating container group, electronic equipment and medium
JP5641064B2 (en) Execution control program, execution control apparatus, and execution control method
US20220122038A1 (en) Process Version Control for Business Process Management
CN114090234A (en) Request scheduling method and device, electronic equipment and storage medium
US20210141670A1 (en) Function performance trigger
CN113095645B (en) Heterogeneous unmanned aerial vehicle task allocation method aiming at emergency scene with uneven task distribution
CN113574506A (en) Request allocation based on compute node identifiers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952898

Country of ref document: EP

Kind code of ref document: A1