WO2024016596A1 - Procédé et appareil de planification de grappe de conteneurs, dispositif et support d'enregistrement - Google Patents

Procédé et appareil de planification de grappe de conteneurs, dispositif et support d'enregistrement Download PDF

Info

Publication number
WO2024016596A1
WO2024016596A1 PCT/CN2022/141606 CN2022141606W WO2024016596A1 WO 2024016596 A1 WO2024016596 A1 WO 2024016596A1 CN 2022141606 W CN2022141606 W CN 2022141606W WO 2024016596 A1 WO2024016596 A1 WO 2024016596A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
pod
group
resource
groups
Prior art date
Application number
PCT/CN2022/141606
Other languages
English (en)
Chinese (zh)
Inventor
闫海娜
景宇
刘磊
杨帆
甄富
鞠娜
Original Assignee
天翼云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天翼云科技有限公司 filed Critical 天翼云科技有限公司
Publication of WO2024016596A1 publication Critical patent/WO2024016596A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing

Definitions

  • This application relates to the field of computer technology, and specifically to a method, device, equipment and storage medium for container cluster scheduling. Background technique
  • container cluster scheduling algorithms usually use container resource utilization, node load balancing and disaster recovery strategies as the main indicators to evaluate cluster performance.
  • the rationality of the container scheduling algorithm also highly affects the performance of the cluster.
  • Kubernetes container orchestration and scheduling technology which is widely used in the field of cloud computing, fully considers the above issues.
  • it uses a certain filtering algorithm and scoring algorithm to select the most suitable node binding POD. Therefore, Kubernetes container orchestration and scheduling technology solves the problem of container resource utilization to a large extent, and has great advantages in achieving load balancing and disaster recovery.
  • Kubernetes container orchestration and scheduling technology needs to allocate a large number of PODs to nodes one by one.
  • the container scheduling efficiency is too low, and its sequential and repetitive scoring and filtering of nodes will also lead to The efficiency is reduced, so improving the efficiency of container scheduling has become an urgent problem to be solved.
  • Kubernetes' native scheduler is responsible for node filtering and scoring, as well as node load balancing and disaster recovery, resulting in excessive load pressure on the scheduler.
  • embodiments of the present application provide a container cluster scheduling method, device, equipment and storage medium to solve the technical problems of low container cluster scheduling efficiency and excessive load pressure on the native scheduler.
  • a container cluster scheduling method includes:
  • each node group Based on the resource requirements of each POD and the upper limit of the resource amount that each node group can provide, group the multiple PODs to obtain multiple POD groups, each POD group containing at least one POD;
  • the corresponding target node is determined from the node group assigned to the POD group, and each POD is bound to the corresponding target node.
  • a container cluster scheduling device including:
  • a creation unit configured to create multiple PODs based on the received container scheduling request, where the scheduling request is used to indicate the resource requirements of the multiple PODs;
  • a grouping unit configured to group the multiple PODs based on the resource requirements of each POD and the upper limit of the resource amount that each node group can provide, to obtain multiple POD groups, each POD group containing at least one POD;
  • the allocation unit is used to allocate the corresponding node group to each POD group from the node groups included in the container cluster;
  • the binding unit is used to determine the corresponding target node from the node group allocated to the POD group for each POD included in each POD group, and bind each POD to the corresponding target node.
  • the grouping unit is specifically used for:
  • the node attributes between multiple nodes in each node group have one or more of the following relationships:
  • the resource attributes are the same;
  • the difference between resource utilization rates is not greater than the first preset difference threshold
  • the grouping unit is specifically used for:
  • each common resource node queue into N sub-queues.
  • the order of common resource nodes in each sub-queue is consistent with the corresponding common resource node queue.
  • the minimum resource utilization rate in one of any two sub-queues is The value is greater than the maximum value of resource utilization in another subqueue;
  • the nodes included in the subqueues corresponding to the positions are combined to obtain N common resource node groups.
  • the grouping unit is specifically used for:
  • each node group in the plurality of node groups For each node group in the plurality of node groups, obtain a resource utilization set of each node group, where the resource utilization set includes the resource utilization rate of each node in the corresponding node group;
  • the multiple node groups are grouped and adjusted to obtain the adjusted multiple node groups. .
  • the device also includes a migration unit, specifically used for:
  • the activity represents the probability of the resource demand of the POD, or the resource when the resource demand expands. quantity
  • the target POD When the activity of a target POD is lower than the preset activity threshold, the target POD is migrated to a node whose available resources are lower than the preset resource threshold.
  • the grouping unit is specifically used for:
  • Each POD group has the same resource requirements
  • the total resource demand of each POD group is less than or equal to the upper limit of resource amount, and the difference between the total resource demand and the upper limit of resource amount does not exceed the third preset difference threshold;
  • the correlation between each POD in each POD group is not lower than the preset correlation threshold.
  • the allocation unit is specifically used for:
  • the special resource node group is assigned to the other unallocated POD groups.
  • a computer device including a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the computer program, the steps of any of the above methods are implemented.
  • a computer storage medium on which computer program instructions are stored.
  • the steps of any of the above methods are implemented.
  • a computer program product or computer program including computer instructions stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the steps of any of the above methods.
  • multiple PODs are created according to the resource requirements of the POD indicated by the received container scheduling request, and the created PODs are grouped according to the resource requirements of the POD and the upper limit of the resource amount that the node group can provide. Then assign a corresponding node group to each POD group from the node group included in the container cluster, and for each POD included in each POD group, determine the corresponding target node from the node group assigned to the POD group, and assign Each POD is bound to the corresponding target node.
  • the container cluster scheduling method adopted in the embodiment of the present application groups PODs and nodes respectively, and then uses the large-grained batch group scheduling from POD group to node group and the hierarchical scheduling mechanism of POD scheduling within the node group to meet the requirements of nodes.
  • hierarchical scheduling improves the scheduling efficiency of the container cluster, while reducing the load pressure on the native scheduler of the container cluster, making the resource utilization of the node higher.
  • Figure 1 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • Figure 2 is a system architecture diagram of a container cluster scheduling device provided by an embodiment of the present application.
  • Figure 3 is a schematic flowchart of a container cluster scheduling method provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of a two-level scheduling module for POD scheduling provided by an embodiment of the present application
  • Figure 5 is a schematic flow chart of node grouping provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of grouping nodes in a cluster topology provided by an embodiment of the present application.
  • Figure 7 is a grouping flow chart of a node grouping module provided by an embodiment of the present application.
  • Figure 8 is a flow chart of a container cluster scheduling process provided by an embodiment of the present application.
  • Figure 9 is a schematic structural diagram of a container cluster scheduling device provided by an embodiment of the present application.
  • Figure 10 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • POD It is the most basic resource type and the smallest scheduling unit in container cluster technology.
  • One POD is used to store one or more containers that implement the same business function and share the same network and storage. All containers in the POD are uniformly scheduled following the POD. Bound to the same node and cannot be separated.
  • Containers in POD have two characteristics: shared network and shared storage. Shared network means that all containers in POD share the same network namespace, including IP address and network port. Shared storage means that all containers in POD can access the shared storage volume. And allows containers to share data.
  • Containers Package application code, tool libraries, runtime environment, and settings and dependencies required to run into a container to achieve the benefits of portability. Different containers can run independently in different environments without affecting each other.
  • multiple PODs are created according to the resource requirements of the POD indicated by the received container scheduling request, and the created PODs are grouped according to the resource requirements of the POD and the upper limit of the resource amount that the node group can provide. Then assign a corresponding node group to each POD group from the node group included in the container cluster, and for each POD included in each POD group, determine the corresponding target node from the node group assigned to the POD group, and assign Each POD is bound to the corresponding target node.
  • the container cluster scheduling method adopted in the embodiment of the present application groups PODs and nodes respectively, and then uses the large-grained batch group scheduling from POD group to node group and the hierarchical scheduling mechanism of POD scheduling within the node group to meet the requirements of nodes.
  • hierarchical scheduling improves the scheduling efficiency of the container cluster, while reducing the load pressure on the native scheduler of the container cluster, making the resource utilization of the node higher.
  • embodiments of this application combine the node's resource attributes, resource utilization, and the topology of the container cluster to which it belongs.
  • Nodes with specific resources are divided into special resource node groups.
  • Clustering processing and grouping algorithms are used for other common resource nodes to divide common resource nodes with similar resource utilization and from different topologies into a group. After the node group is run, the nodes are divided into groups. The resource utilization of each node group is grouped and adjusted for the divided node groups.
  • this method enables nodes with special resources to be fully utilized, further improving the resource utilization of nodes, dividing common resource nodes with similar resource utilization into a group, and dynamically grouping and adjusting the node group according to the resource utilization. Ensure that the resource usage of each node in the node group remains similar, so that there is no need to spend more computing resources to achieve load balancing when scheduling PODs in each node group. This ensures node load balancing and further improves the scheduling efficiency of containers. By allowing ordinary The resource node group contains nodes from different topologies to ensure high availability of nodes in the group.
  • the embodiment of this application divides PODs with the same resource requirements into one group, and makes the total resource requirements of each POD group less than and closest to the upper limit of the node group's resource amount.
  • Each POD group There is a high degree of correlation between each POD.
  • special resource node groups are assigned to POD groups with specific resource requirements, and ordinary resource node groups are assigned to other POD groups.
  • nodes that are not assigned to ordinary resource node groups When using other POD groups, assign the special resource node group to other unassigned POD groups. And when the load of the container cluster is too low, PODs with low activity will be migrated to nodes with low available resources.
  • This method divides PODs with the same resource requirements into one group, realizes adaptive allocation corresponding to special resource node groups and ordinary resource node groups, and facilitates saving of computing resources to achieve load balancing of POD scheduling within each node group.
  • the total resource demand of each POD group is less than and closest to the upper limit of the node group, so that the resource demand of the POD group matches the resource amount of the node group and avoids the situation where the node group has insufficient resources and cannot be scheduled.
  • There is a high degree of correlation between PODs in each POD group so that PODs in the same group come from the same application or have dependencies on each other. Affinity or anti-affinity can be used more conveniently when scheduling PODs within a node group.
  • Strategies to improve container scheduling efficiency Allocating special resource node groups to other unallocated POD groups and performing node migration based on POD activity can effectively utilize the node's fragmented resources and improve the node's resource utilization without affecting the POD operation.
  • FIG. 1 it is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • it may include a user terminal 101, a container cluster scheduling device 102, and a cluster node device 103.
  • the user terminal 101 can be a mobile phone, a personal computer (personal computer) computer (PC), tablet computer (PAD), notebook computer, desktop computer, mobile Internet device (Mobile Internet Device, MID), and any other device that can connect to the server and provide local services to users. This embodiment will not be specific. limited.
  • the container cluster scheduling device 102 may be a management device in the container cluster, and is used to implement functions such as deployment, management, and monitoring of containers.
  • the cluster node device 103 provides the necessary operating environment for the scheduled POD through the agent program of the container cluster. It can be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or it can provide a cloud. Services, cloud database, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, namely content distribution network (Content Delivery Network (CDN), as well as cloud servers for basic cloud computing services such as big data and artificial intelligence platforms, but are not limited to these.
  • CDN Content Delivery Network
  • FIG. 2 it is a system architecture diagram of the container cluster scheduling device 102 provided by the embodiment of the present application.
  • this architecture may include a control module 1021, a cache module 1022, a node grouping module 1023, a POD grouping module 1024, a level 1 Scheduling module 1025 and secondary scheduling module 1026.
  • the components and structure of the container cluster scheduling device 102 shown in Figure 1 are only exemplary and not restrictive. In actual scenarios, other components and structures may be included as needed.
  • the control module 1021 is used to receive the container scheduling request triggered by the user terminal, and obtain the container data information indicated by the request through the list monitoring mechanism to create a POD to be scheduled.
  • the node information obtained from the cluster server and the created POD information to be scheduled can also be transmitted to the node grouping module 1023 and the POD grouping module 1024 respectively for subsequent scheduling processes.
  • the cache module 1022 is used to cache the data of the entire container cluster scheduling device 102.
  • the intermediate scheduling data generated by the above-mentioned other modules in the first-level and second-level scheduling processes will be temporarily stored in the cache module, such as the node grouping module 1023.
  • Node grouping module 1023 used to obtain node data information from the control module 1021, dynamically group nodes according to node attributes, and store the grouping results in the cache module 1022.
  • the POD grouping module 1024 is used to obtain the data information of the PODs to be scheduled from the control module 1021, group the PODs according to the preset grouping rules, and store the generated POD group list in the cache module 1022.
  • the first-level scheduling module 1025 is used to obtain the node group information and POD group list information from the cache module 1022, allocate the corresponding node group to each POD group, and bind the POD group batch scheduling to the node group.
  • the secondary scheduling module 1026 is used to determine the corresponding target node from the node group bound by the POD group scheduling for each POD contained in each POD group after the primary scheduling module 1025 completes large-grained batch scheduling. , and bind each POD to the corresponding target node.
  • the container cluster scheduling device is responsible for batch scheduling of POD groups and binding them to node groups through a first-level scheduling module, and the second-level scheduling module is equal to the number of node groups and corresponds one-to-one.
  • the level scheduling module is responsible for POD scheduling of its corresponding node group.
  • FIG. 3 is a schematic flow chart of the container cluster scheduling method provided by the embodiment of the present application.
  • the container cluster scheduling platform is used as the execution subject as an example.
  • the specific implementation process of the method is as follows:
  • Step 301 Create multiple PODs based on the received container scheduling request, where the scheduling request is used to indicate resource requirements of the multiple PODs.
  • scheduling requests for multiple containers required to implement business functions will be triggered.
  • the control module processes the received container scheduling requests and analyzes the resources required for the multiple containers indicated by the request. , create multiple PODs based on container resource requirements, where one POD contains one or more containers that implement the same business function.
  • the control module can follow the design of the native Kubernetes Informer notifier, where the Informer notifier is a core toolkit of Kubernetes, and the Kubernetes component uses the list- provided by the Informer notifier.
  • the watch function and the get function obtain the latest data information of the resource object.
  • the control module obtains data information related to the scheduling request from the cluster server, such as the application code, tool library, operating environment, and settings and dependencies required for the scheduled container, etc., and performs the corresponding POD creation process.
  • the created POD is stored in the cache module and can also be directly used as input to the POD grouping module.
  • Step 302 Based on the resource requirements of each POD and the upper limit of the resource amount that each node group can provide, group the multiple PODs to obtain multiple POD groups, each POD group containing at least one POD.
  • the POD grouping module groups the resource requirements of each POD and the upper limit of the resource amount that each node group can provide, so that the multiple POD groups obtained by grouping meet the following conditions:
  • Each POD group has the same resource requirements.
  • the resource requirements of POD can include but are not limited to storage resources, CPU resources, disk resources, etc. required to run the POD container.
  • the POD grouping module divides multiple PODs created into multiple groups according to the same resource requirements, so that each POD groups have the same resource requirements. Through this method, PODs with specific resource requirements such as special GPUs and high-performance disks can be divided into a group to facilitate subsequent batch scheduling to node groups with special resources and make full use of nodes with special resources.
  • the total resource demand of each POD group is less than or equal to the upper limit of resource amount of the node group, and the difference between the total resource demand and the upper limit of resource amount of the node group does not exceed the third preset difference value threshold.
  • the POD grouping module can obtain the node group information from the control module, and calculate the total resource demand of the currently divided POD group in real time during grouping to compare the upper limit of the resource amount of the node group with the total resource demand of each POD. , ensuring that the total resource demand of each POD group is less than or equal to the upper limit of resource amount of the node group, and the difference between the total resource demand and the upper limit of resource amount of the node group does not exceed the third preset difference threshold.
  • the correlation between two PODs represents the possibility that the two PODs come from the same application or are dependent on each other.
  • the POD grouping module ensures that the correlation between the PODs in each POD group is not lower than the expected value when grouping.
  • Set the correlation threshold to ensure that each POD in a group of PODs has a high degree of correlation with each other, so that the PODs in the same group come from the same application or have mutual dependencies as much as possible, which makes it more convenient to schedule PODs in the node group.
  • Properly use the affinity or anti-affinity strategy that is, call highly relevant PODs to the same node.
  • POD containers can coordinate calls with each other to improve container calling efficiency and avoid low-relevance PODs from being called to the same node, resulting in
  • the running environment or tool components related to the runtime of the container in the POD are mutually exclusive, causing conflicts and causing container downtime.
  • the node grouping module can group nodes by adding labels. For example, adding the serial number label of the corresponding node group to the nodes in the same node group, first-level and second-level The scheduling module can use this label to identify the node group to which the node belongs.
  • Step 303 Allocate a corresponding node group to each POD group from the node groups included in the container cluster.
  • the node grouping module will group the nodes according to the node attributes of each node, so that the multiple node groups obtained by grouping meet the following conditions:
  • the resource attributes of a node may include but are not limited to storage resources, CPU resources, disk resources, etc. of the cluster server corresponding to the node.
  • the control module obtains the resource attribute information of the server corresponding to each node from the cluster node device and sends it to the grouping module.
  • the node grouping module can divide nodes with the same resource attributes into a group based on the received resource attribute information. Through this method, specific resource nodes such as special GPUs and high-performance disks can be divided into multiple special resource node groups to facilitate batch provision of resources for multiple PODs with special resource requirements.
  • control module can implement list-based real-time monitoring of the resource utilization of each node through the list-watch mechanism, and can send the resource utilization information of each node at a certain time to the node grouping module.
  • the node grouping module makes sure that the difference in resource utilization between any two nodes is not greater than the first preset difference threshold, that is, nodes with close resource utilization are divided into a group. , for subsequent POD scheduling within the node group, the container cluster scheduling equipment no longer needs to spend more computing resources for load balancing of nodes, improving container scheduling efficiency.
  • Each node group contains nodes in at least two topologies in the container cluster.
  • the control module can obtain a certain topology structure information of the server corresponding to each node and send it to the node grouping module.
  • the POD grouping module uses the cluster topology information of the node corresponding to the server to group the nodes so that each node group contains nodes belonging to different topologies of the server cluster, so as to realize the problem when a node in a certain topology of the node group goes down. , nodes in other topologies can still continue to provide services, ensuring the high availability of the node group.
  • the node grouping module periodically obtains the resource utilization set of each node group from the control module.
  • the resource utilization set of each node group includes the resource utilization of each node in the corresponding node group. Rate.
  • the second preset difference threshold may be 10% of the maximum value
  • the grouping adjustment can be to only divide all the nodes in the node group whose difference between the maximum value and the minimum value in the resource utilization collection exceeds the preset threshold into other node groups according to the original grouping rules, or it can be to adjust the current node group. All node groups in the container cluster are regrouped according to the original grouping rules.
  • the node-level scheduling module allocates a corresponding node group to each POD group based on the node group information and POD group information obtained from the cache module, and binds the POD group large-grained batch scheduling to the node group.
  • the first-level scheduling module will first allocate node groups with special resources to POD groups with specific resource requirements among multiple POD groups, and then allocate other common resource node groups to other PODs except those with specific resource requirements. Group. When other POD groups are not assigned to the corresponding ordinary resource node groups, the first-level scheduling module will assign the special resource node groups to other unallocated POD groups to ensure that all pods can run normally and efficiently utilize the fragmented resources of the nodes. Improve node resource utilization.
  • Step 304 For each POD included in each POD group, determine the corresponding target node from the node group allocated to the POD group, and bind each POD to the corresponding target node.
  • the second-level scheduling module implements the process of binding each POD scheduling in each POD group to the corresponding target node.
  • the secondary scheduling module can use the native scheduler of the container cluster.
  • the secondary scheduling module can use the native scheduler of Kubernetes, based on the load balancing and resource utilization of the node. Taking rate and high availability into account, the scheduling and binding of PODs within the node group are carried out.
  • the first-level scheduling module allocates the general resource demand POD group and the special resource demand POD group in batches according to the resource demand attributes of the POD group obtained from the cache module.
  • the secondary scheduling module binds the PODs in each node group to the designated nodes by considering factors such as load balancing and disaster recovery strategies.
  • the control module when the workload of the container cluster server is lower than the preset load threshold, the control module will obtain the current activity of each POD through the list-watch mechanism and determine whether it is low. At the preset activity threshold, the activity represents the probability of POD's resource demand, or the amount of resources when the resource demand expands.
  • the secondary scheduling module When the control module determines that the activity of the target POD is lower than the threshold, the secondary scheduling module will migrate the target POD to a node whose available resources are lower than the preset resource threshold.
  • the control module can periodically obtain its workload from the cluster server, or can obtain it through event triggering, for example, when receiving feedback from the user that the server response is too slow.
  • the node grouping module can also combine the node's resource attributes and resource utilization when grouping nodes. They are grouped according to the rate and topology of the container cluster to which they belong. Referring to Figure 5, which is a schematic flow chart of node grouping provided by an embodiment of the present application, the specific implementation process of this method is as follows:
  • Step 501 Obtain at least one special resource node group based on the nodes with specific resources in the container cluster.
  • the node grouping module directly divides the nodes with specific resources in each cluster topology into special resource node groups, so that resources can be provided in batches for multiple pods with specific resource requirements.
  • Step 502 Based on the resource utilization of each node, perform clustering processing on common resource nodes in the container cluster except nodes with specific resources, and determine the number of groups N according to the clustering results.
  • the node grouping module performs clustering processing on common resource nodes in the container cluster except nodes with specific resources, and uses the obtained clustering results as the group number of common resource node groups.
  • the node grouping module can use the elbow method in the clustering algorithm to obtain the number of common resource nodes in the container cluster and the resource utilization corresponding to each common resource node from the control unit, and determine the node through the following formula Number of groups:
  • q is the number of classes of common resource nodes
  • p is the total number of common resource nodes in the container cluster
  • p is the resource utilization rate of the j-th node within the i-th type node
  • dist is the sum of the differences between the average resource utilization of each node in the container cluster and its class.
  • the corresponding elbow curve diagram is obtained according to the results of the above formula.
  • the number of classes q corresponding to the inflection point of the elbow curve is the final number of node classes obtained through the clustering algorithm. Assuming that the number of container cluster topologies is t, the final number of node groups m is:
  • the number of node groups m is determined by comparing the final number of node classes q, the number of container cluster topology t and the value 2, and the maximum value is determined as the final number of common resource node groups.
  • Step 503 Sort the common resource nodes in each topology in the container cluster in ascending order of resource utilization, and obtain the common resource node queue corresponding to each topology.
  • the node grouping module ensures that the difference in resource utilization between common resource node groups is not greater than the first preset difference threshold and includes nodes in at least two topologies in the container cluster, Arrange the common resource node groups in each topology into a common resource node queue as shown in Figure 6 in ascending order of resource utilization for subsequent grouping processes.
  • Step 504 Divide each common resource node queue into N sub-queues.
  • the order of common resource nodes in each sub-queue is consistent with the corresponding common resource node queue.
  • the resource utilization in one of the two sub-queues is The minimum value of the resource utilization rate is greater than the maximum value of resource utilization in another subqueue.
  • the node grouping module divides the common resource node queue of each topology into N sub-queues whose order is consistent with the common resource node queue according to the determined number of node groupings N, thereby ensuring that The difference in resource utilization between each sub-queue is not greater than the first preset difference threshold.
  • the node grouping module can also extract a certain number of nodes from each common resource node queue to form a node group in the order of the common resource node queue, where the number of extracted nodes is It is determined by the ratio of the total number of nodes in each common resource node queue to the determined number of node groups. This can not only make each node group contain nodes from different topologies as much as possible, but also ensure that the node resource utilization in each node group is similar, so that when performing POD scheduling within the node group, the high node resource utilization can be ensured. Available to meet load balancing needs.
  • Step 505 Combine the nodes included in the sub-queues corresponding to the positions in the N sub-queues corresponding to each common resource node queue to obtain N common resource node groups.
  • the node grouping module combines the nodes included in the sub-queues corresponding to the positions in the N sub-queues corresponding to the common resource node queues of each topology to obtain N common resource node groups. This ensures that each node group contains nodes from different topologies, and ensures that the resource utilization of the nodes in each node group is similar, ensuring the high availability of nodes and meeting the load balancing requirements of POD scheduling within the node group.
  • the node grouping module can also extract a certain number of nodes from each common resource node queue according to the order of the common resource node queue to form a node group, and finally obtain N common resource nodes. Group. Among them, the number of extracted nodes is determined by the ratio of the total number of nodes in each common resource node queue and the determined number of node groups.
  • FIG. 6 is a schematic diagram of grouping nodes within a cluster topology.
  • the example cluster contains 3 topologies, and topology 1 contains 4 common resource nodes and 1 node with specific resources.
  • Topology 2 contains 3 ordinary resource nodes and 1 node with specific resources.
  • Topology 3 contains 2 ordinary resource nodes and 1 node with specific resources.
  • the black areas in the two node rectangles in the figure represent the resources of the nodes. Utilization.
  • the node grouping module will directly divide the nodes with specific resources in the three topologies into a special resource node group. For other common resource nodes in each topology, the node grouping module will Arrange them in ascending order of resource utilization, and obtain three corresponding common resource node queues in the three topologies. Then according to the order of the ordinary resource node queues, the three ordinary resource node queues are divided into two sub-queues. According to the order of the ordinary resource node queues, the nodes included in the sub-queues corresponding to the positions in the three topologies are combined to obtain two ordinary resources.
  • Node group In a possible implementation, refer to Figure 7, which is a grouping flow chart of a node grouping module. The specific implementation steps of the method are as follows:
  • Step 701 Determine the number of groups that all nodes in the container cluster need to be divided into.
  • Step 702 Based on the node's resource utilization, sort the nodes in each cluster topology from small to large according to the node's resource utilization, and obtain the node queue corresponding to each cluster topology.
  • Step 703 Extract a certain number of nodes from the node queue corresponding to each cluster topology in proportion to form a node group, and obtain multiple node groups.
  • the number of nodes extracted is determined by the ratio of the total number of nodes to the number of groups.
  • Step 704 Add a node group serial number label to each node group, and add corresponding node group serial number labels to the nodes in the same node group.
  • Step 705 Determine whether a new node is generated in the container cluster. If so, perform steps 706-707. If not, perform step 708.
  • Step 706 Obtain the resource utilization rate of the new node, compare it with the resource utilization rate of the original node in the cluster, and determine the original node with the smallest difference in resource utilization rate from the new node.
  • Step 707 Query the node group serial number label of the node, and divide the new node into the node group corresponding to the serial number label.
  • Step 708 Determine whether there is a node group in which the difference between the maximum value and the minimum value of node resource utilization in the node group is greater than the second preset difference threshold. If so, perform step 701.
  • Figure 8 is a flow chart of a container cluster scheduling process. The specific implementation steps of the method are as follows:
  • Step 801 Group all nodes in the container cluster to obtain multiple node groups.
  • Step 802 Determine whether a container scheduling request is received. If yes, execute step 803. If not, end scheduling.
  • Step 803 Create multiple PODs according to the container scheduling request.
  • Step 804 Group multiple PODs according to the resource requirements of each POD to obtain multiple POD groups.
  • Step 805 Determine whether there is a special resource requirement POD group. If yes, execute step 806. If not, execute step 807.
  • Step 806 Assign the node group with special resources to the special resource requirement POD group.
  • Step 807 Assign the common resource node group to the common resource requirement POD group.
  • Step 808 Determine whether there is a common resource requirement POD group that is not allocated to the common resource node group. If yes, execute step 809. If not, execute step 810.
  • Step 809 Assign the node group with special resources to the unallocated general resource requirement POD group.
  • Step 810 Determine the corresponding target node from the node group allocated to each POD group, and bind each POD to the corresponding target node.
  • Step 811 Determine whether the workload of the container cluster server is lower than the preset load threshold. If yes, execute step 812. If not, execute step 815.
  • Step 812 Determine whether there is a POD group whose activity is lower than the preset activity threshold. If yes, execute step 813. If not, execute step 815.
  • Step 813 Group multiple PODs according to the activity level of each POD to obtain multiple POD groups.
  • Step 814 Assign the POD group whose activity is lower than the preset activity threshold to the node group whose available resources are lower than the preset resource threshold.
  • Step 815 Determine whether there is a node group in which the difference between the maximum value and the minimum value of node resource utilization in the node group is greater than the second preset difference threshold. If yes, perform step 801; if not, perform step 802.
  • the embodiment of this application proposes a hierarchical pod scheduling method, adds a batch scheduling algorithm, and realizes pod batch scheduling.
  • the purpose is to improve the scheduling efficiency of containers in the cluster on the basis of satisfying resource balance and disperse the pressure of the central scheduler. , while making resource utilization higher.
  • hierarchical scheduling improves scheduling efficiency and reduces scheduler pressure.
  • the Node grouping algorithm comprehensively considers the node resource specificity and resource usage, and adds node topology characteristics to dynamically group during operation.
  • the Pod grouping algorithm is divided into two stages: creation time grouping and runtime grouping. According to the pod activity grouping strategy, the cluster fragmented resources are organized during the Pod running process, further improving resource utilization.
  • this embodiment of the present application also provides a container cluster scheduling device 90, which includes:
  • the creation unit 901 is configured to create multiple PODs based on the received container scheduling request, where the scheduling request is used to indicate the resource requirements of the multiple PODs;
  • the grouping unit 902 is used to group multiple PODs based on the resource requirements of each POD and the upper limit of the resource amount provided by each node group to obtain multiple POD groups, each POD group containing at least one POD;
  • the allocation unit 903 is used to allocate corresponding node groups to each POD group from the node groups included in the container cluster;
  • the binding unit 904 is configured to determine the corresponding target node from the node group allocated to the POD group for each POD included in each POD group, and bind each POD to the corresponding target node.
  • grouping unit 902 is specifically used for:
  • the node attributes between multiple nodes in each node group have one or more of the following relationships:
  • the resource attributes are the same;
  • the difference between resource utilization rates is not greater than the first preset difference threshold
  • grouping unit 902 is specifically used for:
  • each common resource node queue into N sub-queues.
  • the order of common resource nodes in each sub-queue is consistent with the corresponding common resource node queue.
  • the minimum resource utilization rate in one of any two sub-queues is The value is greater than the maximum value of resource utilization in another subqueue;
  • the nodes included in the subqueues corresponding to the positions are combined to obtain N common resource node groups.
  • grouping unit 902 is specifically used for:
  • the resource utilization set includes the resource utilization rate of each node in the corresponding node group
  • grouping unit 902 is specifically used for:
  • Each POD group has the same resource requirements
  • the total resource demand of each POD group is less than or equal to the resource upper limit, and the difference between the total resource demand and the resource upper limit does not exceed the third preset difference threshold;
  • the correlation between each POD in each POD group is not lower than the preset correlation threshold.
  • allocation unit 903 is specifically used for:
  • the device also includes a migration unit 905, specifically used for:
  • the activity represents the probability of the POD's resource demand, or the amount of resources when the resource demand expands;
  • the target POD When the activity of the target POD is lower than the preset activity threshold, the target POD is migrated to a node whose available resources are lower than the preset resource threshold.
  • multiple PODs are created according to the resource requirements of the POD indicated by the received container scheduling request, and the created PODs are grouped according to the resource requirements of the POD and the upper limit of the resource amount that the node group can provide, and then from A corresponding node group is assigned to each POD group in the node group included in the container cluster, and for each POD included in each POD group, the corresponding target node is determined from the node group assigned to the POD group, and each POD is bound to the corresponding target node.
  • the container cluster scheduling method adopted in the embodiment of the present application groups PODs and nodes respectively, and then uses the large-grained batch group scheduling from POD group to node group and the hierarchical scheduling mechanism of POD scheduling within the node group to meet the requirements of nodes.
  • hierarchical scheduling improves the scheduling efficiency of the container cluster, while reducing the load pressure on the native scheduler of the container cluster, making the resource utilization of the node higher.
  • each unit or module
  • the functions of each unit can be implemented in the same or multiple software or hardware.
  • the device can be used to perform the methods shown in the embodiments of the present application. Therefore, for the functions that can be implemented by each functional module of the device, reference can be made to the description of the previous embodiments and will not be described again.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be the container cluster scheduling device shown in Figure 1.
  • the computer device includes a memory 1001, a communication module 1003 and one or more processors 1002.
  • Memory 1001 is used to store computer programs executed by processor 1002.
  • the memory 1001 may mainly include a program storage area and a data storage area.
  • the program storage area may store the operating system and programs required to run instant messaging functions.
  • the storage data area may store various instant messaging information and operating instruction sets.
  • the memory 1001 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 1001 may also be a non-volatile memory (non-volatile memory), such as read-only memory, flash memory, hard disk drive (HDD) or solid-state drive (SSD); or the memory 1001 is capable of carrying or storing instructions or data. Without limitation, any other medium that may be in the form of the desired program code and capable of being accessed by a computer.
  • the memory 1001 may be a combination of the above memories.
  • Processor 1002 may include one or more central processing units (central processing units). processing unit (CPU) or digital processing unit, etc.
  • CPU central processing unit
  • the processor 1002 is configured to implement the above container cluster scheduling method when calling the computer program stored in the memory 1001.
  • the communication module 1003 is used to communicate with backward sensors and other servers.
  • the embodiment of the present application does not limit the specific connection medium between the above-mentioned memory 1001, communication module 1003 and processor 1002.
  • the memory 1001 and the processor 1002 are connected through a bus 1004 in Figure 10.
  • the bus 1004 is depicted as a thick line in Figure 10.
  • the connection between other components is only a schematic explanation and does not It is limited.
  • the bus 1004 can be divided into an address bus, a data bus, a control bus, etc. For ease of description, only one thick line is used in Figure 10, but it does not describe only one bus or one type of bus.
  • a computer storage medium is stored in the memory 1001, and computer executable instructions are stored in the computer storage medium.
  • the computer executable instructions are used to implement the container cluster scheduling method in the embodiment of the present application.
  • the processor 1002 is used to execute the container cluster scheduling methods in the above embodiments.
  • various aspects of the container cluster scheduling method provided by this application can also be implemented in the form of a program product, which includes program code.
  • the program product When the program product is run on a computer device, the program code is used to use
  • the computer device performs the steps in the container cluster scheduling method according to various exemplary embodiments of the present application described above in this specification. For example, the computer device may perform the steps of each embodiment.
  • the Program Product may take the form of one or more readable media in any combination.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more wires, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the program product of embodiments of the present application may take the form of a portable compact disk read-only memory (CD-ROM) and include the program code, and may be run on a computing device.
  • CD-ROM portable compact disk read-only memory
  • the program product of the present application is not limited thereto.
  • the readable storage medium may be any tangible medium containing or storing a program, which may be used by or in combination with a command execution system, device or device.
  • the readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a readable signal medium may also be any readable medium other than a readable storage medium that can send, propagate, or transport a program for use by or in connection with a command execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
  • the program code for performing the operations of the present application can be written in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc., and also includes conventional procedural programming. language, such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computing device, partly on the user's equipment, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device, such as an Internet service provider. connected via the Internet).
  • LAN local area network
  • WAN wide area network
  • embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

La présente demande divulgue un procédé et un appareil de planification de grappe de conteneurs, un dispositif et un support d'enregistrement, se rapportant au domaine technique des ordinateurs. Le procédé comprend : la création de multiples POD selon une exigence de ressource POD indiquée par une demande de planification de conteneur reçue ; le regroupement des POD créées au moyen de l'exigence de ressource POD et d'une valeur limite supérieure d'une quantité de ressources qu'un groupe de nœuds peut fournir ; puis l'attribution d'un groupe de nœuds correspondant à chaque groupe POD à partir de groupes de nœuds inclus dans une grappe de conteneurs, et pour chaque POD inclus dans chaque groupe POD, la détermination d'un nœud cible correspondant à partir du groupe de nœuds attribué au groupe POD, et la liaison de chaque POD au nœud cible correspondant. Le procédé de planification de grappe de conteneurs utilisé dans un mode de réalisation de la présente demande améliore l'efficacité de planification de grappe de conteneurs, facilite la pression de charge d'un planificateur natif de la grappe de conteneurs, et augmente un taux d'utilisation de ressources de nœud, tout en satisfaisant un équilibrage de charge de nœud.
PCT/CN2022/141606 2022-07-21 2022-12-23 Procédé et appareil de planification de grappe de conteneurs, dispositif et support d'enregistrement WO2024016596A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210869509.4 2022-07-21
CN202210869509.4A CN115408100A (zh) 2022-07-21 2022-07-21 容器集群调度的方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2024016596A1 true WO2024016596A1 (fr) 2024-01-25

Family

ID=84157725

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/141606 WO2024016596A1 (fr) 2022-07-21 2022-12-23 Procédé et appareil de planification de grappe de conteneurs, dispositif et support d'enregistrement

Country Status (2)

Country Link
CN (1) CN115408100A (fr)
WO (1) WO2024016596A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117729204A (zh) * 2024-02-06 2024-03-19 山东大学 一种基于监控感知的k8s容器调度方法及系统
CN117971505B (zh) * 2024-03-29 2024-06-07 苏州元脑智能科技有限公司 部署容器应用的方法及装置

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408100A (zh) * 2022-07-21 2022-11-29 天翼云科技有限公司 容器集群调度的方法、装置、设备及存储介质
CN116132447A (zh) * 2022-12-21 2023-05-16 天翼云科技有限公司 一种基于Kubernetes的负载均衡方法及其装置
CN116483547A (zh) * 2023-06-21 2023-07-25 之江实验室 资源调度方法、装置、计算机设备和存储介质
CN117170811A (zh) * 2023-09-07 2023-12-05 中国人民解放军国防科技大学 一种基于volcano的节点分组作业调度方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228354A (zh) * 2017-12-29 2018-06-29 杭州朗和科技有限公司 调度方法、系统、计算机设备和介质
US20190243687A1 (en) * 2018-02-05 2019-08-08 Red Hat, Inc. Baselining for compute resource allocation
CN112905297A (zh) * 2019-12-03 2021-06-04 中国电信股份有限公司 容器集群资源调度方法和装置
CN113204428A (zh) * 2021-05-28 2021-08-03 北京市商汤科技开发有限公司 资源调度方法、装置、电子设备以及计算机可读存储介质
CN114706596A (zh) * 2022-04-11 2022-07-05 中国电信股份有限公司 容器部署方法、资源调度方法、装置、介质和电子设备
CN115408100A (zh) * 2022-07-21 2022-11-29 天翼云科技有限公司 容器集群调度的方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228354A (zh) * 2017-12-29 2018-06-29 杭州朗和科技有限公司 调度方法、系统、计算机设备和介质
US20190243687A1 (en) * 2018-02-05 2019-08-08 Red Hat, Inc. Baselining for compute resource allocation
CN112905297A (zh) * 2019-12-03 2021-06-04 中国电信股份有限公司 容器集群资源调度方法和装置
CN113204428A (zh) * 2021-05-28 2021-08-03 北京市商汤科技开发有限公司 资源调度方法、装置、电子设备以及计算机可读存储介质
CN114706596A (zh) * 2022-04-11 2022-07-05 中国电信股份有限公司 容器部署方法、资源调度方法、装置、介质和电子设备
CN115408100A (zh) * 2022-07-21 2022-11-29 天翼云科技有限公司 容器集群调度的方法、装置、设备及存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117729204A (zh) * 2024-02-06 2024-03-19 山东大学 一种基于监控感知的k8s容器调度方法及系统
CN117729204B (zh) * 2024-02-06 2024-05-10 山东大学 一种基于监控感知的k8s容器调度方法及系统
CN117971505B (zh) * 2024-03-29 2024-06-07 苏州元脑智能科技有限公司 部署容器应用的方法及装置

Also Published As

Publication number Publication date
CN115408100A (zh) 2022-11-29

Similar Documents

Publication Publication Date Title
WO2024016596A1 (fr) Procédé et appareil de planification de grappe de conteneurs, dispositif et support d'enregistrement
US20190324819A1 (en) Distributed-system task assignment method and apparatus
US20200396311A1 (en) Provisioning using pre-fetched data in serverless computing environments
WO2021159638A1 (fr) Procédé, appareil et dispositif de planification de ressources de file d'attente de grappe, et support de stockage
CN107241281B (zh) 一种数据处理方法及其装置
CN113641457A (zh) 容器创建方法、装置、设备、介质及程序产品
CN110166507B (zh) 多资源调度方法和装置
Pourghaffari et al. An efficient method for allocating resources in a cloud computing environment with a load balancing approach
CN112445857A (zh) 一种基于数据库的资源配额管理方法和装置
CN112600761B (zh) 一种资源分配的方法、装置及存储介质
CN112486653A (zh) 调度多类型计算资源的方法、装置和系统
CN115408152A (zh) 一种自适应资源匹配获取方法及系统
US10037225B2 (en) Method and system for scheduling computing
Ma et al. vLocality: Revisiting data locality for MapReduce in virtualized clouds
Shu-Jun et al. Optimization and research of hadoop platform based on fifo scheduler
CN113806097A (zh) 一种数据处理方法、装置、电子设备以及存储介质
WO2024037239A1 (fr) Procédé de planification d'accélérateur et dispositif associé
WO2023273502A1 (fr) Procédé et appareil de traitement de tâche, dispositif informatique et support de stockage
US20230289214A1 (en) Intelligent task messaging queue management
Hicham et al. Deadline and energy aware task scheduling in cloud computing
Du et al. A combined priority scheduling method for distributed machine learning
CN110399206B (zh) 一种基于云计算环境下idc虚拟化调度节能系统
CN113918291A (zh) 多核操作系统流任务调度方法、系统、计算机和介质
CN114090201A (zh) 资源调度方法、装置、设备及存储介质
US10887381B1 (en) Management of allocated computing resources in networked environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22951852

Country of ref document: EP

Kind code of ref document: A1