AU2018100381A4

AU2018100381A4 - A physical resource scheduling method in cloud cluster

Info

Publication number: AU2018100381A4
Application number: AU2018100381A
Authority: AU
Inventors: Chenquan Gan; Yi Jiang; Qingyi ZHU
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2018-03-27
Filing date: 2018-03-27
Publication date: 2018-05-10
Anticipated expiration: 2026-03-27

Abstract

Abstract for 2018100381 A cloud task scheduling algorithm and efficient physical resource scheduling method to improve the utilization of resources and enhance the reliability of their application, which includes the following steps: 1. When a new task arrives at the resource pool, the type and priority of the task must be determined; 2. Gather each node's resource utilization information in the cloud cluster, where the resources include CPU, memory, network and storage; 3. Calculate the score of each node; 4. Obtain a queue by sorting the scores of nodes in the cloud cluster; 5. Assign the pods required by the task to the optimum nodes. Created by Examiner 9/4/2018 Page 1 Task arriving aAnalysis module Computing Parameters m The configuration resource scheduling Scheduling system Resource decision information Collector Node 1 Node 2 Node 3 NodeN CPU1 CPU 2 CPU 3 CPUN The MEM1 MEM2 MEM3 ' MEMN cloud NET NET2 NET NETN custer DISK1 DISK 2 DISK3 DISKN Figure 1

Description

A physical resource scheduling method in cloud cluster

1 TECHNICAL FIELD

The present invention relates to the cloud computing, more specifically, it’s related to a cloud task scheduling algorithm.

2 BACKGROUND

With the development of cloud computing, lots of large-scale enterprises, research institutions, and governments have established their respective data center to provide the storage or/and computing services to public. Due to the large scale of resources, different types of service requirement, and the complexity of applications, it is significant important to develop a efficient physical resource scheduling method to improve the utilization of resources and enhance the reliability of application.

Kubcrnctcs is a portable, extensible opcn-sourcc platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubcrnctcs services, support, and tools arc widely available.

But, the resource scheduling methods built in the Kubcrnctcs is quit simple, and can not meet the diverse need of users. At present, the main scheduling methods of Kubcrnctcs includes two kinds of methods: Predicates and Priorities. However, there exit some disadvantages to those approaches. Firstly, the scoring method of CPU and memory resources can not be applied in heterogeneous cluster. Secondly, Only the resources of CPU and memory arc considered whereas the resources of network and storage arc not involved. Thirdly, due to scheduling one single Pod each time, the consistency of several Pods in the same Service can not be guaranteed.

3 DESCRIPTION

The scheduling method proposed in this invention includes following steps: Step 1. When a new task arriving at the resource pool, the type T and priority P of this task must be determined, where T € {1, 2, 3,..., Tmax} and P € {1,2,3,..., Fmax}. More specifically, the type T can be determined through a /v-mcans clustering algorithm based on the resource requirements of task. And the priority P can also be determined through the A'-rricans clustering algorithm based on the request URL.

Step 2. Gather each node’s resource utilization information in the cloud cluster, where the resources include CPU, memory, network and storage.

Step 3. Calculate the score of each node as following, SJotai = wcpuS1cpv-\-wmemorySmemory + wnetWorkSnetwork+wciiSi~S^isli, where Stotal is the total score of node i, and SlCPU, Slmemory, Slnetwork, S‘llsk denote the resource scores of node i in CPU, memory, network and storage respectively, and wepu, wmemory, Wnetwork, Wdisk denote the weight of corresponding indexes. Moreover, wcpu, Wmemory, ^network, U7disk UTC pOSltlVC Constant, and \ Wrr,(-rn.ory | WrI(-f/work I 'didisk = 1-

In a hybrid cloud consisting of heterogeneous nodes, it is highly possible there arc variety of types of CPUs, memory items, network cards and disks. Although there arc so many indicators that can infect the performance of CPU, here we take the dominant frequency as its Key Performance Indicator (KPI) for simplify. Similarly, we take the capacity, bandwidth, and size as the KPIs of memory items, network cards and disks respectively.

Furthermore, we set the score of the node of which the CPU has the best performance in the hybrid cloud as 1 when its CPU is completely unoccupied. Then for any node i in this hybrid cloud, we have SlCPU = (1 — RjCPUi) ^ where f_CPUmax represents the highest dominant frequency of CPU in cloud, f-CPUi is the CPU frequency of node i, and R.CPU,, (0 < R.CPU, < 100%) is its utilization rates. Obviously, for any node i, 0 < SlCPU < 1, and the less utilization and the higher performance lead a higher score.

Similarly, we set the score of the node which has the maximum size of memory in the cloud as 1 when its memory is completely unoccupied. Then for any node i, we have Slmemory = (1 - R_memoryt) whcrc fjmemorymax represents the maximum capability of memory of one node in the cloud, fjmemoryi is the memory capability of node i, and Rjmemoryi (0 < Rjmemoryi < 100%) is its utilization rates. Obviously, for any node i, 0 < Φ <1 yj — ^memory — -*·

For the network resource score of node i, we similarly have Slnetwork = (1 — Rjnetworki) ’kki , where fjnetworkmax represents the maximum bandwidth of network cards in the cloud, f jaetworki is the network cards bandwidth of node i, and R^etworki (0 < Rjnetworki < 100%) is its utilization rates. Obviously, for any node i, 0 < Slnetwork < 1.

And for the storage resource score, we have Sldisk = (1 — Rjdisj) ^^fsikki where /_di.s/cmax represents the maximum storage capability of one node in the cloud, f jdiski is the disk storage capability of node i, and Rjdiski (0 < Rjdiski < 100%) is its utilization rates. Obviously, for any node i, 0 < Sldisk < 1.

Otherwise, different tasks commonly require different resources. For example, the compute-intensive task requires higher performance in computing resources such as CPU and memory, whereas data-transmission task requires higher performance in network resources. Hence, we initially define different combines of weights for different types of tasks. And the user also can dynamically changes the weights according the actual situation.

Step 4. Obtain a queue Q[A] by sorting the scores S\otrd of all N nodes in the cloud cluster, where Q[l] > Q[2] > Q[3] > ... > Q[JV].

Step 5. Assuming that the new task require m Pods in Kubcrnctcs, then they must be assigned to the m optimum nodes from Q[7V], where m « N. For a given priority P of task, Cp is define as its corresponding the consistency threshold. Then the m nodes can be selected through following several substeps. Substep 5.1. Set i = 1.

Substep 5.2. If z > N — m + 1, return empty and end the process there, else, choose the first m scores of Q[AT] from Q[i], i.c., Q[i], Q\i + 1], Q[i + 2], ..., Q[i + m- 1],

Substep 5.3. Calculate the consistency of the chosen m scores which is defined as their statistical variance, i.c., σ = ^ 1 {Q\j\ ~ Qi)2, where

Ql = ijyjL7^1Q[j}·

Substep 5.4. Compare σ with Cp. If sigma > Cp, set i i |1, and repeat the Substep 5.2-5.4, else, obtain the value of i and end this process.

As a result, we obtain the m nodes, i.c., node i, node i + 1,..., node i + rn — 1.

4 FIGURES

Figure 1 is the system framework of the proposed physical resource scheduling method.

Figure 2 briefly displays the proposed physical resource scheduling .

Claims

Claims The claims defining the invention arc as follows:

1 .The scheduling method proposed in this invention includes following steps: Step 1. When a new task arriving at the resource pool, the type T and priority P of this task must be determined, where T € {1, 2,3,..., Tmax} and P € {1,2,3,..., Pmax}. Step

2. Gather each node’s resource utilization information in the cloud cluster, where the resources include CPU, memory, network and storage. Step

3. Calculate the score of each node as following, Sltotal = wcPljScpu I wmemorySmf,rnory-\-WnetworkSnetwor]t-\-W(iis]iSdisk, where Stotai IS the total SCOrC of node i, and SlCPU, Slmemory, Slnetwork, Sldisk denote the resource scores of node i in CPU, memory, network and storage respectively, and wepu, wmemory > wnetworki wchst, denote the weight of corresponding indexes. Moreover, w(:pij, Wmemory s mnetwork> Wdisk arc positive Constant, a,H(l W(jj>fj I I 'd]r)ff/woikp ^disk = 1 · Furthermore, we set the score of the node of which the CPU has the best performance in the hybrid cloud as 1 when its CPU is completely unoccupied. Then for any node i in this hybrid cloud, we have SlCPU = (1 — RjCPUi) j where /-CPUmax represents the highest dominant frequency of CPU in cloud, fJOPUi is the CPU frequency of node i, and R_CPUi (0 < R.CPUi < 100%) is its utilization rates. Obviously, for any node i, 0 < SlCPU < 1, and the less utilization and the higher performance lead a higher score. Similarly,Slmemory = (1 - Rjmemoryi) where fjnemorymax rep resents the maximum capability of memory of one node in the cloud, fjmemoryi is the memory capability of node i, and Rjmemoryi (0 < Rjmemoryi < 100%) is its utilization rates. Obviously, for any node i, 0 < Slmemory < 1. ^network (1 Rjnetworki) f[~eti™°knlKX ’ whcrc /.network^ represents the maximum bandwidth of network cards in the cloud, f jnetworki is the network cards bandwidth of node i, and R_networkt (0 < Rjnetworki < 100%) is its utilization rates. Obviously, for any node *, 0< $ network — Sdisk = (1 — R_disi) ^ ^dfgk^x whcrc /_flhsfcmax represents the maximum storage capability of one node in the cloud, f_diski is the disk storage capability of node i, and Rjdiski (0 < R,jdiskt < 100%) is its utilization rates. Step

4. Obtain a queue 0[Ar] by sorting the scores S\otal of all N nodes in the cloud cluster, whcrc Q[ 1] > Q\ 2] > Q[ 3] > ··· > Qm· Step

5. Assuming that the new task require m Pods in Kubcrnctcs, then they must be assigned to the m optimum nodes from Q[/V], whcrc m « N. For a given priority P of task, Cp is define as its corresponding the consistency threshold. Then the m nodes can be selected through following several substeps. Substep 5.1. Set i = 1. Substep 5.2. If i > N — m + 1, return empty and end the process there, else, choose the first m scores of Q[IV] from Q[i], i.c., Q[i\, Q[i+ 1], Q[i + 2], ..., Q[i + m - 1]. Substep 5.3. Calculate the consistency of the chosen m scores which is defined as their statistical variance, i.c., σ = ^ (Qbl Q*)2j where Qi = Substep 5.4. Compare σ with Cp. If sigma > Cp, set i = i + 1, and repeat the Substep 5.2-5.4, else, obtain the value of i and end this process. As a result, we obtain the m nodes, i.c., node i, node * +1,..., node i + m — l.