WO2024055809A1 - 一种基于聚类集成算法的云计算虚拟资源调度方法 - Google Patents
一种基于聚类集成算法的云计算虚拟资源调度方法 Download PDFInfo
- Publication number
- WO2024055809A1 WO2024055809A1 PCT/CN2023/113666 CN2023113666W WO2024055809A1 WO 2024055809 A1 WO2024055809 A1 WO 2024055809A1 CN 2023113666 W CN2023113666 W CN 2023113666W WO 2024055809 A1 WO2024055809 A1 WO 2024055809A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- host
- clustering
- load
- algorithm
- clustering algorithm
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000005012 migration Effects 0.000 claims abstract description 25
- 238000013508 migration Methods 0.000 claims abstract description 25
- 239000011159 matrix material Substances 0.000 claims abstract description 16
- 230000010354 integration Effects 0.000 claims description 31
- 230000008569 process Effects 0.000 claims description 15
- 230000003595 spectral effect Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 238000003064 k means clustering Methods 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000005265 energy consumption Methods 0.000 abstract description 4
- 230000009467 reduction Effects 0.000 abstract description 4
- 238000004134 energy conservation Methods 0.000 abstract description 3
- 238000010606 normalization Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
- G06F9/5088—Techniques for rebalancing the load in a distributed system involving task migration
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present invention relates to the field of cloud computing technology, and in particular to a cloud computing virtual resource scheduling method based on a clustering integration algorithm.
- the computer resources of cloud computing are usually clusters of computers located in different geographical locations. Different computers may be heterogeneous, including some differences in bandwidth, CPU, storage, etc.
- Virtual resource scheduling plays a very important role in cloud computing. First, user requests are allocated to virtual machines. These virtual machines are virtualized from physical hosts through virtualization technology and do not interfere with each other. Therefore, cloud computing physical resources The scheduling problem becomes a virtual resource scheduling problem. Since the hardware resources of the physical hosts are different and the processing capabilities are also different, load imbalance is easy to occur during the actual scheduling process. Computers with strong processing capabilities are always overwhelmed. Allocating too many requests results in overload, while computers with weak processing capabilities are in a low-load state. Load imbalance will cause low utilization of computer resources.
- the first category uses a single clustering algorithm to cluster tasks requested by users, such as using a single clustering algorithm (k-means) to cluster tasks in different time periods. Clustering is performed to achieve refined classification from the perspective of task cycles; the second type is to use a single clustering algorithm to cluster cloud computing resources, such as using fuzzy clustering algorithm (fuzzy c-means) to cluster cloud computing resources , and determine that the offset of the clustering center exceeds the threshold and the cloud computing resources have changed, reacquire the resources and cluster the cloud computing resources.
- fuzzy clustering algorithm fuzzy c-means
- the present invention provides a cloud computing virtual resource scheduling method based on a clustering integration algorithm.
- the clustering integration algorithm is used to cluster the attribute characteristics of the hosts in the cloud computing resources to improve clustering. accuracy, thereby improving the efficiency of cloud computing virtual resource scheduling, reducing the energy consumption of the host, and achieving the goal of energy conservation and emission reduction.
- a cloud computing virtual resource scheduling method based on a clustering integration algorithm which specifically includes the following steps:
- Step S1 Obtain the attribute characteristics of the hosts in the cloud computing resources, normalize each type of attribute characteristics, form a set of feature vectors from the normalized attribute characteristics of each host, and form the feature vectors into a matrix;
- Step S2 Use the base clustering algorithm to cluster the matrices respectively to obtain the base clustering results
- Step S3 Use the integration function based on the voting method to integrate the attribute features in the base clustering results that belong to the same cluster to obtain an integration matrix;
- Step S4 Use any one of the base clustering algorithms to cluster the integrated matrix to obtain the final clustering result
- Step S5 In any cluster of the final clustering result, calculate the load of each host in the cluster, sort the host load, and migrate the virtual machine from the host with the largest load to the host with the smallest load in the cluster. , after migration, calculate the load of each migrated host again and sort it. Calculate the difference between the host with the largest load and the host with the smallest load after migration. If the difference is within the user-set threshold range, stop the migration; otherwise, again The virtual machine is migrated from the host with the largest load after migration to the host with the smallest load. The migration stops until the difference is within the user-set threshold.
- the attribute characteristics of the host in the cloud computing resource include: storage capacity, occupied bandwidth, CPU and memory.
- x′ ij is the normalized result of the i-th attribute feature on the j-th host
- x ij is the i-th attribute feature on the j-th host
- the base clustering algorithms are: k-means clustering algorithm, fuzzy C-means clustering algorithm, Median K-flats clustering algorithm, Gaussian mixture model clustering algorithm, Subtract Clustering clustering algorithm, Single-linkage Euclidean Clustering algorithm, Single-linkage cosine clustering algorithm, Single-linkage haming clustering algorithm, Complete-linkage Euclidean clustering algorithm, Complete-linkage cosine clustering algorithm, Complete-linkage hamming clustering algorithm, Ward-linkage Euclidean clustering Algorithm, Ward-linkage cosine clustering algorithm, Ward-linkage hamming clustering algorithm, Average-linkage Euclidean clustering algorithm, Average-linkage cosine clustering algorithm, Average-linkage hamming clustering algorithm, Spectral using a sparse simi larity matrix clustering Class algorithm, Spectral using Nystrom method with orthogonalization clustering algorithm, Spectral using Nystrom method without orthogonalization clustering algorithm.
- step S3 the process of integrating the base clustering results in step S3 is:
- x′ ij is the normalized result of the i-th attribute feature on the j-th host
- x′ ab is the normalized result of the a-th attribute feature on the b-th host
- S m ⁇ x′ ij , x′ ij ⁇ is the integration result of x′ ij and x′ ab based on the integration function based on the voting method
- the labels of x′ ab are the same and belong to the same cluster
- C(x′ ij ) ⁇ C(x′ ab ) means that the labels of x′ ij and x′ ab are different and do not belong to the same cluster.
- L w is the load of the wth host in the cluster
- ⁇ is the weight of the host CPU
- ⁇ is the weight of the host memory Mem
- ⁇ is the weight of the host bandwidth Bw.
- step S5 the calculation process of the difference between the host with the largest load and the host with the least load after migration is:
- n is the number of migrated hosts in the cluster
- L max is the maximum load of the migrated host
- L min is the minimum load of the migrated host.
- the cloud computing virtual resource scheduling method based on the cluster integration algorithm of the present invention clusters the attribute characteristics of the hosts in the cloud computing resources through the cluster integration algorithm. Learn and integrate multiple clustering results of the original data set to obtain a data division that can better reflect the internal structure of the data set, which can effectively avoid the influence of the cluster center on a single clustering algorithm, resulting in low accuracy of clustering results.
- Figure 1 is a flow chart of the cloud computing virtual resource scheduling method based on the clustering integration algorithm of the present invention
- Figure 2 is a flow chart of clustering by the clustering integration algorithm in the present invention.
- FIG. 1 is a flow chart of the cloud computing virtual resource scheduling method based on the clustering integration algorithm of the present invention.
- the cloud computing virtual resource scheduling method specifically includes the following steps:
- Step S1 Obtain the attribute characteristics of the host in the cloud computing resource, including: storage capacity, occupied bandwidth, CPU and memory; due to the large difference in attribute characteristics of the host, when performing calculations, the problem of "large numbers eating decimals" is prone to occur, and Each type of attribute characteristics is normalized, and the normalized attribute characteristics of each host are formed into a set of feature vectors, and the feature vectors are formed into a matrix;
- x′ ij is the normalized result of the i-th attribute feature on the j-th host
- x ij is the i-th attribute feature on the j-th host
- the present invention uses the clustering integration method to cluster cloud computing resources and obtain more accurate clustering results.
- the clustering process through the clustering integration algorithm in the present invention is specifically:
- Step S2 Use the base clustering algorithm to cluster the matrices respectively to obtain the base clustering results;
- the base clustering algorithms in the present invention are: k-means clustering algorithm, fuzzy C-means clustering algorithm, and Median K-flats clustering Algorithm, Gaussian mixture model clustering algorithm, Subtract Clustering clustering algorithm, Single-linkage Euclidean clustering algorithm, Single-linkage cosine clustering algorithm, Single-linkage hamming clustering algorithm, Complete-linkage Euclidean clustering algorithm, Complete-linkage cosine clustering algorithm, Complete-linkage hamming clustering algorithm, Ward-linkage Euclidean clustering algorithm, Ward-linkage cosine clustering algorithm, Ward-linkage hamming clustering algorithm, Average-linkage Euclidean clustering algorithm, Average-linkage cosine clustering algorithm Class algorithm, Average-linkage hamming clustering algorithm, Spectral using a sparse similarity matrix clustering algorithm, Spectral using Nystrom method with
- Step S3 Use the integration function based on the voting method to integrate the attribute features in the base clustering results that belong to the same cluster to obtain the integration matrix.
- the voting method uses a minority-subject-majority mechanism, which can effectively improve the final clustering result. accuracy.
- x′ ij is the normalized result of the i-th attribute feature on the j-th host
- x′ ab is the normalized result of the a-th attribute feature on the b-th host
- S m ⁇ x′ ij , x′ ab ⁇ is the integration result of x′ ij and x′ ab by the integration function based on the voting method
- the labels of x′ ab are the same and belong to the same cluster
- C(x′ ij ) ⁇ C(x′ ab ) means that the labels of x′ ij and x′ ab are different and do not belong to the same cluster.
- Step S4 Use any one of the base clustering algorithms to cluster the integrated matrix to obtain the final clustering result.
- Step S5 In any cluster of the final clustering result, calculate the load of each host in the cluster, sort the host load, and migrate the virtual machine from the host with the largest load to the host with the smallest load in the cluster. , after migration, calculate the load of each migrated host again and sort it. Calculate the difference between the host with the largest load and the host with the smallest load after migration. If the difference is within the user-set threshold, stop the migration; otherwise, there are hosts The load of the host is greater than that of other hosts. The load has not reached balance and the migration needs to continue. Migrate the virtual machine again from the host with the largest load to the host with the smallest load until the difference is within the user-preset threshold range. Stop the migration. Achieve balanced load.
- L w is the load of the wth host in the cluster
- ⁇ is the weight of the host CPU
- ⁇ is the weight of the host memory Mem
- ⁇ is the weight of the host bandwidth Bw.
- the calculation process of the difference between the host with the largest load and the host with the smallest load after migration is:
- n is the number of migrated hosts in the cluster
- L max is the maximum load of the migrated host
- L min is the minimum load of the migrated host.
- the cloud computing virtual resource scheduling method based on the clustering integration algorithm of the present invention clusters hosts with similar performance into a cluster. When performing virtual machine migration, it can narrow the scope of searching for target hosts, shorten the time for searching for target hosts, and improve the efficiency of resource scheduling. , reduce the energy consumption of the host and achieve the goal of energy conservation and emission reduction.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种基于聚类集成算法的云计算虚拟资源调度方法,包括:获取云计算资源中主机的属性特征,将每一类属性特征进行归一化处理后组成矩阵;采用基聚类算法分别对矩阵进行聚类;使用基于投票法的集成函数,对基聚类结果中属性特征属于同一个簇的进行集成,得到集成矩阵;使用基聚类算法中的任意一种对集成矩阵进行聚类,得到最终聚类结果;在最终聚类结果的任意一个簇中,计算该簇中每台主机的负载,并将主机负载进行排序,将虚拟机从负载最大的主机中迁移到该簇中负载最小的主机中,直至迁移后负载最大主机与负载最小主机的差值在用户预设阈值范围内,停止迁移。本发明提高云计算虚拟资源调度的效率,降低主机的能耗,实现节能减排。
Description
本发明涉及云计算技术领域,尤其涉及一种基于聚类集成算法的云计算虚拟资源调度方法。
云计算的计算机资源通常都是处于不同地理位置的计算机组成的集群,不同的计算机之间可能是异构的,包括带宽、CPU、存储等方面存在一些差异。虚拟资源调度在云计算中起着十分重要的作用,首先用户请求被分配到虚拟机上,这些虚拟机是通过虚拟化技术从物理主机中虚拟出来的,并且互不干扰,所以云计算物理资源调度问题变成了虚拟资源调度问题,由于物理主机的硬件资源各不相同,处理能力也不一样,所以在实际的调度过程中,很容易出现负载失衡的现象,处理能力强的计算机总是被分配过多的请求而出现过载,而处理能力弱的计算机处于低载状态,负载失衡会造成计算机资源的利用率低下的问题。
目前基于聚类算法云资源匹配方法主要分为两类,第一类是采用单个聚类算法对用户请求的任务进行聚类,例如采用单个聚类算法(k-means)对不同时间段的任务进行聚类,实现从任务周期角度进行细化分类;第二类是采用单个聚类算法对云计算资源进行聚类,例如采用模糊聚类算法(fuzzy c-means)对云计算资源进行聚类,并判断聚类中心的偏移量,超过阈值,云计算资源发生了变化,重新获取资源并将云计算资源进行聚类。上述方法采用单个聚类算法进行聚类,单个聚类算法不稳定,容易受到异常点的影响,导致聚类结果不准确。
发明内容
针对现有技术中存在的问题,本发明提供了一种基于聚类集成算法的云计算虚拟资源调度方法,采用聚类集成的算法对云计算资源中主机的属性特征进行聚类,提高聚类的准确性,从而提高云计算虚拟资源调度的效率,降低主机的能耗,达到节能减排的目标。
为实现上述目的,本发明采用如下技术方案:一种基于聚类集成算法的云计算虚拟资源调度方法,具体包括如下步骤:
步骤S1、获取云计算资源中主机的属性特征,将每一类属性特征进行归一化处理,并将每台主机归一化后的属性特征组成一组特征向量,将特征向量组成矩阵;
步骤S2、采用基聚类算法分别对矩阵进行聚类,获取基聚类结果;
步骤S3、使用基于投票法的集成函数,对基聚类结果中属性特征属于同一个簇的进行集成,得到集成矩阵;
步骤S4、使用基聚类算法中的任意一种对集成矩阵进行聚类,得到最终聚类结果;
步骤S5、在最终聚类结果的任意一个簇中,计算该簇中每台主机的负载,并将主机负载进行排序,将虚拟机从负载最大的主机中迁移到该簇中负载最小的主机中,迁移后,再次计算迁移后的每台主机的负载并进行排序,计算迁移后负载最大主机与负载最小主机的差值,如果差值在用户预设阈值范围内,停止迁移;否则,再次将虚拟机从迁移后负载最大的主机迁移到最小负载的主机中,直到差值在用户预设阈值范围内,停止迁移。
进一步地,所述云计算资源中主机的属性特征包括:存储容量、占用带宽、CPU和内存。
进一步地,所述每一类属性特征进行归一化处理的过程为:
其中,x′ij为第j台主机上第i类属性特征归一化的结果,xij为第j台主机上第i类属性特征,为第i类属性特征的最小值,为第i类属性特征的最大值。
进一步地,所述基聚类算法为:k-means聚类算法、模糊C均值聚类算法、Median K-flats聚类算法、高斯混合模型聚类算法、Subtract Clustering聚类算法、Single-linkage Euclidean聚类算法、Single-linkage cosine聚类算法、Single-linkage haming聚类算法、Complete-linkage Euclidean聚类算法、Complete-linkage cosine聚类算法、Complete-linkage hamming聚类算法、Ward-linkage Euclidean聚类算法、Ward-linkage cosine聚类算法、Ward-linkage hamming聚类算法、Average-linkage Euclidean聚类算法、Average-linkage cosine聚类算法、Average-linkage hamming聚类算法、Spectral using a sparse simi larity matrix聚类算法、Spectral using Nystrom method with orthogonalization聚类算法、Spectral using Nystrom method without orthogonalization聚类算法。
进一步地,步骤S3中对基聚类结果进行集成的过程为:
其中,x′ij为第j台主机上第i类属性特征归一化的结果,x′ab为第b台主机上第a类属性特征归一化的结果,且j=b时,i≠a;Sm{x′ij,x′ij}为基于投票法的集成函数对x′ij和x′ab的集成结果;C(x′ij)=C(x′ab)表示x′ij和x′ab的标签相同,属于同一个簇;C(x′ij)≠C(x′ab)表示x′ij和x′ab的标签不同,不属于同一个簇。
进一步地,步骤S5中每个主机的负载的计算过程为:
Lw=αCPU+βMem+λBw
Lw=αCPU+βMem+λBw
其中,Lw为该簇中第w台主机的负载,α为主机CPU的权重,β为主机内存Mem的权重,λ为主机带宽Bw的权重。
进一步地,步骤S5中迁移后负载最大主机与负载最小主机的差值的计算过程为:
其中,n为该簇中迁移后主机的台数,Lmax为迁移后主机的最大负载,Lmin为迁移后主机的最小负载。
与现有技术相比,本发明具有如下有益效果:本发明基于聚类集成算法的云计算虚拟资源调度方法将云计算资源中主机的属性特征通过聚类集成算法进行聚类,聚类集成通过对原始数据集的多个聚类结果进行学习和集成,得到一个能够较好的反映数据集内在结构的数据划分,能够有效的避免单个聚类算法受簇中心影响导致聚类结果准确率低的问题,提高聚类结果准确性和稳定性,使得同簇内的主机性能尽可能的相似,不同簇之间主机的性能尽可能差异较大,将性能相似的主机聚为一簇,在进行虚拟机迁移时,能够缩小查找目标主机范围,缩短查找目标主机的时间,从而提高资源调度的效率,降低主机的能耗,达到节能减排的目标;在任意簇中,计算主机的负载,并将主机负载进行排序,将虚拟机从负载最大主机中迁移到负载最低的主机中,避免虚拟机反复回迁造成的资源浪费,业务中断,影响用户的使用。
图1为本发明基于聚类集成算法的云计算虚拟资源调度方法流程图;
图2为本发明中聚类集成算法进行聚类的流程图。
下面将结合附图,对本发明的技术方案进行清楚、完整的描述,显然,所描述的具体实施方式仅仅是本发明一部分,而不是全部。基于本发明的具体实施方式,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护范围。
图1为本发明基于聚类集成算法的云计算虚拟资源调度方法流程图,该云计算虚拟资源调度方法具体包括如下步骤:
步骤S1、获取云计算资源中主机的属性特征,包括:存储容量、占用带宽、CPU和内存;由于主机的属性特征差异较大,进行运算时,容易出现“大数吃小数”的问题,将每一类属性特征进行归一化处理,并将每台主机归一化后的属性特征组成一组特征向量,将特征向量组成矩阵;
本发明中每一类属性特征进行归一化处理的过程为:
其中,x′ij为第j台主机上第i类属性特征归一化的结果,xij为第j台主机上第i类属性特征,为第i类属性特征的最小值,为第i类属性特征的最大值。
由于云计算资源中存在较多类型的主机,存储容量、占用带宽、CPU和内存均存在着较大的差异,根据差异最小的原则,将相似的主机聚为一簇,簇内的主机性能尽可能的相似,不同簇之间存在较大的差异。在簇中,将虚拟机从负载最高主机的迁移到低负载的主机中,能够达到负载均衡的目的。现有的方法是使用单个聚类算法对用户请求的任务进行聚类,由于单个聚类算法稳定性差、准确性较低,得到的聚类结果不稳定;而聚类集成算法具有良好的鲁棒性,较高的准确性,所以本发明采用聚类集成的方法对云计算资源进行聚类,得到更为准确的聚类结果。如图2所示,本发明中通过聚类集成算法进行聚类的过程具体为:
步骤S2、采用基聚类算法分别对矩阵进行聚类,获取基聚类结果;本发明中基聚类算法为:k-means聚类算法、模糊C均值聚类算法、Median K-flats聚类算法、高斯混合模型聚类算法、Subtract Clustering聚类算法、Single-linkage Euclidean聚类算法、Single-linkage cosine聚类算法、Single-linkage hamming聚类算法、Complete-linkage Euclidean聚类算法、Complete-linkage cosine聚类算法、Complete-linkage hamming聚类算法、Ward-linkage Euclidean聚类算法、Ward-linkage cosine聚类算法、Ward-linkage hamming聚类算法、Average-linkage Euclidean聚类算法、Average-linkage cosine聚类算法、Average-linkage hamming聚类算法、Spectral using a sparse similarity matrix聚类算法、Spectral using Nystrom method with orthogonalization聚类算法、Spectral using Nystrom method without orthogonalization聚类算法,上述20种基聚类算法,涉及较多聚类算法,且包含不通类型的聚类算法,通过上述20种基聚类算法分别对矩阵进行聚类,能够产生差异性较大的基聚类结果,对差异性较大的基聚类结果进行聚类,能提高最终聚类结果的准确性。
步骤S3、使用基于投票法的集成函数,对基聚类结果中属性特征属于同一个簇的进行集成,得到集成矩阵,投票法采用的是少数服从多数的机制,能有效的提高最终聚类结果的准确性。
本发明中对基聚类结果进行集成的过程为:
其中,x′ij为第j台主机上第i类属性特征归一化的结果,x′ab为第b台主机上第a类属性特征归一化的结果,且j=b时,i≠a;Sm{x′ij,x′ab}为基于投票法的集成函数对x′ij和x′ab的集成结果;C(x′ij)=C(x′ab)表示x′ij和x′ab的标签相同,属于同一个簇;C(x′ij)≠C(x′ab)表示x′ij和x′ab的标签不同,不属于同一个簇。
步骤S4、使用基聚类算法中的任意一种对集成矩阵进行聚类,得到最终聚类结果。
步骤S5、在最终聚类结果的任意一个簇中,计算该簇中每台主机的负载,并将主机负载进行排序,将虚拟机从负载最大的主机中迁移到该簇中负载最小的主机中,迁移后,再次计算迁移后的每台主机的负载并进行排序,计算迁移后负载最大主机与负载最小主机的差值,如果差值在用户预设阈值范围内,停止迁移;否则,存在主机的负载大于其他的主机,负载并没有达到平衡,需要继续迁移,再次将虚拟机从迁移后负载最大的主机迁移到最小负载的主机中,直到差值在用户预设阈值范围内,停止迁移,达到平衡负载。
本发明中每个主机的负载的计算过程为:
Lw=αCPU+βMem+λBw
Lw=αCPU+βMem+λBw
其中,Lw为该簇中第w台主机的负载,α为主机CPU的权重,β为主机内存Mem的权重,λ为主机带宽Bw的权重。
本发明中迁移后负载最大主机与负载最小主机的差值的计算过程为:
其中,n为该簇中迁移后主机的台数,Lmax为迁移后主机的最大负载,Lmin为迁移后主机的最小负载。
由于是在同一个簇内的主机结构、性能较为相似,迁移过程中耗费的资源较小,并且从负载最大向负载最小的主机进行迁移,避免虚拟机反复回迁、减少业务中断,影响用户使用。本发明基于聚类集成算法的云计算虚拟资源调度方法将性能相似的主机聚为一簇,在进行虚拟机迁移时,能够缩小查找目标主机范围,缩短查找目标主机的时间,提高资源调度的效率,降低主机的能耗,达到节能减排的目标。
以上仅是本发明的优选实施方式,本发明的保护范围并不仅局限于上述实施方式,凡属于本发明思路下的技术方案均属于本发明的保护范围。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理前提下的若干改进和润饰,应视为本发明的保护范围。
Claims (7)
- 一种基于聚类集成算法的云计算虚拟资源调度方法,其特征在于,具体包括如下步骤:步骤S1、获取云计算资源中主机的属性特征,将每一类属性特征进行归一化处理,并将每台主机归一化后的属性特征组成一组特征向量,将特征向量组成矩阵;步骤S2、采用基聚类算法分别对矩阵进行聚类,获取基聚类结果;步骤S3、使用基于投票法的集成函数,对基聚类结果中属性特征属于同一个簇的进行集成,得到集成矩阵;步骤S4、使用基聚类算法中的任意一种对集成矩阵进行聚类,得到最终聚类结果;步骤S5、在最终聚类结果的任意一个簇中,计算该簇中每台主机的负载,并将主机负载进行排序,将虚拟机从负载最大的主机中迁移到该簇中负载最小的主机中,迁移后,再次计算迁移后的每台主机的负载并进行排序,计算迁移后负载最大主机与负载最小主机的差值,如果差值在用户预设阈值范围内,停止迁移;否则,再次将虚拟机从迁移后负载最大的主机迁移到最小负载的主机中,直到差值在用户预设阈值范围内,停止迁移。
- 根据权利要求1所述的一种基于聚类集成算法的云计算虚拟资源调度方法,其特征在于,所述云计算资源中主机的属性特征包括:存储容量、占用带宽、CPU和内存。
- 根据权利要求1所述的一种基于聚类集成算法的云计算虚拟资源调度方法,其特征在于,所述每一类属性特征进行归一化处理的过程为:
其中,x′ij为第j台主机上第i类属性特征归一化的结果,xij为第j台主机上第i类属性特征,为第i类属性特征的最小值,为第i类属性特征的最大值。 - 根据权利要求1所述的一种基于聚类集成算法的云计算虚拟资源调度方法,其特征在于,所述基聚类算法为:k-means聚类算法、模糊C均值聚类算法、Median K-flats聚类算法、高斯混合模型聚类算法、Subtract Clustering聚类算法、Single-linkage Euclidean聚类算法、Single-linkage cosine聚类算法、Single-linkage hamming聚类算法、Complete-linkage Euclidean聚类算法、Complete-linkage cosine聚类算法、Complete-linkage hamming聚类算法、Ward-linkage Euclidean聚类算法、Ward-linkage cosine聚类算法、Ward-linkage hamming聚类算法、Average-linkage Euclidean聚类算法、Average-linkage cosine聚类算法、Average-linkage hamming聚类算法、Spectral using a sparse similarity matrix聚类算法、Spectral using Nystrom method with orthogonalization聚类算法、Spectral using Nystrom method without orthogonalization 聚类算法。
- 根据权利要求1所述的一种基于聚类集成算法的云计算虚拟资源调度方法,其特征在于,步骤S3中对基聚类结果进行集成的过程为:
其中,x′ij为第j台主机上第i类属性特征归一化的结果,x′ab为第b台主机上第a类属性特征归一化的结果,且j=b时,i≠a;Sm{x′ij,x′ab}为基于投票法的集成函数对x′ij和x′ab的集成结果;C(x′ij)=C(x′ab)表示x′ij和x′ab的标签相同,属于同一个簇;C(x′ij)≠C(x′ab)表示x′ij和x′ab的标签不同,不属于同一个簇。 - 根据权利要求1所述的一种基于聚类集成算法的云计算虚拟资源调度方法,其特征在于,步骤S5中每个主机的负载的计算过程为:
Lw=αCPU+βMem+λBw其中,Lw为该簇中第w台主机的负载,α为主机CPU的权重,β为主机内存Mem的权重,λ为主机带宽Bw的权重。 - 根据权利要求1所述的一种基于聚类集成算法的云计算虚拟资源调度方法,其特征在于,步骤S5中迁移后负载最大主机与负载最小主机的差值的计算过程为:
其中,n为该簇中迁移后主机的台数,Lmax为迁移后主机的最大负载,Lmin为迁移后主机的最小负载。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211120488.2 | 2022-09-15 | ||
CN202211120488.2A CN115543609B (zh) | 2022-09-15 | 2022-09-15 | 一种基于聚类集成算法的云计算虚拟资源调度方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024055809A1 true WO2024055809A1 (zh) | 2024-03-21 |
Family
ID=84728589
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/113666 WO2024055809A1 (zh) | 2022-09-15 | 2023-08-18 | 一种基于聚类集成算法的云计算虚拟资源调度方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115543609B (zh) |
WO (1) | WO2024055809A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115543609B (zh) * | 2022-09-15 | 2023-11-21 | 中电信数智科技有限公司 | 一种基于聚类集成算法的云计算虚拟资源调度方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102724277A (zh) * | 2012-05-04 | 2012-10-10 | 华为技术有限公司 | 虚拟机热迁移和部署的方法、服务器及集群系统 |
CN104111867A (zh) * | 2013-04-19 | 2014-10-22 | 杭州迪普科技有限公司 | 一种虚拟机迁移装置及方法 |
US20200099597A1 (en) * | 2018-08-20 | 2020-03-26 | Arbor Networks. Inc. | Scalable unsupervised host clustering based on network metadata |
CN115543609A (zh) * | 2022-09-15 | 2022-12-30 | 中电信数智科技有限公司 | 一种基于聚类集成算法的云计算虚拟资源调度方法 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007102332A (ja) * | 2005-09-30 | 2007-04-19 | Toshiba Corp | 負荷分散システム及び負荷分散方法 |
CN104156463A (zh) * | 2014-08-21 | 2014-11-19 | 南京信息工程大学 | 一种基于MapReduce的大数据聚类集成方法 |
CN106686039B (zh) * | 2015-11-10 | 2020-07-21 | 华为技术有限公司 | 一种云计算系统中的资源调度方法及装置 |
CN106897116A (zh) * | 2017-02-27 | 2017-06-27 | 郑州云海信息技术有限公司 | 一种虚拟机迁移方法及装置 |
CN110275759A (zh) * | 2019-06-21 | 2019-09-24 | 长沙学院 | 一种虚拟机簇动态部署方法 |
CN113886674B (zh) * | 2020-07-01 | 2024-09-06 | 北京达佳互联信息技术有限公司 | 资源推荐方法、装置、电子设备及存储介质 |
CN112232383A (zh) * | 2020-09-27 | 2021-01-15 | 江南大学 | 一种基于超簇加权的集成聚类方法 |
-
2022
- 2022-09-15 CN CN202211120488.2A patent/CN115543609B/zh active Active
-
2023
- 2023-08-18 WO PCT/CN2023/113666 patent/WO2024055809A1/zh unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102724277A (zh) * | 2012-05-04 | 2012-10-10 | 华为技术有限公司 | 虚拟机热迁移和部署的方法、服务器及集群系统 |
CN104111867A (zh) * | 2013-04-19 | 2014-10-22 | 杭州迪普科技有限公司 | 一种虚拟机迁移装置及方法 |
US20200099597A1 (en) * | 2018-08-20 | 2020-03-26 | Arbor Networks. Inc. | Scalable unsupervised host clustering based on network metadata |
CN115543609A (zh) * | 2022-09-15 | 2022-12-30 | 中电信数智科技有限公司 | 一种基于聚类集成算法的云计算虚拟资源调度方法 |
Non-Patent Citations (3)
Title |
---|
DONG SHILONG: "Master thesis ", 15 February 2015, GUANGXI UNIVERSITY, CN, article DONG, SHILONG: "Research on Optimization of Cloud Job Scheduling Strategy Based on Fuzzy Clustering", pages: 1 - 89, XP009553040 * |
HU MENG: "Master Thesis", 15 February 2016, CN, article HU, MENG: "Research on Task Scheduling in The Cloud Computing", pages: 1 - 56, XP009553042, DOI: Hebei Agricultural University * |
ZHANG JINGJING: "Master Thesis ", 15 January 2017, SOUTHWEST JIAOTONG UNIVERSITY CHINA, CN, article ZHANG JINGJING: "Research on Soft Voting Clustering Ensemble And Its Parallel Implementation", pages: 1 - 72, XP009553041 * |
Also Published As
Publication number | Publication date |
---|---|
CN115543609A (zh) | 2022-12-30 |
CN115543609B (zh) | 2023-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2024055809A1 (zh) | 一种基于聚类集成算法的云计算虚拟资源调度方法 | |
US9405806B2 (en) | Systems and methods of modeling object networks | |
CN108196935B (zh) | 一种面向云计算的虚拟机节能迁移方法 | |
US10445344B2 (en) | Load balancing for large in-memory databases | |
CN112835698A (zh) | 一种基于异构集群的请求分类处理的动态负载均衡方法 | |
US10031962B2 (en) | Method and system for partitioning database | |
CN108549904A (zh) | 基于轮廓系数的差分隐私保护K-means聚类方法 | |
CN115718644A (zh) | 一种面向云数据中心的计算任务跨区迁移方法及系统 | |
WO2015051685A1 (zh) | 一种任务调度方法、装置及系统 | |
CN110308973A (zh) | 一种基于能耗优化的容器动态迁移方法 | |
CN109815987A (zh) | 一种人群分类方法和分类系统 | |
CN109976879B (zh) | 一种基于资源使用曲线互补的云计算虚拟机放置方法 | |
CN110888713A (zh) | 一种针对异构云数据中心的可信虚拟机迁移算法 | |
US10180712B2 (en) | Apparatus and method for limiting power in symmetric multiprocessing system | |
CN111813512B (zh) | 一种基于动态分区的高能效Spark任务调度方法 | |
Ganegedara et al. | Redundancy reduction in self-organising map merging for scalable data clustering | |
Wang et al. | S-CDA: A smart cloud disk allocation approach in cloud block storage system | |
Tang et al. | A classification-based virtual machine placement algorithm in mobile cloud computing | |
Zhang et al. | Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining. | |
US20200175011A1 (en) | Scalable implementations of exact distinct counts and multiple exact distinct counts in distributed query processing systems | |
US11256686B2 (en) | Scalable implementations of multi-dimensional aggregations with input blending in distributed query processing systems | |
Khan et al. | Scalable diversification of multiple search results | |
Kasemtaweechok et al. | Adaptive geometric median prototype selection method for k-nearest neighbors classification | |
US11263202B2 (en) | Scalable implementations of exact distinct counts and multiple exact distinct counts in distributed query processing systems | |
CN109947530B (zh) | 一种针对云平台的多维度虚拟机映射方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23864542 Country of ref document: EP Kind code of ref document: A1 |