WO2021051441A1 - 一种Hadoop集群节能系统 - Google Patents

一种Hadoop集群节能系统 Download PDF

Info

Publication number
WO2021051441A1
WO2021051441A1 PCT/CN2019/108323 CN2019108323W WO2021051441A1 WO 2021051441 A1 WO2021051441 A1 WO 2021051441A1 CN 2019108323 W CN2019108323 W CN 2019108323W WO 2021051441 A1 WO2021051441 A1 WO 2021051441A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
cluster
data
module
energy consumption
Prior art date
Application number
PCT/CN2019/108323
Other languages
English (en)
French (fr)
Inventor
倪丽娜
张金泉
刘浩然
韩庆亮
Original Assignee
山东科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 山东科技大学 filed Critical 山东科技大学
Publication of WO2021051441A1 publication Critical patent/WO2021051441A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4893Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the invention belongs to the field of information technology processing, and specifically relates to a Hadoop cluster energy-saving system.
  • Hadoop clusters are deployed in various fields such as web search, data mining, and recommendation advertisements.
  • the current design ideas of Hadoop task scheduling and data block storage take more into consideration issues such as cluster performance and data security. Therefore, the load balancing strategy of the Hadoop cluster itself keeps all nodes in a running state without considering the issue of energy consumption.
  • Some clusters can reach hundreds of units, so Hadoop clusters are one of the main contributors to data center energy consumption to a certain extent. Studying the energy-saving strategies of Hadoop clusters in job scheduling and storage is of great significance to reducing the power usage effectiveness (PUE) of data centers, and will also play a positive role in the further development of Hadoop open source projects.
  • PUE power usage effectiveness
  • the present invention proposes a Hadoop cluster energy-saving system, which has a reasonable design, overcomes the shortcomings of the prior art, and has good effects.
  • a Hadoop cluster energy-saving system including a data collection module at the bottom layer, a load prediction module and an energy consumption model module at the middle layer, and a job scheduling module at the upper layer;
  • the data collection module is configured to obtain cluster node data
  • the cluster node data includes: (1) the resource utilization of the node; (2) the system resource occupied by the tasks run by the cluster node;
  • the data collection module uses Zabbix's Agent probe technology to monitor cluster performance indicators; the data collection module works in the manner of server, proxy, and agent;
  • the data collection module contains multiple clusters, each cluster contains two hosts, and each host corresponds to n cluster nodes; each host is installed with a server, each cluster node is installed with an agent, and the server is installed every other segment Time to send a request to the agent to collect the indicator data of the monitored item, the agent returns the requested data to the server, and the server writes the obtained data into the corresponding database to complete the data collection and analysis;
  • the data collection module uses proxy to share the analysis and collection of cluster data to ensure the stability of the underlying system
  • Load prediction module and energy consumption model module are configured to monitor cluster performance and collect cluster node data through the underlying data collection module, train and construct an LSTM network model that predicts node load, and provide support for upper-level task scheduling;
  • the load prediction module and energy consumption model module obtain real-time data of cluster node monitoring indicators by analyzing the CPU utilization and memory allocation collected on the server side, and realize the monitoring of each cluster node through the set threshold;
  • Visualization of performance indicators by constructing a visualization window, real-time data including CPU utilization, memory allocation, running tasks of the node and resources allocated to the tasks collected on the server side are dynamically displayed. Out; (2) monitoring log collection; the CPU utilization, memory allocation, and resource occupancy of each task on each node collected on the server side are written into the cluster log library; (3) monitoring frequency control; used to set the server How often does the terminal collect receipts, that is, how often do you collect data;
  • the LSTM network model constructed to predict the node load is trained to provide support for the upper-level task scheduling; the specific implementation method is as follows:
  • the job scheduling module including the job scheduler, is configured to use the node load predicted by the trained LSTM model to perform task scheduling according to the job information to be processed by the user;
  • the job scheduling module uses a scheduling algorithm based on host state prediction.
  • the algorithm needs to obtain the job information input by the user in advance.
  • the job information includes CPU-intensive or memory-intensive, and then select a node in the cluster that can meet the energy consumption requirements for processing.
  • the job scheduler assigns nodes to the job to complete it based on the user's job information and predicted node load;
  • the specific implementation method of the job scheduling module function is as follows:
  • Task oscillation migration control (2) Threshold trigger mechanism; Set thresholds for node sleep or activation to provide support for task scheduling; (3) Check whether the minimum requirements for user computing are met; the scheduler assigns tasks to active nodes, Then check the node's resource and user task requirements, if not satisfied, activate the dormant node, and finally count the node's CPU utilization and memory utilization; (4) node sleep queue recommendations; according to the node's CPU utilization and memory utilization, Select the sleeping node to join the node sleeping suggestion queue.
  • the resource usage of the cluster host in a given time period includes the trend of CPU utilization, the trend of memory usage, and the load situation of the node in the future time period, and the prediction result will provide a reference decision for the top-level scheduling.
  • the modules obtain data and call functions through the API interface.
  • the data collection module after the newly added Hadoop node is installed with the Agent probe, it can be seamlessly connected to the system.
  • the node will be automatically discovered and its CPU, memory and other indicator data will be automatically collected for use by the model training layer; at the same time;
  • the computing resources of the node will also be placed in the resource pool; if a node fails and cannot work normally, it will not affect the status of other nodes, and the impact of the failure will be reduced.
  • the load prediction module of the middle layer divides the original data into several different intervals. In each data interval, the actual data is first used for prediction, and then the predicted data is continuously classified into the known data as historical data, and then continue to predict the next one Data, the overall performance is rolling forward forecast. Because when the next time interval is reached, the actual data set is reused as input, which is equivalent to data correction, so the overall performance is that the macro trend is correct.
  • Figure 1 is a diagram of the overall system architecture.
  • a Hadoop cluster energy saving system includes a data collection module at the bottom layer, a load prediction module and an energy consumption model module at the middle layer, and a job scheduling module at the upper layer;
  • the data collection module is configured to obtain cluster node data
  • the cluster node data includes: (1) the resource utilization of the node; (2) the system resource occupied by the tasks run by the cluster node;
  • the data collection module uses Zabbix's Agent probe technology to monitor cluster performance indicators; the data collection module works in the manner of server, proxy, and agent;
  • the data collection module contains multiple clusters, each cluster contains two hosts, and each host corresponds to n cluster nodes; each host is installed with a server, each cluster node is installed with an agent, and the server is installed every other segment Time to send a request to the agent to collect the indicator data of the monitored item, the agent returns the requested data to the server, and the server writes the obtained data into the corresponding database to complete the data collection and analysis;
  • the data collection module uses proxy to share the analysis and collection of cluster data to ensure the stability of the underlying system
  • Load prediction module and energy consumption model module are configured to monitor cluster performance and collect cluster node data through the underlying data collection module, train and construct an LSTM network model that predicts node load, and provide support for upper-level task scheduling;
  • the load prediction module and energy consumption model module obtain real-time data of cluster node monitoring indicators by analyzing the CPU utilization and memory allocation collected on the server side, and realize the monitoring of each cluster node through the set threshold;
  • Visualization of performance indicators by constructing a visualization window, real-time data including CPU utilization, memory allocation, running tasks of the node and resources allocated to the tasks collected on the server side are dynamically displayed. Out; (2) monitoring log collection; the CPU utilization, memory allocation, and resource occupancy of each task on each node collected on the server side are written into the cluster log library; (3) monitoring frequency control; used to set the server The frequency of data collection at the end, that is, how often data is collected;
  • the LSTM network model constructed to predict the node load is trained to provide support for the upper-level task scheduling; the specific implementation method is as follows:
  • the job scheduling module including the job scheduler, is configured to perform task scheduling using the node load predicted by the trained LSTM model according to the job information to be processed by the user;
  • the job scheduling module uses a scheduling algorithm based on host state prediction.
  • the algorithm needs to obtain the job information input by the user in advance.
  • the job information includes CPU-intensive or memory-intensive, and then select a node in the cluster that can meet the energy consumption requirements for processing.
  • the job scheduler assigns nodes to the job to complete it based on the user's job information and predicted node load;
  • the specific implementation method of the job scheduling module function is as follows:
  • Task oscillation migration control (2) Threshold trigger mechanism; Set thresholds for node sleep or activation to provide support for task scheduling; (3) Check whether the minimum requirements for user computing are met; the scheduler assigns tasks to active nodes, Then check the node's resource and user task requirements, if not satisfied, activate the dormant node, and finally count the node's CPU utilization and memory utilization; (4) node sleep queue recommendations; according to the node's CPU utilization and memory utilization, Select the sleeping node to join the node sleeping suggestion queue.
  • the resource usage of the cluster host in a given time period includes the trend of CPU utilization, the trend of memory usage, and the load situation of the node in the future time period.
  • the prediction result will provide a reference decision for the top-level scheduling.
  • the modules obtain data and call functions through the API interface.
  • the data collection module after the newly added Hadoop node is installed with the Agent probe, it can be seamlessly connected to the system.
  • the node will be automatically discovered and its CPU, memory and other indicator data will be automatically collected for use by the model training layer; at the same time;
  • the computing resources of the node will also be placed in the resource pool; if a node fails and cannot work normally, it will not affect the status of other nodes, and the impact of the failure will be reduced.
  • the model training module of the middle layer divides the original data into several different intervals. In each data interval, the actual data is first used for prediction, and then the predicted data is continuously classified into the known data as historical data, so as to continue to predict the next The data, as a whole, appears to be rolling forward forecasts. Because when the next time interval is reached, the actual data set is reused as input, which is equivalent to data correction, so the overall performance is that the macro trend is correct, but the disadvantage is that some details will be missing.
  • this application performs energy consumption modeling based on the two index data of CPU and memory, while other system energy consumption parts such as disks, network inflow and outflow, and other conventional system index energy consumption are regarded as basic constants.
  • C 0 is a constant, representing other basic power that has nothing to do with CPU utilization and memory utilization
  • C ⁇ is the coefficient of the impact of CPU utilization on energy consumption
  • C ⁇ is the effect of memory utilization on energy consumption
  • the influence coefficient, C 0 and C ⁇ are linear regression coefficient values obtained through a large number of model training, and the coefficient values obtained by different servers are different.
  • the energy consumption is calculated by integrating the power of the node, denoted by E, as shown in formula (3):
  • the tool used in this application to control host resource utilization is the CPU and memory stress test tool for server evaluation: COREMark and memory test benchmark HPCC. The specific values are shown in the following table:
  • the power calculation formula can be expressed as:
  • This application first introduces the energy saving system design scheme, which mainly includes the resource data collection of the bottom layer, the load prediction and energy consumption calculation model of the middle layer, the job scheduling of the upper layer, and the key technologies and strategies used in each layer are introduced in detail.
  • CPU and memory usage rate the energy consumption calculation model is established, and the coefficient values of C ⁇ , C ⁇ and C o in the energy consumption model are calculated by using the benchmark Benchmark according to the specific experimental environment of this article.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本发明公开了一种Hadoop集群节能系统,属于信息技术处理领域,主要包括底层的资源数据收集,中间层的负载预测和能耗计算模型,上层的作业调度,并且详细介绍了每层用到的关键技术和策略,然后基于CPU和内存使用率,建立能耗计算模型,同时根据本申请的具体实验环境,通过使用基准Benchmark计算了能耗模型中C α、C β和C o的系数值。

Description

一种Hadoop集群节能系统 技术领域
本发明属于信息技术处理领域,具体涉及一种Hadoop集群节能系统。
背景技术
现阶段人们对数据中心的重视程度越来越高。从初始追求数据中心的规模和数量,到目前提倡建立绿色的数据中心。企业持续依赖数据中心,更多的工作负载会从本地向云平台转移。但是当前数据中心面临着诸多挑战:首先资源是按照峰值的需求设计的,并且按照峰值的需求进行部署。而业务或者计算任务一般是分阶段的,大部分服务器在非峰值时间也都在加电运行。其次是数据中心的数量持续增长,在一个典型的数据中心,服务器消耗约70%的电能,通信设备、存储和空调等设备仅消耗30%的电能,IT能耗逐年上升。在国内外各种规模的数据中心,Hadoop集群占有高比例的部署量,在网页搜索,数据挖掘,推荐广告等各个领域里都配置了大量的Hadoop集群。但是,目前Hadoop任务调度和数据块存储的设计思想,更多考虑了集群性能和数据安全性等方面的问题。因此,Hadoop集群本身的负载均衡策略,使其各个节点始终处于运行的状态,并没有考虑到能耗的问题。有的集群规模能达到几百台的数量,所以Hadoop集群在一定程度上是数据中心能耗的主要贡献者之一。研究Hadoop集群在作业调度和存储方面的节能策略,对降低数据中心电源使用效率PUE(Power Usage Effectiveness)具有重要的意义,同时也会对Hadoop开源项目的进一步发展起到积极的作用。
为了提供准确的资源分配决策依据,需要实时监控节点的状态变化,同时在获得用户提交的作业信息的基础上,获得能耗最优的作业调度队列。
发明内容
针对现有技术中存在的上述技术问题,本发明提出了一种Hadoop集群节能系统,设计合理,克服了现有技术的不足,具有良好的效果。
为了实现上述目的,本发明采用如下技术方案:
一种Hadoop集群节能系统,包括底层的数据收集模块、中间层的负载预测模块和能耗模型模块以及上层的作业调度模块;
数据收集模块,被配置为用于获取集群节点数据;
集群节点数据包括:(1)节点的资源利用率的情况;(2)集群节点运行的任务所占系统资源的情况;
数据收集模块借助于Zabbix的Agent探针技术实现对集群性能指标的监控;数据收集模块是以server端、proxy端、agent端的方式工作的;
数据收集模块,包含有多个集群,每个集群包含两个主机,每个主机对应n个集群节点; 每个主机上均安装有server,每个集群节点上均安装有agent,server每隔一段时间向agent发送请求收集被监控项的指标数据,agent向server返回请求的数据,server将获得的数据写入相应的数据库中,完成数据的收集和分析;
当Hadoop的集群规模过大时,server端的压力会增加,数据收集模块采用proxy来分担集群数据的分析收集工作,保证底层系统的稳定;
负载预测模块和能耗模型模块,被配置为用于完成集群性能的监控以及通过底层的数据收集模块收集的集群节点数据,训练构建的预测节点负载的LSTM网络模型,为上层任务调度提供支持;
集群性能的监控,具体实现方法如下:
负载预测模块和能耗模型模块通过分析server端收集的CPU利用率和内存分配情况,得到集群节点监控指标的实时数据,通过设置的阈值,实现对各个集群节点的监控;
具体包括:(1)性能指标可视化;通过构建一个可视化窗口,将包括server端收集的CPU利用率、内存分配情况、该节点正在运行的任务及任务分配到的资源在内的实时数据动态地展示出来;(2)监控日志收集;将server端收集的CPU利用率、内存分配情况、以及各节点上每个任务资源占用情况,写入集群日志库;(3)监控频率控制;用于设置server端收集收据的频率,即每隔多长时间收集1次数据;
通过底层的数据收集模块收集的集群节点数据,训练构建的预测节点负载的LSTM网络模型,为上层任务调度提供支持;具体实现方法如下:
(1)预测主机的关键指标在设定时间内的走势;
首先构建预测节点负载的LSTM网络模型,使用底层数据作为训练数据,不断修改LSTM网络模型指标参数,得到训练好的模型;然后,利用训练好的模型预测集群主机在给定时间周期内的资源使用情况,获得节点的任务处理特征,根据这些特征分配适合的任务,得到在一定时间段内可执行任务列表;最后,通过由实验得出的效果较佳的序列长度对数据进行分析处理;
(2)计算集群能耗值;
首先建立一个能耗计算模型;然后,通过在Hadoop集群上进行实际测试确定所建模型的指标系数;最后,在实际任务调度中计算集群能耗;
作业调度模块,包括作业调度器,被配置为用于根据用户待处理的作业信息,使用训练好的LSTM模型预测的节点负载情况,进行任务调度;
作业调度模块采用基于主机状态预测的调度算法,该算法需要提前获得用户输入的作业信息,作业信息包括CPU密集型或者内存密集型,然后在集群中选择一个能够满足能耗要求 的节点进行处理,作业调度器根据用户的作业信息和预测的节点负载情况,给作业分配完成它的节点;
作业调度模块功能的具体实现方法如下:
(1)任务振荡迁移控制;(2)阈值触发机制;设置节点休眠或激活的阈值,为任务调度提供支撑;(3)检查是否满足用户计算的最小需求;调度程序将任务分配到活动节点,然后检查节点的资源与用户任务的需求,如果不满足则激活休眠节点,最后统计节点的CPU利用率和内存利用率;(4)节点休眠队列建议;根据节点的CPU利用率和内存利用率,选择休眠节点加入节点休眠建议队列。
优选地,集群主机在给定时间周期内的资源使用情况包括CPU利用率的走势、内存使用率的走势以及节点未来时间段的负载情况,该预测结果会为最上层的调度提供参考决策。
本发明所带来的有益技术效果:
(1)模块低耦合
模块间通过API接口获取数据,调用功能。在数据收集模块,新加入的Hadoop节点安装完Agent探针后,就可以无缝衔接到系统中,该节点会被自动发现并且自动收集其CPU、内存等指标数据以供模型训练层使用;同时该节点的计算资源也会被放到资源池中;假如某个节点发生故障不能正常工作,不会影响到其它节点的状态,将由故障带来的影响降低。
(2)主机状态预测的精度较高
中间层的负载预测模块将原始数据划分为几个不同的区间,在每个数据区间,都先使用实际数据进行预测,之后不断将预测数据归入已知数据作为历史数据,进而继续预测下一个数据,总体表现为滚动向前预测。由于在到达下一个时间区间时,重新使用实际数据集作为输入,相当于进行了数据纠正,因而总体表现为宏观趋势正确。
附图说明
图1为系统整体架构图。
具体实施方式
下面结合附图以及具体实施方式对本发明作进一步详细说明:
1、系统架构设计
如图1所示,一种Hadoop集群节能系统,包括底层的数据收集模块、中间层的负载预测模块和能耗模型模块以及上层的作业调度模块;
数据收集模块,被配置为用于获取集群节点数据;
集群节点数据包括:(1)节点的资源利用率的情况;(2)集群节点运行的任务所占系统资源的情况;
数据收集模块借助于Zabbix的Agent探针技术实现对集群性能指标的监控;数据收集模块是以server端、proxy端、agent端的方式工作的;
数据收集模块,包含有多个集群,每个集群包含两个主机,每个主机对应n个集群节点;每个主机上均安装有server,每个集群节点上均安装有agent,server每隔一段时间向agent发送请求收集被监控项的指标数据,agent向server返回请求的数据,server将获得的数据写入相应的数据库中,完成数据的收集和分析;
当Hadoop的集群规模过大时,server端的压力会增加,数据收集模块采用proxy来分担集群数据的分析收集工作,保证底层系统的稳定;
负载预测模块和能耗模型模块,被配置为用于完成集群性能的监控以及通过底层的数据收集模块收集的集群节点数据,训练构建的预测节点负载的LSTM网络模型,为上层任务调度提供支持;
集群性能的监控,具体实现方法如下:
负载预测模块和能耗模型模块通过分析server端收集的CPU利用率和内存分配情况,得到集群节点监控指标的实时数据,通过设置的阈值,实现对各个集群节点的监控;
具体包括:(1)性能指标可视化;通过构建一个可视化窗口,将包括server端收集的CPU利用率、内存分配情况、该节点正在运行的任务及任务分配到的资源在内的实时数据动态地展示出来;(2)监控日志收集;将server端收集的CPU利用率、内存分配情况、以及各节点上每个任务资源占用情况,写入集群日志库;(3)监控频率控制;用于设置server端收集数据的频率,即每隔多长时间收集一次数据;
通过底层的数据收集模块收集的集群节点数据,训练构建的预测节点负载的LSTM网络模型,为上层任务调度提供支持;具体实现方法如下:
(1)预测主机的关键指标在设定时间内的走势;
首先构建预测节点负载的LSTM网络模型,使用底层数据作为训练数据,不断修改LSTM网络模型指标参数,得到训练好的模型;然后,利用训练好的模型预测集群主机在给定时间周期内的资源使用情况,获得节点的任务处理特征,根据这些特征分配适合的任务,得到在一定时间段内可执行任务列表;最后,通过由实验得出的效果较佳的序列长度对数据进行分析处理;
(2)计算集群能耗值;
首先建立一个能耗计算模型;然后,通过在Hadoop集群上进行实际测试确定所建模型的指标系数;最后,在实际任务调度中计算集群能耗;
作业调度模块,包括作业调度器,被配置为用于根据用户待处理的作业信息,使用训练 好的LSTM模型预测的节点负载情况,进行任务调度;
作业调度模块采用基于主机状态预测的调度算法,该算法需要提前获得用户输入的作业信息,作业信息包括CPU密集型或者内存密集型,然后在集群中选择一个能够满足能耗要求的节点进行处理,作业调度器根据用户的作业信息和预测的节点负载情况,给作业分配完成它的节点;
作业调度模块功能的具体实现方法如下:
(1)任务振荡迁移控制;(2)阈值触发机制;设置节点休眠或激活的阈值,为任务调度提供支撑;(3)检查是否满足用户计算的最小需求;调度程序将任务分配到活动节点,然后检查节点的资源与用户任务的需求,如果不满足则激活休眠节点,最后统计节点的CPU利用率和内存利用率;(4)节点休眠队列建议;根据节点的CPU利用率和内存利用率,选择休眠节点加入节点休眠建议队列。
集群主机在给定时间周期内的资源使用情况包括CPU利用率的走势、内存使用率的走势以及节点未来时间段的负载情况,该预测结果会为最上层的调度提供参考决策。
2、节能方案分析
这种分层的节能系统方案主要以下特点:
(1)模块低耦合
模块间通过API接口获取数据,调用功能。在数据收集模块,新加入的Hadoop节点安装完Agent探针后,就可以无缝衔接到系统中,该节点会被自动发现并且自动收集其CPU、内存等指标数据以供模型训练层使用;同时该节点的计算资源也会被放到资源池中;假如某个节点发生故障不能正常工作,不会影响到其它节点的状态,将由故障带来的影响降低。
(2)主机状态预测的精度较高
中间层的模型训练模块将原始数据划分为几个不同的区间,在每个数据区间,都先使用实际数据进行预测,之后不断将预测数据归入已知数据作为历史数据,从而继续预测下一个数据,总的来看表现为滚动向前预测。由于在到达下一个时间区间时,重新使用实际数据集作为输入,相当于进行了数据纠正,因而总体表现为宏观趋势正确,但是缺点是有些细节会缺失。
3、能耗模型
3.1选取能耗模型指标
研究表明,Hadoop集群的能耗主要是由CPU、内存和网络的流入流出量决定的。其中CPU和内存是节点能耗的主要部分,网络的能耗主要是由交换设备产生的,比如和交换机等硬件设备的耗能关系紧密。当然还有其他影响能耗的指标,比如磁盘的I/O,服务器风扇的工 作模式等等,由于本申请主要研究资源分配和数据存储的方向,所以这些指标不在考虑范围之内。
综上分析,本申请基于CPU和内存两个指标数据进行能耗建模,而把其他系统能耗部分比如磁盘、网络流入流出量,以及其他常规的系统指标耗能视作基础常量。
结合实际环境,建立基于CPU和内存的能耗模型有很多因素需要考虑,包括主机的状态如关机、休眠、空闲等;其指令集的类型,复杂指令集或者精简指令集会有不同数量的计算单元参与。但是,将这些因素考虑在内建模成本较大。研究表明,集群的负载和节点的CPU利用率以及内存使用率具有正相关性,所以节点功率可以用公式(1)计算:
P=C 0+C α*U cpu+C β*U mem(0≤U cpu≤1,0≤U mem≤1)     (1);
在上式中,C 0是常数,代表与CPU利用率和内存使用率无关的其他基本功率,C α是CPU的利用率对能耗的影响系数,C β是内存的利用率对能耗的影响系数,C 0和C β是通过大量模型训练得出的线性回归的系数值,不同的服务器得到的系数值是不同的。
假如Hadoop集群由n个节点组成,总的功率可用式(2)表示:
Figure PCTCN2019108323-appb-000001
由此可以得到集群在t 0到t 1期间的总能耗值,该能耗通过对节点的功率进行积分计算,用E表示,如公式(3)所示:
Figure PCTCN2019108323-appb-000002
3.2能耗模型系数计算
为了得到较为精确的能耗计算值,需要对能耗模型中的系数进行试验测量,本申请选用的实验环境基于IBM x336搭建的,使用功率分析仪得到以下数据:
(1)CPU空载状态功率值和满载状态功率值。
(2)CPU使用率接近一致的情况下,不同内存使用率的功率。
(3)CPU和内存使用率同时接近一致的功率。
本申请控制主机资源使用率的工具是服务器评测的CPU和内存压力测试工具:COREMark和内存测试基准HPCC。具体的值见下表:
表1 服务器功率实测值
Tab3.1 Measured value of server power
Figure PCTCN2019108323-appb-000003
Figure PCTCN2019108323-appb-000004
本申请实际集群环境中的C 0、C α和C β的参数值,根据表1的数据计算得到。在内存利用率接近的时候,CPU的系数计算得到:
C α=100*(P 2-P 1)/(CPU 2-CPU 1)=16.24
内存系数通过同样的方式计算为:C β=7.46
而P 4=C o+C α*U cpu+C β*U mem,将计算出的C β=7.46和C α=16.24代入计算得到:C o=102.16。
根据以上计算,功率计算公式可以表示为:
P=n*102.16+16.24*(U CPU1+U CPU2+...+U CPUn)+7.46*(U mem1+U mem2+...+U memn)
(0≤U CPUi≤1,0≤U memi≤1)
本申请首先介绍了节能系统设计方案,主要包括底层的资源数据收集,中间层的负载预测和能耗计算模型,上层的作业调度,并且详细介绍了每层用到的关键技术和策略,然后基于CPU和内存使用率,建立能耗计算模型,同时根据本文的具体实验环境,通过使用基准Benchmark计算了能耗模型中C α、C β和C o的系数值。
当然,上述说明并非是对本发明的限制,本发明也并不仅限于上述举例,本技术领域的技术人员在本发明的实质范围内所做出的变化、改型、添加或替换,也应属于本发明的保护范围。

Claims (2)

  1. 一种Hadoop集群节能系统,其特征在于:包括底层的数据收集模块、中间层的负载预测模块和能耗模型模块以及上层的作业调度模块;
    数据收集模块,被配置为用于获取集群节点数据;
    集群节点数据包括:(1)节点的资源利用率的情况;(2)集群节点运行的任务所占系统资源的情况;
    数据收集模块借助于Zabbix的Agent探针技术实现对集群性能指标的监控;数据收集模块是以server端、proxy端、agent端的方式工作的;
    数据收集模块包含有多个集群,每个集群包含两个主机,每个主机对应n个集群节点;每个主机上均安装有server,每个集群节点上均安装有agent,server每隔一段时间向agent发送请求收集被监控项的指标数据,agent向server返回请求的数据,server将获得的数据写入相应的数据库中,完成数据的收集和分析;
    当Hadoop的集群规模过大时,server端的压力会增加,数据收集模块采用proxy来分担集群数据的分析收集工作,保证底层系统的稳定;
    负载预测模块和能耗模型模块,被配置为用于完成集群性能的监控以及通过底层的数据收集模块收集的集群节点数据,训练构建的预测节点负载的LSTM网络模型,为上层任务调度提供支持;
    集群性能的监控,具体实现方法如下:
    负载预测模块和能耗模型模块通过分析server端收集的CPU利用率和内存分配情况,得到集群节点监控指标的实时数据,通过设置的阈值,实现对各个集群节点的监控;
    具体包括:(1)性能指标可视化;通过构建一个可视化窗口,将包括server端收集的CPU利用率、内存分配情况、该节点正在运行的任务及任务分配到的资源在内的实时数据动态地展示出来;(2)监控日志收集;将server端收集的CPU利用率、内存分配情况、以及各节点上每个任务资源占用情况,写入集群日志库;(3)监控频率控制;用于设置server端收集数据的频率,即每隔多长时间收集1次数据;
    通过底层的数据收集模块收集的集群节点数据,训练构建的预测节点负载的LSTM网络模型,为上层任务调度提供支持;具体实现方法如下:
    (1)预测主机的关键指标在设定时间内的走势;
    首先构建预测节点负载的LSTM网络模型,使用底层数据作为训练数据,不断修改LSTM网络模型指标参数,得到训练好的模型;然后,利用训练好的模型预测集群主机在给定时间周期内的资源使用情况,获得节点的任务处理特征,根据这些特征分配适合的任务,得到在一定时间段内可执行任务列表;最后,通过由实验得出的效果较佳的序列长度对数据进行分 析处理;
    (2)计算集群能耗值;
    首先建立一个能耗计算模型;然后,通过在Hadoop集群上进行实际测试确定所建模型的指标系数;最后,在实际任务调度中计算集群能耗;
    作业调度模块,包括作业调度器,被配置为用于根据用户待处理的作业信息,使用训练好的LSTM模型预测的节点负载情况,进行任务调度;
    作业调度模块采用基于主机状态预测的调度算法,该算法需要提前获得用户输入的作业信息,作业信息包括CPU密集型或者内存密集型,然后在集群中选择一个能够满足能耗要求的节点进行处理;作业调度器根据用户的作业信息和预测的节点负载情况,给作业分配完成它的节点;
    作业调度模块功能的具体实现方法如下:
    (1)任务振荡迁移控制;(2)阈值触发机制;设置节点休眠或激活的阈值,为任务调度提供支撑;(3)检查是否满足用户计算的最小需求;调度程序将任务分配到活动节点,然后检查节点的资源与用户任务的需求,如果不满足则激活休眠节点,最后统计节点的CPU利用率和内存利用率;(4)节点休眠队列建议;根据节点的CPU利用率和内存利用率,选择休眠节点加入节点休眠建议队列。
  2. 根据权利要求1所述的Hadoop集群节能系统,其特征在于:集群主机在给定时间周期内的资源使用情况包括CPU利用率的走势、内存使用率的走势以及节点未来时间段的负载情况,该预测结果会为最上层的调度提供参考决策。
PCT/CN2019/108323 2019-09-16 2019-09-27 一种Hadoop集群节能系统 WO2021051441A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910868588.5 2019-09-16
CN201910868588.5A CN110618861A (zh) 2019-09-16 2019-09-16 一种Hadoop集群节能系统

Publications (1)

Publication Number Publication Date
WO2021051441A1 true WO2021051441A1 (zh) 2021-03-25

Family

ID=68923026

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/108323 WO2021051441A1 (zh) 2019-09-16 2019-09-27 一种Hadoop集群节能系统

Country Status (2)

Country Link
CN (1) CN110618861A (zh)
WO (1) WO2021051441A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117148955B (zh) * 2023-10-30 2024-02-06 北京阳光金力科技发展有限公司 一种基于能耗数据的数据中心能耗管理方法
CN117421131B (zh) * 2023-12-18 2024-03-26 武汉泽塔云科技股份有限公司 一种监控服务器功耗负载的智能调度方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130024496A1 (en) * 2011-07-21 2013-01-24 Yahoo! Inc Method and system for building an elastic cloud web server farm
CN104915407A (zh) * 2015-06-03 2015-09-16 华中科技大学 一种基于Hadoop多作业环境下的资源调度方法
CN109614210A (zh) * 2018-11-28 2019-04-12 重庆邮电大学 基于能耗感知的Storm大数据节能调度方法
CN110096349A (zh) * 2019-04-10 2019-08-06 山东科技大学 一种基于集群节点负载状态预测的作业调度方法
CN110209494A (zh) * 2019-04-22 2019-09-06 西北大学 一种面向大数据的分布式任务调度方法及Hadoop集群

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6358042B2 (ja) * 2014-10-21 2018-07-18 富士通株式会社 情報処理システム、制御装置および情報処理システムの制御方法
CN105487930B (zh) * 2015-12-01 2018-10-16 中国电子科技集团公司第二十八研究所 一种基于Hadoop的任务优化调度方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130024496A1 (en) * 2011-07-21 2013-01-24 Yahoo! Inc Method and system for building an elastic cloud web server farm
CN104915407A (zh) * 2015-06-03 2015-09-16 华中科技大学 一种基于Hadoop多作业环境下的资源调度方法
CN109614210A (zh) * 2018-11-28 2019-04-12 重庆邮电大学 基于能耗感知的Storm大数据节能调度方法
CN110096349A (zh) * 2019-04-10 2019-08-06 山东科技大学 一种基于集群节点负载状态预测的作业调度方法
CN110209494A (zh) * 2019-04-22 2019-09-06 西北大学 一种面向大数据的分布式任务调度方法及Hadoop集群

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YU KEWEI: "The Distributed Cloud Monitor Design and Implementation of Large Data Storage based on Zabbix", CHINESE MASTER'S THESES FULL-TEXT DATABASE, no. 3, 15 March 2017 (2017-03-15), pages 1 - 70, XP055792754, ISSN: 1674-0246 *

Also Published As

Publication number Publication date
CN110618861A (zh) 2019-12-27

Similar Documents

Publication Publication Date Title
Hsieh et al. Utilization-prediction-aware virtual machine consolidation approach for energy-efficient cloud data centers
EP2539791B1 (en) Virtual machine power consumption measurement and management
WO2018137402A1 (zh) 基于滚动灰色预测模型的云数据中心节能调度实现方法
Ruan et al. Virtual machine allocation and migration based on performance-to-power ratio in energy-efficient clouds
Li et al. Energy-aware and multi-resource overload probability constraint-based virtual machine dynamic consolidation method
US11269677B2 (en) System and method to analyze and optimize application level resource and energy consumption by data center servers
Kaushik et al. T*: A data-centric cooling energy costs reduction approach for Big Data analytics cloud
US20110106935A1 (en) Power management for idle system in clusters
CN111737078B (zh) 基于负载类型的自适应云服务器能耗测算方法、系统及设备
Diouri et al. Your cluster is not power homogeneous: Take care when designing green schedulers!
Zuo et al. Dynamically weighted load evaluation method based on self-adaptive threshold in cloud computing
WO2021051441A1 (zh) 一种Hadoop集群节能系统
Zhou et al. IECL: an intelligent energy consumption model for cloud manufacturing
Maroulis et al. Express: Energy efficient scheduling of mixed stream and batch processing workloads
Lent Analysis of an energy proportional data center
Ismaeel et al. Energy-consumption clustering in cloud data centre
Tian et al. Modeling and analyzing power management policies in server farms using stochastic petri nets
Alrajeh et al. Using Virtual Machine live migration in trace-driven energy-aware simulation of high-throughput computing systems
Zhang et al. Estimating power consumption of containers and virtual machines in data centers
Lin et al. An adaptive workload-aware power consumption measuring method for servers in cloud data centers
Wang et al. Cloud workload analytics for real-time prediction of user request patterns
Ismaeel et al. Real-time energy-conserving vm-provisioning framework for cloud-data centers
Chinnici et al. Data center, a cyber-physical system: improving energy efficiency through the power management
Grishina et al. DC energy data measurement and analysis for productivity and waste energy assessment
Chinnici et al. An HPC-data center case study on the power consumption of workload

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946215

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19946215

Country of ref document: EP

Kind code of ref document: A1