CN110618861A - Hadoop cluster energy-saving system - Google Patents

Hadoop cluster energy-saving system Download PDF

Info

Publication number
CN110618861A
CN110618861A CN201910868588.5A CN201910868588A CN110618861A CN 110618861 A CN110618861 A CN 110618861A CN 201910868588 A CN201910868588 A CN 201910868588A CN 110618861 A CN110618861 A CN 110618861A
Authority
CN
China
Prior art keywords
cluster
node
data
module
energy consumption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910868588.5A
Other languages
Chinese (zh)
Inventor
倪丽娜
张金泉
刘浩然
韩庆亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Science and Technology
Original Assignee
Shandong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Science and Technology filed Critical Shandong University of Science and Technology
Priority to CN201910868588.5A priority Critical patent/CN110618861A/en
Priority to PCT/CN2019/108323 priority patent/WO2021051441A1/en
Publication of CN110618861A publication Critical patent/CN110618861A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4893Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a Hadoop cluster energy-saving system, which belongs to the field of information technology processing and mainly comprises resource data collection of a bottom layer, a load prediction and energy consumption calculation model of a middle layer and operation scheduling of an upper layer, wherein key technologies and strategies used by each layer are introduced in detail, then an energy consumption calculation model is established based on the utilization rate of a CPU and a memory, and meanwhile, according to a specific experimental environment of the application, C in the energy consumption model is calculated by using a Benchmarkα、CβAnd CoCoefficient value of (c).

Description

Hadoop cluster energy-saving system
Technical Field
The invention belongs to the field of information technology processing, and particularly relates to a Hadoop cluster energy-saving system.
Background
At the present stage, people pay more and more attention to the data center. From the initial pursuit of the size and number of data centers, the establishment of green data centers has been promoted to date. Enterprises continue to rely on data centers, and more workload may shift from local to cloud platforms. However, current data centers face a number of challenges: first, resources are designed and deployed according to peak demand. While the traffic or computing tasks are typically staged, most servers are also powered up to run at off-peak times. Secondly, the number of data centers is continuously increasing, in a typical data center, about 70% of electric energy is consumed by a server, only 30% of electric energy is consumed by communication equipment, storage and air conditioning equipment, and the IT energy consumption is increased year by year. In data centers of various scales at home and abroad, Hadoop clusters occupy a high proportion of deployment amount, and a large number of Hadoop clusters are configured in various fields such as webpage searching, data mining, recommended advertisements and the like. However, the design ideas of Hadoop task scheduling and data block storage consider more problems in the aspects of cluster performance, data security and the like. Therefore, the load balancing strategy of the Hadoop cluster makes each node in the running state all the time, and the problem of energy consumption is not considered. Some clusters can reach the number of hundreds of stations, so the Hadoop cluster is one of the main contributors of the energy consumption of the data center to a certain extent. The research on the energy-saving strategy of the Hadoop cluster in the aspects of job scheduling and storage has important significance in reducing the power Usage efficiency PUE (Power Usage efficiency) of the data center, and meanwhile, the research also has a positive effect on the further development of the Hadoop open source project.
In order to provide an accurate resource allocation decision basis, the state change of the nodes needs to be monitored in real time, and meanwhile, on the basis of obtaining job information submitted by a user, a job scheduling queue with optimal energy consumption is obtained.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides the Hadoop cluster energy-saving system which is reasonable in design, overcomes the defects in the prior art and has a good effect.
In order to achieve the purpose, the invention adopts the following technical scheme:
a Hadoop cluster energy-saving system comprises a data collection module at the bottom layer, a load prediction module and an energy consumption model module at the middle layer and an operation scheduling module at the upper layer;
a data collection module configured to obtain cluster node data;
the cluster node data includes: (1) the condition of resource utilization of a node; (2) the situation that the tasks operated by the cluster nodes occupy the system resources;
the data collection module monitors the cluster performance index by means of the Agent probe technology of Zabbix; the data collection module works in a server end, a proxy end and an agent end mode;
the data collection module comprises a plurality of clusters, each cluster comprises two hosts, and each host corresponds to n cluster nodes; each host is provided with a server, each cluster node is provided with an agent, the server sends a request to the agent at intervals to collect the index data of the monitored item, the agent returns the requested data to the server, and the server writes the obtained data into a corresponding database to complete the collection and analysis of the data;
when the cluster scale of the Hadoop is too large, the pressure of the server end is increased, and the data collection module shares the analysis and collection work of cluster data by adopting proxy, so that the stability of a bottom system is ensured;
the load prediction module and the energy consumption model module are configured to be used for monitoring cluster performance, training the constructed LSTM network model for predicting the node load through the cluster node data collected by the data collection module at the bottom layer and providing support for the task scheduling at the upper layer;
the cluster performance monitoring method is specifically realized as follows:
the load prediction module and the energy consumption model module obtain real-time data of cluster node monitoring indexes by analyzing the CPU utilization rate and the memory allocation condition collected by the server end, and realize the monitoring of each cluster node through a set threshold;
the method specifically comprises the following steps: (1) visualizing the performance index; dynamically displaying real-time data including CPU utilization rate, memory allocation condition, running task of the node and resource allocated to the task, which are collected by a server end, by constructing a visual window; (2) collecting monitoring logs; writing the CPU utilization rate, the memory allocation condition and the occupation condition of each task resource on each node collected by the server end into a cluster log library; (3) monitoring frequency control; the receipt collecting device is used for setting the frequency of collecting receipts at the server end, namely collecting data 1 time at intervals;
training a constructed LSTM network model for predicting node load through cluster node data collected by a data collection module at the bottom layer, and providing support for upper-layer task scheduling; the specific implementation method comprises the following steps:
(1) predicting the trend of key indexes of the host within set time;
firstly, constructing an LSTM network model for predicting node load, using bottom layer data as training data, and continuously modifying index parameters of the LSTM network model to obtain a trained model; secondly, predicting the resource use condition of the cluster host in a given time period by using the trained model, obtaining task processing characteristics of the nodes, distributing proper tasks according to the characteristics, and obtaining an executable task list in a certain time period; finally, analyzing and processing the data through the sequence length with better effect obtained by experiments;
(2) calculating a cluster energy consumption value;
firstly, establishing an energy consumption calculation model; then, determining an index coefficient of the established model through actual test on the Hadoop cluster; finally, calculating cluster energy consumption in actual task scheduling;
the job scheduling module comprises a job scheduler which is configured to use the node load condition predicted by the trained LSTM model to schedule tasks according to the job information to be processed by the user;
the job scheduling module adopts a scheduling algorithm based on host state prediction, the algorithm needs to obtain job information input by a user in advance, the job information comprises CPU intensive type or memory intensive type, then a node capable of meeting the energy consumption requirement is selected from a cluster for processing, and a job scheduler allocates the node to the job to finish the node according to the job information of the user and the predicted node load condition;
the specific implementation method of the job scheduling module function is as follows:
(1) task oscillation migration control; (2) a threshold triggering mechanism; setting a dormant or activated threshold value of a node to provide support for task scheduling; (3) checking whether the minimum requirement calculated by the user is met; the scheduling program distributes the tasks to the active nodes, then checks the resources of the nodes and the requirements of the user tasks, activates the dormant nodes if the resources of the nodes and the requirements of the user tasks are not met, and finally counts the CPU utilization rate and the memory utilization rate of the nodes; (4) a node dormancy queue suggestion; and selecting the dormant node to be added into the node dormancy suggestion queue according to the CPU utilization rate and the memory utilization rate of the node.
Preferably, the resource usage of the cluster host in a given time period includes a trend of CPU utilization, a trend of memory utilization, and a load condition of the node in a future time period, and the prediction result provides a reference decision for scheduling in the uppermost layer.
The invention has the following beneficial technical effects:
(1) module low coupling
And the modules acquire data through an API (application programming interface) interface and call functions. After the Agent probe is installed on the newly added Hadoop node in the data collection module, the newly added Hadoop node can be seamlessly connected into the system, and the node can be automatically discovered and automatically collect index data such as a CPU (Central processing Unit), a memory and the like of the node for a model training layer to use; meanwhile, the computing resources of the node are also put into the resource pool; if a certain node fails and can not work normally, the states of other nodes can not be influenced, and the influence caused by the failure is reduced.
(2) The accuracy of the state prediction of the host is higher
The load prediction module of the middle layer divides original data into a plurality of different intervals, actual data are used for prediction in each data interval, then the predicted data are continuously put into known data to serve as historical data, and then the next data are continuously predicted, and the overall prediction is represented as rolling forward prediction. Since the actual data set is reused as input when the next time interval is reached, which is equivalent to data correction, the overall appearance is macroscopic trend correct.
Drawings
Fig. 1 is a diagram of the overall architecture of the system.
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
1. system architecture design
As shown in fig. 1, a Hadoop cluster energy-saving system includes a data collection module at a bottom layer, a load prediction module and an energy consumption model module at a middle layer, and an operation scheduling module at an upper layer;
a data collection module configured to obtain cluster node data;
the cluster node data includes: (1) the condition of resource utilization of a node; (2) the situation that the tasks operated by the cluster nodes occupy the system resources;
the data collection module monitors the cluster performance index by means of the Agent probe technology of Zabbix; the data collection module works in a server end, a proxy end and an agent end mode;
the data collection module comprises a plurality of clusters, each cluster comprises two hosts, and each host corresponds to n cluster nodes; each host is provided with a server, each cluster node is provided with an agent, the server sends a request to the agent at intervals to collect the index data of the monitored item, the agent returns the requested data to the server, and the server writes the obtained data into a corresponding database to complete the collection and analysis of the data;
when the cluster scale of the Hadoop is too large, the pressure of the server end is increased, and the data collection module shares the analysis and collection work of cluster data by adopting proxy, so that the stability of a bottom system is ensured;
the load prediction module and the energy consumption model module are configured to be used for monitoring cluster performance, training the constructed LSTM network model for predicting the node load through the cluster node data collected by the data collection module at the bottom layer and providing support for the task scheduling at the upper layer;
the cluster performance monitoring method is specifically realized as follows:
the load prediction module and the energy consumption model module obtain real-time data of cluster node monitoring indexes by analyzing the CPU utilization rate and the memory allocation condition collected by the server end, and realize the monitoring of each cluster node through a set threshold;
the method specifically comprises the following steps: (1) visualizing the performance index; dynamically displaying real-time data including CPU utilization rate, memory allocation condition, running task of the node and resource allocated to the task, which are collected by a server end, by constructing a visual window; (2) collecting monitoring logs; writing the CPU utilization rate, the memory allocation condition and the occupation condition of each task resource on each node collected by the server end into a cluster log library; (3) monitoring frequency control; the frequency for collecting data at the server end is set, namely the data is collected at intervals;
training a constructed LSTM network model for predicting node load through cluster node data collected by a data collection module at the bottom layer, and providing support for upper-layer task scheduling; the specific implementation method comprises the following steps:
(1) predicting the trend of key indexes of the host within set time;
firstly, constructing an LSTM network model for predicting node load, using bottom layer data as training data, and continuously modifying index parameters of the LSTM network model to obtain a trained model; secondly, predicting the resource use condition of the cluster host in a given time period by using the trained model, obtaining task processing characteristics of the nodes, distributing proper tasks according to the characteristics, and obtaining an executable task list in a certain time period; finally, analyzing and processing the data through the sequence length with better effect obtained by experiments;
(2) calculating a cluster energy consumption value;
firstly, establishing an energy consumption calculation model; then, determining an index coefficient of the established model through actual test on the Hadoop cluster; finally, calculating cluster energy consumption in actual task scheduling;
the job scheduling module comprises a job scheduler which is configured to use the node load condition predicted by the trained LSTM model to schedule tasks according to the job information to be processed by the user;
the job scheduling module adopts a scheduling algorithm based on host state prediction, the algorithm needs to obtain job information input by a user in advance, the job information comprises CPU intensive type or memory intensive type, then a node capable of meeting the energy consumption requirement is selected from a cluster for processing, and a job scheduler allocates the node to the job to finish the node according to the job information of the user and the predicted node load condition;
the specific implementation method of the job scheduling module function is as follows:
(1) task oscillation migration control; (2) a threshold triggering mechanism; setting a dormant or activated threshold value of a node to provide support for task scheduling; (3) checking whether the minimum requirement calculated by the user is met; the scheduling program distributes the tasks to the active nodes, then checks the resources of the nodes and the requirements of the user tasks, activates the dormant nodes if the resources of the nodes and the requirements of the user tasks are not met, and finally counts the CPU utilization rate and the memory utilization rate of the nodes; (4) a node dormancy queue suggestion; and selecting the dormant node to be added into the node dormancy suggestion queue according to the CPU utilization rate and the memory utilization rate of the node.
The resource utilization conditions of the cluster host in a given time period comprise the trend of CPU utilization rate, the trend of memory utilization rate and the load condition of the node in the future time period, and the prediction result provides a reference decision for the scheduling of the uppermost layer.
2. Energy saving scheme analysis
The layered energy-saving system scheme is mainly characterized in that:
(1) module low coupling
And the modules acquire data through an API (application programming interface) interface and call functions. After the Agent probe is installed on the newly added Hadoop node in the data collection module, the newly added Hadoop node can be seamlessly connected into the system, and the node can be automatically discovered and automatically collect index data such as a CPU (Central processing Unit), a memory and the like of the node for a model training layer to use; meanwhile, the computing resources of the node are also put into the resource pool; if a certain node fails and can not work normally, the states of other nodes can not be influenced, and the influence caused by the failure is reduced.
(2) The accuracy of the state prediction of the host is higher
The model training module of the middle layer divides the original data into a plurality of different intervals, actual data are used for prediction in each data interval, and then the predicted data are continuously put into known data to serve as historical data, so that the next data are continuously predicted, and the prediction is represented as rolling forward prediction in the overall view. Since the reuse of the actual data set as input is equivalent to a data correction when the next time interval is reached, the overall appearance is macroscopic trend correct, but the disadvantage is that some details are missing.
3. Energy consumption model
3.1 selecting energy consumption model index
Research shows that the energy consumption of the Hadoop cluster is mainly determined by the inflow and outflow of a CPU, a memory and a network. The CPU and the memory are the main parts of the energy consumption of the node, and the energy consumption of the network is mainly generated by the switching device, for example, the energy consumption relationship with the hardware devices such as the switch is tight. There are of course other metrics that affect power consumption, such as disk I/O, server fan operating mode, etc., and these metrics are not considered because the present application is primarily concerned with the direction of resource allocation and data storage.
In summary, the energy consumption modeling is carried out based on two index data of the CPU and the memory, and energy consumption parts of other systems, such as a disk, network inflow and outflow, and other conventional system index energy consumption are regarded as basic constants.
In combination with the actual environment, there are many factors to be considered in establishing the energy consumption model based on the CPU and the memory, including the states of the host such as shutdown, hibernation, idle, etc.; the type of instruction set, complex instruction set or reduced instruction set may have a different number of computing units involved. However, taking these factors into account, modeling is costly. Research shows that the load of the cluster has positive correlation with the CPU utilization rate and the memory utilization rate of the node, so the node power can be calculated by formula (1):
P=C0+Cα*Ucpu+Cβ*Umem(0≤Ucpu≤1,0≤Umem≤1) (1);
in the above formula, C0Is a constant, representing other base power independent of CPU utilization and memory usage, CαIs the utilization ratio pair of CPUCoefficient of influence of energy consumption, CβIs the coefficient of influence of the utilization of the memory on the energy consumption, C0And CβIs the coefficient value of the linear regression obtained by a large amount of model training, and the coefficient value obtained by different servers is different.
If the Hadoop cluster consists of n nodes, the total power can be represented by equation (2):
from this, the cluster at t can be obtained0To t1The total energy consumption value during the period, which is calculated by integrating the power of the nodes, is represented by E, as shown in equation (3):
3.2 energy consumption model coefficient calculation
In order to obtain a relatively accurate energy consumption calculation value, the coefficient in the energy consumption model needs to be tested and measured, the experimental environment selected by the application is built based on IBM x336, and a power analyzer is used for obtaining the following data:
(1) a CPU idle state power value and a full state power value.
(2) And under the condition that the CPU utilization rates are close to consistency, the power of different memory utilization rates is obtained.
(3) CPU and memory usage are simultaneously close to uniform power.
The tool for controlling the utilization rate of the host resources is a CPU and memory pressure testing tool evaluated by a server: COREMark and memory test reference HPCC. Specific values are given in the following table:
TABLE 1 Server Power measurement
Tab3.1 Measured value of server power
C in the actual cluster environment of the application0、CαAnd CβIs calculated according to the data in table 1. When the memory utilization rate is close, the coefficient of the CPU is calculated to obtain:
Cα=100*(P2-P1)/(CPU2-CPU1)=16.24
the memory coefficient is calculated in the same way as: cβ=7.46
And P is4=Co+Cα*Ucpu+Cβ*UmemC to be calculatedβ7.46 and CαSubstitution calculation results in 16.24: co=102.16。
From the above calculations, the power calculation formula can be expressed as:
P=n*102.16+16.24*(UCPU1+UCPU2+...+UCPUn)+7.46*(Umem1+Umem2+...+Umemn)
(0≤UCPUi≤1,0≤Umemi≤1)
the energy-saving system design scheme is introduced firstly, the energy-saving system design scheme mainly comprises resource data collection of a bottom layer, a load prediction and energy consumption calculation model of a middle layer and operation scheduling of an upper layer, key technologies and strategies used by each layer are introduced in detail, then the energy consumption calculation model is established based on the utilization rate of a CPU (Central processing Unit) and a memory, and meanwhile according to a specific experimental environment of the energy-saving system, C in the energy consumption model is calculated by using a Benchmarkα、CβAnd CoCoefficient value of (c).
It is to be understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make modifications, alterations, additions or substitutions within the spirit and scope of the present invention.

Claims (2)

1. A Hadoop cluster economizer system which characterized in that: the energy-saving management system comprises a data collection module at the bottom layer, a load prediction module and an energy consumption model module at the middle layer and an operation scheduling module at the upper layer;
a data collection module configured to obtain cluster node data;
the cluster node data includes: (1) the condition of resource utilization of a node; (2) the situation that the tasks operated by the cluster nodes occupy the system resources;
the data collection module monitors the cluster performance index by means of the Agent probe technology of Zabbix; the data collection module works in a server end, a proxy end and an agent end mode;
the data collection module comprises a plurality of clusters, each cluster comprises two hosts, and each host corresponds to n cluster nodes; each host is provided with a server, each cluster node is provided with an agent, the server sends a request to the agent at intervals to collect the index data of the monitored item, the agent returns the requested data to the server, and the server writes the obtained data into a corresponding database to complete the collection and analysis of the data;
when the cluster scale of the Hadoop is too large, the pressure of the server end is increased, and the data collection module shares the analysis and collection work of cluster data by adopting proxy, so that the stability of a bottom system is ensured;
the load prediction module and the energy consumption model module are configured to be used for monitoring cluster performance, training the constructed LSTM network model for predicting the node load through the cluster node data collected by the data collection module at the bottom layer and providing support for the task scheduling at the upper layer;
the cluster performance monitoring method is specifically realized as follows:
the load prediction module and the energy consumption model module obtain real-time data of cluster node monitoring indexes by analyzing the CPU utilization rate and the memory allocation condition collected by the server end, and realize the monitoring of each cluster node through a set threshold;
the method specifically comprises the following steps: (1) visualizing the performance index; dynamically displaying real-time data including CPU utilization rate, memory allocation condition, running task of the node and resource allocated to the task, which are collected by a server end, by constructing a visual window; (2) collecting monitoring logs; writing the CPU utilization rate, the memory allocation condition and the occupation condition of each task resource on each node collected by the server end into a cluster log library; (3) monitoring frequency control; the frequency for collecting data at the server end is set, namely, the data is collected for 1 time at intervals;
training a constructed LSTM network model for predicting node load through cluster node data collected by a data collection module at the bottom layer, and providing support for upper-layer task scheduling; the specific implementation method comprises the following steps:
(1) predicting the trend of key indexes of the host within set time;
firstly, constructing an LSTM network model for predicting node load, using bottom layer data as training data, and continuously modifying index parameters of the LSTM network model to obtain a trained model; secondly, predicting the resource use condition of the cluster host in a given time period by using the trained model, obtaining task processing characteristics of the nodes, distributing proper tasks according to the characteristics, and obtaining an executable task list in a certain time period; finally, analyzing and processing the data through the sequence length with better effect obtained by experiments;
(2) calculating a cluster energy consumption value;
firstly, establishing an energy consumption calculation model; then, determining an index coefficient of the established model through actual test on the Hadoop cluster; finally, calculating cluster energy consumption in actual task scheduling;
the job scheduling module comprises a job scheduler which is configured to use the node load condition predicted by the trained LSTM model to schedule tasks according to the job information to be processed by the user;
the job scheduling module adopts a scheduling algorithm based on host state prediction, the algorithm needs to obtain job information input by a user in advance, the job information comprises CPU intensive type or memory intensive type, and then a node capable of meeting the energy consumption requirement is selected from a cluster for processing; the job scheduler allocates a node for completing the job according to the job information of the user and the predicted node load condition;
the specific implementation method of the job scheduling module function is as follows:
(1) task oscillation migration control; (2) a threshold triggering mechanism; setting a dormant or activated threshold value of a node to provide support for task scheduling; (3) checking whether the minimum requirement calculated by the user is met; the scheduling program distributes the tasks to the active nodes, then checks the resources of the nodes and the requirements of the user tasks, activates the dormant nodes if the resources of the nodes and the requirements of the user tasks are not met, and finally counts the CPU utilization rate and the memory utilization rate of the nodes; (4) a node dormancy queue suggestion; and selecting the dormant node to be added into the node dormancy suggestion queue according to the CPU utilization rate and the memory utilization rate of the node.
2. The Hadoop cluster economizer system of claim 1 wherein: the resource utilization conditions of the cluster host in a given time period comprise the trend of CPU utilization rate, the trend of memory utilization rate and the load condition of the node in the future time period, and the prediction result provides a reference decision for the scheduling of the uppermost layer.
CN201910868588.5A 2019-09-16 2019-09-16 Hadoop cluster energy-saving system Pending CN110618861A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910868588.5A CN110618861A (en) 2019-09-16 2019-09-16 Hadoop cluster energy-saving system
PCT/CN2019/108323 WO2021051441A1 (en) 2019-09-16 2019-09-27 Energy conservation system for hadoop cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910868588.5A CN110618861A (en) 2019-09-16 2019-09-16 Hadoop cluster energy-saving system

Publications (1)

Publication Number Publication Date
CN110618861A true CN110618861A (en) 2019-12-27

Family

ID=68923026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910868588.5A Pending CN110618861A (en) 2019-09-16 2019-09-16 Hadoop cluster energy-saving system

Country Status (2)

Country Link
CN (1) CN110618861A (en)
WO (1) WO2021051441A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117148955A (en) * 2023-10-30 2023-12-01 北京阳光金力科技发展有限公司 Data center energy consumption management method based on energy consumption data
CN117421131A (en) * 2023-12-18 2024-01-19 武汉泽塔云科技股份有限公司 Intelligent scheduling method and system for monitoring power consumption load of server

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915407A (en) * 2015-06-03 2015-09-16 华中科技大学 Resource scheduling method under Hadoop-based multi-job environment
CN105487930A (en) * 2015-12-01 2016-04-13 中国电子科技集团公司第二十八研究所 Task optimization scheduling method based on Hadoop
US20160110227A1 (en) * 2014-10-21 2016-04-21 Fujitsu Limited System, method of controlling to execute a job, and apparatus
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8954568B2 (en) * 2011-07-21 2015-02-10 Yahoo! Inc. Method and system for building an elastic cloud web server farm
CN109614210B (en) * 2018-11-28 2022-11-04 重庆邮电大学 Storm big data energy-saving scheduling method based on energy consumption perception
CN110209494B (en) * 2019-04-22 2022-11-25 西北大学 Big data-oriented distributed task scheduling method and Hadoop cluster

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160110227A1 (en) * 2014-10-21 2016-04-21 Fujitsu Limited System, method of controlling to execute a job, and apparatus
CN104915407A (en) * 2015-06-03 2015-09-16 华中科技大学 Resource scheduling method under Hadoop-based multi-job environment
CN105487930A (en) * 2015-12-01 2016-04-13 中国电子科技集团公司第二十八研究所 Task optimization scheduling method based on Hadoop
CN110096349A (en) * 2019-04-10 2019-08-06 山东科技大学 A kind of job scheduling method based on the prediction of clustered node load condition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
于科伟: ""基于Zabbix分布式监控大数据存储设计与实现"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117148955A (en) * 2023-10-30 2023-12-01 北京阳光金力科技发展有限公司 Data center energy consumption management method based on energy consumption data
CN117148955B (en) * 2023-10-30 2024-02-06 北京阳光金力科技发展有限公司 Data center energy consumption management method based on energy consumption data
CN117421131A (en) * 2023-12-18 2024-01-19 武汉泽塔云科技股份有限公司 Intelligent scheduling method and system for monitoring power consumption load of server
CN117421131B (en) * 2023-12-18 2024-03-26 武汉泽塔云科技股份有限公司 Intelligent scheduling method and system for monitoring power consumption load of server

Also Published As

Publication number Publication date
WO2021051441A1 (en) 2021-03-25

Similar Documents

Publication Publication Date Title
Sarood et al. Maximizing throughput of overprovisioned hpc data centers under a strict power budget
Donyanavard et al. SPARTA: Runtime task allocation for energy efficient heterogeneous many-cores
US10289184B2 (en) Methods of achieving cognizant power management
US9207993B2 (en) Dynamic application placement based on cost and availability of energy in datacenters
US9098351B2 (en) Energy-aware job scheduling for cluster environments
US9037880B2 (en) Method and system for automated application layer power management solution for serverside applications
US20100318827A1 (en) Energy use profiling for workload transfer
Patel et al. What does power consumption behavior of hpc jobs reveal?: Demystifying, quantifying, and predicting power consumption characteristics
Zhou et al. ECMS: An edge intelligent energy efficient model in mobile edge computing
Lent Analysis of an energy proportional data center
CN110618861A (en) Hadoop cluster energy-saving system
Zhang et al. An Energy and SLA‐Aware Resource Management Strategy in Cloud Data Centers
Tian et al. Modeling and analyzing power management policies in server farms using stochastic petri nets
Teodoro et al. Energy efficiency management in computational grids through energy-aware scheduling
Sampaio et al. Optimizing energy-efficiency in high-available scientific cloud environments
Moghaddam Dynamic energy and reliability management in network-on-chip based chip multiprocessors
Guo et al. Heuristic algorithms for energy and performance dynamic optimization in cloud computing
Huixi et al. A combination of host overloading detection and virtual machine selection in cloud server consolidation based on learning method
Diouri et al. Sesames: a smart-grid based framework for consuming less and better in extreme-scale infrastructures
Jiang et al. Power Aware Job Scheduling in Multi-Processor System with Service Level Agreements Constraints.
Tang et al. Exploring hardware profile-guided green datacenter scheduling
US20230418688A1 (en) Energy efficient computing workload placement
TWI476694B (en) Virtual Operation Resource Water Level Early Warning and Energy Saving Control System and Method
Gan et al. Energy-efficient optimization strategy for simulation task scheduling based on supercomputing platform
Antici et al. PM100: A Job Power Consumption Dataset of a Large-scale Production HPC System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191227