CN108647135B - Hadoop parameter automatic tuning method based on micro-operation - Google Patents

Hadoop parameter automatic tuning method based on micro-operation Download PDF

Info

Publication number
CN108647135B
CN108647135B CN201810426699.6A CN201810426699A CN108647135B CN 108647135 B CN108647135 B CN 108647135B CN 201810426699 A CN201810426699 A CN 201810426699A CN 108647135 B CN108647135 B CN 108647135B
Authority
CN
China
Prior art keywords
micro
stage
model
phase
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810426699.6A
Other languages
Chinese (zh)
Other versions
CN108647135A (en
Inventor
滕飞
李耘书
李天瑞
杜圣东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN201810426699.6A priority Critical patent/CN108647135B/en
Publication of CN108647135A publication Critical patent/CN108647135A/en
Application granted granted Critical
Publication of CN108647135B publication Critical patent/CN108647135B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Debugging And Monitoring (AREA)
  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)

Abstract

The invention belongs to the technical field of cloud computing, and particularly relates to a Hadoop parameter automatic tuning method based on micro-operation. The method mainly comprises the steps of determining mapreduce task decoupling as different stages of different types of micro-operations, then establishing a model of single execution time and single processing data volume according to the selected micro-operations, reconstructing and combining the operation process according to the established model to obtain the relation between stage operation time and system parameters, and finally searching a parameter combination which enables the task operation time to be shortest in the model. The method of the invention does not change with the change of the operation type and the cluster configuration, and simultaneously has short time consumption, high efficiency and good portability for searching the optimal parameters.

Description

Hadoop parameter automatic tuning method based on micro-operation
Technical Field
The invention belongs to the technical field of cloud computing, and particularly relates to a Hadoop parameter automatic tuning method based on micro-operation reconstruction.
Background
The problem of resource optimization of the distributed platform is always one of the hot topics focused by users, particularly, optimization of operation running time of the distributed platform is always an important research object, cloud service is popularized in recent years, operation running time is shortened, a renter can be helped to improve working efficiency, rental cost is reduced, and meanwhile a provider is helped to achieve resource utilization maximization.
In recent years, hadoop distributed computing platforms have been applied to the industry in a mature and wide range, and in the academic world, optimization of the hadoop platforms in various aspects is still a key research object. With the continuous update of hadoop versions, the computing efficiency is no longer a concern of people, a huge production cluster gradually generates expensive operation and maintenance cost, and the cost problem of companies is increasingly highlighted due to the unreasonable allocation of cloud resources, so that the cost optimization problem of the cloud distributed computing framework in the computing process is one of the problems to be solved by large IT companies at present.
There have been some research efforts directed at optimizing hadoop job run time:
1) the method and the system for automatically tuning Hadoop parameters based on machine learning are applied to Shi and Feng Dan, in Ruili.
CN106202431A.2016.
According to the method, different parameters which have large influence on different types of application are automatically obtained by clustering resource consumption characteristics of different operation types and establishing different performance models, and quantitative parameter suggested values are given. The method effectively solves the problem that the existing method based on the empirical rule highly depends on the experience of the user and the problem of limitation of qualitative parameter suggestion.
2) An iterative MapReduce job parameter auto-tuning method, Zhao ceramic Sen, high Xiaojie, Tang Hua. Cn106326005a.2017.
According to the method, the actual operation is executed and the operation execution effect is evaluated, the new parameter configuration combination is determined in the parameter space, and then the operation is executed iteratively until the ending requirement is met.
From the patent situation of the last two years, the main emphasis is to characterize the influence of the change of the parameters on the change of the working time. Another important point which is also concerned by people is the portability of the platform in Hadoop parameter automatic adjustment. The method has important practical significance if the tuning model is quickly established under different operation types of different clusters.
Disclosure of Invention
The invention aims to provide a hadoop2.0 parameter automatic optimization method based on micro-operation reconstruction, which is provided by considering the important practical significance of hadoop parameter automatic optimization in the rise of current cloud computing service.
The technical scheme adopted by the invention is as follows:
a Hadoop parameter automatic tuning method based on micro-operation is used for optimizing parameter combination during MapReduce operation execution, and is characterized by comprising the following steps:
s1, establishing a micro-operation model:
s11, selecting a micro operation: decoupling a MapReduce task, selecting a single memory write operation cm _ micro _ op and a single disk write operation cd _ micro _ op in a collection stage in the Map task, and taking a shuffle stage single memory write operation sm _ micro _ op, a single memory overflow disk write operation sd _ micro _ op and a single file merge disk write operation merge _ micro _ op in a Reduce task as micro-operations;
s12, determining the parameter change space influencing the micro-operation according to the micro-operation selected in the step S11;
s13, determining the difference of data volume processed by single micro-operation according to different parameter values, discretely taking values in each dimension in a parameter space, executing actual operation as a micro-operation model benchmark test, and testing the speed of the single micro-operation under the condition of processing different data volumes;
s14, collecting execution logs of the benchmark test case under different parameter conditions at different stages, and respectively establishing models of single execution time and single processing data volume for single disk write operation and single memory write operation at different stages:
Tmicro_op=α*Dmicro_op
Tmicro_opindicating micro-operation execution time, Dmicro_opRepresenting the data volume of single processing of the micro-operation, and alpha and beta are model parameters;
s2, reconstructing and combining the micro-operation model according to the operation process of the collection phase to obtain the relation between the phase operation time and the system parameters:
s21, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s22, reconstructing the collection stage based on the micro-operation to obtain the running process of the actual collection stage, and obtaining the relation between the relevant parameters of the collection stage and the running time of the stage;
s3, reconstructing and combining the micro-operation model according to the operation process of the shuffling stage to obtain the relationship between the stage operation time and the system parameters:
s31, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s32, reconstructing the shuffling stage based on the micro-operation to obtain the running process of the actual shuffling stage, and obtaining the relation between the relevant parameters of the shuffling stage and the running time of the stage;
s33, independently modeling the execution time of the sequencing write stage in the Reduce task, discretely taking values in a parameter space determined by the write times of the memory overflow disk and the data volume, executing the actual job task, testing the execution time of the stage under different parameters and the relationship between the write times of the memory overflow disk and the data volume processed by the stage in the shuffle stage, and establishing a model of the execution time of the sequencing write stage, the write times of the memory overflow disk and the total data volume processed by the stage:
Tsw_phase=Dsw_input*(Nspillsw_phgasesw_phase)
Tsw_phaserepresenting ordered write phase runtime, Dsw_inputRepresenting the amount of input data of a single reduce task, NspillIndicating number of memory-overflowed disk writes in shuffle stage, alphasw_phaseAnd betasw_phaseIs a model parameter;
s4, finding the parameter combination which enables the task running time to be shortest in the model: and obtaining the parameter combination with the shortest execution time in the stages in the model by adopting a search optimization algorithm, and searching in different stages to obtain the respective optimal parameter combination.
The invention has the beneficial effects that:
(1) a fine-grained micro-operation model capable of accurately depicting parameter change influence is provided. The model can visually and accurately depict the influence of the system parameter change on the execution time. The model facilitates and accurately analyzes the change of the operation execution time when multiple parameters change simultaneously from the viewpoint of data flow.
(2) A strategy for performing a micro-manipulation reconstruction in accordance with a principle of operation is presented. The method does not change along with the change of the operation type and the cluster configuration, and meanwhile, the time for searching the optimal parameters is short, the efficiency is high, and the transportability is good. The method can be used as a description method and an analysis framework of an optimization problem, and an optimal parameter combination is searched by describing a parameter change principle and establishing a model from the angle of finer granularity.
Drawings
Fig. 1 is a logic block diagram of MapReduce job in the present invention.
Detailed Description
The technical scheme of the invention is described in detail in the following with the accompanying drawings:
the method comprises the following steps: aiming at fine-grained operation directly influenced by parameters, different models are established according to different operation types, and the core steps are as follows:
1) decoupling the mapreduce task according to the mode shown in figure 1 to determine different types of micro-operations at different stages: the method comprises the following steps that (1) a collection phase single-time memory write operation cm _ micro _ op and a collection phase single-time disk write operation cd _ micro _ op are carried out; a shuffle phase single memory write operation sm _ micro _ op, a shuffle phase single spin disk write operation sd _ micro _ op, and a shuffle phase single merge disk write operation merge _ micro _ op.
2) And determining the parameter change space influencing the selected micro-operation according to the selected micro-operation. Parameters affecting the cm _ micro _ op and the cd _ micro _ op are io.sort.mb and sort.spill.percentage, and the value space is the respective value range; parameters affecting sm _ micro _ op, sd _ micro _ op and merge _ micro _ op are reduce.
3) And determining the difference of the data volume processed by the single micro-operation due to the difference of the parameter values, discretely taking the value of each dimension in the parameter space, executing the actual operation as the benchmark test of the micro-operation model, and testing the speed of the single micro-operation under the condition of processing different data volumes.
4) And collecting execution logs of the benchmark test case at different stages under different parameter conditions, and respectively establishing a model of single execution time and single processing data volume for single disk write operation and single memory write operation at different stages.
Tmicro_op=α*Dmicro_op
The above equation is a linear model of the single execution time and the single processing data volume of the established micro-operation. T ismicro_opIndicating micro-operation execution time, Dmicro_opRepresenting the amount of data a single pass of the micro-operation, alpha and beta are model parameters.
Step two: reconstructing and combining the micro-operation model according to the operation process of the collection phase to obtain the relationship between the phase operation time and the system parameters, wherein the core steps are as follows:
1) the data volume processed by a single micro-operation is determined by a plurality of parameters, and the relationship between the micro-operation time and the system parameters influencing the operation is established through the relationship between the collection phase micro-operation execution time and the data volume established in the step one.
2) Reconstructing the collection phase shown in FIG. 1 based on the micro-operation to obtain the actual running process of the collection phase, and depicting the relationship between the relevant parameters of the collection phase and the execution time of the collection phase.
Step three: reconstructing and combining the micro-operation model according to the running process of the shuffle phase to obtain the relation between the phase running time and the system parameters, wherein the core steps are as follows:
1) the multiple parameters jointly determine the data volume processed by a single micro-operation, and the relationship between the micro-operation time and the system parameters influencing the operation is established through the relationship between the execution time and the data volume of the shuffle phase micro-operation established in the step one.
2) Reconstructing the shuffle phase shown in FIG. 1 based on the micro-operation to obtain the running process of the actual shuffle phase, and obtaining the relationship between the parameters related to the shuffle phase and the execution time of the shuffle phase
3) Independently modeling the execution time of the sort _ write phase in the reduce task, discretely valuing in a parameter space determined by the number of spills and the data volume, executing an actual operation task, testing the execution time of the stage under different parameters and the relation between the number of spills in the shuffle phase and the data volume processed by the stage, and establishing a model of the execution time of the sort _ write phase, the number of spills and the total data processed by the stage:
Tsw_phase=Dsw_input*(Nspillsw_phasesw_phase)
Tsw_phaserepresenting reduce phase runtime, Dsw_inputRepresenting a single reduce task input data quantity, NspillTo representNumber of shuffle phase spill, αsw_phaseAnd betasw_phaseAre model parameters.
Step four: the method is characterized in that a parameter combination which enables the task running time to be shortest in a model is found, and the core steps are as follows:
1) through the first step to the third step, a description method and an analysis framework for establishing an optimization problem are obtained, the relationship between a change parameter and an analysis target is described from the perspective of finer granularity, and the framework can be adapted to different algorithms.
2) On the basis of the model, various search optimization algorithms can be applied to obtain the parameter combination with the shortest execution time of the stages in the model.
3) And searching in different stages to obtain respective optimal parameter combinations. And completing parameter tuning to obtain the optimal combination of all relevant parameters.

Claims (1)

1. A Hadoop parameter automatic tuning method based on micro-operation is used for optimizing parameter combination during MapReduce operation execution, and is characterized by comprising the following steps:
s1, establishing a micro-operation model:
s11, selecting a micro operation: decoupling a MapReduce task, selecting a single memory write operation cm _ micro _ op and a single disk write operation cd _ micro _ op in a collection stage in the Map task, and taking a shuffle stage single memory write operation sm _ micro _ op, a single memory overflow disk write operation sd _ micro _ op and a single file merge disk write operation merge _ micro _ op in a Reduce task as micro-operations;
s12, according to the micro-operation selected in the step S11, determining a parameter change space which influences the micro-operation, specifically: parameters affecting the cm _ micro _ op and the cd _ micro _ op are io.sort.mb and sort.spill.percentage, and the value space is the respective value range; parameters affecting sm _ micro _ op, sd _ micro _ op and merge _ micro _ op are reduce.java.ops, shuffle.input.buffer.percentage, shuffle.merge.percentage and io.sort.factor, and the value space is the respective variation range;
s13, determining the difference of data volume processed by single micro-operation according to different parameter values, discretely taking values in each dimension in a parameter space, executing actual operation as a micro-operation model benchmark test, and testing the speed of the single micro-operation under the condition of processing different data volumes;
s14, collecting execution logs of the benchmark test case under different parameter conditions at different stages, and respectively establishing models of single execution time and single processing data volume for single disk write operation and single memory write operation at different stages:
Tmicro_op=α*Dmicro_op
Tmicro_opindicating micro-operation execution time, Dmicro_opRepresenting the data volume of single processing of the micro-operation, and alpha and beta are model parameters;
s2, reconstructing and combining the micro-operation model according to the operation process of the collection phase to obtain the relation between the phase operation time and the system parameters:
s21, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s22, reconstructing the collection stage based on the micro-operation to obtain the running process of the actual collection stage, and obtaining the relation between the relevant parameters of the collection stage and the running time of the stage;
s3, reconstructing and combining the micro-operation model according to the operation process of the shuffling stage to obtain the relationship between the stage operation time and the system parameters:
s31, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s32, reconstructing the shuffling stage based on the micro-operation to obtain the running process of the actual shuffling stage, and obtaining the relation between the relevant parameters of the shuffling stage and the running time of the stage;
s33, independently modeling the execution time of the sequencing write stage in the Reduce task, discretely taking values in a parameter space determined by the write times of the memory overflow disk and the data volume, executing the actual job task, testing the execution time of the stage under different parameters and the relationship between the write times of the memory overflow disk and the data volume processed by the stage in the shuffle stage, and establishing a model of the execution time of the sequencing write stage, the write times of the memory overflow disk and the total data volume processed by the stage:
Tsw_phase=Dsw_input*(Nspillsw_phasesw_phase)
Tsw_phaserepresenting ordered write phase runtime, Dsw_inputRepresenting the amount of input data of a single reduce task, NspillIndicating number of memory-overflowed disk writes in shuffle stage, alphasw_phaseAnd betasw_phaseIs a model parameter;
s4, finding out the parameter combination which enables the task running time to be shortest in the model: and obtaining the parameter combination with the shortest execution time in the stages in the model by adopting a search optimization algorithm, and searching in different stages to obtain the respective optimal parameter combination.
CN201810426699.6A 2018-05-07 2018-05-07 Hadoop parameter automatic tuning method based on micro-operation Expired - Fee Related CN108647135B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810426699.6A CN108647135B (en) 2018-05-07 2018-05-07 Hadoop parameter automatic tuning method based on micro-operation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810426699.6A CN108647135B (en) 2018-05-07 2018-05-07 Hadoop parameter automatic tuning method based on micro-operation

Publications (2)

Publication Number Publication Date
CN108647135A CN108647135A (en) 2018-10-12
CN108647135B true CN108647135B (en) 2021-02-12

Family

ID=63749200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810426699.6A Expired - Fee Related CN108647135B (en) 2018-05-07 2018-05-07 Hadoop parameter automatic tuning method based on micro-operation

Country Status (1)

Country Link
CN (1) CN108647135B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427619B (en) * 2019-07-23 2022-06-21 西南交通大学 Chinese text automatic proofreading method based on multi-channel fusion and reordering
CN111858003B (en) * 2020-07-16 2021-05-28 山东大学 Hadoop optimal parameter evaluation method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361183A (en) * 2014-11-21 2015-02-18 中国人民解放军国防科学技术大学 Microprocessor micro system structure parameter optimizing method based on simulator
CN106383746A (en) * 2016-08-30 2017-02-08 北京航空航天大学 Configuration parameter determination method and apparatus of big data processing system
US9665404B2 (en) * 2013-11-26 2017-05-30 International Business Machines Corporation Optimization of map-reduce shuffle performance through shuffler I/O pipeline actions and planning
CN107612886A (en) * 2017-08-15 2018-01-19 中国科学院大学 A kind of Spark platforms Shuffle process compresses algorithm decision-making techniques

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9665404B2 (en) * 2013-11-26 2017-05-30 International Business Machines Corporation Optimization of map-reduce shuffle performance through shuffler I/O pipeline actions and planning
CN104361183A (en) * 2014-11-21 2015-02-18 中国人民解放军国防科学技术大学 Microprocessor micro system structure parameter optimizing method based on simulator
CN106383746A (en) * 2016-08-30 2017-02-08 北京航空航天大学 Configuration parameter determination method and apparatus of big data processing system
CN107612886A (en) * 2017-08-15 2018-01-19 中国科学院大学 A kind of Spark platforms Shuffle process compresses algorithm decision-making techniques

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Towards automatic optimization of MapReduce programs》;Shivnath Babu;《SoCC "10: Proceedings of the 1st ACM symposium on Cloud computing》;20101231;全文 *
《基于机器学习的Hadoop参数调优方法》;童颖;《中国优秀硕士学位论文全文库》;20170115;全文 *

Also Published As

Publication number Publication date
CN108647135A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
CN107612886B (en) Spark platform Shuffle process compression algorithm decision method
Song et al. A hadoop mapreduce performance prediction method
EP3180695A1 (en) Systems and methods for auto-scaling a big data system
CN102799486A (en) Data sampling and partitioning method for MapReduce system
Mustafa et al. A machine learning approach for predicting execution time of spark jobs
CN110727506B (en) SPARK parameter automatic tuning method based on cost model
CN108647135B (en) Hadoop parameter automatic tuning method based on micro-operation
CN106383746A (en) Configuration parameter determination method and apparatus of big data processing system
CN113157421B (en) Distributed cluster resource scheduling method based on user operation flow
Pettijohn et al. {User-Centric}{Heterogeneity-Aware}{MapReduce} Job Provisioning in the Public Cloud
Li et al. An adaptive auto-configuration tool for hadoop
CN106326005A (en) Automatic parameter tuning method for iterative MapReduce operation
Zhou et al. Model and application of product conflict problem with integrated TRIZ and Extenics for low-carbon design
CN113032367A (en) Dynamic load scene-oriented cross-layer configuration parameter collaborative tuning method and system for big data system
CN109635473B (en) Heuristic high-flux material simulation calculation optimization method
CN102546235A (en) Performance diagnosis method and system of web-oriented application under cloud computing environment
CN110377525A (en) A kind of parallel program property-predication system based on feature and machine learning when running
Li et al. Data balancing-based intermediate data partitioning and check point-based cache recovery in Spark environment
Gu et al. Characterizing job-task dependency in cloud workloads using graph learning
JP2015191428A (en) Distributed data processing apparatus, distributed data processing method, and distributed data processing program
Liu et al. A survey of speculative execution strategy in MapReduce
Lu et al. On the auto-tuning of elastic-search based on machine learning
Zhang et al. Getting more for less in optimized mapreduce workflows
CN107621970B (en) Virtual machine migration method and device for heterogeneous CPU
CN111813512A (en) High-energy-efficiency Spark task scheduling method based on dynamic partition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210212