CN108647135B - Hadoop parameter automatic tuning method based on micro-operation - Google Patents
Hadoop parameter automatic tuning method based on micro-operation Download PDFInfo
- Publication number
- CN108647135B CN108647135B CN201810426699.6A CN201810426699A CN108647135B CN 108647135 B CN108647135 B CN 108647135B CN 201810426699 A CN201810426699 A CN 201810426699A CN 108647135 B CN108647135 B CN 108647135B
- Authority
- CN
- China
- Prior art keywords
- micro
- stage
- model
- phase
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000012360 testing method Methods 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 11
- 238000012163 sequencing technique Methods 0.000 claims description 4
- 238000011160 research Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000919 ceramic Substances 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3447—Performance evaluation by modeling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Debugging And Monitoring (AREA)
- Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
Abstract
The invention belongs to the technical field of cloud computing, and particularly relates to a Hadoop parameter automatic tuning method based on micro-operation. The method mainly comprises the steps of determining mapreduce task decoupling as different stages of different types of micro-operations, then establishing a model of single execution time and single processing data volume according to the selected micro-operations, reconstructing and combining the operation process according to the established model to obtain the relation between stage operation time and system parameters, and finally searching a parameter combination which enables the task operation time to be shortest in the model. The method of the invention does not change with the change of the operation type and the cluster configuration, and simultaneously has short time consumption, high efficiency and good portability for searching the optimal parameters.
Description
Technical Field
The invention belongs to the technical field of cloud computing, and particularly relates to a Hadoop parameter automatic tuning method based on micro-operation reconstruction.
Background
The problem of resource optimization of the distributed platform is always one of the hot topics focused by users, particularly, optimization of operation running time of the distributed platform is always an important research object, cloud service is popularized in recent years, operation running time is shortened, a renter can be helped to improve working efficiency, rental cost is reduced, and meanwhile a provider is helped to achieve resource utilization maximization.
In recent years, hadoop distributed computing platforms have been applied to the industry in a mature and wide range, and in the academic world, optimization of the hadoop platforms in various aspects is still a key research object. With the continuous update of hadoop versions, the computing efficiency is no longer a concern of people, a huge production cluster gradually generates expensive operation and maintenance cost, and the cost problem of companies is increasingly highlighted due to the unreasonable allocation of cloud resources, so that the cost optimization problem of the cloud distributed computing framework in the computing process is one of the problems to be solved by large IT companies at present.
There have been some research efforts directed at optimizing hadoop job run time:
1) the method and the system for automatically tuning Hadoop parameters based on machine learning are applied to Shi and Feng Dan, in Ruili.
CN106202431A.2016.
According to the method, different parameters which have large influence on different types of application are automatically obtained by clustering resource consumption characteristics of different operation types and establishing different performance models, and quantitative parameter suggested values are given. The method effectively solves the problem that the existing method based on the empirical rule highly depends on the experience of the user and the problem of limitation of qualitative parameter suggestion.
2) An iterative MapReduce job parameter auto-tuning method, Zhao ceramic Sen, high Xiaojie, Tang Hua. Cn106326005a.2017.
According to the method, the actual operation is executed and the operation execution effect is evaluated, the new parameter configuration combination is determined in the parameter space, and then the operation is executed iteratively until the ending requirement is met.
From the patent situation of the last two years, the main emphasis is to characterize the influence of the change of the parameters on the change of the working time. Another important point which is also concerned by people is the portability of the platform in Hadoop parameter automatic adjustment. The method has important practical significance if the tuning model is quickly established under different operation types of different clusters.
Disclosure of Invention
The invention aims to provide a hadoop2.0 parameter automatic optimization method based on micro-operation reconstruction, which is provided by considering the important practical significance of hadoop parameter automatic optimization in the rise of current cloud computing service.
The technical scheme adopted by the invention is as follows:
a Hadoop parameter automatic tuning method based on micro-operation is used for optimizing parameter combination during MapReduce operation execution, and is characterized by comprising the following steps:
s1, establishing a micro-operation model:
s11, selecting a micro operation: decoupling a MapReduce task, selecting a single memory write operation cm _ micro _ op and a single disk write operation cd _ micro _ op in a collection stage in the Map task, and taking a shuffle stage single memory write operation sm _ micro _ op, a single memory overflow disk write operation sd _ micro _ op and a single file merge disk write operation merge _ micro _ op in a Reduce task as micro-operations;
s12, determining the parameter change space influencing the micro-operation according to the micro-operation selected in the step S11;
s13, determining the difference of data volume processed by single micro-operation according to different parameter values, discretely taking values in each dimension in a parameter space, executing actual operation as a micro-operation model benchmark test, and testing the speed of the single micro-operation under the condition of processing different data volumes;
s14, collecting execution logs of the benchmark test case under different parameter conditions at different stages, and respectively establishing models of single execution time and single processing data volume for single disk write operation and single memory write operation at different stages:
Tmicro_op=α*Dmicro_op+β
Tmicro_opindicating micro-operation execution time, Dmicro_opRepresenting the data volume of single processing of the micro-operation, and alpha and beta are model parameters;
s2, reconstructing and combining the micro-operation model according to the operation process of the collection phase to obtain the relation between the phase operation time and the system parameters:
s21, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s22, reconstructing the collection stage based on the micro-operation to obtain the running process of the actual collection stage, and obtaining the relation between the relevant parameters of the collection stage and the running time of the stage;
s3, reconstructing and combining the micro-operation model according to the operation process of the shuffling stage to obtain the relationship between the stage operation time and the system parameters:
s31, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s32, reconstructing the shuffling stage based on the micro-operation to obtain the running process of the actual shuffling stage, and obtaining the relation between the relevant parameters of the shuffling stage and the running time of the stage;
s33, independently modeling the execution time of the sequencing write stage in the Reduce task, discretely taking values in a parameter space determined by the write times of the memory overflow disk and the data volume, executing the actual job task, testing the execution time of the stage under different parameters and the relationship between the write times of the memory overflow disk and the data volume processed by the stage in the shuffle stage, and establishing a model of the execution time of the sequencing write stage, the write times of the memory overflow disk and the total data volume processed by the stage:
Tsw_phase=Dsw_input*(Nspill*αsw_phgase+βsw_phase)
Tsw_phaserepresenting ordered write phase runtime, Dsw_inputRepresenting the amount of input data of a single reduce task, NspillIndicating number of memory-overflowed disk writes in shuffle stage, alphasw_phaseAnd betasw_phaseIs a model parameter;
s4, finding the parameter combination which enables the task running time to be shortest in the model: and obtaining the parameter combination with the shortest execution time in the stages in the model by adopting a search optimization algorithm, and searching in different stages to obtain the respective optimal parameter combination.
The invention has the beneficial effects that:
(1) a fine-grained micro-operation model capable of accurately depicting parameter change influence is provided. The model can visually and accurately depict the influence of the system parameter change on the execution time. The model facilitates and accurately analyzes the change of the operation execution time when multiple parameters change simultaneously from the viewpoint of data flow.
(2) A strategy for performing a micro-manipulation reconstruction in accordance with a principle of operation is presented. The method does not change along with the change of the operation type and the cluster configuration, and meanwhile, the time for searching the optimal parameters is short, the efficiency is high, and the transportability is good. The method can be used as a description method and an analysis framework of an optimization problem, and an optimal parameter combination is searched by describing a parameter change principle and establishing a model from the angle of finer granularity.
Drawings
Fig. 1 is a logic block diagram of MapReduce job in the present invention.
Detailed Description
The technical scheme of the invention is described in detail in the following with the accompanying drawings:
the method comprises the following steps: aiming at fine-grained operation directly influenced by parameters, different models are established according to different operation types, and the core steps are as follows:
1) decoupling the mapreduce task according to the mode shown in figure 1 to determine different types of micro-operations at different stages: the method comprises the following steps that (1) a collection phase single-time memory write operation cm _ micro _ op and a collection phase single-time disk write operation cd _ micro _ op are carried out; a shuffle phase single memory write operation sm _ micro _ op, a shuffle phase single spin disk write operation sd _ micro _ op, and a shuffle phase single merge disk write operation merge _ micro _ op.
2) And determining the parameter change space influencing the selected micro-operation according to the selected micro-operation. Parameters affecting the cm _ micro _ op and the cd _ micro _ op are io.sort.mb and sort.spill.percentage, and the value space is the respective value range; parameters affecting sm _ micro _ op, sd _ micro _ op and merge _ micro _ op are reduce.
3) And determining the difference of the data volume processed by the single micro-operation due to the difference of the parameter values, discretely taking the value of each dimension in the parameter space, executing the actual operation as the benchmark test of the micro-operation model, and testing the speed of the single micro-operation under the condition of processing different data volumes.
4) And collecting execution logs of the benchmark test case at different stages under different parameter conditions, and respectively establishing a model of single execution time and single processing data volume for single disk write operation and single memory write operation at different stages.
Tmicro_op=α*Dmicro_op+β
The above equation is a linear model of the single execution time and the single processing data volume of the established micro-operation. T ismicro_opIndicating micro-operation execution time, Dmicro_opRepresenting the amount of data a single pass of the micro-operation, alpha and beta are model parameters.
Step two: reconstructing and combining the micro-operation model according to the operation process of the collection phase to obtain the relationship between the phase operation time and the system parameters, wherein the core steps are as follows:
1) the data volume processed by a single micro-operation is determined by a plurality of parameters, and the relationship between the micro-operation time and the system parameters influencing the operation is established through the relationship between the collection phase micro-operation execution time and the data volume established in the step one.
2) Reconstructing the collection phase shown in FIG. 1 based on the micro-operation to obtain the actual running process of the collection phase, and depicting the relationship between the relevant parameters of the collection phase and the execution time of the collection phase.
Step three: reconstructing and combining the micro-operation model according to the running process of the shuffle phase to obtain the relation between the phase running time and the system parameters, wherein the core steps are as follows:
1) the multiple parameters jointly determine the data volume processed by a single micro-operation, and the relationship between the micro-operation time and the system parameters influencing the operation is established through the relationship between the execution time and the data volume of the shuffle phase micro-operation established in the step one.
2) Reconstructing the shuffle phase shown in FIG. 1 based on the micro-operation to obtain the running process of the actual shuffle phase, and obtaining the relationship between the parameters related to the shuffle phase and the execution time of the shuffle phase
3) Independently modeling the execution time of the sort _ write phase in the reduce task, discretely valuing in a parameter space determined by the number of spills and the data volume, executing an actual operation task, testing the execution time of the stage under different parameters and the relation between the number of spills in the shuffle phase and the data volume processed by the stage, and establishing a model of the execution time of the sort _ write phase, the number of spills and the total data processed by the stage:
Tsw_phase=Dsw_input*(Nspill*αsw_phase+βsw_phase)
Tsw_phaserepresenting reduce phase runtime, Dsw_inputRepresenting a single reduce task input data quantity, NspillTo representNumber of shuffle phase spill, αsw_phaseAnd betasw_phaseAre model parameters.
Step four: the method is characterized in that a parameter combination which enables the task running time to be shortest in a model is found, and the core steps are as follows:
1) through the first step to the third step, a description method and an analysis framework for establishing an optimization problem are obtained, the relationship between a change parameter and an analysis target is described from the perspective of finer granularity, and the framework can be adapted to different algorithms.
2) On the basis of the model, various search optimization algorithms can be applied to obtain the parameter combination with the shortest execution time of the stages in the model.
3) And searching in different stages to obtain respective optimal parameter combinations. And completing parameter tuning to obtain the optimal combination of all relevant parameters.
Claims (1)
1. A Hadoop parameter automatic tuning method based on micro-operation is used for optimizing parameter combination during MapReduce operation execution, and is characterized by comprising the following steps:
s1, establishing a micro-operation model:
s11, selecting a micro operation: decoupling a MapReduce task, selecting a single memory write operation cm _ micro _ op and a single disk write operation cd _ micro _ op in a collection stage in the Map task, and taking a shuffle stage single memory write operation sm _ micro _ op, a single memory overflow disk write operation sd _ micro _ op and a single file merge disk write operation merge _ micro _ op in a Reduce task as micro-operations;
s12, according to the micro-operation selected in the step S11, determining a parameter change space which influences the micro-operation, specifically: parameters affecting the cm _ micro _ op and the cd _ micro _ op are io.sort.mb and sort.spill.percentage, and the value space is the respective value range; parameters affecting sm _ micro _ op, sd _ micro _ op and merge _ micro _ op are reduce.java.ops, shuffle.input.buffer.percentage, shuffle.merge.percentage and io.sort.factor, and the value space is the respective variation range;
s13, determining the difference of data volume processed by single micro-operation according to different parameter values, discretely taking values in each dimension in a parameter space, executing actual operation as a micro-operation model benchmark test, and testing the speed of the single micro-operation under the condition of processing different data volumes;
s14, collecting execution logs of the benchmark test case under different parameter conditions at different stages, and respectively establishing models of single execution time and single processing data volume for single disk write operation and single memory write operation at different stages:
Tmicro_op=α*Dmicro_op+β
Tmicro_opindicating micro-operation execution time, Dmicro_opRepresenting the data volume of single processing of the micro-operation, and alpha and beta are model parameters;
s2, reconstructing and combining the micro-operation model according to the operation process of the collection phase to obtain the relation between the phase operation time and the system parameters:
s21, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s22, reconstructing the collection stage based on the micro-operation to obtain the running process of the actual collection stage, and obtaining the relation between the relevant parameters of the collection stage and the running time of the stage;
s3, reconstructing and combining the micro-operation model according to the operation process of the shuffling stage to obtain the relationship between the stage operation time and the system parameters:
s31, establishing the relationship between the micro-operation time and the system parameters influencing the operation according to the model of the step S14;
s32, reconstructing the shuffling stage based on the micro-operation to obtain the running process of the actual shuffling stage, and obtaining the relation between the relevant parameters of the shuffling stage and the running time of the stage;
s33, independently modeling the execution time of the sequencing write stage in the Reduce task, discretely taking values in a parameter space determined by the write times of the memory overflow disk and the data volume, executing the actual job task, testing the execution time of the stage under different parameters and the relationship between the write times of the memory overflow disk and the data volume processed by the stage in the shuffle stage, and establishing a model of the execution time of the sequencing write stage, the write times of the memory overflow disk and the total data volume processed by the stage:
Tsw_phase=Dsw_input*(Nspill*αsw_phase+βsw_phase)
Tsw_phaserepresenting ordered write phase runtime, Dsw_inputRepresenting the amount of input data of a single reduce task, NspillIndicating number of memory-overflowed disk writes in shuffle stage, alphasw_phaseAnd betasw_phaseIs a model parameter;
s4, finding out the parameter combination which enables the task running time to be shortest in the model: and obtaining the parameter combination with the shortest execution time in the stages in the model by adopting a search optimization algorithm, and searching in different stages to obtain the respective optimal parameter combination.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810426699.6A CN108647135B (en) | 2018-05-07 | 2018-05-07 | Hadoop parameter automatic tuning method based on micro-operation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810426699.6A CN108647135B (en) | 2018-05-07 | 2018-05-07 | Hadoop parameter automatic tuning method based on micro-operation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108647135A CN108647135A (en) | 2018-10-12 |
CN108647135B true CN108647135B (en) | 2021-02-12 |
Family
ID=63749200
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810426699.6A Expired - Fee Related CN108647135B (en) | 2018-05-07 | 2018-05-07 | Hadoop parameter automatic tuning method based on micro-operation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108647135B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110427619B (en) * | 2019-07-23 | 2022-06-21 | 西南交通大学 | Chinese text automatic proofreading method based on multi-channel fusion and reordering |
CN111858003B (en) * | 2020-07-16 | 2021-05-28 | 山东大学 | Hadoop optimal parameter evaluation method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104361183A (en) * | 2014-11-21 | 2015-02-18 | 中国人民解放军国防科学技术大学 | Microprocessor micro system structure parameter optimizing method based on simulator |
CN106383746A (en) * | 2016-08-30 | 2017-02-08 | 北京航空航天大学 | Configuration parameter determination method and apparatus of big data processing system |
US9665404B2 (en) * | 2013-11-26 | 2017-05-30 | International Business Machines Corporation | Optimization of map-reduce shuffle performance through shuffler I/O pipeline actions and planning |
CN107612886A (en) * | 2017-08-15 | 2018-01-19 | 中国科学院大学 | A kind of Spark platforms Shuffle process compresses algorithm decision-making techniques |
-
2018
- 2018-05-07 CN CN201810426699.6A patent/CN108647135B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9665404B2 (en) * | 2013-11-26 | 2017-05-30 | International Business Machines Corporation | Optimization of map-reduce shuffle performance through shuffler I/O pipeline actions and planning |
CN104361183A (en) * | 2014-11-21 | 2015-02-18 | 中国人民解放军国防科学技术大学 | Microprocessor micro system structure parameter optimizing method based on simulator |
CN106383746A (en) * | 2016-08-30 | 2017-02-08 | 北京航空航天大学 | Configuration parameter determination method and apparatus of big data processing system |
CN107612886A (en) * | 2017-08-15 | 2018-01-19 | 中国科学院大学 | A kind of Spark platforms Shuffle process compresses algorithm decision-making techniques |
Non-Patent Citations (2)
Title |
---|
《Towards automatic optimization of MapReduce programs》;Shivnath Babu;《SoCC "10: Proceedings of the 1st ACM symposium on Cloud computing》;20101231;全文 * |
《基于机器学习的Hadoop参数调优方法》;童颖;《中国优秀硕士学位论文全文库》;20170115;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108647135A (en) | 2018-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107612886B (en) | Spark platform Shuffle process compression algorithm decision method | |
Song et al. | A hadoop mapreduce performance prediction method | |
EP3180695A1 (en) | Systems and methods for auto-scaling a big data system | |
CN102799486A (en) | Data sampling and partitioning method for MapReduce system | |
Mustafa et al. | A machine learning approach for predicting execution time of spark jobs | |
CN110727506B (en) | SPARK parameter automatic tuning method based on cost model | |
CN108647135B (en) | Hadoop parameter automatic tuning method based on micro-operation | |
CN106383746A (en) | Configuration parameter determination method and apparatus of big data processing system | |
CN113157421B (en) | Distributed cluster resource scheduling method based on user operation flow | |
Pettijohn et al. | {User-Centric}{Heterogeneity-Aware}{MapReduce} Job Provisioning in the Public Cloud | |
Li et al. | An adaptive auto-configuration tool for hadoop | |
CN106326005A (en) | Automatic parameter tuning method for iterative MapReduce operation | |
Zhou et al. | Model and application of product conflict problem with integrated TRIZ and Extenics for low-carbon design | |
CN113032367A (en) | Dynamic load scene-oriented cross-layer configuration parameter collaborative tuning method and system for big data system | |
CN109635473B (en) | Heuristic high-flux material simulation calculation optimization method | |
CN102546235A (en) | Performance diagnosis method and system of web-oriented application under cloud computing environment | |
CN110377525A (en) | A kind of parallel program property-predication system based on feature and machine learning when running | |
Li et al. | Data balancing-based intermediate data partitioning and check point-based cache recovery in Spark environment | |
Gu et al. | Characterizing job-task dependency in cloud workloads using graph learning | |
JP2015191428A (en) | Distributed data processing apparatus, distributed data processing method, and distributed data processing program | |
Liu et al. | A survey of speculative execution strategy in MapReduce | |
Lu et al. | On the auto-tuning of elastic-search based on machine learning | |
Zhang et al. | Getting more for less in optimized mapreduce workflows | |
CN107621970B (en) | Virtual machine migration method and device for heterogeneous CPU | |
CN111813512A (en) | High-energy-efficiency Spark task scheduling method based on dynamic partition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210212 |