CN103605662A - Distributed computation frame parameter optimizing method, device and system - Google Patents

Distributed computation frame parameter optimizing method, device and system Download PDF

Info

Publication number
CN103605662A
CN103605662A CN201310495879.7A CN201310495879A CN103605662A CN 103605662 A CN103605662 A CN 103605662A CN 201310495879 A CN201310495879 A CN 201310495879A CN 103605662 A CN103605662 A CN 103605662A
Authority
CN
China
Prior art keywords
distributed computing
computing framework
historical
framework operation
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310495879.7A
Other languages
Chinese (zh)
Other versions
CN103605662B (en
Inventor
方育柯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201310495879.7A priority Critical patent/CN103605662B/en
Publication of CN103605662A publication Critical patent/CN103605662A/en
Priority to PCT/CN2014/084483 priority patent/WO2015058578A1/en
Application granted granted Critical
Publication of CN103605662B publication Critical patent/CN103605662B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs

Abstract

The invention is suitable for the IT technical field and provides a distributed computation frame parameter optimizing method, device and system. The method comprises the steps of obtaining currently-submitted parallel computation frame operation; retrieving historical distributed computation frame operation similar to the distributed computation frame operation in a distributed computation frame operation historical operation database, wherein the distributed computation frame operation historical operation database comprises execution information and configuration parameters of the distributed computation frame operation; retrieving historical distributed computation frame operation similar to the distributed computation frame operation during the similar distributed computation frame operation; performing optimizing configuration on the configuration parameters of the distributed computation frame operation according to configuration parameters of the similar historical distributed computation frame operation. By adopting the distributed computation frame parameter optimizing method, device and system, parameter configuration rationalization, automatization and self learning of Mapreduce operation can be achieved.

Description

A kind of distributed computing framework parameter optimization method, Apparatus and system
Technical field
The invention belongs to IT technical field, relate in particular to a kind of distributed computing framework parameter optimization method, Apparatus and system.
Background technology
Along with global information industry is at continuous fusion development, Internet resources and data scale are also in continuous growth, especially in fields such as internet, applications, ecommerce, data volume presents the trend of rapid growth, in order to solve these data-intensive computational problems, cloud computing is arisen at the historic moment, and Map/Reduce programming model obtains applying more and more widely as a kind of important means of simplifying large-scale data processing.MapReduce is a kind of common software framework of realizing distributed parallel calculation task that Google proposes, and it has simplified the concurrent software programming mode on the super large cluster being comprised of common computer, can be used for the parallel computation of large-scale dataset.In MapReduce distributed computing system, the parameter optimization strategy of systematic parameter is directly connected to the fairness of using resource between the utilization factor of entire system resource and each user.Therefore, the parameter optimization algorithm of systematic parameter becomes a major challenge of MapReduce systems face.
Yet the scheme of current common MapReduce parameter optimization has two kinds, details are as follows:
While 1, moving Mapreduce operation, by monitoring tools as nmon, monitor that the performance index of group system are (as CPU usage, memory usage, disk and network I/O etc.), carry out fast detecting performance bottleneck, assist slip-stick artist more targetedly performance bottleneck point to be carried out to parameter improvement and optimization.
2, by carry out in advance Mapreduce operation on a simulation cluster, the operation characteristic of simultaneously monitoring this task (comprises output file, each stage running time, process and the data volume of transmission, and each resource information of taking of the progress of work etc.), calculate the cost of each stage consumption of natural resource, then by revising resource parameters, estimate the working time of actual Mapreduce operation, until reach working time, can accept scope, thereby arrive the object of MapReduce performance optimization.
This technical scheme 1 major defect is just to have provided performance bottleneck point detecting method, does not provide concrete improvement in performance scheme, even know performance bottleneck Dian, domestic consumer does not still know how to revise parameter in a lot of situation.Even if user knows how to revise MapReduce parameter, but this scheme implementation efficiency is also very low, need to manually make repeated attempts repeatedly, just can make performance reach compared with the figure of merit, therefore cannot solve the problem of the rationalization of MapReduce parameter configuration, robotization.
These technical scheme 2 major defects be the pre-execution and the analysis that increase program be cost, and because this scheme is the tuning of carrying out for individual task, when the task type submitted to as user is complicated and changeable, all to re-start pre-execution at every turn, its versatility is reduced, therefore cannot solve the problem of MapReduce parameter configuration self study.
Summary of the invention
The object of the embodiment of the present invention is to provide distributed computing framework parameter optimization method, is intended to solve the problem of the rationalization of MapReduce parameter configuration, robotization and self study.
First aspect, a kind of method of distributed computing framework parameter optimization, comprising:
Obtain the distributed computing framework operation when submit;
In the distributed computing framework history data storehouse of setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, described distributed computing framework history data storehouse comprises execution information and the configuration parameter of historical distributed computing framework operation;
In similar distributed computing framework operation, retrieve the historical distributed computing framework operation similar to described distributed computing framework operation;
Configuration parameter according to similar historical distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
In conjunction with first aspect, described in distributed computing framework Historical Jobs runtime database, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, comprising:
When described distributed computing framework operation does not exist the distributed computing framework operational factor of appointment, in distributed computing framework Historical Jobs runtime database, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation.
In conjunction with first aspect, describedly according to the configuration parameter of similar historical distributed computing framework operation, the configuration parameter of described distributed computing framework operation is optimized to configuration, comprising:
In similar historical distributed computing framework operation, obtain the highest historical distributed computing framework operation of scoring, use the configuration parameter of the highest historical distributed computing framework operation of scoring as the configuration parameter of described distributed computing framework operation, the configuration parameter of described distributed computing framework operation is optimized to configuration.
In conjunction with first aspect, when described distributed computing framework Job execution is complete, collect described distributed computing framework Job execution information and configuration parameter;
Described distributed computing framework Job execution information is marked, and described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
In conjunction with first aspect, described distributed computing framework Job execution information is marked, be specially:
Obtain time loss and the space consuming of described distributed computing framework job run;
According to the time loss of described distributed computing framework job run and space consuming, generate time loss cost and the space consuming cost of described distributed computing framework job run;
According to the Rating Model of the distributed computing framework operation of setting up in advance, and the time loss cost of described distributed computing framework job run and space consuming cost are marked to distributed computing framework job run.
In conjunction with first aspect, the Rating Model of the distributed computing framework operation of setting up in advance in described basis, and before the time loss cost of described distributed computing framework job run and space consuming cost mark to distributed computing framework job run, comprising:
Set up Rating Model;
Described Rating Model is
Figure BDA0000399144570000031
Wherein, F jobscoring when (τ, υ) represents a job run, τ, υ represents respectively corresponding time loss cost, space consuming cost, τ during Job execution scoring, the function of υ.α, β is the weight of the scoring of time loss cost and the scoring of space consuming cost, i is the sequence number of the middle property value of time loss cost, τ icorresponding to i property value of time loss cost, j is the sequence number of the middle property value of space consuming cost, υ jexpression is corresponding to j property value of space consuming cost,
Figure BDA0000399144570000041
the weight of corresponding above-mentioned two attributes respectively, parameter alpha wherein, β is for regulating setup times to consume cost and space consuming cost which more preferably; object be in order to eliminate the different difference of each cost attribute value magnitude, by arranging carry out normalized calculating F job(τ, υ).
In conjunction with first aspect, described described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database, be specially:
Adopt tree-like storage mode, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
In conjunction with first aspect, in the distributed computing framework history data storehouse of setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, comprising:
In the distributed computing framework history data storehouse of setting up in advance, adopt search tree node mode, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation.
In conjunction with first aspect, the historical distributed computing framework operation that described retrieval is similar to described distributed computing framework operation, comprising:
Whether retrieve described distributed computing framework operation equates with the determinant attribute in historical distributed computing framework operation;
Determinant attribute in described distributed computing framework operation when determinant attribute in historical distributed computing framework operation equates, represents that described historical distributed computing framework operation is the historical distributed computing framework operation similar to described distributed computing framework operation.
In conjunction with first aspect, the historical distributed computing framework operation that described retrieval is similar to described distributed computing framework operation, also comprises:
Obtain the attribute field of described distributed computing framework operation and historical distributed computing framework operation, described attribute field comprises determinant attribute set and non-key community set;
According to similarity model and the described attribute field of the distributed computing framework operation of setting up in advance, generate respectively the similarity of operation and the similarity of cluster environment information, the similarity of described operation is the similarity of described distributed computing framework operation and historical distributed computing framework operation, and the similarity of described cluster environment information is that the similarity of described distributed computing framework operation and historical distributed computing framework operation is respectively in the similarity of described cluster environment information;
According to comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance, generate the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation.
In conjunction with first aspect, the similarity model of the distributed computing framework operation of setting up in advance in described basis and described attribute field, before generating respectively the similarity of operation and the similarity of cluster environment information, comprising:
Set up the similarity model of distributed computing framework operation,
Described similarity model is:
sim ( A , B ) = Π k ∈ K 1 , p A , k = p B , k 0 , p A , k ≠ p B , k * ( Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2 )
Wherein K is determinant attribute set, and k is the sequence number of the middle determinant attribute of determinant attribute set, and I is non-key community set, the sequence number of non-key attribute during i is non-key community set, p a,kk the determinant attribute that represents operation A, p a,ifor i the non-key attribute of sign operation A, p b,kfor k the determinant attribute of sign operation B, p b,ii the non-key attribute for sign operation B.
In conjunction with first aspect, comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance in basis, before generating the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation, comprising:
Set up the comprehensive similarity model of distributed computing framework operation,
Described comprehensive similarity model is:
sim(A,B)=α×sim(JobA,JobB)+β×sim(ClusterA,ClusterB)
Wherein sim (JobA, JobB) represents operation A, the similarity of B, and sim (ClusterA, ClusterB) represents operation A, B is respectively in the similarity of described cluster environment information.Sim (A, B) represents operation A, and between B, with reference to the similarity of cluster environment information, α is the first weight parameter in weighted strategy, and β is the second weight parameter in weighted strategy.
In conjunction with second aspect, a kind of MapReduce parameter optimization device, comprising:
Acquiring unit, for obtaining the distributed computing framework operation when submit;
The first retrieval unit, for the distributed computing framework history data storehouse setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, described distributed computing framework history data storehouse comprises execution information and the configuration parameter of historical distributed computing framework operation;
The second retrieval unit, for the distributed computing framework operation similar, retrieves the historical distributed computing framework operation similar to described distributed computing framework operation;
Dispensing unit, for according to the configuration parameter of similar historical distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
In conjunction with second aspect, described the first retrieval unit, comprising:
Check subelement, for checking whether described distributed computing framework operation exists the distributed computing framework operational factor of appointment;
Carry out subelement, for when there is not the distributed computing framework operational factor of appointment in described distributed computing framework operation, described in carrying out, in distributed computing framework Historical Jobs runtime database, retrieve the step of the historical distributed computing framework operation similar with described distributed computing framework operation.
In conjunction with second aspect, described dispensing unit, also for the historical distributed computing framework operation similar, obtain the highest historical distributed computing framework operation of scoring, use the configuration parameter of the highest historical distributed computing framework operation of scoring as the configuration parameter of described distributed computing framework operation, the configuration parameter of described distributed computing framework operation is optimized to configuration.
In conjunction with second aspect, also comprise:
Collector unit, for when described distributed computing framework Job execution is complete, collects described distributed computing framework Job execution information and configuration parameter;
Scoring unit, for described distributed computing framework Job execution information is marked, and is saved in described distributed computing framework operation in described distributed computing framework Historical Jobs database.
In conjunction with second aspect, described scoring unit, comprising:
Obtain subelement, for obtaining time loss and the space consuming of described distributed computing framework job run;
Generate subelement, for according to the time loss of described distributed computing framework job run and space consuming, generate time loss cost and the space consuming cost of described distributed computing framework job run;
Scoring subelement, for according to the Rating Model of the distributed computing framework operation of setting up in advance, and the time loss cost of described distributed computing framework job run and space consuming cost are marked to distributed computing framework job run.
In conjunction with second aspect, described scoring unit, also comprises:
Set up subelement, for setting up Rating Model;
Described Rating Model is
Figure BDA0000399144570000071
Wherein, F jobscoring when (τ, υ) represents a job run, τ, υ represents respectively corresponding time loss cost, space consuming cost, τ during Job execution scoring, the function of υ.α, β is the weight of the scoring of time loss cost and the scoring of space consuming cost, i is the sequence number of the middle property value of time loss cost, τ icorresponding to i property value of time loss cost, j is the sequence number of the middle property value of space consuming cost, υ jexpression is corresponding to j property value of space consuming cost,
Figure BDA0000399144570000072
the weight of corresponding above-mentioned two attributes respectively, parameter alpha wherein, β is for regulating setup times to consume cost and space consuming cost which more preferably;
Figure BDA0000399144570000073
Figure BDA0000399144570000074
object be in order to eliminate the different difference of each cost attribute value magnitude, by arranging
Figure BDA0000399144570000075
carry out normalized calculating F job(τ, υ).
In conjunction with second aspect, described scoring unit, also comprises:
Preserve subelement, for adopting tree-like storage mode, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
In conjunction with second aspect, the first retrieval unit, comprising:
The first retrieval subelement, for the distributed computing framework history data storehouse setting up in advance, adopts search tree node mode, retrieves the historical distributed computing framework operation similar with described distributed computing framework operation.
In conjunction with second aspect, the second retrieval unit, comprising:
Whether the second retrieval subelement, equate with the determinant attribute of historical distributed computing framework operation for retrieving described distributed computing framework operation;
Similar subelement, for when the determinant attribute of described distributed computing framework operation is when determinant attribute in historical distributed computing framework operation equates, represent that described historical distributed computing framework operation is the historical distributed computing framework operation similar to described distributed computing framework operation.
In conjunction with second aspect, the second retrieval unit, also comprises:
Obtain subelement, for obtaining the attribute field of described distributed computing framework operation and historical distributed computing framework operation, described attribute field comprises determinant attribute set and non-key community set;
First generates subelement, similarity model and the described attribute field of the distributed computing framework operation of setting up in advance for basis, generate respectively the similarity of operation and the similarity of cluster environment information, the similarity of described operation is the similarity of described distributed computing framework operation and historical distributed computing framework operation, and the similarity of described cluster environment information is that the similarity of described distributed computing framework operation and historical distributed computing framework operation is respectively in the similarity of described cluster environment information;
Second generates subelement, for according to comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance, generates the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation.
In conjunction with second aspect, first generates subelement, also comprises:
First sets up subelement, for setting up the similarity model of distributed computing framework operation,
Described similarity model is:
sim ( A , B ) = Π k ∈ K 1 , p A , k = p B , k 0 , p A , k ≠ p B , k * ( Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2 )
Wherein K is determinant attribute set, and k is the sequence number of the middle determinant attribute of determinant attribute set, and I is non-key community set, the sequence number of non-key attribute during i is non-key community set, p a,kk the determinant attribute that represents operation A, p a,ifor i the non-key attribute of sign operation A, p b,kfor k the determinant attribute of sign operation B, p b,ii the non-key attribute for sign operation B.
In conjunction with second aspect, second generates subelement, also comprises:
Second sets up subelement, for setting up the comprehensive similarity model of distributed computing framework operation,
Described comprehensive similarity model is:
sim(A,B)=α×sim(JobA,JobB)+β×sim(ClusterA,ClusterB)
Wherein sim (JobA, JobB) represents operation A, the similarity of B, and sim (ClusterA, ClusterB) represents operation A, B is respectively in the similarity of described cluster environment information.Sim (A, B) represents operation A, and between B, with reference to the similarity of cluster environment information, α is the first weight parameter in weighted strategy, and β is the second weight parameter in weighted strategy.
The third aspect, the client and the management of computing node that comprise the distributed computing framework operation of above-mentioned parameter optimization device, submission, wherein, between the client of described distributed computing framework operation and described management of computing node, by described parameter optimization device, connect.
In the present embodiment, according to the configuration parameter of similar historical distributed computing framework operation, configuration parameter to described distributed computing framework operation is optimized configuration, thereby avoided parameter configuration unreasonable, cause the situation of the significant wastage of cluster computational resource, avoided user to want manually the configuration parameter of MapReduce operation to be adjusted simultaneously, tuning mode efficiency is low, and configuration parameter is only applicable to current task, the situation that does not possess versatility, make when user submits a Mapreduce operation to, can to parameter, be optimized configuration automatically, the Mapreduce operation having moved before can learning, and do not need all again to optimize MapReduce parameter at every turn, realizing the configuration of Mapreduce job parameter rationalizes, robotization and self study.
Accompanying drawing explanation
Fig. 1 is the realization flow figure of the method for a kind of distributed computing framework parameter optimization of providing of the embodiment of the present invention;
Fig. 2 is the implementing procedure figure of the preservation distributed computing framework operation that provides of the embodiment of the present invention;
Fig. 3 is the preferably sample figure of the tree-like distributed storage Computational frame operation that provides of the embodiment of the present invention;
Fig. 4 is the preferably implementing procedure figure of the distributed computing framework parameter optimization that provides of the present embodiment;
Fig. 5 is the structural representation of a kind of parameter optimization device of providing of the embodiment of the present invention;
Fig. 6 is preferably network architecture diagram of the distributed computing framework Parameter Optimization System that provides of the present embodiment;
Fig. 7 is the structural representation of a kind of parameter optimization device of providing of the embodiment of the present invention.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
Fig. 1 is the realization flow figure of the method for a kind of distributed computing framework parameter optimization of providing of the embodiment of the present invention;
In step S101, obtain the distributed computing framework operation when submit;
In the present embodiment, distributed computing framework includes but not limited to parallel computation framework Mapreduce.
In step S102, in the distributed computing framework history data storehouse of setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, described distributed computing framework history data storehouse comprises execution information and the configuration parameter of historical distributed computing framework operation;
In the present embodiment, by depositing in the distributed computing framework history data storehouse of distributed computing framework operation, according to the similar sign of distributed computing framework operation, retrieve the distributed computing framework operation similar with distributed computing framework operation, retrieve the implementation process of the distributed computing framework operation similar with distributed computing framework operation, at following embodiment, describe, at this, do not repeat.
In the present embodiment, when described distributed computing framework operation exists the distributed computing framework operational factor of appointment, according to the distributed computing framework operational factor of appointment, carry out described distributed computing framework operation;
In step S103, in similar distributed computing framework operation, retrieve the historical distributed computing framework operation similar to described distributed computing framework operation;
In the present embodiment, whether the operation of retrieval distributed computing framework equates with the determinant attribute in historical distributed computing framework operation, determinant attribute in distributed computing framework operation is when determinant attribute in historical distributed computing framework operation equates, when namely the numerical value of both determinant attributes is consistent, represent that this history distributed computing framework operation is the historical distributed computing framework operation similar to distributed computing framework operation.
In the present embodiment, when retrieving the distributed computing framework operation similar with distributed computing framework operation, in similar distributed computing framework operation, by the similarity model of setting up in advance, retrieve the distributed computing framework operation similar to this distributed computing framework operation, the implementation process of retrieving the distributed computing framework operation similar to this distributed computing framework operation, describes at following embodiment, at this, does not repeat.
In step S104, the configuration parameter according to similar historical distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
In the present embodiment, according to the configuration parameter of similar historical distributed computing framework operation, configuration parameter to described distributed computing framework operation is optimized configuration, particularly, when retrieving the distributed computing framework operation similar to this distributed computing framework operation, in similar historical distributed computing framework operation, obtain the highest historical distributed computing framework operation of scoring, use the configuration parameter of the highest historical distributed computing framework operation of scoring as the configuration parameter of described distributed computing framework operation, configuration parameter to described distributed computing framework operation is optimized configuration, and carry out this distributed computing framework operation, this distributed computing framework history data storehouse comprises execution information and the configuration parameter of historical distributed computing framework operation.
In the present embodiment, by analysis distribution formula Computational frame job run historical information, find a series of distributed computing framework operations similar to current distributed computing framework operation, from distributed computing framework history library, find the configuration parameter of the highest distributed computing framework operation of a set of scoring, configuration reference as current distributed computing framework job parameter, thereby avoided parameter configuration unreasonable, cause the situation of the significant wastage of cluster computational resource, avoided user to want manually the configuration parameter of distributed computing framework operation to be adjusted simultaneously, tuning mode efficiency is low, and configuration parameter is only applicable to current task, the situation that does not possess versatility, make when user submits a distributed computing framework operation to, can to parameter, be optimized configuration automatically, the distributed computing framework operation having moved before can learning, and do not need all again to optimize distributed computing framework parameter at every turn, realizing the configuration of distributed computing framework job parameter rationalizes, robotization and self study.
As a preferred embodiment of the present invention, in the distributed computing framework history data storehouse of setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, comprising:
Check whether described distributed computing framework operation exists the distributed computing framework operational factor of appointment;
When there is not the distributed computing framework operational factor of appointment in described distributed computing framework operation, described in carrying out, in distributed computing framework Historical Jobs runtime database, retrieve the step of the historical distributed computing framework operation similar with described distributed computing framework operation.
In the present embodiment, when submitting task to, whether the ginseng list file in the operation of system acquisition distributed computing framework, there is the distributed computing framework operational factor of appointment by the inspection of ginseng list file.
As a preferred embodiment of the present invention, according to the configuration parameter of similar historical distributed computing framework operation, configuration parameter to described distributed computing framework operation is optimized configuration, also can be in similar historical distributed computing framework operation, obtaining marks is greater than a plurality of distributed computing framework operations of pre-set threshold value, according to the similarity degree of a plurality of distributed computing framework operations, the configuration parameter of a plurality of distributed computing framework operations of weighted calculation, the configuration parameter that use weighted calculation obtains is as the configuration parameter of described distributed computing framework operation, configuration parameter to described distributed computing framework operation is optimized configuration.For example, for new operation Job_A, its parameter setting can be based on following K operation B the most similar icalculate:
P A=Σ i=1,2,...Ksim(A,B i)*P i
P wherein afor some configuration parameters of new operation Job_A, K is the number of jobs that the scoring selected is greater than certain threshold value, B irepresent i operation in these operations.Generally speaking new job parameter can be calculated and produce according to Similarity-Weighted based on TopK the highest operation of scoring, owing to combining a plurality of distributed computing framework job parameter configurations, can make like this optimization model of new work industry more stable.
With reference to figure 2, Fig. 2 is the implementing procedure figure of the preservation distributed computing framework operation that provides of the embodiment of the present invention, and details are as follows:
S201, when described distributed computing framework Job execution is complete, collects described distributed computing framework Job execution information and configuration parameter;
S202, marks to described distributed computing framework Job execution information, and described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
As a preferred embodiment of the present invention, collect described distributed computing framework Job execution information and configuration parameter, comprising:
Collect current cluster environment information and the operation configuration parameter information of described distributed computing framework operation.
In the present embodiment, cluster environment information includes but not limited to computing node number, memory amount, cpu total amount, dfs block size, dfs number of copies, the network bandwidth, disk I/O.
In the present embodiment, operation configuration parameter includes but not limited to Mapper class, Reducer class, corresponding byte number, input-output file form, respective path, split information.
In the present embodiment, collect current cluster environment information and the operation configuration parameter information of described distributed computing framework operation, the similarity that is mainly used in the operation of subsequent distribution formula Computational frame is calculated.
As a preferred embodiment of the present invention, the described execution information to distributed computing framework operation is marked, and comprising:
Obtain time loss and the space consuming of described distributed computing framework job run;
According to the time loss of described distributed computing framework job run and space consuming, generate time loss cost and the space consuming cost of described distributed computing framework job run;
According to the Rating Model of the distributed computing framework operation of setting up in advance, and the time loss cost of described distributed computing framework job run and space consuming cost are marked to distributed computing framework job run.
In the present embodiment, the time loss that time loss is cost attribute.
Reference table 1, table 1 is the preferably time loss cost attribute table that the embodiment of the present invention provides, it has comprised the cost attribute name of part consumption computing time.
Table-1 time loss cost attribute table
Figure BDA0000399144570000141
Note: in concrete enforcement, can be not limited to above-mentioned attribute field.
In the present embodiment, according to the time loss of distributed computing framework job run, generate the time loss cost of distributed computing framework job run.Can pass through, each phases-time consumes, and the data volume of input and output is calculated acquisition.For example, HdfsReadCost in time loss cost attribute table, reading of the every byte HDFS of its time loss cost is consuming time, can read the T.T. that quantity reads divided by byte by byte, generates HdfsReadCost time loss cost (reading of every byte HDFS is consuming time).
In the present embodiment, the space consuming that space consuming is cost attribute.
Reference table 2, table 2 is preferably space consuming cost attribute tables that the embodiment of the present invention provides, it has comprised the cost attribute name that part computer memory consumes.
Table-2 space consuming cost attribute tables
Figure BDA0000399144570000151
Note: in concrete enforcement, can be not limited to above-mentioned attribute field.
In the present embodiment, according to the space consuming of distributed computing framework job run, generate the space consuming cost of distributed computing framework job run.Can consume by each stage space, and the data volume of input and output is calculated acquisition.
For example, cost attribute name MapAvgMemBytes, its space consuming cost be average every byte of Map stage to internal memory resource consumption, by average memory consumption divided by input the number that records generate, specific formula for calculation is as follows:
MapAvgMemBytes = MapAvgMem MapInputBytes
Wherein MapAvgMem is average memory consumption, the number that records that MapInputBytes is input,
MapAvgMem and MapInputBytes can directly obtain by the Metrics of distributed computing framework.In addition, cost attribute name MapCPUCostPerBytes, its space consuming cost be Map each byte of stage average CPU consume, the number that records by MAP stage CPU internal memory wastage in bulk or weight divided by input generates, specific formula for calculation is as follows:
MapCPUCostPerBytes = MapCPUCost MapInputBytes
Wherein MapCPUCost and MapInputBytes field can directly be obtained by the Metrics of distributed computing framework.
As a preferred embodiment of the present invention, the Rating Model of the distributed computing framework operation of setting up in advance in described basis, and before the time loss cost of described distributed computing framework job run and space consuming cost mark to distributed computing framework job run, comprising:
Set up Rating Model;
Described Rating Model is
Figure BDA0000399144570000162
Wherein, F jobscoring when (τ, υ) represents a job run, τ, υ represents respectively corresponding time loss cost, space consuming cost, τ during Job execution scoring, the function of υ.α, β is the weight of the scoring of time loss cost and the scoring of space consuming cost, i is the sequence number of the middle property value of time loss cost, τ icorresponding to i property value of time loss cost, j is the sequence number of the middle property value of space consuming cost, υ jexpression is corresponding to j property value of space consuming cost, the weight of corresponding above-mentioned two attributes respectively, parameter alpha wherein, β is for regulating setup times to consume cost and space consuming cost which more preferably;
Figure BDA0000399144570000164
object be in order to eliminate the different difference of each cost attribute value magnitude, by arranging
Figure BDA0000399144570000166
carry out normalized calculating F job(τ, υ).
As a preferred embodiment of the present invention, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database, be specially:
Adopt tree-like storage mode, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
In the present embodiment, adopt tree-like storage mode, with tree root to the path representation of leaf node the class title of corresponding operation, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
With reference to figure 3, Fig. 3 is the preferably sample figure of the tree-like distributed storage Computational frame operation that provides of the embodiment of the present invention.
In the present embodiment, in the present embodiment, (as the dotted line frame in figure) in same leaf node left in the operation of identical class name in, so that follow-up when searching for similar distributed computing framework operation, can go out similar distributed computing framework operation by fast search.
As a preferred embodiment of the present invention, in the distributed computing framework history data storehouse of setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, comprising:
In the distributed computing framework history data storehouse of setting up in advance, adopt search tree node mode, retrieval adopts the distributed computing framework operation similar with described distributed computing framework operation of tree-like storage.
In the present embodiment, adopt search tree node mode, retrieval adopts the distributed computing framework operation similar with described distributed computing framework operation of tree-like storage.For example, with reference to figure 5, operation for com.huawei.pagerank.PageRank-1, by root node root->com->huawei-GreatT.Gre aT.GTpagerank, search for totally 3 times and can obtain the distributed computing framework operation similar with described distributed computing framework operation.
As a preferred embodiment of the present invention, the historical distributed computing framework operation that described retrieval is similar to described distributed computing framework operation, comprising:
Obtain the attribute field of described distributed computing framework operation and historical distributed computing framework operation, described attribute field comprises determinant attribute set and non-key community set;
According to similarity model and the described attribute field of the distributed computing framework operation of setting up in advance, generate respectively the similarity of operation and the similarity of cluster environment information, the similarity of described operation is the similarity of described distributed computing framework operation and historical distributed computing framework operation, and the similarity of described cluster environment information is that the similarity of described distributed computing framework operation and historical distributed computing framework operation is respectively in the similarity of described cluster environment information;
According to comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance, generate the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation.
In the present embodiment, reference table 3, table 3 is preferably attribute field tables that the embodiment of the present invention provides, it comprises the field of using when part distributed computing framework operation similarity is calculated.
Table-3Job task attribution table
Figure BDA0000399144570000181
Note: in concrete enforcement, can be not limited to above-mentioned attribute field.
In the present embodiment, reference table 4, table 4 is attribute field tables of the better cluster environment that provides of the embodiment of the present invention, it comprises the field of using when part distributed computing framework operation similarity is calculated.
Table-4 cluster environment parameter similarities
Figure BDA0000399144570000182
Figure BDA0000399144570000191
Note: in concrete enforcement, can be not limited to above-mentioned attribute field.
As a preferred embodiment of the present invention, the similarity model of the distributed computing framework operation of setting up in advance in described basis and described attribute field, before generating respectively the similarity of operation and the similarity of cluster environment information, comprising:
Set up the similarity model of distributed computing framework operation,
Described similarity model is:
sim ( A , B ) = Π k ∈ K 1 , p A , k = p B , k 0 , p A , k ≠ p B , k * ( Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2 )
Wherein K is determinant attribute set, and k is the sequence number of the middle determinant attribute of determinant attribute set, and I is non-key community set, the sequence number of non-key attribute during i is non-key community set, p a,kk the determinant attribute that represents operation A, p a,ifor i the non-key attribute of sign operation A, p b,kfor k the determinant attribute of sign operation B, p b,ii the non-key attribute for sign operation B.
Need to describe, for determinant attribute, be based on absolute phase isotype, once exist unequally in determinant attribute, directly returning to similarity is 0, and namely distributed computing framework operation and historical distributed computing framework operation are dissimilar.
As a preferred embodiment of the present invention, comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance in basis, before generating the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation, comprising:
Set up the comprehensive similarity model of distributed computing framework operation,
Described comprehensive similarity model is:
sim(A,B)=α×sim(JobA,JobB)+β×sim(ClusterA,ClusterB)
Wherein sim (JobA, JobB) represents operation A, the similarity of B, and sim (ClusterA, ClusterB) represents operation A, B is respectively in the similarity of described cluster environment information.Sim (A, B) represents operation A, and between B, with reference to the similarity of cluster environment information, α is the first weight parameter in weighted strategy, and β is the second weight parameter in weighted strategy.
In the present embodiment, the similarity between operation obtains based on above-mentioned formula, and parameter declaration is consistent with description above.
sim ( JobA , JobB ) = Π k ∈ K 1 , p A , k = p B , k 0 , p A , k ≠ p B , k * ( Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2 )
For the similarity between cluster, suppose that cluster parameter is not as determinant attribute, its similarity is calculated and can be reduced to so:
sim ( ClusterA , ClusterB ) = Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2
Wherein I is cluster parameter sets, p a,ifor i the property value of sign cluster A, p b,ifor j the property value of sign operation B, it is a simple computation model based on cosine similarity.
As a preferred embodiment of the present invention, Fig. 4 is the preferably implementing procedure figure of the distributed computing framework parameter optimization that provides of the present embodiment, and details are as follows:
S401, obtains parallel computation framework (distributed computing framework, the distributed computing framework) operation when submit;
S402, judges the whether self-defined distributed computing framework parameter of described distributed computing framework operation user, is S403, otherwise carries out S404;
S403, submits distributed computing framework operation to according to user-specified parameters;
S404 retrieves similar operation from history run storehouse;
Whether S405, there is operation of the same type, is to carry out S405, otherwise carries out S406;
S406, Uses Defaults and submits distributed computing framework operation to;
S407, selects a distributed computing framework operation that scoring is the highest;
S408, submits distributed computing framework operation to according to optimized parameter;
S409, distributed computing framework job run finishes, and collects distributed computing framework configuration and running log;
S409, to current distributed computing framework operation scoring, puts into database by result.
With reference to figure 5, Fig. 5 is the structural representation of a kind of parameter optimization device of providing of the embodiment of the present invention, for convenience of explanation, only shows the part relevant to the present embodiment, and details are as follows:
Acquiring unit 51, for obtaining the distributed computing framework operation when submit;
The first retrieval unit 52, for the distributed computing framework history data storehouse setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, described distributed computing framework history data storehouse comprises execution information and the configuration parameter of historical distributed computing framework operation;
The second retrieval unit 53, for the distributed computing framework operation similar, retrieves the historical distributed computing framework operation similar to described distributed computing framework operation;
Dispensing unit 54, for according to the configuration parameter of similar historical distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
Further, in this device, described the first retrieval unit, comprising:
Check subelement, for checking whether described distributed computing framework operation exists the distributed computing framework operational factor of appointment;
Carry out subelement, for when there is not the distributed computing framework operational factor of appointment in described distributed computing framework operation, described in carrying out, in distributed computing framework Historical Jobs runtime database, retrieve the step of the historical distributed computing framework operation similar with described distributed computing framework operation.
Further, in this device, described dispensing unit, also for the historical distributed computing framework operation similar, obtain the highest historical distributed computing framework operation of scoring, use the configuration parameter of the highest historical distributed computing framework operation of scoring as the configuration parameter of described distributed computing framework operation, the configuration parameter of described distributed computing framework operation is optimized to configuration.
Further, in this device, also comprise:
Collector unit, for when described distributed computing framework Job execution is complete, collects described distributed computing framework Job execution information and configuration parameter;
Scoring unit, for described distributed computing framework Job execution information is marked, and is saved in described distributed computing framework operation in described distributed computing framework Historical Jobs database.
Further, in this device, described scoring unit, comprising:
Obtain subelement, for obtaining time loss and the space consuming of described distributed computing framework job run;
Generate subelement, for according to the time loss of described distributed computing framework job run and space consuming, generate time loss cost and the space consuming cost of described distributed computing framework job run;
Scoring subelement, for according to the Rating Model of the distributed computing framework operation of setting up in advance, and the time loss cost of described distributed computing framework job run and space consuming cost are marked to distributed computing framework job run.
Further, in this device, described scoring unit, also comprises:
Set up subelement, for setting up Rating Model;
Described Rating Model is
Figure BDA0000399144570000221
Wherein, F jobscoring when (τ, υ) represents a job run, τ, υ represents respectively corresponding time loss cost, space consuming cost, τ during Job execution scoring, the function of υ.α, β is the weight of the scoring of time loss cost and the scoring of space consuming cost, i is the sequence number of the middle property value of time loss cost, τ icorresponding to i property value of time loss cost, j is the sequence number of the middle property value of space consuming cost, υ jexpression is corresponding to j property value of space consuming cost,
Figure BDA0000399144570000222
the weight of corresponding above-mentioned two attributes respectively, parameter alpha wherein, β is for regulating setup times to consume cost and space consuming cost which more preferably;
Figure BDA0000399144570000223
Figure BDA0000399144570000224
object be in order to eliminate the different difference of each cost attribute value magnitude, by arranging
Figure BDA0000399144570000225
carry out normalized calculating F job(τ, υ).
Further, in this device, described scoring unit, also comprises:
Preserve subelement, for adopting tree-like storage mode, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
Further, in this device, the first retrieval unit, comprising:
The first retrieval subelement, for the distributed computing framework history data storehouse setting up in advance, adopts search tree node mode, retrieves the historical distributed computing framework operation similar with described distributed computing framework operation.
Further, in this device, the second retrieval unit, comprising:
Whether the second retrieval subelement, equate with the determinant attribute of historical distributed computing framework operation for retrieving described distributed computing framework operation;
Similar subelement, for when the determinant attribute of described distributed computing framework operation is when determinant attribute in historical distributed computing framework operation equates, represent that described historical distributed computing framework operation is the historical distributed computing framework operation similar to described distributed computing framework operation.
Further, in this device, the second retrieval unit, also comprises:
Obtain subelement, for obtaining the attribute field of described distributed computing framework operation and historical distributed computing framework operation, described attribute field comprises determinant attribute set and non-key community set;
First generates subelement, similarity model and the described attribute field of the distributed computing framework operation of setting up in advance for basis, generate respectively the similarity of operation and the similarity of cluster environment information, the similarity of described operation is the similarity of described distributed computing framework operation and historical distributed computing framework operation, and the similarity of described cluster environment information is that the similarity of described distributed computing framework operation and historical distributed computing framework operation is respectively in the similarity of described cluster environment information;
Second generates subelement, for according to comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance, generates the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation.
Further, in this device, first generates subelement, also comprises:
First sets up subelement, for setting up the similarity model of distributed computing framework operation,
Described similarity model is:
sim ( A , B ) = Π k ∈ K 1 , p A , k = p B , k 0 , p A , k ≠ p B , k * ( Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2 )
Wherein K is determinant attribute set, and k is the sequence number of the middle determinant attribute of determinant attribute set, and I is non-key community set, the sequence number of non-key attribute during i is non-key community set, p a,kk the determinant attribute that represents operation A, p a,ifor i the non-key attribute of sign operation A, p b,kfor k the determinant attribute of sign operation B, p b,ii the non-key attribute for sign operation B.
Further, in this device, second generates subelement, also comprises:
Second sets up subelement, for setting up the comprehensive similarity model of distributed computing framework operation,
Described comprehensive similarity model is:
sim(A,B)=α×sim(JobA,JobB)+β×sim(ClusterA,ClusterB)
Wherein sim (JobA, JobB) represents operation A, the similarity of B, and sim (ClusterA, ClusterB) represents operation A, B is respectively in the similarity of described cluster environment information.Sim (A, B) represents operation A, and between B, with reference to the similarity of cluster environment information, α is the first weight parameter in weighted strategy, and β is the second weight parameter in weighted strategy.
As a preferred embodiment of the present invention, a kind of distributed computing framework Parameter Optimization System, the client of the distributed computing framework operation of parameter optimization device, submission and management of computing node, wherein, between the client of described distributed computing framework operation and described management of computing node, by described parameter optimization device, connect.
With reference to figure 6, Fig. 6 is preferably network architecture diagram of the distributed computing framework Parameter Optimization System that provides of the present embodiment.
Wherein, parameter automatic optimization configuration module is submitted between client and distributed computing framework Master in distributed computing framework operation.When user submits task to by distributed computing framework client, parameter optimization configuration module can carry out to this task the parameter configuration of a series of robotizations, then the management of computing node M aster that is submitted to distributed computing framework is upper, then by Master, this distributed computing framework operation is distributed on administered a plurality of Worker nodes and executed the task.
Wherein, distributed computing framework historic task Runtime Library has been deposited the distributed computing framework job information of having moved on cluster, and its Storage Format can be in this locality, HDFS or database, while retrieving similar task for parameter optimization configuration module.
With reference to figure 7, Fig. 7 is the structural representation of a kind of parameter optimization device of providing of the embodiment of the present invention, and the specific embodiment of the invention does not limit the specific implementation of described parameter optimization device, and described parameter optimization device 700, comprising:
Processor (English: processor) 701, communication interface (English: Communications Interface) 702, storer (English: memory) 103, bus 704.
Processor 701, communication interface 702, storer 703 completes mutual communication by bus 704.
Communication interface 702, for communicating with other communication facilitiess;
Processor 701, for executive routine.
Particularly, program can comprise program code, and described program code comprises computer-managed instruction.
Processor 701 may be a central processing unit (English: central processing unit, abbreviation: CPU.
Storer 703, for storage program.Its Program, for obtaining parallel computation framework (distributed computing framework, the distributed computing framework) operation when submit, checks whether described distributed computing framework operation exists the distributed computing framework operational factor of appointment;
For when there is not the distributed computing framework operational factor of appointment in described distributed computing framework operation, in the distributed computing framework history data storehouse of setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, described distributed computing framework history data storehouse comprises execution information and the configuration parameter of historical distributed computing framework operation; For in similar distributed computing framework operation, retrieve the historical distributed computing framework operation similar to described distributed computing framework operation; For according to the configuration parameter of similar historical distributed computing framework operation, the configuration parameter of described distributed computing framework operation is optimized to configuration.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (25)

1. a method for distributed computing framework parameter optimization, is characterized in that, comprising:
Obtain the distributed computing framework operation when submit;
In distributed computing framework Historical Jobs runtime database, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, described distributed computing framework Historical Jobs runtime database comprises execution information and the configuration parameter of historical distributed computing framework operation;
In similar distributed computing framework operation, retrieve the historical distributed computing framework operation similar to described distributed computing framework operation;
Configuration parameter according to similar historical distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
2. the method for claim 1, is characterized in that, described in distributed computing framework Historical Jobs runtime database, and before retrieving the historical distributed computing framework operation similar with described distributed computing framework operation, described method also comprises:
Check whether described distributed computing framework operation exists the distributed computing framework operational factor of appointment;
When there is not the distributed computing framework operational factor of appointment in described distributed computing framework operation, described in carrying out, in distributed computing framework Historical Jobs runtime database, retrieve the step of the historical distributed computing framework operation similar with described distributed computing framework operation.
3. the method for claim 1, is characterized in that, describedly according to the configuration parameter of similar historical distributed computing framework operation, the configuration parameter of described distributed computing framework operation is optimized to configuration, comprising:
In similar historical distributed computing framework operation, obtain the highest historical distributed computing framework operation of scoring, use the configuration parameter of the highest historical distributed computing framework operation of scoring as the configuration parameter of described distributed computing framework operation, the configuration parameter of described distributed computing framework operation is optimized to configuration; Or
In similar historical distributed computing framework operation, obtaining marks is greater than a plurality of distributed computing framework operations of pre-set threshold value, according to the similarity degree of a plurality of distributed computing framework operations, the configuration parameter of a plurality of distributed computing framework operations of weighted calculation, the configuration parameter that use weighted calculation obtains, as the configuration parameter of described distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
4. the method for claim 1, is characterized in that, also comprises:
When described distributed computing framework Job execution is complete, collect described distributed computing framework Job execution information and configuration parameter;
Described distributed computing framework Job execution information is marked, and described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
5. method as claimed in claim 4, is characterized in that, described distributed computing framework Job execution information is marked, and is specially:
Obtain time loss and the space consuming of described distributed computing framework job run;
According to the time loss of described distributed computing framework job run and space consuming, generate time loss cost and the space consuming cost of described distributed computing framework job run;
According to the Rating Model of the distributed computing framework operation of setting up in advance, and the time loss cost of described distributed computing framework job run and space consuming cost are marked to distributed computing framework job run.
6. method as claimed in claim 4, it is characterized in that, the Rating Model of the distributed computing framework operation of setting up in advance in described basis, and before the time loss cost of described distributed computing framework job run and space consuming cost mark to distributed computing framework job run, comprising:
Set up Rating Model;
Described Rating Model is
Figure FDA0000399144560000021
Wherein, F jobscoring when (τ, υ) represents a job run, τ, υ represents respectively corresponding time loss cost, space consuming cost, τ during Job execution scoring, the function of υ.α, β is the weight of the scoring of time loss cost and the scoring of space consuming cost, i is the sequence number of the middle property value of time loss cost, τ icorresponding to i property value of time loss cost, j is the sequence number of the middle property value of space consuming cost, υ jexpression is corresponding to j property value of space consuming cost, the weight of corresponding above-mentioned two attributes respectively, parameter alpha wherein, β is for regulating setup times to consume cost and space consuming cost which more preferably;
Figure FDA0000399144560000032
Figure FDA0000399144560000033
object be in order to eliminate the different difference of each cost attribute value magnitude, by arranging carry out normalized calculating F job(τ, υ).
7. method as claimed in claim 4, is characterized in that, described described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database, is specially:
Adopt tree-like storage mode, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
8. the method for claim 1, is characterized in that, in the distributed computing framework history data storehouse of setting up in advance, retrieves the historical distributed computing framework operation similar with described distributed computing framework operation, comprising:
In the distributed computing framework history data storehouse of setting up in advance, adopt search tree node mode, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation.
9. the method for claim 1, is characterized in that, the historical distributed computing framework operation that described retrieval is similar to described distributed computing framework operation, comprising:
Whether retrieve described distributed computing framework operation equates with the determinant attribute in historical distributed computing framework operation;
Determinant attribute in described distributed computing framework operation when determinant attribute in historical distributed computing framework operation equates, represents that described historical distributed computing framework operation is the historical distributed computing framework operation similar to described distributed computing framework operation.
10. method as claimed in claim 9, is characterized in that, the historical distributed computing framework operation that described retrieval is similar to described distributed computing framework operation, also comprises:
Obtain the attribute field of described distributed computing framework operation and historical distributed computing framework operation, described attribute field comprises determinant attribute set and non-key community set;
According to similarity model and the described attribute field of the distributed computing framework operation of setting up in advance, generate respectively the similarity of operation and the similarity of cluster environment information, the similarity of described operation is the similarity of described distributed computing framework operation and historical distributed computing framework operation, and the similarity of described cluster environment information is that the similarity of described distributed computing framework operation and historical distributed computing framework operation is respectively in the similarity of described cluster environment information;
According to comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance, generate the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation.
11. methods as claimed in claim 10, is characterized in that, the similarity model of the distributed computing framework operation of setting up in advance in described basis and described attribute field, before generating respectively the similarity of operation and the similarity of cluster environment information, comprising:
Set up the similarity model of distributed computing framework operation,
Described similarity model is:
sim ( A , B ) = Π k ∈ K 1 , p A , k = p B , k 0 , p A , k ≠ p B , k * ( Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2 )
Wherein K is determinant attribute set, and k is the sequence number of the middle determinant attribute of determinant attribute set, and I is non-key community set, the sequence number of non-key attribute during i is non-key community set, p a,kk the determinant attribute that represents operation A, p a,ifor i the non-key attribute of sign operation A, p b,kfor k the determinant attribute of sign operation B, p b,ii the non-key attribute for sign operation B.
12. methods as claimed in claim 10, it is characterized in that, comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance in basis, before generating the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation, comprising:
Set up the comprehensive similarity model of distributed computing framework operation,
Described comprehensive similarity model is:
sim(A,B)=α×sim(JobA,JobB)+β×sim(ClusterA,ClusterB)
Sim (JobA wherein, JobB) represent operation A, the similarity of B, sim (ClusterA, ClusterB) represents operation A, B is respectively in the similarity of described cluster environment information, sim (A, B) represents operation A, between B with reference to the similarity of cluster environment information, α is the first weight parameter in weighted strategy, and β is the second weight parameter in weighted strategy.
13. 1 kinds of distributed computing framework parameter optimization devices, is characterized in that, comprising:
Acquiring unit, for obtaining the distributed computing framework operation when submit;
The first retrieval unit, in the distributed computing framework history data storehouse of setting up in advance, retrieve the historical distributed computing framework operation similar with described distributed computing framework operation, described distributed computing framework history data storehouse comprises execution information and the configuration parameter of historical distributed computing framework operation;
The second retrieval unit, for the distributed computing framework operation similar, retrieves the historical distributed computing framework operation similar to described distributed computing framework operation;
Dispensing unit, for according to the configuration parameter of similar historical distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
14. parameter optimization devices as claimed in claim 13, is characterized in that, described the first retrieval unit, comprising:
Check subelement, for checking whether described distributed computing framework operation exists the distributed computing framework operational factor of appointment;
Carry out subelement, for when there is not the distributed computing framework operational factor of appointment in described distributed computing framework operation, described in carrying out, in distributed computing framework Historical Jobs runtime database, retrieve the step of the historical distributed computing framework operation similar with described distributed computing framework operation.
15. parameter optimization devices as claimed in claim 13, it is characterized in that, described dispensing unit, also for the historical distributed computing framework operation similar, obtain the highest historical distributed computing framework operation of scoring, use the configuration parameter of the highest historical distributed computing framework operation of scoring as the configuration parameter of described distributed computing framework operation, to the configuration parameter of described distributed computing framework operation be optimized configuration or
In similar historical distributed computing framework operation, obtaining marks is greater than a plurality of distributed computing framework operations of pre-set threshold value, according to the similarity degree of a plurality of distributed computing framework operations, the configuration parameter of a plurality of distributed computing framework operations of weighted calculation, the configuration parameter that use weighted calculation obtains, as the configuration parameter of described distributed computing framework operation, is optimized configuration to the configuration parameter of described distributed computing framework operation.
16. parameter optimization devices as claimed in claim 13, is characterized in that, also comprise:
Collector unit, for when described distributed computing framework Job execution is complete, collects described distributed computing framework Job execution information and configuration parameter;
Scoring unit, for described distributed computing framework Job execution information is marked, and is saved in described distributed computing framework operation in described distributed computing framework Historical Jobs database.
17. parameter optimization devices as claimed in claim 16, is characterized in that, described scoring unit, comprising:
Obtain subelement, for obtaining time loss and the space consuming of described distributed computing framework job run;
Generate subelement, for according to the time loss of described distributed computing framework job run and space consuming, generate time loss cost and the space consuming cost of described distributed computing framework job run;
Scoring subelement, for according to the Rating Model of the distributed computing framework operation of setting up in advance, and the time loss cost of described distributed computing framework job run and space consuming cost are marked to distributed computing framework job run.
18. parameter optimization devices as claimed in claim 16, is characterized in that, described scoring unit, also comprises:
Set up subelement, for setting up Rating Model;
Described Rating Model is
Figure FDA0000399144560000061
Wherein, F jobscoring when (τ, υ) represents a job run, τ, υ represents respectively corresponding time loss cost, space consuming cost, τ during Job execution scoring, the function of υ.α, β is the weight of the scoring of time loss cost and the scoring of space consuming cost, i is the sequence number of the middle property value of time loss cost, τ icorresponding to i property value of time loss cost, j is the sequence number of the middle property value of space consuming cost, υ jexpression is corresponding to j property value of space consuming cost,
Figure FDA0000399144560000062
the weight of corresponding above-mentioned two attributes respectively, parameter alpha wherein, β is for regulating setup times to consume cost and space consuming cost which more preferably;
Figure FDA0000399144560000063
object be in order to eliminate the different difference of each cost attribute value magnitude, by arranging carry out normalized calculating F job(τ, υ).
19. parameter optimization devices as claimed in claim 16, is characterized in that, described scoring unit, also comprises:
Preserve subelement, for adopting tree-like storage mode, described distributed computing framework operation is saved in described distributed computing framework Historical Jobs database.
20. parameter optimization devices as claimed in claim 13, is characterized in that, described the first retrieval unit, comprising:
The first retrieval subelement, for the distributed computing framework history data storehouse setting up in advance, adopts search tree node mode, retrieves the historical distributed computing framework operation similar with described distributed computing framework operation.
21. parameter optimization devices as claimed in claim 13, is characterized in that, described the second retrieval unit, comprising:
Whether the second retrieval subelement, equate with the determinant attribute of historical distributed computing framework operation for retrieving described distributed computing framework operation;
Similar subelement, for when the determinant attribute of described distributed computing framework operation is when determinant attribute in historical distributed computing framework operation equates, represent that described historical distributed computing framework operation is the historical distributed computing framework operation similar to described distributed computing framework operation.
22. parameter optimization devices as claimed in claim 21, is characterized in that, described the second retrieval unit, also comprises:
Obtain subelement, for obtaining the attribute field of described distributed computing framework operation and historical distributed computing framework operation, described attribute field comprises determinant attribute set and non-key community set;
First generates subelement, similarity model and the described attribute field of the distributed computing framework operation of setting up in advance for basis, generate respectively the similarity of operation and the similarity of cluster environment information, the similarity of described operation is the similarity of described distributed computing framework operation and historical distributed computing framework operation, and the similarity of described cluster environment information is that the similarity of described distributed computing framework operation and historical distributed computing framework operation is respectively in the similarity of described cluster environment information;
Second generates subelement, for according to comprehensive similarity model and the weighted strategy of the distributed computing framework operation of setting up in advance, generates the comprehensive similarity of described distributed computing framework operation and historical distributed computing framework operation.
23. parameter optimization devices as claimed in claim 22, is characterized in that, described first generates subelement, also comprises:
First sets up subelement, for setting up the similarity model of distributed computing framework operation,
Described similarity model is:
sim ( A , B ) = Π k ∈ K 1 , p A , k = p B , k 0 , p A , k ≠ p B , k * ( Σ i ∈ I p A , i * p B , i Σ i ∈ I p A , i 2 * Σ i ∈ I p B , i 2 )
Wherein K is determinant attribute set, and k is the sequence number of the middle determinant attribute of determinant attribute set, and I is non-key community set, the sequence number of non-key attribute during i is non-key community set, p a,kk the determinant attribute that represents operation A, p a,ifor i the non-key attribute of sign operation A, p b,kfor k the determinant attribute of sign operation B, p b,ii the non-key attribute for sign operation B.
24. parameter optimization devices as claimed in claim 22, is characterized in that, described second generates subelement, also comprises:
Second sets up subelement, for setting up the comprehensive similarity model of distributed computing framework operation,
Described comprehensive similarity model is:
sim(A,B)=α×sim(JobA,JobB)+β×sim(ClusterA,ClusterB)
Wherein sim (JobA, JobB) represents operation A, the similarity of B, and sim (ClusterA, ClusterB) represents operation A, B is respectively in the similarity of described cluster environment information.Sim (A, B) represents operation A, and between B, with reference to the similarity of cluster environment information, α is the first weight parameter in weighted strategy, and β is the second weight parameter in weighted strategy.
25. 1 kinds of distributed computing framework Parameter Optimization Systems, it is characterized in that, the client and the management of computing node that comprise the distributed computing framework operation of parameter optimization device described in claim 13 to 24 any one, submission, wherein, between the client of described distributed computing framework operation and described management of computing node, by described parameter optimization device, connect.
CN201310495879.7A 2013-10-21 2013-10-21 Distributed computation frame parameter optimizing method, device and system Active CN103605662B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201310495879.7A CN103605662B (en) 2013-10-21 2013-10-21 Distributed computation frame parameter optimizing method, device and system
PCT/CN2014/084483 WO2015058578A1 (en) 2013-10-21 2014-08-15 Method, apparatus and system for optimizing distributed computation framework parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310495879.7A CN103605662B (en) 2013-10-21 2013-10-21 Distributed computation frame parameter optimizing method, device and system

Publications (2)

Publication Number Publication Date
CN103605662A true CN103605662A (en) 2014-02-26
CN103605662B CN103605662B (en) 2017-02-22

Family

ID=50123887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310495879.7A Active CN103605662B (en) 2013-10-21 2013-10-21 Distributed computation frame parameter optimizing method, device and system

Country Status (2)

Country Link
CN (1) CN103605662B (en)
WO (1) WO2015058578A1 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462216A (en) * 2014-11-06 2015-03-25 上海南洋万邦软件技术有限公司 Resident committee standard code conversion system and method
WO2015058578A1 (en) * 2013-10-21 2015-04-30 华为技术有限公司 Method, apparatus and system for optimizing distributed computation framework parameters
CN105302536A (en) * 2014-07-31 2016-02-03 国际商业机器公司 Configuration method and apparatus for related parameters of MapReduce application
CN105511957A (en) * 2014-09-25 2016-04-20 国际商业机器公司 Method and system for generating work alarm
CN106021495A (en) * 2016-05-20 2016-10-12 清华大学 Task parameter optimization method for distributed iterative computing system
CN106383746A (en) * 2016-08-30 2017-02-08 北京航空航天大学 Configuration parameter determination method and apparatus of big data processing system
CN107209670A (en) * 2015-02-13 2017-09-26 华为技术有限公司 A kind of collocation method, device and the terminal of application attribute parameter
CN108733750A (en) * 2018-04-04 2018-11-02 安徽水利开发股份有限公司 A kind of database optimizing method
CN109614236A (en) * 2018-12-07 2019-04-12 深圳前海微众银行股份有限公司 Cluster resource dynamic adjusting method, device, equipment and readable storage medium storing program for executing
CN109710395A (en) * 2017-10-26 2019-05-03 中国电信股份有限公司 Parameter optimization control method, device and distributed computing system
CN110554910A (en) * 2018-05-30 2019-12-10 中国电信股份有限公司 Method and apparatus for optimizing distributed computing performance
CN110609850A (en) * 2019-08-01 2019-12-24 联想(北京)有限公司 Information determination method, electronic equipment and computer storage medium
CN110908803A (en) * 2019-11-22 2020-03-24 神州数码融信软件有限公司 Operation distribution method based on cosine similarity algorithm
CN113552856A (en) * 2021-09-22 2021-10-26 成都数之联科技有限公司 Process parameter root factor positioning method and related device
CN113688602A (en) * 2021-10-26 2021-11-23 中电云数智科技有限公司 Task processing method and device
WO2022121518A1 (en) * 2020-12-11 2022-06-16 清华大学 Parameter configuration optimization method and system for distributed computing jobs
WO2023103624A1 (en) * 2021-12-06 2023-06-15 中兴通讯股份有限公司 Task optimization method and apparatus, and computer readable storage medium
CN117269180A (en) * 2023-11-24 2023-12-22 成都数之联科技股份有限公司 Vehicle appearance detection method, device, server and computer readable storage medium
CN117573359A (en) * 2023-11-28 2024-02-20 之江实验室 Heterogeneous cluster-based computing framework management system and method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060200552A1 (en) * 2005-03-07 2006-09-07 Beigi Mandis S Method and apparatus for domain-independent system parameter configuration
JP4401347B2 (en) * 2005-10-27 2010-01-20 シャープ株式会社 Distributed printing control system and distributed printing control method
CN101697141B (en) * 2009-10-30 2012-09-05 清华大学 Prediction method of operational performance based on historical data modeling in grid
US9367601B2 (en) * 2012-03-26 2016-06-14 Duke University Cost-based optimization of configuration parameters and cluster sizing for hadoop
CN103064664B (en) * 2012-11-28 2015-07-22 华中科技大学 Hadoop parameter automatic optimization method and system based on performance pre-evaluation
CN103605662B (en) * 2013-10-21 2017-02-22 华为技术有限公司 Distributed computation frame parameter optimizing method, device and system

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015058578A1 (en) * 2013-10-21 2015-04-30 华为技术有限公司 Method, apparatus and system for optimizing distributed computation framework parameters
US10831716B2 (en) 2014-07-31 2020-11-10 International Business Machines Corporation Method and apparatus for configuring relevant parameters of MapReduce applications
CN105302536A (en) * 2014-07-31 2016-02-03 国际商业机器公司 Configuration method and apparatus for related parameters of MapReduce application
CN105511957A (en) * 2014-09-25 2016-04-20 国际商业机器公司 Method and system for generating work alarm
US10705935B2 (en) 2014-09-25 2020-07-07 International Business Machines Corporation Generating job alert
CN105511957B (en) * 2014-09-25 2019-05-07 国际商业机器公司 For generating the method and system of operation alarm
CN104462216B (en) * 2014-11-06 2018-01-26 上海南洋万邦软件技术有限公司 Occupy committee's standard code converting system and method
CN104462216A (en) * 2014-11-06 2015-03-25 上海南洋万邦软件技术有限公司 Resident committee standard code conversion system and method
CN107209670A (en) * 2015-02-13 2017-09-26 华为技术有限公司 A kind of collocation method, device and the terminal of application attribute parameter
CN107209670B (en) * 2015-02-13 2020-01-17 华为技术有限公司 Configuration method and device of application attribute parameters and terminal
CN106021495B (en) * 2016-05-20 2017-10-31 清华大学 A kind of task parameters optimization method of distributed iterative computing system
CN106021495A (en) * 2016-05-20 2016-10-12 清华大学 Task parameter optimization method for distributed iterative computing system
CN106383746A (en) * 2016-08-30 2017-02-08 北京航空航天大学 Configuration parameter determination method and apparatus of big data processing system
CN109710395A (en) * 2017-10-26 2019-05-03 中国电信股份有限公司 Parameter optimization control method, device and distributed computing system
CN109710395B (en) * 2017-10-26 2021-05-14 中国电信股份有限公司 Parameter optimization control method and device and distributed computing system
CN108733750A (en) * 2018-04-04 2018-11-02 安徽水利开发股份有限公司 A kind of database optimizing method
CN110554910A (en) * 2018-05-30 2019-12-10 中国电信股份有限公司 Method and apparatus for optimizing distributed computing performance
CN109614236A (en) * 2018-12-07 2019-04-12 深圳前海微众银行股份有限公司 Cluster resource dynamic adjusting method, device, equipment and readable storage medium storing program for executing
CN110609850A (en) * 2019-08-01 2019-12-24 联想(北京)有限公司 Information determination method, electronic equipment and computer storage medium
CN110908803A (en) * 2019-11-22 2020-03-24 神州数码融信软件有限公司 Operation distribution method based on cosine similarity algorithm
CN110908803B (en) * 2019-11-22 2022-07-05 神州数码融信软件有限公司 Operation distribution method based on cosine similarity algorithm
WO2022121518A1 (en) * 2020-12-11 2022-06-16 清华大学 Parameter configuration optimization method and system for distributed computing jobs
US11768712B2 (en) 2020-12-11 2023-09-26 Tsinghua University Method and system for optimizing parameter configuration of distributed computing job
CN113552856A (en) * 2021-09-22 2021-10-26 成都数之联科技有限公司 Process parameter root factor positioning method and related device
CN113552856B (en) * 2021-09-22 2021-12-10 成都数之联科技有限公司 Process parameter root factor positioning method and related device
CN113688602A (en) * 2021-10-26 2021-11-23 中电云数智科技有限公司 Task processing method and device
WO2023103624A1 (en) * 2021-12-06 2023-06-15 中兴通讯股份有限公司 Task optimization method and apparatus, and computer readable storage medium
CN117269180A (en) * 2023-11-24 2023-12-22 成都数之联科技股份有限公司 Vehicle appearance detection method, device, server and computer readable storage medium
CN117269180B (en) * 2023-11-24 2024-03-12 成都数之联科技股份有限公司 Vehicle appearance detection method, device, server and computer readable storage medium
CN117573359A (en) * 2023-11-28 2024-02-20 之江实验室 Heterogeneous cluster-based computing framework management system and method

Also Published As

Publication number Publication date
CN103605662B (en) 2017-02-22
WO2015058578A1 (en) 2015-04-30

Similar Documents

Publication Publication Date Title
CN103605662A (en) Distributed computation frame parameter optimizing method, device and system
US8359305B1 (en) Query metadata engine
US8326825B2 (en) Automated partitioning in parallel database systems
US11228489B2 (en) System and methods for auto-tuning big data workloads on cloud platforms
CN105550268A (en) Big data process modeling analysis engine
Hasani et al. Lambda architecture for real time big data analytic
CN102799622A (en) Distributed structured query language (SQL) query method based on MapReduce expansion framework
CN114416855A (en) Visualization platform and method based on electric power big data
CN115335821B (en) Offloading statistics collection
CN116097247A (en) Automated ETL workflow generation
Elsayed et al. Mapreduce: State-of-the-art and research directions
Dagade et al. Big data weather analytics using hadoop
CN103646079A (en) Distributed index for graph database searching and parallel generation method of distributed index
CN103699696A (en) Data online gathering method in cloud computing environment
CN105138600A (en) Graph structure matching-based social network analysis method
CN114297173A (en) Knowledge graph construction method and system for large-scale mass data
CN106648839A (en) Method and device for processing data
CN110825526A (en) Distributed scheduling method and device based on ER relationship, equipment and storage medium
CN113722564A (en) Visualization method and device for energy and material supply chain based on space map convolution
US10417228B2 (en) Apparatus and method for analytical optimization through computational pushdown
CN114817226A (en) Government data processing method and device
US20140379691A1 (en) Database query processing with reduce function configuration
Liu et al. A survey of speculative execution strategy in MapReduce
Ma et al. Cloud-based multidimensional parallel dynamic programming algorithm for a cascade hydropower system
Jamal et al. Performance Comparison between S3, HDFS and RDS storage technologies for real-time big-data applications

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220208

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Patentee after: Huawei Cloud Computing Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right