CN106021495B - A kind of task parameters optimization method of distributed iterative computing system - Google Patents
A kind of task parameters optimization method of distributed iterative computing system Download PDFInfo
- Publication number
- CN106021495B CN106021495B CN201610341201.7A CN201610341201A CN106021495B CN 106021495 B CN106021495 B CN 106021495B CN 201610341201 A CN201610341201 A CN 201610341201A CN 106021495 B CN106021495 B CN 106021495B
- Authority
- CN
- China
- Prior art keywords
- mrow
- src
- dst
- directed acyclic
- msub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/217—Database tuning
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Complex Calculations (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明涉及一种分布式迭代计算系统中的任务参数优化方法,属于分布式数据处理技术领域。本方法首先采集分布式迭代计算系统中历史任务的运行数据,构建历史数据库;进行任务参数优化时,根据约束条件对历史数据库中显著不相关的运行数据进行一次过滤;然后对待优化任务对应的历史数据库中的运行数据与一次过滤后的运行数据进行有向无环图的相似度计算,并对相似度低于一定阈值的运行数据进行二次过滤;最后将两次过滤后的结果经过计算排序,并将排序后的运行数据所对应的任务参数作为任务参数优化结果。本发明能自动进行分布式迭代计算系统的任务参数优化,是一种即插即用型自适应调优方法,能够显著降低用户使用分布式迭代计算系统的门槛。
The invention relates to a task parameter optimization method in a distributed iterative computing system, belonging to the technical field of distributed data processing. This method first collects the running data of the historical tasks in the distributed iterative computing system, and builds the historical database; when optimizing the task parameters, it filters the significantly irrelevant running data in the historical database according to the constraints; The operating data in the database and the operating data after the first filtering are calculated for the similarity of the directed acyclic graph, and the operating data whose similarity is lower than a certain threshold is filtered twice; finally, the results after the two filtering are calculated and sorted , and use the task parameters corresponding to the sorted running data as the task parameter optimization results. The invention can automatically optimize the task parameters of the distributed iterative computing system, is a plug-and-play self-adaptive optimization method, and can significantly reduce the threshold for users to use the distributed iterative computing system.
Description
技术领域technical field
本发明属于分布式数据处理技术领域,特别涉及一种分布式迭代计算系统中任务参数优化方法。The invention belongs to the technical field of distributed data processing, in particular to a task parameter optimization method in a distributed iterative computing system.
背景技术Background technique
使用分布式迭代计算系统处理大规模数据集已成为目前数据处理的主要做法。相比于传统的单机数据处理方案,现在流行并被大量使用的分布式迭代计算系统,如ApacheSpark,利用了多台机器对数据进行划分,从而大幅度的提高了数据处理的规模。并且,多台机器参与到数据处理的流程中,提高了数据处理的并行数目,加快了大规模数据的处理速度。Using distributed iterative computing systems to process large-scale data sets has become the main practice of data processing. Compared with traditional stand-alone data processing solutions, distributed iterative computing systems that are now popular and widely used, such as Apache Spark, use multiple machines to divide data, thereby greatly increasing the scale of data processing. Moreover, multiple machines participate in the data processing process, which increases the parallel number of data processing and speeds up the processing speed of large-scale data.
尽管拥有以上的优点,一个分布式迭代计算系统任务的正常运行需要合理的任务参数。不合理的任务参数会导致该任务在分布式迭代计算系统中的处理速度下降。合理的任务参数能增加任务在分布式迭代计算系统中数据处理的并行度,减少网络的传输开销和减少调度时间开销,因此能加快任务的处理速度。分布式迭代计算系统所涉及的任务参数多达数十个,并且任务参数之间存在错综复杂的关系。任务参数的配置工作给开发人员带来了额外的开销,并且人工决策的任务参数不一定取得良好的运行性能。Despite the above advantages, the normal operation of a distributed iterative computing system task requires reasonable task parameters. Unreasonable task parameters will reduce the processing speed of the task in the distributed iterative computing system. Reasonable task parameters can increase the parallelism of task data processing in the distributed iterative computing system, reduce network transmission overhead and reduce scheduling time overhead, thus speeding up task processing. There are dozens of task parameters involved in the distributed iterative computing system, and there are intricate relationships among the task parameters. The configuration of task parameters brings additional overhead to developers, and task parameters determined manually may not necessarily achieve good operating performance.
分布式迭代计算系统中存在任务参数众多并且不容易配好的难题,由此引出的一个问题是,能否给分布式迭代计算系统中的任务参数进行优化。目前,分布式迭代计算系统中任务参数优化工作主要依赖于工程师的经验进行决策。但这种优化方法过于主观,经验充足的工程师往往能得出较好的任务参数,而经验不足的工程师却得不出较好的任务参数。There are many task parameters in the distributed iterative computing system and it is difficult to configure them well. One of the problems that arises from this is whether to optimize the task parameters in the distributed iterative computing system. At present, the optimization of task parameters in distributed iterative computing systems mainly relies on the experience of engineers to make decisions. But this optimization method is too subjective, experienced engineers can often get better task parameters, but inexperienced engineers can't get better task parameters.
发明内容Contents of the invention
本发明的目的是针对现有分布式迭代计算系统中任务参数众多并且不容易配置好的难题,提出一种分布式迭代计算系统的任务参数优化方法。本发明能自动进行分布式迭代计算系统的任务参数优化,是一种即插即用型自适应调优方法,能够显著降低用户使用分布式迭代计算系统的门槛。The purpose of the present invention is to propose a method for optimizing task parameters of a distributed iterative computing system in view of the problem that there are many task parameters in the existing distributed iterative computing system and it is not easy to configure well. The invention can automatically optimize the task parameters of the distributed iterative computing system, is a plug-and-play self-adaptive optimization method, and can significantly reduce the threshold for users to use the distributed iterative computing system.
本发明提出的分布式迭代计算系统中的任务参数优化方法,首先采集分布式迭代计算系统中历史任务的运行数据,构建历史数据库;进行任务参数优化时,根据约束条件对历史数据库中显著不相关的运行数据进行一次过滤;然后对待优化任务对应的历史数据库中的运行数据与一次过滤后的运行数据进行有向无环图的相似度计算,并对相似度低于一定阈值的运行数据进行二次过滤;最后将两次过滤后的结果经过计算排序,并将排序后的运行数据所对应的任务参数作为任务参数优化结果。该方法具体包括以下步骤:The task parameter optimization method in the distributed iterative computing system proposed by the present invention firstly collects the operation data of the historical tasks in the distributed iterative computing system, and builds a historical database; when optimizing the task parameters, the historical database is significantly irrelevant according to the constraints Filter the operation data once; then perform DAG similarity calculation on the operation data in the historical database corresponding to the task to be optimized and the operation data after the first filter, and perform secondary calculation on the operation data whose similarity is lower than a certain threshold. Finally, the results after the two filters are calculated and sorted, and the task parameters corresponding to the sorted running data are used as the task parameter optimization results. The method specifically includes the following steps:
(1)从分布式迭代计算系统中获取每个历史任务的运行数据,将每个历史任务的运行数据保存到历史数据库中,历史数据库中每一项数据代表一个历史任务的运行数据;(1) Obtain the operation data of each historical task from the distributed iterative computing system, save the operation data of each historical task in the historical database, and each item of data in the historical database represents the operational data of a historical task;
(2)根据用户请求,对分布式迭代计算系统中的任务进行任务参数优化,设从历史数据库中找出的与该任务相同的的历史任务的运行数据为Jsrc;(2) according to user request, carry out task parameter optimization to the task in the distributed iterative computing system, suppose the running data of the historical task identical with this task that finds out from historical database be J src ;
(3)从历史数据库中找出满足所有硬件资源约束的历史任务运行数据组成数据集合Shardware;(3) Find out from the historical database the historical task operation data that meets all hardware resource constraints to form the data set S hardware ;
(4)在步骤(3)得到的Shardware的所有运行数据中找出输入数据总大小与步骤(2)得到的Jsrc的输入数据总大小在数值上相对差异小于设定的输入数据大小差异阈值的运行数据组成数据集合Sdatasize;(4) Find out the relative difference between the total size of the input data and the total size of the input data of J src obtained in the step (2) from all the operating data of Shardware obtained in step (3) is smaller than the set input data size difference The operating data of the threshold constitutes the data set S datasize ;
(5)在步骤(4)得到的Sdatasize的所有运行数据中找出有向无环图与Jsrc的有向无环图在规模上相近的运行数据组成数据集合Sdag;(5) in all operating data of the S datasize that step (4) obtains, find out directed acyclic graph and the directed acyclic graph of J src in scale similar operating data to form data set S dag ;
(6)计算步骤(5)得到的Sdag中每项运行数据的有向无环图与Jsrc的有向无环图的相似度,并设定相似度阈值;(6) the similarity between the directed acyclic graph of each operation data and the directed acyclic graph of J src in the S dag that calculation step (5) obtains, and set the similarity threshold;
(7)遍历步骤(6)的计算结果,抛弃Sdag中有向无环图与Jsrc的有向无环图相似度低于设定的相似度阈值的运行数据,设剩余运行数据组成的数据集合为Ssim;(7) Traversing the calculation results of step (6), discarding the operating data whose similarity between DAG in S dag and DAG in J src is lower than the set similarity threshold, and setting The data set is S sim ;
(8)对步骤(7)得到的Ssim中的每项运行数据按照公式的计算结果从高到低进行排序,并只保留排序后计算结果中前n项的运行数据,n为正整数;式中,timedst表示Jdst的运行时间;设排序后所得结果组成的数据集合为Srank;(8) to each operation data in the S sim that step (7) obtains according to the formula The calculation results are sorted from high to low, and only the running data of the first n items in the sorted calculation results are kept, n is a positive integer; where time dst represents the running time of J dst ; the data composed of the sorted results is set The set is S rank ;
(9)将步骤(8)得到的Srank中的每一条运行数据的任务参数在图型显示界面上显示给用户,任务参数优化流程结束;(9) the task parameter of each piece of operating data in the S rank that step (8) obtains is displayed to the user on the graphic display interface, and the task parameter optimization process ends;
(10)当用户再次请求对分布式迭代计算系统的任务进行优化时,重新返回步骤(2)。(10) When the user requests to optimize the task of the distributed iterative computing system again, return to step (2).
本发明提出的分布式迭代计算系统中任务参数的优化方法,其特点和有益效果是:The optimization method of task parameter in the distributed iterative computing system that the present invention proposes, its characteristic and beneficial effect are:
1.本发明方法能让计算机承担分布式迭代计算系统中的任务参数优化的工作,减少了用户在使用分布式迭代计算系统时的工作量。在用户不熟悉分布式迭代计算系统的情况下,能给用户提供较为有效的任务参数,减轻了使用分布式迭代计算系统的压力。1. The method of the present invention enables the computer to undertake the work of optimizing task parameters in the distributed iterative computing system, reducing the workload of the user when using the distributed iterative computing system. In the case that the user is not familiar with the distributed iterative computing system, it can provide the user with more effective task parameters, reducing the pressure of using the distributed iterative computing system.
2.本方法结合了系统优化的经验规则和基于相似性搜索的优化方法,提高了任务参数优化的可靠性和可用性。2. This method combines the empirical rules of system optimization and the optimization method based on similarity search, which improves the reliability and usability of task parameter optimization.
3.本发明方法能适应系统的变化而进行改变,是一种自适应的调优方法。在分布式迭代计算系统运行的过程中,该方法会不断的收集系统所产生的运行数据,使得历史数据库的数据量越来越大,所能覆盖的任务类别越来越多。数据量的增加会使得任务参数优化结果随着系统运行而变得更好。3. The method of the present invention can adapt to changes in the system and be changed, and is an adaptive tuning method. During the operation of the distributed iterative computing system, this method will continuously collect the operating data generated by the system, so that the amount of data in the historical database becomes larger and larger, and the types of tasks that can be covered are more and more. The increase in the amount of data will make the task parameter optimization results better as the system runs.
4.本发明方法不需改动原有的分布式迭代计算系统,属于即插即用型的方法。4. The method of the present invention does not need to modify the original distributed iterative computing system, and belongs to the plug-and-play method.
附图说明Description of drawings
图1是本发明提出的分布式迭代计算系统中任务参数优化方法的总体流程图。Fig. 1 is an overall flowchart of the task parameter optimization method in the distributed iterative computing system proposed by the present invention.
具体实施方式detailed description
本发明提出一种分布式迭代计算系统中任务参数优化方法,下面结合附图和具体实施例进一步详细说明如下。The present invention proposes a task parameter optimization method in a distributed iterative computing system, which will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.
本发明提出一种分布式迭代计算系统中任务参数优化方法,总体流程如图1所示,本方法首先采集分布式迭代计算系统中历史任务的运行数据,构建历史数据库;进行任务参数优化时,根据约束条件对历史数据库中显著不相关的运行数据进行一次过滤;然后对待优化任务对应的历史数据库中的运行数据与一次过滤后的运行数据进行有向无环图的相似度计算,并对相似度低于一定阈值的运行数据进行二次过滤;最后将两次过滤后的结果经过计算排序,并将排序后的运行数据所对应的任务参数作为任务参数优化结果。该方法具体包括以下步骤:The present invention proposes a method for optimizing task parameters in a distributed iterative computing system. The overall process is shown in Figure 1. This method first collects the operating data of historical tasks in the distributed iterative computing system and builds a historical database; when optimizing task parameters, According to the constraint conditions, the significantly irrelevant operating data in the historical database is filtered once; then the similarity calculation of the directed acyclic graph is performed between the operating data in the historical database corresponding to the task to be optimized and the filtered operating data, and the similarity The operating data whose degree is lower than a certain threshold is filtered twice; finally, the results after the two filtering are calculated and sorted, and the task parameters corresponding to the sorted operating data are used as the task parameter optimization results. The method specifically includes the following steps:
(1)从分布式迭代计算系统中获取每个历史任务的运行数据,一个历史任务的运行数据包括任务参数、硬件资源信息(总体内存、可运行CPU核数和机器节点数目)、输入数据总大小和对应的有向无环图(任务在执行过程中,任务被分为多个子任务,有向无环图用于反映各个子任务之间的依赖关系;有向无环图上的节点代表子任务,有向无环图上的边代表子任务之间的先后执行顺序关系,节点上的标签代表子任务的具体名字);然后将每个历史任务的运行数据保存到历史数据库中,历史数据库中每一项数据代表一个历史任务的运行数据;(1) Obtain the running data of each historical task from the distributed iterative computing system. The running data of a historical task includes task parameters, hardware resource information (total memory, the number of CPU cores that can be run, and the number of machine nodes), and the total number of input data. Size and the corresponding directed acyclic graph (during the execution of the task, the task is divided into multiple subtasks, and the directed acyclic graph is used to reflect the dependencies between the subtasks; the nodes on the directed acyclic graph represent Subtasks, the edges on the directed acyclic graph represent the sequence relationship between the subtasks, and the labels on the nodes represent the specific names of the subtasks); then save the running data of each historical task in the historical database, and the history Each item of data in the database represents the running data of a historical task;
(2)根据用户请求,对分布式迭代计算系统中的任务进行任务参数优化,设从历史数据库中找出与该任务相同的历史任务的运行数据为Jsrc;(2) According to the user's request, the task in the distributed iterative computing system is optimized for task parameters, and the operating data of finding out the same historical task as the task from the historical database is J src ;
(3)从历史数据库中找出满足所有硬件资源约束的历史任务运行数据组成数据集合Shardware;所述约束包括:运行数据总体内存与Jsrc的总体内存在数值上相对差异小于设定的内存差异阈值(本实施例设定的阈值为30%);运行数据可运行CPU核数与Jsrc的可运行CPU核数在数值上相对差异小于设定的核数差异阈值(本实施例设定的阈值为30%);运行数据机器节点数与Jsrc的机器节点数在数值上相对差异小于设定的机器节点数差异阈值(本实施例设定的阈值为30%);(3) Find out from the historical database the historical task operation data that meets all hardware resource constraints to form a data set S hardware ; the constraints include: the relative difference in value between the overall memory of the operating data and the overall memory of J src is smaller than the set memory The difference threshold (the threshold set in this embodiment is 30%); the relative difference in value between the number of executable CPU cores of the running data and the number of executable CPU cores of J src is less than the set core number difference threshold (set in this embodiment) The threshold of the machine node number is 30%); the relative difference in value between the number of machine nodes of the running data and the number of machine nodes of J src is less than the set machine node number difference threshold (the threshold set in this embodiment is 30%);
(4)在步骤(3)得到的Shardware的所有运行数据中找出输入数据总大小与步骤(2)得到的Jsrc的输入数据总大小在数值(以兆为单位)上相对差异小于设定的输入数据大小差异阈值的运行数据,组成数据集合Sdatasize(本实施例设定的阈值为30%);(4) Find out the relative difference between the total size of the input data and the total size of the input data of J src obtained in step (2) in value (in megabytes) from all the operating data of the Shardware obtained in step (3) is smaller than the set value The operating data of the determined input data size difference threshold, form the data set S datasize (the threshold set in this embodiment is 30%);
(5)在步骤(4)得到的Sdatasize的所有运行数据中找出有向无环图与Jsrc的有向无环图在规模上相近的运行数据组成数据集合Sdag;两个有向无环图的规模相近包括以下两方面条件:其一,两个有向无环图上的节点数目在数值上相对差异小于设定的有向无环图节点数目差异阈值(本实施例设定的阈值为30%);其二,两个有向无环图上的边数目在数值上相对差异小于设定的有向无环图边数目差异阈值(本实施例设定的阈值为30%);(5) Find out the directed acyclic graph and the directed acyclic graph of J src in all operating data of S datasize that step (4) obtains and form the data collection S dag of operating data similar in scale; Two directed The similar scale of the acyclic graph includes the following two conditions: first, the relative difference in value of the number of nodes on the two directed acyclic graphs is smaller than the set difference threshold of the number of directed acyclic graph nodes (set in this embodiment The threshold is 30%); second, the relative difference in value of the number of edges on the two directed acyclic graphs is smaller than the set directed acyclic graph edge number difference threshold (the threshold set in this embodiment is 30% );
(6)计算步骤(5)得到的Sdag中每项运行数据的有向无环图与Jsrc的有向无环图的相似度,并设定相似度阈值;具体计算过程步骤如下:(6) Calculate the similarity between the directed acyclic graph of each item of operating data in S dag obtained in step (5) and the directed acyclic graph of J src , and set the similarity threshold; the specific calculation process steps are as follows:
(6-1)设Sdag中任一项运行数据为Jdst,计算Jdst的有向无环图与Jsrc的有向无环图相似度,定义Jsrc的有向无环图为Gsrc=(Nsrc,Esrc,Lsrc),其中Nsrc表示有向无环图Gsrc中的节点集合,Esrc表示有向无环图Gsrc中的边集合,Lsrc表示有向无环图Gsrc中每个节点上的标签所构成的集合;定义Jdst的有向无环图为Gdst=(Ndst,Edst,Ldst),其中Ndst表示有向无环图Gdst中的节点集合,Edst表示有向无环图Gdst中的边集合,Ldst表示有向无环图Gdst中每个节点上的标签所构成的集合;(6-1) Let J dst be the running data of any item in S dag , calculate the similarity between the directed acyclic graph of J dst and the directed acyclic graph of J src , and define the directed acyclic graph of J src as G src = (N src , E src , L src ), where N src represents the node set in the directed acyclic graph G src , E src represents the edge set in the directed acyclic graph G src , and L src represents the directed acyclic graph G src A collection of labels on each node in the ring graph G src ; define the directed acyclic graph of J dst as G dst = (N dst , E dst , L dst ), where N dst represents the directed acyclic graph G The set of nodes in dst , E dst represents the set of edges in the directed acyclic graph G dst , and L dst represents the set formed by the labels on each node in the directed acyclic graph G dst ;
(6-2)Jdst与Jsrc的有向无环图之间的相似度由如下公式定义:(6-2) The similarity between the DAGs of J dst and J src is defined by the following formula:
式中,sim(Gsrc,Gdst)代表Jdst的有向无环图与Jsrc的有向无环图的相似度,取值范围为[0,1];In the formula, sim(G src ,G dst ) represents the similarity between the DAG of J dst and the DAG of J src , and the value range is [0,1];
skipN(Gsrc,Gdst)代表使Gsrc和Gdst相等的过程中,Gsrc与Gdst分别增加或删除的节点的数目之和;skipN(G src ,G dst ) represents the sum of the number of nodes added or deleted by G src and G dst in the process of making G src and G dst equal;
skipE(Gsrc,Gdst)代表使Gsrc和Gdst相等过程中,Gsrc与Gdst分别增加或删除的边的数目之和;skipE(G src ,G dst ) represents the sum of the number of edges added or deleted by G src and G dst in the process of making G src and G dst equal;
nsrc和ndst分别代表在有向无环图Gsrc中的任一个节点和有向无环图Gdst中的任一个节点;n src and n dst respectively represent any node in the directed acyclic graph G src and any node in the directed acyclic graph G dst ;
lsrc和ldst分别代表节点nsrc和ndst所对应的标签;l src and l dst represent the labels corresponding to nodes n src and n dst respectively;
edit(lsrc,ldst)表示lsrc和ldst两个标签上的编辑距离,即由标签lsrc转换成标签ldst过程中所需的最少编辑操作次数(允许的编辑操作包括:将lsrc中的一个字符替换成另外一个字符;将lsrc中的一个字符删除;添加一个字符到lsrc中);edit(l src , l dst ) represents the edit distance between the two labels l src and l dst , that is, the minimum number of editing operations required in the process of converting the label l src into the label l dst (allowed editing operations include: converting l Replace a character in src with another character; delete a character in l src ; add a character to l src );
|lsrc|和|ldst|分别表示标签lsrc和标签ldst的字符串长度;|l src | and |l dst | represent the string lengths of label l src and label l dst respectively;
(6-3)重复步骤(6-1)至(6-2),计算得到Sdag中每项运行数据的有向无环图与Jsrc的有向无环图的相似度;(6-3) Repeat steps (6-1) to (6-2), calculate the similarity between the directed acyclic graph of each operation data in S dag and the directed acyclic graph of J src ;
(7)遍历步骤(6)的计算结果,抛弃Sdag中有向无环图与Jsrc的有向无环图相似度低于设定相似度阈值的运行数据(本实施例设定的阈值为0.3),设剩余运行数据组成的数据集合为Ssim;(7) Traversing the calculation result of step (6), discarding the DAG similarity between DAG and J src in S dag is lower than the operating data of the set similarity threshold (the threshold set in this embodiment is 0.3), let the data set composed of remaining operating data be S sim ;
(8)对步骤(7)得到的Ssim中的每项运行数据按照公式的计算结果从高到低进行排序,并只保留排序后计算结果中前n项的运行数据,n为正整数(具体保留结果的数目根据实际情况决定,本实施例保留排序后前10项计算结果);式中,timedst表示Jdst的运行时间;该公式综合考虑了两项因素,一是Jdst与Jsrc在有向无环图上的相似度,二是Jdst所对应的历史任务的运行时间;在这种评价指标下进行排序,在有向无环图上相似度越高,或Jdst所对应的历史任务运行时间越短,运行数据在排序结果中越靠前;设排序后所得结果组成的数据集合为Srank;(8) to each operation data in the S sim that step (7) obtains according to the formula The calculation results are sorted from high to low, and only the running data of the first n items in the sorted calculation results are retained, and n is a positive integer (the number of specific retained results is determined according to the actual situation, and this embodiment retains the calculation of the first 10 items after sorting result); in the formula, time dst represents the running time of J dst ; this formula takes two factors into consideration, one is the similarity between J dst and J src on the directed acyclic graph, and the other is the history corresponding to J dst The running time of the task; sorting under this evaluation index, the higher the similarity on the directed acyclic graph, or the shorter the running time of the historical task corresponding to J dst , the higher the running data in the sorting result; set the sorting The data set composed of the obtained results is S rank ;
(9)将步骤(8)得到的Srank中的每一条运行数据的任务参数在图型显示界面上显示给用户,任务参数优化流程结束;(9) the task parameter of each piece of operating data in the S rank that step (8) obtains is displayed to the user on the graphic display interface, and the task parameter optimization process ends;
(10)当用户再次请求对分布式迭代计算系统的任务进行优化时,重新返回步骤(2)。(10) When the user requests to optimize the task of the distributed iterative computing system again, return to step (2).
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610341201.7A CN106021495B (en) | 2016-05-20 | 2016-05-20 | A kind of task parameters optimization method of distributed iterative computing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610341201.7A CN106021495B (en) | 2016-05-20 | 2016-05-20 | A kind of task parameters optimization method of distributed iterative computing system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106021495A CN106021495A (en) | 2016-10-12 |
CN106021495B true CN106021495B (en) | 2017-10-31 |
Family
ID=57095624
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610341201.7A Active CN106021495B (en) | 2016-05-20 | 2016-05-20 | A kind of task parameters optimization method of distributed iterative computing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106021495B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710395B (en) * | 2017-10-26 | 2021-05-14 | 中国电信股份有限公司 | Parameter optimization control method and device and distributed computing system |
CN108108843A (en) * | 2017-12-22 | 2018-06-01 | 冶金自动化研究设计院 | A kind of industrial data optimization system iterated to calculate online based on label data |
CN114840555A (en) * | 2022-05-26 | 2022-08-02 | 中国平安财产保险股份有限公司 | Script optimization method, device, equipment and storage medium |
CN117077598B (en) * | 2023-10-13 | 2024-01-26 | 青岛展诚科技有限公司 | 3D parasitic parameter optimization method based on Mini-batch gradient descent method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103175516A (en) * | 2013-02-26 | 2013-06-26 | 中国人民解放军信息工程大学 | Distributed computing method for adjustment of large-scale geodesic control net |
CN103605662A (en) * | 2013-10-21 | 2014-02-26 | 华为技术有限公司 | Distributed computation frame parameter optimizing method, device and system |
US9015083B1 (en) * | 2012-03-23 | 2015-04-21 | Google Inc. | Distribution of parameter calculation for iterative optimization methods |
CN104679590A (en) * | 2013-11-27 | 2015-06-03 | 阿里巴巴集团控股有限公司 | Map optimization method and device in distributive calculating system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9582775B2 (en) * | 2012-12-10 | 2017-02-28 | International Business Machines Corporation | Techniques for iterative reduction of uncertainty in water distribution networks |
-
2016
- 2016-05-20 CN CN201610341201.7A patent/CN106021495B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9015083B1 (en) * | 2012-03-23 | 2015-04-21 | Google Inc. | Distribution of parameter calculation for iterative optimization methods |
CN103175516A (en) * | 2013-02-26 | 2013-06-26 | 中国人民解放军信息工程大学 | Distributed computing method for adjustment of large-scale geodesic control net |
CN103605662A (en) * | 2013-10-21 | 2014-02-26 | 华为技术有限公司 | Distributed computation frame parameter optimizing method, device and system |
CN104679590A (en) * | 2013-11-27 | 2015-06-03 | 阿里巴巴集团控股有限公司 | Map optimization method and device in distributive calculating system |
Also Published As
Publication number | Publication date |
---|---|
CN106021495A (en) | 2016-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106021495B (en) | A kind of task parameters optimization method of distributed iterative computing system | |
CN103336790B (en) | Hadoop-based fast neighborhood rough set attribute reduction method | |
CN103605662B (en) | Distributed computation frame parameter optimizing method, device and system | |
CN107015856A (en) | Task scheduling approach generation method and device under cloud environment in scientific workflow | |
JP6190255B2 (en) | Stream data processing method using recursive query of graph data | |
WO2017143908A1 (en) | Association analysis method and device | |
CN103761236A (en) | Incremental frequent pattern increase data mining method | |
CN103176974A (en) | Method and device used for optimizing access path in data base | |
CN110908796B (en) | Multi-operation merging and optimizing system and method in Gaia system | |
CN106339252B (en) | Self-adaptive optimization method and device for distributed DAG system | |
CN111382925A (en) | Production performance data analysis device | |
EP2850542A1 (en) | Pattern mining based on occupancy | |
CN114239960B (en) | Distribution network project group progress management method and system based on dynamic resource optimization | |
CN106681791A (en) | Incremental virtual machine anomaly detection method based on symmetric neighbor relation | |
CN112528082B (en) | An XML document pipeline XPath query method, terminal device and storage medium | |
CN109144498A (en) | A kind of the API auto recommending method and device of object-oriented instantiation task | |
CN105302647A (en) | Optimization scheme of speculative execution strategy of backup task in MapReduce | |
CN117498344B (en) | Power grid topology path generation method, device, equipment and medium based on graph data | |
CN112905629A (en) | Method and device for determining target object, electronic equipment and storage medium | |
CN105224389B (en) | Based on the virtual machine resource integration method that linear dependence and segmenting vanning are theoretical | |
CN108664499A (en) | The method, apparatus and equipment of data storage | |
CN113065734A (en) | Method and system, equipment and storage medium for constructing decision tree based on index system | |
WO2011070979A1 (en) | Dictionary creation device | |
JPH0644074A (en) | Knowledge base, inference method, and explanatory sentence generation method | |
CN112738756B (en) | Internet of things equipment data collection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |