CN110413389A - A kind of task schedule optimization method under the unbalanced Spark environment of resource - Google Patents

A kind of task schedule optimization method under the unbalanced Spark environment of resource Download PDF

Info

Publication number
CN110413389A
CN110413389A CN201910669809.6A CN201910669809A CN110413389A CN 110413389 A CN110413389 A CN 110413389A CN 201910669809 A CN201910669809 A CN 201910669809A CN 110413389 A CN110413389 A CN 110413389A
Authority
CN
China
Prior art keywords
node
cpu
task
priority
spark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910669809.6A
Other languages
Chinese (zh)
Other versions
CN110413389B (en
Inventor
胡亚红
盛夏
毛家发
吴寅超
邱圆圆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910669809.6A priority Critical patent/CN110413389B/en
Publication of CN110413389A publication Critical patent/CN110413389A/en
Application granted granted Critical
Publication of CN110413389B publication Critical patent/CN110413389B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/484Precedence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to the task schedule optimization methods under a kind of unbalanced Spark environment of resource, the present invention optimizes the bottom dispatching algorithm of Spark, propose the Spark dynamic self-adapting dispatching algorithm (Spark Dynamic Adaptive Scheduling Algorithm, SDASA) based on node priority.SDASA indicates its computing capability using the priority of node, and in task operational process real-time perfoming priority update, situations such as the abundant isomerism for considering node, the utilization of resources and load.Experiments have shown that SDASA can be improved the operational efficiency of Spark system, shorten the job execution time.When executing the task of the same race of different data amount, 6.99% is averagely promoted using SDASA algorithm clustering performance;When the not task of the same race of execution, 6.32% is averagely promoted using SDASA algorithm clustering performance.

Description

A kind of task schedule optimization method under the unbalanced Spark environment of resource
Technical field
The present invention relates to the task schedules under big data processing field more particularly to a kind of unbalanced Spark environment of resource Optimization method.
Background technique
Update and high performance unit with structural establishments such as each large data center, Supercomputer Center and Internet companies The introducing of (such as GPU), each node gradually becomes isomery in cluster, calculate node CPU, memory in terms of different property Their processing capacity can be caused difference occur.Thus there is biggish difference, entire cluster in the COMPREHENSIVE CALCULATING ability of each node In the unbalanced state of resource.Since the ability of node each in cluster is different, same task is assigned to different nodes will be to section Point load generates different influences.The task schedule of Spark default is not examined based on the idealized design of clustered node isomorphism Consider cluster isomerism and node resource utilizes and loads the case where changing, therefore is unable to satisfy the effect of system under resource heterogeneous schemas The requirement such as rate and load balancing.
The task schedule research under parallel frame focuses primarily upon Hadoop platform at present, unbalanced for resource Task scheduling algorithm research is relatively fewer under Spark environment.A kind of self-adapting task scheduling method is loaded by detection node The runnability of cluster is improved with resource utilization.But the algorithm considers that real estate impact factor is not comprehensive enough, weight is excessively The threshold value of setting is relied on, it is subjective.Some task schedule optimization algorithms based on artificial intelligence and biological information, such as ant colony Algorithm, genetic algorithm etc., though being able to carry out multiple-objection optimization, these algorithm principles are more complicated, implement calculation amount It is larger, thus dispatching efficiency is lower.Therefore, it is the performance for improving Spark under the unbalanced environment of resource, needs to propose efficiently to appoint Business dispatching algorithm.
Summary of the invention
The present invention is to overcome above-mentioned shortcoming, and it is an object of the present invention to provide appointing under a kind of unbalanced Spark environment of resource Be engaged in method for optimizing scheduling, the present invention by the computing capability of each node in analysis cluster, to the bottom dispatching algorithm of Spark into Row optimization, proposes Spark dynamic self-adapting dispatching algorithm (the Spark Dynamic Adaptive based on node priority Scheduling Algorithm, SDASA).SDASA fully considers situations such as isomerism, the utilization of resources and load of node, energy The operational efficiency for enough improving Spark system, shortens the job execution time.
The present invention is to reach above-mentioned purpose by the following technical programs: the task under a kind of unbalanced Spark environment of resource Method for optimizing scheduling includes the following steps:
(1) screening influences the Static implicit method and dynamic factor of node priority, establishes node priority assessment indicator system, And calculate the weight of each index;
(2) distributed type assemblies resource monitoring Ganglia is disposed in the cluster, when cluster starts, triggering monitoring starting Heartbeat;
(3) when cluster is established or has new node that cluster is added, Master node calculates the nature static of each Slave node The static performance index value of energy index value or newly added node;
(4) Master node calculates the dynamic performance index value of each Slave node;
(5) Master node calculates the priority of each Slave node;
(6) Master node reads the priority of each Slave node, and according to the value of each Slave node priority to section Point is ranked up;
(7) Master node selects Slave node according to ranking results, traverses, will need to selected node The task of operation distributes to the highest Slave node of localization degree;
(8) if task execution finishes, task action result is returned;Otherwise return step (3).
Preferably, the step (1) is specific as follows:
(1.1) determine that the Static implicit method of node is that the CPU speed of node, CPU core number, memory are big using Principal Component Analysis Small and disk size;
(1.2) using Principal Component Analysis determine node dynamic factor be the CPU surplus ratio of node, memory surplus ratio, Disk size surplus ratio and cpu load;
(1.3) the analysis result based on step (1.1) and (1.2) establishes node priority assessment indicator system, and to each The importance of index is assessed;
(1.4) weight of each Static implicit method, dynamic factor is obtained using analytic hierarchy process (AHP).
Preferably, the step (3) is specific as follows:
(3.1) each Slave node obtains the Static implicit method value of oneself using Ganglia cluster resource monitoring system, packet Include CPU speed scpu_speed, CPU core number scpu_num, memory size smemWith disk size sdisk
(3.2) Slave node uses unicast by tidal data recovering to Master node;
(3.3) Master node calculates the static performance index S of i-th of Slave node using formula (1)i, i=1 To h, h are the number of slave node in cluster;
Wherein n1, n2, n3, n4The respectively power of the Static implicit methods such as CPU speed, CPU core number, memory size and disk size Value, and n1+n2+n3+n4=1;n1, n2, n3, n4Value be calculated using analytic hierarchy process (AHP).
Preferably, the step (4) is specific as follows:
(4.1) the period timing that each Slave node gives according to Ganglia cluster resource monitoring system configuration file obtains It is derived from oneself dynamic factor value, including node cpu surplus ratio dcpu, memory surplus ratio dmem, disk size surplus ratio ddiskAnd Cpu load dlength
(4.2) Slave node uses unicast by tidal data recovering to Master node;
(4.3) Master node calculates the dynamic performance index D of i-th of Slave node using formula (2)i, i=1 To h, h are the number of slave node in cluster;
Wherein, m1, m2, m3, m4Respectively indicate CPU surplus ratio, memory surplus ratio, disk size surplus ratio and cpu load The weight of equal dynamic factors, and m1+m2+m3+m4=1;m1, m2, m3, m4Value be calculated using analytic hierarchy process (AHP).
Preferably, the step (5) specifically: Master node is saved using each Slave that step (3) and (4) obtain The Static State Index value S of pointiWith dynamic indicator value Di, the priority of each node is calculated using formula (3):
Pi=α Di+βSi (3)
Wherein α and β is D respectivelyiAnd SiWeight, be calculated using analytic hierarchy process (AHP).
Preferably, the step (7) is specific as follows:
(7.1) Master node successively traverses the node set WorkerOffer by the sequence of node priority size;
(7.2) each task in set of tasks is traversed in turn in each node, circulation executes step (7.3);
(7.3) localization parameter of the task on present node is obtained;If parameter is the largest, then follow the steps (7.4), no to then follow the steps (7.2);
(7.4) Task is distributed to the node.
The beneficial effects of the present invention are: use priority of the present invention describes the unbalanced isomeric group interior joint of resource Computing capability, and task schedule is carried out according to the priority of node.In cluster operational process, each Slave node is obtained in real time Dynamic factor value, and the priority value of more new node.The algorithm of proposition can complete task tune according to the current performance of node Degree, effectively improves the performance of cluster, shortens the execution time of task.
Detailed description of the invention
Fig. 1 is method flow schematic diagram of the invention;
Fig. 2 is node priority assessment indicator system schematic diagram of the invention;
Fig. 3 is that SDASA algorithm of the invention implements architecture diagram;
Fig. 4 is the task completion time of the same race that SDASA algorithm and Spark default algorithm of the invention execute different data amount Comparison schematic diagram;
Fig. 5 is that SDASA algorithm and Spark default algorithm execution task completion time not of the same race of the invention compare signal Figure.
Specific embodiment
The present invention is described further combined with specific embodiments below, but protection scope of the present invention is not limited in This:
Embodiment: the present invention for Spark default task schedule be the idealized design based on clustered node isomorphism this One problem, the present invention optimize the bottom dispatching algorithm of Spark by the computing capability of each node in analysis cluster, Propose Spark dynamic self-adapting dispatching algorithm (the Spark Dynamic Adaptive based on node priority Scheduling Algorithm, SDASA).SDASA fully considers situations such as isomerism, the utilization of resources and load of node, energy The operational efficiency for enough improving Spark system, shortens the job execution time.
The computing capability of node indicates that priority is higher, and the node computing capability that represents is stronger with node priority, is selected The probability of execution task is bigger.The index (i.e. joint behavior index) that node priority describes joint behavior by one group calculates It arrives.Joint behavior index includes static performance index and dynamic performance index.Static performance index refer to execution status of task without The index of pass, value are determined by multiple Static implicit methods.Node dynamic performance index then refers to that value can be with execution status of task And the index changed, value are determined by multiple dynamic factors.
As shown in Figure 1, the task schedule optimization method under a kind of unbalanced Spark environment of resource, includes the following steps:
(1) screening influences the Static implicit method and dynamic factor of node priority, establishes node priority assessment indicator system And calculate the weight of each index.
(1.1) factor for influencing joint behavior is analyzed, establishes the priority assessment indicator system of node, such as attached drawing Shown in 2;Wherein, carrying out analysis includes determining that the Static implicit method of node is the CPU speed of node, CPU using Principal Component Analysis Nucleus number, memory size and disk size.Using Principal Component Analysis determine node dynamic factor be node CPU surplus ratio, Memory surplus ratio, disk size surplus ratio and cpu load (i.e. the length that CPU uses queue).
(1.2) domain expert assesses the importance of each index;
(1.3) weight of each static performance index and dynamic performance index is calculated using analytic hierarchy process (AHP).
(2) distributed type assemblies resource monitoring Ganglia is disposed, in the cluster to complete to Slave each in cluster The monitoring of the information such as memory, CPU, hard disk, the network flow of node.When cluster starts, triggering monitoring starting heartbeat.
(3) when cluster is established or has new node that cluster is added, Master node calculates the nature static of each Slave node The static performance index value of energy index value or newly added node.(3.1) when cluster is established or has new node that cluster is added, respectively Slave node (or the Slave node being newly added) obtains the Static implicit method value of oneself, including CPU speed using Ganglia scpu_speed, CPU core number scpu_num, memory size smemWith disk size sdisk
(3.2) each Slave node uses unicast by tidal data recovering to Master node;
(3.3) Master node calculates the static performance index S of i-th of Slave node using formula (1)i, i=1 To h, h are the number of slave node in cluster.
(4) Master node calculates the dynamic performance index value of each Slave node.
(4.1), the period timing acquisition oneself that each Slave node gives according to Ganglia system configuration file is dynamic State factor value, including node cpu surplus ratio dcpu, memory surplus ratio dmem, disk size surplus ratio ddiskAnd cpu load dlength
(4.2), Slave node uses unicast by tidal data recovering to Master node;
(4.3), Master node calculates the dynamic performance index D of i-th of Slave node using formula (2)i, i=1 To h, h are the number of slave node in cluster.
(5) Master node calculates the priority of each node.
When there is node sequencing request, Master node reads the Static State Index value S of each node from databaseiWith it is dynamic State index value Di, the priority of each node is calculated using formula (3).
(6) Master node reads the priority of each Slave node, and is ranked up according to the value of priority to node.
(7) Master node selects Slave node according to priority size, then again to selected node progress time out It goes through, running for task will be needed to distribute to the highest Slave node of localization degree.
(8) if task execution finishes, task action result is returned;Otherwise return step (3).
Wherein the above method is what the framework based on Fig. 3 was implemented, the Spark task scheduling algorithm of the method for the present invention and default Experimental result comparison it is as shown in Figure 4 and Figure 5.
In conclusion the present invention is on the basis of establishing node priority assessment indicator system, it is true using analytic hierarchy process (AHP) The weight of fixed each Static implicit method and dynamic factor.SDASA algorithm obtains the dynamic indicator value of each Slave node in real time, carries out The calculating of node priority, and according to the distribution of the priority of each node completion task.Experiment shows silent compared to Spark Recognize dispatching algorithm, algorithm proposed by the present invention can effectively improve the performance of group system.When execute different data amount it is of the same race When task, 6.99% is averagely promoted using SDASA algorithm clustering performance;When the not task of the same race of execution, SDASA set of algorithms is used Group's performance averagely promotes 6.32%.
It is specific embodiments of the present invention and the technical principle used described in above, if conception under this invention institute The change of work when the spirit that generated function is still covered without departing from specification and attached drawing, should belong to of the invention Protection scope.

Claims (6)

1. the task schedule optimization method under a kind of unbalanced Spark environment of resource, which comprises the steps of:
(1) screening influences the Static implicit method and dynamic factor of node priority, establishes node priority assessment indicator system, and count Calculate the weight of each index;
(2) distributed type assemblies resource monitoring Ganglia is disposed in the cluster, when cluster starts, triggering monitoring starting heartbeat;
(3) when cluster is established or has new node that cluster is added, the static properties that Master node calculates each Slave node refers to The static performance index value of scale value or newly added node;
(4) Master node calculates the dynamic performance index value of each Slave node;
(5) Master node calculates the priority of each Slave node;
(6) Master node reads the priority of each Slave node, and according to the value of each Slave node priority to node into Row sequence;
(7) Master node selects Slave node according to ranking results, traverses to selected node, will need to run Task distribute to the highest Slave node of localization degree;
(8) if task execution finishes, task action result is returned;Otherwise return step (3).
2. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist In: the step (1) is specific as follows:
(1.1) using Principal Component Analysis determine node Static implicit method be the CPU speed of node, CPU core number, memory size and Disk size;
(1.2) determine that the dynamic factor of node is CPU surplus ratio, the memory surplus ratio, disk of node using Principal Component Analysis Capacity surplus ratio and cpu load;
(1.3) the analysis result based on step (1.1) and (1.2) establishes node priority assessment indicator system, and to each index Importance assessed;
(1.4) weight of each Static implicit method, dynamic factor is obtained using analytic hierarchy process (AHP).
3. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist In: the step (3) is specific as follows:
(3.1) each Slave node obtains the Static implicit method value of oneself, including CPU using Ganglia cluster resource monitoring system Speed scpu_speed, CPU core number scpu_num, memory size smemWith disk size sdisk
(3.2) Slave node uses unicast by tidal data recovering to Master node;
(3.3) Master node calculates the static performance index S of i-th of Slave node using formula (1)i, i=1 to h, h For the number of slave node in cluster;
Wherein n1, n2, n3, n4The respectively weight of the Static implicit methods such as CPU speed, CPU core number, memory size and disk size, and And n1+n2+n3+n4=1;n1, n2, n3, n4Value be calculated using analytic hierarchy process (AHP).
4. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist In: the step (4) is specific as follows:
(4.1) each Slave node according to Ganglia cluster resource monitoring system configuration file give period timing acquisition from Oneself dynamic factor value, including node cpu surplus ratio dcpu, memory surplus ratio dmem, disk size surplus ratio ddiskAnd CPU is negative Carry dlength
(4.2) Slave node uses unicast by tidal data recovering to Master node;
(4.3) Master node calculates the dynamic performance index D of i-th of Slave node using formula (2)i, i=1 to h, h For the number of slave node in cluster;
Wherein, m1, m2, m3, m4It is dynamic to respectively indicate CPU surplus ratio, memory surplus ratio, disk size surplus ratio and cpu load etc. The weight of state factor, and m1+m2+m3+m4=1;m1, m2, m3, m4Value be calculated using analytic hierarchy process (AHP).
5. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist In: the step (5) specifically: the Static State Index value for each Slave node that Master node is obtained using step (3) and (4) SiWith dynamic indicator value Di, the priority of each node is calculated using formula (3):
Pi=α Di+βSi (3)
Wherein α and β is D respectivelyiAnd SiWeight, be calculated using analytic hierarchy process (AHP).
6. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist In: the step (7) is specific as follows:
(7.1) Master node successively traverses the node set WorkerOffer by the sequence of node priority size;
(7.2) each task in set of tasks is traversed in turn in each node, circulation executes step (7.3);
(7.3) localization parameter of the task on present node is obtained;If parameter is the largest, then follow the steps (7.4), it is no Then follow the steps (7.2);
(7.4) Task is distributed to the node.
CN201910669809.6A 2019-07-24 2019-07-24 Task scheduling optimization method under resource imbalance Spark environment Active CN110413389B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910669809.6A CN110413389B (en) 2019-07-24 2019-07-24 Task scheduling optimization method under resource imbalance Spark environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910669809.6A CN110413389B (en) 2019-07-24 2019-07-24 Task scheduling optimization method under resource imbalance Spark environment

Publications (2)

Publication Number Publication Date
CN110413389A true CN110413389A (en) 2019-11-05
CN110413389B CN110413389B (en) 2021-09-28

Family

ID=68362792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910669809.6A Active CN110413389B (en) 2019-07-24 2019-07-24 Task scheduling optimization method under resource imbalance Spark environment

Country Status (1)

Country Link
CN (1) CN110413389B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928666A (en) * 2019-12-09 2020-03-27 湖南大学 Method and system for optimizing task parallelism based on memory in Spark environment
CN110928648A (en) * 2019-12-10 2020-03-27 浙江工商大学 Heuristic and intelligent computing-fused cloud workflow segmentation online scheduling optimization method
CN110955526A (en) * 2019-12-16 2020-04-03 湖南大学 Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment
CN111459628A (en) * 2020-03-12 2020-07-28 重庆邮电大学 Spark platform task scheduling method based on improved quantum ant colony algorithm
CN111694789A (en) * 2020-04-22 2020-09-22 西安电子科技大学 Embedded reconfigurable heterogeneous determination method, system, storage medium and processor
CN111985845A (en) * 2020-09-02 2020-11-24 浙江工业大学 Node priority tuning method for heterogeneous Spark cluster
CN112068959A (en) * 2020-09-04 2020-12-11 北京明略昭辉科技有限公司 Self-adaptive task scheduling method and system and retrieval method comprising method
CN112231081A (en) * 2020-10-14 2021-01-15 山东大学 PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment
CN112256434A (en) * 2020-10-30 2021-01-22 中国科学院信息工程研究所 Resource matching method in encrypted data cracking scene
CN112764906A (en) * 2021-01-26 2021-05-07 浙江工业大学 Cluster resource scheduling method based on user job type and node performance bias
CN113377495A (en) * 2021-05-17 2021-09-10 杭州中港科技有限公司 Method for optimizing docker cluster deployment based on heuristic ant colony algorithm
CN114780247A (en) * 2022-05-17 2022-07-22 中国地质大学(北京) Flow application scheduling method and system with flow rate and resource sensing
CN115408136A (en) * 2022-11-01 2022-11-29 安徽思高智能科技有限公司 RPA flow scheduling method based on genetic algorithm

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060184939A1 (en) * 2005-02-15 2006-08-17 International Business Machines Corporation Method for using a priority queue to perform job scheduling on a cluster based on node rank and performance
CN102073546A (en) * 2010-12-13 2011-05-25 北京航空航天大学 Task-dynamic dispatching method under distributed computation mode in cloud computing environment
CN103218233A (en) * 2013-05-09 2013-07-24 福州大学 Data allocation strategy in hadoop heterogeneous cluster
CN104270322A (en) * 2014-10-30 2015-01-07 中电海康集团有限公司 Self-adaptive load balance scheduling mechanism for internet-of-things device access processing platform
US20170010918A1 (en) * 2015-07-06 2017-01-12 Fujitsu Limited Information processing apparatus, parallel computer system and job schedule setting program
CN108762921A (en) * 2018-05-18 2018-11-06 电子科技大学 A kind of method for scheduling task and device of the on-line optimization subregion of Spark group systems

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060184939A1 (en) * 2005-02-15 2006-08-17 International Business Machines Corporation Method for using a priority queue to perform job scheduling on a cluster based on node rank and performance
CN102073546A (en) * 2010-12-13 2011-05-25 北京航空航天大学 Task-dynamic dispatching method under distributed computation mode in cloud computing environment
CN103218233A (en) * 2013-05-09 2013-07-24 福州大学 Data allocation strategy in hadoop heterogeneous cluster
CN104270322A (en) * 2014-10-30 2015-01-07 中电海康集团有限公司 Self-adaptive load balance scheduling mechanism for internet-of-things device access processing platform
US20170010918A1 (en) * 2015-07-06 2017-01-12 Fujitsu Limited Information processing apparatus, parallel computer system and job schedule setting program
CN108762921A (en) * 2018-05-18 2018-11-06 电子科技大学 A kind of method for scheduling task and device of the on-line optimization subregion of Spark group systems

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
冯兴杰等: "改进的Hadoop作业调度算法", 《计算机工程与应用》 *
贺阳: "基于Yarn的负载均衡研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928666A (en) * 2019-12-09 2020-03-27 湖南大学 Method and system for optimizing task parallelism based on memory in Spark environment
CN110928666B (en) * 2019-12-09 2022-03-22 湖南大学 Method and system for optimizing task parallelism based on memory in Spark environment
CN110928648A (en) * 2019-12-10 2020-03-27 浙江工商大学 Heuristic and intelligent computing-fused cloud workflow segmentation online scheduling optimization method
CN110928648B (en) * 2019-12-10 2022-05-20 浙江工商大学 Heuristic and intelligent computing-fused cloud workflow segmentation online scheduling optimization method
CN110955526B (en) * 2019-12-16 2022-10-21 湖南大学 Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment
CN110955526A (en) * 2019-12-16 2020-04-03 湖南大学 Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment
CN111459628B (en) * 2020-03-12 2023-11-28 大庆市凯德信信息技术有限公司 Spark platform task scheduling method based on improved quantum ant colony algorithm
EP3907609A4 (en) * 2020-03-12 2022-05-11 Chongqing University of Posts and Telecommunications Improved quantum ant colony algorithm-based spark platform task scheduling method
CN111459628A (en) * 2020-03-12 2020-07-28 重庆邮电大学 Spark platform task scheduling method based on improved quantum ant colony algorithm
WO2021179462A1 (en) * 2020-03-12 2021-09-16 重庆邮电大学 Improved quantum ant colony algorithm-based spark platform task scheduling method
CN111694789A (en) * 2020-04-22 2020-09-22 西安电子科技大学 Embedded reconfigurable heterogeneous determination method, system, storage medium and processor
CN111985845A (en) * 2020-09-02 2020-11-24 浙江工业大学 Node priority tuning method for heterogeneous Spark cluster
CN111985845B (en) * 2020-09-02 2024-03-19 浙江工业大学 Node priority optimization method of heterogeneous Spark cluster
CN112068959A (en) * 2020-09-04 2020-12-11 北京明略昭辉科技有限公司 Self-adaptive task scheduling method and system and retrieval method comprising method
CN112231081B (en) * 2020-10-14 2022-08-16 山东大学 PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment
CN112231081A (en) * 2020-10-14 2021-01-15 山东大学 PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment
CN112256434A (en) * 2020-10-30 2021-01-22 中国科学院信息工程研究所 Resource matching method in encrypted data cracking scene
CN112256434B (en) * 2020-10-30 2024-04-05 中国科学院信息工程研究所 Resource matching method in encrypted data cracking scene
CN112764906A (en) * 2021-01-26 2021-05-07 浙江工业大学 Cluster resource scheduling method based on user job type and node performance bias
CN112764906B (en) * 2021-01-26 2024-03-15 浙江工业大学 Cluster resource scheduling method based on user job type and node performance bias
CN113377495A (en) * 2021-05-17 2021-09-10 杭州中港科技有限公司 Method for optimizing docker cluster deployment based on heuristic ant colony algorithm
CN113377495B (en) * 2021-05-17 2024-02-27 杭州中港科技有限公司 Dock cluster deployment optimization method based on heuristic ant colony algorithm
CN114780247A (en) * 2022-05-17 2022-07-22 中国地质大学(北京) Flow application scheduling method and system with flow rate and resource sensing
CN114780247B (en) * 2022-05-17 2022-12-13 中国地质大学(北京) Flow application scheduling method and system with flow rate and resource sensing
CN115408136A (en) * 2022-11-01 2022-11-29 安徽思高智能科技有限公司 RPA flow scheduling method based on genetic algorithm

Also Published As

Publication number Publication date
CN110413389B (en) 2021-09-28

Similar Documents

Publication Publication Date Title
CN110413389A (en) A kind of task schedule optimization method under the unbalanced Spark environment of resource
CN107404523A (en) Cloud platform adaptive resource dispatches system and method
Sun et al. PACO: A period ACO based scheduling algorithm in cloud computing
CN110351348B (en) Cloud computing resource scheduling optimization method based on DQN
CN107357652B (en) Cloud computing task scheduling method based on segmentation ordering and standard deviation adjustment factor
CN102708011A (en) Multistage load estimating method facing task scheduling of cloud computing platform
CN107908536B (en) Performance evaluation method and system for GPU application in CPU-GPU heterogeneous environment
CN110086855B (en) Intelligent Spark task perception scheduling method based on ant colony algorithm
CN103401939A (en) Load balancing method adopting mixing scheduling strategy
CN105744006A (en) Particle swarm optimization user request dispatching method facing multi-type service
CN105740059B (en) A kind of population dispatching method towards Divisible task
CN112269632A (en) Scheduling method and system for optimizing cloud data center
Shukla et al. FAT-ETO: Fuzzy-AHP-TOPSIS-Based efficient task offloading algorithm for scientific workflows in heterogeneous fog–cloud environment
Jiang et al. An energy-aware virtual machine migration strategy based on three-way decisions
CN113722112B (en) Service resource load balancing processing method and system
Lyu et al. Dynamic pricing scheme for edge computing services: A two-layer reinforcement learning approach
CN108897625B (en) Parallel scheduling method based on DAG model
LawanyaShri et al. Energy-Aware Fruitfly Optimisation Algorithm for Load Balancing in Cloud Computing Environments.
Zhu et al. A multi-resource scheduling scheme of Kubernetes for IIoT
Shuang et al. Task Scheduling Based on Grey Wolf Optimizer Algorithm for Smart Meter Embedded Operating System
CN112035234A (en) Distributed batch job distribution method and device
CN114398148A (en) Power industry K8S dynamic container arrangement method and storage medium
Sit et al. An adaptive clustering approach to dynamic load balancing
Sun et al. Optimizing grid resource allocation by combining fuzzy clustering with application preference
CN111324444A (en) Cloud computing task scheduling method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant