CN110413389A - A kind of task schedule optimization method under the unbalanced Spark environment of resource - Google Patents
A kind of task schedule optimization method under the unbalanced Spark environment of resource Download PDFInfo
- Publication number
- CN110413389A CN110413389A CN201910669809.6A CN201910669809A CN110413389A CN 110413389 A CN110413389 A CN 110413389A CN 201910669809 A CN201910669809 A CN 201910669809A CN 110413389 A CN110413389 A CN 110413389A
- Authority
- CN
- China
- Prior art keywords
- node
- cpu
- task
- priority
- spark
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/484—Precedence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5021—Priority
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
- Debugging And Monitoring (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention relates to the task schedule optimization methods under a kind of unbalanced Spark environment of resource, the present invention optimizes the bottom dispatching algorithm of Spark, propose the Spark dynamic self-adapting dispatching algorithm (Spark Dynamic Adaptive Scheduling Algorithm, SDASA) based on node priority.SDASA indicates its computing capability using the priority of node, and in task operational process real-time perfoming priority update, situations such as the abundant isomerism for considering node, the utilization of resources and load.Experiments have shown that SDASA can be improved the operational efficiency of Spark system, shorten the job execution time.When executing the task of the same race of different data amount, 6.99% is averagely promoted using SDASA algorithm clustering performance;When the not task of the same race of execution, 6.32% is averagely promoted using SDASA algorithm clustering performance.
Description
Technical field
The present invention relates to the task schedules under big data processing field more particularly to a kind of unbalanced Spark environment of resource
Optimization method.
Background technique
Update and high performance unit with structural establishments such as each large data center, Supercomputer Center and Internet companies
The introducing of (such as GPU), each node gradually becomes isomery in cluster, calculate node CPU, memory in terms of different property
Their processing capacity can be caused difference occur.Thus there is biggish difference, entire cluster in the COMPREHENSIVE CALCULATING ability of each node
In the unbalanced state of resource.Since the ability of node each in cluster is different, same task is assigned to different nodes will be to section
Point load generates different influences.The task schedule of Spark default is not examined based on the idealized design of clustered node isomorphism
Consider cluster isomerism and node resource utilizes and loads the case where changing, therefore is unable to satisfy the effect of system under resource heterogeneous schemas
The requirement such as rate and load balancing.
The task schedule research under parallel frame focuses primarily upon Hadoop platform at present, unbalanced for resource
Task scheduling algorithm research is relatively fewer under Spark environment.A kind of self-adapting task scheduling method is loaded by detection node
The runnability of cluster is improved with resource utilization.But the algorithm considers that real estate impact factor is not comprehensive enough, weight is excessively
The threshold value of setting is relied on, it is subjective.Some task schedule optimization algorithms based on artificial intelligence and biological information, such as ant colony
Algorithm, genetic algorithm etc., though being able to carry out multiple-objection optimization, these algorithm principles are more complicated, implement calculation amount
It is larger, thus dispatching efficiency is lower.Therefore, it is the performance for improving Spark under the unbalanced environment of resource, needs to propose efficiently to appoint
Business dispatching algorithm.
Summary of the invention
The present invention is to overcome above-mentioned shortcoming, and it is an object of the present invention to provide appointing under a kind of unbalanced Spark environment of resource
Be engaged in method for optimizing scheduling, the present invention by the computing capability of each node in analysis cluster, to the bottom dispatching algorithm of Spark into
Row optimization, proposes Spark dynamic self-adapting dispatching algorithm (the Spark Dynamic Adaptive based on node priority
Scheduling Algorithm, SDASA).SDASA fully considers situations such as isomerism, the utilization of resources and load of node, energy
The operational efficiency for enough improving Spark system, shortens the job execution time.
The present invention is to reach above-mentioned purpose by the following technical programs: the task under a kind of unbalanced Spark environment of resource
Method for optimizing scheduling includes the following steps:
(1) screening influences the Static implicit method and dynamic factor of node priority, establishes node priority assessment indicator system,
And calculate the weight of each index;
(2) distributed type assemblies resource monitoring Ganglia is disposed in the cluster, when cluster starts, triggering monitoring starting
Heartbeat;
(3) when cluster is established or has new node that cluster is added, Master node calculates the nature static of each Slave node
The static performance index value of energy index value or newly added node;
(4) Master node calculates the dynamic performance index value of each Slave node;
(5) Master node calculates the priority of each Slave node;
(6) Master node reads the priority of each Slave node, and according to the value of each Slave node priority to section
Point is ranked up;
(7) Master node selects Slave node according to ranking results, traverses, will need to selected node
The task of operation distributes to the highest Slave node of localization degree;
(8) if task execution finishes, task action result is returned;Otherwise return step (3).
Preferably, the step (1) is specific as follows:
(1.1) determine that the Static implicit method of node is that the CPU speed of node, CPU core number, memory are big using Principal Component Analysis
Small and disk size;
(1.2) using Principal Component Analysis determine node dynamic factor be the CPU surplus ratio of node, memory surplus ratio,
Disk size surplus ratio and cpu load;
(1.3) the analysis result based on step (1.1) and (1.2) establishes node priority assessment indicator system, and to each
The importance of index is assessed;
(1.4) weight of each Static implicit method, dynamic factor is obtained using analytic hierarchy process (AHP).
Preferably, the step (3) is specific as follows:
(3.1) each Slave node obtains the Static implicit method value of oneself using Ganglia cluster resource monitoring system, packet
Include CPU speed scpu_speed, CPU core number scpu_num, memory size smemWith disk size sdisk;
(3.2) Slave node uses unicast by tidal data recovering to Master node;
(3.3) Master node calculates the static performance index S of i-th of Slave node using formula (1)i, i=1
To h, h are the number of slave node in cluster;
Wherein n1, n2, n3, n4The respectively power of the Static implicit methods such as CPU speed, CPU core number, memory size and disk size
Value, and n1+n2+n3+n4=1;n1, n2, n3, n4Value be calculated using analytic hierarchy process (AHP).
Preferably, the step (4) is specific as follows:
(4.1) the period timing that each Slave node gives according to Ganglia cluster resource monitoring system configuration file obtains
It is derived from oneself dynamic factor value, including node cpu surplus ratio dcpu, memory surplus ratio dmem, disk size surplus ratio ddiskAnd
Cpu load dlength;
(4.2) Slave node uses unicast by tidal data recovering to Master node;
(4.3) Master node calculates the dynamic performance index D of i-th of Slave node using formula (2)i, i=1
To h, h are the number of slave node in cluster;
Wherein, m1, m2, m3, m4Respectively indicate CPU surplus ratio, memory surplus ratio, disk size surplus ratio and cpu load
The weight of equal dynamic factors, and m1+m2+m3+m4=1;m1, m2, m3, m4Value be calculated using analytic hierarchy process (AHP).
Preferably, the step (5) specifically: Master node is saved using each Slave that step (3) and (4) obtain
The Static State Index value S of pointiWith dynamic indicator value Di, the priority of each node is calculated using formula (3):
Pi=α Di+βSi (3)
Wherein α and β is D respectivelyiAnd SiWeight, be calculated using analytic hierarchy process (AHP).
Preferably, the step (7) is specific as follows:
(7.1) Master node successively traverses the node set WorkerOffer by the sequence of node priority size;
(7.2) each task in set of tasks is traversed in turn in each node, circulation executes step (7.3);
(7.3) localization parameter of the task on present node is obtained;If parameter is the largest, then follow the steps
(7.4), no to then follow the steps (7.2);
(7.4) Task is distributed to the node.
The beneficial effects of the present invention are: use priority of the present invention describes the unbalanced isomeric group interior joint of resource
Computing capability, and task schedule is carried out according to the priority of node.In cluster operational process, each Slave node is obtained in real time
Dynamic factor value, and the priority value of more new node.The algorithm of proposition can complete task tune according to the current performance of node
Degree, effectively improves the performance of cluster, shortens the execution time of task.
Detailed description of the invention
Fig. 1 is method flow schematic diagram of the invention;
Fig. 2 is node priority assessment indicator system schematic diagram of the invention;
Fig. 3 is that SDASA algorithm of the invention implements architecture diagram;
Fig. 4 is the task completion time of the same race that SDASA algorithm and Spark default algorithm of the invention execute different data amount
Comparison schematic diagram;
Fig. 5 is that SDASA algorithm and Spark default algorithm execution task completion time not of the same race of the invention compare signal
Figure.
Specific embodiment
The present invention is described further combined with specific embodiments below, but protection scope of the present invention is not limited in
This:
Embodiment: the present invention for Spark default task schedule be the idealized design based on clustered node isomorphism this
One problem, the present invention optimize the bottom dispatching algorithm of Spark by the computing capability of each node in analysis cluster,
Propose Spark dynamic self-adapting dispatching algorithm (the Spark Dynamic Adaptive based on node priority
Scheduling Algorithm, SDASA).SDASA fully considers situations such as isomerism, the utilization of resources and load of node, energy
The operational efficiency for enough improving Spark system, shortens the job execution time.
The computing capability of node indicates that priority is higher, and the node computing capability that represents is stronger with node priority, is selected
The probability of execution task is bigger.The index (i.e. joint behavior index) that node priority describes joint behavior by one group calculates
It arrives.Joint behavior index includes static performance index and dynamic performance index.Static performance index refer to execution status of task without
The index of pass, value are determined by multiple Static implicit methods.Node dynamic performance index then refers to that value can be with execution status of task
And the index changed, value are determined by multiple dynamic factors.
As shown in Figure 1, the task schedule optimization method under a kind of unbalanced Spark environment of resource, includes the following steps:
(1) screening influences the Static implicit method and dynamic factor of node priority, establishes node priority assessment indicator system
And calculate the weight of each index.
(1.1) factor for influencing joint behavior is analyzed, establishes the priority assessment indicator system of node, such as attached drawing
Shown in 2;Wherein, carrying out analysis includes determining that the Static implicit method of node is the CPU speed of node, CPU using Principal Component Analysis
Nucleus number, memory size and disk size.Using Principal Component Analysis determine node dynamic factor be node CPU surplus ratio,
Memory surplus ratio, disk size surplus ratio and cpu load (i.e. the length that CPU uses queue).
(1.2) domain expert assesses the importance of each index;
(1.3) weight of each static performance index and dynamic performance index is calculated using analytic hierarchy process (AHP).
(2) distributed type assemblies resource monitoring Ganglia is disposed, in the cluster to complete to Slave each in cluster
The monitoring of the information such as memory, CPU, hard disk, the network flow of node.When cluster starts, triggering monitoring starting heartbeat.
(3) when cluster is established or has new node that cluster is added, Master node calculates the nature static of each Slave node
The static performance index value of energy index value or newly added node.(3.1) when cluster is established or has new node that cluster is added, respectively
Slave node (or the Slave node being newly added) obtains the Static implicit method value of oneself, including CPU speed using Ganglia
scpu_speed, CPU core number scpu_num, memory size smemWith disk size sdisk;
(3.2) each Slave node uses unicast by tidal data recovering to Master node;
(3.3) Master node calculates the static performance index S of i-th of Slave node using formula (1)i, i=1
To h, h are the number of slave node in cluster.
(4) Master node calculates the dynamic performance index value of each Slave node.
(4.1), the period timing acquisition oneself that each Slave node gives according to Ganglia system configuration file is dynamic
State factor value, including node cpu surplus ratio dcpu, memory surplus ratio dmem, disk size surplus ratio ddiskAnd cpu load
dlength;
(4.2), Slave node uses unicast by tidal data recovering to Master node;
(4.3), Master node calculates the dynamic performance index D of i-th of Slave node using formula (2)i, i=1
To h, h are the number of slave node in cluster.
(5) Master node calculates the priority of each node.
When there is node sequencing request, Master node reads the Static State Index value S of each node from databaseiWith it is dynamic
State index value Di, the priority of each node is calculated using formula (3).
(6) Master node reads the priority of each Slave node, and is ranked up according to the value of priority to node.
(7) Master node selects Slave node according to priority size, then again to selected node progress time out
It goes through, running for task will be needed to distribute to the highest Slave node of localization degree.
(8) if task execution finishes, task action result is returned;Otherwise return step (3).
Wherein the above method is what the framework based on Fig. 3 was implemented, the Spark task scheduling algorithm of the method for the present invention and default
Experimental result comparison it is as shown in Figure 4 and Figure 5.
In conclusion the present invention is on the basis of establishing node priority assessment indicator system, it is true using analytic hierarchy process (AHP)
The weight of fixed each Static implicit method and dynamic factor.SDASA algorithm obtains the dynamic indicator value of each Slave node in real time, carries out
The calculating of node priority, and according to the distribution of the priority of each node completion task.Experiment shows silent compared to Spark
Recognize dispatching algorithm, algorithm proposed by the present invention can effectively improve the performance of group system.When execute different data amount it is of the same race
When task, 6.99% is averagely promoted using SDASA algorithm clustering performance;When the not task of the same race of execution, SDASA set of algorithms is used
Group's performance averagely promotes 6.32%.
It is specific embodiments of the present invention and the technical principle used described in above, if conception under this invention institute
The change of work when the spirit that generated function is still covered without departing from specification and attached drawing, should belong to of the invention
Protection scope.
Claims (6)
1. the task schedule optimization method under a kind of unbalanced Spark environment of resource, which comprises the steps of:
(1) screening influences the Static implicit method and dynamic factor of node priority, establishes node priority assessment indicator system, and count
Calculate the weight of each index;
(2) distributed type assemblies resource monitoring Ganglia is disposed in the cluster, when cluster starts, triggering monitoring starting heartbeat;
(3) when cluster is established or has new node that cluster is added, the static properties that Master node calculates each Slave node refers to
The static performance index value of scale value or newly added node;
(4) Master node calculates the dynamic performance index value of each Slave node;
(5) Master node calculates the priority of each Slave node;
(6) Master node reads the priority of each Slave node, and according to the value of each Slave node priority to node into
Row sequence;
(7) Master node selects Slave node according to ranking results, traverses to selected node, will need to run
Task distribute to the highest Slave node of localization degree;
(8) if task execution finishes, task action result is returned;Otherwise return step (3).
2. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist
In: the step (1) is specific as follows:
(1.1) using Principal Component Analysis determine node Static implicit method be the CPU speed of node, CPU core number, memory size and
Disk size;
(1.2) determine that the dynamic factor of node is CPU surplus ratio, the memory surplus ratio, disk of node using Principal Component Analysis
Capacity surplus ratio and cpu load;
(1.3) the analysis result based on step (1.1) and (1.2) establishes node priority assessment indicator system, and to each index
Importance assessed;
(1.4) weight of each Static implicit method, dynamic factor is obtained using analytic hierarchy process (AHP).
3. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist
In: the step (3) is specific as follows:
(3.1) each Slave node obtains the Static implicit method value of oneself, including CPU using Ganglia cluster resource monitoring system
Speed scpu_speed, CPU core number scpu_num, memory size smemWith disk size sdisk;
(3.2) Slave node uses unicast by tidal data recovering to Master node;
(3.3) Master node calculates the static performance index S of i-th of Slave node using formula (1)i, i=1 to h, h
For the number of slave node in cluster;
Wherein n1, n2, n3, n4The respectively weight of the Static implicit methods such as CPU speed, CPU core number, memory size and disk size, and
And n1+n2+n3+n4=1;n1, n2, n3, n4Value be calculated using analytic hierarchy process (AHP).
4. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist
In: the step (4) is specific as follows:
(4.1) each Slave node according to Ganglia cluster resource monitoring system configuration file give period timing acquisition from
Oneself dynamic factor value, including node cpu surplus ratio dcpu, memory surplus ratio dmem, disk size surplus ratio ddiskAnd CPU is negative
Carry dlength;
(4.2) Slave node uses unicast by tidal data recovering to Master node;
(4.3) Master node calculates the dynamic performance index D of i-th of Slave node using formula (2)i, i=1 to h, h
For the number of slave node in cluster;
Wherein, m1, m2, m3, m4It is dynamic to respectively indicate CPU surplus ratio, memory surplus ratio, disk size surplus ratio and cpu load etc.
The weight of state factor, and m1+m2+m3+m4=1;m1, m2, m3, m4Value be calculated using analytic hierarchy process (AHP).
5. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist
In: the step (5) specifically: the Static State Index value for each Slave node that Master node is obtained using step (3) and (4)
SiWith dynamic indicator value Di, the priority of each node is calculated using formula (3):
Pi=α Di+βSi (3)
Wherein α and β is D respectivelyiAnd SiWeight, be calculated using analytic hierarchy process (AHP).
6. the task schedule optimization method under the unbalanced Spark environment of a kind of resource according to claim 1, feature exist
In: the step (7) is specific as follows:
(7.1) Master node successively traverses the node set WorkerOffer by the sequence of node priority size;
(7.2) each task in set of tasks is traversed in turn in each node, circulation executes step (7.3);
(7.3) localization parameter of the task on present node is obtained;If parameter is the largest, then follow the steps (7.4), it is no
Then follow the steps (7.2);
(7.4) Task is distributed to the node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910669809.6A CN110413389B (en) | 2019-07-24 | 2019-07-24 | Task scheduling optimization method under resource imbalance Spark environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910669809.6A CN110413389B (en) | 2019-07-24 | 2019-07-24 | Task scheduling optimization method under resource imbalance Spark environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110413389A true CN110413389A (en) | 2019-11-05 |
CN110413389B CN110413389B (en) | 2021-09-28 |
Family
ID=68362792
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910669809.6A Active CN110413389B (en) | 2019-07-24 | 2019-07-24 | Task scheduling optimization method under resource imbalance Spark environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110413389B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110928666A (en) * | 2019-12-09 | 2020-03-27 | 湖南大学 | Method and system for optimizing task parallelism based on memory in Spark environment |
CN110928648A (en) * | 2019-12-10 | 2020-03-27 | 浙江工商大学 | Heuristic and intelligent computing-fused cloud workflow segmentation online scheduling optimization method |
CN110955526A (en) * | 2019-12-16 | 2020-04-03 | 湖南大学 | Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment |
CN111459628A (en) * | 2020-03-12 | 2020-07-28 | 重庆邮电大学 | Spark platform task scheduling method based on improved quantum ant colony algorithm |
CN111694789A (en) * | 2020-04-22 | 2020-09-22 | 西安电子科技大学 | Embedded reconfigurable heterogeneous determination method, system, storage medium and processor |
CN111985845A (en) * | 2020-09-02 | 2020-11-24 | 浙江工业大学 | Node priority tuning method for heterogeneous Spark cluster |
CN112068959A (en) * | 2020-09-04 | 2020-12-11 | 北京明略昭辉科技有限公司 | Self-adaptive task scheduling method and system and retrieval method comprising method |
CN112231081A (en) * | 2020-10-14 | 2021-01-15 | 山东大学 | PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment |
CN112256434A (en) * | 2020-10-30 | 2021-01-22 | 中国科学院信息工程研究所 | Resource matching method in encrypted data cracking scene |
CN112764906A (en) * | 2021-01-26 | 2021-05-07 | 浙江工业大学 | Cluster resource scheduling method based on user job type and node performance bias |
CN113377495A (en) * | 2021-05-17 | 2021-09-10 | 杭州中港科技有限公司 | Method for optimizing docker cluster deployment based on heuristic ant colony algorithm |
CN114780247A (en) * | 2022-05-17 | 2022-07-22 | 中国地质大学(北京) | Flow application scheduling method and system with flow rate and resource sensing |
CN115408136A (en) * | 2022-11-01 | 2022-11-29 | 安徽思高智能科技有限公司 | RPA flow scheduling method based on genetic algorithm |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060184939A1 (en) * | 2005-02-15 | 2006-08-17 | International Business Machines Corporation | Method for using a priority queue to perform job scheduling on a cluster based on node rank and performance |
CN102073546A (en) * | 2010-12-13 | 2011-05-25 | 北京航空航天大学 | Task-dynamic dispatching method under distributed computation mode in cloud computing environment |
CN103218233A (en) * | 2013-05-09 | 2013-07-24 | 福州大学 | Data allocation strategy in hadoop heterogeneous cluster |
CN104270322A (en) * | 2014-10-30 | 2015-01-07 | 中电海康集团有限公司 | Self-adaptive load balance scheduling mechanism for internet-of-things device access processing platform |
US20170010918A1 (en) * | 2015-07-06 | 2017-01-12 | Fujitsu Limited | Information processing apparatus, parallel computer system and job schedule setting program |
CN108762921A (en) * | 2018-05-18 | 2018-11-06 | 电子科技大学 | A kind of method for scheduling task and device of the on-line optimization subregion of Spark group systems |
-
2019
- 2019-07-24 CN CN201910669809.6A patent/CN110413389B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060184939A1 (en) * | 2005-02-15 | 2006-08-17 | International Business Machines Corporation | Method for using a priority queue to perform job scheduling on a cluster based on node rank and performance |
CN102073546A (en) * | 2010-12-13 | 2011-05-25 | 北京航空航天大学 | Task-dynamic dispatching method under distributed computation mode in cloud computing environment |
CN103218233A (en) * | 2013-05-09 | 2013-07-24 | 福州大学 | Data allocation strategy in hadoop heterogeneous cluster |
CN104270322A (en) * | 2014-10-30 | 2015-01-07 | 中电海康集团有限公司 | Self-adaptive load balance scheduling mechanism for internet-of-things device access processing platform |
US20170010918A1 (en) * | 2015-07-06 | 2017-01-12 | Fujitsu Limited | Information processing apparatus, parallel computer system and job schedule setting program |
CN108762921A (en) * | 2018-05-18 | 2018-11-06 | 电子科技大学 | A kind of method for scheduling task and device of the on-line optimization subregion of Spark group systems |
Non-Patent Citations (2)
Title |
---|
冯兴杰等: "改进的Hadoop作业调度算法", 《计算机工程与应用》 * |
贺阳: "基于Yarn的负载均衡研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110928666A (en) * | 2019-12-09 | 2020-03-27 | 湖南大学 | Method and system for optimizing task parallelism based on memory in Spark environment |
CN110928666B (en) * | 2019-12-09 | 2022-03-22 | 湖南大学 | Method and system for optimizing task parallelism based on memory in Spark environment |
CN110928648A (en) * | 2019-12-10 | 2020-03-27 | 浙江工商大学 | Heuristic and intelligent computing-fused cloud workflow segmentation online scheduling optimization method |
CN110928648B (en) * | 2019-12-10 | 2022-05-20 | 浙江工商大学 | Heuristic and intelligent computing-fused cloud workflow segmentation online scheduling optimization method |
CN110955526B (en) * | 2019-12-16 | 2022-10-21 | 湖南大学 | Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment |
CN110955526A (en) * | 2019-12-16 | 2020-04-03 | 湖南大学 | Method and system for realizing multi-GPU scheduling in distributed heterogeneous environment |
CN111459628B (en) * | 2020-03-12 | 2023-11-28 | 大庆市凯德信信息技术有限公司 | Spark platform task scheduling method based on improved quantum ant colony algorithm |
EP3907609A4 (en) * | 2020-03-12 | 2022-05-11 | Chongqing University of Posts and Telecommunications | Improved quantum ant colony algorithm-based spark platform task scheduling method |
CN111459628A (en) * | 2020-03-12 | 2020-07-28 | 重庆邮电大学 | Spark platform task scheduling method based on improved quantum ant colony algorithm |
WO2021179462A1 (en) * | 2020-03-12 | 2021-09-16 | 重庆邮电大学 | Improved quantum ant colony algorithm-based spark platform task scheduling method |
CN111694789A (en) * | 2020-04-22 | 2020-09-22 | 西安电子科技大学 | Embedded reconfigurable heterogeneous determination method, system, storage medium and processor |
CN111985845A (en) * | 2020-09-02 | 2020-11-24 | 浙江工业大学 | Node priority tuning method for heterogeneous Spark cluster |
CN111985845B (en) * | 2020-09-02 | 2024-03-19 | 浙江工业大学 | Node priority optimization method of heterogeneous Spark cluster |
CN112068959A (en) * | 2020-09-04 | 2020-12-11 | 北京明略昭辉科技有限公司 | Self-adaptive task scheduling method and system and retrieval method comprising method |
CN112231081B (en) * | 2020-10-14 | 2022-08-16 | 山东大学 | PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment |
CN112231081A (en) * | 2020-10-14 | 2021-01-15 | 山东大学 | PSO-AHP-based monotonic rate resource scheduling method and system in cloud environment |
CN112256434A (en) * | 2020-10-30 | 2021-01-22 | 中国科学院信息工程研究所 | Resource matching method in encrypted data cracking scene |
CN112256434B (en) * | 2020-10-30 | 2024-04-05 | 中国科学院信息工程研究所 | Resource matching method in encrypted data cracking scene |
CN112764906A (en) * | 2021-01-26 | 2021-05-07 | 浙江工业大学 | Cluster resource scheduling method based on user job type and node performance bias |
CN112764906B (en) * | 2021-01-26 | 2024-03-15 | 浙江工业大学 | Cluster resource scheduling method based on user job type and node performance bias |
CN113377495A (en) * | 2021-05-17 | 2021-09-10 | 杭州中港科技有限公司 | Method for optimizing docker cluster deployment based on heuristic ant colony algorithm |
CN113377495B (en) * | 2021-05-17 | 2024-02-27 | 杭州中港科技有限公司 | Dock cluster deployment optimization method based on heuristic ant colony algorithm |
CN114780247A (en) * | 2022-05-17 | 2022-07-22 | 中国地质大学(北京) | Flow application scheduling method and system with flow rate and resource sensing |
CN114780247B (en) * | 2022-05-17 | 2022-12-13 | 中国地质大学(北京) | Flow application scheduling method and system with flow rate and resource sensing |
CN115408136A (en) * | 2022-11-01 | 2022-11-29 | 安徽思高智能科技有限公司 | RPA flow scheduling method based on genetic algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN110413389B (en) | 2021-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110413389A (en) | A kind of task schedule optimization method under the unbalanced Spark environment of resource | |
CN107404523A (en) | Cloud platform adaptive resource dispatches system and method | |
Sun et al. | PACO: A period ACO based scheduling algorithm in cloud computing | |
CN110351348B (en) | Cloud computing resource scheduling optimization method based on DQN | |
CN107357652B (en) | Cloud computing task scheduling method based on segmentation ordering and standard deviation adjustment factor | |
CN102708011A (en) | Multistage load estimating method facing task scheduling of cloud computing platform | |
CN107908536B (en) | Performance evaluation method and system for GPU application in CPU-GPU heterogeneous environment | |
CN110086855B (en) | Intelligent Spark task perception scheduling method based on ant colony algorithm | |
CN103401939A (en) | Load balancing method adopting mixing scheduling strategy | |
CN105744006A (en) | Particle swarm optimization user request dispatching method facing multi-type service | |
CN105740059B (en) | A kind of population dispatching method towards Divisible task | |
CN112269632A (en) | Scheduling method and system for optimizing cloud data center | |
Shukla et al. | FAT-ETO: Fuzzy-AHP-TOPSIS-Based efficient task offloading algorithm for scientific workflows in heterogeneous fog–cloud environment | |
Jiang et al. | An energy-aware virtual machine migration strategy based on three-way decisions | |
CN113722112B (en) | Service resource load balancing processing method and system | |
Lyu et al. | Dynamic pricing scheme for edge computing services: A two-layer reinforcement learning approach | |
CN108897625B (en) | Parallel scheduling method based on DAG model | |
LawanyaShri et al. | Energy-Aware Fruitfly Optimisation Algorithm for Load Balancing in Cloud Computing Environments. | |
Zhu et al. | A multi-resource scheduling scheme of Kubernetes for IIoT | |
Shuang et al. | Task Scheduling Based on Grey Wolf Optimizer Algorithm for Smart Meter Embedded Operating System | |
CN112035234A (en) | Distributed batch job distribution method and device | |
CN114398148A (en) | Power industry K8S dynamic container arrangement method and storage medium | |
Sit et al. | An adaptive clustering approach to dynamic load balancing | |
Sun et al. | Optimizing grid resource allocation by combining fuzzy clustering with application preference | |
CN111324444A (en) | Cloud computing task scheduling method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |