CN111061553B - Parallel task scheduling method and system for super computing center - Google Patents

Parallel task scheduling method and system for super computing center Download PDF

Info

Publication number
CN111061553B
CN111061553B CN201911296937.7A CN201911296937A CN111061553B CN 111061553 B CN111061553 B CN 111061553B CN 201911296937 A CN201911296937 A CN 201911296937A CN 111061553 B CN111061553 B CN 111061553B
Authority
CN
China
Prior art keywords
job
scheduling
queue
scheduling queue
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911296937.7A
Other languages
Chinese (zh)
Other versions
CN111061553A (en
Inventor
李肯立
肖雄
唐卓
蒋冰婷
李文
朱锦涛
唐小勇
阳王东
周旭
刘楚波
曹嵘晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN201911296937.7A priority Critical patent/CN111061553B/en
Publication of CN111061553A publication Critical patent/CN111061553A/en
Application granted granted Critical
Publication of CN111061553B publication Critical patent/CN111061553B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a parallel task scheduling method for a super computing center, which provides a realization method based on the use price of a processor in the existing super computing environment and the task parallel execution problem commonly existing in a grid job scheduling system. The invention can fully utilize the existing hardware resources to calculate, and proves the execution efficiency of the method and the reliability of parallel execution of the used scheduling algorithm in operation, and simultaneously ensures the load balance among processors.

Description

Parallel task scheduling method and system for super computing center
Technical Field
The invention belongs to the technical field of high-performance computing of computers, and particularly relates to a parallel task scheduling method and system for a super computing center.
Background
At present, high-performance computing research using computing resources of a supercomputer has gained great popularity in China.
However, most supercomputing centers currently have some non-negligible problems with respect to task scheduling strategies: firstly, due to the inadequacy of task scheduling, the queuing time of the job is too long, and the scheduling efficiency is low; second, because the pricing of using the supercomputer is different from place to place, the job requiring large-scale processor to calculate needs to spend higher price to finish the calculation, thus greatly increasing the cost; thirdly, because the scheduling strategy does not use an effective load balancing strategy, tasks cannot be efficiently scheduled to an idle queue for calculation in a plurality of queues capable of providing calculation, so that the queues with lighter loads are in an idle state, and the queues with heavier loads are in a full-load state, thereby causing serious load imbalance conditions and further forming serious scheduling performance bottlenecks.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the invention provides a parallel task scheduling method and a system for a super computing center, which aim to solve the technical problems that the scheduling strategy used by the existing super computing center is too long in queuing time and low in scheduling efficiency due to the inadequacy of task scheduling, and the tasks requiring a large-scale processor to perform calculation are required to finish the calculation due to different pricing of the super computing center, thereby greatly increasing the technical problem of cost, and further forming the technical problem of serious scheduling performance bottleneck due to the fact that an effective load balancing strategy is not used.
To achieve the above object, according to one aspect of the present invention, there is provided a parallel task scheduling method for a supercomputing center, which is applied to a client, the method comprising the steps of:
(1) Acquiring a text file from a user, wherein the text file records job information to be scheduled, schedulable queue information and server computing capability information;
(2) Preprocessing the obtained text file to obtain a preprocessed text file;
(3) Normalizing all the processing frequencies of the CPU of the server in the computing capacity information of the server, and updating the computing capacity information of the server by using the normalized processing frequencies of the CPU of the server;
(4) Screening the dispatching queue names in the dispatching queue information according to the job information to be dispatched to obtain a screened dispatching queue set;
(5) Calculating the use price of each scheduling queue in the scheduling queue set screened in the step (4) according to the CPU core number required by the operation of the job in the job information to be scheduled,
(6) Setting a counter i=1;
(7) Judging whether i is larger than the total number of jobs corresponding to the job names in the scheduling job information, if so, turning to the step (11), otherwise, turning to the step (8);
(8) Selecting a scheduling queue corresponding to the minimum use price from the standard use prices of the plurality of scheduling queues obtained in the step (5), and scheduling the ith job corresponding to the job name in the job information to be scheduled to the scheduling queue corresponding to the minimum use price for execution;
(9) After the execution of the ith job is completed by the corresponding scheduling queue, updating the predicted CPU core number occupied by the job operation of the scheduling queue in the schedulable queue information;
(10) Setting i=i+1, and returning to step (7).
(11) And storing the number of each job which is already executed, the job name corresponding to the job in the job information to be scheduled, the job global ID corresponding to the job in the job information to be scheduled, the service end name corresponding to the scheduling queue of the service end executing the job in the schedulable queue information, and the scheduling queue name.
Preferably, the job information to be scheduled includes a job global ID, a job name, a job execution required software version, an estimated job execution completion time, and a job execution required CPU core number.
Preferably, the schedulable queue information includes a service end name to which the schedulable queue belongs, a scheduling queue name, a maximum/minimum CPU core number provided by each scheduling queue in the scheduling queue names for job running, a maximum time limit of each scheduling queue in the scheduling queue names for job running, a software name contained in each scheduling queue in the scheduling queue names, a software version contained in each scheduling queue in the scheduling queue names, and a time and cost for using each scheduling queue in the scheduling queue names.
Preferably, the server computing capability information includes a server name, a server available dispatch queue name, and a server CPU processing frequency.
Preferably, step (4) is specifically to search for a scheduling task in the job information to be scheduled, where the scheduling task meets the requirement of the job operation in the job information to be scheduled, the software version in the scheduling queue meets the requirement of the job operation in the job information to be scheduled, the maximum/minimum CPU core number provided by the scheduling queue for the job operation includes the required CPU core number for the job operation in the job information to be scheduled, and the maximum time limit of the scheduling queue on the job operation includes the estimated job operation completion time in the job information to be scheduled, where the scheduling queues meeting the above 4 conditions together form a screened scheduling queue set.
Preferably, step (5) is specifically to firstly query the predicted CPU core number occupied by the operation of the job in the schedulable queue information obtained in step (4) according to each scheduling queue in the screened scheduling queue set, then query the corresponding service end in the schedulable queue information according to the scheduling queue, multiply the time required by the operation of the job by the CPU core number required by the operation of the job, multiply the time-consuming time (hpprice) of each scheduling queue in the scheduling queue name, and finally multiply the service end CPU processing frequency corresponding to the queried service end in the updated service end computing capacity information in step (4), thereby obtaining the standard use price of the scheduling queue.
Preferably, step (9) updates the predicted number of CPU cores that the scheduling queue will occupy for the job to run in the schedulable queue information by subtracting the number of CPU cores that the scheduling queue uses to execute the ith job from the original value.
Preferably, the number of each job that has been executed in step (11) is arranged in the order in which the scheduled queues are executed.
According to another aspect of the present invention, there is provided a parallel task scheduling system for a supercomputing center, which is provided in a client, the parallel task scheduling system comprising:
the first module is used for acquiring a text file from a user, wherein the text file records job information to be scheduled, schedulable queue information and server computing capability information;
the second module is used for preprocessing the obtained text file to obtain a preprocessed text file;
the third module is used for carrying out normalization processing on all the processing frequencies of the CPU of the server in the computing capacity information of the server, and updating the computing capacity information of the server by using the normalized processing frequencies of the CPU of the server;
a fourth module, configured to screen a scheduling queue name in the schedulable queue information according to the job information to be scheduled, so as to obtain a screened scheduling queue set;
a fifth module for calculating the use price of each scheduling queue in the scheduling queue set screened by the fourth module according to the CPU core number required by the operation of the job in the job information to be scheduled,
a sixth module for setting a counter i=1;
a seventh module, configured to determine whether i is greater than a total number of jobs corresponding to the job names in the scheduled job information, and if yes, go to an eleventh module, otherwise go to an eighth module;
an eighth module, configured to select a scheduling queue corresponding to the minimum use price from the standard use prices of the plurality of scheduling queues obtained by the fifth module, and schedule the ith job corresponding to the job name in the job information to be scheduled to the scheduling queue corresponding to the minimum use price for execution;
a ninth module, configured to update, in the schedulable queue information, a predicted CPU core number that the scheduling queue will be occupied by the job operation after the execution of the ith job by the corresponding scheduling queue is completed;
a tenth module, configured to set i=i+1, and return to the seventh module;
an eleventh module, configured to store the number of each job that has been executed, a job name corresponding to the job in the job information to be scheduled, a job global ID corresponding to the job in the job information to be scheduled, a service end name corresponding to the scheduling queue of the service end that executes the job in the schedulable queue information, and a scheduling queue name.
In general, the above technical solutions conceived by the present invention, compared with the prior art, enable the following beneficial effects to be obtained:
(1) The invention adopts the steps (1) to (11), which uses the minimum price priority scheduling strategy based on the super computing center processor to efficiently execute scheduling, and calculates the effective scheduling of the queue execution job with the minimum price based on the standard price calculation formula, so that the technical problems of overlong queuing time and low scheduling efficiency of the job caused by the insufficient task scheduling of the scheduling strategy used by the existing super computing center can be solved;
(2) The invention adopts the steps (4) and (5), which can effectively screen out the queue capable of scheduling and calculate the standard price, and can accurately schedule the job needing large-scale processing to the scheduling queue with the lowest price for execution, thus solving the technical problems that the job needing large-scale processor for calculation cannot be efficiently scheduled to the corresponding scheduling queue for processing with the lowest price in the scheduling strategy used by the existing super computing center, and increasing a large amount of cost;
(3) The invention adopts the steps (5) to (11) and uses the scheduling strategy of load balancing, so that the load balancing among the service ends is well maintained, and the technical problems that the existing super computing center causes serious load unbalance due to the fact that an effective load balancing strategy is not used and forms serious scheduling performance bottleneck can be solved.
Drawings
FIG. 1 is a flow chart of a parallel task scheduling method for a supercomputer center of the present invention;
FIG. 2 is a comparison of the performance of the method of the present invention in terms of average job overhead with the scheduling policy used by existing supercomputer centers.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
The basic idea of the invention is that a calculation method for scheduling task priority based on the lowest price of a super computing center processor is adopted to make a final task-to-processor mapping decision, the calculation method stores all analyzed data respectively, the use price of each job to be scheduled on a queue is calculated, the obtained plurality of binary data are ordered, the data with the lowest use price are obtained to execute priority scheduling, and the resource information is updated periodically after the data are scheduled on the corresponding queue, so as to ensure the resource data certainty of each queue when the rest of the jobs are scheduled. By executing the scheme, higher performance and better load balancing effect are realized, and the cost is reduced.
As shown in fig. 1, the present invention provides a parallel task scheduling method for a supercomputing center, comprising the following steps:
(1) The method comprises the steps that a client acquires a text file from a user, wherein Job to be scheduled (Job) information, schedulable Queue (Queue) information and server computing capability information are recorded in the text file;
specifically, job information to be scheduled includes a job global ID (Jobgid), a job name (Username), a job execution required software name (application name), a job execution required software version (application version), an estimated job execution completion time (Walltime), and a job execution required CPU core number (CPU count).
As shown in table 1 below, which shows an example of job information to be scheduled:
TABLE 1
The schedulable queue information includes a service end name (Servername) to which the schedulable queue belongs, a schedulable queue name (queue name), a maximum/minimum CPU core number (Max/mincpu count) provided by each of the schedulable queues for job execution, a maximum time limit (walltimeelimit) for job execution by each of the schedulable queues in the schedulable queue names, a software name (applications) included in each of the schedulable queues in the schedulable queue names, a software version (applications) included in each of the schedulable queues in the schedulable queue names, and a time-to-use cost (hpcPrice) of each of the schedulable queues in the schedulable queue names.
As shown in table 2 below, which illustrates an example of schedulable queue information:
TABLE 2
The server computing capability information includes a server name (Servername), a server available dispatch queue name (queue names), and a server CPU processing Frequency (Frequency).
As shown in table 3 below, which shows an example of server-side computing capability information:
service end name 1 Dispatch queue name 1 CPU processing frequency 1
Service end name 2 Dispatch queue name 2 CPU processing frequency 2
…… …… ……
Service end name n Dispatch queue name n CPU processing frequency n
TABLE 3 Table 3
(2) The client preprocesses the obtained text file to obtain a preprocessed text file;
specifically, the text files are preprocessed, i.e., redundant symbols (such as brackets, double quotation marks, colon marks, etc.) contained in the files are removed.
(3) The client normalizes all the processing frequencies (frequencies) of the CPU of the server in the computing capacity information of the server, and updates the computing capacity information of the server by using the normalized processing frequencies of the CPU of the server;
(4) The client screens the scheduling queue names in the schedulable queue information according to the job information to be scheduled to obtain a screened scheduling queue set;
specifically, the method searches for a scheduling task in the job information to be scheduled, wherein the software name (application names) contained in the scheduling queue and the software name (application names) required by the job operation in the job information to be scheduled are met, the software version (application version) contained in the scheduling queue and the software version (application version) required by the job operation in the job information to be scheduled are met, the maximum/minimum CPU core number (Max/MinCPUcount) provided by the scheduling queue for the job operation contains the CPU core number (CPU core) required by the job operation in the job information to be scheduled, and the maximum time limit (Walltimelimit) of the scheduling queue to the job operation contains the scheduling task of the estimated job operation completion time (Walltime) in the job information to be scheduled, and the scheduling queues meeting the 4 conditions together form a screened scheduling queue set.
(5) The client calculates the use price of each scheduling queue in the scheduling queue set screened in the step (4) according to the CPU core number (CPU count) required by the operation of the job in the job information to be scheduled,
specifically, the step includes firstly inquiring the predicted CPU core number occupied by the operation of the scheduling queue in the schedulable queue information obtained in the step (4) according to each scheduling queue in the screened scheduling queue set, then inquiring the corresponding service end in the schedulable queue information according to the scheduling queue, then multiplying the required CPU core number (CPU count) for the operation of the operation by the required time (Walltime), multiplying the required time (hpccce) for the operation of the operation by the time of each scheduling queue in the scheduling queue name, and finally multiplying the service end CPU processing frequency corresponding to the inquired service end in the updated service end computing capacity information in the step (4), thereby obtaining the standard use price of the scheduling queue.
(6) The client sets a counter i=1;
(7) The client judges whether i is larger than the total number of the jobs corresponding to the job names in the scheduling job information, if yes, the step (11) is carried out, otherwise, the step (8) is carried out;
(8) The client selects a scheduling queue corresponding to the minimum use price from the standard use prices of the plurality of scheduling queues obtained in the step (5), and schedules the ith job corresponding to the job name in the job information to be scheduled to the scheduling queue corresponding to the minimum use price for execution;
(9) After the execution of the ith job by the corresponding scheduling queue is completed, the client updates the predicted CPU core number occupied by the operation of the job in the scheduling queue in the schedulable queue information (namely subtracting the CPU core number used by the scheduling queue to execute the ith job from the original value);
(10) The client sets i=i+1, and returns to step (7).
(11) The client saves the number of each job which is already executed (which is arranged according to the execution sequence of the scheduled queue), the job name corresponding to the job in the job information to be scheduled, the job global ID corresponding to the job in the job information to be scheduled, the service end name corresponding to the scheduling queue of the service end executing the job in the schedulable queue information, and the scheduling queue name.
Performance testing
The present invention is compared to the existing scheduling algorithm (min-min algorithm) by calculating the average job overhead.
As shown in fig. 2, the abscissa indicates the time when a job is submitted, the ordinate indicates the calculated price of a scheduled task, the price (Cost) is calculated as the unit price of the time used multiplied by the unit time, the time used is represented as [ user job end time-actual start operation time ]. The number of CPU cores applied, and the lower the calculated value is, the lower the calculated price is, the job is scheduled preferentially. As is evident from fig. 2, the average job overhead of the method of the present invention (shown in the figure as ACFS algorithm, which is collectively referred to as application-aware price-first scheduling algorithm, i.e., application Cost First Scheduling) is better than that of the existing min-min algorithm, since the method of the present invention always schedules tasks to be scheduled to the scheduling queue with the lowest price to ensure low overhead execution of the overall calculation.
In order to ensure better low-overhead scheduling and load balancing among processors, a calculation method for scheduling tasks with lowest price based on a queue standard is adopted to make a final task-to-processor mapping decision, the calculation method stores all analyzed data respectively, calculates standard use prices of each job to be scheduled on the queue, sorts a plurality of binary data obtained, acquires data with the lowest standard use price to execute priority scheduling, and periodically updates resource information after the data is scheduled on the corresponding queue so as to ensure the resource data certainty of each queue in the rest job scheduling. By executing the scheme, higher performance and better load balancing effect are realized, and the cost is reduced.
The invention provides a parallel task scheduling method based on the minimum-price priority scheduling of the super computing center processor, which plays a key role in maintaining the load balancing performance and reducing the cost, and also improves the overall parallel efficiency.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (4)

1. A parallel task scheduling method for a supercomputing center, which is applied to a client, characterized in that the method comprises the following steps:
(1) Acquiring a text file from a user, wherein the text file records job information to be scheduled, schedulable queue information and server computing capability information; the job information to be scheduled comprises a job global ID, a job name, a software name required by job operation, a software version required by job operation, estimated job operation completion time and CPU core number required by job operation, the schedulable queue information comprises a service end name, a scheduling queue name, a maximum/minimum CPU core number provided by each scheduling queue in the scheduling queue names for job operation, a maximum time limit of each scheduling queue in the scheduling queue names to job operation, a software name contained in each scheduling queue in the scheduling queue names, a software version contained in each scheduling queue in the scheduling queue names and a time and time consumption of each scheduling queue in the scheduling queue names, and the service end computing capability information comprises the service end name, the scheduling queue names available by the service end and the CPU processing frequency of the service end;
(2) Preprocessing the obtained text file to obtain a preprocessed text file;
(3) Normalizing all the processing frequencies of the CPU of the server in the computing capacity information of the server, and updating the computing capacity information of the server by using the normalized processing frequencies of the CPU of the server;
(4) Screening the dispatching queue names in the dispatching queue information according to the job information to be dispatched to obtain a screened dispatching queue set; searching for a scheduling task which simultaneously meets the requirement of the operation in the operation information to be scheduled, wherein the software name is contained in the operation information to be scheduled, the software version is contained in the operation information to be scheduled, the maximum/minimum CPU core number provided by the operation queue for the operation is contained in the operation information to be scheduled, the maximum time limit of the operation of the scheduling queue for the operation is contained in the estimated operation completion time of the operation information to be scheduled, and the scheduling queues which simultaneously meet the 4 conditions form a screened scheduling queue set;
(5) Calculating the use price of each scheduling queue in the scheduling queue set screened in the step (4) according to the CPU core number required by the operation of the job in the job information to be scheduled; the step (5) is that firstly, according to each scheduling queue in the screened scheduling queue set, the predicted CPU core number occupied by the operation of the scheduling queue is inquired in the scheduling queue information obtained in the step (4), then the corresponding service end is inquired in the scheduling queue information according to the scheduling queue, then the CPU core number required by the operation of the operation is multiplied by the time required by the operation, then the time and time consumption of each scheduling queue in the scheduling queue name are multiplied, and finally the service end CPU processing frequency corresponding to the inquired service end in the updated service end computing capacity information in the step (4) is multiplied, so that the standard use price of the scheduling queue is obtained;
(6) Setting a counter i=1;
(7) Judging whether i is larger than the total number of jobs corresponding to the job names in the scheduling job information, if so, turning to the step (11), otherwise, turning to the step (8);
(8) Selecting a scheduling queue corresponding to the minimum use price from the standard use prices of the plurality of scheduling queues obtained in the step (5), and scheduling the ith job corresponding to the job name in the job information to be scheduled to the scheduling queue corresponding to the minimum use price for execution;
(9) After the execution of the ith job is completed by the corresponding scheduling queue, updating the predicted CPU core number occupied by the job operation of the scheduling queue in the schedulable queue information;
(10) Setting i=i+1, and returning to step (7);
(11) And storing the number of each job which is already executed, the job name corresponding to the job in the job information to be scheduled, the job global ID corresponding to the job in the job information to be scheduled, the service end name corresponding to the scheduling queue of the service end executing the job in the schedulable queue information, and the scheduling queue name.
2. The parallel task scheduling method according to claim 1, wherein step (9) updates the predicted CPU core number that the scheduling queue will occupy by the operation of the job in the schedulable queue information by subtracting the CPU core number used by the scheduling queue to execute the i-th job from the original value.
3. The parallel task scheduling method according to claim 2, wherein the number of each of the jobs that have been executed in step (11) is arranged in the order of execution of the scheduled queues.
4. A parallel task scheduling system for a supercomputer center, provided in a client, characterized in that the parallel task scheduling system comprises:
the first module is used for acquiring a text file from a user, wherein the text file records job information to be scheduled, schedulable queue information and server computing capability information; the job information to be scheduled comprises a job global ID, a job name, a software name required by job operation, a software version required by job operation, estimated job operation completion time and CPU core number required by job operation, the schedulable queue information comprises a service end name, a scheduling queue name, a maximum/minimum CPU core number provided by each scheduling queue in the scheduling queue names for job operation, a maximum time limit of each scheduling queue in the scheduling queue names to job operation, a software name contained in each scheduling queue in the scheduling queue names, a software version contained in each scheduling queue in the scheduling queue names and a time and time consumption of each scheduling queue in the scheduling queue names, and the service end computing capability information comprises the service end name, the scheduling queue names available by the service end and the CPU processing frequency of the service end;
the second module is used for preprocessing the obtained text file to obtain a preprocessed text file;
the third module is used for carrying out normalization processing on all the processing frequencies of the CPU of the server in the computing capacity information of the server, and updating the computing capacity information of the server by using the normalized processing frequencies of the CPU of the server;
a fourth module, configured to screen a scheduling queue name in the schedulable queue information according to the job information to be scheduled, so as to obtain a screened scheduling queue set; the fourth module is specifically to search for a scheduling task which simultaneously meets the requirement of the operation of the job in the job information to be scheduled, wherein the software name contained in the scheduling queue is consistent with the requirement of the operation of the job in the job information to be scheduled, the software version contained in the scheduling queue is consistent with the requirement of the operation of the job in the job information to be scheduled, the maximum/minimum CPU core number provided by the scheduling queue for the operation of the job contains the CPU core number required by the operation of the job in the job information to be scheduled, and the maximum time limit of the scheduling queue for the operation of the job contains the estimated operation completion time of the job in the job information to be scheduled, and the scheduling queues which simultaneously meet the 4 conditions together form a screened scheduling queue set;
the fifth module is used for calculating the use price of each scheduling queue in the scheduling queue set screened by the fourth module according to the number of CPU cores required by the operation of the job to be scheduled, and specifically, the fifth module firstly inquires the predicted number of CPU cores occupied by the operation of the scheduling queue in the schedulable queue information obtained by the fourth module according to each scheduling queue in the screened scheduling queue set, then inquires the corresponding service end in the schedulable queue information according to the scheduling queue, then multiplies the time required by the operation of the job by the number of CPU cores required by the operation of the job, multiplies the time and time of use of each scheduling queue in the scheduling queue name, and finally multiplies the service end CPU processing frequency corresponding to the inquired service end in the service end computing capacity information updated by the fourth module, so that the standard use price of the scheduling queue is obtained;
a sixth module for setting a counter i=1;
a seventh module, configured to determine whether i is greater than a total number of jobs corresponding to the job names in the scheduled job information, and if yes, go to an eleventh module, otherwise go to an eighth module;
an eighth module, configured to select a scheduling queue corresponding to the minimum use price from the standard use prices of the plurality of scheduling queues obtained by the fifth module, and schedule the ith job corresponding to the job name in the job information to be scheduled to the scheduling queue corresponding to the minimum use price for execution;
a ninth module, configured to update, in the schedulable queue information, a predicted CPU core number that the scheduling queue will be occupied by the job operation after the execution of the ith job by the corresponding scheduling queue is completed;
a tenth module, configured to set i=i+1, and return to the seventh module;
an eleventh module, configured to store the number of each job that has been executed, a job name corresponding to the job in the job information to be scheduled, a job global ID corresponding to the job in the job information to be scheduled, a service end name corresponding to the scheduling queue of the service end that executes the job in the schedulable queue information, and a scheduling queue name.
CN201911296937.7A 2019-12-17 2019-12-17 Parallel task scheduling method and system for super computing center Active CN111061553B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911296937.7A CN111061553B (en) 2019-12-17 2019-12-17 Parallel task scheduling method and system for super computing center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911296937.7A CN111061553B (en) 2019-12-17 2019-12-17 Parallel task scheduling method and system for super computing center

Publications (2)

Publication Number Publication Date
CN111061553A CN111061553A (en) 2020-04-24
CN111061553B true CN111061553B (en) 2023-10-10

Family

ID=70301117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911296937.7A Active CN111061553B (en) 2019-12-17 2019-12-17 Parallel task scheduling method and system for super computing center

Country Status (1)

Country Link
CN (1) CN111061553B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626173B (en) * 2021-08-31 2023-12-12 阿里巴巴(中国)有限公司 Scheduling method, scheduling device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001014961A2 (en) * 1999-08-26 2001-03-01 Parabon Computation System and method for the establishment and utilization of networked idle computational processing power
CN101957780A (en) * 2010-08-17 2011-01-26 中国电子科技集团公司第二十八研究所 Resource state information-based grid task scheduling processor and grid task scheduling processing method
CN103207814A (en) * 2012-12-27 2013-07-17 北京仿真中心 Decentralized cross cluster resource management and task scheduling system and scheduling method
CN103761147A (en) * 2014-01-15 2014-04-30 清华大学 Method and system for managing calculation examples in cloud platforms
CN109951558A (en) * 2019-03-27 2019-06-28 北京并行科技股份有限公司 A kind of cloud dispatching method of supercomputer resource, cloud control centre and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110154353A1 (en) * 2009-12-22 2011-06-23 Bmc Software, Inc. Demand-Driven Workload Scheduling Optimization on Shared Computing Resources

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001014961A2 (en) * 1999-08-26 2001-03-01 Parabon Computation System and method for the establishment and utilization of networked idle computational processing power
CN101957780A (en) * 2010-08-17 2011-01-26 中国电子科技集团公司第二十八研究所 Resource state information-based grid task scheduling processor and grid task scheduling processing method
CN103207814A (en) * 2012-12-27 2013-07-17 北京仿真中心 Decentralized cross cluster resource management and task scheduling system and scheduling method
CN103761147A (en) * 2014-01-15 2014-04-30 清华大学 Method and system for managing calculation examples in cloud platforms
CN109951558A (en) * 2019-03-27 2019-06-28 北京并行科技股份有限公司 A kind of cloud dispatching method of supercomputer resource, cloud control centre and system

Also Published As

Publication number Publication date
CN111061553A (en) 2020-04-24

Similar Documents

Publication Publication Date Title
US7810099B2 (en) Optimizing workflow execution against a heterogeneous grid computing topology
Biyabani et al. The integration of deadline and criticalness in hard real-time scheduling
CN110008024B (en) Container scheduling method and device based on delay decision under multidimensional constraint
US8843929B1 (en) Scheduling in computer clusters
CN109120715A (en) Dynamic load balancing method under a kind of cloud environment
Chang et al. Selecting the most fitting resource for task execution
US20080052712A1 (en) Method and system for selecting optimal clusters for batch job submissions
CN106557471A (en) Method for scheduling task and device
JP2003256222A (en) Distribution processing system, job distribution processing method and its program
CN107430526B (en) Method and node for scheduling data processing
US8407709B2 (en) Method and apparatus for batch scheduling, and computer product
Bibal Benifa et al. Performance improvement of Mapreduce for heterogeneous clusters based on efficient locality and replica aware scheduling (ELRAS) strategy
Dong et al. A grid task scheduling algorithm based on QoS priority grouping
CN111061553B (en) Parallel task scheduling method and system for super computing center
CN108665157B (en) Method for realizing balanced scheduling of cloud workflow system process instance
Ghazali et al. A classification of Hadoop job schedulers based on performance optimization approaches
Liu et al. Distributed energy-efficient scheduling for data-intensive applications with deadline constraints on data grids
CN111782627B (en) Task and data cooperative scheduling method for wide-area high-performance computing environment
CN110955527B (en) Method and system for realizing parallel task scheduling based on CPU (Central processing Unit) core number prediction
Xu et al. Expansion slot backfill scheduling for concurrent workflows with deadline on heterogeneous resources
CN112559440A (en) Method and device for realizing serial service performance optimization in multi-small-chip system
CN116360922A (en) Cluster resource scheduling method, device, computer equipment and storage medium
Tang et al. A network load perception based task scheduler for parallel distributed data processing systems
Bindu et al. Perspective study on resource level load balancing in grid computing environments
JPH1139340A (en) Data base retrieval system, multiprocessor system and data base retrieval method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Li Kenli

Inventor after: Liu Chubo

Inventor after: Cao Ronghui

Inventor after: Xiao Xiong

Inventor after: Tang Zhuo

Inventor after: Jiang Bingting

Inventor after: Li Wen

Inventor after: Zhu Jintao

Inventor after: Tang Xiaoyong

Inventor after: Yang Wangdong

Inventor after: Zhou Xu

Inventor before: Tang Zhuo

Inventor before: Liu Chubo

Inventor before: Cao Ronghui

Inventor before: Xiao Xiong

Inventor before: Li Kenli

Inventor before: Jiang Bingting

Inventor before: Li Wen

Inventor before: Zhu Jintao

Inventor before: Tang Xiaoyong

Inventor before: Yang Wangdong

Inventor before: Zhou Xu

GR01 Patent grant
GR01 Patent grant