CN108897619B - Multi-level resource flexible configuration method for super computer - Google Patents

Multi-level resource flexible configuration method for super computer Download PDF

Info

Publication number
CN108897619B
CN108897619B CN201810680674.9A CN201810680674A CN108897619B CN 108897619 B CN108897619 B CN 108897619B CN 201810680674 A CN201810680674 A CN 201810680674A CN 108897619 B CN108897619 B CN 108897619B
Authority
CN
China
Prior art keywords
time
breakpoints
job
expected time
supercomputer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810680674.9A
Other languages
Chinese (zh)
Other versions
CN108897619A (en
Inventor
孟祥飞
康波
李健增
刘光明
菅晓东
雷秀丽
孙华文
马庆珍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Supercomputer Center In Tianjin
Original Assignee
National Supercomputer Center In Tianjin
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Supercomputer Center In Tianjin filed Critical National Supercomputer Center In Tianjin
Priority to CN202010258601.8A priority Critical patent/CN111475297B/en
Priority to CN201810680674.9A priority patent/CN108897619B/en
Publication of CN108897619A publication Critical patent/CN108897619A/en
Application granted granted Critical
Publication of CN108897619B publication Critical patent/CN108897619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)

Abstract

The invention relates to a multi-level resource flexible configuration method for a super computer, which comprises the following steps:assigning P of supercomputers to jobs0A node; the calculation is performing N tasks T1,T2,...,TNReach M breakpoints in the process of { B } {1,B2,...,BMCorresponding initial expected time of
Figure DDA0001710257170000011
Computing when executing task TjLate arrival task TjAnd task Tj+1Break point B in betweeniThe actual time of (c)
Figure DDA0001710257170000012
And the initial expected time
Figure DDA0001710257170000013
Difference of (2)
Figure DDA0001710257170000014
When in use
Figure DDA0001710257170000015
Then, for the remaining N-j unexecuted tasks { T }j+1,Tj+2,...,TNAllocate P1One compute node and recalculate to the remaining M-i breakpoints Bi+1,Bi+2,...,BMCorresponding first corrected expected time of

Description

Multi-level resource flexible configuration method for super computer
Technical Field
The invention relates to a multi-level resource flexible configuration method for a super computer.
Background
The supercomputer is a computer which is formed by combining a plurality of computing nodes and can perform large-scale computation or data processing in parallel, is also called as a parallel computer, is the computer with the strongest function, the fastest operation and the largest storage capacity, is mainly used for the national high-tech field and the advanced technical research, and is an important embodiment of the national science and technology development level and the comprehensive national force.
At present, when a user submits a job to a supercomputer, various required resources, such as a storage space, a node number, a core number and the like of the supercomputer required for running the job, need to be specified by the user. In general, the user estimates the required resources based on experience or the results of a small number of data commissioning, and thus the bias is often large. If the requested resources are insufficient, the operation may be terminated due to timeout, overflow and the like, and a desired result cannot be obtained; however, if the resources requested are excessive, the user is charged with additional cost and valuable recalculation computing power is wasted. Therefore, how to specify a proper amount of resources for a job when the job is submitted and run becomes an urgent problem to be solved.
Disclosure of Invention
In order to solve the technical problem, the invention provides a multi-level resource flexible configuration method for a supercomputer, which comprises the following steps:
step S100, obtaining a job, wherein the job comprises N tasks { T }1,T2,...,TNAnd M breakpoints B respectively arranged between the tasks1,B2,...,BM}; assigning P of supercomputer to the job0A node; the calculation is performing N tasks T1,T2,...,TNReach M breakpoints in the process of { B } {1,B2,...,BMCorresponding initial expected time of
Figure BDA0001710257160000011
Wherein N, M and P0Are all natural numbers, and M is more than N;
step S200, calculating the task T when executingjLate arrival task TjAnd task Tj+1Break point B in betweeniThe actual time of (c)
Figure BDA0001710257160000012
And the initial expected time
Figure BDA0001710257160000013
Difference of (2)
Figure BDA0001710257160000014
Step S300, when
Figure BDA0001710257160000015
Then, for the remaining N-j unexecuted tasks { T }j+1,Tj+2,...,TNAllocate P1One compute node and recalculate to the remaining M-i breakpoints Bi+1,Bi+2,...,BMCorresponding first corrected expected time of
Figure BDA0001710257160000021
Where | Δ t1Is Deltat1TH1 is a set threshold (preferably not exceeding 5).
Detailed Description
The present invention will be described in further detail in order to make the objects, technical solutions and advantages of the present invention more apparent. This description is made by way of example and not limitation to specific embodiments consistent with the principles of the invention, the description being in sufficient detail to enable those skilled in the art to practice the invention, other embodiments may be utilized and the structure of various elements may be changed and/or substituted without departing from the scope and spirit of the invention. The following detailed description is, therefore, not to be taken in a limiting sense.
One embodiment of the invention provides a multi-level resource flexible configuration method for a supercomputer, wherein the supercomputer is selected from a Tianhe supercomputer, in particular a Tianhe series supercomputer such as TH-1, TH-1A, TH-2, and the series supercomputer generally receives and executes a job in the form of a script file, wherein the script file at least provides parameters such as a job submission mode, a calculation partition, a node number, a core number, a task script file absolute file path and the like, and the submission form of reference is "yhbatch-N N1-p P1-N1xxx.bat", wherein N1 is the node number, and the data type is integer; p1 is the name of the partition, and the data type is a character string; n1 is the number of cores, and the data type is integer; bat is task script file name, data type is string, specifically, the configuration method includes the following steps:
step S100, obtaining a job through a script file, wherein the job comprises N tasks { T }1,T2,...,TNAnd M breakpoints B respectively arranged between the tasks1,B2,...,BMThe task can be any suitable software or a program for executing specific processing, the task has an interface for inputting data, the result of the input data after processing is used as output data, the result data output by the previous task is used as the input data of the next task, and the result of the task is obtained after the last task is executed, namely the execution of the task is finished; the breakpoint is arranged after one or more of the first N-1 tasks, the operation is temporarily suspended at the breakpoint, and the next task is continued after the execution progress of the operation is evaluated; assigning P of supercomputer to the job after obtaining script file0Each node generally comprises a plurality of computing cores, for example, 4-28 cores, and in the Tianhe super computer, computing resources are generally distributed by taking the node as a unit; then, before the job is executed, the N tasks are calculated to be executed { T }1,T2,...,TNReach M breakpoints in the process of { B } {1,B2,...,BMCorresponding initial expected time of
Figure BDA0001710257160000022
And the initial expected job run time required to complete the job, i.e., all N tasks
Figure BDA0001710257160000023
Wherein N, M and P0All are natural numbers, and M > N, the calculated initial expected time
Figure BDA0001710257160000024
And an initial expected job run time
Figure BDA0001710257160000025
After storing, the method is used for evaluating the execution progress of the job in the following steps;
step S200, after the job is submitted to the super computer to be executed, in the task TjAnd task Tj+1Breakpoint BiInterrupt processing, obtaining the operation from the start of execution to the breakpoint BiActual running time of the process
Figure BDA0001710257160000031
Calculating the actual time
Figure BDA0001710257160000032
Corresponding to the initial expected time
Figure BDA0001710257160000033
Difference between them
Figure BDA0001710257160000034
Step S300, when
Figure BDA0001710257160000035
Then, for the remaining N-j unexecuted tasks { T }j+1,Tj+2,...,TNAllocate P1One compute node and recalculate to the remaining M-i breakpoints Bi+1,Bi+2,...,BMCorresponding first corrected expected time of
Figure BDA0001710257160000036
Where | Δ t1Is Deltat1Absolute value of, TH1To set the threshold value, TH1May be any suitable value, typically not more than 10, preferably not more than 5, for example not more than 4, 3, 2, 1, 0.5, 0.3, 0.2, 0.1 etc.
In a preferred embodiment, in step S300, when Δ t is greater than or equal to1When > 0, P1=(1+w)×P0First corrected expected time
Figure BDA0001710257160000037
In (1)
Figure BDA0001710257160000038
Wherein M is more than or equal to i +1 and less than or equal to M,
Figure BDA0001710257160000039
thus as i increases, w gets closer to A1Thereby providing more resources to complete the job as soon as possible within the expected time.
In a preferred embodiment, in step S300, when Δ t is greater than or equal to1When the ratio is less than or equal to 0, P1=(1-w)×P0First corrected expected time
Figure BDA00017102571600000310
In (1)
Figure BDA00017102571600000311
Wherein M is more than or equal to i +1 and less than or equal to M,
Figure BDA00017102571600000312
this allows for the release of excess resources as early as possible to obtain completion in the right time and to save costs as much as possible.
Similarly, when task T is executedj+yLate arrival task Tj+yAnd task Tj+y+1Break point B in betweeni+xWhen in treatment, y is more than or equal to 1 and less than N-j, x is more than or equal to 1 and less than M-i, and the actual time is calculated
Figure BDA00017102571600000313
And the initial expected time
Figure BDA00017102571600000314
Difference of (2)
Figure BDA00017102571600000315
When in use
Figure BDA00017102571600000316
Then, for the remaining N-j-y tasks { T }j+y+1,Tj+y+2,...,TNAllocate PxOne compute node and recalculate to the remaining M-i-x breakpoints { B }i+x+1,Bi+x+2,...,BMCorresponding corrected expected time of xth
Figure BDA00017102571600000317
Wherein when | Δ txIs DeltatxAbsolute value of, THxTo set the threshold value, THxCan be connected with TH1The same or different. TH allows for more resources to be needed to correct the bias the later the program is, and thereforexPreferably less than TH1So as to sensitively and timely start the resource allocation correction process and complete the operation on time.
In some cases, in step S200, the number P of nodes that need to be allocated for the current operation of the job is calculated according to the algorithm complexity of the job and the data amount of the current operation, or according to the historical operation result of the job and the data amount of the current operation0And calculating the number of tasks in executing N { T }1,T2,...,TNReach M breakpoints in the process of { B } {1,B2,...,BMCorresponding initial expected time of
Figure BDA00017102571600000318
For example, the historical run results include the job at different data volumes { D1,D2,...,DLAnd number of different nodes { P }1,P2,...,PKRun time under the conditions of }
Figure BDA0001710257160000041
And run to M breakpoints { B1,B2,...,BMCorresponding time of }
Figure BDA0001710257160000042
Wherein
Figure BDA0001710257160000043
And
Figure BDA0001710257160000044
respectively indicate the data quantity D of the operationaAnd the number of nodes PbRun time and run arrival breakpoint B under the condition of (c)iThe time of (d); accordingly, the data amount D of the current operation is calculated according to the operationcAnd a desired run time tcSearching and selecting from the historical operation results
Figure BDA0001710257160000045
Corresponding number of nodes PbThe number P of the nodes needing to be distributed as the operation of the operation at this time0Selecting
Figure BDA0001710257160000046
Reach M breakpoints as runs B1,B2,...,BMCorresponding initial expected time of
Figure BDA0001710257160000047
Wherein D isaAnd DcClosest to and equal to or greater than DcAt the same time
Figure BDA0001710257160000048
And tcClosest to and not more than tc
In other cases, when the number P of nodes to be allocated for the current operation of the job cannot be found in the historical operation result according to the data size and the expected operation time of the current operation of the job0And run to M breakpoints { B1,B2,...,BMCorresponding toInitial expected time
Figure BDA0001710257160000049
In this case, various known interpolation methods can be used to determine the number P of nodes to be allocated in the current operation of the job0And run to M breakpoints { B1,B2,...,BMCorresponding initial expected time of
Figure BDA00017102571600000410
In some cases, at allocation P0At the same time, according to N tasks { T }1,T2,...,TNAllocating memory and storage space according to the requirement of the unit; and in distribution P1While computing nodes, according to the remaining N-j unexecuted tasks { Tj+1,Tj+2,...,TNAllocating memory and storage space according to the requirement of the unit; and in distribution PxWhile computing the node, according to the rest N-j-y tasks { T }j+y+1,Tj+y+2,...,TNAllocating memory and storage space according to the requirement of the unit; wherein any known method may be employed to allocate memory and storage space. Preferably, the storage space is increased when the allocated storage space occupancy exceeds 85%, preferably exceeds 75%, more preferably exceeds 70%.
By using the method of the invention, before the operation is executed, the proper resources can be initially allocated to the operation as accurately as possible according to the historical data; during the execution of the operation, the resources can be dynamically allocated or recovered according to the actual execution condition of the operation, namely the condition of exceeding or falling below the expected progress, so that the flexible or elastic configuration of the resources of the super computer at multiple levels is realized, and the occupation of the resources of the super computer is reduced as much as possible under the condition of fully ensuring the timely completion of the operation.
Moreover, other implementations of the invention will be apparent to those skilled in the art from consideration of the specification of the invention disclosed herein. The embodiments and/or aspects of the embodiments can be used in the systems and methods of the present invention alone or in any combination. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

Claims (5)

1. A method for flexible configuration of multi-tier resources for a supercomputer, comprising the steps of:
step S100, obtaining a job, wherein the job comprises N tasks { T }1,T2,...,TNAnd M breakpoints B respectively arranged between the tasks1,B2,...,BM}; assigning P of supercomputer to the job0A node; the calculation is performing N tasks T1,T2,...,TNReach M breakpoints in the process of { B } {1,B2,...,BMCorresponding initial expected time of
Figure FDA0002392199370000011
Wherein N, M and P0Are all natural numbers, and M is more than N;
step S200, calculating the task T when executingjLate arrival task TjAnd task Tj+1Break point B in betweeniThe actual time of (c)
Figure FDA0002392199370000012
And the initial expected time
Figure FDA0002392199370000013
Difference of (2)
Figure FDA0002392199370000014
Calculating the number P of nodes required to be allocated for the current operation of the operation according to the algorithm complexity of the operation and the data volume of the current operation or according to the historical operation result of the operation and the data volume of the current operation0And calculating the number of tasks in executing N { T }1,T2,...,TNReach M breakpoints in the process of { B } {1,B2,...,BMCorresponding initial expected time of
Figure FDA0002392199370000015
Wherein the historical operating result comprises that the job has different data volumes { D1,D2,...,DLAnd number of different nodes { P }1,P2,...,PKRun time under the conditions of }
Figure FDA0002392199370000016
And run to M breakpoints { B1,B2,...,BMCorresponding time of }
Figure FDA0002392199370000017
Wherein
Figure FDA0002392199370000018
And
Figure FDA0002392199370000019
respectively indicate the data quantity D of the operationaAnd the number of nodes PbRun time and run arrival breakpoint B under the condition of (c)iThe time of (d); accordingly, the data amount D of the current operation is calculated according to the operationcAnd a desired run time tcSearching and selecting from the historical operation results
Figure FDA00023921993700000110
Corresponding number of nodes PbThe number P of the nodes needing to be distributed as the operation of the operation at this time0Selecting
Figure FDA00023921993700000111
Reach M breakpoints as runs B1,B2,...,BMCorresponding initial expected time of
Figure FDA00023921993700000112
Wherein D isaAnd DcClosest to and equal to or greater than DcAt the same time
Figure FDA00023921993700000113
And tcClosest to and not more than tc
Step S300, when
Figure FDA00023921993700000114
Then, for the remaining N-j unexecuted tasks { T }j+1,Tj+2,...,TNAllocate P1One compute node and recalculate to the remaining M-i breakpoints Bi+1,Bi+2,...,BMCorresponding first corrected expected time of
Figure FDA0002392199370000021
Where | Δ t1Is Deltat1Absolute value of, TH1To set the threshold.
2. The multi-tier resource flexible configuration method for a supercomputer according to claim 1, wherein, in step S300, TH1Is a set threshold value not exceeding 5.
3. The multi-level resource flexible configuration method for a supercomputer according to claim 1, wherein when the number P of nodes to be allocated for the current operation of the job cannot be found in the historical operation result according to the data amount and expected operation time of the current operation of the job0And run to M breakpoints { B1,B2,...,BMCorresponding initial expected time of
Figure FDA0002392199370000022
Then, the number P of the nodes needing to be distributed in the current operation of the operation is obtained by adopting an interpolation method0And run to M breakpoints { B1,B2,...,BMCorresponding initial expected time of
Figure FDA0002392199370000023
4. The multi-tier resource flexible configuration method for a supercomputer according to any one of claims 1 to 3, wherein in step S300, when Δ t is reached1When > 0, P1=(1+w)×P0First corrected expected time
Figure FDA0002392199370000024
In (1)
Figure FDA0002392199370000025
Wherein M is more than or equal to i +1 and less than or equal to M,
Figure FDA0002392199370000026
5. the multi-tier resource flexible configuration method for a supercomputer according to any one of claims 1 to 3, wherein in step S300, when Δ t is reached1When the ratio is less than or equal to 0, P1=(1-w)×P0First corrected expected time
Figure FDA0002392199370000027
In (1)
Figure FDA0002392199370000028
Wherein M is more than or equal to i +1 and less than or equal to M,
Figure FDA0002392199370000029
CN201810680674.9A 2018-06-27 2018-06-27 Multi-level resource flexible configuration method for super computer Active CN108897619B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010258601.8A CN111475297B (en) 2018-06-27 2018-06-27 Flexible operation configuration method
CN201810680674.9A CN108897619B (en) 2018-06-27 2018-06-27 Multi-level resource flexible configuration method for super computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810680674.9A CN108897619B (en) 2018-06-27 2018-06-27 Multi-level resource flexible configuration method for super computer

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202010258601.8A Division CN111475297B (en) 2018-06-27 2018-06-27 Flexible operation configuration method

Publications (2)

Publication Number Publication Date
CN108897619A CN108897619A (en) 2018-11-27
CN108897619B true CN108897619B (en) 2020-05-05

Family

ID=64346784

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810680674.9A Active CN108897619B (en) 2018-06-27 2018-06-27 Multi-level resource flexible configuration method for super computer
CN202010258601.8A Active CN111475297B (en) 2018-06-27 2018-06-27 Flexible operation configuration method

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202010258601.8A Active CN111475297B (en) 2018-06-27 2018-06-27 Flexible operation configuration method

Country Status (1)

Country Link
CN (2) CN108897619B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083449B (en) * 2019-04-08 2020-04-28 清华大学 Method and device for dynamically allocating memory and processor and computing module
CN114020443B (en) * 2022-01-05 2022-03-18 国家超级计算天津中心 Supercomputer resource scheduling method, electronic device and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102906696A (en) * 2010-03-26 2013-01-30 维尔图尔梅特里克斯公司 Fine grain performance resource management of computer systems
CN106254058A (en) * 2015-06-12 2016-12-21 华为技术有限公司 A kind of method and device of the frequency adjusting server
CN107045456A (en) * 2016-02-05 2017-08-15 华为技术有限公司 A kind of resource allocation methods and explorer

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8185422B2 (en) * 2006-07-31 2012-05-22 Accenture Global Services Limited Work allocation model
US9262216B2 (en) * 2012-02-14 2016-02-16 Microsoft Technologies Licensing, LLC Computing cluster with latency control
CN104252391B (en) * 2013-06-28 2017-09-12 国际商业机器公司 Method and apparatus for managing multiple operations in distributed computing system
US9417928B2 (en) * 2014-12-24 2016-08-16 International Business Machines Corporation Energy efficient supercomputer job allocation
CN104536770A (en) * 2015-01-28 2015-04-22 浪潮电子信息产业股份有限公司 Job submitting and restoring method capable of supporting break restoration of concurrent jobs
CN105808334B (en) * 2016-03-04 2016-12-28 山东大学 A kind of short optimization of job system and method for MapReduce based on resource reuse

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102906696A (en) * 2010-03-26 2013-01-30 维尔图尔梅特里克斯公司 Fine grain performance resource management of computer systems
CN106254058A (en) * 2015-06-12 2016-12-21 华为技术有限公司 A kind of method and device of the frequency adjusting server
CN107045456A (en) * 2016-02-05 2017-08-15 华为技术有限公司 A kind of resource allocation methods and explorer

Also Published As

Publication number Publication date
CN111475297A (en) 2020-07-31
CN111475297B (en) 2023-04-07
CN108897619A (en) 2018-11-27

Similar Documents

Publication Publication Date Title
US10733026B2 (en) Automated workflow selection
CN111381950B (en) Multi-copy-based task scheduling method and system for edge computing environment
CN104765640B (en) A kind of intelligent Service dispatching method
CN108205469B (en) MapReduce-based resource allocation method and server
CN110347515B (en) Resource optimization allocation method suitable for edge computing environment
CN108897619B (en) Multi-level resource flexible configuration method for super computer
CN108427602B (en) Distributed computing task cooperative scheduling method and device
CN104199739A (en) Speculation type Hadoop scheduling method based on load balancing
CN115237580A (en) Intelligent calculation-oriented flow parallel training self-adaptive adjustment system and method
CN111176810B (en) Meteorological hydrology data processing scheduling system based on priority
CN115586961A (en) AI platform computing resource task scheduling method, device and medium
CN113448714B (en) Computing resource control system based on cloud platform
CN108108242B (en) Storage layer intelligent distribution control method based on big data
CN108139929B (en) Task scheduling apparatus and method for scheduling a plurality of tasks
CN111736959B (en) Spark task scheduling method considering data affinity under heterogeneous cluster
CN113452546A (en) Dynamic quality of service management for deep learning training communications
CN117331668A (en) Job scheduling method, device, equipment and storage medium
CN116915869A (en) Cloud edge cooperation-based time delay sensitive intelligent service quick response method
JP2006195985A (en) Method for controlling resource utilization rate and computer system
Banicescu et al. Towards the robustness of dynamic loop scheduling on large-scale heterogeneous distributed systems
CN112052087B (en) Deep learning training system and method for dynamic resource adjustment and migration
CN111178529B (en) Data processing method and device, electronic equipment and readable storage medium
CN112015539A (en) Task allocation method, device and computer storage medium
KR101470695B1 (en) Method and system of biogeography based optimization for grid computing scheduling
CN115280286A (en) Dynamic allocation and reallocation of learning model computing resources

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant