CN108563510A - The architecture method for sensing and optimizing calculated towards E grades - Google Patents
The architecture method for sensing and optimizing calculated towards E grades Download PDFInfo
- Publication number
- CN108563510A CN108563510A CN201810418522.1A CN201810418522A CN108563510A CN 108563510 A CN108563510 A CN 108563510A CN 201810418522 A CN201810418522 A CN 201810418522A CN 108563510 A CN108563510 A CN 108563510A
- Authority
- CN
- China
- Prior art keywords
- sensing
- optimizing
- code section
- grades
- independent code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000004364 calculation method Methods 0.000 claims description 21
- 238000005457 optimization Methods 0.000 claims description 20
- 238000012545 processing Methods 0.000 claims description 14
- 238000013468 resource allocation Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 3
- 230000006835 compression Effects 0.000 claims 1
- 238000007906 compression Methods 0.000 claims 1
- 230000008447 perception Effects 0.000 description 10
- 238000011160 research Methods 0.000 description 7
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本发明具体涉及一种面向E级计算的体系结构感知优化方法。The invention specifically relates to an architecture-aware optimization method for exascale computing.
背景技术Background technique
随着经济技术的发展、人们生活水平的提高和信息化时代的到来,E级计算 (百万万亿次)的超级计算机的时代已经到来。E级计算具有超强的计算能力,能够给计算科学和科学研究带来革命性的变化。With the development of economy and technology, the improvement of people's living standards and the arrival of the information age, the era of supercomputers with exascale calculations (metaflops) has arrived. Exascale computing has super computing power and can bring revolutionary changes to computing science and scientific research.
随着科学的发展,各种各样的数值模型的尺度也越来越大,模型也更加接近实际情况。但是,更加精确的模型意味着更加复杂的模型求解。目前,科研院所的研究经费相对紧张,无法承担购买大量服务器所需要的经费。所以,即使科研院所成功的建立了更加精确的数值模型,其自身也无法独立将该数值模型进行求解。With the development of science, the scale of various numerical models is getting larger and larger, and the models are closer to the actual situation. However, a more accurate model means a more complex model solution. At present, the research funds of scientific research institutes are relatively tight and cannot afford the funds required to purchase a large number of servers. Therefore, even if a scientific research institute successfully establishes a more accurate numerical model, it cannot independently solve the numerical model.
如果科研院所能够将更加精确的数值模型的代码移植到面向E级计算的超算平台上,那么科研院所则无需购置大量性能强劲的服务器,只需要将计算代码移植到超算平台,远程查看计算进度和效果,支付超算计算的费用即可。而超算计算的费用与大量性能强劲的服务器相比,无疑成本要低廉许多。If scientific research institutes can transplant the codes of more accurate numerical models to the supercomputing platform for exascale computing, then the scientific research institutes do not need to purchase a large number of powerful servers, but only need to transplant the calculation codes to the supercomputing platform. Check the calculation progress and effect, and pay for the supercomputing fee. The cost of supercomputing is undoubtedly much lower than that of a large number of powerful servers.
但是,如何将数值模型的计算代码有效的移植到面向E级计算的超算平台,从而最大程度的发挥E级计算超算平台的计算能力,目前尚未有相关的研究,从而也在一定程度上制约了该领域的发展。However, there is no relevant research on how to effectively transplant the calculation code of the numerical model to the supercomputing platform for E-level computing, so as to maximize the computing power of the E-level computing supercomputing platform. restrict the development of this field.
发明内容Contents of the invention
本发明的目的在于提供一种面向E级计算,能够根据任务体系结构进行感知对程序进行优化的面向E级计算的体系结构感知优化方法。The purpose of the present invention is to provide an E-level computing-oriented architecture-aware optimization method capable of perceiving and optimizing programs according to the task architecture.
本发明提供的这种面向E级计算的体系结构感知优化方法,包括如下步骤:This E-level computing-oriented architecture-aware optimization method provided by the present invention includes the following steps:
任务流感知优化的步骤,用于对所需要执行的所有任务进行感知和优化,调整并得到各个任务的处理顺序,从而加快任务流的处理效率;The step of task flow perception optimization is used to perceive and optimize all the tasks that need to be executed, adjust and obtain the processing order of each task, so as to speed up the processing efficiency of the task flow;
程序代码感知优化的步骤,用于将需要执行的程序代码分发至最合适的系统进行进行分布式计算并进行结果汇总,从而提高程序代码的处理效率;The step of program code perception optimization is used to distribute the program code that needs to be executed to the most suitable system for distributed calculation and summary of results, so as to improve the processing efficiency of program code;
程序算法感知优化的步骤,用于对程序算法进行分析并分配相应的计算资源,从而提高程序算法的处理效率;The step of program algorithm perception optimization is used to analyze the program algorithm and allocate corresponding computing resources, so as to improve the processing efficiency of the program algorithm;
向量感知优化的步骤,用于对计算过程中的向量化计算选择最大的压缩存储格式,从而提高向量化计算的处理效率。The vector-aware optimization step is used to select the largest compressed storage format for the vectorized calculation in the calculation process, thereby improving the processing efficiency of the vectorized calculation.
所述的任务流感知优化,具体包括如下步骤:The task flow perception optimization specifically includes the following steps:
(1)获取所有需要执行的任务,并确认所有任务之间不存在相互依赖关系;(1) Obtain all tasks that need to be executed, and confirm that there is no interdependence between all tasks;
(2)对步骤(1)中获取的所有任务的子任务进行识别,并在一个时间段内统一对各个子任务分配计算资源;(2) Identify the subtasks of all tasks obtained in step (1), and uniformly allocate computing resources to each subtask within a period of time;
(3)采用最短作业优化算法找到各个子任务中计算量最小的子任务,并优先处理该计算量最小的子任务。(3) Use the shortest job optimization algorithm to find the subtask with the least amount of calculation among each subtask, and process the subtask with the least amount of calculation first.
所述的程序代码感知优化,具体包括如下步骤:The program code perception optimization specifically includes the following steps:
1)对测试代码进行测试,找到各个典型独立代码段与系统硬件之间的最佳匹配关系;1) Test the test code to find the best matching relationship between each typical independent code segment and the system hardware;
2)对程序代码进行识别,找到程序代码中的独立代码段;2) Identify the program code and find the independent code segment in the program code;
3)将步骤2)得到的独立代码段与典型独立代码段进行匹配,从而得到各个独立代码段与典型独立代码段之间的对应关系;3) matching the independent code segment obtained in step 2) with the typical independent code segment, thereby obtaining the correspondence between each independent code segment and the typical independent code segment;
4)根据步骤3)中的独立代码段与典型独立代码段之间的对应关系,以及步骤1)中各个典型独立代码段与系统硬件之间的最佳匹配关系,将独立代码段分配至对应的系统硬件,从而提高程序代码的处理效率。4) According to the corresponding relationship between the independent code segment and the typical independent code segment in step 3), and the best matching relationship between each typical independent code segment and the system hardware in step 1), assign the independent code segment to the corresponding system hardware, thereby improving the processing efficiency of the program code.
所述的程序算法感知优化,具体包括如下步骤:The program algorithm perception optimization specifically includes the following steps:
A.在程序算法运行前,对不同的程序算法进行计算资源的平均分配;A. Before the program algorithm is run, the computing resources are evenly allocated to different program algorithms;
B.在程序算法运行时,若出现计算等待的情况,则程序算法的资源需求进行重新分析;B. When the program algorithm is running, if there is a calculation waiting situation, the resource requirements of the program algorithm will be re-analyzed;
C.根据步骤B得到的分析结果,采用动态资源平衡(DRF)算法对计算资源进行重新分配。C. According to the analysis result obtained in step B, reallocate computing resources by using a dynamic resource balancing (DRF) algorithm.
步骤C所述的对计算资源进行重新分配,具体为采用如下原则进行重新分配:The redistribution of computing resources described in step C is specifically to redistribute according to the following principles:
R1.用户不能获得比其他用户更多的资源;R1. Users cannot obtain more resources than other users;
R2.用户不能通过谎报其资源需求来获得更多的资源;R2. Users cannot obtain more resources by lying about their resource requirements;
R3.分配所有可以利用的资源,不用取代现有的资源分配;R3. Allocate all available resources without replacing existing resource allocations;
R4.用户不会更喜欢其他用户的资源分配。R4. Users do not prefer resource allocations of other users.
本发明提供的这种面向E级计算的体系结构感知优化方法,采用多种感知优化算法,能够在面向E级计算时的海量计算资源时,合理调整和分配硬件资源,提高计算效率。The E-level computing-oriented architecture perception optimization method provided by the present invention adopts multiple perception optimization algorithms, which can reasonably adjust and allocate hardware resources when facing massive computing resources during E-level computing, and improve computing efficiency.
附图说明Description of drawings
图1为本发明方法的方法示意图。Fig. 1 is a method schematic diagram of the method of the present invention.
具体实施方式Detailed ways
如图1所示为本发明方法的方法流程示意图:本发明提供的这种面向E级计算的体系结构感知优化方法,包括如下步骤:As shown in Figure 1, it is a schematic flow chart of the method of the present invention: the architecture-aware optimization method for E-level computing provided by the present invention includes the following steps:
任务流感知优化的步骤,用于对所需要执行的所有任务进行感知和优化,调整并得到各个任务的处理顺序,从而加快任务流的处理效率;具体包括如下步骤:The task flow-aware optimization step is used to perceive and optimize all the tasks that need to be executed, adjust and obtain the processing order of each task, so as to speed up the processing efficiency of the task flow; specifically, it includes the following steps:
(1)获取所有需要执行的任务,并确认所有任务之间不存在相互依赖关系;(1) Obtain all tasks that need to be executed, and confirm that there is no interdependence between all tasks;
(2)对步骤(1)中获取的所有任务的子任务进行识别,并在一个时间段内统一对各个子任务分配计算资源;(2) Identify the subtasks of all tasks obtained in step (1), and uniformly allocate computing resources to each subtask within a period of time;
(3)采用最短作业优化算法找到各个子任务中计算量最小的子任务,并优先处理该计算量最小的子任务;(3) Use the shortest job optimization algorithm to find the subtask with the smallest amount of calculation among each subtask, and give priority to the subtask with the smallest amount of calculation;
程序代码感知优化的步骤,用于将需要执行的程序代码分发至最合适的系统进行进行分布式计算并进行结果汇总,从而提高程序代码的处理效率;具体包括如下步骤:The step of program code perception optimization is used to distribute the program code that needs to be executed to the most suitable system for distributed calculation and summary of results, so as to improve the processing efficiency of program code; specifically, it includes the following steps:
1)对测试代码进行测试,找到各个典型独立代码段与系统硬件之间的最佳匹配关系;1) Test the test code to find the best matching relationship between each typical independent code segment and the system hardware;
2)对程序代码进行识别,找到程序代码中的独立代码段;2) Identify the program code and find the independent code segment in the program code;
3)将步骤2)得到的独立代码段与典型独立代码段进行匹配,从而得到各个独立代码段与典型独立代码段之间的对应关系;3) matching the independent code segment obtained in step 2) with the typical independent code segment, thereby obtaining the correspondence between each independent code segment and the typical independent code segment;
4)根据步骤3)中的独立代码段与典型独立代码段之间的对应关系,以及步骤1)中各个典型独立代码段与系统硬件之间的最佳匹配关系,将独立代码段分配至对应的系统硬件,从而提高程序代码的处理效率;4) According to the corresponding relationship between the independent code segment and the typical independent code segment in step 3), and the best matching relationship between each typical independent code segment and the system hardware in step 1), assign the independent code segment to the corresponding system hardware, thereby improving the processing efficiency of the program code;
程序算法感知优化的步骤,用于对程序算法进行分析并分配相应的计算资源,从而提高程序算法的处理效率;具体包括如下步骤:The step of program algorithm perception optimization is used to analyze the program algorithm and allocate corresponding computing resources, so as to improve the processing efficiency of the program algorithm; specifically, it includes the following steps:
A.在程序算法运行前,对不同的程序算法进行计算资源的平均分配;A. Before the program algorithm is run, the computing resources are evenly allocated to different program algorithms;
B.在程序算法运行时,若出现计算等待的情况,则程序算法的资源需求进行重新分析;B. When the program algorithm is running, if there is a calculation waiting situation, the resource requirements of the program algorithm will be re-analyzed;
C.根据步骤B得到的分析结果,采用动态资源平衡(DRF)算法对计算资源进行重新分配;C. According to the analysis result obtained in step B, the computing resources are redistributed by using a dynamic resource balancing (DRF) algorithm;
向量感知优化的步骤,用于对计算过程中的向量化计算选择最大的压缩存储格式,从而提高向量化计算的处理效率;具体为采用如下原则进行重新分配:The vector-aware optimization step is used to select the largest compressed storage format for vectorized calculations in the calculation process, thereby improving the processing efficiency of vectorized calculations; specifically, the following principles are used for redistribution:
R1.用户不能获得比其他用户更多的资源;R1. Users cannot obtain more resources than other users;
R2.用户不能通过谎报其资源需求来获得更多的资源;R2. Users cannot obtain more resources by lying about their resource requirements;
R3.分配所有可以利用的资源,不用取代现有的资源分配;R3. Allocate all available resources without replacing existing resource allocations;
R4.用户不会更喜欢其他用户的资源分配。R4. Users do not prefer resource allocations of other users.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810418522.1A CN108563510B (en) | 2018-05-04 | 2018-05-04 | Architecture-aware optimization method for exascale computing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810418522.1A CN108563510B (en) | 2018-05-04 | 2018-05-04 | Architecture-aware optimization method for exascale computing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108563510A true CN108563510A (en) | 2018-09-21 |
CN108563510B CN108563510B (en) | 2021-07-13 |
Family
ID=63537648
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810418522.1A Active CN108563510B (en) | 2018-05-04 | 2018-05-04 | Architecture-aware optimization method for exascale computing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108563510B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101453398A (en) * | 2007-12-06 | 2009-06-10 | 怀特威盛软件公司 | Novel distributed grid super computer system and method |
US20130080482A1 (en) * | 2003-04-09 | 2013-03-28 | Gary Charles Berkowitz | Virtual Supercomputer |
CN103885839A (en) * | 2014-04-06 | 2014-06-25 | 孙凌宇 | Cloud computing task scheduling method based on multilevel division method and empowerment directed hypergraphs |
US8874879B2 (en) * | 2010-11-11 | 2014-10-28 | Fujitsu Limited | Vector processing circuit, command issuance control method, and processor system |
CN104969207A (en) * | 2012-10-22 | 2015-10-07 | 英特尔公司 | High Performance Interconnect Coherence Protocol |
CN107147517A (en) * | 2017-03-24 | 2017-09-08 | 上海交通大学 | An Adaptive Computing Resource Allocation Method for Virtual Network Functions |
CN107977270A (en) * | 2017-11-22 | 2018-05-01 | 用友金融信息技术股份有限公司 | Peers distribution method, peers distribution system and computer installation |
-
2018
- 2018-05-04 CN CN201810418522.1A patent/CN108563510B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130080482A1 (en) * | 2003-04-09 | 2013-03-28 | Gary Charles Berkowitz | Virtual Supercomputer |
CN101453398A (en) * | 2007-12-06 | 2009-06-10 | 怀特威盛软件公司 | Novel distributed grid super computer system and method |
US8874879B2 (en) * | 2010-11-11 | 2014-10-28 | Fujitsu Limited | Vector processing circuit, command issuance control method, and processor system |
CN104969207A (en) * | 2012-10-22 | 2015-10-07 | 英特尔公司 | High Performance Interconnect Coherence Protocol |
CN103885839A (en) * | 2014-04-06 | 2014-06-25 | 孙凌宇 | Cloud computing task scheduling method based on multilevel division method and empowerment directed hypergraphs |
CN107147517A (en) * | 2017-03-24 | 2017-09-08 | 上海交通大学 | An Adaptive Computing Resource Allocation Method for Virtual Network Functions |
CN107977270A (en) * | 2017-11-22 | 2018-05-01 | 用友金融信息技术股份有限公司 | Peers distribution method, peers distribution system and computer installation |
Also Published As
Publication number | Publication date |
---|---|
CN108563510B (en) | 2021-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6939132B2 (en) | Application profiling job management system, programs, and methods | |
CN110427262B (en) | Gene data analysis method and heterogeneous scheduling platform | |
CN105808328B (en) | The methods, devices and systems of task schedule | |
US11409576B2 (en) | Dynamic distribution of a workload processing pipeline on a computing infrastructure | |
Bansal et al. | Cost performance of QoS Driven task scheduling in cloud computing | |
CN106991006B (en) | Support the cloud workflow task clustering method relied on and the time balances | |
JP2013218700A (en) | Distributed processing system, scheduler node and scheduling method of distributed processing system, and program generation apparatus therefor | |
CN110618865B (en) | Hadoop task scheduling method and device | |
CN109865292B (en) | Game resource construction method and device based on game engine | |
CN108132840B (en) | Resource scheduling method and device in distributed system | |
JP2012118669A (en) | Load distribution processing system and load distribution processing method | |
CN113867907A (en) | CPU resource-based scheduling system and optimization algorithm in engineering field | |
CN104753977A (en) | Seismic processing and interpretation infrastructure cloud resource scheduling method based on fuzzy clustering | |
JP6778130B2 (en) | Virtual computer system and its resource allocation method | |
CN104598304B (en) | Method and apparatus for the scheduling in Job execution | |
CN113168344A (en) | Distributed resource management by increasing cluster diversity | |
WO2017173662A1 (en) | Heterogeneous system based program processing method and device | |
CN115168058A (en) | Thread load balancing method, device, equipment and storage medium | |
CN108563510A (en) | The architecture method for sensing and optimizing calculated towards E grades | |
KR101695238B1 (en) | System and method for job scheduling using multi computing resource | |
Rahmani et al. | Machine learning-driven energy-efficient load balancing for real-time heterogeneous systems | |
CN114356550A (en) | Three-level parallel middleware-oriented automatic computing resource allocation method and system | |
CN115033389A (en) | Energy-saving task resource scheduling method and device for power grid information system | |
Pearce et al. | MPMD framework for offloading load balance computation | |
JP2016173643A (en) | Distributed processing control device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |