CN109144944A - A kind of program groups bandwidth scheduling method that concurrency performance is optimal - Google Patents
A kind of program groups bandwidth scheduling method that concurrency performance is optimal Download PDFInfo
- Publication number
- CN109144944A CN109144944A CN201810858682.8A CN201810858682A CN109144944A CN 109144944 A CN109144944 A CN 109144944A CN 201810858682 A CN201810858682 A CN 201810858682A CN 109144944 A CN109144944 A CN 109144944A
- Authority
- CN
- China
- Prior art keywords
- program
- bandwidth
- program segment
- scheduling
- present procedure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17306—Intercommunication techniques
- G06F15/17318—Parallel communications techniques, e.g. gather, scatter, reduce, roadcast, multicast, all to all
Abstract
The invention discloses a kind of program groups bandwidth scheduling methods that concurrency performance is optimal, comprising steps of binding a performance counter (PMU) for each program process, program process is divided into several program segments according to scheduling time leaf length, the PMU data of each program segment is stored respectively into program segment execution information database, is corresponded with specific identification code;Program segment identification is carried out to program each in present procedure group, reads corresponding PMU data respectively;According to the corresponding PMU data of present procedure section, bandwidth demand is estimated, carry out bandwidth scheduling;Option program concurrently executes, while the bandwidth demand of next timeslice is estimated according to the corresponding PMU data of next program segment;Present procedure section is executed, execution information is stored, constantly repeats, until completing the bandwidth scheduling of present procedure group and executing optimization.This method is not necessarily to user intervention, is able to achieve continuous updating and self-optimization, Prediction of Bandwidth Requirement precise and high efficiency, and dispatching effect is good.
Description
Technical field
The invention belongs to computer systems technology field more particularly to a kind of program groups bandwidth schedulings that concurrency performance is optimal
Method.
Background technique
It is limited by power consumption and hardware etc., the simple method warp that processor performance is improved by chip frequency is promoted
Become more and more difficult, therefore, the development trend of multicore and many-core as computing system.But this kind of system is by multiple processing
Core is integrated on a processor, and multiple processing cores share a memory access bus, system performance by memory access bandwidth constraint so that
Memory access performance becomes the main bottleneck that limitation system performance is promoted.
For this purpose, improving memory access efficiency, system performance is reduced to the dependence in memory access broadband, it is possible to reduce memory access latency, into
The operational efficiency of one step raising system.
Summary of the invention
For the above-mentioned deficiency of the prior art, the present invention provides a kind of program groups bandwidth scheduling side that concurrency performance is optimal
Method, comprising steps of
(1) performance counter (PMU) is bound for each program process, is held program according to scheduling time leaf length
Row process is divided into several program segments, and the PMU data of each program segment is stored respectively into program segment execution information database,
It is corresponded with specific identification code;
(2) program segment identification is carried out to program each in present procedure group, reads corresponding PMU data respectively;
(3) it according to the corresponding PMU data of present procedure section, estimates bandwidth demand, carries out bandwidth scheduling;
(4) option program concurrently executes, while estimating next timeslice according to the corresponding PMU data of next program segment
Bandwidth demand;
(5) present procedure section is executed, corresponding PMU data is then updated to program segment execution information database, and it is corresponding
Identification code match;
(6) next program segment weight bandwidth scheduling is carried out, is repeated step (4) and (5), until completing the bandwidth of present procedure group
Dispatch and execute optimization.
Preferably, in step (4), the total bandwidth need of next timeslice is not less than the average bandwidth demand of program groups.
Preferably, scheduling time leaf length is 5-200ms.
Preferably, bandwidth demand monopolizes execution performance estimation according to program;Further, program monopolizes execution performance according to journey
The efficiency calculation of frequency, access Cache and main memory that sequence section occurs.
Preferably, each program segment, which respectively corresponds, is provided with program segment information table, current for logging program section and go through
History execution information.
Beneficial effects of the present invention:
Bandwidth scheduling method of the invention carries out bandwidth scheduling according to historical execution information, executes end in program
Afterwards, current execution information automatically updates in program segment execution information database, is loaded into again in the execution in program future,
Continuous updating and self-optimization are realized in this way, and accordingly executes data and is obtained by PMU, are not necessarily to user intervention,
Realize Prediction of Bandwidth Requirement and the scheduling of precise and high efficiency.
Detailed description of the invention
Fig. 1 is bandwidth scheduling flow diagram of the invention.
Specific embodiment
With reference to the accompanying drawing and specific embodiment the present invention will be described in detail.
Referring to Fig. 1, the present invention provides a kind of program groups bandwidth scheduling method that concurrency performance is optimal, comprising steps of
(1) a performance counter (PMU) is bound for each program process, is 50ms by journey by scheduling time leaf length
Program process is divided into several program segments, and the PMU data of each program segment is stored respectively to program segment execution information data
In library, corresponded with specific identification code.
(2) program segment identification is carried out to program each in present procedure group, reads corresponding PMU number respectively according to identification code
According to.It for the validity for improving program segment identification, can be realized by basic block vector, i.e., by being formed in program segment implementation procedure
The manhatton distance of basic block vector whether be less than the similarity threshold of basic block vector and judged.
(3) it according to the corresponding PMU data of present procedure section, estimates bandwidth demand, carries out bandwidth scheduling.Each program segment
Respectively correspond and be provided with program segment information table (being stored in program segment execution information database), for logging program section it is current and
Historical execution information, i.e. PMU data are estimated according to the average value of execution information.
(4) option program concurrently executes, while estimating next timeslice according to the corresponding PMU data of next program segment
Bandwidth demand.To guarantee that performance is stablized in system operation, the total bandwidth need of next timeslice is not less than program groups
Average bandwidth demand.Specifically, the efficiency of frequency, access Cache and main memory that bandwidth demand occurs according to program segment is counted
It calculates.
(5) present procedure section is executed, after program segment execution terminates, corresponding execution information continues to update to corresponding letter
It ceases in table, matches with corresponding identification code.It is loaded into when executing in the program segment future, is needed for improving next step bandwidth again
Seek the accuracy of estimation.
(6) next program segment weight bandwidth scheduling is carried out, is repeated step (4) and (5), until completing the bandwidth of present procedure group
Dispatch and execute optimization.
For the dispatching effect for evaluating and testing the above method, the present embodiment generates the program that 20 concurrencies are 5 at random in systems
Group records the execution performance that each program groups continuously performs 50 times under above-mentioned bandwidth scheduling method, the results showed that, program groups is most
Big speed-up ratio increase rate is up to 5.5%, and main memory access amount is declined slightly, and effectively increases the operational efficiency of system.
The above embodiments are merely illustrative of the technical solutions of the present invention and is not intended to limit it, all without departing from the present invention
Any modification of spirit and scope or equivalent replacement should all cover in the protection scope of technical solution of the present invention.
Claims (6)
1. a kind of program groups bandwidth scheduling method that concurrency performance is optimal, which is characterized in that comprising steps of
(1) performance counter (PMU) is bound for each program process, was executed program according to scheduling time leaf length
Journey is divided into several program segments, and the PMU data of each program segment is stored respectively into program segment execution information database, with spy
Fixed identification code corresponds;
(2) program segment identification is carried out to program each in present procedure group, reads corresponding PMU data respectively;
(3) it according to the corresponding PMU data of present procedure section, estimates bandwidth demand, carries out bandwidth scheduling;
(4) option program concurrently executes, while the bandwidth of next timeslice is estimated according to the corresponding PMU data of next program segment
Demand;
(5) present procedure section is executed, corresponding PMU data is then updated to program segment execution information database, with corresponding knowledge
Other code matches;
(6) next program segment weight bandwidth scheduling is carried out, is repeated step (4) and (5), until completing the bandwidth scheduling of present procedure group
With execute optimization.
2. the method according to claim 1, wherein the total bandwidth need of next timeslice is not in step (4)
Lower than the average bandwidth demand of program groups.
3. the method according to claim 1, wherein scheduling time leaf length is 5-200ms.
4. being estimated the method according to claim 1, wherein bandwidth demand monopolizes execution performance according to program.
5. according to the method described in claim 4, it is characterized in that, program monopolizes the frequency that execution performance occurs according to program segment
Rate, the efficiency calculation for accessing Cache and main memory.
6. the method according to claim 1, wherein each program segment, which respectively corresponds, is provided with program segment information
Table, for logging program section to be current and historical execution information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810858682.8A CN109144944A (en) | 2018-07-31 | 2018-07-31 | A kind of program groups bandwidth scheduling method that concurrency performance is optimal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810858682.8A CN109144944A (en) | 2018-07-31 | 2018-07-31 | A kind of program groups bandwidth scheduling method that concurrency performance is optimal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109144944A true CN109144944A (en) | 2019-01-04 |
Family
ID=64799018
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810858682.8A Pending CN109144944A (en) | 2018-07-31 | 2018-07-31 | A kind of program groups bandwidth scheduling method that concurrency performance is optimal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109144944A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05241860A (en) * | 1992-02-26 | 1993-09-21 | Kobe Nippon Denki Software Kk | Time slice optimization system |
US6438747B1 (en) * | 1999-08-20 | 2002-08-20 | Hewlett-Packard Company | Programmatic iteration scheduling for parallel processors |
US20060053416A1 (en) * | 2004-09-09 | 2006-03-09 | Fujitsu Limited | Program section layout method and layout program |
CN101403978A (en) * | 2007-10-01 | 2009-04-08 | 埃森哲环球服务有限公司 | Infrastructure for parallel programming of clusters of machines |
CN106598599A (en) * | 2016-12-15 | 2017-04-26 | 王弘远 | Program execution method and device |
CN107196807A (en) * | 2017-06-20 | 2017-09-22 | 清华大学深圳研究生院 | Network intermediary device and its dispositions method |
-
2018
- 2018-07-31 CN CN201810858682.8A patent/CN109144944A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05241860A (en) * | 1992-02-26 | 1993-09-21 | Kobe Nippon Denki Software Kk | Time slice optimization system |
US6438747B1 (en) * | 1999-08-20 | 2002-08-20 | Hewlett-Packard Company | Programmatic iteration scheduling for parallel processors |
US20060053416A1 (en) * | 2004-09-09 | 2006-03-09 | Fujitsu Limited | Program section layout method and layout program |
CN101403978A (en) * | 2007-10-01 | 2009-04-08 | 埃森哲环球服务有限公司 | Infrastructure for parallel programming of clusters of machines |
CN106598599A (en) * | 2016-12-15 | 2017-04-26 | 王弘远 | Program execution method and device |
CN107196807A (en) * | 2017-06-20 | 2017-09-22 | 清华大学深圳研究生院 | Network intermediary device and its dispositions method |
Non-Patent Citations (1)
Title |
---|
徐地,等;: "一个支持访存带宽敏感调度的跨执行优化方法", 《计算机学报》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10261806B2 (en) | Adaptive hardware configuration for data analytics | |
US8402223B2 (en) | Cache eviction using memory entry value | |
US7401329B2 (en) | Compiling computer programs to exploit parallelism without exceeding available processing resources | |
US9436512B2 (en) | Energy efficient job scheduling in heterogeneous chip multiprocessors based on dynamic program behavior using prim model | |
CN111294234B (en) | Parallel block chain fragmentation method based on intelligent contract optimization model | |
CN114895773A (en) | Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium | |
Kim et al. | Minimizing GPU kernel launch overhead in deep learning inference on mobile GPUs | |
CN110414569B (en) | Clustering implementation method and device | |
CN107704266B (en) | Reduction method applied to solving particle simulation parallel data competition | |
CN115150471A (en) | Data processing method, device, equipment, storage medium and program product | |
CN116112563A (en) | Dual-strategy self-adaptive cache replacement method based on popularity prediction | |
CN105094949A (en) | Method and system for simulation based on instruction calculation model and feedback compensation | |
Liang et al. | Prediction method of energy consumption based on multiple energy-related features in data center | |
US20190197653A1 (en) | Processing unit performance projection using dynamic hardware behaviors | |
CN109144944A (en) | A kind of program groups bandwidth scheduling method that concurrency performance is optimal | |
CN107145453B (en) | A kind of prediction technique, device, readable medium and the equipment of cache invalidation rate | |
Li et al. | GbA: A graph‐based thread partition approach in speculative multithreading | |
US20190197652A1 (en) | On-the-fly scheduling of execution of dynamic hardware behaviors | |
CN115185804A (en) | Server performance prediction method, system, terminal and storage medium | |
Strobel et al. | Combined mpsoc task mapping and memory optimization for low-power | |
CN107577517B (en) | NUMA memory architecture-oriented fine-grained vCPU scheduling method and system | |
CN112052087A (en) | Deep learning training system and method for dynamic resource adjustment and migration | |
CN106203083B (en) | The prediction partition method of device driver and its intelligent hardened system | |
US11836531B2 (en) | Method, device, and program product for managing computing system | |
CN113204478B (en) | Method, device and equipment for operating test unit and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190104 |
|
RJ01 | Rejection of invention patent application after publication |