CN109144944A - A kind of program groups bandwidth scheduling method that concurrency performance is optimal - Google Patents

A kind of program groups bandwidth scheduling method that concurrency performance is optimal Download PDF

Info

Publication number
CN109144944A
CN109144944A CN201810858682.8A CN201810858682A CN109144944A CN 109144944 A CN109144944 A CN 109144944A CN 201810858682 A CN201810858682 A CN 201810858682A CN 109144944 A CN109144944 A CN 109144944A
Authority
CN
China
Prior art keywords
program
bandwidth
program segment
scheduling
present procedure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810858682.8A
Other languages
Chinese (zh)
Inventor
张彩霞
王向东
王新东
肖人苗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan University
Original Assignee
Foshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan University filed Critical Foshan University
Priority to CN201810858682.8A priority Critical patent/CN109144944A/en
Publication of CN109144944A publication Critical patent/CN109144944A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17318Parallel communications techniques, e.g. gather, scatter, reduce, roadcast, multicast, all to all

Abstract

The invention discloses a kind of program groups bandwidth scheduling methods that concurrency performance is optimal, comprising steps of binding a performance counter (PMU) for each program process, program process is divided into several program segments according to scheduling time leaf length, the PMU data of each program segment is stored respectively into program segment execution information database, is corresponded with specific identification code;Program segment identification is carried out to program each in present procedure group, reads corresponding PMU data respectively;According to the corresponding PMU data of present procedure section, bandwidth demand is estimated, carry out bandwidth scheduling;Option program concurrently executes, while the bandwidth demand of next timeslice is estimated according to the corresponding PMU data of next program segment;Present procedure section is executed, execution information is stored, constantly repeats, until completing the bandwidth scheduling of present procedure group and executing optimization.This method is not necessarily to user intervention, is able to achieve continuous updating and self-optimization, Prediction of Bandwidth Requirement precise and high efficiency, and dispatching effect is good.

Description

A kind of program groups bandwidth scheduling method that concurrency performance is optimal
Technical field
The invention belongs to computer systems technology field more particularly to a kind of program groups bandwidth schedulings that concurrency performance is optimal Method.
Background technique
It is limited by power consumption and hardware etc., the simple method warp that processor performance is improved by chip frequency is promoted Become more and more difficult, therefore, the development trend of multicore and many-core as computing system.But this kind of system is by multiple processing Core is integrated on a processor, and multiple processing cores share a memory access bus, system performance by memory access bandwidth constraint so that Memory access performance becomes the main bottleneck that limitation system performance is promoted.
For this purpose, improving memory access efficiency, system performance is reduced to the dependence in memory access broadband, it is possible to reduce memory access latency, into The operational efficiency of one step raising system.
Summary of the invention
For the above-mentioned deficiency of the prior art, the present invention provides a kind of program groups bandwidth scheduling side that concurrency performance is optimal Method, comprising steps of
(1) performance counter (PMU) is bound for each program process, is held program according to scheduling time leaf length Row process is divided into several program segments, and the PMU data of each program segment is stored respectively into program segment execution information database, It is corresponded with specific identification code;
(2) program segment identification is carried out to program each in present procedure group, reads corresponding PMU data respectively;
(3) it according to the corresponding PMU data of present procedure section, estimates bandwidth demand, carries out bandwidth scheduling;
(4) option program concurrently executes, while estimating next timeslice according to the corresponding PMU data of next program segment Bandwidth demand;
(5) present procedure section is executed, corresponding PMU data is then updated to program segment execution information database, and it is corresponding Identification code match;
(6) next program segment weight bandwidth scheduling is carried out, is repeated step (4) and (5), until completing the bandwidth of present procedure group Dispatch and execute optimization.
Preferably, in step (4), the total bandwidth need of next timeslice is not less than the average bandwidth demand of program groups.
Preferably, scheduling time leaf length is 5-200ms.
Preferably, bandwidth demand monopolizes execution performance estimation according to program;Further, program monopolizes execution performance according to journey The efficiency calculation of frequency, access Cache and main memory that sequence section occurs.
Preferably, each program segment, which respectively corresponds, is provided with program segment information table, current for logging program section and go through History execution information.
Beneficial effects of the present invention:
Bandwidth scheduling method of the invention carries out bandwidth scheduling according to historical execution information, executes end in program Afterwards, current execution information automatically updates in program segment execution information database, is loaded into again in the execution in program future, Continuous updating and self-optimization are realized in this way, and accordingly executes data and is obtained by PMU, are not necessarily to user intervention, Realize Prediction of Bandwidth Requirement and the scheduling of precise and high efficiency.
Detailed description of the invention
Fig. 1 is bandwidth scheduling flow diagram of the invention.
Specific embodiment
With reference to the accompanying drawing and specific embodiment the present invention will be described in detail.
Referring to Fig. 1, the present invention provides a kind of program groups bandwidth scheduling method that concurrency performance is optimal, comprising steps of
(1) a performance counter (PMU) is bound for each program process, is 50ms by journey by scheduling time leaf length Program process is divided into several program segments, and the PMU data of each program segment is stored respectively to program segment execution information data In library, corresponded with specific identification code.
(2) program segment identification is carried out to program each in present procedure group, reads corresponding PMU number respectively according to identification code According to.It for the validity for improving program segment identification, can be realized by basic block vector, i.e., by being formed in program segment implementation procedure The manhatton distance of basic block vector whether be less than the similarity threshold of basic block vector and judged.
(3) it according to the corresponding PMU data of present procedure section, estimates bandwidth demand, carries out bandwidth scheduling.Each program segment Respectively correspond and be provided with program segment information table (being stored in program segment execution information database), for logging program section it is current and Historical execution information, i.e. PMU data are estimated according to the average value of execution information.
(4) option program concurrently executes, while estimating next timeslice according to the corresponding PMU data of next program segment Bandwidth demand.To guarantee that performance is stablized in system operation, the total bandwidth need of next timeslice is not less than program groups Average bandwidth demand.Specifically, the efficiency of frequency, access Cache and main memory that bandwidth demand occurs according to program segment is counted It calculates.
(5) present procedure section is executed, after program segment execution terminates, corresponding execution information continues to update to corresponding letter It ceases in table, matches with corresponding identification code.It is loaded into when executing in the program segment future, is needed for improving next step bandwidth again Seek the accuracy of estimation.
(6) next program segment weight bandwidth scheduling is carried out, is repeated step (4) and (5), until completing the bandwidth of present procedure group Dispatch and execute optimization.
For the dispatching effect for evaluating and testing the above method, the present embodiment generates the program that 20 concurrencies are 5 at random in systems Group records the execution performance that each program groups continuously performs 50 times under above-mentioned bandwidth scheduling method, the results showed that, program groups is most Big speed-up ratio increase rate is up to 5.5%, and main memory access amount is declined slightly, and effectively increases the operational efficiency of system.
The above embodiments are merely illustrative of the technical solutions of the present invention and is not intended to limit it, all without departing from the present invention Any modification of spirit and scope or equivalent replacement should all cover in the protection scope of technical solution of the present invention.

Claims (6)

1. a kind of program groups bandwidth scheduling method that concurrency performance is optimal, which is characterized in that comprising steps of
(1) performance counter (PMU) is bound for each program process, was executed program according to scheduling time leaf length Journey is divided into several program segments, and the PMU data of each program segment is stored respectively into program segment execution information database, with spy Fixed identification code corresponds;
(2) program segment identification is carried out to program each in present procedure group, reads corresponding PMU data respectively;
(3) it according to the corresponding PMU data of present procedure section, estimates bandwidth demand, carries out bandwidth scheduling;
(4) option program concurrently executes, while the bandwidth of next timeslice is estimated according to the corresponding PMU data of next program segment Demand;
(5) present procedure section is executed, corresponding PMU data is then updated to program segment execution information database, with corresponding knowledge Other code matches;
(6) next program segment weight bandwidth scheduling is carried out, is repeated step (4) and (5), until completing the bandwidth scheduling of present procedure group With execute optimization.
2. the method according to claim 1, wherein the total bandwidth need of next timeslice is not in step (4) Lower than the average bandwidth demand of program groups.
3. the method according to claim 1, wherein scheduling time leaf length is 5-200ms.
4. being estimated the method according to claim 1, wherein bandwidth demand monopolizes execution performance according to program.
5. according to the method described in claim 4, it is characterized in that, program monopolizes the frequency that execution performance occurs according to program segment Rate, the efficiency calculation for accessing Cache and main memory.
6. the method according to claim 1, wherein each program segment, which respectively corresponds, is provided with program segment information Table, for logging program section to be current and historical execution information.
CN201810858682.8A 2018-07-31 2018-07-31 A kind of program groups bandwidth scheduling method that concurrency performance is optimal Pending CN109144944A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810858682.8A CN109144944A (en) 2018-07-31 2018-07-31 A kind of program groups bandwidth scheduling method that concurrency performance is optimal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810858682.8A CN109144944A (en) 2018-07-31 2018-07-31 A kind of program groups bandwidth scheduling method that concurrency performance is optimal

Publications (1)

Publication Number Publication Date
CN109144944A true CN109144944A (en) 2019-01-04

Family

ID=64799018

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810858682.8A Pending CN109144944A (en) 2018-07-31 2018-07-31 A kind of program groups bandwidth scheduling method that concurrency performance is optimal

Country Status (1)

Country Link
CN (1) CN109144944A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05241860A (en) * 1992-02-26 1993-09-21 Kobe Nippon Denki Software Kk Time slice optimization system
US6438747B1 (en) * 1999-08-20 2002-08-20 Hewlett-Packard Company Programmatic iteration scheduling for parallel processors
US20060053416A1 (en) * 2004-09-09 2006-03-09 Fujitsu Limited Program section layout method and layout program
CN101403978A (en) * 2007-10-01 2009-04-08 埃森哲环球服务有限公司 Infrastructure for parallel programming of clusters of machines
CN106598599A (en) * 2016-12-15 2017-04-26 王弘远 Program execution method and device
CN107196807A (en) * 2017-06-20 2017-09-22 清华大学深圳研究生院 Network intermediary device and its dispositions method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05241860A (en) * 1992-02-26 1993-09-21 Kobe Nippon Denki Software Kk Time slice optimization system
US6438747B1 (en) * 1999-08-20 2002-08-20 Hewlett-Packard Company Programmatic iteration scheduling for parallel processors
US20060053416A1 (en) * 2004-09-09 2006-03-09 Fujitsu Limited Program section layout method and layout program
CN101403978A (en) * 2007-10-01 2009-04-08 埃森哲环球服务有限公司 Infrastructure for parallel programming of clusters of machines
CN106598599A (en) * 2016-12-15 2017-04-26 王弘远 Program execution method and device
CN107196807A (en) * 2017-06-20 2017-09-22 清华大学深圳研究生院 Network intermediary device and its dispositions method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐地,等;: "一个支持访存带宽敏感调度的跨执行优化方法", 《计算机学报》 *

Similar Documents

Publication Publication Date Title
US10261806B2 (en) Adaptive hardware configuration for data analytics
US8402223B2 (en) Cache eviction using memory entry value
US7401329B2 (en) Compiling computer programs to exploit parallelism without exceeding available processing resources
US9436512B2 (en) Energy efficient job scheduling in heterogeneous chip multiprocessors based on dynamic program behavior using prim model
CN111294234B (en) Parallel block chain fragmentation method based on intelligent contract optimization model
CN114895773A (en) Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium
Kim et al. Minimizing GPU kernel launch overhead in deep learning inference on mobile GPUs
CN110414569B (en) Clustering implementation method and device
CN107704266B (en) Reduction method applied to solving particle simulation parallel data competition
CN115150471A (en) Data processing method, device, equipment, storage medium and program product
CN116112563A (en) Dual-strategy self-adaptive cache replacement method based on popularity prediction
CN105094949A (en) Method and system for simulation based on instruction calculation model and feedback compensation
Liang et al. Prediction method of energy consumption based on multiple energy-related features in data center
US20190197653A1 (en) Processing unit performance projection using dynamic hardware behaviors
CN109144944A (en) A kind of program groups bandwidth scheduling method that concurrency performance is optimal
CN107145453B (en) A kind of prediction technique, device, readable medium and the equipment of cache invalidation rate
Li et al. GbA: A graph‐based thread partition approach in speculative multithreading
US20190197652A1 (en) On-the-fly scheduling of execution of dynamic hardware behaviors
CN115185804A (en) Server performance prediction method, system, terminal and storage medium
Strobel et al. Combined mpsoc task mapping and memory optimization for low-power
CN107577517B (en) NUMA memory architecture-oriented fine-grained vCPU scheduling method and system
CN112052087A (en) Deep learning training system and method for dynamic resource adjustment and migration
CN106203083B (en) The prediction partition method of device driver and its intelligent hardened system
US11836531B2 (en) Method, device, and program product for managing computing system
CN113204478B (en) Method, device and equipment for operating test unit and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190104

RJ01 Rejection of invention patent application after publication