CN105183651B - For the foreseeable viewpoint method for improving of program automaticity - Google Patents

For the foreseeable viewpoint method for improving of program automaticity Download PDF

Info

Publication number
CN105183651B
CN105183651B CN201510579026.0A CN201510579026A CN105183651B CN 105183651 B CN105183651 B CN 105183651B CN 201510579026 A CN201510579026 A CN 201510579026A CN 105183651 B CN105183651 B CN 105183651B
Authority
CN
China
Prior art keywords
viewpoint
basic block
frequency
loop
parent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510579026.0A
Other languages
Chinese (zh)
Other versions
CN105183651A (en
Inventor
张伟哲
谢虎成
何慧
韩硕
郝萌
王学惠
鲁刚钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201510579026.0A priority Critical patent/CN105183651B/en
Publication of CN105183651A publication Critical patent/CN105183651A/en
Application granted granted Critical
Publication of CN105183651B publication Critical patent/CN105183651B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Devices For Executing Special Programs (AREA)

Abstract

For the foreseeable viewpoint method for improving of program automaticity, belong to program feature prediction field.The problem of being difficult to determine maximum predicted while precision of prediction is ensured be present in existing program automatic performance Forecasting Methodology.One kind is used for the foreseeable viewpoint method for improving of program automaticity, is then (E using the basic block frequency of two element group representations Step 1: defining the vision point that basic block N performs numberv, BV,N);Step 2: the basic block frequency (E to two element group representationsv, BV,N) in vision point actual motion number EvAmount carry out lifting operation;Step 3: determine to predict basic block N frequency B in vision pointV,N;Step 4: define basic block N frequency BN=Ev×BV,N;Step 5: obtaining vision point total basic block frequency in once running and beingThe present invention has determines suitable insertion position while precision is ensured, the advantages of predictive is improved with reference to static branch probability.

Description

Viewpoint lifting method for program automatic performance prediction
Technical Field
The invention relates to a viewpoint lifting method for automatic performance prediction of a program.
Background
Common performance evaluation methods for programs include dynamic analysis and static analysis. Dynamic analysis means that a small input scale is actually operated and a large-scale situation is predicted according to parallelism, and static analysis means that codes are analyzed based on a compiler to obtain program characteristics. It is now desirable to combine dynamic analysis with static analysis to obtain program performance. Dynamic analysis represents accuracy, while static analysis represents predictability.
In the research, the cycle times are obtained through static analysis, the pile is inserted into a source program, the deleted program is operated to obtain the cycle times, and the prediction time is finally obtained by combining the program characteristics. The LLVM itself provides EdgeProfiling, i.e., the information to insert "edges" at the looker of the loop. However, the method is a pure dynamic analysis method which is not in accordance with the original purpose of prediction, and the obtained prediction probability is low.
The method needs to find a proper insertion position while ensuring the precision, the found inserting position is a viewpoint, and the viewpoint is the inserting position of the cycle number obtained through static analysis. The ratio of the dynamic and static properties is freely adjusted by raising or lowering the viewpoint. The viewpoint is too high, increasing the predictability but decreasing the accuracy, i.e. approaching a purely static analysis. If the view is too low, the predictability is lost, approaching the EdgeProfiling provided by LLVM. There is a need for a method that can improve the viewpoint as much as possible to improve the predictability while ensuring the accuracy.
Disclosure of Invention
The invention aims to solve the problem that the existing program automatic performance prediction method is difficult to determine the maximum prediction while ensuring the prediction precision, and provides a viewpoint lifting method for program automatic performance prediction.
A viewpoint lifting method for automatic performance prediction of a program, the method being implemented by:
step one, defining the view point V of the execution times of the basic block N, and then the frequency of the basic block represented by the binary group is (E)v,BV,N) (ii) a Wherein E isvRepresenting the actual number of runs of viewpoint V, BV,NRepresents the frequency at which the basic block N is predicted in the view V; the viewpoint V is an insertion point of an instruction for calculating the execution times of the loop basic block;
step two, the basic block frequency (E) represented by the binary groupv,BV,N) Actual number of runs E of viewpoint V invIs lifted, and the lifted viewpoint V satisfiesWherein δ represents a dominance relationship, that is, values of three variables of% start,% end and% stride, which are directly dependent, can be obtained at the viewpoint V;
step three, determining the basic block frequency (E) represented by the binary groupv,BV,N) Frequency B of the predicted basic block N in view VV,NThe amount of (c);
step four, making the actual operation times E of the viewpoint VvWith the frequency B of the predicted basic block N in view VV,NIs defined as the frequency B of the basic block NNI.e. BN=Ev×BV,N
Step five, obtaining the actual operation times E every time the viewpoint V is actually operated oncevAccordingly, the frequency B of the basic block N is predicted from the viewpoint VV,NIs inserted into the viewpoint V, so that the viewpoint V has a total basic block frequency of
The invention has the beneficial effects that:
according to the invention, the precision is improved by inserting the times of the basic blocks into the Preheader, and the viewpoint V is improved by obtaining the values of three variables of% start,% end and% stride which are directly depended on, so that the defect of predictively reduced caused by improving the precision is overcome, and then the late code deleting operation is realized by improving the predictivity, and the reading and writing of the global counter array are scattered in each basic block, so that the deleting process is blocked. The method and the device realize the determination of a proper insertion position while ensuring the precision, and improve the predictability by combining the static branch probability. And the prediction accuracy reaches 92-95%.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 shows that different viewpoints contain different dynamic static performance ratios in the embodiment 1 of the present invention;
fig. 3 is a schematic diagram of loop nesting of different viewpoints according to embodiment 1 of the present invention;
Detailed Description
The first embodiment is as follows:
the viewpoint lifting method for automatic performance prediction of a program according to the present embodiment is implemented by the following steps, as shown in fig. 1:
step one, defining the view point V of the execution times of the basic block N, and then the frequency of the basic block represented by the binary group is (E)v,BV,N) (ii) a Wherein E isvRepresenting the actual number of runs of viewpoint V, BV,NRepresents the frequency at which the basic block N is predicted in the view V; the viewpoint V is an insertion point of an instruction for calculating the execution times of the loop basic block;
step two, the basic block frequency (E) represented by the binary groupv,BV,N) The actual running times Ev of the viewpoint V in the process of the previous step are lifted, and the lifted viewpoint V meets the requirementWherein δ represents a dominance relationship, that is, values of three variables of% start,% end and% stride, which are directly dependent, can be obtained at the viewpoint V;
step three, determining the basic block frequency (E) represented by the binary groupv,BV,N) Frequency B of the predicted basic block N in view VV,NThe amount of (c);
step four, making the actual operation times E of the viewpoint VvWith the frequency B of the predicted basic block N in view VV,NIs defined as the frequency B of the basic block NNI.e. BN=Ev×BV,N
Step five, along with the operation of the program, the actual operation times E are obtained when the viewpoint V is actually operated oncevAccordingly, the frequency B of the basic block N is predicted from the viewpoint VV,NIs inserted into the viewpoint V, so that the viewpoint V has a total basic block frequency ofE is particularly chosen here to represent the dynamic basic block frequency of view V, since its value is equal to the result of EdgeProfiling.
Compared with the EdgeProfiling method of the LLVM, the method has the advantages that the basic block times are inserted into the Preheader, so that the precision is improved, and the predictability is reduced. In contrast, the method of the present invention can increase the predictability, and has the advantage of facilitating late-stage pruning of codes, and the reading and writing of the global counter array are scattered in each basic block, thereby blocking the pruning process, and therefore, the viewpoint is necessary to be improved.
The second embodiment is as follows:
different from the first embodiment, in the viewpoint lifting method for program automatic performance prediction according to the first embodiment, the frequency (E) of the basic block represented by the binary group in the first stepv,BV,N) The values of (A) are as follows:
(1) when the basic block itself is selected as the viewpoint, the frequency of the basic block represented by the binary group is represented by (E)N,1);
(2) For a basic block in a loop, when the Preheader of the loop is selected as a viewpoint, the basic block frequency is expressed asWherein,% tc represents the number of static analysis cycles,is the probability of static branch prediction provided by the compiled-framework LLVM;
(3) for the acyclic basic block, following the coding framework LLVM basic block frequency, the view point of the acyclic basic block is still the function entry basic block e, i.e.: (E)e,Be,N)。
The third concrete implementation mode:
different from the first or second embodiment, in the viewpoint lifting method for program automatic performance prediction according to the second embodiment, the frequency (E) of the basic block represented by the binary group is set as the step twov,BV,N) Actual number of runs E of viewpoint V invThe process of lifting operation:
selecting a directly-dependent instruction of the basic block times of the loop, wherein the directly-dependent instruction comprises one of an access instruction, an exchange instruction and an operation instruction when a program is executed;
secondly, assigning operands which are used only once in the directly dependent instructions selected in the first step to a target set Targets; the directly dependent instruction is an instruction inserted when the LoopTripCount is generated, namely an instruction needing to be promoted;
assigning the viewpoint of the parent loop ParentLoop of the operand used for multiple times in the instruction selected in the step two to a Depends set;
step four, if the dependeds set is empty, returning to the basic block where the cycle is located, and performing the operation of the step five; otherwise, performing the operation of searching the viewpoint V;
step two, traversing the instructions in the target set Targets, and inserting the instructions in front of a termination instruction Terminator of the initial viewpoint pos;
and step two, returning to the initial viewpoint pos, namely finishing the lifting of the viewpoint V.
The fourth concrete implementation mode:
different from the third embodiment, in the viewpoint lifting method for automatic performance prediction of a program according to the present embodiment, the process of the operation of finding the viewpoint V in the second or fourth step is,
traversing a basic block N in the Depends set, and if an initial viewpoint pos can dominate the basic block N, reassigning the viewpoint pos to be a basic block N; otherwise, the initial viewpoint pos is still the original value.
The fifth concrete implementation mode:
different from the first, second or fourth embodiments, in the viewpoint lifting method for automatic performance prediction of a program according to the present embodiment, the step three of determining the basic block frequency (E) represented by the binary groupv,BV,N) Frequency B of the predicted basic block N in view VV,NThe process of (a) is that,
determining the frequency B of predicting the basic block N in the viewpoint V when the selected viewpoint V is in the same loop layer as the parent loopV,NExpressed as: ev(V → m)% tc; wherein m represents the loop layer where the parent loop is located; for example, when the selected viewpoint V is at the same loop layer as the fourth-layer parent loop, it is determined that the viewpoint V is within the selected viewpoint VThe frequency of the basic block N is predicted to be: ev,(V→4)%tc
When the selected viewpoint is outside the nth parent cycle, sequentially multiplying the probability of the viewpoint reaching the Preheader by the cycle times and the probability of the next node reaching the Preheader of the precursor basic block, and sequentially nesting to obtain the frequency of predicting the basic block N in the viewpoint V as follows:
wherein,
Eiindicating the head node of the i-th cycle, PiRepresents the predecessor basic block Preheader, ViRepresents viewpoint,% tciDenotes the number of cycles, i ═ 1.2.. n-1, n.
The sixth specific implementation mode:
unlike the fifth embodiment, according to the viewpoint lifting method for automatic performance prediction of a program of the present embodiment, when the selected viewpoint V is the same as the parent loop, the parent loop is used to indicate the frequency B at which the basic block N is predicted in the simplified viewpoint VV,NComprises the following steps:
wherein H1The representation of the parent loop is shown,representing a view point common to the parent loop,indicating the parent cycle frequency in simplifying the calculations.
Example 1:
modeling by calculating time:
in fig. 2, the basic block frequencies corresponding to different viewpoints are depicted, and the left and right sides are dynamic and static extremes, respectively. The loop basic block 4 is predicted, and the number of loops of the loop basic block 4 is% tc from the viewpoint of the previous basic block of the loop 3.
If the view is raised to 1 in order to maintain the accuracy, the frequency of prediction 3 may be simply multiplied by the path probability when viewed from basic block 1. Thus with prediction:
(E3,%tc)=(E1,(1→3),%tc)=(E1,%tc(1→3))
the lifting is then completed. The next step is now to determine how to select the basic block 1, with the goal of being as close as possible while ensuring accuracy. Since% tc depends on three quantities,% start,% end and% stride. The viewpoint D we have chosen therefore needs to satisfy:where δ represents the dominating relationship, i.e., the values of the three variables can be obtained at viewpoint D.
Now, a lifting algorithm for finding a viewpoint is designed as follows:
next, B needs to be determinedV,NThe complexity of loop nesting as shown in figure 3 is discussed below.
The basic block of the parent loop of the loop basic block 2 is {1,4,2,3}, and the frequency of the basic block is (E)c,%tc)。Ei、Pi、ViAnd% tciRespectively representing the head node, predecessor basic block prefix, viewpoint and loop times IR of the ith loop.
When the selected viewpoint ViWhen the same layer of the loop as the viewpoint 4,can determine Ev(V → 4)% tc. But when the selected viewpoint D is outside 4, i.e. outside the parent cycle, it can be expressed completely as:
that is, if the selected viewpoint is in the nth parent loop, the probability of the viewpoint reaching the Preheader is sequentially multiplied by the loop times, and is multiplied by the probability of the next node reaching the Preheader, and the viewpoint is sequentially nested.
A special case is that some simplification can be made if its point of view is the same as that of the parent loop. Both prefixes are the same, so the expression of the parent cycle can be utilized. E.g., parent cycle of viewpoint 2 is H1Let its common viewpoint beThe parent cycle frequency isThe calculation formula can be simplified as:
the present invention is capable of other embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and scope of the present invention.

Claims (4)

1. A viewpoint lifting method for program automatic performance prediction, characterized by: the method is realized by the following steps:
step one, defining the view point V of the execution times of the basic block N, and then the frequency of the basic block represented by the binary group is (E)v,BV,N) (ii) a Wherein E isvRepresenting the actual number of runs of viewpoint V, BV,NRepresents the frequency at which the basic block N is predicted in the view V; the viewpoint V is an insertion point of an instruction for calculating the execution times of the loop basic block;
step two, pair of binary groupsBasic block frequency (E) of the representationv,BV,N) Actual number of runs E of viewpoint V invIs lifted, and the lifted viewpoint V satisfiesWherein δ represents a dominance relationship, that is, values of three variables of% start,% end and% stride, which are directly dependent, can be obtained at the viewpoint V;
the lifting operation process comprises the following steps:
step two, selecting a direct dependent instruction of the number of times of the basic block of the loop;
secondly, assigning operands which are used only once in the directly dependent instructions selected in the first step to a target set;
step two, assigning the viewpoint of the parent cycle of the operand used for multiple times in the instruction selected in the step two to a Depends set;
step four, if the dependeds set is empty, returning to the basic block where the cycle is located, and performing the operation of the step five; otherwise, performing the operation of searching the viewpoint V;
step two, traversing the instructions in the target set, and inserting the instructions before the termination instruction of the initial viewpoint pos;
step two, returning to the initial viewpoint pos, namely finishing the lifting of the viewpoint V;
step three, determining the basic block frequency (E) represented by the binary groupv,BV,N) Frequency B of the predicted basic block N in view VV,NThe amount of (c); the method specifically comprises the following steps:
determining the frequency B of predicting the basic block N in the viewpoint V when the selected viewpoint V is in the same loop layer as the parent loopV,NExpressed as: ev(V → m)% tc; wherein m represents the loop layer where the parent loop is located;
when the selected viewpoint is outside the nth parent cycle, sequentially multiplying the probability of the viewpoint reaching the Preheader by the cycle times and the probability of the next node reaching the Preheader of the precursor basic block, and sequentially nesting to obtain the frequency of predicting the basic block N in the viewpoint V as follows:
wherein,
Eiindicating the head node of the i-th cycle, PiRepresents the predecessor basic block Preheader, ViRepresents viewpoint,% tciRepresents the number of cycles, i ═ 1.2.. n-1, n;
step four, making the actual operation times E of the viewpoint VvWith the frequency B of the predicted basic block N in view VV,NIs defined as the frequency B of the basic block NNI.e. BN=Ev×BV,N
Step five, obtaining the actual operation times E every time the viewpoint V is actually operated oncevAccordingly, the frequency B of the basic block N is predicted from the viewpoint VV,NIs inserted into the viewpoint V, so that the viewpoint V has a total basic block frequency of
2. The viewpoint elevation method for program automatic performance prediction according to claim 1, characterized in that: elementary block frequency (E) represented by the doublet described in step onev,BV,N) The values of (A) are as follows:
(1) when the basic block itself is selected as the viewpoint, the frequency of the basic block represented by the binary group is represented by (E)N,1);
(2) For a basic block in a loop, when the Preheader of the loop is selected as a viewpoint, the basic block frequency is expressed asWherein,% tc represents the number of static analysis cycles,is the probability of static branch prediction provided by the compiled-framework LLVM;
(3) for the non-cyclic basisBlock, following the compiled frame LLVM basic block frequency, the view point of the acyclic basic block is still the function entry basic block e, i.e.: (E)e,Be,N)。
3. The viewpoint elevation method for program automatic performance prediction according to claim 2, characterized in that: step two and four the process of the operation of finding the viewpoint V is,
traversing a basic block N in the Depends set, and if an initial viewpoint pos can dominate the basic block N, reassigning the viewpoint pos to be the basic block N; otherwise, the initial viewpoint pos is still the original value.
4. The viewpoint elevation method for program automatic performance prediction according to claim 3, characterized in that: when the selected viewpoint V is the same as the parent loop, the parent loop is used to represent the frequency B of the predicted basic block N in the simplified viewpoint VV,NComprises the following steps:
wherein H1The representation of the parent loop is shown,representing a view point common to the parent loop,indicating the parent cycle frequency in simplifying the calculations.
CN201510579026.0A 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity Active CN105183651B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510579026.0A CN105183651B (en) 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510579026.0A CN105183651B (en) 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity

Publications (2)

Publication Number Publication Date
CN105183651A CN105183651A (en) 2015-12-23
CN105183651B true CN105183651B (en) 2018-03-16

Family

ID=54905743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510579026.0A Active CN105183651B (en) 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity

Country Status (1)

Country Link
CN (1) CN105183651B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377525B (en) * 2019-07-25 2022-11-15 哈尔滨工业大学 Parallel program performance prediction system based on runtime characteristics and machine learning

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5993575B2 (en) * 2008-12-18 2016-09-14 コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. Software bug and performance deficiency report creation system, digital storage medium and method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于LLVM的科学计算程序自动性能预测研究;谢虎成;《中国知网》;20150715;第5页第1.5节-第53页第3.6节 *

Also Published As

Publication number Publication date
CN105183651A (en) 2015-12-23

Similar Documents

Publication Publication Date Title
CN110941494A (en) Deep learning-oriented GPU parallel computing data processing method
EP0732650A2 (en) Resource assigning apparatus
WO2012142069A2 (en) Elastic computing
KR20130114688A (en) Architecture optimizer
JP2016509271A (en) Hierarchical hidden variable model estimation device
US20150220315A1 (en) Method and apparatus for compiling
CN107844380B (en) Multi-core cache WCET analysis method supporting instruction prefetching
KR102333845B1 (en) Method, apparatus, device and storage medium for generating chip-based computing functions
CN110058936A (en) For determining the method, equipment and computer program product of the stock number of dedicated processes resource
CN110149801A (en) System and method for carrying out data flow diagram conversion in the processing system
CN108932137B (en) Assembly-level inter-process pointer analysis method based on speculative multithreading
KR20080018679A (en) Branch history length indicator, branch prediction system, and the method thereof
CN103559069B (en) A kind of optimization method across between file processes based on algebra system
Chen et al. FATNN: Fast and accurate ternary neural networks
Kaufman et al. Learned TPU cost model for XLA tensor programs
CN105183651B (en) For the foreseeable viewpoint method for improving of program automaticity
KR101503620B1 (en) Intelligent architecture creator
Boechat et al. Representing and scheduling procedural generation using operator graphs
CN110728359B (en) Method, device, equipment and storage medium for searching model structure
CN109711555B (en) Method and system for predicting single-round iteration time of deep learning model
CN107358001A (en) A kind of constrained global optimization method based on Kriging models
JP5905491B2 (en) How to quantify the inherent data transfer rate of an algorithm
Vetter Parallel time-dependent contraction hierarchies
CN104679521B (en) A kind of accurate calculating task cache WCET analysis method
Menard et al. Design of fixed-point embedded systems (defis) french anr project

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant