CN105183651A - Viewpoint increase method for automatic performance prediction of program - Google Patents
Viewpoint increase method for automatic performance prediction of program Download PDFInfo
- Publication number
- CN105183651A CN105183651A CN201510579026.0A CN201510579026A CN105183651A CN 105183651 A CN105183651 A CN 105183651A CN 201510579026 A CN201510579026 A CN 201510579026A CN 105183651 A CN105183651 A CN 105183651A
- Authority
- CN
- China
- Prior art keywords
- fundamental block
- viewpoint
- frequency
- vision point
- father
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Devices For Executing Special Programs (AREA)
Abstract
The invention discloses a viewpoint increase method for automatic program performance prediction, and belongs to the field of program performance prediction. According to an existing method for automatic program performance prediction, it is difficult to determine the maximum predictability on the premise of guaranteeing prediction precision. The viewpoint increase method for automatic program performance prediction includes the steps that firstly, a viewpoint V of the execution number of times of a basic block N is defined, and then the basic block frequency represented by a binary group is (Ev, BV, N); secondly, the actual running number of times Ev of the viewpoint V in the basic block frequency (Ev, BV, N) represented by the binary group is increased; thirdly, the basic block N frequency BV, N predicted in the viewpoint V is determined; fourthly, it is defined that basic block N frequency BN=Ev*BV, N; fifthly, the total basic block frequency in one-time running of the viewpoint V is obtained. The method has the advantages that a suitable insertion location is determined on the premise of guaranteeing precision, and predictability is improved in combination with the static branch probability.
Description
Technical field
The present invention relates to a kind of for the foreseeable viewpoint method for improving of program automatism.
Background technology
The common performance evaluation methodology of program comprises performance analysis and static analysis.The so-called performance analysis i.e. less input size of true operation and degree of parallelism predicts large-scale situation, and static analysis, namely based on compiler code analysis, obtains performance of program.Wish now that performance analysis and static analysis combine acquisition program feature.Performance analysis represents accuracy, and static analysis represents predictability.
Obtain cycle index by static analysis in research, pitching pile is in source program, and the program after deleting of running obtains cycle index, in conjunction with performance of program, finally obtains predicted time.LLVM itself provides EdgeProfiling, namely inserts the information on " limit " at the preheader of circulation.But this method is the method for pure performance analysis and original intention that we predict not to be inconsistent, and the prediction probability of gained is lower.
Need to find suitable insertion position, the pitching pile position found and viewpoint while guarantee precision, viewpoint refers to the pitching pile position being obtained cycle index by static analysis, and its advantage is perfectly to have unified static prediction and performance prediction.By promoting or reducing the ratio that viewpoint comes free adjustment dynamic and nature static.Viewpoint is too high, increases predictability but can reduce precision, namely levels off to pure static analysis.If viewpoint is too low, predictability can be lost, the EdgeProfiling that the LLVM that levels off to provides.Need badly a kind of can ensure precision while promote viewpoint to improve the method for predictability as far as possible.
Summary of the invention
The object of the invention is to there is to solve existing program automatic performance Forecasting Methodology the problem being difficult to determine maximum predicted while ensureing precision of prediction, and propose a kind of for the foreseeable viewpoint method for improving of program automatism.
A kind of for the foreseeable viewpoint method for improving of program automatism, described method is realized by following steps:
Step one, definition fundamental block N perform the vision point of number of times, then utilize the fundamental block frequency of two element group representations for (E
v, B
v,N); Wherein, E
vrepresent the actual motion number of times of vision point, B
v,Nrepresent the frequency predicting fundamental block N in vision point; Described vision point is the insertion point that computation cycles fundamental block performs the instruction of number of times;
Step 2, fundamental block frequency (E to two element group representations
v, B
v,N) in the actual motion number of times E of vision point
vamount carry out lifting operation, and promote after vision point meet
Wherein, δ represents dominance relation, namely can obtain the value of %start, %end and %stride tri-variablees directly relied at vision point;
Step 3, determine the fundamental block frequency (E of two element group representations
v, B
v,N) in vision point in predict the frequency B of fundamental block N
v,Namount;
Step 4, make the actual motion number of times E of vision point
vwith the frequency B predicting fundamental block N in vision point
v,Nproduct be defined as the frequency B of fundamental block N
n, i.e. B
n=E
v× B
v,N;
Step 5, every actual motion vision point all obtain actual motion number of times E
v, correspondingly, the frequency B of fundamental block N will be predicted in vision point
v,Ninstruction insert vision point, therefore vision point once run in total fundamental block frequency be
Beneficial effect of the present invention is:
The present invention is by inserting in Preheader by fundamental block number of times, to promote precision, and promote viewpoint by the value making vision point obtain %start, %end and %stride tri-variablees directly relied on, thus make up the shortcoming of the predictability reduction caused because promoting precision, and then make later stage code delete operation by the predictability improved, the read-write of global counter array is dispersed in each fundamental block, thus has blocked and delete process.Realize determining suitable insertion position while guarantee precision, improve predictability in conjunction with static branch probability.And prediction accurate rate reaches 92-95%.
Accompanying drawing explanation
Fig. 1 is process flow diagram of the present invention;
Fig. 2 is that in the embodiment of the present invention 1, different points of view comprises dynamic nature static ratio in various degree;
Fig. 3 is the different points of view loop nesting schematic diagram that the embodiment of the present invention 1 relates to;
Embodiment
Embodiment one:
Present embodiment for the foreseeable viewpoint method for improving of program automatism, shown in composition graphs 1, described method is realized by following steps:
Step one, definition fundamental block N perform the vision point of number of times, then utilize the fundamental block frequency of two element group representations for (E
v, B
v,N); Wherein, E
vrepresent the actual motion number of times of vision point, B
v,Nrepresent the frequency predicting fundamental block N in vision point; Described vision point is the insertion point that computation cycles fundamental block performs the instruction of number of times;
Step 2, fundamental block frequency (E to two element group representations
v, B
v,N) in the amount of actual motion number of times Ev of vision point carry out lifting operation, and the vision point after promoting meets
Wherein, δ represents dominance relation, namely can obtain the value of %start, %end and %stride tri-variablees directly relied at vision point;
Step 3, determine the fundamental block frequency (E of two element group representations
v, B
v,N) in vision point in predict the frequency B of fundamental block N
v, Namount;
Step 4, make the actual motion number of times E of vision point
vwith the frequency B predicting fundamental block N in vision point
v,Nproduct be defined as the frequency B of fundamental block N
n, i.e. B
n=E
v× B
v,N;
Step 5, operation along with program, every actual motion vision point all obtains actual motion number of times E
v, correspondingly, the frequency B of fundamental block N will be predicted in vision point
v,Ninstruction insert vision point, therefore vision point once run in total fundamental block frequency be
here E is selected to represent the dynamic fundamental block frequency of vision point, because its value equals the result of EdgeProfiling especially.
With the EdgeProfiling method of LLVM, fundamental block number of times is inserted in Preheader, improves precision and but reduce predictability.Unlike, the inventive method can increase predictability, and benefit is convenient to the later stage to delete code, and the read-write of global counter array is dispersed in each fundamental block, thus has blocked and delete process, and it is necessary for therefore promoting viewpoint.
Embodiment two:
With embodiment one unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, the fundamental block frequency (E of two element group representations described in step one
v, B
v,N) value be:
(1) when selecting fundamental block itself as viewpoint, the fundamental block frequency representation of two element group representations is (E
n, 1);
(2) for the fundamental block in circulation, when selecting the Preheader of circulation as viewpoint, fundamental block frequency representation is
wherein, %tc represents static analysis cycle index,
it is the probability of the static branch prediction that compiler framework LLVM provides;
(3) for acyclic fundamental block, continue to use compiler framework LLVM fundamental block frequency, the viewpoint of acyclic fundamental block is still function entrance fundamental block e, that is: (E
e, B
e,N).
Embodiment three:
With embodiment one or two unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, to the fundamental block frequency (E of two element group representations described in step 2
v, B
v,N) in the actual motion number of times E of vision point
vamount carry out the process of lifting operation:
The instruction of the direct dependence of the fundamental block number of times of step 2 one, selection circulation, the instruction of described direct dependence comprises the one in access instruction when program performs, exchange instruction, operational order;
The operand assignment be only only used once in step 2 two, the instruction of direct dependence step 2 one selected is to goal set Targets; The instruction of inserting when namely the instruction of direct dependence is and generates LoopTripCount, namely needs by the instruction promoted;
In step 2 three, the instruction step step 2 one selected, the viewpoint assignment of the father cycle P arentLoop of nonexpondable operand is gathered to Depends;
If step 2 four Depends gathers for empty, then return the fundamental block at circulation place, and carry out the operation of step 2 five; Otherwise, carry out the operation finding vision point;
Instruction in step 2 five, traversal goal set Targets, before being inserted into the command for stopping Terminator of initial viewpoint pos by instruction;
Step 2 six, return initial viewpoint pos, namely complete the lifting of vision point.
Embodiment four:
With embodiment three unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, the process finding the operation of vision point described in step 2 four is,
Fundamental block N in traversal Depends set, if initial viewpoint pos can arrange fundamental block N, then viewpoint pos again assignment be basic soon N; Otherwise initial viewpoint pos is still initial value.
Embodiment five:
With embodiment one, two or four unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, determine the fundamental block frequency (E of two element group representations described in step 3
v, B
v,N) in vision point in predict the frequency B of fundamental block N
v,Nthe process of amount be,
When the vision point selected and father circulate in same circulation layer, determine the frequency B predicting fundamental block N in vision point
v,Nbe expressed as: E
v, (V → m) %tc; Wherein, m represents that father is circulated the circulation layer at place; Such as, when selecting vision point and the 4th layer of father to circulate in same circulation layer, determine that the frequency predicting fundamental block N in vision point is: E
v, (V → 4) %tc
When the viewpoint selected is outside n-th father's circulation, probability viewpoint being arrived Preheader is multiplied by the probability of cycle index and next node arrival forerunner fundamental block Preheader successively, nested successively, the frequency obtaining predicting in vision point fundamental block N is:
E
irepresent i-th head node circulated, P
irepresent forerunner's fundamental block Preheader, V
irepresent viewpoint, %tc
irepresent cycle index, i=1.2...n-1, n.
Embodiment six:
Be used for the foreseeable viewpoint method for improving of program automatism with embodiment five unlike the root of, present embodiment, when the vision point selected circulates identical with father, in the vision point utilizing father's cyclic representation to simplify, predict the frequency B of fundamental block N
v,Nfor:
wherein, H
1represent father's circulation,
represent that viewpoint and father are circulated common viewpoint,
represent and simplify father's cycle frequency when calculating.
Embodiment 1:
Modeling computing time:
Describe the fundamental block frequency that different viewpoints is corresponding in fig. 2, arranged on left and right sides is that dynamic and static state two is extreme respectively.Be all predict circulation fundamental block 4, with the forerunner's fundamental block Preheader3 circulated for viewpoint, then the cycle index of circulation fundamental block 4 is %tc.
To keep precision, then viewpoint is risen to 1, so at fundamental block 1, the frequency of prediction 3 is only with being simply multiplied by path probability.Therefore can obtain with prediction:
(E
3,%tc)=(E
1,(1→3),%tc)=(E
1,%tc(1→3))
So just complete lifting.Next step needs to determine how to select fundamental block 1 now, reaches and ensureing object close as far as possible under the prerequisite of precision.Because %tc relies on %start, %end and %stride tri-amount.Therefore the viewpoint D demand fulfillment of our selection:
Wherein, δ represents dominance relation, namely can obtain the value of three variablees at viewpoint D.
The boosting algorithm of viewpoint is found in present design, as follows:
Next need to determine B
v,N, the complex situations of loop nesting are as shown in Figure 3 discussed below.
The fundamental block of circulation fundamental block 2 place father circulation is that { its fundamental block frequency is (E for Isosorbide-5-Nitrae, 2,3}
c, %tc).E
i, P
i, V
iand %tc
irepresent the head node of i-th circulation, forerunner's fundamental block Preheader, viewpoint and cycle index IR respectively.
When the vision point selected
iwith viewpoint 4 when same circulation layer, E can be determined
v, (V → 4) %tc.But when select viewpoint D outside 4 namely father circulation outside time, can complete representation be:
Special case situation is, if its viewpoint and father circulate identical, then can carry out some simplify.The two prefix is identical, so can utilize the expression that father is circulated.Father's circulation of such as viewpoint 2 is H
1if its common viewpoint is
father's cycle frequency is
then computing formula can be reduced to:
The present invention also can have other various embodiments; when not deviating from the present invention's spirit and essence thereof; those skilled in the art are when making various corresponding change and distortion according to the present invention, but these change accordingly and are out of shape the protection domain that all should belong to the claim appended by the present invention.
Claims (6)
1., for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: described method is realized by following steps:
Step one, definition fundamental block N perform the vision point of number of times, then utilize the fundamental block frequency of two element group representations for (E
v, B
v,N); Wherein, E
vrepresent the actual motion number of times of vision point, B
v,Nrepresent the frequency predicting fundamental block N in vision point; Described vision point is the insertion point that computation cycles fundamental block performs the instruction of number of times;
Step 2, fundamental block frequency (E to two element group representations
v, B
v,N) in the actual motion number of times E of vision point
vamount carry out lifting operation, and promote after vision point meet
Wherein, δ represents dominance relation, namely can obtain the value of %start, %end and %stride tri-variablees directly relied at vision point;
Step 3, determine the fundamental block frequency (E of two element group representations
v, B
v,N) in vision point in predict the frequency B of fundamental block N
v,Namount;
Step 4, make the actual motion number of times E of vision point
vwith the frequency B predicting fundamental block N in vision point
v,Nproduct be defined as the frequency B of fundamental block N
n, i.e. B
n=E
v× B
v,N;
Step 5, every actual motion vision point all obtain actual motion number of times E
v, correspondingly, the frequency B of fundamental block N will be predicted in vision point
v,Ninstruction insert vision point, therefore vision point once run in total fundamental block frequency be
2. according to claim 1 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: the fundamental block frequency (E of two element group representations described in step one
v, B
v,N) value be:
(1) when selecting fundamental block itself as viewpoint, the fundamental block frequency representation of two element group representations is (E
n, 1);
(2) for the fundamental block in circulation, when selecting the Preheader of circulation as viewpoint, fundamental block frequency representation is
wherein, %tc represents static analysis cycle index,
it is the probability of the static branch prediction that compiler framework LLVM provides;
(3) for acyclic fundamental block, continue to use compiler framework LLVM fundamental block frequency, the viewpoint of acyclic fundamental block is still function entrance fundamental block e, that is: (E
e, B
e,N).
3. according to claim 1 and 2 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: to the fundamental block frequency (E of two element group representations described in step 2
v, B
v,N) in the actual motion number of times E of vision point
vamount carry out the process of lifting operation:
The instruction of the direct dependence of the fundamental block number of times of step 2 one, selection circulation;
The operand assignment be only only used once in step 2 two, the instruction of direct dependence step 2 one selected is to goal set;
In step 2 three, the instruction step step 2 one selected, the viewpoint assignment of father's circulation of nonexpondable operand is gathered to Depends;
If step 2 four Depends gathers for empty, then return the fundamental block at circulation place, and carry out the operation of step 2 five; Otherwise, carry out the operation finding vision point;
Instruction in step 2 five, traversal goal set, before being inserted into the command for stopping of initial viewpoint pos by instruction;
Step 2 six, return initial viewpoint pos, namely complete the lifting of vision point.
4. according to claim 3 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: the process finding the operation of vision point described in step 2 four is,
Fundamental block N in traversal Depends set, if initial viewpoint pos can arrange fundamental block N, then viewpoint pos again assignment be basic soon N; Otherwise initial viewpoint pos is still initial value.
5. according to claim 1,2 or 4 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: the fundamental block frequency (E determining two element group representations described in step 3
v, B
v,N) in vision point in predict the frequency B of fundamental block N
v,Nthe process of amount be,
When the vision point selected and father circulate in same circulation layer, determine the frequency B predicting fundamental block N in vision point
v,Nbe expressed as: E
v, (V → m) %tc; Wherein, m represents that father is circulated the circulation layer at place;
When the viewpoint selected is outside n-th father's circulation, probability viewpoint being arrived Preheader is multiplied by the probability of cycle index and next node arrival forerunner fundamental block Preheader successively, nested successively, the frequency obtaining predicting in vision point fundamental block N is:
E
irepresent i-th head node circulated, P
irepresent forerunner's fundamental block Preheader, V
irepresent viewpoint, %tc
irepresent cycle index, i=1.2...n-1, n.
6. according to claim 5 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: when the vision point selected circulates identical with father, in the vision point utilizing father's cyclic representation to simplify, predict the frequency B of fundamental block N
v,Nfor:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510579026.0A CN105183651B (en) | 2015-09-11 | 2015-09-11 | For the foreseeable viewpoint method for improving of program automaticity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510579026.0A CN105183651B (en) | 2015-09-11 | 2015-09-11 | For the foreseeable viewpoint method for improving of program automaticity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105183651A true CN105183651A (en) | 2015-12-23 |
CN105183651B CN105183651B (en) | 2018-03-16 |
Family
ID=54905743
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510579026.0A Active CN105183651B (en) | 2015-09-11 | 2015-09-11 | For the foreseeable viewpoint method for improving of program automaticity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105183651B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377525A (en) * | 2019-07-25 | 2019-10-25 | 哈尔滨工业大学 | A kind of parallel program property-predication system based on feature and machine learning when running |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101286132A (en) * | 2008-06-02 | 2008-10-15 | 北京邮电大学 | Test method and system based on software defect mode |
WO2010070490A1 (en) * | 2008-12-18 | 2010-06-24 | Koninklijke Philips Electronics, N.V. | Software bug and performance deficiency reporting system |
-
2015
- 2015-09-11 CN CN201510579026.0A patent/CN105183651B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101286132A (en) * | 2008-06-02 | 2008-10-15 | 北京邮电大学 | Test method and system based on software defect mode |
WO2010070490A1 (en) * | 2008-12-18 | 2010-06-24 | Koninklijke Philips Electronics, N.V. | Software bug and performance deficiency reporting system |
Non-Patent Citations (1)
Title |
---|
谢虎成: "基于LLVM的科学计算程序自动性能预测研究", 《中国知网》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377525A (en) * | 2019-07-25 | 2019-10-25 | 哈尔滨工业大学 | A kind of parallel program property-predication system based on feature and machine learning when running |
Also Published As
Publication number | Publication date |
---|---|
CN105183651B (en) | 2018-03-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102298514B (en) | Register mapping techniques for efficient dynamic binary translation | |
Yang et al. | Dense reppoints: Representing visual objects with dense point sets | |
US20170338802A1 (en) | Actually-measured marine environment data assimilation method based on sequence recursive filtering three-dimensional variation | |
CN101841730A (en) | Real-time stereoscopic vision implementation method based on FPGA | |
US10534576B2 (en) | Optimization apparatus and control method thereof | |
CN112598091B (en) | Training model and small sample classification method and device | |
CN104699464A (en) | Dependency mesh based instruction-level parallel scheduling method | |
CN108564221A (en) | A kind of photovoltaic array spacing and the computational methods and computing device at inclination angle | |
CN105426918A (en) | Efficient realization method for normalized correlation image template matching | |
CN105242907A (en) | NEON vectorization conversion method for ARM (Advanced RISC Machine) binary code | |
CN116091574A (en) | 3D target detection method and system based on plane constraint and position constraint | |
CN105183651A (en) | Viewpoint increase method for automatic performance prediction of program | |
Chen et al. | I-SMOOTH: Iteratively smoothing mean-constrained and nonnegative piecewise-constant functions | |
Bražėnas et al. | Parallel algorithms for fitting Markov arrival processes | |
CN105184807A (en) | Automatic efficiency selection method for increasing charted depth | |
CN114548414A (en) | Method, device, storage medium and compiling system for compiling quantum circuit | |
CN104360906A (en) | High-level comprehensive scheduling method based on difference constraint system and iterative model | |
Wang et al. | Real-time hierarchical supervoxel segmentation via a minimum spanning tree | |
CN112131794A (en) | Hydraulic structure multi-effect optimization prediction and visualization method based on LSTM network | |
CN103942095A (en) | Two-dimensional phase position unwrapping method based on heterogeneous accelerating platform | |
CN108038304A (en) | A kind of Lattice Boltzmann Method parallel acceleration method using temporal locality | |
CN109492086A (en) | A kind of answer output method, device, electronic equipment and storage medium | |
CN105467383A (en) | Distance measurement method based on waveform matching in TOF technology | |
CN102314215A (en) | Low power consumption optimization method of decimal multiplier in integrated circuit system | |
KR101623113B1 (en) | Apparatus and method for learning and classification of decision tree |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |