CN105183651A - Viewpoint increase method for automatic performance prediction of program - Google Patents

Viewpoint increase method for automatic performance prediction of program Download PDF

Info

Publication number
CN105183651A
CN105183651A CN201510579026.0A CN201510579026A CN105183651A CN 105183651 A CN105183651 A CN 105183651A CN 201510579026 A CN201510579026 A CN 201510579026A CN 105183651 A CN105183651 A CN 105183651A
Authority
CN
China
Prior art keywords
fundamental block
viewpoint
frequency
vision point
father
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510579026.0A
Other languages
Chinese (zh)
Other versions
CN105183651B (en
Inventor
张伟哲
谢虎成
何慧
韩硕
郝萌
王学惠
鲁刚钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201510579026.0A priority Critical patent/CN105183651B/en
Publication of CN105183651A publication Critical patent/CN105183651A/en
Application granted granted Critical
Publication of CN105183651B publication Critical patent/CN105183651B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses a viewpoint increase method for automatic program performance prediction, and belongs to the field of program performance prediction. According to an existing method for automatic program performance prediction, it is difficult to determine the maximum predictability on the premise of guaranteeing prediction precision. The viewpoint increase method for automatic program performance prediction includes the steps that firstly, a viewpoint V of the execution number of times of a basic block N is defined, and then the basic block frequency represented by a binary group is (Ev, BV, N); secondly, the actual running number of times Ev of the viewpoint V in the basic block frequency (Ev, BV, N) represented by the binary group is increased; thirdly, the basic block N frequency BV, N predicted in the viewpoint V is determined; fourthly, it is defined that basic block N frequency BN=Ev*BV, N; fifthly, the total basic block frequency in one-time running of the viewpoint V is obtained. The method has the advantages that a suitable insertion location is determined on the premise of guaranteeing precision, and predictability is improved in combination with the static branch probability.

Description

For the foreseeable viewpoint method for improving of program automatism
Technical field
The present invention relates to a kind of for the foreseeable viewpoint method for improving of program automatism.
Background technology
The common performance evaluation methodology of program comprises performance analysis and static analysis.The so-called performance analysis i.e. less input size of true operation and degree of parallelism predicts large-scale situation, and static analysis, namely based on compiler code analysis, obtains performance of program.Wish now that performance analysis and static analysis combine acquisition program feature.Performance analysis represents accuracy, and static analysis represents predictability.
Obtain cycle index by static analysis in research, pitching pile is in source program, and the program after deleting of running obtains cycle index, in conjunction with performance of program, finally obtains predicted time.LLVM itself provides EdgeProfiling, namely inserts the information on " limit " at the preheader of circulation.But this method is the method for pure performance analysis and original intention that we predict not to be inconsistent, and the prediction probability of gained is lower.
Need to find suitable insertion position, the pitching pile position found and viewpoint while guarantee precision, viewpoint refers to the pitching pile position being obtained cycle index by static analysis, and its advantage is perfectly to have unified static prediction and performance prediction.By promoting or reducing the ratio that viewpoint comes free adjustment dynamic and nature static.Viewpoint is too high, increases predictability but can reduce precision, namely levels off to pure static analysis.If viewpoint is too low, predictability can be lost, the EdgeProfiling that the LLVM that levels off to provides.Need badly a kind of can ensure precision while promote viewpoint to improve the method for predictability as far as possible.
Summary of the invention
The object of the invention is to there is to solve existing program automatic performance Forecasting Methodology the problem being difficult to determine maximum predicted while ensureing precision of prediction, and propose a kind of for the foreseeable viewpoint method for improving of program automatism.
A kind of for the foreseeable viewpoint method for improving of program automatism, described method is realized by following steps:
Step one, definition fundamental block N perform the vision point of number of times, then utilize the fundamental block frequency of two element group representations for (E v, B v,N); Wherein, E vrepresent the actual motion number of times of vision point, B v,Nrepresent the frequency predicting fundamental block N in vision point; Described vision point is the insertion point that computation cycles fundamental block performs the instruction of number of times;
Step 2, fundamental block frequency (E to two element group representations v, B v,N) in the actual motion number of times E of vision point vamount carry out lifting operation, and promote after vision point meet % s t a r t δ V % e n d δ V % s t r i d e δ V V δ Pr e h e a d e r ; Wherein, δ represents dominance relation, namely can obtain the value of %start, %end and %stride tri-variablees directly relied at vision point;
Step 3, determine the fundamental block frequency (E of two element group representations v, B v,N) in vision point in predict the frequency B of fundamental block N v,Namount;
Step 4, make the actual motion number of times E of vision point vwith the frequency B predicting fundamental block N in vision point v,Nproduct be defined as the frequency B of fundamental block N n, i.e. B n=E v× B v,N;
Step 5, every actual motion vision point all obtain actual motion number of times E v, correspondingly, the frequency B of fundamental block N will be predicted in vision point v,Ninstruction insert vision point, therefore vision point once run in total fundamental block frequency be B N ′ = Σ i = 1 E v E v × B V , N .
Beneficial effect of the present invention is:
The present invention is by inserting in Preheader by fundamental block number of times, to promote precision, and promote viewpoint by the value making vision point obtain %start, %end and %stride tri-variablees directly relied on, thus make up the shortcoming of the predictability reduction caused because promoting precision, and then make later stage code delete operation by the predictability improved, the read-write of global counter array is dispersed in each fundamental block, thus has blocked and delete process.Realize determining suitable insertion position while guarantee precision, improve predictability in conjunction with static branch probability.And prediction accurate rate reaches 92-95%.
Accompanying drawing explanation
Fig. 1 is process flow diagram of the present invention;
Fig. 2 is that in the embodiment of the present invention 1, different points of view comprises dynamic nature static ratio in various degree;
Fig. 3 is the different points of view loop nesting schematic diagram that the embodiment of the present invention 1 relates to;
Embodiment
Embodiment one:
Present embodiment for the foreseeable viewpoint method for improving of program automatism, shown in composition graphs 1, described method is realized by following steps:
Step one, definition fundamental block N perform the vision point of number of times, then utilize the fundamental block frequency of two element group representations for (E v, B v,N); Wherein, E vrepresent the actual motion number of times of vision point, B v,Nrepresent the frequency predicting fundamental block N in vision point; Described vision point is the insertion point that computation cycles fundamental block performs the instruction of number of times;
Step 2, fundamental block frequency (E to two element group representations v, B v,N) in the amount of actual motion number of times Ev of vision point carry out lifting operation, and the vision point after promoting meets % s t a r t δ V % e n d δ V % s t r i d e δ V V δ Pr e h e a d e r ; Wherein, δ represents dominance relation, namely can obtain the value of %start, %end and %stride tri-variablees directly relied at vision point;
Step 3, determine the fundamental block frequency (E of two element group representations v, B v,N) in vision point in predict the frequency B of fundamental block N v, Namount;
Step 4, make the actual motion number of times E of vision point vwith the frequency B predicting fundamental block N in vision point v,Nproduct be defined as the frequency B of fundamental block N n, i.e. B n=E v× B v,N;
Step 5, operation along with program, every actual motion vision point all obtains actual motion number of times E v, correspondingly, the frequency B of fundamental block N will be predicted in vision point v,Ninstruction insert vision point, therefore vision point once run in total fundamental block frequency be here E is selected to represent the dynamic fundamental block frequency of vision point, because its value equals the result of EdgeProfiling especially.
With the EdgeProfiling method of LLVM, fundamental block number of times is inserted in Preheader, improves precision and but reduce predictability.Unlike, the inventive method can increase predictability, and benefit is convenient to the later stage to delete code, and the read-write of global counter array is dispersed in each fundamental block, thus has blocked and delete process, and it is necessary for therefore promoting viewpoint.
Embodiment two:
With embodiment one unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, the fundamental block frequency (E of two element group representations described in step one v, B v,N) value be:
(1) when selecting fundamental block itself as viewpoint, the fundamental block frequency representation of two element group representations is (E n, 1);
(2) for the fundamental block in circulation, when selecting the Preheader of circulation as viewpoint, fundamental block frequency representation is wherein, %tc represents static analysis cycle index, it is the probability of the static branch prediction that compiler framework LLVM provides;
(3) for acyclic fundamental block, continue to use compiler framework LLVM fundamental block frequency, the viewpoint of acyclic fundamental block is still function entrance fundamental block e, that is: (E e, B e,N).
Embodiment three:
With embodiment one or two unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, to the fundamental block frequency (E of two element group representations described in step 2 v, B v,N) in the actual motion number of times E of vision point vamount carry out the process of lifting operation:
The instruction of the direct dependence of the fundamental block number of times of step 2 one, selection circulation, the instruction of described direct dependence comprises the one in access instruction when program performs, exchange instruction, operational order;
The operand assignment be only only used once in step 2 two, the instruction of direct dependence step 2 one selected is to goal set Targets; The instruction of inserting when namely the instruction of direct dependence is and generates LoopTripCount, namely needs by the instruction promoted;
In step 2 three, the instruction step step 2 one selected, the viewpoint assignment of the father cycle P arentLoop of nonexpondable operand is gathered to Depends;
If step 2 four Depends gathers for empty, then return the fundamental block at circulation place, and carry out the operation of step 2 five; Otherwise, carry out the operation finding vision point;
Instruction in step 2 five, traversal goal set Targets, before being inserted into the command for stopping Terminator of initial viewpoint pos by instruction;
Step 2 six, return initial viewpoint pos, namely complete the lifting of vision point.
Embodiment four:
With embodiment three unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, the process finding the operation of vision point described in step 2 four is,
Fundamental block N in traversal Depends set, if initial viewpoint pos can arrange fundamental block N, then viewpoint pos again assignment be basic soon N; Otherwise initial viewpoint pos is still initial value.
Embodiment five:
With embodiment one, two or four unlike, present embodiment for the foreseeable viewpoint method for improving of program automatism, determine the fundamental block frequency (E of two element group representations described in step 3 v, B v,N) in vision point in predict the frequency B of fundamental block N v,Nthe process of amount be,
When the vision point selected and father circulate in same circulation layer, determine the frequency B predicting fundamental block N in vision point v,Nbe expressed as: E v, (V → m) %tc; Wherein, m represents that father is circulated the circulation layer at place; Such as, when selecting vision point and the 4th layer of father to circulate in same circulation layer, determine that the frequency predicting fundamental block N in vision point is: E v, (V → 4) %tc
When the viewpoint selected is outside n-th father's circulation, probability viewpoint being arrived Preheader is multiplied by the probability of cycle index and next node arrival forerunner fundamental block Preheader successively, nested successively, the frequency obtaining predicting in vision point fundamental block N is:
( E C , %tc c ) = ( E V n , ( V n → P n ) %tc n ( H n → P n - 1 ) %tc n - 1 ( H n - 1 → P n - 2 ) %tc n - 2 ... %tc c ) ; Wherein,
E irepresent i-th head node circulated, P irepresent forerunner's fundamental block Preheader, V irepresent viewpoint, %tc irepresent cycle index, i=1.2...n-1, n.
Embodiment six:
Be used for the foreseeable viewpoint method for improving of program automatism with embodiment five unlike the root of, present embodiment, when the vision point selected circulates identical with father, in the vision point utilizing father's cyclic representation to simplify, predict the frequency B of fundamental block N v,Nfor:
wherein, H 1represent father's circulation, represent that viewpoint and father are circulated common viewpoint, represent and simplify father's cycle frequency when calculating.
Embodiment 1:
Modeling computing time:
Describe the fundamental block frequency that different viewpoints is corresponding in fig. 2, arranged on left and right sides is that dynamic and static state two is extreme respectively.Be all predict circulation fundamental block 4, with the forerunner's fundamental block Preheader3 circulated for viewpoint, then the cycle index of circulation fundamental block 4 is %tc.
To keep precision, then viewpoint is risen to 1, so at fundamental block 1, the frequency of prediction 3 is only with being simply multiplied by path probability.Therefore can obtain with prediction:
(E 3,%tc)=(E 1,(1→3),%tc)=(E 1,%tc(1→3))
So just complete lifting.Next step needs to determine how to select fundamental block 1 now, reaches and ensureing object close as far as possible under the prerequisite of precision.Because %tc relies on %start, %end and %stride tri-amount.Therefore the viewpoint D demand fulfillment of our selection: % s t a r t δ D % e n d δ D % s t r i d e δ D δ , Wherein, δ represents dominance relation, namely can obtain the value of three variablees at viewpoint D.
The boosting algorithm of viewpoint is found in present design, as follows:
Next need to determine B v,N, the complex situations of loop nesting are as shown in Figure 3 discussed below.
The fundamental block of circulation fundamental block 2 place father circulation is that { its fundamental block frequency is (E for Isosorbide-5-Nitrae, 2,3} c, %tc).E i, P i, V iand %tc irepresent the head node of i-th circulation, forerunner's fundamental block Preheader, viewpoint and cycle index IR respectively.
When the vision point selected iwith viewpoint 4 when same circulation layer, E can be determined v, (V → 4) %tc.But when select viewpoint D outside 4 namely father circulation outside time, can complete representation be:
( E C , %tc c ) = ( E V n , ( V n → P n ) %tc n ( H n → P n - 1 ) %tc n - 1 ( H n - 1 → P n - 2 ) %tc n - 2 ... %tc c ) . If the viewpoint namely selected several n-th father's circulation outside, then the probability of viewpoint arrival Preheader is multiplied by cycle index more successively, and is multiplied by the probability that next node arrives Preheader, nested successively.
Special case situation is, if its viewpoint and father circulate identical, then can carry out some simplify.The two prefix is identical, so can utilize the expression that father is circulated.Father's circulation of such as viewpoint 2 is H 1if its common viewpoint is father's cycle frequency is then computing formula can be reduced to: ( E 4 , % t c ) = ( E V m , B V m , H 1 ( H 1 → 4 ) % t c ) .
The present invention also can have other various embodiments; when not deviating from the present invention's spirit and essence thereof; those skilled in the art are when making various corresponding change and distortion according to the present invention, but these change accordingly and are out of shape the protection domain that all should belong to the claim appended by the present invention.

Claims (6)

1., for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: described method is realized by following steps:
Step one, definition fundamental block N perform the vision point of number of times, then utilize the fundamental block frequency of two element group representations for (E v, B v,N); Wherein, E vrepresent the actual motion number of times of vision point, B v,Nrepresent the frequency predicting fundamental block N in vision point; Described vision point is the insertion point that computation cycles fundamental block performs the instruction of number of times;
Step 2, fundamental block frequency (E to two element group representations v, B v,N) in the actual motion number of times E of vision point vamount carry out lifting operation, and promote after vision point meet % s t a r t δ V % e n d δ V % s t r i d e δ V V δ Pr e h e a d e r ; Wherein, δ represents dominance relation, namely can obtain the value of %start, %end and %stride tri-variablees directly relied at vision point;
Step 3, determine the fundamental block frequency (E of two element group representations v, B v,N) in vision point in predict the frequency B of fundamental block N v,Namount;
Step 4, make the actual motion number of times E of vision point vwith the frequency B predicting fundamental block N in vision point v,Nproduct be defined as the frequency B of fundamental block N n, i.e. B n=E v× B v,N;
Step 5, every actual motion vision point all obtain actual motion number of times E v, correspondingly, the frequency B of fundamental block N will be predicted in vision point v,Ninstruction insert vision point, therefore vision point once run in total fundamental block frequency be B N ′ = Σ i = 1 E v E v × B V , N .
2. according to claim 1 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: the fundamental block frequency (E of two element group representations described in step one v, B v,N) value be:
(1) when selecting fundamental block itself as viewpoint, the fundamental block frequency representation of two element group representations is (E n, 1);
(2) for the fundamental block in circulation, when selecting the Preheader of circulation as viewpoint, fundamental block frequency representation is wherein, %tc represents static analysis cycle index, it is the probability of the static branch prediction that compiler framework LLVM provides;
(3) for acyclic fundamental block, continue to use compiler framework LLVM fundamental block frequency, the viewpoint of acyclic fundamental block is still function entrance fundamental block e, that is: (E e, B e,N).
3. according to claim 1 and 2 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: to the fundamental block frequency (E of two element group representations described in step 2 v, B v,N) in the actual motion number of times E of vision point vamount carry out the process of lifting operation:
The instruction of the direct dependence of the fundamental block number of times of step 2 one, selection circulation;
The operand assignment be only only used once in step 2 two, the instruction of direct dependence step 2 one selected is to goal set;
In step 2 three, the instruction step step 2 one selected, the viewpoint assignment of father's circulation of nonexpondable operand is gathered to Depends;
If step 2 four Depends gathers for empty, then return the fundamental block at circulation place, and carry out the operation of step 2 five; Otherwise, carry out the operation finding vision point;
Instruction in step 2 five, traversal goal set, before being inserted into the command for stopping of initial viewpoint pos by instruction;
Step 2 six, return initial viewpoint pos, namely complete the lifting of vision point.
4. according to claim 3 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: the process finding the operation of vision point described in step 2 four is,
Fundamental block N in traversal Depends set, if initial viewpoint pos can arrange fundamental block N, then viewpoint pos again assignment be basic soon N; Otherwise initial viewpoint pos is still initial value.
5. according to claim 1,2 or 4 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: the fundamental block frequency (E determining two element group representations described in step 3 v, B v,N) in vision point in predict the frequency B of fundamental block N v,Nthe process of amount be,
When the vision point selected and father circulate in same circulation layer, determine the frequency B predicting fundamental block N in vision point v,Nbe expressed as: E v, (V → m) %tc; Wherein, m represents that father is circulated the circulation layer at place;
When the viewpoint selected is outside n-th father's circulation, probability viewpoint being arrived Preheader is multiplied by the probability of cycle index and next node arrival forerunner fundamental block Preheader successively, nested successively, the frequency obtaining predicting in vision point fundamental block N is:
( E C , %tc c ) = ( E V n , ( V n → P n ) %tc n ( H n → P n - 1 ) %tc n - 1 ( H n - 1 → P n - 2 ) %tc n - 2 ... %tc c ) ; Wherein,
E irepresent i-th head node circulated, P irepresent forerunner's fundamental block Preheader, V irepresent viewpoint, %tc irepresent cycle index, i=1.2...n-1, n.
6. according to claim 5 for the foreseeable viewpoint method for improving of program automatism, it is characterized in that: when the vision point selected circulates identical with father, in the vision point utilizing father's cyclic representation to simplify, predict the frequency B of fundamental block N v,Nfor:
( E 4 , % t c ) = ( E V m , B V m , H 1 ( H 1 → 4 ) % t c ) ; Wherein, H 1represent father's circulation, represent that viewpoint and father are circulated common viewpoint, represent and simplify father's cycle frequency when calculating.
CN201510579026.0A 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity Active CN105183651B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510579026.0A CN105183651B (en) 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510579026.0A CN105183651B (en) 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity

Publications (2)

Publication Number Publication Date
CN105183651A true CN105183651A (en) 2015-12-23
CN105183651B CN105183651B (en) 2018-03-16

Family

ID=54905743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510579026.0A Active CN105183651B (en) 2015-09-11 2015-09-11 For the foreseeable viewpoint method for improving of program automaticity

Country Status (1)

Country Link
CN (1) CN105183651B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377525A (en) * 2019-07-25 2019-10-25 哈尔滨工业大学 A kind of parallel program property-predication system based on feature and machine learning when running

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
WO2010070490A1 (en) * 2008-12-18 2010-06-24 Koninklijke Philips Electronics, N.V. Software bug and performance deficiency reporting system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
WO2010070490A1 (en) * 2008-12-18 2010-06-24 Koninklijke Philips Electronics, N.V. Software bug and performance deficiency reporting system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢虎成: "基于LLVM的科学计算程序自动性能预测研究", 《中国知网》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377525A (en) * 2019-07-25 2019-10-25 哈尔滨工业大学 A kind of parallel program property-predication system based on feature and machine learning when running

Also Published As

Publication number Publication date
CN105183651B (en) 2018-03-16

Similar Documents

Publication Publication Date Title
CN102298514B (en) Register mapping techniques for efficient dynamic binary translation
Yang et al. Dense reppoints: Representing visual objects with dense point sets
US20170338802A1 (en) Actually-measured marine environment data assimilation method based on sequence recursive filtering three-dimensional variation
CN101841730A (en) Real-time stereoscopic vision implementation method based on FPGA
US10534576B2 (en) Optimization apparatus and control method thereof
CN112598091B (en) Training model and small sample classification method and device
CN104699464A (en) Dependency mesh based instruction-level parallel scheduling method
CN108564221A (en) A kind of photovoltaic array spacing and the computational methods and computing device at inclination angle
CN105426918A (en) Efficient realization method for normalized correlation image template matching
CN105242907A (en) NEON vectorization conversion method for ARM (Advanced RISC Machine) binary code
CN116091574A (en) 3D target detection method and system based on plane constraint and position constraint
CN105183651A (en) Viewpoint increase method for automatic performance prediction of program
Chen et al. I-SMOOTH: Iteratively smoothing mean-constrained and nonnegative piecewise-constant functions
Bražėnas et al. Parallel algorithms for fitting Markov arrival processes
CN105184807A (en) Automatic efficiency selection method for increasing charted depth
CN114548414A (en) Method, device, storage medium and compiling system for compiling quantum circuit
CN104360906A (en) High-level comprehensive scheduling method based on difference constraint system and iterative model
Wang et al. Real-time hierarchical supervoxel segmentation via a minimum spanning tree
CN112131794A (en) Hydraulic structure multi-effect optimization prediction and visualization method based on LSTM network
CN103942095A (en) Two-dimensional phase position unwrapping method based on heterogeneous accelerating platform
CN108038304A (en) A kind of Lattice Boltzmann Method parallel acceleration method using temporal locality
CN109492086A (en) A kind of answer output method, device, electronic equipment and storage medium
CN105467383A (en) Distance measurement method based on waveform matching in TOF technology
CN102314215A (en) Low power consumption optimization method of decimal multiplier in integrated circuit system
KR101623113B1 (en) Apparatus and method for learning and classification of decision tree

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant