WO2015062369A1 - Procédé et appareil d'optimisation de compilation dans une technologie de profilage - Google Patents
Procédé et appareil d'optimisation de compilation dans une technologie de profilage Download PDFInfo
- Publication number
- WO2015062369A1 WO2015062369A1 PCT/CN2014/086593 CN2014086593W WO2015062369A1 WO 2015062369 A1 WO2015062369 A1 WO 2015062369A1 CN 2014086593 W CN2014086593 W CN 2014086593W WO 2015062369 A1 WO2015062369 A1 WO 2015062369A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- program
- dimensional array
- region
- instrumentation
- area
- Prior art date
Links
- 238000005516 engineering process Methods 0.000 title claims abstract description 34
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000006870 function Effects 0.000 claims description 112
- 238000005457 optimization Methods 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 14
- 238000007781 pre-processing Methods 0.000 claims description 7
- 230000000977 initiatory effect Effects 0.000 claims 1
- 238000010586 diagram Methods 0.000 abstract description 30
- 238000004891 communication Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3404—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for parallel or distributed programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
- G06F11/3604—Analysis of software for verifying properties of programs
- G06F11/3612—Analysis of software for verifying properties of programs by runtime analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
Definitions
- the embodiments of the present invention relate to network technologies, and in particular, to an optimization method and apparatus for compiling a contour technology.
- profiling an important method for collecting program runtime information, including edge profiling, stride profiling, and values.
- edge profiling An important method for collecting program runtime information, including edge profiling, stride profiling, and values.
- value profiling There are many forms such as value profiling, among which edge profiling is the most widely used.
- FIG. 1 is a schematic diagram of a basic block of division in the existing edge contour technique.
- the source program is first converted into an intermediate representation, and the basic block is divided on the intermediate representation (here will still be in the middle).
- Indicates the form expressed as C language, such as BB1, BB2, etc. 2 is a schematic diagram of control flow diagram and instrumentation in the existing edge contour technology. As shown in FIG. 1
- a control flow diagram is constructed from basic blocks, and each edge is labeled, and the main control flow diagram is shown.
- Edge interpolation (some compilers do the instrumentation for each edge), that is, insert function calls to count the number of executions of the edge (Edge id represents the number of the edge), and the instrumentation functions are INSTRU1, INSTRU2.
- the embodiment of the invention provides an optimization method and device for compiling the contour technology, so as to overcome the problem that the execution efficiency of the contour technology in the prior art is low.
- an embodiment of the present invention provides an optimization method for compiling a contour technology, including:
- the area corresponds to information when the program is executed; the micro thread is executed in parallel with the main thread executing the program corresponding to the next area.
- the replacing the instrumentation function included in each region corresponding program into a counting operation includes:
- N is the maximum value of the number of instrumentation functions included in the corresponding program of each area
- an i*2 element of the two-dimensional array Assigning an i*2 element of the two-dimensional array to a sequence number of edges in the program control flow graph, an i*2+1+1 element of the two-dimensional array is used for counting, and the i represents an i th
- the value of i is greater than or equal to 0 and less than or equal to N.
- Count operations outside of the operation including:
- the instrumentation function is the i*2th element of the two-dimensional array.
- a third implementation manner of the first aspect when the executing ends to a region corresponding program, at least one micro thread is started to execute the region corresponding program.
- the operation of the instrumentation function in addition to the counting operation including:
- an embodiment of the present invention provides an apparatus for optimizing a contour technology, including:
- a pre-processing module configured to divide at least two regions on the program control flow graph, and replace the instrumentation function included in each region corresponding to the counting operation;
- a processing module configured to execute the respective area corresponding programs on the main thread, and when the execution ends to an area corresponding program, start at least one micro thread to execute an operation other than the counting operation of the instrumentation function included in the area corresponding program, For recording the information of the area corresponding to the execution of the program; the micro thread is executed in parallel with the main thread executing the program corresponding to the next area.
- the pre-processing module is specifically configured to:
- N is the maximum value of the number of instrumentation functions included in the corresponding program of each area
- an i*2 element of the two-dimensional array Assigning an i*2 element of the two-dimensional array to a sequence number of edges in the program control flow graph, an i*2+1+1 element of the two-dimensional array is used for counting, and the i represents an i th
- the value of i is greater than or equal to 0 and less than or equal to N.
- the processing module is specifically configured to:
- the instrumentation function is the i*2th element of the two-dimensional array.
- the processing module is further configured to:
- the optimization method and device for compiling the contour technology in the embodiment of the present invention by dividing at least two regions on the program control flow graph, replacing the instrumentation function included in each region corresponding program with a counting operation; Executing on the main thread, executing to the end of a region corresponding program, starting at least one microthread to execute an operation other than the counting operation of the instrumentation function included in the region corresponding program; the micro thread and the main program corresponding to the execution of the next region Threads are executed in parallel, because part of the operation of the instrumentation function in each area of the program is transferred to the micro-thread, and the micro-thread is executed in parallel with the main thread executing the program corresponding to the next area, thereby improving the execution efficiency of the contour technology at compile time.
- the problem of low efficiency of execution in the prior art is solved.
- FIG. 1 is a schematic diagram of a basic block of division in a conventional edge contour technique
- FIG. 2 is a schematic diagram of a control flow diagram and a pile insertion in the prior edge profile technique
- Embodiment 3 is a flowchart of Embodiment 1 of an optimization method for compiling a contour technology according to the present invention
- FIG. 4 is a schematic diagram of a control flow diagram in the first embodiment
- Figure 5 is a schematic view of the control flow diagram of Figure 4 after dividing the area
- Figure 6 is a schematic diagram of the processing of the instrumentation function of the control flow diagram of Figure 5;
- Figure 7A is a first schematic diagram of the execution of the control flow diagram of Figure 6;
- Figure 7B is a second schematic diagram of the execution of the control flow diagram of Figure 6;
- FIG. 8 is a schematic structural diagram of an embodiment of an apparatus for optimizing a contour technology compilation according to the present invention.
- FIG. 9 is a schematic structural diagram of an embodiment of an optimization device compiled by a contour technology according to the present invention.
- FIG. 3 is a flowchart of Embodiment 1 of a method for compiling a contour technology according to the present invention
- FIG. 4 is a schematic diagram of a control flow diagram in Embodiment 1
- FIG. 5 is a schematic diagram of a control flow graph in FIG.
- Figure 7A is a first schematic diagram of the execution of the control flow diagram of Figure 6.
- FIG. 7B is a second diagram of the execution of the control flow diagram of FIG. 6.
- the execution body of this embodiment is an optimization device compiled by a profile technology, and the device can be implemented by software and/or hardware.
- the solution of this embodiment is applied in the compilation of contour technology.
- the method in this embodiment may include:
- Step 301 Divide at least two regions on the program control flow graph, and replace the instrumentation function included in each region corresponding program with a counting operation.
- the edge contour technology is taken as an example for description.
- the control flow graph shown in FIG. 4 is divided into regions, and the region is a connected sub-region on the control flow graph, and one or several basic blocks and The side that connects them can be seen as a regional node.
- Regional node Can be nested.
- the area node has the following three types: 1) a loop domain node, which contains all the basic blocks and connected edges in a loop (with internal back edges); 2) an exception domain node, which is an interconnected basic block set containing irreversible The control flow graph of the approximation (the inner part contains the non-standard ring edge); 3) the multi-input multi-outgoing node, the multi-entry multi-export (the inner does not contain the ring edge), including the single-input and multi-out type domain nodes.
- the process of dividing the area is as follows: First, the control flow graph is traversed, and the edges that can form a loop are identified, and the circulation domain and the anomaly domain are constructed. Find all the edges in the control flow graph, and determine which ring edges are the back edges.
- the back edge refers to the edge of the edge of the node that is the starting node of the starting node, and the remaining ring edge is not the edge of the edge. Can not be reduced to the ring.
- these natural loops form the loop domain (allowing nesting between each other), and use the strongly connected component lookup algorithm for the domain with non-reducible loop edges in the control flow graph to find the largest strong connected component, leaving The strong connected components are the non-reducing parts, which will constitute the anomaly domain.
- the circular domain and the abnormal domain are reduced into regional nodes, a plurality of constituent regions (basic blocks or regional nodes) that are connected along the path on the control flow graph are formed.
- different regions are divided according to the number of node interpolation functions.
- the division rule is mainly the number of the instrumentation functions contained in the region.
- the maximum value of the number of instrumentation functions is N, and the minimum value is M.
- the values of N and M are determined by heuristic means, that is, the empirical value is used.
- the instrumentation functions contained in the area are as average as possible, not too much or too little, and the connection relationship between the areas is as simple as possible.
- the control flow graph includes 20 basic blocks and 27 edges connecting the basic blocks, and 8 instrumentation functions are inserted.
- the instrumentation function included in each region corresponding program is replaced with a counting operation, including:
- N is the maximum value of the number of instrumentation functions included in each area corresponding program
- the i*2 element of the two-dimensional array is assigned the sequence number of the edge in the program control flow graph, the i*2+1 element of the two-dimensional array is used for counting, and i represents the i-th instrumentation function, i value It is greater than or equal to 0 and less than or equal to N.
- a two-dimensional array space of size N*2 is opened (which may be a space on the heap), for example, named a_id (id indicates the area id) ), the size is N*2, N is the maximum value of the number of instrumentation functions included in each region corresponding program, for example, the number of regions in FIG.
- Step 302 Execute each area corresponding program on the main thread, and execute to an area corresponding program end, start an operation of the at least one micro-thread execution area corresponding program, and the operation of the interpolation function is used for the recording area corresponding program.
- Information at execution time the microthread is executed in parallel with the main thread executing the program corresponding to the next region.
- each area corresponding program is executed on the main thread, and when the execution of the area corresponding to the program ends, the operation of the at least one micro-thread execution area corresponding program includes the instrumentation function except the counting operation, that is, The instrumentation functions in each area are concentrated, the operations other than the counting operation are performed on the micro-thread, the number of enabled micro-threads is selected according to the number of instrumentation functions; the micro-threading of the instrumentation function of the corresponding program in the region is executed. Execute in parallel with the main thread executing the program corresponding to the next region.
- Micro-threading is a new technology of software and hardware working together. Its main features are: thread startup cost is small; thread length can be short, and the contour technology compiles with the characteristics of “starting frequently, small amount of calculation, one-way communication”. For example, edge contour technology may need to be inserted for each jump, which is suitable for utilizing the micro-thread startup low cost.
- the basic function of the instrumentation function is counting, accompanied by some data processing work, but overall calculation Small amount, suitable
- the use of micro-threaded thread length is short; the contour technology is to detect the program execution information, so it is necessary to know the state of the program execution, but the results of these states do not affect the normal execution of the program, the contour technology of different stages There is no communication problem between them. It is suitable for the program to be executed in the main thread and the contour technology is divided into multiple micro-threads, and no explicit synchronization is required. Therefore, the contour technology compilation is very suitable for solving with micro-threading.
- the control flow graph corresponding program has two operating conditions, one is the end, and the other is the end.
- FIG. 7A and FIG. 7B there are two execution modes, and the first execution mode, such as As shown in FIG. 7A, R1, R2, and R3 are executed on the main thread, and the instrumentation functions of the corresponding regions are collectively executed on one or more microthreads, and the microthreading of the instrumentation function (InstruR1) of the R1 region is executed and R2 is executed.
- the first execution mode such as As shown in FIG. 7A, R1, R2, and R3 are executed on the main thread, and the instrumentation functions of the corresponding regions are collectively executed on one or more microthreads, and the microthreading of the instrumentation function (InstruR1) of the R1 region is executed and R2 is executed.
- the main thread of the region is executed in parallel, and the microthread of the instrumentation function (InstruR2) executing the R2 region is executed in parallel with the main thread executing the R3 region, and the microthread is executed to execute the instrumentation function of the R3 region after the execution of the main thread of the R3 region (InstruR3)
- the second execution mode as shown in FIG. 7B, R1 and R2 are executed on the main thread, and the instrumentation functions of the corresponding regions are collectively executed on one or more microthreads, and the interpolation function of the R1 region is executed (InstruR1).
- the microthread is executed in parallel with the main thread executing the R2 region, and the microthread is started to execute the instrumentation function (InstruR2) of the R2 region after the execution of the main thread of the R2 region.
- the operation of the instrumentation function of the at least one micro-thread execution area corresponding program is performed, and the operation includes:
- the number of times the instrumentation function is executed is equal to the value of the i*2+1 element, and the input parameter of the instrumentation function is the second array. i*2 elements.
- the operation of the instrumentation function of the at least one micro-thread execution area corresponding program is performed, and the operation includes:
- the release size is N*2.
- the at least two regions are divided on the program control flow graph, and the interpolation function included in each region corresponding program is replaced with a counting operation; the respective regions corresponding to the program are executed on the main thread, and executed to an area.
- at least one micro-thread is started to execute an operation other than the counting operation of the instrumentation function included in the area corresponding program; the micro-thread is executed in parallel with the main thread executing the program corresponding to the next area, because each program in the program A part of the operation of the instrumentation function in the area is transferred to the micro-thread, and the micro-thread is executed in parallel with the main thread executing the program corresponding to the next area, and the counting operation of the instrumentation function is replaced by the array operation, and the execution efficiency of the array operation is The execution of the function count is more efficient. After the micro-thread executes the instrumentation function, the array space is released to improve the utilization of resources, and finally the execution efficiency of the contour technology is improved, and the problem of low execution
- FIG. 8 is a schematic structural diagram of an embodiment of an apparatus for optimizing a contour technology according to the present invention.
- the optimization apparatus 80 of the contour technology compilation of the present embodiment may include: a preprocessing module 801 and a processing module 802, wherein the preprocessing The module 801 is configured to divide at least two regions on the program control flow graph, and replace the instrumentation function included in each region corresponding program with a counting operation; and the processing module 802, configured to use the respective region corresponding program on the main thread Executing, executing to an area corresponding to the end of the program, starting at least one micro-thread to perform an operation other than the counting operation of the instrumentation function included in the area corresponding program, for recording information of the execution of the corresponding program in the area; The thread executes in parallel with the main thread executing the program corresponding to the next region.
- the device in this embodiment may be used to implement the technical solution of the method embodiment shown in FIG. 3, and the implementation principle and technical effects are similar, and details are not described herein again.
- the pre-processing module 801 is specifically configured to:
- N is the maximum value of the number of instrumentation functions included in the corresponding program of each area
- an i*2 element of the two-dimensional array Assigning an i*2 element of the two-dimensional array to a sequence number of edges in the program control flow graph, an i*2+1+1 element of the two-dimensional array is used for counting, and the i represents an i th
- the value of i is greater than or equal to 0 and less than or equal to N.
- processing module 802 is specifically configured to:
- the instrumentation function is the i*2th element of the two-dimensional array.
- processing module 802 is further configured to:
- FIG. 9 is a schematic structural diagram of an embodiment of an optimization device compiled by a contour technology according to the present invention.
- the optimization device 90 compiled by the contour technology provided by this embodiment includes a bus 901, a processor 902, and a memory 903.
- the bus 901 is used to connect the processor 902 and the memory 903 and transfer information; the memory 903 stores execution instructions.
- the processor 902 communicates with the memory 903, and the processor 902 runs the memory.
- the code stored in 903 performs the following operations:
- the area corresponds to information when the program is executed; the micro thread is executed in parallel with the main thread executing the program corresponding to the next area.
- the replacing the instrumentation function included in each region corresponding program with the counting operation includes:
- N is the maximum value of the number of instrumentation functions included in the corresponding program of each area
- an i*2 element of the two-dimensional array Assigning an i*2 element of the two-dimensional array to a sequence number of edges in the program control flow graph, an i*2+1+1 element of the two-dimensional array is used for counting, and the i represents an i th
- the value of i is greater than or equal to 0 and less than or equal to N.
- starting the at least one micro-thread to perform the operation of the instrumentation function of the area corresponding program, except for the counting operation includes:
- the instrumentation function is the i*2th element of the two-dimensional array.
- starting the at least one micro-thread to perform the operation of the instrumentation function of the area corresponding program, except for the counting operation includes:
- the disclosed device and the square The law can be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
- the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
- the above software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Devices For Executing Special Programs (AREA)
- Stored Programmes (AREA)
Abstract
Selon des modes de réalisation, la présente invention concerne un procédé et un appareil d'optimisation de compilation dans une technologie de profilage. Le procédé d'optimisation de compilation dans une technologie de profilage de la présente invention consiste : à obtenir au moins deux régions par division d'un diagramme de flux de commande logiciel, et à remplacer une fonction d'instrumentation comprise dans un logiciel correspondant à chaque région par une opération de comptage ; et à exécuter le logiciel correspondant à chaque région sur un fil principal, lorsque l'exécution du logiciel correspondant à une région est terminée, à démarrer au moins un microfil pour exécuter une opération autre que l'opération de comptage de la fonction d'instrumentation comprise dans le logiciel correspondant à la région, de manière à enregistrer des informations durant l'exécution du logiciel correspondant à la région ; le microfil et un fil principal pour exécuter un logiciel correspondant à une région suivante étant exécutés en parallèle. Les modes de réalisation de la présente invention améliorent l'efficacité d'exécution durant la compilation dans une technologie de profilage, ce qui permet de résoudre un problème de faible efficacité d'exécution dans l'état de la technique.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310539297.4 | 2013-11-04 | ||
CN201310539297.4A CN104615473B (zh) | 2013-11-04 | 2013-11-04 | 轮廓技术编译的优化方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015062369A1 true WO2015062369A1 (fr) | 2015-05-07 |
Family
ID=53003294
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/086593 WO2015062369A1 (fr) | 2013-11-04 | 2014-09-16 | Procédé et appareil d'optimisation de compilation dans une technologie de profilage |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN104615473B (fr) |
WO (1) | WO2015062369A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116775127A (zh) * | 2023-05-25 | 2023-09-19 | 哈尔滨工业大学 | 一种基于RetroWrite框架的静态符号执行插桩方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112287357B (zh) * | 2020-11-11 | 2022-08-12 | 中国科学院信息工程研究所 | 一种针对嵌入式裸机系统的控制流验证方法与系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002077821A2 (fr) * | 2001-03-26 | 2002-10-03 | Intel Corporation | Procede et systeme pour l'etablissement concerte de profil permettant la detection en continu de transition de phase de profil |
US20020194580A1 (en) * | 2001-06-18 | 2002-12-19 | Vinodha Ramasamy | Edge profiling for executable program code having branches through stub code segments |
US20090276766A1 (en) * | 2008-05-01 | 2009-11-05 | Yonghong Song | Runtime profitability control for speculative automatic parallelization |
CN103019852A (zh) * | 2012-11-14 | 2013-04-03 | 北京航空航天大学 | 一种适用于大规模集群的mpi并行程序负载问题三维可视化分析方法 |
CN103051509A (zh) * | 2012-08-03 | 2013-04-17 | 北京航空航天大学 | 一种基于树状架构的初始化方法 |
-
2013
- 2013-11-04 CN CN201310539297.4A patent/CN104615473B/zh active Active
-
2014
- 2014-09-16 WO PCT/CN2014/086593 patent/WO2015062369A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002077821A2 (fr) * | 2001-03-26 | 2002-10-03 | Intel Corporation | Procede et systeme pour l'etablissement concerte de profil permettant la detection en continu de transition de phase de profil |
US20020194580A1 (en) * | 2001-06-18 | 2002-12-19 | Vinodha Ramasamy | Edge profiling for executable program code having branches through stub code segments |
US20090276766A1 (en) * | 2008-05-01 | 2009-11-05 | Yonghong Song | Runtime profitability control for speculative automatic parallelization |
CN103051509A (zh) * | 2012-08-03 | 2013-04-17 | 北京航空航天大学 | 一种基于树状架构的初始化方法 |
CN103019852A (zh) * | 2012-11-14 | 2013-04-03 | 北京航空航天大学 | 一种适用于大规模集群的mpi并行程序负载问题三维可视化分析方法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116775127A (zh) * | 2023-05-25 | 2023-09-19 | 哈尔滨工业大学 | 一种基于RetroWrite框架的静态符号执行插桩方法 |
CN116775127B (zh) * | 2023-05-25 | 2024-05-28 | 哈尔滨工业大学 | 一种基于RetroWrite框架的静态符号执行插桩方法 |
Also Published As
Publication number | Publication date |
---|---|
CN104615473A (zh) | 2015-05-13 |
CN104615473B (zh) | 2017-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6761878B2 (ja) | コンピューティングシステムによって実行されるタスクの制御 | |
US8296746B2 (en) | Optimum code generation method and compiler device for multiprocessor | |
US8863128B2 (en) | System and method for optimizing the evaluation of task dependency graphs | |
US8893080B2 (en) | Parallelization of dataflow actors with local state | |
CN105389158B (zh) | 数据处理系统、编译器、处理器的方法和机器可读介质 | |
JP6763072B2 (ja) | データ処理グラフのコンパイル | |
US9367428B2 (en) | Transparent performance inference of whole software layers and context-sensitive performance debugging | |
JPH01108638A (ja) | 並列化コンパイル方式 | |
US10409570B2 (en) | Feedback directed program stack optimization | |
CN104572260A (zh) | 用于实现事务内存区域提升的代码版本控制的方法和设备 | |
EP3238053A1 (fr) | Technologies pour des bibliothèques informatiques à hautes performances composables de faible niveau | |
Breß et al. | A framework for cost based optimization of hybrid CPU/GPU query plans in database systems | |
WO2015062369A1 (fr) | Procédé et appareil d'optimisation de compilation dans une technologie de profilage | |
JP2009080583A (ja) | 情報処理装置、並列処理最適化方法およびプログラム | |
US20100037214A1 (en) | Method and system for mpi_wait sinking for better computation-communication overlap in mpi applications | |
JP6179524B2 (ja) | 実行制御方法及び情報処理装置 | |
US10042645B2 (en) | Method and apparatus for compiling a program for execution by a plurality of processing units | |
Wiechmann | On improving the performance of pipe-and-filter architectures by adding support for self-adaptive task farms | |
KR101569142B1 (ko) | 프로그램의 최적화 실행을 위한 장치 및 방법 | |
Zhang et al. | Slice Partition and Optimization Compilation Algorithm for Dataflow Multi-core Processor | |
JP2006065682A (ja) | コンパイラプログラム、コンパイル方法およびコンパイラ装置 | |
KR20080094257A (ko) | 컴파일 장치 및 컴파일 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14859114 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14859114 Country of ref document: EP Kind code of ref document: A1 |