US20140075423A1 - Efficiently solving the "use-def" problem involving label variables - Google Patents
Efficiently solving the "use-def" problem involving label variables Download PDFInfo
- Publication number
- US20140075423A1 US20140075423A1 US13/613,211 US201213613211A US2014075423A1 US 20140075423 A1 US20140075423 A1 US 20140075423A1 US 201213613211 A US201213613211 A US 201213613211A US 2014075423 A1 US2014075423 A1 US 2014075423A1
- Authority
- US
- United States
- Prior art keywords
- node
- inset
- predecessor
- outset
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/433—Dependency analysis; Data or control flow analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
Definitions
- This invention relates to apparatus and methods for performing data-flow analysis for computer programs.
- Data-flow analysis is a technique commonly used to determine values or properties at various points in a computer program.
- a program's control flow graph made up of nodes and edges between nodes, may be used to determine where a particular value assigned to a variable might propagate.
- the information gathered by the data-flow analysis may be used by compilers to optimize the computer program.
- a label variable is set to some number of values in some number of basic blocks. These are referred to as the “definitions” of the label variable. In other basic blocks, the label variable is the operand of indirect branches. These are referred to as the “uses” of the label variable.
- the problem is to discover which definitions of a label variable reach which uses of the label variable.
- a compiler may assume that all possible definitions of a label variable reach all actual uses, and reflect this in a program's control flow graph.
- the problem may then be solved using a conventional data-flow algorithm.
- conventional dataflow algorithms define the inset of a node in the control flow graph as a join operation on the outsets of all of the node's predecessors. This means that a single pass of the conventional dataflow algorithm is likely to find an opportunity to reduce the control flow graph, but exercising that opportunity will expose other opportunities which will in turn require further executions of the algorithm. This is because, in a first pass of the algorithm, a definition of some label variable may have reached a use of that label variable on what was proven by the first pass to be an implausible edge. A second pass may further reduce the control flow graph which may in turn expose additional opportunities to reduce the control flow graph. Each of these executions (i.e., passes) is expensive in time.
- the invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available apparatus and methods. Accordingly, the invention has been developed to provide apparatus and methods for efficiently solving the “use-def” problem involving label variables.
- the features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
- such a method includes generating a control flow graph for a computer program, wherein the control flow graph includes a plurality of nodes and edges between the nodes.
- the method performs a data-flow analysis on the control flow graph that includes calculating an inset for each node in the control flow graph.
- the inset for each node may be calculated as follows: if a predecessor node directly branches to the node, the method includes an outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, the method includes the outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, the method does not include the outset of the predecessor node in the inset of the node.
- FIG. 1 is a high-level block diagram showing one example of a computing system in which an apparatus and method in accordance with the invention may be implemented;
- FIG. 2 is a process flow diagram showing one embodiment of a method for efficiently solving the “use-def” problem involving label variables
- FIG. 3 is a process flow diagram of one embodiment of a method for determining whether a predecessor node can “reach” a node;
- FIG. 4 is pseudocode showing one possible way of implementing the methods of FIGS. 2 and 3 ;
- FIG. 5 shows a first example of how the methods of FIGS. 2 and 3 may be applied to an actual control flow graph
- FIG. 6 shows a second example of how the methods of FIGS. 2 and 3 may be applied to an actual control flow graph.
- the present invention may be embodied as an apparatus, system, method, or computer program product.
- the present invention may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, microcode, etc.) configured to operate hardware, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “module” or “system.”
- the present invention may take the form of a computer-usable storage medium embodied in any tangible medium of expression having computer-usable program code stored therein.
- the computer-usable or computer-readable storage medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CDROM), an optical storage device, or a magnetic storage device.
- a computer-usable or computer-readable storage medium may be any medium that can contain, store, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as JavaTM, Smalltalk, C++, or the like, conventional procedural programming languages such as the “C” programming language, scripting languages such as JavaScript, or similar programming languages.
- Computer program code for implementing the invention may also be written in a low-level programming language such as assembly language.
- Embodiments of the invention may be described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems, and computer program products. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer program instructions or code. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- the computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
- the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- FIG. 1 one example of a computing system 100 is illustrated.
- the computing system 100 is presented to show one example of an environment where an apparatus and method in accordance with the invention may be implemented.
- the computing system 100 is presented only by way of example and is not intended to be limiting. Indeed, the apparatus and methods disclosed herein may be applicable to a wide variety of different computing systems in addition to the computing system 100 shown. The apparatus and methods disclosed herein may also potentially be distributed across multiple computing systems 100 .
- the computing system 100 includes at least one processor 102 and may include more than one processor 102 .
- the processor 102 may be operably connected to a memory 104 .
- the memory 104 may include one or more non-volatile storage devices such as hard drives 104 a, solid state drives 104 a, CD-ROM drives 104 a, DVD-ROM drives 104 a , tape drives 104 a, or the like.
- the memory 104 may also include non-volatile memory such as a read-only memory 104 b (e.g., ROM, EPROM, EEPROM, and/or Flash ROM) or volatile memory such as a random access memory 104 c (RAM or operational memory).
- a bus 106 or plurality of buses 106 , may interconnect the processor 102 , memory devices 104 , and other devices to enable data and/or instructions to pass therebetween.
- the computing system 100 may include one or more ports 108 .
- Such ports 108 may be embodied as wired ports 108 (e.g., USB ports, serial ports, Firewire ports, SCSI ports, parallel ports, etc.) or wireless ports 108 (e.g., Bluetooth, IrDA, etc.).
- the ports 108 may enable communication with one or more input devices 110 (e.g., keyboards, mice, touchscreens, cameras, microphones, scanners, storage devices, etc.) and output devices 112 (e.g., displays, monitors, speakers, printers, storage devices, etc.).
- the ports 108 may also enable communication with other computing systems 100 .
- the computing system 100 includes a network adapter 114 to connect the computing system 100 to a network 116 , such as a LAN, WAN, or the Internet.
- a network 116 may enable the computing system 100 to connect to one or more servers 118 , workstations 120 , personal computers 120 , mobile computing devices, or other devices.
- the network 116 may also enable the computing system 100 to connect to another network by way of a router 122 or other device 122 .
- a router 122 may allow the computing system 100 to communicate with servers, workstations, personal computers, or other devices located on different networks.
- a process flow diagram showing one embodiment of a method 200 to efficiently solve the “use-def” problem involving label variables is illustrated. More specifically, the method 200 may be used to determine which definitions of label variables reach which uses of the label variables. Such a method 200 may be incorporated into a data-flow algorithm, such as a data-flow algorithm used by a compiler. As previously mentioned, a data-flow algorithm may analyze a computer program's control flow graph to determine values or properties at various points in the computer program. The data-flow algorithm accomplishes this by visiting each node of the control flow graph, possibly multiple times, and performing an operation at each visit.
- a data-flow algorithm may analyze a computer program's control flow graph to determine values or properties at various points in the computer program. The data-flow algorithm accomplishes this by visiting each node of the control flow graph, possibly multiple times, and performing an operation at each visit.
- the operation includes calculating the aforementioned values or properties at the inset and/or outset of each node (i.e., basic block) in the control flow graph.
- each node i.e., basic block
- the illustrated method 200 may be used to calculate the inset of the node.
- the method 200 analyzes 202 a first predecessor node p of node n.
- the method 200 determines 204 whether the predecessor node p can “reach” node n.
- a way in which the method 200 may determine 204 whether a predecessor node p reaches a node n will be described in association with FIG. 3 . If a node p can reach node n, the method 200 adds 206 the outset of node p to the inset of node n (since the inset of a node in the control flow graph is calculated as a join operation on the outsets of the node's predecessors). If node p cannot reach node n, the method 200 does not add the outset of node p to the inset of node n.
- the method determines 208 whether the node p is the last predecessor node. If not, the method 200 analyzes 210 the next predecessor node (also referred as node p) by determining 204 whether the next predecessor node p can reach the node n and, if so, adding 206 the outset of the predecessor node p to the inset of the node n. This process continues for all of the predecessors of the node n. Once the method 200 has analyzed all of the predecessors of the node n in the described manner, the method 200 ends.
- the next predecessor node also referred as node p
- a method 204 for determining whether a predecessor node p can “reach” a node n determines 302 whether a predecessor node p ends with an indirect branch (i.e., the predecessor node p indirectly branches to a node n via a label variable). If the predecessor node p does not end with an indirect branch (meaning that it ends with a direct branch), the method 204 determines 306 that node p can reach node n.
- an indirect branch i.e., the predecessor node p indirectly branches to a node n via a label variable. If the predecessor node p does not end with an indirect branch (meaning that it ends with a direct branch), the method 204 determines 306 that node p can reach node n.
- a “direct branch” may include an unconditional direct branch (e.g., GOTO X) or a conditional direct branch (e.g., IF (A ⁇ B) GOTO X), whereas an “indirect branch” may include a branch that branches via a label variable (e.g., GOTO LV, where LV is a label variable that may be set, or “defined,” to various values).
- GOTO X unconditional direct branch
- IF A ⁇ B
- an “indirect branch” may include a branch that branches via a label variable (e.g., GOTO LV, where LV is a label variable that may be set, or “defined,” to various values).
- the method 204 determines 304 whether node n is in the definitions of the label variable that controls the indirect branch. If node n is not in the definitions, the method 204 determines 308 that node p cannot reach node n. If node n is in the definitions, the method 204 determines 306 that node p can reach node n.
- pseudocode 400 showing one technique for implementing the methods of FIGS. 2 and 3 is illustrated.
- the pseudocode 400 includes a method called “computeInset” which computes the inset for a node n.
- the “Inset” for a node n is set to empty in line 2 .
- the method determines whether each predecessor node p can reach the node n and, where appropriate, adds the outset of the predecessor node p to the inset of the node n.
- Lines 7 through 15 show a method for determining when a predecessor node p can “reach” a node n in the current iteration (i.e., pass) of the data-flow algorithm.
- the method determines whether a predecessor node p ends with an indirect branch controlled by a label variable. If so, the method sets the variable LVx to the value of the label variable at the outset of p that controls the indirect branch. If, at line 10 , the node n is in the definitions of the label variable at the outset of node p, the method returns “true” (indicating that the node p can reach the node n).
- the method returns “false” (indicating that the node p cannot reach the node n). If, at line 8 , the method determines that node p does not end with an indirect branch (indicating that it ends with a direct branch), the method returns “true” at line 15 (indicating that the node p can reach the node n).
- FIG. 5 a first control flow graph 500 showing an exemplary application of the methods of FIGS. 2 and 3 is illustrated.
- the control flow graph 500 includes multiple basic blocks, connected by edges.
- the uninterrupted lines indicate a direct connection between basic blocks (i.e., no intervening basic blocks), whereas the broken lines indicate a path between the basic blocks through the control flow graph, with the possible (but not the necessary) presence of one or more intervening basic blocks.
- the control flow graph 500 includes multiple basic blocks, in particular Block 1 , Block 10 , Block 20 , Block 30 , Block 40 , Block 50 , and Block 60 .
- a first label variable (Lv 1 ) is set (i.e., defined) to Block 30 .
- the first label variable (Lv 1 ) is set to Block 40 and a second label variable (Lv 2 ) is set to Block 60 .
- Lv 1 first label variable
- Block 10 the first label variable (Lv 1 ) is set to Block 40 and a second label variable (Lv 2 ) is set to Block 60 .
- Block 20 indirectly branches to either Block 30 or Block 40 depending on the value (i.e., definition) assigned to the first label variable (Lv 1 ).
- Block 50 indirectly branches to Block 60 depending on the value (i.e., definition) assigned to the second label variable (Lv 2 ).
- the methods 200 , 300 described in FIGS. 2 and 3 will include the outset of Block 20 in the inset of Block 40 since Block 40 is in the definition of Lv 1 at the outset of Block 20 .
- the outset of Block 20 will not be included in the inset of Block 40 .
- the outset of Block 20 will not be included in the inset of Block 30 since Lv 1 will not have a definition that includes Block 30 when it reaches Block 20 .
- Block 20 has received the definitions for Lv 1 and Lv 2 , these definitions will be passed to Block 50 by way of Block 40 .
- the outset of Block 50 will be included in the inset of Block 60 since Block 60 will be in the definition of Lv 2 at the outset of Block 50 .
- FIG. 6 a second control flow graph 600 showing an exemplary application of the methods of FIGS. 2 and 3 is illustrated.
- the control flow graph 600 is substantially identical to the control flow graph 500 illustrated in FIG. 5 , except that the first label variable (Lv 1 ) is not defined in Block 10 .
- the outset of Block 20 will be included in the inset of Block 30 since Block 30 is in the definition of Lv 1 at the outset of Block 20 .
- the outset of Block 20 will not be included in the inset of Block 40 since Block 40 is not in the definition of Lv 1 .
- Block 40 will never receive the definition for the second label variable (Lv 2 ) and thus this definition will never be passed on to Block 50 .
- the outset of Block 50 will never be included in the inset of Block 60 since Block 50 will never be able to reach Block 60 .
- the forward edge from Block 50 to Block 60 may be safely deleted from the control flow graph 600 .
- the methods 200 , 300 may be used to prune and simplify a control flow graph to make data-flow analysis more efficient.
- the methods 200 , 300 may be easily adapted for a backward data-flow analysis.
- the outset of a node may be computed based on the insets of all the node's successors.
- a method in accordance with the invention may determine whether a successor node s can “reach” a node n (which may include determining whether a node n is in the definition of a label variable at the inset of node s). Stated otherwise, node n “takes up” from a successor s if and only if node n reaches node s. Such a determination may be made from a second contemporaneous forward analysis performed along with the backward analysis. That is, certain algorithms may try to solve forward and backward data flow problems at the same time.
- the method may add the inset of node s to the outset of node n. If node s cannot “reach” node n, the method does not add the inset of node s to the outset of node n.
- the methods 200 , 300 of FIGS. 2 and 3 as well as the claims describing such, are intended to encompass both backward and forward data-flow algorithms under the doctrine of equivalents.
- each block in the block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
- the functions discussed in association with a block may occur in a different order than discussed. For example, two functions occurring in succession may, in fact, be implemented in the reverse order, depending upon the functionality involved.
- each block of the block diagrams, and combinations of blocks in the block diagrams may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
A method for efficiently solving the “use-def” problem involving label variables performs a data-flow analysis on a control flow graph that includes calculating an inset for each node as follows: if a predecessor node directly branches to the node, the method includes an outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, the method includes the outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, the method does not include the outset of the predecessor node in the inset of the node.
Description
- 1. Field of the Invention
- This invention relates to apparatus and methods for performing data-flow analysis for computer programs.
- 2. Background of the Invention
- Data-flow analysis is a technique commonly used to determine values or properties at various points in a computer program. When performing a data-flow analysis, a program's control flow graph, made up of nodes and edges between nodes, may be used to determine where a particular value assigned to a variable might propagate. The information gathered by the data-flow analysis may be used by compilers to optimize the computer program.
- One particular problem a data-flow analysis may attempt to solve is a variation of the “use-def” problem involving label variables. In this problem, a label variable is set to some number of values in some number of basic blocks. These are referred to as the “definitions” of the label variable. In other basic blocks, the label variable is the operand of indirect branches. These are referred to as the “uses” of the label variable. The problem is to discover which definitions of a label variable reach which uses of the label variable.
- Initially, a compiler may assume that all possible definitions of a label variable reach all actual uses, and reflect this in a program's control flow graph. The problem may then be solved using a conventional data-flow algorithm. However, conventional dataflow algorithms define the inset of a node in the control flow graph as a join operation on the outsets of all of the node's predecessors. This means that a single pass of the conventional dataflow algorithm is likely to find an opportunity to reduce the control flow graph, but exercising that opportunity will expose other opportunities which will in turn require further executions of the algorithm. This is because, in a first pass of the algorithm, a definition of some label variable may have reached a use of that label variable on what was proven by the first pass to be an implausible edge. A second pass may further reduce the control flow graph which may in turn expose additional opportunities to reduce the control flow graph. Each of these executions (i.e., passes) is expensive in time.
- In view of the foregoing, what are needed are apparatus and methods to more efficiently solve the “use-def” problem involving label variables.
- The invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available apparatus and methods. Accordingly, the invention has been developed to provide apparatus and methods for efficiently solving the “use-def” problem involving label variables. The features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
- Consistent with the foregoing, a method for efficiently solving the “use-def” problem involving label variables is disclosed herein. In one embodiment, such a method includes generating a control flow graph for a computer program, wherein the control flow graph includes a plurality of nodes and edges between the nodes. The method performs a data-flow analysis on the control flow graph that includes calculating an inset for each node in the control flow graph. The inset for each node may be calculated as follows: if a predecessor node directly branches to the node, the method includes an outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, the method includes the outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, the method does not include the outset of the predecessor node in the inset of the node.
- A corresponding apparatus and computer program product are also disclosed and claimed herein.
- In order that the advantages of the invention will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered limiting of its scope, the invention will be described and explained with additional specificity and detail through use of the accompanying drawings, in which:
-
FIG. 1 is a high-level block diagram showing one example of a computing system in which an apparatus and method in accordance with the invention may be implemented; -
FIG. 2 is a process flow diagram showing one embodiment of a method for efficiently solving the “use-def” problem involving label variables; -
FIG. 3 is a process flow diagram of one embodiment of a method for determining whether a predecessor node can “reach” a node; -
FIG. 4 is pseudocode showing one possible way of implementing the methods ofFIGS. 2 and 3 ; -
FIG. 5 shows a first example of how the methods ofFIGS. 2 and 3 may be applied to an actual control flow graph; and -
FIG. 6 shows a second example of how the methods ofFIGS. 2 and 3 may be applied to an actual control flow graph. - It will be readily understood that the components of the present invention, as generally described and illustrated in the Figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the invention, as represented in the Figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of certain examples of presently contemplated embodiments in accordance with the invention. The presently described embodiments will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout.
- As will be appreciated by one skilled in the art, the present invention may be embodied as an apparatus, system, method, or computer program product. Furthermore, the present invention may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, microcode, etc.) configured to operate hardware, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “module” or “system.” Furthermore, the present invention may take the form of a computer-usable storage medium embodied in any tangible medium of expression having computer-usable program code stored therein.
- Any combination of one or more computer-usable or computer-readable storage medium(s) may be utilized to store the computer program product. The computer-usable or computer-readable storage medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CDROM), an optical storage device, or a magnetic storage device. In the context of this document, a computer-usable or computer-readable storage medium may be any medium that can contain, store, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java™, Smalltalk, C++, or the like, conventional procedural programming languages such as the “C” programming language, scripting languages such as JavaScript, or similar programming languages. Computer program code for implementing the invention may also be written in a low-level programming language such as assembly language.
- Embodiments of the invention may be described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems, and computer program products. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer program instructions or code. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- The computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- Referring to
FIG. 1 , one example of acomputing system 100 is illustrated. Thecomputing system 100 is presented to show one example of an environment where an apparatus and method in accordance with the invention may be implemented. Thecomputing system 100 is presented only by way of example and is not intended to be limiting. Indeed, the apparatus and methods disclosed herein may be applicable to a wide variety of different computing systems in addition to thecomputing system 100 shown. The apparatus and methods disclosed herein may also potentially be distributed acrossmultiple computing systems 100. - As shown, the
computing system 100 includes at least oneprocessor 102 and may include more than oneprocessor 102. Theprocessor 102 may be operably connected to a memory 104. The memory 104 may include one or more non-volatile storage devices such ashard drives 104 a, solid state drives 104 a, CD-ROM drives 104 a, DVD-ROM drives 104 a, tape drives 104 a, or the like. The memory 104 may also include non-volatile memory such as a read-only memory 104 b (e.g., ROM, EPROM, EEPROM, and/or Flash ROM) or volatile memory such as arandom access memory 104 c (RAM or operational memory). Abus 106, or plurality ofbuses 106, may interconnect theprocessor 102, memory devices 104, and other devices to enable data and/or instructions to pass therebetween. - To enable communication with external systems or devices, the
computing system 100 may include one ormore ports 108.Such ports 108 may be embodied as wired ports 108 (e.g., USB ports, serial ports, Firewire ports, SCSI ports, parallel ports, etc.) or wireless ports 108 (e.g., Bluetooth, IrDA, etc.). Theports 108 may enable communication with one or more input devices 110 (e.g., keyboards, mice, touchscreens, cameras, microphones, scanners, storage devices, etc.) and output devices 112 (e.g., displays, monitors, speakers, printers, storage devices, etc.). Theports 108 may also enable communication withother computing systems 100. - In certain embodiments, the
computing system 100 includes anetwork adapter 114 to connect thecomputing system 100 to anetwork 116, such as a LAN, WAN, or the Internet. Such anetwork 116 may enable thecomputing system 100 to connect to one ormore servers 118,workstations 120,personal computers 120, mobile computing devices, or other devices. Thenetwork 116 may also enable thecomputing system 100 to connect to another network by way of arouter 122 orother device 122. Such arouter 122 may allow thecomputing system 100 to communicate with servers, workstations, personal computers, or other devices located on different networks. - As shown in
FIG. 2 , a process flow diagram showing one embodiment of amethod 200 to efficiently solve the “use-def” problem involving label variables is illustrated. More specifically, themethod 200 may be used to determine which definitions of label variables reach which uses of the label variables. Such amethod 200 may be incorporated into a data-flow algorithm, such as a data-flow algorithm used by a compiler. As previously mentioned, a data-flow algorithm may analyze a computer program's control flow graph to determine values or properties at various points in the computer program. The data-flow algorithm accomplishes this by visiting each node of the control flow graph, possibly multiple times, and performing an operation at each visit. The operation includes calculating the aforementioned values or properties at the inset and/or outset of each node (i.e., basic block) in the control flow graph. Each time a node is visited by the data-flow algorithm, the illustratedmethod 200 may be used to calculate the inset of the node. - As shown, in order to calculate the inset for a node n in the control flow graph, the
method 200 analyzes 202 a first predecessor node p of node n. Themethod 200 determines 204 whether the predecessor node p can “reach” node n. A way in which themethod 200 may determine 204 whether a predecessor node p reaches a node n will be described in association withFIG. 3 . If a node p can reach node n, themethod 200 adds 206 the outset of node p to the inset of node n (since the inset of a node in the control flow graph is calculated as a join operation on the outsets of the node's predecessors). If node p cannot reach node n, themethod 200 does not add the outset of node p to the inset of node n. - The method then determines 208 whether the node p is the last predecessor node. If not, the
method 200 analyzes 210 the next predecessor node (also referred as node p) by determining 204 whether the next predecessor node p can reach the node n and, if so, adding 206 the outset of the predecessor node p to the inset of the node n. This process continues for all of the predecessors of the node n. Once themethod 200 has analyzed all of the predecessors of the node n in the described manner, themethod 200 ends. - Referring to
FIG. 3 , one embodiment of amethod 204 for determining whether a predecessor node p can “reach” a node n is illustrated. As shown, themethod 204 determines 302 whether a predecessor node p ends with an indirect branch (i.e., the predecessor node p indirectly branches to a node n via a label variable). If the predecessor node p does not end with an indirect branch (meaning that it ends with a direct branch), themethod 204 determines 306 that node p can reach node n. For the purposes of this disclosure, a “direct branch” may include an unconditional direct branch (e.g., GOTO X) or a conditional direct branch (e.g., IF (A<B) GOTO X), whereas an “indirect branch” may include a branch that branches via a label variable (e.g., GOTO LV, where LV is a label variable that may be set, or “defined,” to various values). - If the predecessor node p ends with an indirect branch, the
method 204 determines 304 whether node n is in the definitions of the label variable that controls the indirect branch. If node n is not in the definitions, themethod 204 determines 308 that node p cannot reach node n. If node n is in the definitions, themethod 204 determines 306 that node p can reach node n. - Referring to
FIG. 4 ,pseudocode 400 showing one technique for implementing the methods ofFIGS. 2 and 3 is illustrated. As shown, thepseudocode 400 includes a method called “computeInset” which computes the inset for a node n. Initially, the “Inset” for a node n is set to empty in line 2. In lines 3 through 5, the method determines whether each predecessor node p can reach the node n and, where appropriate, adds the outset of the predecessor node p to the inset of the node n. -
Lines 7 through 15 show a method for determining when a predecessor node p can “reach” a node n in the current iteration (i.e., pass) of the data-flow algorithm. As shown, inline 8, the method determines whether a predecessor node p ends with an indirect branch controlled by a label variable. If so, the method sets the variable LVx to the value of the label variable at the outset of p that controls the indirect branch. If, atline 10, the node n is in the definitions of the label variable at the outset of node p, the method returns “true” (indicating that the node p can reach the node n). Otherwise, the method returns “false” (indicating that the node p cannot reach the node n). If, atline 8, the method determines that node p does not end with an indirect branch (indicating that it ends with a direct branch), the method returns “true” at line 15 (indicating that the node p can reach the node n). - Referring to
FIG. 5 , a firstcontrol flow graph 500 showing an exemplary application of the methods ofFIGS. 2 and 3 is illustrated. As shown, thecontrol flow graph 500 includes multiple basic blocks, connected by edges. The uninterrupted lines indicate a direct connection between basic blocks (i.e., no intervening basic blocks), whereas the broken lines indicate a path between the basic blocks through the control flow graph, with the possible (but not the necessary) presence of one or more intervening basic blocks. - As shown, the
control flow graph 500 includes multiple basic blocks, inparticular Block 1,Block 10,Block 20,Block 30,Block 40,Block 50, andBlock 60. To perform themethods 200, 300 illustrated inFIGS. 2 and 3 , definitions and uses of specific label variables may be identified and noted in thecontrol flow graph 500. As shown inFIG. 5 , inBlock 1, a first label variable (Lv1) is set (i.e., defined) toBlock 30. InBlock 10, the first label variable (Lv1) is set to Block 40 and a second label variable (Lv2) is set to Block 60. As further shown inFIG. 5 , Block 20 indirectly branches to eitherBlock 30 orBlock 40 depending on the value (i.e., definition) assigned to the first label variable (Lv1). Similarly,Block 50 indirectly branches to Block 60 depending on the value (i.e., definition) assigned to the second label variable (Lv2). - The first time a data-flow algorithm visits
Block 20, it is possible that neither of the definitions of the first label variable (Lv1), as defined inBlocks Block 20. Eventually, the definition of the first label variable (Lv1) as established inBlock 10 will reachBlock 20. The definition of Lv1 inBlock 1 will never reachBlock 20 since the definition inBlock 1 is overwritten inBlock 10. - Assuming that the definition of the first label variable (Lv1) as established in
Block 10 has reachedBlock 20, themethods 200, 300 described inFIGS. 2 and 3 will include the outset ofBlock 20 in the inset ofBlock 40 sinceBlock 40 is in the definition of Lv1 at the outset ofBlock 20. On the other hand, if the definition of Lv1 has not yet reachedBlock 20, the outset ofBlock 20 will not be included in the inset ofBlock 40. Similarly, the outset ofBlock 20 will not be included in the inset ofBlock 30 since Lv1 will not have a definition that includesBlock 30 when it reachesBlock 20. - If the definitions for Lv1 and Lv2 have not been passed to Block 20, these definitions would likewise not be passed to Block 40. If such is the case,
Block 50 would not receive the definition for Lv2 and the outset ofBlock 50 would not be included in the inset ofBlock 60. - However, if
Block 20 has received the definitions for Lv1 and Lv2, these definitions will be passed to Block 50 by way ofBlock 40. In such a case, the outset ofBlock 50 will be included in the inset ofBlock 60 sinceBlock 60 will be in the definition of Lv2 at the outset ofBlock 50. - Referring to
FIG. 6 , a secondcontrol flow graph 600 showing an exemplary application of the methods ofFIGS. 2 and 3 is illustrated. Thecontrol flow graph 600 is substantially identical to thecontrol flow graph 500 illustrated inFIG. 5 , except that the first label variable (Lv1) is not defined inBlock 10. In this scenario, assuming that the definition of the first label variable (Lv1) has reachedBlock 20 fromBlock 1, the outset ofBlock 20 will be included in the inset ofBlock 30 sinceBlock 30 is in the definition of Lv1 at the outset ofBlock 20. By the same token, the outset ofBlock 20 will not be included in the inset ofBlock 40 sinceBlock 40 is not in the definition of Lv1. - Because the outset of
Block 20 will never be included in the inset of Block 40 (sinceBlock 40 will never be in the definition of Lv1 at the outset of Block 20),Block 40 will never receive the definition for the second label variable (Lv2) and thus this definition will never be passed on to Block 50. As a result, the outset ofBlock 50 will never be included in the inset ofBlock 60 sinceBlock 50 will never be able to reachBlock 60. In such a case, the forward edge fromBlock 50 to Block 60 may be safely deleted from thecontrol flow graph 600. In this way, themethods 200, 300 may be used to prune and simplify a control flow graph to make data-flow analysis more efficient. - It should be noted that the usual rules regarding the monotonicity of the solution lattice are not violated by the construction of the inset as described in the
methods 200, 300. This is because as iterations of the data-flow algorithm proceed, the set of definitions of a label variable at any given use point monotonically increases. That is, once a node is in the set of definitions of a controlling label variable, it stays in that set. - Although the
methods 200, 300 have been described herein in association with a forward data-flow analysis, themethods 200, 300 may be easily adapted for a backward data-flow analysis. In a backward data-flow analysis, the outset of a node may be computed based on the insets of all the node's successors. In such a case, a method in accordance with the invention may determine whether a successor node s can “reach” a node n (which may include determining whether a node n is in the definition of a label variable at the inset of node s). Stated otherwise, node n “takes up” from a successor s if and only if node n reaches node s. Such a determination may be made from a second contemporaneous forward analysis performed along with the backward analysis. That is, certain algorithms may try to solve forward and backward data flow problems at the same time. - If node s can “reach” node n, the method may add the inset of node s to the outset of node n. If node s cannot “reach” node n, the method does not add the inset of node s to the outset of node n. For the purposes of this description, the
methods 200, 300 ofFIGS. 2 and 3 , as well as the claims describing such, are intended to encompass both backward and forward data-flow algorithms under the doctrine of equivalents. - The block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer-usable storage media according to various embodiments of the present invention. In this regard, each block in the block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions discussed in association with a block may occur in a different order than discussed. For example, two functions occurring in succession may, in fact, be implemented in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams, and combinations of blocks in the block diagrams, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Claims (14)
1-7. (canceled)
8. A computer program product for iteratively altering the composition of an inset in a forward dataflow pass, the computer program product comprising a computer-readable storage medium having computer-usable program code embodied therein, the computer-usable program code comprising:
computer-usable program code to generate a control flow graph for a computer program, the control flow graph comprising a plurality of nodes and edges between the nodes;
computer-usable program code to perform a data-flow analysis on the control flow graph by calculating an inset of each node in the control flow graph, wherein calculating the inset of a node comprises performing the following:
if a predecessor node directly branches to the node, including an outset of the predecessor node in the inset of the node;
if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, including the outset of the predecessor node in the inset of the node; and
if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, not including the outset of the predecessor node in the inset of the node.
9. The computer program product of claim 8 , wherein the predecessor node that directly branches to the node does so unconditionally.
10. The computer program product of claim 8 , wherein the predecessor node that directly branches to the node does so conditionally.
11. The computer program product of claim 8 , wherein performing the data-flow analysis comprises performing a forward data-flow analysis.
12. The computer program product of claim 8 , wherein performing the data-flow analysis comprises performing the data-flow analysis in a single pass over the control flow graph.
13. The computer program product of claim 8 , wherein performing the data-flow analysis comprises performing the data-flow analysis in multiple passes over the control flow graph.
14. The computer program product of claim 8 , further comprising computer-usable program code to, prior to performing the data-flow analysis, identify definitions of label variables and uses of the label variables in the computer program.
15. An apparatus for iteratively altering the composition of an inset in a forward dataflow pass, the apparatus comprising:
at least one processor;
at least one memory device coupled to the at least one processor and storing computer instructions for execution on the at least one processor, the computer instructions enabling the at least one processor to:
generate a control flow graph for a computer program, the control flow graph comprising a plurality of nodes and edges between the nodes;
perform a data-flow analysis on the control flow graph by calculating an inset of each node in the control flow graph, wherein calculating the inset of a node comprises performing the following:
if a predecessor node directly branches to the node, including an outset of the predecessor node in the inset of the node;
if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, including the outset of the predecessor node in the inset of the node; and
if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, not including the outset of the predecessor node in the inset of the node.
16. The apparatus of claim 15 , wherein the predecessor node that directly branches to the node does so unconditionally.
17. The apparatus of claim 15 , wherein the predecessor node that directly branches to the node does so conditionally.
18. The apparatus of claim 15 , wherein performing the data-flow analysis comprises performing a forward data-flow analysis.
19. The apparatus of claim 15 , wherein performing the data-flow analysis comprises performing the data-flow analysis in a single pass over the control flow graph.
20. The apparatus of claim 15 , wherein the computer instructions further enable the at least one processor to, prior to performing the data-flow analysis, identify definitions of label variables and uses of the label variables in the computer program.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/613,211 US20140075423A1 (en) | 2012-09-13 | 2012-09-13 | Efficiently solving the "use-def" problem involving label variables |
US13/844,307 US8839217B2 (en) | 2012-09-13 | 2013-03-15 | Efficiently solving the “use-def” problem involving label variables |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/613,211 US20140075423A1 (en) | 2012-09-13 | 2012-09-13 | Efficiently solving the "use-def" problem involving label variables |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/844,307 Continuation US8839217B2 (en) | 2012-09-13 | 2013-03-15 | Efficiently solving the “use-def” problem involving label variables |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140075423A1 true US20140075423A1 (en) | 2014-03-13 |
Family
ID=50234752
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/613,211 Abandoned US20140075423A1 (en) | 2012-09-13 | 2012-09-13 | Efficiently solving the "use-def" problem involving label variables |
US13/844,307 Expired - Fee Related US8839217B2 (en) | 2012-09-13 | 2013-03-15 | Efficiently solving the “use-def” problem involving label variables |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/844,307 Expired - Fee Related US8839217B2 (en) | 2012-09-13 | 2013-03-15 | Efficiently solving the “use-def” problem involving label variables |
Country Status (1)
Country | Link |
---|---|
US (2) | US20140075423A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090217248A1 (en) * | 2008-02-12 | 2009-08-27 | Bently William G | Systems and methods for information flow analysis |
US20150248279A1 (en) * | 2013-06-24 | 2015-09-03 | International Business Machines Corporation | Extracting stream graph structure in a computer language by pre-executing a deterministic subset |
US9411564B2 (en) | 2010-08-30 | 2016-08-09 | International Business Machines Corporation | Extraction of functional semantics and isolated dataflow from imperative object oriented languages |
CN108958902A (en) * | 2017-05-25 | 2018-12-07 | 阿里巴巴集团控股有限公司 | Figure calculation method and system |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377993B (en) * | 2019-07-09 | 2020-06-12 | 长江勘测规划设计研究有限责任公司 | Agile configuration method for multi-combination regulation and control calculation of over-standard flood |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6308321B1 (en) | 1998-12-11 | 2001-10-23 | Incert Software Corporation | Method for determining program control flow |
US6922830B1 (en) | 2000-03-10 | 2005-07-26 | International Business Machines Corporation | Skip list data storage during compilation |
US6751792B1 (en) | 2000-10-04 | 2004-06-15 | Sun Microsystems, Inc. | Using value-expression graphs for data-flow optimizations |
JP3956113B2 (en) | 2002-06-13 | 2007-08-08 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Data processing apparatus and program |
US8255891B2 (en) | 2005-12-30 | 2012-08-28 | Intel Corporation | Computer-implemented method and system for improved data flow analysis and optimization |
CA2675680C (en) * | 2009-08-27 | 2013-05-14 | Ibm Canada Limited - Ibm Canada Limitee | Generating object code that uses calculated contents for a variable determined from a predicate |
CA2691851A1 (en) * | 2010-02-04 | 2011-08-04 | Ibm Canada Limited - Ibm Canada Limitee | Control flow analysis using deductive reaching definitions |
-
2012
- 2012-09-13 US US13/613,211 patent/US20140075423A1/en not_active Abandoned
-
2013
- 2013-03-15 US US13/844,307 patent/US8839217B2/en not_active Expired - Fee Related
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090217248A1 (en) * | 2008-02-12 | 2009-08-27 | Bently William G | Systems and methods for information flow analysis |
US9043774B2 (en) * | 2008-02-12 | 2015-05-26 | William G. Bently | Systems and methods for information flow analysis |
US9411564B2 (en) | 2010-08-30 | 2016-08-09 | International Business Machines Corporation | Extraction of functional semantics and isolated dataflow from imperative object oriented languages |
US9424010B2 (en) | 2010-08-30 | 2016-08-23 | International Business Machines Corporation | Extraction of functional semantics and isolated dataflow from imperative object oriented languages |
US20150248279A1 (en) * | 2013-06-24 | 2015-09-03 | International Business Machines Corporation | Extracting stream graph structure in a computer language by pre-executing a deterministic subset |
US9454350B2 (en) * | 2013-06-24 | 2016-09-27 | International Business Machines Corporation | Extracting stream graph structure in a computer language by pre-executing a deterministic subset |
CN108958902A (en) * | 2017-05-25 | 2018-12-07 | 阿里巴巴集团控股有限公司 | Figure calculation method and system |
Also Published As
Publication number | Publication date |
---|---|
US8839217B2 (en) | 2014-09-16 |
US20140075424A1 (en) | 2014-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8839217B2 (en) | Efficiently solving the “use-def” problem involving label variables | |
US9811446B2 (en) | Method and apparatus for providing test cases | |
US10599820B2 (en) | Control flow flattening for code obfuscation where the next block calculation needs run-time information | |
US8671397B2 (en) | Selective data flow analysis of bounded regions of computer software applications | |
US10296311B2 (en) | Finding uninitialized variables outside the local scope | |
US20130339696A1 (en) | Selectively blocking branch instruction prediction | |
US9483274B1 (en) | Method of splitting register live ranges | |
US10325844B2 (en) | Modifying execution flow in save-to-return code scenarios | |
EP2937803B1 (en) | Control flow flattening for code obfuscation where the next block calculation needs run-time information | |
JP2010191847A (en) | Program obfuscating program and program obfuscating device | |
US10599554B2 (en) | Dynamic instrumentation based on detected errors | |
JP2016128941A (en) | Output determination device, output determination method, output determination program, and static analysis device | |
US9928089B2 (en) | Optimizing software code | |
US9262167B2 (en) | Computer processor with instruction for execution based on available instruction sets | |
Gold | Reductions of control flow graphs | |
US20150082443A1 (en) | System to automate compliance with licenses of software third-party content | |
US9280441B2 (en) | Detection and correction of race conditions in workflows | |
US20140026116A1 (en) | Source control execution path locking | |
JP2016184273A (en) | Arithmetic control device, arithmetic control method and arithmetic control program | |
US9772824B2 (en) | Program structure-based blocking | |
KR20230131903A (en) | Reconfigurable optimized hardware using data sampling | |
CN117891514A (en) | Branch prediction method, apparatus, electronic device, and computer-readable storage medium | |
JP5181947B2 (en) | Embedded software development support system, support method and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: EXPRESSLY ABANDONED -- DURING PUBLICATION PROCESS |