US20140075423A1 - Efficiently solving the "use-def" problem involving label variables - Google Patents

Efficiently solving the "use-def" problem involving label variables Download PDF

Info

Publication number
US20140075423A1
US20140075423A1 US13/613,211 US201213613211A US2014075423A1 US 20140075423 A1 US20140075423 A1 US 20140075423A1 US 201213613211 A US201213613211 A US 201213613211A US 2014075423 A1 US2014075423 A1 US 2014075423A1
Authority
US
United States
Prior art keywords
node
inset
predecessor
outset
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/613,211
Inventor
Allan H. Kielstra
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US13/613,211 priority Critical patent/US20140075423A1/en
Priority to US13/844,307 priority patent/US8839217B2/en
Publication of US20140075423A1 publication Critical patent/US20140075423A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation

Definitions

  • This invention relates to apparatus and methods for performing data-flow analysis for computer programs.
  • Data-flow analysis is a technique commonly used to determine values or properties at various points in a computer program.
  • a program's control flow graph made up of nodes and edges between nodes, may be used to determine where a particular value assigned to a variable might propagate.
  • the information gathered by the data-flow analysis may be used by compilers to optimize the computer program.
  • a label variable is set to some number of values in some number of basic blocks. These are referred to as the “definitions” of the label variable. In other basic blocks, the label variable is the operand of indirect branches. These are referred to as the “uses” of the label variable.
  • the problem is to discover which definitions of a label variable reach which uses of the label variable.
  • a compiler may assume that all possible definitions of a label variable reach all actual uses, and reflect this in a program's control flow graph.
  • the problem may then be solved using a conventional data-flow algorithm.
  • conventional dataflow algorithms define the inset of a node in the control flow graph as a join operation on the outsets of all of the node's predecessors. This means that a single pass of the conventional dataflow algorithm is likely to find an opportunity to reduce the control flow graph, but exercising that opportunity will expose other opportunities which will in turn require further executions of the algorithm. This is because, in a first pass of the algorithm, a definition of some label variable may have reached a use of that label variable on what was proven by the first pass to be an implausible edge. A second pass may further reduce the control flow graph which may in turn expose additional opportunities to reduce the control flow graph. Each of these executions (i.e., passes) is expensive in time.
  • the invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available apparatus and methods. Accordingly, the invention has been developed to provide apparatus and methods for efficiently solving the “use-def” problem involving label variables.
  • the features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
  • such a method includes generating a control flow graph for a computer program, wherein the control flow graph includes a plurality of nodes and edges between the nodes.
  • the method performs a data-flow analysis on the control flow graph that includes calculating an inset for each node in the control flow graph.
  • the inset for each node may be calculated as follows: if a predecessor node directly branches to the node, the method includes an outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, the method includes the outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, the method does not include the outset of the predecessor node in the inset of the node.
  • FIG. 1 is a high-level block diagram showing one example of a computing system in which an apparatus and method in accordance with the invention may be implemented;
  • FIG. 2 is a process flow diagram showing one embodiment of a method for efficiently solving the “use-def” problem involving label variables
  • FIG. 3 is a process flow diagram of one embodiment of a method for determining whether a predecessor node can “reach” a node;
  • FIG. 4 is pseudocode showing one possible way of implementing the methods of FIGS. 2 and 3 ;
  • FIG. 5 shows a first example of how the methods of FIGS. 2 and 3 may be applied to an actual control flow graph
  • FIG. 6 shows a second example of how the methods of FIGS. 2 and 3 may be applied to an actual control flow graph.
  • the present invention may be embodied as an apparatus, system, method, or computer program product.
  • the present invention may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, microcode, etc.) configured to operate hardware, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “module” or “system.”
  • the present invention may take the form of a computer-usable storage medium embodied in any tangible medium of expression having computer-usable program code stored therein.
  • the computer-usable or computer-readable storage medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CDROM), an optical storage device, or a magnetic storage device.
  • a computer-usable or computer-readable storage medium may be any medium that can contain, store, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as JavaTM, Smalltalk, C++, or the like, conventional procedural programming languages such as the “C” programming language, scripting languages such as JavaScript, or similar programming languages.
  • Computer program code for implementing the invention may also be written in a low-level programming language such as assembly language.
  • Embodiments of the invention may be described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems, and computer program products. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer program instructions or code. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • the computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • FIG. 1 one example of a computing system 100 is illustrated.
  • the computing system 100 is presented to show one example of an environment where an apparatus and method in accordance with the invention may be implemented.
  • the computing system 100 is presented only by way of example and is not intended to be limiting. Indeed, the apparatus and methods disclosed herein may be applicable to a wide variety of different computing systems in addition to the computing system 100 shown. The apparatus and methods disclosed herein may also potentially be distributed across multiple computing systems 100 .
  • the computing system 100 includes at least one processor 102 and may include more than one processor 102 .
  • the processor 102 may be operably connected to a memory 104 .
  • the memory 104 may include one or more non-volatile storage devices such as hard drives 104 a, solid state drives 104 a, CD-ROM drives 104 a, DVD-ROM drives 104 a , tape drives 104 a, or the like.
  • the memory 104 may also include non-volatile memory such as a read-only memory 104 b (e.g., ROM, EPROM, EEPROM, and/or Flash ROM) or volatile memory such as a random access memory 104 c (RAM or operational memory).
  • a bus 106 or plurality of buses 106 , may interconnect the processor 102 , memory devices 104 , and other devices to enable data and/or instructions to pass therebetween.
  • the computing system 100 may include one or more ports 108 .
  • Such ports 108 may be embodied as wired ports 108 (e.g., USB ports, serial ports, Firewire ports, SCSI ports, parallel ports, etc.) or wireless ports 108 (e.g., Bluetooth, IrDA, etc.).
  • the ports 108 may enable communication with one or more input devices 110 (e.g., keyboards, mice, touchscreens, cameras, microphones, scanners, storage devices, etc.) and output devices 112 (e.g., displays, monitors, speakers, printers, storage devices, etc.).
  • the ports 108 may also enable communication with other computing systems 100 .
  • the computing system 100 includes a network adapter 114 to connect the computing system 100 to a network 116 , such as a LAN, WAN, or the Internet.
  • a network 116 may enable the computing system 100 to connect to one or more servers 118 , workstations 120 , personal computers 120 , mobile computing devices, or other devices.
  • the network 116 may also enable the computing system 100 to connect to another network by way of a router 122 or other device 122 .
  • a router 122 may allow the computing system 100 to communicate with servers, workstations, personal computers, or other devices located on different networks.
  • a process flow diagram showing one embodiment of a method 200 to efficiently solve the “use-def” problem involving label variables is illustrated. More specifically, the method 200 may be used to determine which definitions of label variables reach which uses of the label variables. Such a method 200 may be incorporated into a data-flow algorithm, such as a data-flow algorithm used by a compiler. As previously mentioned, a data-flow algorithm may analyze a computer program's control flow graph to determine values or properties at various points in the computer program. The data-flow algorithm accomplishes this by visiting each node of the control flow graph, possibly multiple times, and performing an operation at each visit.
  • a data-flow algorithm may analyze a computer program's control flow graph to determine values or properties at various points in the computer program. The data-flow algorithm accomplishes this by visiting each node of the control flow graph, possibly multiple times, and performing an operation at each visit.
  • the operation includes calculating the aforementioned values or properties at the inset and/or outset of each node (i.e., basic block) in the control flow graph.
  • each node i.e., basic block
  • the illustrated method 200 may be used to calculate the inset of the node.
  • the method 200 analyzes 202 a first predecessor node p of node n.
  • the method 200 determines 204 whether the predecessor node p can “reach” node n.
  • a way in which the method 200 may determine 204 whether a predecessor node p reaches a node n will be described in association with FIG. 3 . If a node p can reach node n, the method 200 adds 206 the outset of node p to the inset of node n (since the inset of a node in the control flow graph is calculated as a join operation on the outsets of the node's predecessors). If node p cannot reach node n, the method 200 does not add the outset of node p to the inset of node n.
  • the method determines 208 whether the node p is the last predecessor node. If not, the method 200 analyzes 210 the next predecessor node (also referred as node p) by determining 204 whether the next predecessor node p can reach the node n and, if so, adding 206 the outset of the predecessor node p to the inset of the node n. This process continues for all of the predecessors of the node n. Once the method 200 has analyzed all of the predecessors of the node n in the described manner, the method 200 ends.
  • the next predecessor node also referred as node p
  • a method 204 for determining whether a predecessor node p can “reach” a node n determines 302 whether a predecessor node p ends with an indirect branch (i.e., the predecessor node p indirectly branches to a node n via a label variable). If the predecessor node p does not end with an indirect branch (meaning that it ends with a direct branch), the method 204 determines 306 that node p can reach node n.
  • an indirect branch i.e., the predecessor node p indirectly branches to a node n via a label variable. If the predecessor node p does not end with an indirect branch (meaning that it ends with a direct branch), the method 204 determines 306 that node p can reach node n.
  • a “direct branch” may include an unconditional direct branch (e.g., GOTO X) or a conditional direct branch (e.g., IF (A ⁇ B) GOTO X), whereas an “indirect branch” may include a branch that branches via a label variable (e.g., GOTO LV, where LV is a label variable that may be set, or “defined,” to various values).
  • GOTO X unconditional direct branch
  • IF A ⁇ B
  • an “indirect branch” may include a branch that branches via a label variable (e.g., GOTO LV, where LV is a label variable that may be set, or “defined,” to various values).
  • the method 204 determines 304 whether node n is in the definitions of the label variable that controls the indirect branch. If node n is not in the definitions, the method 204 determines 308 that node p cannot reach node n. If node n is in the definitions, the method 204 determines 306 that node p can reach node n.
  • pseudocode 400 showing one technique for implementing the methods of FIGS. 2 and 3 is illustrated.
  • the pseudocode 400 includes a method called “computeInset” which computes the inset for a node n.
  • the “Inset” for a node n is set to empty in line 2 .
  • the method determines whether each predecessor node p can reach the node n and, where appropriate, adds the outset of the predecessor node p to the inset of the node n.
  • Lines 7 through 15 show a method for determining when a predecessor node p can “reach” a node n in the current iteration (i.e., pass) of the data-flow algorithm.
  • the method determines whether a predecessor node p ends with an indirect branch controlled by a label variable. If so, the method sets the variable LVx to the value of the label variable at the outset of p that controls the indirect branch. If, at line 10 , the node n is in the definitions of the label variable at the outset of node p, the method returns “true” (indicating that the node p can reach the node n).
  • the method returns “false” (indicating that the node p cannot reach the node n). If, at line 8 , the method determines that node p does not end with an indirect branch (indicating that it ends with a direct branch), the method returns “true” at line 15 (indicating that the node p can reach the node n).
  • FIG. 5 a first control flow graph 500 showing an exemplary application of the methods of FIGS. 2 and 3 is illustrated.
  • the control flow graph 500 includes multiple basic blocks, connected by edges.
  • the uninterrupted lines indicate a direct connection between basic blocks (i.e., no intervening basic blocks), whereas the broken lines indicate a path between the basic blocks through the control flow graph, with the possible (but not the necessary) presence of one or more intervening basic blocks.
  • the control flow graph 500 includes multiple basic blocks, in particular Block 1 , Block 10 , Block 20 , Block 30 , Block 40 , Block 50 , and Block 60 .
  • a first label variable (Lv 1 ) is set (i.e., defined) to Block 30 .
  • the first label variable (Lv 1 ) is set to Block 40 and a second label variable (Lv 2 ) is set to Block 60 .
  • Lv 1 first label variable
  • Block 10 the first label variable (Lv 1 ) is set to Block 40 and a second label variable (Lv 2 ) is set to Block 60 .
  • Block 20 indirectly branches to either Block 30 or Block 40 depending on the value (i.e., definition) assigned to the first label variable (Lv 1 ).
  • Block 50 indirectly branches to Block 60 depending on the value (i.e., definition) assigned to the second label variable (Lv 2 ).
  • the methods 200 , 300 described in FIGS. 2 and 3 will include the outset of Block 20 in the inset of Block 40 since Block 40 is in the definition of Lv 1 at the outset of Block 20 .
  • the outset of Block 20 will not be included in the inset of Block 40 .
  • the outset of Block 20 will not be included in the inset of Block 30 since Lv 1 will not have a definition that includes Block 30 when it reaches Block 20 .
  • Block 20 has received the definitions for Lv 1 and Lv 2 , these definitions will be passed to Block 50 by way of Block 40 .
  • the outset of Block 50 will be included in the inset of Block 60 since Block 60 will be in the definition of Lv 2 at the outset of Block 50 .
  • FIG. 6 a second control flow graph 600 showing an exemplary application of the methods of FIGS. 2 and 3 is illustrated.
  • the control flow graph 600 is substantially identical to the control flow graph 500 illustrated in FIG. 5 , except that the first label variable (Lv 1 ) is not defined in Block 10 .
  • the outset of Block 20 will be included in the inset of Block 30 since Block 30 is in the definition of Lv 1 at the outset of Block 20 .
  • the outset of Block 20 will not be included in the inset of Block 40 since Block 40 is not in the definition of Lv 1 .
  • Block 40 will never receive the definition for the second label variable (Lv 2 ) and thus this definition will never be passed on to Block 50 .
  • the outset of Block 50 will never be included in the inset of Block 60 since Block 50 will never be able to reach Block 60 .
  • the forward edge from Block 50 to Block 60 may be safely deleted from the control flow graph 600 .
  • the methods 200 , 300 may be used to prune and simplify a control flow graph to make data-flow analysis more efficient.
  • the methods 200 , 300 may be easily adapted for a backward data-flow analysis.
  • the outset of a node may be computed based on the insets of all the node's successors.
  • a method in accordance with the invention may determine whether a successor node s can “reach” a node n (which may include determining whether a node n is in the definition of a label variable at the inset of node s). Stated otherwise, node n “takes up” from a successor s if and only if node n reaches node s. Such a determination may be made from a second contemporaneous forward analysis performed along with the backward analysis. That is, certain algorithms may try to solve forward and backward data flow problems at the same time.
  • the method may add the inset of node s to the outset of node n. If node s cannot “reach” node n, the method does not add the inset of node s to the outset of node n.
  • the methods 200 , 300 of FIGS. 2 and 3 as well as the claims describing such, are intended to encompass both backward and forward data-flow algorithms under the doctrine of equivalents.
  • each block in the block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions discussed in association with a block may occur in a different order than discussed. For example, two functions occurring in succession may, in fact, be implemented in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams, and combinations of blocks in the block diagrams may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

A method for efficiently solving the “use-def” problem involving label variables performs a data-flow analysis on a control flow graph that includes calculating an inset for each node as follows: if a predecessor node directly branches to the node, the method includes an outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, the method includes the outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, the method does not include the outset of the predecessor node in the inset of the node.

Description

    BACKGROUND
  • 1. Field of the Invention
  • This invention relates to apparatus and methods for performing data-flow analysis for computer programs.
  • 2. Background of the Invention
  • Data-flow analysis is a technique commonly used to determine values or properties at various points in a computer program. When performing a data-flow analysis, a program's control flow graph, made up of nodes and edges between nodes, may be used to determine where a particular value assigned to a variable might propagate. The information gathered by the data-flow analysis may be used by compilers to optimize the computer program.
  • One particular problem a data-flow analysis may attempt to solve is a variation of the “use-def” problem involving label variables. In this problem, a label variable is set to some number of values in some number of basic blocks. These are referred to as the “definitions” of the label variable. In other basic blocks, the label variable is the operand of indirect branches. These are referred to as the “uses” of the label variable. The problem is to discover which definitions of a label variable reach which uses of the label variable.
  • Initially, a compiler may assume that all possible definitions of a label variable reach all actual uses, and reflect this in a program's control flow graph. The problem may then be solved using a conventional data-flow algorithm. However, conventional dataflow algorithms define the inset of a node in the control flow graph as a join operation on the outsets of all of the node's predecessors. This means that a single pass of the conventional dataflow algorithm is likely to find an opportunity to reduce the control flow graph, but exercising that opportunity will expose other opportunities which will in turn require further executions of the algorithm. This is because, in a first pass of the algorithm, a definition of some label variable may have reached a use of that label variable on what was proven by the first pass to be an implausible edge. A second pass may further reduce the control flow graph which may in turn expose additional opportunities to reduce the control flow graph. Each of these executions (i.e., passes) is expensive in time.
  • In view of the foregoing, what are needed are apparatus and methods to more efficiently solve the “use-def” problem involving label variables.
  • SUMMARY
  • The invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available apparatus and methods. Accordingly, the invention has been developed to provide apparatus and methods for efficiently solving the “use-def” problem involving label variables. The features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
  • Consistent with the foregoing, a method for efficiently solving the “use-def” problem involving label variables is disclosed herein. In one embodiment, such a method includes generating a control flow graph for a computer program, wherein the control flow graph includes a plurality of nodes and edges between the nodes. The method performs a data-flow analysis on the control flow graph that includes calculating an inset for each node in the control flow graph. The inset for each node may be calculated as follows: if a predecessor node directly branches to the node, the method includes an outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, the method includes the outset of the predecessor node in the inset of the node; if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, the method does not include the outset of the predecessor node in the inset of the node.
  • A corresponding apparatus and computer program product are also disclosed and claimed herein.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • In order that the advantages of the invention will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered limiting of its scope, the invention will be described and explained with additional specificity and detail through use of the accompanying drawings, in which:
  • FIG. 1 is a high-level block diagram showing one example of a computing system in which an apparatus and method in accordance with the invention may be implemented;
  • FIG. 2 is a process flow diagram showing one embodiment of a method for efficiently solving the “use-def” problem involving label variables;
  • FIG. 3 is a process flow diagram of one embodiment of a method for determining whether a predecessor node can “reach” a node;
  • FIG. 4 is pseudocode showing one possible way of implementing the methods of FIGS. 2 and 3;
  • FIG. 5 shows a first example of how the methods of FIGS. 2 and 3 may be applied to an actual control flow graph; and
  • FIG. 6 shows a second example of how the methods of FIGS. 2 and 3 may be applied to an actual control flow graph.
  • DETAILED DESCRIPTION
  • It will be readily understood that the components of the present invention, as generally described and illustrated in the Figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the invention, as represented in the Figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of certain examples of presently contemplated embodiments in accordance with the invention. The presently described embodiments will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout.
  • As will be appreciated by one skilled in the art, the present invention may be embodied as an apparatus, system, method, or computer program product. Furthermore, the present invention may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, microcode, etc.) configured to operate hardware, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “module” or “system.” Furthermore, the present invention may take the form of a computer-usable storage medium embodied in any tangible medium of expression having computer-usable program code stored therein.
  • Any combination of one or more computer-usable or computer-readable storage medium(s) may be utilized to store the computer program product. The computer-usable or computer-readable storage medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CDROM), an optical storage device, or a magnetic storage device. In the context of this document, a computer-usable or computer-readable storage medium may be any medium that can contain, store, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java™, Smalltalk, C++, or the like, conventional procedural programming languages such as the “C” programming language, scripting languages such as JavaScript, or similar programming languages. Computer program code for implementing the invention may also be written in a low-level programming language such as assembly language.
  • Embodiments of the invention may be described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems, and computer program products. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer program instructions or code. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • The computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • Referring to FIG. 1, one example of a computing system 100 is illustrated. The computing system 100 is presented to show one example of an environment where an apparatus and method in accordance with the invention may be implemented. The computing system 100 is presented only by way of example and is not intended to be limiting. Indeed, the apparatus and methods disclosed herein may be applicable to a wide variety of different computing systems in addition to the computing system 100 shown. The apparatus and methods disclosed herein may also potentially be distributed across multiple computing systems 100.
  • As shown, the computing system 100 includes at least one processor 102 and may include more than one processor 102. The processor 102 may be operably connected to a memory 104. The memory 104 may include one or more non-volatile storage devices such as hard drives 104 a, solid state drives 104 a, CD-ROM drives 104 a, DVD-ROM drives 104 a, tape drives 104 a, or the like. The memory 104 may also include non-volatile memory such as a read-only memory 104 b (e.g., ROM, EPROM, EEPROM, and/or Flash ROM) or volatile memory such as a random access memory 104 c (RAM or operational memory). A bus 106, or plurality of buses 106, may interconnect the processor 102, memory devices 104, and other devices to enable data and/or instructions to pass therebetween.
  • To enable communication with external systems or devices, the computing system 100 may include one or more ports 108. Such ports 108 may be embodied as wired ports 108 (e.g., USB ports, serial ports, Firewire ports, SCSI ports, parallel ports, etc.) or wireless ports 108 (e.g., Bluetooth, IrDA, etc.). The ports 108 may enable communication with one or more input devices 110 (e.g., keyboards, mice, touchscreens, cameras, microphones, scanners, storage devices, etc.) and output devices 112 (e.g., displays, monitors, speakers, printers, storage devices, etc.). The ports 108 may also enable communication with other computing systems 100.
  • In certain embodiments, the computing system 100 includes a network adapter 114 to connect the computing system 100 to a network 116, such as a LAN, WAN, or the Internet. Such a network 116 may enable the computing system 100 to connect to one or more servers 118, workstations 120, personal computers 120, mobile computing devices, or other devices. The network 116 may also enable the computing system 100 to connect to another network by way of a router 122 or other device 122. Such a router 122 may allow the computing system 100 to communicate with servers, workstations, personal computers, or other devices located on different networks.
  • As shown in FIG. 2, a process flow diagram showing one embodiment of a method 200 to efficiently solve the “use-def” problem involving label variables is illustrated. More specifically, the method 200 may be used to determine which definitions of label variables reach which uses of the label variables. Such a method 200 may be incorporated into a data-flow algorithm, such as a data-flow algorithm used by a compiler. As previously mentioned, a data-flow algorithm may analyze a computer program's control flow graph to determine values or properties at various points in the computer program. The data-flow algorithm accomplishes this by visiting each node of the control flow graph, possibly multiple times, and performing an operation at each visit. The operation includes calculating the aforementioned values or properties at the inset and/or outset of each node (i.e., basic block) in the control flow graph. Each time a node is visited by the data-flow algorithm, the illustrated method 200 may be used to calculate the inset of the node.
  • As shown, in order to calculate the inset for a node n in the control flow graph, the method 200 analyzes 202 a first predecessor node p of node n. The method 200 determines 204 whether the predecessor node p can “reach” node n. A way in which the method 200 may determine 204 whether a predecessor node p reaches a node n will be described in association with FIG. 3. If a node p can reach node n, the method 200 adds 206 the outset of node p to the inset of node n (since the inset of a node in the control flow graph is calculated as a join operation on the outsets of the node's predecessors). If node p cannot reach node n, the method 200 does not add the outset of node p to the inset of node n.
  • The method then determines 208 whether the node p is the last predecessor node. If not, the method 200 analyzes 210 the next predecessor node (also referred as node p) by determining 204 whether the next predecessor node p can reach the node n and, if so, adding 206 the outset of the predecessor node p to the inset of the node n. This process continues for all of the predecessors of the node n. Once the method 200 has analyzed all of the predecessors of the node n in the described manner, the method 200 ends.
  • Referring to FIG. 3, one embodiment of a method 204 for determining whether a predecessor node p can “reach” a node n is illustrated. As shown, the method 204 determines 302 whether a predecessor node p ends with an indirect branch (i.e., the predecessor node p indirectly branches to a node n via a label variable). If the predecessor node p does not end with an indirect branch (meaning that it ends with a direct branch), the method 204 determines 306 that node p can reach node n. For the purposes of this disclosure, a “direct branch” may include an unconditional direct branch (e.g., GOTO X) or a conditional direct branch (e.g., IF (A<B) GOTO X), whereas an “indirect branch” may include a branch that branches via a label variable (e.g., GOTO LV, where LV is a label variable that may be set, or “defined,” to various values).
  • If the predecessor node p ends with an indirect branch, the method 204 determines 304 whether node n is in the definitions of the label variable that controls the indirect branch. If node n is not in the definitions, the method 204 determines 308 that node p cannot reach node n. If node n is in the definitions, the method 204 determines 306 that node p can reach node n.
  • Referring to FIG. 4, pseudocode 400 showing one technique for implementing the methods of FIGS. 2 and 3 is illustrated. As shown, the pseudocode 400 includes a method called “computeInset” which computes the inset for a node n. Initially, the “Inset” for a node n is set to empty in line 2. In lines 3 through 5, the method determines whether each predecessor node p can reach the node n and, where appropriate, adds the outset of the predecessor node p to the inset of the node n.
  • Lines 7 through 15 show a method for determining when a predecessor node p can “reach” a node n in the current iteration (i.e., pass) of the data-flow algorithm. As shown, in line 8, the method determines whether a predecessor node p ends with an indirect branch controlled by a label variable. If so, the method sets the variable LVx to the value of the label variable at the outset of p that controls the indirect branch. If, at line 10, the node n is in the definitions of the label variable at the outset of node p, the method returns “true” (indicating that the node p can reach the node n). Otherwise, the method returns “false” (indicating that the node p cannot reach the node n). If, at line 8, the method determines that node p does not end with an indirect branch (indicating that it ends with a direct branch), the method returns “true” at line 15 (indicating that the node p can reach the node n).
  • Referring to FIG. 5, a first control flow graph 500 showing an exemplary application of the methods of FIGS. 2 and 3 is illustrated. As shown, the control flow graph 500 includes multiple basic blocks, connected by edges. The uninterrupted lines indicate a direct connection between basic blocks (i.e., no intervening basic blocks), whereas the broken lines indicate a path between the basic blocks through the control flow graph, with the possible (but not the necessary) presence of one or more intervening basic blocks.
  • As shown, the control flow graph 500 includes multiple basic blocks, in particular Block 1, Block 10, Block 20, Block 30, Block 40, Block 50, and Block 60. To perform the methods 200, 300 illustrated in FIGS. 2 and 3, definitions and uses of specific label variables may be identified and noted in the control flow graph 500. As shown in FIG. 5, in Block 1, a first label variable (Lv1) is set (i.e., defined) to Block 30. In Block 10, the first label variable (Lv1) is set to Block 40 and a second label variable (Lv2) is set to Block 60. As further shown in FIG. 5, Block 20 indirectly branches to either Block 30 or Block 40 depending on the value (i.e., definition) assigned to the first label variable (Lv1). Similarly, Block 50 indirectly branches to Block 60 depending on the value (i.e., definition) assigned to the second label variable (Lv2).
  • The first time a data-flow algorithm visits Block 20, it is possible that neither of the definitions of the first label variable (Lv1), as defined in Blocks 1 and 10, have reached Block 20. Eventually, the definition of the first label variable (Lv1) as established in Block 10 will reach Block 20. The definition of Lv1 in Block 1 will never reach Block 20 since the definition in Block 1 is overwritten in Block 10.
  • Assuming that the definition of the first label variable (Lv1) as established in Block 10 has reached Block 20, the methods 200, 300 described in FIGS. 2 and 3 will include the outset of Block 20 in the inset of Block 40 since Block 40 is in the definition of Lv1 at the outset of Block 20. On the other hand, if the definition of Lv1 has not yet reached Block 20, the outset of Block 20 will not be included in the inset of Block 40. Similarly, the outset of Block 20 will not be included in the inset of Block 30 since Lv1 will not have a definition that includes Block 30 when it reaches Block 20.
  • If the definitions for Lv1 and Lv2 have not been passed to Block 20, these definitions would likewise not be passed to Block 40. If such is the case, Block 50 would not receive the definition for Lv2 and the outset of Block 50 would not be included in the inset of Block 60.
  • However, if Block 20 has received the definitions for Lv1 and Lv2, these definitions will be passed to Block 50 by way of Block 40. In such a case, the outset of Block 50 will be included in the inset of Block 60 since Block 60 will be in the definition of Lv2 at the outset of Block 50.
  • Referring to FIG. 6, a second control flow graph 600 showing an exemplary application of the methods of FIGS. 2 and 3 is illustrated. The control flow graph 600 is substantially identical to the control flow graph 500 illustrated in FIG. 5, except that the first label variable (Lv1) is not defined in Block 10. In this scenario, assuming that the definition of the first label variable (Lv1) has reached Block 20 from Block 1, the outset of Block 20 will be included in the inset of Block 30 since Block 30 is in the definition of Lv1 at the outset of Block 20. By the same token, the outset of Block 20 will not be included in the inset of Block 40 since Block 40 is not in the definition of Lv1.
  • Because the outset of Block 20 will never be included in the inset of Block 40 (since Block 40 will never be in the definition of Lv1 at the outset of Block 20), Block 40 will never receive the definition for the second label variable (Lv2) and thus this definition will never be passed on to Block 50. As a result, the outset of Block 50 will never be included in the inset of Block 60 since Block 50 will never be able to reach Block 60. In such a case, the forward edge from Block 50 to Block 60 may be safely deleted from the control flow graph 600. In this way, the methods 200, 300 may be used to prune and simplify a control flow graph to make data-flow analysis more efficient.
  • It should be noted that the usual rules regarding the monotonicity of the solution lattice are not violated by the construction of the inset as described in the methods 200, 300. This is because as iterations of the data-flow algorithm proceed, the set of definitions of a label variable at any given use point monotonically increases. That is, once a node is in the set of definitions of a controlling label variable, it stays in that set.
  • Although the methods 200, 300 have been described herein in association with a forward data-flow analysis, the methods 200, 300 may be easily adapted for a backward data-flow analysis. In a backward data-flow analysis, the outset of a node may be computed based on the insets of all the node's successors. In such a case, a method in accordance with the invention may determine whether a successor node s can “reach” a node n (which may include determining whether a node n is in the definition of a label variable at the inset of node s). Stated otherwise, node n “takes up” from a successor s if and only if node n reaches node s. Such a determination may be made from a second contemporaneous forward analysis performed along with the backward analysis. That is, certain algorithms may try to solve forward and backward data flow problems at the same time.
  • If node s can “reach” node n, the method may add the inset of node s to the outset of node n. If node s cannot “reach” node n, the method does not add the inset of node s to the outset of node n. For the purposes of this description, the methods 200, 300 of FIGS. 2 and 3, as well as the claims describing such, are intended to encompass both backward and forward data-flow algorithms under the doctrine of equivalents.
  • The block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer-usable storage media according to various embodiments of the present invention. In this regard, each block in the block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions discussed in association with a block may occur in a different order than discussed. For example, two functions occurring in succession may, in fact, be implemented in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams, and combinations of blocks in the block diagrams, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims (14)

1-7. (canceled)
8. A computer program product for iteratively altering the composition of an inset in a forward dataflow pass, the computer program product comprising a computer-readable storage medium having computer-usable program code embodied therein, the computer-usable program code comprising:
computer-usable program code to generate a control flow graph for a computer program, the control flow graph comprising a plurality of nodes and edges between the nodes;
computer-usable program code to perform a data-flow analysis on the control flow graph by calculating an inset of each node in the control flow graph, wherein calculating the inset of a node comprises performing the following:
if a predecessor node directly branches to the node, including an outset of the predecessor node in the inset of the node;
if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, including the outset of the predecessor node in the inset of the node; and
if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, not including the outset of the predecessor node in the inset of the node.
9. The computer program product of claim 8, wherein the predecessor node that directly branches to the node does so unconditionally.
10. The computer program product of claim 8, wherein the predecessor node that directly branches to the node does so conditionally.
11. The computer program product of claim 8, wherein performing the data-flow analysis comprises performing a forward data-flow analysis.
12. The computer program product of claim 8, wherein performing the data-flow analysis comprises performing the data-flow analysis in a single pass over the control flow graph.
13. The computer program product of claim 8, wherein performing the data-flow analysis comprises performing the data-flow analysis in multiple passes over the control flow graph.
14. The computer program product of claim 8, further comprising computer-usable program code to, prior to performing the data-flow analysis, identify definitions of label variables and uses of the label variables in the computer program.
15. An apparatus for iteratively altering the composition of an inset in a forward dataflow pass, the apparatus comprising:
at least one processor;
at least one memory device coupled to the at least one processor and storing computer instructions for execution on the at least one processor, the computer instructions enabling the at least one processor to:
generate a control flow graph for a computer program, the control flow graph comprising a plurality of nodes and edges between the nodes;
perform a data-flow analysis on the control flow graph by calculating an inset of each node in the control flow graph, wherein calculating the inset of a node comprises performing the following:
if a predecessor node directly branches to the node, including an outset of the predecessor node in the inset of the node;
if a predecessor node indirectly branches to the node via a label variable and the node is in definitions of the label variable in the outset of the predecessor node, including the outset of the predecessor node in the inset of the node; and
if a predecessor node indirectly branches to the node via a label variable and the node is not in definitions of the label variable in the outset of the predecessor node, not including the outset of the predecessor node in the inset of the node.
16. The apparatus of claim 15, wherein the predecessor node that directly branches to the node does so unconditionally.
17. The apparatus of claim 15, wherein the predecessor node that directly branches to the node does so conditionally.
18. The apparatus of claim 15, wherein performing the data-flow analysis comprises performing a forward data-flow analysis.
19. The apparatus of claim 15, wherein performing the data-flow analysis comprises performing the data-flow analysis in a single pass over the control flow graph.
20. The apparatus of claim 15, wherein the computer instructions further enable the at least one processor to, prior to performing the data-flow analysis, identify definitions of label variables and uses of the label variables in the computer program.
US13/613,211 2012-09-13 2012-09-13 Efficiently solving the "use-def" problem involving label variables Abandoned US20140075423A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US13/613,211 US20140075423A1 (en) 2012-09-13 2012-09-13 Efficiently solving the "use-def" problem involving label variables
US13/844,307 US8839217B2 (en) 2012-09-13 2013-03-15 Efficiently solving the “use-def” problem involving label variables

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/613,211 US20140075423A1 (en) 2012-09-13 2012-09-13 Efficiently solving the "use-def" problem involving label variables

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/844,307 Continuation US8839217B2 (en) 2012-09-13 2013-03-15 Efficiently solving the “use-def” problem involving label variables

Publications (1)

Publication Number Publication Date
US20140075423A1 true US20140075423A1 (en) 2014-03-13

Family

ID=50234752

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/613,211 Abandoned US20140075423A1 (en) 2012-09-13 2012-09-13 Efficiently solving the "use-def" problem involving label variables
US13/844,307 Expired - Fee Related US8839217B2 (en) 2012-09-13 2013-03-15 Efficiently solving the “use-def” problem involving label variables

Family Applications After (1)

Application Number Title Priority Date Filing Date
US13/844,307 Expired - Fee Related US8839217B2 (en) 2012-09-13 2013-03-15 Efficiently solving the “use-def” problem involving label variables

Country Status (1)

Country Link
US (2) US20140075423A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090217248A1 (en) * 2008-02-12 2009-08-27 Bently William G Systems and methods for information flow analysis
US20150248279A1 (en) * 2013-06-24 2015-09-03 International Business Machines Corporation Extracting stream graph structure in a computer language by pre-executing a deterministic subset
US9411564B2 (en) 2010-08-30 2016-08-09 International Business Machines Corporation Extraction of functional semantics and isolated dataflow from imperative object oriented languages
CN108958902A (en) * 2017-05-25 2018-12-07 阿里巴巴集团控股有限公司 Figure calculation method and system

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377993B (en) * 2019-07-09 2020-06-12 长江勘测规划设计研究有限责任公司 Agile configuration method for multi-combination regulation and control calculation of over-standard flood

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6308321B1 (en) 1998-12-11 2001-10-23 Incert Software Corporation Method for determining program control flow
US6922830B1 (en) 2000-03-10 2005-07-26 International Business Machines Corporation Skip list data storage during compilation
US6751792B1 (en) 2000-10-04 2004-06-15 Sun Microsystems, Inc. Using value-expression graphs for data-flow optimizations
JP3956113B2 (en) 2002-06-13 2007-08-08 インターナショナル・ビジネス・マシーンズ・コーポレーション Data processing apparatus and program
US8255891B2 (en) 2005-12-30 2012-08-28 Intel Corporation Computer-implemented method and system for improved data flow analysis and optimization
CA2675680C (en) * 2009-08-27 2013-05-14 Ibm Canada Limited - Ibm Canada Limitee Generating object code that uses calculated contents for a variable determined from a predicate
CA2691851A1 (en) * 2010-02-04 2011-08-04 Ibm Canada Limited - Ibm Canada Limitee Control flow analysis using deductive reaching definitions

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090217248A1 (en) * 2008-02-12 2009-08-27 Bently William G Systems and methods for information flow analysis
US9043774B2 (en) * 2008-02-12 2015-05-26 William G. Bently Systems and methods for information flow analysis
US9411564B2 (en) 2010-08-30 2016-08-09 International Business Machines Corporation Extraction of functional semantics and isolated dataflow from imperative object oriented languages
US9424010B2 (en) 2010-08-30 2016-08-23 International Business Machines Corporation Extraction of functional semantics and isolated dataflow from imperative object oriented languages
US20150248279A1 (en) * 2013-06-24 2015-09-03 International Business Machines Corporation Extracting stream graph structure in a computer language by pre-executing a deterministic subset
US9454350B2 (en) * 2013-06-24 2016-09-27 International Business Machines Corporation Extracting stream graph structure in a computer language by pre-executing a deterministic subset
CN108958902A (en) * 2017-05-25 2018-12-07 阿里巴巴集团控股有限公司 Figure calculation method and system

Also Published As

Publication number Publication date
US8839217B2 (en) 2014-09-16
US20140075424A1 (en) 2014-03-13

Similar Documents

Publication Publication Date Title
US8839217B2 (en) Efficiently solving the “use-def” problem involving label variables
US9811446B2 (en) Method and apparatus for providing test cases
US10599820B2 (en) Control flow flattening for code obfuscation where the next block calculation needs run-time information
US8671397B2 (en) Selective data flow analysis of bounded regions of computer software applications
US10296311B2 (en) Finding uninitialized variables outside the local scope
US20130339696A1 (en) Selectively blocking branch instruction prediction
US9483274B1 (en) Method of splitting register live ranges
US10325844B2 (en) Modifying execution flow in save-to-return code scenarios
EP2937803B1 (en) Control flow flattening for code obfuscation where the next block calculation needs run-time information
JP2010191847A (en) Program obfuscating program and program obfuscating device
US10599554B2 (en) Dynamic instrumentation based on detected errors
JP2016128941A (en) Output determination device, output determination method, output determination program, and static analysis device
US9928089B2 (en) Optimizing software code
US9262167B2 (en) Computer processor with instruction for execution based on available instruction sets
Gold Reductions of control flow graphs
US20150082443A1 (en) System to automate compliance with licenses of software third-party content
US9280441B2 (en) Detection and correction of race conditions in workflows
US20140026116A1 (en) Source control execution path locking
JP2016184273A (en) Arithmetic control device, arithmetic control method and arithmetic control program
US9772824B2 (en) Program structure-based blocking
KR20230131903A (en) Reconfigurable optimized hardware using data sampling
CN117891514A (en) Branch prediction method, apparatus, electronic device, and computer-readable storage medium
JP5181947B2 (en) Embedded software development support system, support method and program

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: EXPRESSLY ABANDONED -- DURING PUBLICATION PROCESS