CA2251369A1

CA2251369A1 - System and method for analyzing dependencies in a computer program

Info

Publication number: CA2251369A1
Application number: CA 2251369
Authority: CA
Inventors: Scott Kerr; Derek Rayside; Michael S. Fulton; Kostas Kontogiannis
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 1998-10-26
Filing date: 1998-10-26
Publication date: 2000-04-26
Also published as: GB2345994A; GB9925011D0

Abstract

A method of analyzing the inter-dependencies of elements in a computer program is provided which allows statically and dynamically bound elements to be represented in a graph representation representing the relationship of the elements to each other. The method provides for techniques for resolving statically determinate dependencies and techniques for statically determing dynamic dependencies. Analysis of the graph representation provides a list of dynamic runtime dependencies present in the program.
Further analysis on the graph representation provides information on transitive closure of sets of elements of the program. This is accomplished by propagating signals through the graph representation and analyzing the subsequent result in the graph representation.

Description

SYSTEM AND METHOD FOR ANALYZING DEPENDENCIES
OF A COMPUTER PROGRAM
FIELD OF THE INVENTION
The invention relates to a system and method for identifying and analyzing dynamic dependencies in computer programs and their elements.
BACKGROUND OF THE INVENTION
In object oriented programming ("OOP") languages, such as Java (a trademark of Sun Microsystems, Inc.) and C++, objects are the fundamental elements of the computer program structure. The flexibility to manipulate and substitute the objects within the computer program are key advantages to object oriented programming.
One characteristic of OOP languages is their ability to reuse code and objects through libraries. In Java, applications rely on the standard Java class library in the Java Development Kit (JDK). However, as the JDK is expanded and revised, when a Java program accesses the JDK, the actual contents of the JDK and the presumed contents of the JDK by the Java program may not match. If the Java program expects and uses a certain feature in an older version of the JDK, and that feature is not present in the current version, the program will not execute when accessing the current JDK.
It is also possible that an old application will not execute as intended when it references a new version of the JDK.
As such, software analysis tools have been developed in Java (and other languages) which analyze computer programs for their static library dependencies. Such tools can analyze the program without executing it. The program JavaPureCheck (a trade-mark of Sun Microsystems, Inc.) by Sun Microsystems is a software dataflow analysis tool which analyses the static symbol table of each class file for strings that reference other files.
However, another feature of Java limits the usefulness of current analysis tools.
Java (and C++) allows dynamic binding of objects at runtime. This feature (also called polymorphism, late binding, or runtime binding) enables an object to bind to objects of types other than those envisaged when the original object was written.
Consider an OOP
program where an ARTIST object draws SHAPE objects. The ARTIST object does not need to know what type of SHAPE object is being drawn until the moment it is actually drawn. As such, it is not readily apparent from a static examination of the ARTIST object what the ARTIST object is going to do. The method of this invention is to determine what the ARTIST object may do by examining the other code in the system to determine what SHAPE objects are available to be drawn (for example, a SQUARE object or a TRIANGLE object).
Dataflow analysis techniques can be used to statically determine some of the dynamic relationships (e.g. what SHAPE objects are available to be drawn), but the techniques are computationally demanding. Further, known dataflow analysis techniques cannot fully evaluate the effect of these dynamic relationships.
It is possible to determine which dynamic relationships are established by a dynamic analysis of the program (observing its execution). This sort of analysis fundamentally differs from the method of this invention as the program must be executed.
Furthermore, this sort of dynamic analysis may produce different results for each execution of the program, depending on the program's inputs, and so is not reliable for certain analyses (such as the ones presented here). For example, if the ARTIST
object draws only SQUARE objects during one execution it would be false to conclude that the ARTIST object may only draw SQUARE objects in the future; the ARTIST object may very well draw TRIANGLE objects or other SHAPE objects at some future point.

Also, as dynamic dependencies of objects in Java are not known until runtime, the current tools, including JavaPureCheck, do not evaluate the effects of these dynamic links.
SUMMARY
For a computer program having a set of program elements, including a dynamic program element, the invention provides a method for determining for the dynamic element a set of static and dynamic dependencies relating to the other program elements.
The method generates a graph representation of the program elements and the dynamic element using nodes and the nodes are associated together according to their static dependencies. Next, program elements directly or indirectly affecting dynamic elements are identified and the respective nodes are associated to nodes of the dynamic elements.
The resulting graph representation encompasses all static and dynamic dependencies existing amongst the program elements and the dynamic elements. The invention also may provide information on transitive closure sets of the graph representation of the computer program produced by the above method of the invention. These sets include, but are not limited to, requirements sets and portability sets.
The invention provides an ability to represent dynamic dependencies in a program statically and to determine constraints for these dynamic dependencies from a static analysis of the program. Further, the invention provides for allowing useful computation such as transitive closure sets given the complexity of the dynamic dependencies.
The invention provides a method for determining a dynamic dependency of a dynamic element in a computer program with a plurality of other elements in the computer program, comprising the steps of generating a graph representation containing nodes representing said dynamic element and said plurality of other elements;
identifying all targeted elements of said plurality of other elements directly or indirectly affecting said dynamic element; and associating the nodes of said targeted elements to said node of the dynamic element by said dynamic dependency. The step of generating a graph representation of the above method may further comprise associating static dependencies amongst said nodes between which static relationships exist.
Further, the step of traversing the nodes of the graph representation and producing an output representing the dependencies characteristics of the graph representation may be provided.
The above method can further provide the step of determining a set of graph traversal algorithms based on the specification of the computer language and wherein the step of generating a graph representation comprises linking said nodes according to the graph traversal algorithms. The step of identifying all targeted elements may also comprise identifying elements that define the interface of the dynamic node.
Further, the step of identifying all targeted elements may comprise identifying elements that contain code implementing the dynamic element.
The computer program of the above method may be programmed in an object-oriented computer language. Also, the graph representation may be selected from the group of graph representations consisting of directed multi-graphs, graphs, tables, linked lists, arrays vectors, trees and hash tables.
The node for the dynamic element may correspond to a Java language method invocation selected from the group of Java language method invocations consisting of invoke virtual, invoke interface and invoke super. The step of generating a graph representation may comprise generating a node for the dynamic element for each type of method invocation used for the dynamic element. And, the dynamic element may be a polymorphic element.
The above method may also further comprise the step of determining a transitive closure set of elements from said graph representation. The step of determining a transitive closure set of elements may comprise the steps of associating a signal value to said nodes; transmitting said signal values through said nodes according to a set of rules; and determining the transitive closure set of elements from said transmitted signal values.
Further, the step of transmitting said signal values may proceed in a breadth-first manner. The transitive closure set of elements may be a reduced subset, optimized subset or a portability set of said dynamic element and said plurality of other elements.
In a computer program having a plurality of elements associated with each other by a plurality of static and dynamic dependencies, a method for determining for a dynamic element in said plurality of elements, a set of static and dynamic dependencies is provided, said method comprising the steps of generating a graph representation containing nodes representing said plurality of elements linked by said plurality of static dependencies; creating a dynamic node in said graph representation for said dynamic element; identifying all targeted elements in said plurality of elements directly or indirectly affecting said dynamic element; and associating each node of said targeted elements to said dynamic node, whereby the resulting graph representation encompasses the static and dynamic dependencies existing amongst said plurality of elements.
The above method may further comprise the step of determining a set of graph traversal algorithms based on the specification of the computer language and wherein the step of generating a graph representation comprises linking said nodes according to the graph traversal algorithms.
The step of identifying all targeted elements may also comprise identifying elements that define the interface of the dynamic node or identifying elements that contain code implementing the dynamic element. The dynamic element may also be a polymorphic element.
The above method may further comprise the step of determining a transitive closure set ofelements from said graph representation. Moreover, the step of determining a transitive closure set of elements may comprise the steps of associating a signal value to said nodes; transmitting said signal values through said nodes according to a set of rules; and determining the transitive closure set of elements from said transmitted signal values. The transitive closure set of elements may be a reduced subset, an optimized subset or a portability set of said plurality of elements.
In a computer program having a plurality of elements including a dynamic element and a static element, a method of establishing a set of transitive closure elements for said plurality of elements, said method comprising the steps of generating a graph representation containing nodes representing said plurality of elements;
identifying all targeted elements of said plurality of elements directly or indirectly affecting said dynamic element; associating the nodes of said targeted elements to said node of the dynamic element by dynamic dependencies; establishing signal propagation parameters for said plurality of elements; transmitting signals through said plurality of elements; and determining said set of transitive closure elements from said transmitted signals. The set of transitive closure elements may be a reduced subset, an optimized subset or a portability set of said plurality of elements.
In a computer program language having a plurality of program method elements associated with each other by a plurality of static and dynamic dependencies invoked by one of a plurality of reference types using one of a plurality of dynamic invocations, a method for determining for a target program method element, a target set of static and dynamic dependencies is provided, said method comprising the steps of generating a graph representation containing nodes representing said plurality of program method elements linked by said plurality of static dependencies; creating a polymorphic node for said target program method element in said graph representation; traversing said graph and identifying a signature node in said graph representation defining an interface of said target program method element; and associating said signature node to said polymorphic node with a signature association, whereby the resulting graph encompasses the static and dynamic dependencies existing amongst said program method elements and said target program method element. The method may further comprise the steps oftraversing said graph and identifying an implemented node directly or indirectly implementing said target program method element; and associating said implemented node to said polymorphic node with an implementation association.
There is also provided a computer system comprising an analyzing computer program operating on a data processing system, said analyzing computer program executing any of the above method steps, Further, there is provided a program storage device readable by a data processing system, tangibly embodying a program of instructions, executable by said data processing system to perform any of the above method steps.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar or corresponding elements and in which:
Figure 1 is a flow chart of the method of the invention;
Figures 2a and 2b are sections of sample Java code;
Figures 3a, 3b and 3c are tables showing relationships of elements in Java;
Figure 4 shows a diagram of a graph representation of element relationships in the sample code;
Figures 5a, 5b and 5c show tables of element relationships in the sample code;

Figures 6, 7 and 8 show diagrams of graph representations of the sample code after the execution of the invention;
Figures 9a, 9b, 9c, and 9d show code for linking polymorphic nodes in the graph representation;
Figures 10, 11 and 12 show pseudo code for traversing algorithms to be used in a graph representation (the Java code for which is in Figures 9b, 9c and 9d);
Figure 13 shows pseudo code for visiting a node in a graph representation to determine if that node is connected by a declaration arc to the desired method (the Java code for which is in Figure 9a);
Figures 14a, 14b, 14c and 14d show code for computing a generalized transitive closure on the graph representation, using a graph and some rules (propagation relations) as input; and Figures 15, 16 and 17 show code defining the rules and particulars for three different transitive disclosures that may be computed on the graph representation.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
While the preferred embodiments of this invention have been described in relation to the Java language and implemented in the Java language, this invention need not be solely implemented using or applied to the Java language. The present invention and the preferred embodiments may equally be developed in or used for other computer languages including other object oriented languages such as C++. Indeed, in a preferred embodiment, the techniques of the present invention have been developed to analyze computer programs written in Java where the dynamic dependencies are caused by polymorphism. However, these techniques may be adapted to other languages with other kinds of dynamic dependencies. For example, the invention could be adapted to deal with function pointers in the C programming language.
Further, the dynamic elements and dependencies hereafter shall be referred to in the terms of the polymorphic elements and dependencies of the Java language.
This invention however is not limited to the polymorphism of Java; this invention may equally be applied to other kinds of dynamic elements and dependencies.
As general background, Java code uses objects. Objects are instances of classes.
By analogy, an object relates to a class as a building relates to a blueprint:
an object is an instance of a class and a building is an instance of a blueprint. Many objects and buildings may be instances of the respective class and blueprint. All objects in Java are manipulated via "handles", which are similar to "pointers" in C that may only be dereferenced. Handles in Java are typed and may point to any descendent object of the handle type.
All Java classes, including interfaces and arrays, ultimately trace their parenthood to the class "java.lang.Object", defined in the Java Development Kit (JDK). In other words, every interface, object and array extends the class "java.lang.Object", where each particular array, interface or object is an instance of "java.lang.Object".
Java allows objects to inherit properties of other objects. This permits code reuse by different objects and dynamic binding of objects to other objects.
Java is also an interpreted language. As such, objects are dynamically bound to each other only at runtime of the program. The source code is converted to "bytecode"
by the compiler, which is fed to the virtual machine interpreter to interpret and execute.
It can be appreciated that programs in other computer languages, such as ADA, may also be fed to a compiler to generate bytecode which then can be analyzed by the invention.
The bytecode contains all fields and method invocations declared by the program.

A method invocation is comprised of an opcode and a handle which specifies the type of object the method may be invoked against.
Anal~ing Dependencies in a Pro rq am Relating specifically to the invention, analysis of Java element constructs and definitions revealed a set of relationships amongst the elements. These relationships were used to define graph traversal algorithms, properties and linking algorithms for Java polymorphic elements used in the method of the invention and which are apparent to those skilled in the art. A similar analysis can be performed for other programming languages to establish comparable graph traversal algorithms, properties and linking algorithms necessary for the invention.
The relationships are tabulated in Figures 3a, 3b and 3c. Generally, the tables show how elements listed in the SourceNode Type field 50 relate to elements in the TargetNodeType field 51 by relationships in the EdgeType field 52. Figure 3a shows a table of relationships for the language constructs of Java. Figure 3b shows a table of relationships for the method elements of Java. Figure 3c shows a table of relationships for polymorphic elements of Java.
Figures 3a, 3b and 3c also show propagation relationship fields 53 for each element. The propagation relationships are further described hereafter. The propagation relationships shown in Figure 3a, 3b and 3c are defined in accordance with a preferred portability analysis also described hereafter and are used for a transitive closure determination of the elements of a program in accordance with the preferred portability analysis of the invention.
In the invention, for a given polymorphic element in a computer program, the method generates a graph representation representing the dependencies existing amongst the polymorphic element and other elements in the computer program.
The graph reflects the dependencies of the elements defined in the source code.
Every node in the graph represents an element; every arc connects two elements, representing the type of dependency between the connected elements. These graphs are also known as "directed multi-graphs".
In embodiments of the invention, graph representations (also referred to as graphs herein) may implemented in various data structures, including graphs, tables, linked lists, arrays, vectors, hash tables, trees and other structures.
Dependencies between elements include inheritance (an element is a specialization of another element), fields (an element has another element), fields and methods (an element declares another element) and various types of usage relationships (an element uses another element). In Java, the polymorphic dependencies are a special type of usage relationship, although this may be different in a language with different constructs.
Java has three types of polymorphic method invocations: invoke virtual, invoke interface and invoke super. Invoke virtual is the typical dispatch method;
almost all methods, except interface methods, are invoked with invoke virtual. Invoke interface is used to invoke a method signature declared in a Java interface and implemented in some class that inherits from the declaring interface. Invoke super is a particular case of invoke special. Invoke special invokes a method requiring special handling, of which there are three cases: private, constructor and super. The super case is polymorphic.
For each invocation type, the invention generates a special polymorphic node in the graph representation and creates polymorphic links to various elements in the graph representation. These polymorphic nodes and their connections facilitate the static analysis of these dynamic dependencies.
The invention operates on a representation ofthe source code (orthe source code itself as the case may be) of the computer program. The computer program does not have to execute for the method to operate.
Figure 1 shows a block diagram describing the method steps employed by the invention. The source code for the method in a preferred embodiment of the invention is Java code. Figures 2a and 2b provide sample Java source code analyzed by the invention.
In Figure 1, the method begins at START 1. At Step 2, the language specification relating to the source code is analyzed to determine graph traversal algorithms for visiting affected nodes in the graph representation being built. It can be appreciated that the algorithms need only visit affected nodes and not the entire graph. It can be further appreciated that after the algorithms are established, they do not have to be redetermined and may be reused by subsequent applications of the invention.
Step 3 translates the source code into an equivalent representation usable by the method. In a preferred embodiment of the invention, Java source code is translated into bytecode. It can be appreciated that other formats of the source code, including the source code itself, may be used by the method and, as can be appreciated, in the case of using the source code itself this translating step is unnecessary. However, in a preferred embodiment of the invention, the bytecode provides a representation of the elements in the source code for the method which can be easily parsed and analyzed.
Next, at Step 4a, the bytecode is analyzed to generate a graph representation containing nodes of all static and polymorphic elements in the code. The creation of nodes for polymorphic elements for the graph is a key feature of the invention, as their inclusion into the graph allows them to be analyzed as part of the otherwise static graph.
In Step 4b, links representing static dependencies existing amongst the static elements are made amongst the nodes of the static elements.

Step 5 links the polymorphic nodes to all static element nodes affecting them with relationship arcs based on the characteristics of the polymorphic elements.
In a preferred embodiment of the invention, the polymorphic nodes are linked to the affected nodes in the graph representation by signature arcs and implementation arcs. These arcs set the dependencies of polymorphic elements to other elements. The signature arc connects the polymorphic element node to a node that defines its interface.
Implementation arcs connect the polymorphic nodes to element nodes containing code implementing the polymorphic element.
The graph now contains all static and dynamic dependencies of the relevant elements, including the dynamically bound elements) represented by the polymorphic node(s).
Step 6 traverses the graph representation and produces an output representing the dependency characteristics of the graph. An output of dependencies can be in Rigi Standard Form (RSF), a simple entity relation text file format defined by Dr.
Hausi Muller, comprising the form: [relation] [source entity] [target entity]. It can be appreciated that other output formats can be used. The output may, for example, be displayed, stored and/or further processed The method is completed at END 7.
Figure 4 shows a diagram of a graph representation embodying relationships of elements in the sample Java source code in Figures 2a and 2b. Figure 4 includes a series of nodes for the elements in the source code connected by three types of directed arcs, namely implements arcs, extends arcs and method declaration arcs. Implements and extends arcs are species of inheritance arcs. These directed arcs are used to show the static relationships between the elements.

In Figure 4, nodes include Primitive 11, ThreeD 18, Draw 17, Shape 12, Point 13, Line 14, Rectangle 15, Triangle 16, Square 22, Cube 60, Pyramid 61, and Prism 62. Point 13, Line 14, Rectangle 15, Triangle 16, Square 22, Cube 60, Pyramid 61, and Prism 62 are concrete classes whereas Shape 12 is an abstract class. Note that Primitive 11 represents a geometric primitive class, as opposed to a Java primitive data type.
Implements arcs 19 are shown between Primitive and each of Shape, Point, and Line. As such, Shape, Point, and Line depend on the declarations of Primitive.
If Primitive declares a field named colour, Shape, Point, and Line will refer to the same field, colour.
If Primitive declares a method named colour, then the concrete classes Point, Line, Rectangle, and Triangle implement that colour method. Implements arcs are also shown between ThreeD and each of Cube, Pyramid, and Prism.
Draw nodes are declared for and connected by method declaration arcs 21 to each of Primitive, Point, Line, Rectangle, Triangle, Cube, Pyramid, and Prism.
Extends arcs 20 are shown between Shape and each of Rectangle and Triangle;
as such the class of Shape abstract is inherited by Rectangle and Triangle.
Moreover, each of Rectangle and Triangle extend Shape. As shown in Figure 4, further extends arcs are defined between the various nodes indicating the inheritance and multiple inheritance relationships of the nodes.
Figures 5a, 5b and 5c show tables of relationships between the nodes of graph 10, expressed through sets of node declarations, inheritance arcs (including implements and extends arcs) and method declaration arcs, respectively. The relationships described in Figures 5a, 5b and 5c are expressed in Rigi Standard Form (RSF) Figures 6, 7 and 8 show Figure 4 after different polymorphic method invocations have been applied to graph 10. A polymorphic node is created to represent each invocation's dependencies in the graph. The invention connects the polymorphic node to the program elements on which they depend, using a different algorithm for each type of polymorphic method invocation (invoke virtual (Figure 6), invoke interface (Figure 7) or invoke super (Figure 8)). Connections between the polymorphic node to the program elements are made with signature arcs and implementation arcs. Where appropriate, same reference numbers are used with the suffix "a" used for references in Figure 6, the suffix "b" used for references in Figure 7 and the suffix "c" for references in Figure 8.
Except as provided, all nodes and relationships are the same as graph 10 in Figure 4.
Figure 6 shows graph 10a after an analysis of an invocation of invokevirtual method 31 for the method Shape.draw(). Polymorphic node 24 for Shape.draw() is inserted into graph 10a with signature arc 25 connecting to draw() node 23 declared by Primitive 11a and implementation arcs 26 connecting to other draw() nodes. As can be appreciated, method Shape.drawQ is just one example of a method for which the invokevirtual method may be invoked. Different methods (optionally, of different classes) may equally be invoked and analyzed as well as more than one method may be invoked and analyzed. Representative Java code for the invocation of method Shape.draw() 23 is shown in Figure 2b at 36. The invokevirtual opcode is contained in the body of the Application.main() method. Graph 10a shows all dynamic dependencies of the polymorphic element Shape.draw() through the attached signature and implementation arcs.
Figure 7 shows graph 10b after an analysis of an invocation of invokeinterface method 32 for the method Primitive.draw(). Polymorphic node 28 is inserted into graph 10b with signature arc 29 connecting to draw() node 27 declared by Primitive 11 b and implementation arcs 30 connecting to otherdraw() nodes. Again, method Primitive.drawQ
is just one example of a method for which the invokeinterface method may be invoked.
Different methods (optionally, of different classes) may equally be invoked and analyzed as well as more than one method may be invoked and analyzed. Representative Java code for the invocation of method Primitive.drawQ is shown in Figure 2b at 37.
The invokeinterface opcode is contained in the body of the Application.mainQ
method.

Figures 6 and 7 illustrate that the method invocation contained in the Application.main() does not always cause the same body of code to execute.
This is the nature of the dynamic dependencies present in Java.
Figure 8 shows the graph representation after an analysis of an invocation of invokesuper33 ofthe invokespecial method group, against the Rectangle.drawQ
method.
The invokespecial method is contained in the body of the Cube.drawQ method. A
polymorphic node for Square.draw()~s 35 is shown with implementation arc 39 connecting to Square.drawQ node 34. Once again, method Rectangle.drawQ is just one example of a method for which the invokesuper method may be invoked. Different methods (optionally, of different classes) may equally be invoked and analyzed as well as more than one method may be invoked and analyzed. Representative Java code for the invocation of method Rectangle.draw() is shown in Figure 2a at 38. The method of the invention determines when the invokespecial opcode represents the invokesuper case.
Also note that a method may be invoked by both the invoke virtual opcode and the invoke super opcode from different points in the program. In this event, two distinct polymorphic nodes would be created, one for each type of invocation. These two polymorphic nodes would be connected to the rest of the graph in different ways. For example, suppose that the application contains an invoke virtual against Square.draw() then a polymorphic node Square.draw()w would be created. Square.draw()w would have implementation arcs connecting to Rectangle.drawQ and Cube.draw(), while Square.draw()~s would only have an implementation arc to Rectangle.drawQ (as shown in Figure 8). This illustrates creating a different node type for each type of polymorphic invocation.
Figures 9a through 9d show sections of Java code implementing the graph insertion and connection routines for invoke interface (Figure 9b), invoke virtual (Figure 9d), invoke super (Figure 9c) and polymorphic nodes (Figure 9a).
PolymorphicNode.java is the parent class of InvokelnterfaceNode.java, InvokeVirtual.java and InvokeSuper.java.
As the parent class, it contains code that is common to all three.
Figures 10 (invoke virtual), 11 (invoke interface) and 12 (invoke super) show the pseudo code for the Java code in Figures 9d, 9b, and 9c respectively. This code describes the graph traversal algorithms that connect the polymorphic node to the method nodes it depends on. Referring to an earlier example, these traversals reveal what SHAPE objects the ARTIST object may draw. These traversal algorithms were developed from a study of the Java language, and are apparent to those skilled in the art.
These particular algorithms have been optimized to visit the fewest nodes necessary, and it can be appreciated that other algorithms may be used.
Figure 13 shows the pseudo code for the Java code contained in Figure 9a. The purpose of this code is to examine ("visit") a particular Classorlnterface node to determine if it declares a method node that the polymorphic node in question is dependent on. If so, the appropriate arc is created between the polymorphic node and the method node it depends upon. This code is common to all three traversal algorithms, the other three algorithms direct this code to visit/examine the appropriate nodes.
Transitive Closure Determinations In another preferred embodiment, an analysis tool can use the graph representation produced by the invention to evaluate two general transitive closure determinations of the graph, the portability analysis and the requirements analysis. For both determinations, the output is a graph, expressed in RSF format. Other output formats may also be used such as a graphical representation, a table or any other structure and the output may, for example, be displayed on a computer screen or further processed by a software application in accordance with the result of the analysis.
For both determinations, a generalized framework for computing these transitive closure determinations is used. In the framework, the analysis propagates a signal through the nodes of the graph representation starting from an initial node of interest. The propagation may proceed in a depth-first, breadth-first or any other manner.
The results of the signal propagations provide information on the transitive closures of the elements of the program represented by nodes in the graph.
At the outset, all nodes of the graph are set to a default value, such as zero. Next, a table representing the initial states of the nodes of the graph is used as the input for the analysis. The table is provided by the user or can be generated by a tool, in either case in accordance with the type of analysis to be performed and the particular code being analyzed. Using the table in a preferred embodiment, signal levels equal to, or greater than zero are assigned to the nodes.
The analysis then iteratively propagates signals at specified nodes to nodes associated with the specified nodes, and nodes associated with them, and so on, until the graph reaches a steady state. The signals being propagated can be modified by the specified node or arcs connected to the specified node; each node and/or arc are assigned particular modifiers or rules which may either accept or reject the signals. In a preferred embodiment of the analysis, signals at nodes may only increase. When a signal reaches a certain node, the modifier either accepts the signal and propagates the signal onward or rejects the signal. Further, in the preferred embodiment, each modifier limits the maximum signal that a node can be raised. These modifiers or rules are described merely as an example and are not intended as a limitation. Other modifiers or rules may equally be employed depending on the result sought from the analysis and/or the computer language of the code being analyzed. Each transitive closure determination has different signal propagation characteristics applicable specifically to it.
Eventually, the nodes in the graph will reach a steady state after the signals are propagated through the graph. The steady state is in essence defined by the rules. The steady state system is the output of the analysis. According to the rules and the content of the table, the output of the analysis can be interpreted by the user or another software application to determine the results shown by the analysis.
Figures 14a through 14d show code for the signal propagation routine described above. The Graph.propagateQ routine of Figure 14a calls Node.sendQ routine of Figure 14d, which calls the ArcBehaviour.transmit() routine of Figure 14b which filters the signal according to rules developed through analysis of the constructs of the Java language.
The filtered signal is passed to Node. receive() routine of Figure 14d, where the receiving node chooses whether to accept the signal or not (only signals with a greater value than the current level are accepted). If the node accepts the signal, the node is placed on the list for the next round of propagation.
Portability Analysis The first transitive closure determination propagates signals through a graph having a set of unusable elements to provide information on elements affected by the unusable elements. This is referred to as "portability analysis". Given a list of unusable elements, the analysis tool traverses the graph and determines a list of elements directly and indirectly related to any of the unusable elements. These related elements also cannot be used. A practical example of this type of analysis would be determine the portability of a Java application from one release of the JDK to another release of the JDK.
Figure 15 shows a set of selected Java routines of a preferred embodiment of the invention providing portability analysis through signal propagation analysis.
There are four different categories of arc signal propagation relationships for Java: Alpha, Beta, Gamma and Delta. Figure 15 shows Gamma. See also Figures 3a, 3b and 3c for a listing of arc signal propagation relationships 53 for certain combinations of SourceNode Type, EdgeType and TargetNode Type.

Within each category, three signal levels exist: FULL (0), PART (1 ), and NONE

(2).
FULL represents a fully supported node; it will function even though other nodes may not.
PART identifies a partially supported node. NONE identifies an unsupported and nonfunctioning node. For other types of analysis, different sets of arc signal processing behaviour are applicable.
For the four categories of arc signal propagation relationships, the following signal level propagation relations were determined:

1 ) ALPHA
FULL -> FULL
PART -> PART
NONE -> NONE
In the ALPHA arc signal propagation relationship, a FULL signal level of a node would propagate as a FULL signal level to a related node. Similarly, a PART signal level would propagate as a PART signal level and a NONE signal level would propagate as a NONE
signal level.
2) BETA
FULL -> FULL
PART -> PART
NONE -> PART
The signal levels in the BETA arc signal propagation relationship propagate like the ALPHA arc signal relationship except that a NONE signal level would propagate as a PART signal level.

3) DELTA
FULL -> FULL
PART -> FULL

NONE -> NONE
Like the BETA arc signal propagation relationship, the signal levels in the DELTA arc signal propagation relationship propagate like the ALPHA arc signal relationship except that a PART signal level would propagate as a FULL signal level.

4) GAM MA
The GAMMA relationship is not a simple integer mapping and is only used for the polymorphic nodes. The polymorphic node is dependent on each node connected to it by an implementation arc. As such, the state of the polymorphic node falls into one of three categories:
1. All nodes on which the polymorphic node depends are functioning (level FULL(0)). The state of the polymorphic node is FULL(0). As such, all of the options are available to the polymorphic node.
2. All nodes on which the polymorphic depends are not functioning (level NONE(2)). The state of the polymorphic node is NONE(2). As such, none of the options are available.
3. Any other case that is not one of the above. The polymorphic node will be level PART(1 ). Some of the options may be available.
The characteristics of GAMMA are analogous to a capacitor, where there must be a buildup of charge (NONE signals in the nodes on which the polymorphic node is partially dependent) before the signal is discharged (the polymorphic node changes its state to NONE and transmits this signal).
Using the signal propagation analysis described above in a preferred embodiment, a table would be provided indicating the appropriate signal levels for the elements (represented by nodes in the graph) of a program. Using the portability between JDK
releases example of above, a Java program may be analyzed to determine that certain elements of the program will be fully supported between releases (FULL), partly supported (PART) and not supported (NONE). Further, the signal propagation analysis described above would be provided the arc signal propagation relationships (modifiers or rules) as described above, namely the ALPHA, BETA, DELTA and GAMMA, by which the signal propagation should proceed. Signal propagation would be initiated and proceed to a steady-state. At steady-state, the results of the analysis can be assessed. Again, using the portability between JDK releases example of above, the user can fully determine what elements of the Java program are supported when moving from one release of the JDK to another release.
Requirements Analysis The second determination provides a "requirements analysis" forthe graph.
Given a list of required elements, the analysis tool traverses the graph and determines a list of other elements directly and indirectly used by the required elements. The other elements must function correctly for the required elements to function correctly. A
practical example of such an analysis is to reduce the Java classes required for a Java application to a reduced subset in order to use the minimized Java application in a small memory device or embedded system.
There are two types of requirements analysis. First, a reduced subset requirements analysis examines the need for complete class files, as opposed to individual fields and methods. Figure 16 shows a set of selected Java routines of a preferred embodiment providing reduced subset analysis. If the class file is required, then it is included in the reduced subset. Like the portability analysis, a set of arc signal propagation relationships (modifiers or rules) as well as a table of initial signal levels are provided to the signal propagation analysis described above in order to perform the reduced subset requirements analysis to determine the required class files.

Second, an optimized subset requirements analysis evaluates the need for individual fields and methods, not complete class files. Figures 17 shows a set of selected Java routines providing optimized subset set analysis. Like the reduced subset analysis, a set of arc signal propagation relationships (modifiers or rules) described below as well as a table of initial signal levels are provided to the signal propagation analysis described above in order to perform the optimized subset requirements analysis to determine the need for individual fields and methods.
Unlike the simple "required" or "not required" analysis of the reduced subset requirements analysis, the optimized subset requirements analysis must consider dynamic dependencies between methods caused by polymorphic invocations. In terms of signal propagations, this introduces an indeterminate state between "required" and "not required" of the reduced subset requirements analysis.
Referring to Figure 17, the InspectorArcBehaviour class identifies these indeterminate states in the fields and methods in a class file. The InspectorArcBehaviour class selects a signal to transmit based on the level of both the sending and receiving nodes. In other words, in addition to evaluating the level of the sending node, it also inspects the receiving node when determining the signal to transmit.
For the optimized subset requirements analysis, two partial states used by the method nodes include PART BY CLASS and PART BY INVOKE. PART BY CLASS
signifies that a method may be required because the class that declares it is required.
PART_BY_I NVOKE signifies that a method may be required because a polymorphic node that depends on it is required (see alpha behaviour in Figure 17). A method is required if both the class that declares it is required and a polymorphic node that depends on it is required; otherwise, the method is not required. So, if the method meets only one of the PART_BY states, the method will not be required (the delta behaviour in Figure 17).
The InspectorArcBehaviour class is used when transmitting a signal to a method to process the partial state of the method node. First, the process determines the current state of the method node by inspecting it. The signal sent to the method node is dependent on the incoming signal and the current state of the node. Two configurations of InspectorArcBehaviourare used. The first configuration acts on the implementation arc.
The other configuration acts on the two arcs for method declarations (arc Constructor and arc_Method).
It can be appreciated that the invention may be embodied using other mapping characteristics with other program construction paradigms.
The invention may be implemented on a stand-alone basis, integrated into an application wherein the invention is a feature such an integrated software development environment or integrated into an application to further process the results of the analysis and/or provide the variable inputs to implement the present invention such as for example a portability analysis or a requirements analysis.
The invention may be implemented as a program storage device readable by a data processing system, tangibly embodying a program of instructions, executable by said data processing system to perform the method steps of the invention. Such a program storage device may include diskettes, optical discs, tapes, CD-ROMS, hard drives, memory including ROM or RAM, computer tapes or other storage media capable of storing a computer program.
The invention may also be implemented in a computer system. In a preferred embodiment, a system is provided comprising a computer program operating on a data processing system, with the computer program embodying the method of the invention and producing an output of the method on a display or output device. Data processing systems include computers, computer networks, embedded systems and other systems capable of executing a computer program. A computer includes a processor and a memory device and optionally, a storage device, a video display and/or an input device.

Computers may equally be in stand-alone form (such as the traditional desktop personal computer) or integrated into another apparatus (such as a cellular telephone).
While this invention has been described in relation to preferred embodiments, it will be understood by those skilled in the art that changes in the details of processes and structures may be made without departing from the spirit and scope of this invention.
Many modifications and variations are possible in light of the above teaching.
Thus, it should be understood that the above described embodiments have been provided byway of example rather than as a limitation and that the specification and drawing are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A method for determining a dynamic dependency of a dynamic element in a computer program with a plurality of other elements in the computer program, comprising the steps of:
generating a graph representation containing nodes representing said dynamic element and said plurality of other elements;
identifying all targeted elements of said plurality of other elements directly or indirectly affecting said dynamic element; and associating the nodes of said targeted elements to said node of the dynamic element by said dynamic dependency.

2. The method of claim 1, wherein said step of generating a graph representation further comprises associating static dependencies amongst said nodes between which static relationships exist.

3. The method of claim 1 or claim 2, further comprising the step of traversing the nodes of the graph representation and producing an output representing the dependencies characteristics of the graph representation.

4. The method of any one of claims 1 to 3, further comprising the step of determining a set of graph traversal algorithms based on the specification of the computer language and wherein the step of generating a graph representation comprises linking said nodes according to the graph traversal algorithms.

5. The method of any one of claims 1 to 4, wherein said step of identifying all targeted elements comprises identifying elements that define the interface of the dynamic node.

6. The method of any one of claims 1 to 5, wherein said step of identifying all targeted elements comprises identifying elements that contain code implementing the dynamic element.

7. The method of any one of claims 1 to 6, wherein said computer program is programmed in an object-oriented computer language.

8. The method of any one of claims 1 to 7, wherein said graph representation is selected from the group of graph representations consisting of directed multi-graphs, graphs, tables, linked lists, arrays vectors, trees and hash tables.

9. The method of any one of claims 1 to 8, wherein the node for the dynamic element corresponds to a Java language method invocation selected from the group of Java language method invocations consisting of invoke virtual, invoke interface and invoke super.

10. The method of any one of claims 1 to 9, wherein the step of generating a graph representation comprises generating a node for the dynamic element for each type of method invocation used for the dynamic element.

11. The method of any one of claims 1 to 10, wherein the dynamic element is a polymorphic element.

12. The method of any one of claims 1 to 11, further comprising the step of determining a transitive closure set of elements from said graph representation.

13. The method of claim 12, wherein the step of determining a transitive closure set of elements comprises the steps of:
associating a signal value to said nodes;
transmitting said signal values through said nodes according to a set of rules; and determining the transitive closure set of elements from said transmitted signal values.

14. The method of claim 13, wherein the step of transmitting said signal values proceeds in a breadth-first manner.

15. The method of any one of claims 12 to 14, wherein said transitive closure set of elements is a reduced subset of said dynamic element and said plurality of other elements.

16. The method of any one of claims 12 to 14, wherein said transitive closure set of elements is an optimized subset of said dynamic element and said plurality of other elements.

17. The method of any one of claims 12 to 14, wherein said transitive closure set of elements is a portability set of said dynamic element and said plurality of other elements.

18. In a computer program having a plurality of elements associated with each other by a plurality of static and dynamic dependencies, a method for determining for a dynamic element in said plurality of elements, a set of static and dynamic dependencies, said method comprising the steps of:
generating a graph representation containing nodes representing said plurality of elements linked by said plurality of static dependencies;
creating a dynamic node in said graph representation for said dynamic element;
identifying all targeted elements in said plurality of elements directly or indirectly affecting said dynamic element; and associating each node of said targeted elements to said dynamic node, whereby the resulting graph representation encompasses the static and dynamic dependencies existing amongst said plurality of elements.

19. The method of claim 18, further comprising the step of determining a set of graph traversal algorithms based on the specification of the computer language and wherein the step of generating a graph representation comprises linking said nodes according to the graph traversal algorithms.

20. The method of claim 18 or claim 19, wherein said step of identifying all targeted elements comprises identifying elements that define the interface of the dynamic node.

21. The method of any one of claims 18 to 20, wherein said step of identifying all targeted elements comprises identifying elements that contain code implementing the dynamic element.

22. The method of any one of claims 18 to 21, wherein the dynamic element is a polymorphic element.

23. The method of any one of claims 18 to 21, further comprising the step of determining a transitive closure set of elements from said graph representation.

24. The method of claim 23, wherein the step of determining a transitive closure set of elements comprises the steps of:
associating a signal value to said nodes;
transmitting said signal values through said nodes according to a set of rules; and determining the transitive closure set of elements from said transmitted signal values.

25. The method of claim 23 or claim 24, wherein said transitive closure set of elements is a reduced subset of said plurality of elements.

26. The method of claim 23 or claim 24, wherein said transitive closure set of elements is an optimized subset of said plurality of elements.

27. The method of claim 23 or claim 24, wherein said transitive closure set of elements is a portability set of said plurality of elements.

28. In a computer program having a plurality of elements including a dynamic element and a static element, a method of establishing a set of transitive closure elements for said plurality of elements, said method comprising the steps of:
generating a graph representation containing nodes representing said plurality of elements;
identifying all targeted elements of said plurality of elements directly or indirectly affecting said dynamic element;
associating the nodes of said targeted elements to said node of the dynamic element by dynamic dependencies;
establishing signal propagation parameters for said plurality of elements;
transmitting signals through said plurality of elements; and determining said set of transitive closure elements from said transmitted signals.

29. The method of claim 28, wherein said set of transitive closure elements is a reduced subset of said plurality of elements.

30. The method of claim 28, wherein said set of transitive closure elements is an optimized subset of said plurality of elements.

31. The method of claim 28, wherein said set of closure elements is a portability set of said plurality of elements.

32 32. In a computer program language having a plurality of program method elements associated with each other by a plurality of static and dynamic dependencies invoked by one of a plurality of reference types using one of a plurality of dynamic invocations, a method for determining for a target program method element, a target set of static and dynamic dependencies, said method comprising the steps of:
generating a graph representation containing nodes representing said plurality of program method elements linked by said plurality of static dependencies;
creating a polymorphic node for said target program method element in said graph representation;
traversing said graph and identifying a signature node in said graph representation defining an interface of said target program method element; and associating said signature node to said polymorphic node with a signature association, whereby the resulting graph encompasses the static and dynamic dependencies existing amongst said program method elements and said target program method element.

33. The method of claim 32, further comprising the steps of:
traversing said graph and identifying an implemented node directly or indirectly implementing said target program method element; and associating said implemented node to said polymorphic node with an implementation association.

34. A computer system comprising an analyzing computer program operating on a data processing system, said analyzing computer program executing the method steps of any one of claims 1 to 33.

35. A program storage device readable by a data processing system, tangibly embodying a program of instructions, executable by said data processing system to perform the method steps of any one of claims 1 to 33.