GB2356946A

GB2356946A - Computer system user interface design using functional languages

Info

Publication number: GB2356946A
Application number: GB9928266A
Authority: GB
Inventors: Paul Anton Richardson Gardner
Original assignee: Fujitsu Services Ltd
Current assignee: Fujitsu Services Ltd
Priority date: 1999-12-01
Filing date: 1999-12-01
Publication date: 2001-06-06
Anticipated expiration: 2019-12-01
Also published as: GB2356946B; GB9928266D0

Abstract

A computer system user interface, in particular, a graphical user interface (GUI), for an application is designed as a result of consideration of the application itself, in particular one which is written in a functional programming language and consists of object types and the relationships between them. Use of a functional language enables any valid application to be denoted by an evaluation tree, whose leaves denote literals or unbound variables. Static analysis of such an application can produce a maximal evaluation tree denoting the potential evaluation of all application branches. Firstly, it is necessary to define a naming scheme whereby all unbound variables are given an unambiguous name, particularly in recursive constructs. Secondly it is necessary to analyse the application to extract maps of the information it may use, in particular to produce information maps for individual usages of unbound variables. The latter is dependent on the naming scheme. This can be achieved automatically. The resulting maps are then merged to produce one map per variable, which comprises a single "inverted" tree. The user interface, such as a GUI, can then be constructed using this single tree.

Description

2356946 C1418 COMPUTER SYSTEM USER INTERFACE DESIGN This invention relates

to computer systems and in particular to computer system user interface design.

Typically a user interface for a computer system application, for example a graphical user interface (GUI), is designed either completely manually, for example using a Visual Basic form designer, or derived from an aspect of underlying database schemas, for example particular tables. The GUI may be designed first and the application generated subsequently. However, whilst this is acceptable in certain circumstances, when logic is required to be added, for example a GUI box is only required to be filled in certain circumstances, problems with this approach arise.

According to one aspect of the present invention there is provided a method for use in designing a user interface for an application, including the steps of defining the application in a functional language and deducing use of information by the application from the application itself by analysis thereof.

According to another aspect of the present invention there is provided a method for use in the construction of a user interface for an application, the application being defined in a functional language and being in the form of a tree structure, leaves of the tree being comprised by unbound variables of the application, the method including the steps of.

transforming the application whereby to provide each unbound variable with a respective unambiguous name, analysing the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, I C 1418 analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, sorting the maps into groups in dependence on the unambiguous names, and merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed.

According to a further aspect of the present invention there is provided a method for use in designing a user interface for an application wherein the use of information by an application is automatically deduced from the application itself, the application comprising a plurality of business rules corresponding to a business purpose, being defined in a functional language, and being in the form of a tree structure, leaves of the tree being comprised by unbound variables of the application, the method including the steps of- transforming the application whereby to provide each unbound variable with a respective unambiguous name, analysing the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, sorting the maps into groups in dependence on the unambiguous names, and merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed.

1.

C1418 According to yet another aspect of the present invention there is provided a computerised method for use in constructing a user interface for an application defined in a functional language and having a tree structure, leaves of the tree being comprised by unbound variables of the application, the method including the steps of.

transforming the application whereby to provide each unbound variable with a respective unambiguous name, analysing the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, sorting the maps into groups in dependence on the unambiguous names, and merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed.

According to yet another aspect of the present invention there is provided a computer system for use in the construction of a user interface for an application, the application being defined in a functional language and being in the form of a tree structure, leaves of the tree being comprised by unbound variables of the application, the computer system including:

means for transforming the application whereby to provide each unbound variable with a respective unambiguous name, means for analysing the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, I, C 1418 means for analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, means for sorting the maps into groups in dependence on the unambiguous names, and means for merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed, According to a still further aspect of the present invention there is provided a computer readable storage medium having a program thereon, wherein the program is for use in the construction of a user interface for an application, the application being defined in a functional language and being in the form of a tree structure, leaves of the tree being comprised by unbound variables of the application, and wherein the program includes steps for:

transforming the application whereby to provide each unbound variable with a respective unambiguous name, analysing; the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, sorting the maps into groups in dependence on the unambiguous names, and merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed.

C1418 Embodiments of the invention will now be described with reference to the accompanying drawings, in which:

Fig 1 illustrates an example name tree expansion, Fig 2 illustrates Ru-ther expansion of part of Fig 1, Fig 3 illustrates an example name recognition tree, Fig 4 illustrates a deterministic, finite automata derived from the name recognition tree, Fig 5 illustrates a first expansion with sequential invocations, Fig 6 illustrates an expansion with nested invocations, Fig 7 illustrates an example of truncated expansion branches, Fig 8 a node layout for higher order functions, Fig 9 illustrates a basic computer system design, and Fig 10 is a flow chart showing the basic steps of a method for use in the construction of a user interface for an application.

As mentioned above, typically a user interface for an application is designed completely manually, and the GUI may be designed first and the application generated subsequently. This is perfectly acceptable providing the underlying business rules do not change. If they do change, however, the GUI and possibly the application will need to be altered. An arrangement which enables rule changes to be made easily, without having to manually change the interface or rewrite the application, is thus desirable. Whereas the invention is basically described in terms of designing a graphical user interface for an application, it is not so limited. Rather, it is aimed at interfaces in general, and includes those for optimizing database access. An application can be provided with information via graphical user interfaces, whereby information is input by a human operator, or from a database. Both comprise sources of information for the application.

The present invention proceeds from the realization that the application itself dictates the information requirements and their sequencing. The application is of course written in business terms using business logic, and may for example be used to apply social security benefit rules to claimants, in which case details of the claimants need to be considered, such as names, dates of birth, addresses, children etc, and verified. Analysis of the application can, therefore, elicit the C1418 maximal set of information that the application can require, along with the relationships between the pieces of information, as long as both the information model and the application logic are constrained to permit this. It is assumed hereinafter that the applications are written against a well-defined information model, consisting of object types and the relationships between them.

In general terms, the interface design method proposed by the present invention requires the writing of the application which will underly the interface in a functional language and static analysis of it, and this analysis can be carried out automatically. When using a functional language for the application, the application or any valid expression in the application can be denoted by an evaluation tree with literals or unbound variables as its leaves, operators as nodes, operand dependencies as edges, and the root denoting the result. In particular the static analysis of an application written in a functional language can produce a maximal evaluation tree denoting the potential evaluation of all expansion branches. To achieve this, all name bindings need to be substituted with their definitions. There are two aspects to the "naming" involved in this. Firstly the giving to unbound variables of an unambiguous name that identifies unbound variable usage, and secondly the assigning of explicit names to expressions within the application. Details of the naming procedures are described hereinafter. The fact that sets of things, such as children, need to be considered means that recursion will be involved, as will be apparent from the following.

The maximal evaluation tree, achieved with the above naming applied to it, is then analysed to identify the sub-trees that represent information model access. In the absence of recursion, each sub-tree represents a potential traversal of a set of object instance graph edges at runtime by the application. When recursion occurs this is denoted by one or more parts of a path containing sections that may be arbitrarily repeated. Each tree is rooted at an unbound variable which in effect defines the entry point to the information graph.

The sub-trees are then grouped in accordance with their unambiguous names and the groups are merged to produce a respective single inverted tree. An example, for a young person's allowance, where the entry point is "claimant' 'and involving the checIdng of the age of the claimant and parental details, may involve four sub-trees:

C1418 claimant. Mother. Address. ZIP claimant. Mother. Alive claimant. Father. Alive claimant. Age these would be merged into claimant + Mother + Address - ZIP + Alive + Father - Alive + Age This it can be seen bears a close relationship to a GUI that could be used to interact with the application. First a lookup via "claimant"to the find the person in question. The mother and the mother's address are then checked, together with the alive/dead status of the mother. A similar check of the father of the claimant is also made, as well as the age of the claimant.

It is not believed necessary to give complete details of the GUI construction herein. A fully functional designer would combine the inverted trees, references to the rules from which they had been derived and the information model components that various parts represent, with drag and drop facilities for adding such components to a fonn.. An overview is given in the following which is considered sufficient for understanding how a fiill implementation may be performed.

Attention is now given to considering various aspects of the method in greater detail. As will be apparent, the applications for which the interfaces are to be designed need to be written against a well-defined information model. A simple model consisting of object types and the relationships between them is assumed.

Object types can have attributes of varying type, such as String, Date and Integer. Relationships are one-to-one, optional and can be traversed in either direction. Traversal of relationships is via attributes of type 'reference' on the objects instances involved, whose value denotes the other object instance participating in the relationship, if one exists, null otherwise.

C1418 Instances of such an information model thus consist of a collection of isolated object instances and zero or more graphs of connected object instances. Such a graph has nodes denoting the object instances, and the relationships between the object instances are denoted by the edges between the nodes. Traversal of such graphs is performed by entering the graph via a lookup operation and then traversing the edges that correspond to relationships. For simplicity, only read-access is considered below, that is, the information model instance is considered to have been pre-defined and fixed.

For example, an information model could have two object types, Person and Address. These could be related by the "lives-at/is-lived_in_by" relationship. An instance of this model would then consist of a number of Person and Address instances, some of which could be related to each other to produce pairs of Person/Address instances (the only permissible graph in this case due to the one-to-one simplification).

A functional programming style is required in order to permit simple reasoning about program semantics, in particular access to the information model. By using a functional language there is provided the ability to denote any valid expression by an evaluation tree with literal or unbound variables as its leaves, operators as nodes, operand dependencies as edges, and the root denoting the result. The language supports the following constructs:

I. Primitive types as commonly found in programming languages, such as integers, floating point values, booleans etc.

2. Constructed types, the only one of which will be considered here is a list, namely a sequence of instances of primitive types, possibly empty. The empty list is denoted by "nil". Elements are added to a list by the "cons" operator,::. Operators "head" and "tail" are available for accessing list elements.

C1418 3. Information model types, object types defined in the information model are implicitly available in the language environment. For example if a Person object type is defined in the model, then a language type "Person' 'is defined.

4. Literals, these are used to denote constant values of primitive types, such as true, false, 109.5.

An example list would be 10::true::"wibble"::nil.

5. Operators. Common operators are supported for arithmetic, boolean operations (comparators, logical and, or, not etc) and basic conditional logic such as if-then-else.

Object attributes are accessed via the (infix, binary) "attribute accessor" operator, '.'. For example, person. sex; 6. Naming. An expression can be named and then used by other expressions within the scope of the name via the "let" keyword:

let <name>:= <expression>; For example, void doIt( Person claimant)I let age:= ageOf( claimant); if age < 10 or age > 20 then... else... endif; To simplify the explanation, named expressions can only be defined at the fimction definition scope, as in the above example. They cannot be declared within inner expressions.

7. Unbound variables. Traditionally, input, that is the binding of values to variables within an evaluation dynamically, is performed by ad-hoe invocations of 10 primitives within an expression. Within fimctional languages such an approach is not generally considered C1418 applicable due to the side-effects destroying the ability to both exploit inherent parallelism and reason about applications.

One often used approach is to explicitly pass an input stream to functions that need it, using some kind of sequencing mechanism to enforce evaluation order (e.g. monadic operators). However, an alternative mechanism is defined here. Rather than considering an input stream as some ordered list of values, input is considered as a set of name/value pairs. The names are declared at the point of usage as "unbound". For example, Person findPersonol unbound String customer_id; fn(customer id) This implies an entry in the input set with name "customer_id". Usage of the name during evaluation will trigger the acquisition of a value for it from the input set. A user of such an evaluation needs to understand the fact that this information may be required and be able to distinguish it from requests for other data.

Obviously there is a need to cater for recursion and the existence of different occurrences of the same local name. This is described below.

To simplify the explanation, unbound variables can only be defined at the function definition scope, as in the above example. They cannot be declared within inner expressions.

8. Function Definition. A function is defined by binding a name to an expression and identifying formal parameters.

For example, int C1418 sumOfSquares( int p 1, int p2 (p 1 p 1) + (p2p2); Function invocation is by quoting the function name along with its actual parameters:

sumOfSquares( 10, squareRoot(16)); Parameter binding is via substitution, not value. That is, the above invocation is equivalent to let int P1 10; let int p2 squareRoot(l 6); (pl pl) + (p2p2); not sumOfSquares( 10, 4 Indeed, in general, all name bindings are performed by substitution. Thus the example above expands to:

(10 10) + (squareRoot(l 6) squareRoot(l 6)); It can be seen that such an approach, if implemented blindly, leads to massive duplication of effort (here the square root will be calculated twice). Indeed, any plausible implementation of this language needs to ensure that common evaluations are only performed once. However, for the purpose of analysis such a logical view of evaluation is beneficial.

Note then that such a naIve expansion results in a tree structure, not a graph.

Any non-trivial program will require recursion. Assuming that the above squareRoot function was not a language primitive, this would be defined by, for example, a recursive binary chop. Recursion prohibits expression expansion prior to evaluation, as this would never terminate. Rather an "applicative orde? approach is required that only performs the C1418 expansion when required. However, for analysis it is still possible to perform a restricted expansion that still captures the essence of all potential evaluations.

In order to permit all unbound variables to be given an unambiguous name, particularly in recursive constructs, it is necessary to define a naming scheme. It is also necessary to define a scheme that permits the use of information by an application to be deduced from the application itself, in order to permit potential interaction models with the application to be calculated. This is dependent on the naming scheme as, typically, information access commences from the use of unbounded data to enter an 'information graph'.

The naming scheme involves the transformation of the application to explicitly name unbound variables. Such a variable can be introduced anywhere in an application, as mentioned above, at fimction definition scope. For example, Person findPerson(f unbound String customer_id; fn(customer_id); It can be seen that simply naming the variable as "customer - id" is in general insufficient. Firstly, there may be other separate occurrences of a variable with the same name. Secondly, there may be more than one invocation of the function "findPerson", and each such invocation effectively requires a separate " customer_id " variable. One special instance of the later case is a recursive function that includes an unbound variable. There is no statically determinable limit to the number of unbound variable instances required.

The transformation uniquely identifies such variables as follows:

Firstly, add two implicit formal parameters to each function as follows:

fn( p I,- pn) C1418 becomes fil( list _context, set unboundvars, p 1'... pn) The "-context" parameter consists of a list of identifiers that uniquely identify the function's position in the tree. The "-unboundvars" parameter denotes the set of unbound variables in the evaluation space.

A second modification involves the insertion of the corresponding actual parameters at function invocation points. Each function invocation within a give function is given a unique identifier, Y' (within that flinction). Each function invocation then passes x::- context as its first parameter. A function-unique identifier is used, rather than a global one, as this prevents changes to other functions from affecting this fiinctiods naming scheme.

fn( list -context, set -unboundvars, int pI):= fx( 1::-context, _jmboundvars, 10 + fy( 2::-cOntext, _unboundvars, p I At the outermost level in the evaluation the "_context" parameter is set to the empty list, nil.

A variable is then named relative to the "-Context" parameter of the function wit-bin which it is declared.

The term "name tree" is used to describe the expanded tree that represents all possible namings within an application. Any actual evaluation will involve a (non-proper) subset of this tree of names.

Expansion occurs by substitution of function invocations by their definitions. For example, void maino{ unboundint V; if C1418 v 3 then wibble(v) else wibble(v/2) endif, int wibble( int p unboundint W; p+45 +w; I becomes (with the _context values inserted) void main( nil unboundint V; if v3 then H wibble( 1::nit, v):

unbound int w; p + 45 + w; else H wibble(2::nil, v/2):

unbound int w; p + 45 + w; endif; The syntax "<unbound - variable_name> ( <Iist_expr> is used to denote the ftilly qualified names of unbound variables, <list-exPr> being the value of the "-context" parameter to the containing ftinction.

There are thus three unbound variables in this example:

v{nill, wl 1::nill, wf2::nil) C1418 As mentioned above, for any non-trivial program, recursion will be required. In this case it is not possible to fully expand the name tree. The key to performing a meaningful expansion that still captures the intrinsic structure of the evaluation is to note that although the expansion is infinite, it is produced by repeated application of function substitution where the functions come from a finite pool (i.e. the program itself is not infinite). Thus, at some point such a recursive substitution will be found to already have been made along the path between the substitution point and the root.

Consider the following program:

void mainof fh(IO); int fh(int p) I unbound int V; if p <-- then square( v H function call I else if even( p function call 2 then 2 fn( fii( p/2)+I) H function calls 3 and 4 else 3 fh(p-I function call 5 endif endif The name tree, curtailed by stopping at recursion points, can be pictured as in Fig 1. Nodes 1,5 7,10,11,13 and 16 are operators, node 2 is a function call node, node 3 is a scope node, node 4 is an unbound node, nodes 8,9,12,14 and 17 are literals, and nodes 15 and 18 are recursive function C 1418 call nodes. (Note the introduction of the 'scope [definition]' node to represent the unbound variable in the expansion, other empty scopes are not shown for reasons of clarity):

With the approach taken for the non-recursive case above this would lead to a single name:

v{l::nill However, this does not capture the recursion. What needs to be captured are all the possible names that can be produced by recursion. To do this note that the other names that can be produced are derived by repetitive application of the name extensions defined by the edges between recursion nodes. In the above example there are two such paths, joining recursion nodes and 18 to the initial invocation of 'fh(lO)'(node2). These paths correspond to name extensions of 'Y and '5'. These recursive portions can thus be repeatedly applied to the names of any unbound variables contained within the original 'fin(I 0)' invocation, giving:

v{[3::15::]I::nil) where is an 'or' operator ..]'denotes 0 or more applications of'...' Thus a name such as v {3::5::5::3:: 1::nil} is an example of a real fully-qualified name that could be generated at evaluation time. Note that any such real name can be derived by exactly one expansion of a recursive name, that is there is no ambiguity.

However, such an expansion is not quite sufficient. Note that it has missed the invocation of 'fn' that extends the name by'4, as this is contained within a parameter to an invocation. The expansion as defined currently discards function parameters. To take account of these modifications it is sufficient to expand the parameters at the point of invocation, treating them as if they were operands to the "invocation" function. This is possible as their - context name is fully defined at the point of invocation; expansion to the point(s) of application is not necessary.

This is illustrated in Fig 2. Taking this into account the name becomes C1418 v{[3::14::15::] 1::nil) Thus, whatever level of recursion actually occurs at runtime, the names generated for V will correspond to values generated from the above expression.

Formally the names are generated as follows. Firstly there is invocation naming. For each separate function defined, each invocation of another function from within that function is given an identifier (ID) that is unique within that function.

Name extraction involves construction of a tree to represent the expression. Each node in the tree can be of the following types:

literal (leaf) unbound variable (leaf) name definition (non-leaf) denotes a name definition within a scope name application (leaf) denotes use of a named expression/parameter n-ary operator (non-leaf) scope [definitioniparameter] (non-leaf) denotes a scope with set of name bindings and the expression within the scope invocation target (leaf/non-leaf) denotes a called function, leaf if a recursion point The parameters to a function are made children of a 'scope [parameter]' node, wrapped up in a name definition! node. Each such scope has exactly one child of type 'invocation target. Thus, for example:

void mangle( int pl, int p2 mangle( 10, 2020 call under consideration produces a'scope [parameter]' node with three children:

invocation target [mangle] -> scope [definition] definition of 'mangle' name definition [p I] -> literal (10) - parameter I C1418 name definition [p2] n-ary operator parameter 2 The 'scope [definition]' following 'invocation target [mangle]' is to contain any unbound variables/names defined within'mangle'.

If recursion occurs the corresponding 'invocation target' node and the 'invocation target' node that it recurses to have a relationship established between them. Further expansion of this branch then halts. Note that in general each resulting 'invocation target' leaf will have exactly one link to an 'invocation target' node between itself and the root, whereas a non- leaf 'invocation target' will have 0 or more such links to leaves.

Each 'invocation target' node is identified with the (function-unique) invocation identifier.

Construction of a name is then performed for each resulting unboundedvariable, which always occur as leaves. The path between these leaves and the root are considered separately, in the direction of leaf to root. In general these paths will consist of n-ary operator, scope [paraineteridefinition], 'name definition' and 'invocation target' nodes.

The name starts out as that of the unbounded variable. This is then extended as traversal to the root occurs. If an 'invocation target' node is encountered the name is extended with the invocation identifier of thenode. If it has any links to 'invocation target' leaves that have been constructed due to recursion, a recursive portion of the name is constructed and inserted prior to the invocation identifier. This proceeds as follows.

In general the form of the recursive portion is [PlIP21... lPn] where each TP corresponds to a separate recursion leaf, 'Li'. Itisgeneratedby computing the name between 'Li' and the recursion node (excluding the recursion node itself) by traversing the path and noting the invocation identifiers of encountered 'invocation target' nodes, as for the original unbound variable leaves.

Note each such path must contain at least one such node and potentially other C1418 recursive 'invocation target' nodes - these are handled in the same way, hence recursive name components can themselves contain recursive components and so on.

To simplify the algorithm an implicit call to 'main' is constructed during initialisation by introducing a function:

void _paino{ maino; 1/ label invocation with ID'I' The result of this is that all names will have an extra component (:: 1) denoting this call this can be discarded after processing. Initialisation involves the following current-exlpression invocation of 'mainwithin'-maid root-node new Nodeo set root-node type to be 'scope [definitionl' 11 outer root scope constructTree( current_expression, root-node findNames( root-node where "constructTree(ExpressionE, NodeN)" is a function that creates the tree for a given expression, E, by extending node N. Processing depends on the type of E:

X:= new Nodeo N.addChild(X) switch( type of expression E){ literal:

set type of X to denote the literal name application, N:

set type of X to denote the application of name N' name definition, N:

set type of X to denote the definition of naine N' constructTree( expression that name N denotes, X n-ary operator (e.g. if-then-else,) set type of X to denote the operator for each of the 'n' operands C1418 constructTree( operands[l], X function invocation, fn(pl..pn), id = ID find fiinction definition F that defines fii H create caller scope for function call set type of X to be 'scope [parameter]' H add each supplied parameter to scope for each of the supplied parameters{ C:= new Nodeo; set type of C to be 'name definition', name that of this formal parameter in F X.addChild(C) constructTree( expression denoting this parameter, C add invocation target to scope node create a node Y, type 'invocation target [fn]', id = ID X.addChild(Y) // check for this being recursive call search back up tree from N to root looking for node of type invocation target [fn]' if such a node Z is found then // link recursive nodes, Y becomes a leaf mark Y as recursive to Z and Z as recursive at Y else // create scope for called function's unbound vars/names create a node S, type ='scope [defmition]' Y.addChild(S); // add each unbound var/named expression in F to the scope for each unbound variable defined in F{ create a node U, type ='unbound variable', name that of this unbound variable S. addChild(U) for each named expression defined in Ff create a node U, type =name defmition', name = that of this named expression S. addChild(U) constructTree( expression name denotes, U H add the target function body to the scope node constructTree( expression body of F, S) endif C1418 The "findNames( Node N)" function looks for unbound variables and computes their names is as follows:

if N is of typeunbound variable V then H must be at least one name component due to 'main' implicit invoke setname:='V' + + findNatnesSupport( N.parent, root)+ "::nil output name else for each child of N findNames(child[l] endif Where the "findNamesSupport( Node startN, Node endN)" function computes and returns the name produced by the path between nodes star(N and endN, where endN is nearer the root than startN. endN itself is not included in the analysis. This function is as follows:

set node startN set name while( node endN if node type is 'invocation target [fn]' then boolean insert-Separator name if node not leaf and is recursive to leaves Ll..Ln then if insert-Separator then name C1418 insert-separator:= false endif name for each Li in Ll..Ln{ if not LO then name += "I" endif name findNamesSupport( Li, node + name+= endif if insert-separator then name += "::" endif name += node.ID H recursive components always followed by an id endif node:= node.parent return( name Explicit Naming As can be seen from the above discussion, all unbound variables can be unambiguously named.

However, the resulting names are unwieldy and unstable in that a slight change to a rule can change the names. A solution to this is to allow the introduction of explicit name segments into the generated one. A simple mechanism is defined by using the naming operator:

let <name>:= <expression>; This does two things. Firstly it defines <name> to be equivalent to <expression> such that <name> can be used within other in-scope expressions. Secondly, any invocations of other functions within <expression> have their context extended with <name>. For example:

C1418 Person findPersono unbound String surname; void maino let personl findPersono; let person2 findPersono; personl:: person2:: nil; surname{I::<personl>::nil) surname{2::<person2>::nil) It can be seen that surnamej<personl>} sumame(<person2>} is sufficient to disambiguate the two unbound occurrences of 'surname', and that these names are relatively stable in the context of change to either'main'or'findPersoif.

For purpose of explanation, as indicated before, a single scope is permitted at the fimction definition level. Nested scopes are a simple extension, although the need to provide a nested naming scheme for names (and unbound variables) becomes necessary in the case of redefinition in inner-scopes.

The above algorithm already copes with the introduction of the named expressions into the name tree. All that remains is for the findNamesSupport function to be extended to include these in the names generated:

while(node!= endNT)( if C1418 node type is 'invocation target ffn]' then elseif node type is 'name definition N' and node.parent type is 'scope [definition]' fl exclude parameter namings then boolean insert-separator:= (name!= if insert-separator then name endif name±- "<" +'N'+ ">" endif Minimal Non-ambiggous Names Given the set of names generated from an expression it is possible to automatically reduce the names to be as simple as possible but still permit non-ambiguous mapping of these simplified names back to the full versions, in other words minimal non-ambiguous names. In the example given above, it can be seen that 'surname (<person 1 > l'uniquely identifies the full version 1surnarne {1:: <person I >::nil 1'.

Ideally all of the simplified names should consist solely of explicit name components - in this way there is no dependence on the internal naming scheme used during name construction.

However, this is dependent on the appropriate use of explicit names, and hence not guaranteed.

With this goal in rnind a tree is built from the set of generated names. Edges denote name segments. Leaves denote unbound variable names. Paths from leaves to the root denote complete names.

Consider an extended example:

Person findPersono unbound String surname; C1418 unbound int dateOfBirth; let p:= lookupPersonBySumame( surname);// invoke 1 let 1 q:= lookupPersonByDOB( dateOfBirth);// invoke 2 if p null then if q null then fmdPersono invoke 3 else q endif else p endif; void maino let personI findPersono; /1 invoke 1 let person2 findPersono; fl invoke 2 persoffi:: person2 nil; This produces the names:

surname {[3:j:: 1::<personl>::nil} dateOfBirdi {[3::]:: L:<person I >::nil} surname {[3::]::2::<person2>::nil dateOfBirth {[3::]::2::<person2>::nil} Name Recogniti The tree generated from this is as in Fig 3 and is called the "naine recognition tree" in that it consolidates the potential names derivable from an evaluation and gives an abstract mechanism for recognising actual generated names. In Fig 3, nodes 28,29,33 and 34 are leaves, node 24 is the root and nodes 25-27, and 30-32 are intermediate nodes. The edges between nodes 27 and C1418 28,29 and between 32 and 33,34 are unbound names. TIle edges between nodes 26 and 27, and 31 and 32 correspond to recursion. The edges between 25 and 26, and 30, 31 are implicit names, and the edges between 24 and 25, 24 and 30 are explicit names. The reason the tree of Fig 3 is abstract is that it does not detail the recognition of the recursive components. To do this a deterministic, finite automata is generated from the tree. Edges correspond to primitive name components (implicit or explicit names). The start state corresponds to the root. Final states correspond to the leaves. To see that it is deterministic consider the construction of the automata for the recursive components (the only non-trivial case):

All recursive components, including nested components, are always of the following form (see above) x[pjqj...]y where p = [pl::p2::...

q = [ql::q2::...

etc.

Here'y'is an implicit identifier. Yis either an unbound variable or an implicit/explicit identifier.

Note that because of the guaranteed existence of 'y' it is not possible for two recursive components to directly follow each other - they must be separated by at least one non-recursive identifier. Because of this it is never necessary to consider more than one recursive component at a time. This generates an automata as in Fig 4, which is deterministic as, at the nodes where choices are possible, all exit transitions are guaranteed to be unique. In the above example this equates to x, pl, ql,... zl all being different. To see why this is the case it is necessary to consider an original naming tree that could have led to the recursive component:

fa( y::...::nil, p l...pn) { unbound x H in the case where x is unbound // OR fy( x::y::::nil H in the case where the Hunbound var is in or H beyond'fy' fp 1 ( p 1::y::) H'p'recursion C1418 fql(ql::y:: H'q' recursion, in turn Meading to Y recursion fzl( zl::y:: HYrecursion fql(ql::y::...

fq2( q2::ql::y::...

frl(rl::ql::y::...) The algorithm used to allocate'x, pl, ql,...zl' guarantees that they are unique within'fn'.

Likewise fbr'q2'and'rl'infql'.

Process There are two aspects of processing which apply separately. Firstly, how to simplify the names by considering the name recognition tree (where recursive components are not expanded).

Secondly, how to deal with naming arising from the recursive components themselves.

The first thing to note is that name qualification beyond that of the unbound variable name is only required when there are non-unique unbound variable names. In the example above there are two such occurrences, for'surname' and 'dateOfBirtb!. However, for other unique values it is sufficient merely to use the unbound name as the simple name.

Name SiMplification For the purpose of simplifying duplicate unbound names the tree can be analysed to find the points at which the names have common nodes. Note that recursive components can be completely ignored for the purpose of this as they may not occur at all (they are defined as zero or more repetitions). Note also that they cannot occur as an edge leading from a branch node in the tree, see earlier discussion, thus ignoring them does not change the structure of the tree.

When attempting to find unambiguous names the common nodes can be used to reduce the search as it is pointless to continue beyond them. However, the following algorithms do not exploit this optimisation.

C1418 The aim is to disambiguate using explicit names if possible. Only if this fails are implicit names resorted to.

In the following algorithms the unbound names themselves are treated as explicit names. Also, nodes are treated as carrying names rather than the edges that join them, with an edge's name being applied to the node nearer the leaf to which it is joined. Thus, for example, the leaves have explicit names corresponding to the unbound variable name.

The first pass of the (non-optimised) algorithm only considers explicit names, the search proceeding from the leaves towards the root. It is implemented via two functions.

The first ftinction takes a set of nodes referring to the current set of explicit names under consideration. If within this set there are multiple occurrences of the same name then these are grouped and passed onto the second function. Names that are explicit names that do not occur more than once denote nodes that are non-ambiguously named at this point.

The second function takes a set of nodes and attempts to find the next explicit name on the path between each node and the root. The set of nodes for which such a name is encountered is then passed recursively into the first function if its cardinality is greater than one. Nodes for which no such name exists imply that the original name cannot be unambiguously named via explicit names.

Firstly, initialisation:

Create an array of 'current nodes', size V. Set each entry to point to leaffi] findMinimal( current-nodes); Now functionl. To simplify processing, entries in the parameter 'nodes' that have the value'null' are used to denote entries not to be considered. They arise in function 2 below. Note that recursive name components are ignored.

C1418 void findMinimal( Node nodes[] boolean done[] new array of boolean, same size as nodes for each entry in 'done'{ done[i] = nodes[i]!= null for each entryT in nodes{ if (done[il)continue done[i] = true String explicit-1 = explicit name held in nodes[i] Set clash = new Set clash.insert( nodes [il.parent for each entry j in nodes, j > i if (doneb])continue String explicit-2 = explicit name held in nodesb] if ( explicit - 1 == explicit-2) { clash.insert(nodesUl.parent) if ( clash.size ≥ 2 create a new array of nodes consisting of those in 'clash' findMiniinaISupport( nem_array }else( nodes[i] denotes a non-ainbiguous name C1418 void findMinimalSupport( Node nodes[] for each entry i in nodes{ Node n = nodes[i] while(n!=null){ if n contains an explicit name break n n.parent nodes[i] n each entry in nodes that is'null'denotes an ambiguous name if (number of non-null entries in nodes >-- 2 findMinimal( nodes The second part of the algorithm is only applied to those names that are still ambiguous. It is applied in turn to the sets of such ambiguous names with the same unbound name. Note that the remaining ambiguous names can be proper subsets of the original equivalent name sets (e.g. say there were originally 4 unbound variables with the same name, after phase one of the algorithm this may have been reduced to, say, just one of the 4 being ambiguous).

As the algorithm continues with only the ambiguous names, this has important considerations for the order in which names are resolved at runtime (see later).

The search proceeds from the root towards the leaves, looking for the first name parts (explicit or implicit, not recursive) that can disambiguate the names. The key nodes that are being sought C1418 are immediately after branches that are left containing a single unbound variable - that is there are no further branches in the tree beyond them. The paths between these nodes and the root are sufficient to disambiguate the names.

Firstly, initialisation:

Create an array of 'current nodes, size V. Set each entry to point to node denoting the first name component of each name.

findMinimal2( current-nodes); The first function chops the nodes up into sets with equivalent unbound names and passes these to the second function.

void findMinimal2( Node nodes[] same as findMinimal above, group into sets with same unbound name.

if set size is I then already non-ambiguous Otherwise call findMinimal2Support with each set, this time starting at the name's roots, not leaves. For this purpose the names are treated separately, rather than as paths in the tree - thus a node only ever has a single child.

The second function traverses along all of the names looking for a point at which a branch occurs. For each resulting branch, if it contains a single name then this has been disambiguated by the branch and processing of it stops. Otherwise the function calls itself recursively with each branch set. Note that recursive name components are ignored.

void findMinimal2Support( Node nodes[] boolean all-the_same = true while(all-the_saine){ C1418 all-the_same false boolean done[] = new array of boolean, same size as nodes, all false for each entry i in nodes{ if (done[il)continue done[i] = true String narne-1 = explicit/implicit name held in nodes[i] Set clash = new Set clash.insert( nodes[il.child for each entry j in nodes, j > i if (doneb])continue String name_2 = explicit/implicit name held in nodes if ( name 1 == name 2){ clash.insert(nodesU].child) if ( clash.size length of nodes all-the_same true break }else if (clash. size > 1 create a new array of nodes consisting of those in 'clash' findMinimal2Support( new_array }else{ fl nodes[i] denotes an unambiguous name C1418 for each entiy i in nodes{ nodes[i] nodes[il.child Further simplification is in some cases possible, by employing an 'explicit-narne-relative' scheme.

This is based on a pattern-matching approach, for example '1::2<narne>... ' might be all this is required to recognise a particular name, rather than the full expansion T:2<name>:: L:1: 1::nil', if '1::2<naine>' can never occur in any other implicitly ambiguous name for the same unbound variable name. Indeed, such an approach can generally be applied but will not in general lead to the most useful names. This is not considered further here.

Recursive C9Monents As mentioned above, recursive components may not occur at all in a name as they are defined to consist of zero or more repetitions. This is not to say that a consumer of the name is not interested in, say, the level of recursion, or the particular path that recursion has occurred through. The most straight forward approach here is to extract any explicit names from recursive components and, if there are any, insert these into the simplified name. For example, x{l::<turnip>::[2::14::<veg>::] 1::nil} becomes x{<tumip>:: [<veg>::] nil} In this example, if recursion occurs through the '2::' branch then this will not be visible in the simplified name.

A stronger approach can be taken, to insist that all branches must produce a name, even if internal naming was required - x {<turnip>:: [2:: 1<veg>:j nil} C1418 If there are duplicate names generated within a recursive component then the same strategy is used to disambiguate them as for unbound duplicate names. Similarly the resulting names can undergo simplification. In this case the algorithms are applied solely to the elements of the recursive component. Thus, a component of the form:

[pl I p2 I... pn] is treated as V separate names, p lp2,...pn.

Resolution of these names is performed in a similar fashion to the toplevel ambiguous names above except that:

I the explicit stage is performed for those names with an explicit name:

implicit 1::implicit2::... <explicitl>::...

with the implicit names to the left of the first explicit being ignored.

2) the implicit stage is performed on all remaining names in one group, i. e. disambiguated against each other without sub-groupings.

Resolution of Names It is important to note that when resolving an actual name against the simplified names, the direction of search is important, as is the order.

All actual names end in an unbound variable, thus the first step is to discount all potential names that end in a different variable name.

Next a name must be matched in the direction of 'unbound variable towards root' against the explicitly disambiguated names, using explicit components found in the name.

C1418 Only when this fails is a name matched in the direction'root towards unbound variable' against the implicitly disambiguated names.

The order is important as implicitly disambiguated names can be ambiguous with the explicit ones. Consider:

void filo unboundint X; void maino let fred = fno; unboundint X; This would produce the explicitly unambiguous x{<fred>} and the implicit x{l::nil} The real name generated forY within'& at runtime would be x{l::<fred>::1::nil} which would erroneously be recognised by the implicit resolution if this were performed first.

Higher Order Functions Higher-order functions are partially evaluated functions that can be progressively further evaluated by the application of parameters. They are supported by a constructed language type that denotes the signatures of such partially applied functions:

C 1418 <retum,._type>([parameter"e>]) which can be used wherever other types can. For example:

let int(int) fn:= factorialo; fh(4); The names of functions can also be used to denote their signatures:

let factorial fn:= factorialo; As the extension to the implicit "-context" parameter is still performed at the original, partial invocation point (this implicit parameter is always supplied by the system and hence it is guaranteed that all functions are always evaluated beyond the point of application of this parameter), and unbound variables occurring within the partially evaluated function are named at this point, the existing algorithms are still valid.

Consider the following:

int fny( int(int) f, int v)( f(v); invoke I in fny int fnx( int p)I unbound int x; let fn:= fnxo; H invoke 1 in fhx if x then p else p + fny(fn,x) invoke 2 in fhx endif, C1418 void mainof fnx(l); H invoke 1 in main I Although the actual recursion occurs within fhy, the naming scheme is based on the recursion within fnx, (invoke 1). Thus the full name forYbecomes:

xj [1::]I::nil} The key thing to understand is that unbound variables are named at point of invocation, even if higher-order. Hence, the following example does not produce a recursive name component:

void fn( fn p unboundint X; if x then x else P(P) endif void maino{ let f:= fho; f(f); This is because the unbound variable Y is named at the point of invocation of Tho' within main.

As a matter of interest, any value supplied for Y other thanT will result in non-terniinating evaluation.

C 1418 Information Access Derivation Information access derivation is the analysis of an application to extract maps of the information it may use. Recall from the earlier discussion that applications are defined to operate on a particular information model. An example information model consists of a single object type:

Person{ String FirstName; String Surname; String Sex; int DateOfBirth; boolean Alive; An application can reason about instances of this type, for example, as follows:

int age( Person p System.dateo - p.DateOfBirth; boolean isEligible( Person p if p.Alive then if p.Sex "male" then age( p) ≥ 16 else age( p) ≥ 18 endif else false endif; void maino C1418 unbound Person claimant; isEligible(claimant); An analysis of this application yields the fact that the (unbound) person identified as 'claimant' has their "Alive", "Sex" and "DateOfBirth" attributes examined.

This information can then be used during, say, the design of a user interface to support this application. It tells the designer that, in the worst case, they have to cater for the acquisition/display of these three attributes, but that in this case they can forget about the others.

In the general case this information map will be more complex, with attributes denoting relations to other object types also being accessed (for example, a link to where the person lives, their Address object). In this case the designer may see the need to handle certain attributes of a person, a link to an address and certain attributes of the person!s address. Recursion can also introduce recursive parts of the information map in a similar fashion to the way that it introduced recursive portions into unbound variable names.

It will be shown in the following that analysis of an arbitrary expression (application) can be performed, this leading to zero or more information access trees. Each tree is rooted at an unbound variable which in effect defines the entry point to the information graph. These trees can have recursive portions denoting repetitive relationship traversal between objects.

Use of these trees to guide the (potentially completely automatic) construction of user interfaces to interact with the expression is not covered.

To derive the information maps, evaluation expansion has to be ffirther reaching than that for the naming above, as the application of name bindings (both from function parameters and explicit namings) must be considered. This is because, while the names of unbound variables are defined at point of declaration, how values are potentially used during evaluation is based on how they are manipulated. In the example above the naming of 'claimant! is performed within'maid - the 40- C1418 fact that this value is then used elsewhere is not evident from the naming tree (the naming tree is only constructed to find other unbound variables and name recursion).

In the non-recursive case this is straight forward, the evaluation is completely expanded, with name bindings being satisfied as required, and attribute accesses can be read of along paths between leaves denoting unbound variables and the root.

Recursion, however, introduces complexity as full expansion cannot be performed. Recall this is the term used to identify functions that by some chain of invocations invoke themselves. Tlius the following definition contains two (mutually) recursive functions:

example 1:

void gn( int I){ fn(1+1); void fn( int I gn( 1- 1 void mainof fa(IO); Note the distinction between functions invoking themselves and using 'values' supplied as parameters. Hence the following is not recursive:

example 2:

void fn( int I I; void maino( fn(fh(4)); C1418 whereas the following is:

example 3:

void gn( int I){ int fii(intJ){ gn( ffl( J- 1 void maino{ fn(4); Note also that for mutual (and higher-occurrence (e.g. tertiary, where& calls 'g;d calls In! calls &)) recursion, these can be considered as consisting of lower-occurrence recursion, based on what loops actually occur. In example (1) aboveTh'can be considered as the only recursive function, with'gn'merely contributing to the effects of'fn's recursion. If, however, the example had been defined with 'main' as follows:

void maino{ fn(4) + gn(3); then'& would be considered to be the recursive function whilst expanding Th(4)', and 'gW the recursive function when expanding 'gn(3)'.

The aim is to expand the evaluation sufficiently so that a set of dereferencing/attribute access chains with recursive components can be generated that correspond to a superset of all potential information access. This is a superset as the logical aspects of information access aren't investigated. Consider:

String fn( Person p C1418 if p.Alive then if not p.Alive then p.Sex else p.Age endif else p.Age endif; In reality the 'Sex' attribute will never be accessed. However, the expansion scheme defined here assumes that all operands are potentially reachable (indeed, it is not in general possible to reason about the reachability of such operands, the problem is undecidable (consider the case where the evaluation of the if-operand never terminates)). The existence of un- reachable operands implies, in general, errors in logic anyway.

The first issue to address is when an expression is recursive, and if so when to terminate expansion of such recursive components. In general a recursive invocation takes the following form:

type fn( Param p fi_i( fh,._r( Eti---p( p where fn_i is some modification of the result of ún - r fn-r is a function that directly or indirectly invokes fli (or can be 'W itself) fn_p is a function that modifies the parameter p before passing to fn-r Thus, in general, when recursion occurs two forces are at work - modifications made to values passed into the recursive function and modifications made to the result. Whenever arecursive cycle occurs, one of each of these modifications occurs (although either/both modification(s) may be 'null'). Thus information access made by such recursion is generally of the form:

C 1418 unbound_var.Xl.X2..Xn.[PI.P2..Pn].YI.Y2..Yn[II.12..In].ZI.Z2..Zn where the X accesses are performed prior to the external recursive invocation the T accesses are 'inbound'modifications the X accesses are made when the recursion terminates, at a leaf the.1 accesses are 'outbound' modifications the Z accesses are performed after the external recursive invocation and the repetitions of [PI.P2..Pn] matches that of [Il.12..In] and further recursion can occur anywhere within this, including within the Ps and Is Consider a first example:

Object fn( Object 0) f if O.Al.A2..An then fii( O.P I.P2..Pn).I 1.12..In else O.YI.Y2..Yn endif; void mainO{ unbound Object unbound_yar; fn(unbound_var.XI.X2..Xn).ZI.Z2..Zn; This generates an access map that exhibits the generic one above:

unbound_var.XI.X2..Xn.[PI.P2..Pnl.YI.Y2..Yn[Il.12..In].Zl.Z2..Zn It also generates the following for the 'if branch unbound_var.XI.X2..Xn.[Pl.P2..Pn].Al.A2..An {[I1.I2.JnJ4Z1.Z2..Zn} The portion in 1) is not applicable as the'if operator's'if operand does not result in this value, it is only used to decide on the 'then' and 'else' outcome. Hence any modifications made to the result of the if expression are not applied to this operator. For the purposes of this discussion only the 'if-then-else' operator is assumed to exhibit this characteristic.

C1418 A function can recurse at more than one point, so in general for a given recursive function there will be multiple [P... 1 and [I...] two-tuples that can be applied in arbitrary interleavings.

For example:

fn( Person p if p.Alive then true else fn(p.Mother) or fn(p.Father) endif, Here there are two recursion points. Each such point deffies separate [P.. . 1 and [I...] portions, in this case the [I... 1 portions are empty while the [P...] portions are [.Mother] and [.Father].

Note that however many such points exist there are still only one set of potential leaves (in the above, [.Alive] and [true]).

Unbound variables that occur within a recursive function will not undergo [P... 1 modification, unless they are themselves passed into a recursive invocation as a parameter in which case they undergo [P...] modification but not [X...] For example:

fn( Person p unbound Address a; if p.LivesAt a then fn( a.Landlord else fn( p.Partner endif, 1 C1418 Here it can be seen that 'a' undergoes a [.Landlord] [P..] modification followed by [Tartner] [P... 1 modifications.

Where a function has more than one parameter, and indeed in the case above, the issue of parameter rotation/distribution occurs. This is where, say, a parameter that is passed as the first parameter is recursively passed as the second.

For example:

fn( Person p 1, Person p2 if pl.Alive then fn(p2, pl, else p1Salary endif; It can be seen that any value passed in as 'p 1' can not have the [. Salary] leaf modification applied on the first recursive call, and thus any information map that ignores such parameter shuffling would give incorrect results.

Note also that such parameter shuffling can result in an entry parameter occurring m multiple places in a recursive call. Consider and'& call of the form fn( pI or p2, pl.

where 'p 1' is shuffled to both parameters.

Thus the information that must be extracted from a recursive function to capture all aspects of recursion is defined to be the following:

C1418 One or more groups of [I...] and [P...] modifications denoting differing recursion points and A set of leaf modifications.

Each group consists of zero or more [P...] type modifications and zero or more [I...] modifications where each of the [P...] modification consists of a triple jorigin,dest,mod} where origin is either a formal parameter name or an unbound variable local to function dest is a formal parameter name mod is a sequence of attribute accesses and each of the [L.] modifications is a sequence of attribute accesses Each leaf modification consists of a pair f origin, modification where origin is either a formal parameter name or an unbound variable local to the function modification is a sequence of attribute accesses Within any portions of the recursive map, sub-recursion may also occur. For example:

int gn( Person p gn(p); int fn( Person p gn( fh( p)); Here the [I...] modification for Tn' contains 'gn' which itself is recursive. Where such sub recursion occurs it is represented by having a reference to the recursive map for the sub-recursive function, along with the formal parameter that the modification is being applied to. In the above example this equates to [gn/p].

C 1418 Here is an (obviously nonsensical example) Object testl_fna( Person p)( unbound q; let res if p.Address.InUK then if q!= p then q else p endif else testl - fna( p.Brother endif; res.Idiot; The information map for testl-fha is group I [P...] mods: { p,p,.Brother} [I...] mods: Idiot leaf mods:

(p,.Address.InUK 11, {p, I), {p,.Idiot), f q, I), {q,.Idiot) (here '1' denotes the fact that the no outbound modifications apply beyond this point as the leaves are generated by 'if operator 'if operand expansion).

This indicates that, for example, if testl-faa is called with parameterY the following can be accessed:

x.Brother.Brother.AddressJnUK x.Brother.Idiot.ldiot q.Idiot C1418 There is one further aspect that is not investigated further here, the correspondence between parts of information maps. For example:

list fn( Person p 1, Person p2 if p I.Age > p2.Age then pI:: p2 else fn( p I.Mother, p2Sather endif, Here, whenever p I.Mother is followed, p2.Father is too. Note that this is not actually directly deducible from the fact that these are [P...] modifications, rather this is also based on the expansion termination properties. In general such associations are nontrivial to identify.

Information Access Tree Construction The following deals in more detail with the construction of the recursive maps, that is information access tree construction.

Recursive Expansio The first problem to address in the production of recursive information maps is how far it is necessary to expand an expression to permit all of the required information to be captured, i.e. is concerned with recursive expansion. As noted above it is necessary to perform substitution of values for name bindings. For non-recursive functions this can be performed completely, for recursive ones it must be curtailed.

This first issue is to identify which functions are recursive. This is done by checking for repetitive invocations of the same function between the current expansion point and the root.

However, it is not totally straightforward, as exemplified by the example:

C 1418 int fn( int p){ A fn(fn(3)); which is not recursive but will, on expansion, result in two calls to Th' being made on a path between the literal T and the root. To understand what is happening here it is necessary to consider the detail of the expansion. Wherever an invocation is made, an 'invocation node will be added to the expansion. More importantly, whenever a formal parameter is resolved, aTormal parameter node will also be added. Invocations nodes can be considered as entering a call context whilst formal parameter nodes can be considered as the leaving of a call-context. An invocation is only recursive if it occurs within a call-context of the same function. If the above example is expanded in this way, the result is as indicated in Fig 5.

Note that invocations of Th' do not occur within nested contexts, they are sequential. If a recursive function is now considered in the same way:

int fn( int p if pO then p fl this leaf else fn(p-1) endif; fn(l); and the expansion of the tree concerning 'this leaf only is considered within a recursive call, the result is as in Fig 6.

Here it can be seen that the invocations are indeed nested.

-so- C1418 The most straightforward way of obtaining a sufficient expansion for finiher processing is to continue expansion until 3 nested invocations are encountered. A check is performed whenever an invocation of a function is about to be expanded in the expansion and is as follows:

int repetition = 0; Node current - node = invocation_node; fl about to be expanded String function = function being expanded (e.g.'&) while( current-node!= null){ if ( current-node type is formal parameter resolution current-node invocation node that gave rise to this formal parameter node, must be on the path to root }else if ( current-node type is invocation) if ( this is an invocation of Tunction'){ repetition++; current-node current-node.parent; If at the end of this, 'repetition' is three then expansion is not performed and the function is marked as recursive. The algorithm is looking to see if the current invocation is nested within two others. If a 'formal parameter node is encountered this represents the 'end' of a set of nodes that are nested within this one, rather than the other way around, and hence this set is skipped.

The end result of this consists of truncated expansions containing, amongst other things, branches like those in Fig 7 (although either or both of the formal parain nodes 9 may be missing):

Thus each recursive function expands to a tree with a number of legitimate leaves (10) along with a number of leaves denoting the expansion curtailment (11). Each of the legitimate leaves C1418 forms the start of a path to the root of the tree that, if it encompasses pairs of nested invocation nodes, contains useful information that can be extracted from each such pair:

For each such pair the path can be split into the following parts:

1) path between the leaf and the first formal parameter node. This denotes the value of the parameter (in this case V) passed to the original invocation. The name of the formal parameter is contained within the formal parameter node (in this case 'p'). The original invocation node is tagged with this value for later use.

2) path between the two parameter nodes. This denotes the modification that has been made to the parameter value by the invocation, in this case 'y'. The name of the formal parameter denoted by the second node indicates how parameter values have been shuffled (in this case there is only one parameter, p). This thus identifies a [P... 1 modification.

3) The path between the second formal parameter node and the second invocation node denotes a leaf within'&. That is, a leaf modification. The parameter that it applies to is identified by the name of the formal parameter in the formal parameter node.

4) The path between the second and original invocation node denotes the [1... 1 modification for the recursive invocation.

Points (2) and (4) the above apply to the recursion group identified by the second invocation node (in this case fn(y)).

This would appear to supply all the information required, and in fact it nearly does. There are just two more points to note.

Firstly, one or both of the formal parameter nodes may not occur. This arises when the function's leaf modification is made to either a local value passed into a recursive invocation as a parameter, to a local value, or to the result of a recursion. These are exemplified by:

C1418 1) outer missing fn( Person p unbound y; if p.Alive then fn(y) endif; 2) both missing fn( Person p unbound y; if p.Alive then y else fn(p.Dad) endif; 3) inner missing (higher order, see later for more detail) (Person->boolean) fnb( Person p H similar defs for filc and fnd p.X; (Person->boolean) fna( Person p if p.Alive then let x:= fna(p.Dad); if x(p) then fnco else fndo endif H key is 'x(p)' else fhbO endif; C 1418 respectively.

Secondly, an artificial expansion of parameter values is required at expansion termination points to detect [P...] modifications that are not exposed by usage as function leaves. Consider the following:

fn( Person p 1, Person p2, Person p3) { if pl.Alive then P1 else fn(p2.Dad, p3.Mum, pl.Goat) endif; The expansion and analysis as defined so far would not identify the p3-> p21.Mum and p 1 - >p3AGoat [P... 1 modifications as these are not shuffled into a pl position within the limited expansion. This artificial expansion consists of the creation of virtual function leaves denoting all (except higher order - see, later) the supplied parameters (to the current function, not the target function as mutuaLlhigher-occurrence recursion causes these to differ) at the point of curtailment.

The resulting paths are then analysed as above with the exception that the (virtual) leaf modifications corresponding to these virtual leaves are ignored.

Sub-recursion appears as invocation nodes within the various modifications that target recursive functions. If a matching formal parameter node does not occur within the modification then this is in fact an inner-higher-occurrence (e.g. mutual/tertiary) recursion that can be ignored.

Otherwise the pair denote a sub-recursive portion. The path between these nodes is replaced by a sub-recursive reference to the function, along with the formal parameter name identified within the sub-recursive formal parameter node, giving the role of this sub- recursive segment.

For example:

C 1418 boolean gn( Person p 1, Person p2 p2.Alive or gn(p2, pl.Parent); boolean fn( Person p 1 unbound Person y; fin( if gn( p 1, y) then p 1 else y endif produces a [P...] modification within Th' for 'y' that is of the form local->pl, y [gn/p21 which says that when Th' recurses the modification made to p 1 is the result of 'gn' operating on 'y' as parameter 'p2'.

Just for interest the whole analysis of 'En' and 'gn above gives:

gn groUP1 [P... 1 mods: (pl. ->p2,.Parent), (p2->pl,null) [I... 1 mods: null leaf mods: p2, Alive fn groupI [P... 1 mods: (local->p 1,y [gn/p21 1), (p 1 ->p 1, [gn/p 111), (local->p 1j), (pl->pl,null) [I... 1 mods: null leaf mods: none (again, note the terminators 'I' that occur for if-operator if-operands, signifying that flirther modifications should not be applied).

The outcome of this activity is thus the full derivation of the recursive information maps together with the tagging of the outer invocations of recursive functions with the paths that denote the actual parameters. Together these fully define the information maps for each separate invocation of a recursive function.

C1418 Higher Order Function Higher order functions themselves do not introduce recursion as all they fundamentally denote is the decoupling of parameter binding from the point of invocation.

Thus higher order functions are merely a notational convenience to permit the construction of what are effectively template-functions - ones that are parameterised by or can return'behaviour as well as data.

Partially evaluated functions still conform to strong type checking. Any function with one or more parameters can be partially evaluated, simply by supplying a subset of its required parameters. The type of such an expression, however, is that of a function type based on the function's original signature and the type of the unsupplied parameters. It is not valid to use this in an incorrect context. For example, it is incorrect to write:

let (int->int) pl = úno; let (int->int) p2 = gno; let (intint->int) x = pl p2; x(4,5); as the'' operator has type (int,int->int).

If such composition of functions is required this must be performed explicitly:

let (int->int) pl. = filo; let (int->int) p2 = gno; let (int,int->int) x = hn(pl,p2); x(4,5); where int hn( (int->int) p 1, (int->int) p2, C 1418 int p3, int p4 pl(p3) p2(p4); Note that the binding of supplied parameter values to missing ones must be done on a type basis, not a usage basis. Consider:

(int,int->int) fn(intpl fxo; let x:= Eno; x(l) In this instance the T must not be bound to the first parameter of TV, it must be bound to the first parameter of&, even though this is never used.

Although it is not possible to implicitly combine partially evaluated functions, it is possible for parameters to be outstanding for more than one such function at a given point. Consider:

int gn( Person p2 p2.Age; (Person->int) fn(Tumip t l if tl.Rotten then gno else endif; C1418 let (Tumip->(Person->int)) x:= fno; x( root-Yeg, fannerjim); To support this it is necessary to introduce three new concepts:

1) higher-order-receptor nodes, used to denote parameters of higher-order functions that are yet to be supplied 2) higher-order-invocation nodes, used to denote an invocation that is 'missing' one or more parameters 3) higher-order-donor nodes used to denote the application of parameters to partially evaluated functions.

When a higher-order-invocation is encountered during expansion, higherorder-receptor nodes are created for the missing parameters and expansion continues using these proxies in their place.

* When such a higher-order-receptor node is considered for expansion itself, this is left as a leaf pending the supply of an actual value by a donor. The higher-order- invocation node itself is also marked with the number of missing parameters.

When a higher-order-donor node is encountered, the values it is donating to the expression are inserted at the front of a'list-of-available-donors' and a search is made of the expanded tree corresponding to the partially evaluated function for higher-order- invocation nodes. When such is found the available donors are matched to the missing parameters and turned into bindings'.

Each binding maps a higher-order-invocation/formal pararneter pair to a donated value, in the correct order. If donor values remain after the removal of those bound at this point, these are retained for further matching. If a further donor node is encountered then its donated values are inserted at the front of the 'list-of-available-donors'.

C1418 Any corresponding receptor nodes found during this search cause a check to be made in the current set of bindings and, if a value is found, the receptor node is replaced by the value. The value itself is then processed in the same fashion.

Each resolved receptor node will lie on a path of the layout shown in Fig 8.

Note that this caters for the incremental supply of parameters, along with the potential for such parameters to be unused.

Higher Order Recursion The explanation above deals with'normal' recursion and higher order functions separately.

However it does not support the combination of these - recursive fimctions that either take partially evaluated functions as Parameters or return them as results.

Firstly it is now possible to have an expansion path that includes 'formal parameter' nodes for more than one parameter of a recursive function. Consider the following:

int fiib( (Person->int) f, Person P){ if p.Alive then f(p) else fnb( f, p.Parent endif; int fna( Person P){ p.Age; I C1418 void maino{ let (Person->int) f fhao; unbound Person p; fhb(f, p This produces a path for'p.Age' that includes the following nodes:

root -> invoke(fnb) -> invoke(fnb) -> dereffhb/f) -> dereff fnb/f) -> deref( fnb/p) -> deref( fnb/p) This is handled by ensuring that this generates two sets of groups of nodes for analysis as before - one set consisting of invoke(fhb) invoke( fnb) dereffnb/f) dereffnb/f) and the other of invoke(fnb) invoke(fnb) deref(ffib/p) dereffnb/p) In general, in a function with 'if higher-order parameters, there may be up to V of these groups generated, followed optionally by one for a non-higher-order dereference.

The remaining issue with this is the generation of modifications from the dereferencing of higher order parameters. Note that there are only two things that can be done to such parameters - dereferencing and rotation. Thus rotation is the only modification that meaningful. However, there is a difficulty here exemplified by the following:

int fhx( Person PI pI.Xl; int Eny( Person pl, Person p2 pl.YI + pl.Y2; C 1418 int Enz( Person pl, Person p2, Person p3 pl.Zl+p2.Z2+p3.Z3; int fna( Person P5 (Person->int) flal, (Person,Person->int) fn2, (Person,Person,Person->int) fn3 if p.Alive then let (Person,Person,Person->int) new-f fiiz(); fna( p.Parent, fli2(p), fn3 (p), new_f) else fn 1 (p) endif; void maino{ unbound Person p; fna(p,fiixo,fnyo,fnzo); The problem is that virtual expansion can not be used with higher-order parameters as a means to derive the behaviour of the function with respect to the non-fully evaluated parameters. In order to correctly capture this it is necessary to perform a more extensive expansion of the evaluation tree, not just to depth '3' as required in the non-higher-order case. Sohowfarisitnecessaryto expandto? The answer is (3 +number_of higher_order_params-1). This gives each higher order parameter, including locally originating ones such as 'new-f above, the opportunity to be shifted into any of the positions whereby it can be fully-satisfied. In the above example this gives an expansion depth of 5. Use of such further expansion requires the '1-mods' produced to be qualified within which chain of invocations they occur.

C1418 Secondly there is the issue of the attribution of modifications to recursive functions that are actually not part of the recursive function itself, rather they are either the result of partial parameters passed into, or returned from, the function. Consider the following:

Person fnb( Person pl, Person p2 pl.Age > p2.Age; Person fna( Person pl, Person p2, (Person,Person->Person) fn) if pl.Dead then fh(pl, p2 else fna(pl.Mum, p2.Dad, fn) endif; void maino{ unbound Person pl; unbound Person p2; let f:= fnbo; fna( p 1, p2, f Although the modifications caused by the application of Thb' within Tria! are not part of fna!s specification, the approach taken so far will attribute them to fha. Moreover, if Tha! is used elsewhere with a different partial function for'& then the modifications caused by this other function will also be attributed to 'fha!.

A similar argument applies to the return of partially evaluated functions.

C1418 The approach taken here is to move away from the attempt to create generic descriptions of recursive functions so that, in the case where the recursive function either has one or more higher order parameters or a higher order result, separate information maps are produced on an a per invocation basis (this is logical - such recursive functions' behaviour is not fully defined by reference to its definition only, one has to consider the context of invocation). Although such an approach will still attribute modifications to the leaves of such functions, the information access graph derived is actually correct.

Merging of mgps The discussion so far has concentrated on producing the information maps for individual usages of unbound variables. The resulting maps then need to be merged to produce a map per variable.

This is straight forward - the separate maps are sorted into groups depending on the disambiguated name attached to the unbound variables. Each groups'elements are then merged to produce the corresponding tree. For example, and as also quoted above, if there were the following separate maps:

claimant.Mother.Address.ZIP claimant.Mother.Alive claimant.Father.Alive claimant.Age this would be merged into claimant+Mother±Address-Zip +Alive +Father-Alive +Age Where unbound variables occur as parameters to recursive information maps, these are just referred to from the merged tree via recursive map name and parameter binding. So if there was one more usage, say a function that counted the number of known ancestors:

C1418 int countAncestors( Person P){ if p.Mother null then 1 + countAncestors(p.Mother) else 0 endif + if p.Father!= null then 1 + countAncestors(p.Father) else 0 endif void maino{ unbound Person claimant; countAncestors( claimant This produces a map along the lines of.

fn: countAncestors group 1:

p_jnod: (p -> p, Mother) group 2:

p_jnod: ( p -> p, Xather) 1-Mod: ( p, Mother 1), ( p, Xather and the point at which it was invoked would identify the parameter as being claimant:

claimant -> [countAncestors/p] This would be added to the information tree:

claimant+Mother+Address-Zip 1 1 +Alive C1418 +Father-Alive I +Age I +[countAncestors/p] The following gives a simplified algorithmic description of the steps required to produce recursive function information maps. It is not optimised.

For Tree Construction The algorithm outlined above in connection with information access tree construction and modified to expand functions with higher order parameters to 3+number_of higher-order_params- P depth, along with the mechanism outlined above to handle higher-order parameter bindings, is sufficient to produce an initial expansion, although it does not detail how the resolution of names to values (explicit names and formal parameters) is performed. This is just a straight forward issue of scoping of name/value pairs and resolution of names within scopes, and is not covered further. Suffice to say the resulting tree at this point consists of the following node types of interest:

invocation nodes (identity of function and unique invocation id) formal parameter binding nodes (identity of function and its unique invocation id) attribute dereference nodes (these denote object.attribute_name expressions) unbound variables (always leaf nodes) nodes that denote the 'if branch of evaluation in 'if-then-else' expressions (marker indicating non-propagation of value) expansion truncation nodes (points at which expansion curtailed due to recursion) 3 types of higher order nodes Oust scaffolding for tree construction, not needed further) Virtual Parameter Expansion This activity occurs at expansion truncation nodes and is actually performed at the time that truncation occurs.

C 1418 For a recursive function with 'n'non-higher-order parameters (p I,pn) (the higher-order ones are not considered for virtual expansion) an artificial node is inserted in the tree, of type expansion curtailment This has 'n+l' children. The first is the invocation node that would have been created for the invocation (which is not expanded further). The remaining V denote each of the W non-higher-order parameters of the function. These are expanded by creating an artificial name' node corresponding to that of the relevant parameter, then expanding this.

Extraction of Paths This part of processing identifies the paths of interest for recursive functions. Each leaf in the tree (not just the unbound variable nodes) is considered in turn (except leaves produced by the invocation nodes that are the first children of 'expansion curtailment! nodes) and the path between it and the root examined. If an invocation node for a recursive function is encountered then the leaf node (which identifies the path) is added to the list of interesting paths.

Path processing Each interesting path is now considered. This is the part of processing that identifies the various modifications performed by the path.

To facilitate processing the path is transformed from being a parent/child linked set of nodes to an array of nodes with the leaf at index 0.

The first stage is to identify the nodes on the path that participate in the construction of 'outer invoke/inner invoke/[inner pararn]/[outer param]' sets as discussed above. These are easily spotted as being the nodes that correspond to the invocation of recursive functions or the usage of formal parameters of recursive functions. The output of this is a list of indexes in the path array.

These must now be converted into a list of'quads'. Eachquad' corresponds to anouter invoke/inner invoke/[inner paraml/[outer param]' set, and is a set of four numbers - either indexes into the path array or -I (signifying a missing formal param. node).

C1418 To build the quad list two maps are initially built based on the unique invocation IDs embedded in the invocation/param nodes. One way to do this is to used hashtables keyed on these IDs. The invocation hashtable' simply maps ID to path array index (a given invocation can onlyever occur once). The 'param hashtable' maps ID to a vector of path array indexes (more than one formal parameter from an invocation can occur on a given path). These hashtables are built by scanning the path array from 0 to the end. Entries in the pararn vectors are added to the end.

This is now a good time to formally record the parameters of recursive invocations. This is done by iterating through each member of the 'invocation hashtable'. For each ID the corresponding entry (a vector) in the 'param hashtable' is looked up. Each entry in this vector identifies a parameter value, the value being defined by the node immediately prior to the corresponding formal pararn' node. The original invocation node is annotated with a reference to this.

Next pairs of nested invocations nodes are sought these forming the basis of each quad. The approach here is the same as that used to spot triples of nested invocations when originally constructing the tree (see [01). Each invocation node from the 'invocation hashtable' is considered in turn - if it is found to be nested within another invocation of the same function then the pair of indexes into the path array [outer-index, inner_indexl is added to the list of pairs.

Only one immediate nesting is considered. If 'a' contains W contains 'c' then checking 'a' will result in the pair (a',b') and checking W will result in (W,V) - that is the pair (V,V) is not considered. However, flirther analysis is performed to identify the true level of nesting of the invocation and this is tagged against the invocation nodes.

Each pair is then considered separately and gives rise to 1 or more quads. The corresponding pararn vectors are extracted from the 'param hashtable' by using the invoke pair ids. If both of these have no entries then this is treated separately, a single quad being produced with -1 for both pararn indexes. Otherwise it is necessary to, where required, insert -1 indexes for either the outer or inner pararn entries where they are missing. Note that the way the pararn entries were built above ensures they are in increasing index order. Basically observe the following: an outer with no inner implies a missing inner. Two outers in a row imply a missing inner. An inner before an C1418 outer implies a missing outer. The easiest way to explain the algorithm is via pseudo-code, given a tpair':

Vector outer_params = parain-hashtable.get( pair[O] Vector inner_params = pararn_hashtable.get( pair[ 11 int outerjen = outer_params.sizeo; /1 assume 0 size if no pararns, int innerjen = inner_parains.sizeo; /1 likewise int outer_pos = 0; int inner_pos = 0; while( outer_pos < outerjen 11 inner_pos < innerjen int outer - ind = outer_pos >-- outer - len ? - 1 outerparains[ outer_pos 1; int inner_ind = inner_pos >-- innerjen ? -1 inner_params[ inner_pos boolean done =false; if outer_ind! = - 1) { if ( inner_ind - 1 fl outer with no inner -> unmatched outer done = true; }else{ int next outer ind (outer_pos+l) ≥ outer_len?-1:outer_params[outer-Pos+11; if ( next-outer_ind! = - 1 && next_outer_ind < inner---ind) { 1/ unmatched outer param, two outers: in a row inner_ind = -1; done = true; if done){ C 1418 outer_pos++; }else{ innerpos++; if ( outer_ind!= -1 if ( outer_ind > inner_ind outer_ind = -1; }else{ outerpos++; quads.add( new Quad( pair[O], pair[ 11, irmer-ind, outer-ind Thus at the end of this process there is a list of zero or more quads, each denoting points in the path that have an impact on one or more recursive functions' information maps.

Each quad is then analysed further to derive the associated modifications that it represents.

There are four sub-cases to analyse:

1 Both the outer and inner pararn indexes are - 1. This quad represents a value local to the recursive function being used locally. Thus the original and target parameter names for parameter rotation are set to 'local'. p_mods are not generated by this case, only 1-mods and i-mods.

The 'p_pod' is set to null.

The '1-mod' is set to the path between the leaf and the inner invoke. If the 1 - mod contains a expansion curtailmenf node then it represents a virtual leaf and is thus discarded.

The'i-mod'is set to the path between the inner invoke and the outer invoke.

C1418 2) The outer param index is - 1, the inner is not. This quad represents a value local to the recursive function that is then passed into a recursive invocation. The original parameter name is set to 'local', the target parameter name is set to that of the inner parameter node. If the target parameter is higher order then no mods are generated. Otherwise:

The 'p_po& is set to the path between the leaf and inner param.

The 1-mod' is set to the path between inner param and inner invoke. If the 1-mod contains a'expansion curtaihnent node then it represents a virtual leaf and is thus discarded.

The 'i-mod' is set to the path between the inner invoke and the outer invoke.

3) The inner param index is - 1, the outer is not. This can only occur under limited circumstances, when a recursive invocation returns a higher order function to which a parameter is then applied and actually identifies a leaf usage. The original and target parameter names are set to that of the outer param., qualified to indicate that this represents leaf usage. If the target parameter is higher order then no mods are generated. Otherwise:

The'p__Mod'is set to null The '1-mod' is set to the path between the outer param and the inner invoke. If the 1-mod contains a 'expansion curtailmenf node then it represents a virtual leaf and is thus discarded.

The 'i-mod' is set to the path between the inner invoke and the outer invoke.

4) Both param. indexes are not - 1. If the target parameter is higher order then no mods are generated. Otherwise the question has to be asked: is this really a single quad or does it really represent a combination of cases (2) and (3) above. It is possible, given the above algorithm, for this mistake to have been made. To answer this it is necessary to consider the circumstance under which such confusion occurs. This is when a locally originating partial function is passed into a recursive call, where it is returned as a result of the function, then this has a parameter applied to it. This situation is recognised by examining the occurrences of higher-order-node triples, 'ho-donor', 'hi-invOkee and'ho-receptoi. If such a a triple exists with: 1) the 'ho-donor' between outer-invoke and inner_invoke 2) the hc_receptoe between outer_param and inneraram 3) the ho-invoke between inner_invoke and ho_receptor, then the quad must be split into two. Thus:

C 1418 a) split required Treat as two separate quads:

( outer_invoke, inner - invoke, inner_param, -1 ( outer_invoke, inner_invoke, - 1, outer_parain) b) split not required The 'p__mod' is set to the path between the outer param and the inner param.

The'l-mod'is set to the path between the inner param and inner invoke. If the 1 - mod contains a'expansion curtailment' node then it represents a virtual leaf and is thus discarded.

The 'i-mod' is set to the path between the inner invoke and the outer invoke.

The result of each quad thus identifies an ImpacV of the form:

1) a recursive invocation pair, the outer invoke giving the actual instance of the invocation of a recursive function, the inner invoke identifying one of the recursion points.

2) two formal parameter names, or 1ocaP, giving the parameter rotation details of this invoke - origin and target parameter names. The names may be qualified to indicate generation due to case (3) above.

3) a 'p_pod' detailing the modifications made to the origin parameter by this invoke.

4) an'l-mod' detailing a leaf modification. The first param in this identifies the parameter to which the leaf modification is made. The nesting level of this modification is given by the nesting level of the outer invoke node.

5) ani-mod'that details modifications made to the result of the invocation Any modification that contains one or more 'if operator markers is truncated after the first one as it signifies that the modification terminates here.

C1418 Each modification is then turned into a human-readable string by turning unbound variable nodes and attribute dereference nodes into strings and concatenating them, scanning from the root-end of the modification towards the leaf. When doing this, if an 'invoke' node is encountered that identifies a recursive function then, if the corresponding 'formal param' node is found further towards the leaf, the section between these nodes represents sub- recursion. If the formal param identified here is higher-order then the section is ignored. Otherwise the section is replaced with a reference to sub-recursion of the form [fimction_name:fbrmaLparam], where the formal_pararn' is that of the formal pararn node encountered.

IMpact Amalgamation The impacts can now be collected together on a recursive-fimction basis. For each different recursive function impacts that relate to it are collected into a group. If the function has higher order parameters or returns, this group is subdivided into a sub-group for each individual outer invocation, otherwise a generic description is produced. The members of each group/sub-group are then merged to produce a complete description. Duplicate modifications are removed.

Information Map Production The tools required to assemble the information maps have now been described. The first step is to produce the individual information maps generated by leaves/recursive nodes in the tree. The expansion tree is searched from the root towards the leaves.

If a node representing the invocation of a recursive function is encountered then the recursive description is looked up. This is done by checking for an individual description, in the case of there being higher order parameters/higher order return, or a generic description otherwise. The actual V parameters tagged at this node (including higher-order ones), along with the description found and the path between the node to the root are all used to form V separate information maps of the form:

C1418 with the 'actual parameter' being processed in the same way as the non- recursive part of the tree.

If a leaf of type'unbound variable'is found then an information map of the form:

<root to leaf mods> <unbound variable> is created.

When this is complete all information maps that do not end in a unbound variable are discarded.

The resulting set of information maps are now collected as indicated above to produce the final result.

The following is an overview which is considered sufficient to understand how a full implementation may be performed. Each distinct inverted tree is the start point for a connected GUI, for example, dialog that commences with a search based on the unbound variables and then potentially follows the edges of the graph during evaluation to display/obtain/verify information.

If the unbound variable names have recursive components, this signifies that the search may be entered multiple times. If part of the tree contains recursive components, then the GUI needs to be able to handle repeated traversal of such components.

The GUI handles requests of three types: "Set" relates to information provided to the GUL No action is required for it. "Get" is a request to obtain the value of a particular piece of information. "Verify" is a request for the GUI to verify that a particular piece of information is correct/as desired.

During an evaluation all interaction with the GUI is in the context of the inverted trees deduced during the analysis phase. This means that they can be automatically matched onto the GUI elements that have been attached to the trees during the GUI design process. Actual unbound C1418 variable names are matched against those identified in the inverted trees, allowing for repetition of recursive components if they exist.

For the purposes of illustration consider the following simplified example. It tries to locate an individual by searching on their surname. If no matches are found the search is repeated. If more than one match is found, each matching person is presented by the GUI in turn to see if it is the desired entry. When such a person is found their current address is verified.

Person FindPersono{ unbound String Surname; H the 'select' operator looks up objects by keyword let list people:= select( Person, "Surname=" + Surname if length( people 0 then findPersono invoke id I else let Person person:= selectPerson( people); // invoke id 2 if person null then findPersono H invoke id 3 else person endif endif; Person selectPerson( list people){ if people == nil then null else C1418 let Person person:= head(people); if verify( person.NINO:: person.FirstName nil then person else selectPerson(tail(people)) endif endif; void mainof let Person person findPersono; H invoke id I let Address address person.Lives-At; verify( address.Linel:: address.Line2 address.Postcode:: nil Note that error reporting is not handled. Also assume that the 'verify' operator performs a parallel verification of the elements of the list it is supplied.

Analysis results in the following inverted tree:

Sumame{[I:::3::]Il --- > NINO I FirstName + Lives-At ------ >.Linel I Line2 +.PostCode This suggests a GUI along the following lines:

A search dialog containing a field for a person's surname, potentially invoked more than once to perform alternative searches. A person information page containing areas for a person's NINO and FirstName. This has a link to a panel to display their address.

At run time the application could behave as follows.

C1418 The evaluation will result in a prompt being made for a surname get[Sumame(I This would be matched by the GUI onto the search dialog as it matches the 'Sumame' name.

The GUI would then display the search dialog and acquire a value for 'Surname', returning it to the evaluation.

Evaluation would proceed and try and locate the people with the given surname. Assume that none are found, causing recursion via 'invoke id 1'. This would result in another prompt for surname-get[Surname{l::l)J. Again this matches the search dialog and the GUI obtains and returns another surname.

This time assume that a person object is located. The evaluation proceeds to the verify within selectPerson, this resulting in verify[Sumame { 1:: 1).NINO] and likewise for FirstName. The GUI matches this to the person information panel and automatically generates a "verify this" prompt to the user.

Assuming that the user responds 'OK' to this the evaluation returns to 'main' where a verify[Surnarne { 1:: 1).Lives-At. Line I] (along with Line2 and PostCode) are generated. Again, the GUI matches these requests against the appropriate form and prompts the user.

In general terms, the invention relates to a mechanism for the automated extraction of an application's use of information, both in terms of what it potentially uses and the relationship between pieces of information used. In this connection, it defines a naming scheme whereby the application's various use of input data can be identified.

The invention may be implemented on any known computer system capable of supporting object-oriented processing. The computer system may be as schematically illustrated in Fig 9, where the computer system 91 includes a user interface 92, a database 93 and an application transformation/analysis mechanism 94. Mechanism 94 may be comprised by a special purpose unit or conventional computer processing hardware running special purpose software, comprising C1418 a program stored on a computer-readable storage medium, for carrying out the transformation and analysis etc, as well as the sorting and merging steps, namely the various steps indicated in Fig 10, other than the initial definition of the application in a functional language. As will be appreciated from the above description, transformation of the application to uniquely identify each unbound variable may comprise firstly adding two implicit parameters to each function, and secondly insertion of the parameters at function invocation points. Optionally, explicit names can be assigned to expressions within an application. In the recursive case, it is not possible to fully expand the evaluation tree, but it is only necessary to expand it as far as the third nested invocation of a ftinction, in order to permit capture of all the required information, as discussed above, in other than higher order cases.

C1418

Claims

CLAIMS:

1. A method for use in designing a user interface for an applicafion, including the steps of defining the application in a functional language and deducing use of information by the application from the application itself by analysis thereof.

2. A method as claimed in claim I and including the step of providing unbound variables of the application with respective unambiguous names prior to said analysis step.

3. A method as claimed in claim 2 and including the step of assigning explicit names to expressions within the application after said name provision step.

4. A method as claimed in any one of the preceding claims and including the step of determining whether or not the application includes recursive functions, and wherein the analysis includes expanding the application to form a maximal evaluation tree in the absence of recursive functions, or in the presence of recursive functions expanding the application to the maximum extent possible.

5. A method for use in the construction of a user interface for an application, the application being defined in a functional language and being in the form of a tree structure, leaves of the tree being comprised by unbound variables of the application, the method including the steps of transforming the application whereby to provide each unbound variable with a respective unambiguous name, analysing the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, C 1418 analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, sorting the maps into groups in dependence on the unambiguous names, and merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed.

6. A method as claimed in claim 5, and including the step of assigning explicit names to expressions of the application after the provision of the unambiguous names.

7. A method as claimed in claim 5 or claim 6 and including the step of determining whether or not the application includes recursive fimctions, and wherein the analysing step includes expanding the application to form a maximal evaluation tree in the absence of recursive functions, or in the presence of recursive functions expanding the application to the maximum extent possible.

8. A method for use in designing a user interface for an application wherein the use of information by an application is automatically deduced from the application itself, the application comprising a plurality of business rules corresponding to a business purpose, being defined in a functional language and being in the form of a tree structure, leaves of the tree being comprised by unbound variables of the application, the method including the steps of:

transforming the application whereby to provide each unbound variable with a respective unambiguous name, analysing the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, C1418 analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, sorting the maps into groups in dependence on the unambiguous names, and merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed.

9. A computerised method for use in constructing a user interface for an application defined in a functional language and having a tree structure, leaves of the tree being comprised by unbound variables of the application, the method including the steps of.

10. A computer system for use in the construction of a user interface for an application, the application being defined in a functional language and being in the form of a tree structure, C1418 leaves of the tree being comprised by unbound variables of the application, the computer system including:

means for transforming the application whereby to provide each unbound variable with a respective unambiguous name, means for analysing the transformed application whereby to produce a maximal evaluation tree which denotes the potential evaluation of all application branches and employs the unambiguous names for the leaves thereof, means for analysing the maximal evaluation tree to produce information maps for individual usages of the unbound variables, means for sorting the maps into groups in dependence on the unambiguous names, and means for merging the maps of each said group whereby to produce a corresponding single tree from which a respective user interface can be constructed.

11. A computer readable storage medium having a program thereon, wherein the program is for use in the construction of a user interface for an application, the application being defined in a functional language and being in the form of a tree structure, leaves of the tree being comprised by unbound variables of the application, and wherein the program includes steps for:

12. A method, for use in the construction of a user interface for an application, substantially as herein described with reference to and as illustrated in the accompanying drawings.

13. A computer system for use in the construction of a user interface for an application, substantially as herein described with reference to and as illustrated in the accompanying drawings.