RECURSIVE DYNAMIC ACCESS TO A DATA MODEL HAVING A HIERARCHICAL TREE STRUCTURE
The present invention relates to computer processes for accessing a large volumes of data, which may be geographically distributed, using a large distributed data model. Any types of data may be accessed, but the invention is particularly suited for accessing pointers to sources of information.
As computer technology advances, multi-media computers need to access an ever increasing number of data elements, such as pointers to sources of information, for example URLs and files accessed via Internet, intranet or locally, or television channels accessed via a tuner. For example, as the Internet is used, a vast number of websites may be browsed on a regular basis by a user who wants to be able to relocate a given web page quickly and simply. Currently this is difficult for the user and the various techniques used by Internet browsers to store URLs for reference to web pages do not allow convenient access. Often it is a matter of navigating through a chain of web pages via successive hyperlinks to locate a page of interest which is clumsy and slow for the user. Furthermore, there is a need to store and access under a single platform pointers to different types of information source for example information data files such as GIF image files which may operate under a different platform or TV channels as their number explodes with digital broadcasting. In fact efficient access to large numbers of data elements is a problem for many types of data element, especially where they are distributed geographically on different computers. It would be desirable to provide a tool for organising accessing large numbers of data elements, such as pointers to sources of information, in a simple and interactive manner. It would also be desirable to provide a tool which is capable of providing efficient access to different types of information source.
According to a first aspect of the present invention, there is provided a process for accessing a large number of data elements in a computing system having an input device and a display device, the process comprising:
accessing a data file storing data which represents a group of nodes arranged in a tree structure with successive levels each consisting of sets of child nodes linked to a common node in the preceding level and has data elements at respective nodes; receiving inputs by a user from an input device successively selecting a chain of linked nodes in successive levels; accessing the successive nodes of the chain in response to the inputs; and in response to an input selecting a node having a data element, accessing that data element.
Accordingly the present invention implements a data model in which a group of nodes are arranged in a hierarchical tree structure, wherein each node at any given level in the tree structure is linked to a set of further nodes at the next level in the tree structure. This data model allows for a logical design of the data structure. Access through the tree structure of nodes is provided. When the selected node has a data element, that data element is accessed. This provides simple and intuitive access to a vast number of data elements.
The data model is advantageous because it lends itself to recursive access in which each level is accessed by the same process performed recursively. This is powerful because a single process suffices for any number of levels, thereby reducing the complexity of a computer program implementing the invention. According to another aspect of the present invention there is provided a process for accessing a large number of data elements in a computing system having an input device and a display device, the process comprising: dynamically accessing respective ones of a plurality of data files each storing data which represents a group of nodes arranged in a tree structure with successive levels each consisting of sets of child node linked to a common node in the preceding level and has data elements at respective nodes and pointers to other data files at respective nodes; receiving inputs by user from an input device successively selecting a chain of linked nodes in successive levels; accessing the successive nodes of the chain in response to the inputs; and in response to an input selecting a node having a pointer to another data file,
dynamically accessing that other data file; in response to an input selecting a node having a data element, accessing that data element.
By using different groups of data each employing a data model having a hierarchical tree structure of nodes, it is possible to create an overall tree structure from plural groups. This allows the groups to be accessed dynamically, that is by accessing only one group at a time. Access to other groups may be provided by pointers. Thus a huge data distribution may be provided without the need to access it for example by retrieval over a network. This prevents the program from becoming unwieldy and slow.
The invention is particularly suited to storing pointers to sources of information as the data elements. For example, these may be pointers to URLs in a computer network, in which case pages corresponding to the URLs may be accessed. In this way, the present invention can provide a powerful tool for organising and accessing large numbers of websites.
It is convenient to implement the present invention as a Java applet which runs under the Internet browser. In this way, the display of labels may be provided as part of a web page and the Internet browser may be controlled to access the web pages. In addition, the pointers may include pointers to other sources of information such as files including image or other multimedia information operating under a different platform. In this case, access is made by activating that platform and retrieving the relevant data file. Thus the present invention provides access to a wide variety of types of information in a single unified manner. Preferably, access is provided by successively displaying labels for the sets of nodes linked to each selected node in the claim.
Preferably, the labels are displayed on the display device in a two dimensional array. This allows for easy recognition of the position of nodes by the user which facilitates searching and thereby increases the power of the system for organising the information. It has been found that a square array is particularly convenient for this purpose as it is logical to the user. A three-by-three array
simplifies the amount of information presented at any one time whilst still providing access to a large number of nodes, that is 9n nodes. In a tree with n levels which amounts to over half a million nodes with just six levels.
The present invention may also provide a graphical representation of the current position in the hierarchy. This allows the user to visualise the position of the node to which he has navigated. The graphical representation may consist of an element for each level of the hierarchy illustrating the position of the label for the selected node in that level relative to the positions of the displayed labels of the other nodes of that level. The use of such elements allows the current position to be easily recognised even when a vast number of pointers are stored within a single hierarchy. Preferred embodiments of the present invention will now be described by way of non-limitative example with reference to the drawings, in which:
Fig. 1 illustrates the data model of a group of nodes employed by the preferred embodiment. Fig. 2 illustrates the data model employed for four groups of nodes.
Fig. 3 illustrates a personal computer on which the present invention may be implemented.
Fig. 4 illustrates an implementation of the present invention within an Internet browser. Fig. 5 illustrates the format of the display provided by the present invention.
Fig. 6 illustrates an example of the graphical representation of the current position in the hierarchy to which the user has navigated.
Fig. 7 illustrates the program used to generate data files encoding the hierarchy of nodes. The data model employed is as follows. As illustrated in Fig. 1, the data is organised into groups of "nodes" 15. Each node 15 represents a node in a hierarchical tree structure within the group stemming from a root node 15 a. Relative keys associated with each node imply the relative position of each node within the hierarchy for the arrays. Within the tree structure nodes 15 at any given level in the tree structure may be limited to a set of child nodes at the next level in the diarchy, as illustrated by links 16 in Fig. 1.
As illustrated by the numbers inside the nodes 15 in Fig. 1, the relative key for each given node consists of a sequence of numbers specifying the respective links 16 of the chain linking that given node 15 to the root node 15 a. Consequently the relative key for each node is unique within the group. Preferably, for the relative keys the decimal numbering system shown is used which allows up to 9 child nodes, however other numbering systems could be used to accommodate a greater number of child nodes 15.
The nodes may have group pointers to the root node of another group. The pointer links between groups provided by group pointers establishes an overall tree hierarchy structure. An example is illustrated in Fig. 2, where nodes of group 17 have pointers to three groups 18, 19 and 20 and nodes of group 19 has pointers to groups 21. Fig. 2 shows a tree structure formed by plural groups of nodes, but pointers may point to parent nodes and nodes might not contain an explicit pointer (as shown). Where there is no explicit group pointer the set of linked nodes at the next implicit level is assumed.
The data elements to be organised are stored under the respective nodes 15. The data elements may be any type of data from simple textual or numeric data to pointers to sources of information, such as URLs or the names and locations of files, which may be local to the computing system or external and accessible through a computer network.
For the purposes of distributing the data, groups are represented by respective data files, which may be encoded for security purposes. Thus, the term "groups" simply relates to a group of nodes 15 which may be transferred together as a single data file between the media groups is to assist distribution and dynamic access individual groups may be handled separately. For example the data files may store for each node:
(1) a relative key unique in the group and defining the position of the node within the group;
(2) a label which may be characters or a graphic; (3) for predetermined nodes, a data element such as a pointer to a source of information;
(4) for other predetermined nodes, a pointer to another data file;
(5) optionally, security information ;
(6) optionally, additional descriptive information.
The data model described above provides an opportunity for dynamic access. Groups are stored on "hosts" which may be distributed geographically. Groups will be accessed by "clients" which may be distributed geographically. Access is dynamic in that a client may load only those groups to which access is currently required for those groups to which access likely to be required. To prevent overloading the capacity of the client, unused groups may be discarded and reloaded as required. Groups may be of any convenient size. The size will be determined by the speed of data transfer and the available space on the client.
Preferably, the data model is accessed using recursion. In particular, the program providing access performs the same process recursively each time a node in the current level is selected to access the linked set of nodes in the next level. Thus the same procedure or sub-routine is repeated in a nested fashion for each level through which the user navigates.
The data model described above ends itself to recursive access since every node points to the next level in the tree structure implicitly according to the key or explicitly according to the group pointer. As recursion is used care must be taken to avoid overflow of stack of data which is stored from the nodes in previous levels which have been recursively accessed to navigate to the current node. Stack management can consist of limiting the number of accessible levels in the hierarchy, or by setting a level at which overflow is approached when the stack could be dumped to magnetic media. The present invention is preferably implemented on a conventional personal computer 1. As illustrated in Fig. 3, the computer 1 has a processor la and conventional types of memory lb and, connected via conventional interfaces 2, a monitor 2 as a display device and a mouse 3a and keyboard 3b as an input device 3. Any input device 3 could be provided, such as a touch screen, remote control or keypad.
In the preferred form of the present invention the data elements are pointers
to sources of information. The sources of information to which the pointer may point include an URL in a computer network, an information data file accessed through a computer network or locally, or a television channel accessed through a tuner. In this case it is preferable to implement the invention in the programming language Java. This allows the programs to be included as a Java applet as part of a web page and to run under a conventional Internet browser 5, such as Microsoft's Internet Explorer. In this case, as illustrated in Fig. 4, the Internet browser 5 responds to the input device 3, controls the display device 2 and accesses a network 6, which may be an Internet or an intranet, in a conventional manner. The Internet browser 5 can also access information data files such as image data or other multi-media information stored on computer 1 by causing computer 1 to operate the platform corresponding to the information data file 7 and then retrieving that file 7.
Java is suitable for implementing the present invention because it is a language which handles recursion well. It is also convenient to use Java to implement access to URLs in a browser and the client applet also runs in a browser. In Java or other languages, in the program may form part of an Internet web-site. It is convenient for distribution since it can be made available on the Internet or on an intranet. However a client program could be implemented in any language having the capability to access the data required. Thus executable code written in any language suitable for the platform could be used.
In fact, a suite of programs can each use the data model described above. The main program for a client is the program for browsing data files 8, for convenience referred to as matrix files 8. There is also a program for creating and editing the matrix files 8 which is a stand alone application. There are also various tools for URL verification, data auditing, changing the format of data including: encoding and joining groups or files, and transferring data between database format and files. First the program for browsing matrix files as will be described.
The browser program performs the following process. Initially one or more of matrix files are retrieved to the client computer 1 from a host computer over the computer network 6 and stored in the memory 16 of the computer 1. Then the nodes 15 of the data stored in the matrix file 8 are accessed in response to inputs from the
user. This done by performing the following process recursively for each level of the tree structure through which the user navigates, starting with the root node 15a of a matrix file 8.
The display device 2 is operated to indicate the set of data nodes linked to the current node by a display in the format illustrated in Fig. 5 including nine label fields 9 arranged in a two-dimensional three-by-three rectangular array, as well as a text field 10. The labels for the set of nodes linked to the current node are displayed in the respective display fields 9.
Inputs from the input device 3 are received. A pointer graphic 11 may be moved in a conventional manner by manipulation of the mouse 3a. When the pointer graphic 11 points to any given display field 9, the additional textual information for the node corresponding to that display field 9 is displayed in the text field 10. This allows additional information about the node to be displayed to assist in navigation through the hierarchy without disrupting the simple and clear arrangement of display fields 9.
The input means 3 may be used to select the node corresponding to any display field 9 in a conventional manner by using the mouse 3a to where the pointer graphic 11 to point to a given display field and operating the mouse button.
By selecting a node the process is recursively repeated for the selected node. In this way the user navigates down a chain of linked nodes by repeatedly selecting a node in each recursion. Recursion is powerful because it allows the same piece of code to perform the access to each level. In the recursion the data for each set of nodes through which the user has navigated is stored in the temporary memory (e.g. RAM) of the memory device lb as a stack. Thus stack management is performed to prevent overflow as described above. However, stack overflow has not been found to be a problem in tests to a depth of 500 levels and the tree structure can be terminated by providing links to external data at a realistic depth such as eight levels. However, in any other implementation, if stack dumping were employed, the number of levels in the tree structure is limited only by the space available on the client magnetic media.
By displaying the labels in display fields 9 in a two-dimensional array, in
particular in a rectangular array, the user finds it easier to understand his current position within the tree structure of nodes and hence to visualise the tree structure. In this way, the present invention can provide a powerful interactive tool.
To assist the user in appreciating the position of the current node which he has navigated, displays a graphical representation 12 of the current position is displayed. The graphical representation 12 consists of an element 13 for each level of the hierarchy through which the user has navigated. Each element illustrates the array of nine display fields and within that array, the position of the label for the node selected by the user. For example, Fig. 6 shows the graphical representation 12 displayed when the user has navigated through four levels by selecting in successive levels the top left node, the middle node twice and then the bottom right node. This graphical representation allows the user to visualise his current position within the hierarchy of nodes. This is particularly powerful in combination with the use of the two-dimensional array of data fields. In response to the selection of a node for which the pointer points to a source of information, that source of information is accessed. If the source of information is an URL, the Internet browser 5 is caused to retrieve and display the page corresponding to that URL. If the pointer is an information data file operating under a different platform, the platform is activated and the file retrieved. On accessing the source of information the browsing program may terminate or it may continue, for example in a separate window on the display device 2.
When the selected node has a pointer to another matrix file 8, dynamic access to the nodes of that further matrix file 8 is provided by retrieving the matrix file 8, if necessary through the network 6 from the host computer. Subsequently the recursive process of accessing nodes is continued from the further data file.
The browser 5 can also modify the matrix file 9 based on the frequency of selection of a particular node. When a node is commonly selected, for example more than ten times, the matrix file 8 is edited so that the commonly selected node becomes a node in the first layer. The node previously in the first layer and all the linked nodes below are each moved down a level. This may be achieved simply by editing the relative key for the relevant nodes.
Another a program generates matrix files 8. The generator program 14 is illustrated in Fig. 7. The generator program 14 allows recursive access using display device 2 and input device 3 in exactly the same manner as the browser 5 described above by displaying labels for a set of nodes and allowing movement through the levels by selection of nodes. Initially, the labels are blank. The input device 3 may be used to generate or modify the label for each node by pointing to the relevant display field 9 and typing a textual label or referring to a graphic. The textual information in text field 10 may be edited in a similar manner. Similarly the pointers may be inserted. The generator program 14 may use a relational database 15 which is itself of a conventional nature and stores a database relating URLs to tags which classify the pages corresponding to the URLs by content or other criteria. For example the relational database 15 may be created by the Microsoft program SQL. The generator program 14 interrogates the regional database and extracts URLs for insertion as pointers at nodes selected by the user.
Particular nodes may be dragged from the currently displayed set of nodes in a given level into a workspace provided on the display device 2. These may be reinserted on a node in the same or different level in the same group of nodes or in a further group of nodes by dragging back from the workspace. In this way different groups of nodes may be joined together or split apart. Different groups may be related by inserting at a node of one group a pointer to another group.
During creation of the tree structure, information defining each group of nodes is stored in a temporary storage area of the memory lb of the computer 1. Upon completion, the data for each group of nodes is encoded and stored as a respective matrix file 8.
The present invention could be applied to store pointers to television channels in order to function as an electronic program guide. In this case, as an alternative to the Java applet implementation described above, it would be preferred to implement the present invention as a C++ program. The present invention as been described above as a process implemented in a computer system by running a program, for example a Java applet. In another
implementation, the present invention may be provided as an article of manufacture in the form of a storage medium readable by a computer and encoding the computer program process. In this case, the storage medium might be a magnetic diskette or an optical disc or other portable storage medium. Similarly, it might consist of the memory within the computing system storing the program. The present invention may also have a machine implementation consisting of modules for performing the logical operations of the computer implemented steps of the process.