WO2003042865A2 - Taxonomy management - Google Patents

Taxonomy management Download PDF

Info

Publication number
WO2003042865A2
WO2003042865A2 PCT/GB2002/005097 GB0205097W WO03042865A2 WO 2003042865 A2 WO2003042865 A2 WO 2003042865A2 GB 0205097 W GB0205097 W GB 0205097W WO 03042865 A2 WO03042865 A2 WO 03042865A2
Authority
WO
WIPO (PCT)
Prior art keywords
taxonomy
data
information
managing
objects
Prior art date
Application number
PCT/GB2002/005097
Other languages
French (fr)
Other versions
WO2003042865A3 (en
Inventor
Richard Osbaldeston
David Parker Bastable
Original Assignee
Wordmap Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wordmap Limited filed Critical Wordmap Limited
Publication of WO2003042865A2 publication Critical patent/WO2003042865A2/en
Publication of WO2003042865A3 publication Critical patent/WO2003042865A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Definitions

  • the invention relates to the management of data.
  • the invention finds particular application in the creation and editing of taxonomies; and preferably hierarchical 5 arrangements of data. Examples of the invention described below relate to the creation and editing of hierarchical databases of terms for use in searching.
  • a taxonomy preferably comprises a classification system that divides a subject area hierarchically into progressively smaller subdivisions.
  • Taxonomies have been used for many years to classify many forms of knowledge, for example products and services in telephone directories, and books in library subject areas. Using taxonomies, people can organise knowledge into clearly defined categories and can give users intelligent interfaces. 5
  • aspects of the present invention seek to improve and simplify the creation and/or editing of hierarchical data structures.
  • An aspect of the invention provides an apparatus for managing a taxonomy, the apparatus 0 comprising means (preferably a memory store) for storing a plurality of obj ects, and means
  • a memory store for storing associated information which is associated with an object of the hierarchy, the associated information including information relating to the
  • each object of the taxonomy has associated information.
  • each object of the taxonomy has additional data.
  • At least a part of the associated information and/or the additional data may be associated 0 with a single object, or may be associated with a group of objects.
  • the group of objects may have a hierarchical relationship between each other. Different parts of the associated information and/or additional data may be associated with different objects or groups of objects.
  • the associated information and/or the additional data includes information for use in a search string.
  • the associated information and/or the additional data may include a search string.
  • the associated information and/or the additional data includes search location information.
  • the search location information may relate to a search engine, a database or other data store which may be located locally, or remotely, for example on the Internet.
  • the search location information may include a URL.
  • the apparatus includes means (preferably a processor and a memory store) for adding an object to the taxonomy, and/or removing an object from the taxonomy.
  • means preferably a processor and a memory store
  • the apparatus includes means (preferably a processor and a memory store) for adding and/or editing additional information and/or additional data associated with an object.
  • the added or edited information and/or data may be associated with a single object or with a group of objects.
  • the apparatus includes means (preferably a processor and a memory store) for adding a link object, the link object being linked to an object of the taxonomy.
  • the apparatus is adapted to effect the display of both the linked object and of the object in a user interface.
  • the linked obj ect and the object are treated as one object in the taxonomy.
  • a further aspect of the invention provides a taxonomy comprising a plurality of objects, wherein associated information is associated with an object of the taxonomy, the associated information including information relating to the hierarchical relationship of the object to another object, and further including additional data relating to the object.
  • a further aspect of the invention provides a method of managing a taxonomy including a plurality of objects, the method including the step of associating information with an object of the taxonomy, the associated information including information relating to the hierarchical relationship of the object to another object, and further including additional data relating to the object.
  • the apparatus is adapted to generate a display of the taxonomy.
  • the apparatus is adapted to generate a display of the taxonomy showing hierarchical relationships between the objects.
  • the apparatus is preferably adapted to use the associated information of the objects to construct a hierarchical representation of the taxonomy.
  • each node of the taxonomy relates to an object of the taxonomy.
  • the associated information is associated with the relevant node of the taxonomy.
  • a further aspect of the invention provides an apparatus for managing a first taxonomy, the apparatus including means for transferring information from a set of data to the first taxonomy.
  • the set of data comprises a second taxonomy.
  • the or each taxonomy comprises a plurality of hierarchically related objects.
  • the apparatus is adapted to transfer information regarding an object of the set of data to the first taxonomy.
  • the information may include information such as the associated information and additional data as described herein.
  • the apparatus is arranged such that transfer of information regarding the object from the second taxonomy effects transfer of information regarding a group of objects of the second taxonomy to the first taxonomy.
  • a group of objects can be transferred from the set of data to the first taxonomy.
  • the transfer may comprise copying information from the set of data to the taxonomy, or deletion from the set of data and insertion into the taxonomy ("cut and paste").
  • the apparatus includes means (preferably a processor) for determining whether the set of data includes data which is similar to data in the first taxonomy.
  • the similar data may be identical to the data in the first taxonomy.
  • the apparatus looks for terms which are duplicated in the imported data and the taxonomy.
  • the apparatus is adapted to remove duplicate data, and/or to prevent duplicate data being included in the first taxonomy.
  • the apparatus includes means (preferably a processor) for preventing duplicate data in the first taxonomy.
  • the apparatus may be used to merge taxonomies.
  • the apparatus includes means (preferably a processor) for merging the associated information and/or the additional data of an object of the taxonomy with data of the set of data.
  • the objects can be merged.
  • the contents of each WordSet in each tree can also be merged, with differing data retained.
  • a further aspect of the invention provides an apparatus for managing a taxonomy including a plurality of objects, the apparatus being adapted to create a link between a first object of the taxonomy and a second object.
  • the second object may be, for example, a further object of the same taxonomy, or may be an object of a different taxonomy.
  • the transfer of information into the taxonomy may, for example be effected using a key board, but is preferably effected using an electronic pointer device, for example a computer mouse.
  • the information can be moved to (or from) the taxonomy using "drag and drop". This feature may be provided independently.
  • the apparatus is adapted to display the first taxonomy and the set of data in a user interface.
  • the taxonomy and the set of data are displayed side by side, preferably in separate panes in a GUI.
  • the apparatus includes means for generating a hierarchical representation of the taxonomy.
  • the apparatus includes means for displaying the objects of a taxonomy in their hierarchical relationship.
  • the display includes a branching hierarchical representation of a taxonomy.
  • a further aspect of the invention provides a method of creating a user interface in an apparatus for managing a taxonomy, the method including generating a hierarchical representation of the taxonomy.
  • the method further includes generating a representation of a set of data.
  • the method includes generating a display including the representation of the taxonomy adjacent the representation of the set of data.
  • the set of data may comprise a further taxonomy.
  • a further aspect of the invention provides a user interface for an apparatus for managing a first taxonomy, the user interface including a hierarchical representation of the first taxonomy.
  • the user interface further includes a representation of a set of data.
  • the representation of the set of data may comprise a hierarchical representation of a taxonomy, which may be the same or a different taxonomy from the first taxonomy.
  • items of a representation can be moved from one representation to another representation.
  • the movement of items effects changes in the taxonomy.
  • the invention also provides an apparatus for managing a taxonomy, the apparatus including means (preferably a processor) for generating a sub-taxonomy comprising a part of the taxonomy.
  • the apparatus preferably includes means for selecting the highest node of the taxonomy to be used in generating the sub-taxonomy.
  • the invention also provides a method of managing a taxonomy, the method comprising the step of generating a sub-taxonomy comprising a part of the taxonomy.
  • a further aspect of the invention comprises a method of generating a search query, the method comprising receiving an input, comparing the input with an object of a taxonomy (preferably a taxonomy as described herein), identifying an object related to the input, retrieving information associated with the identified object, and using the information to generate the search query.
  • a taxonomy preferably a taxonomy as described herein
  • the retrieved information may comprise a search string which may be used directly, or may comprise a set of terms which are linked, for example using Boolean operators, to form a search query.
  • the retrieved information may include information regarding the location for the search, and may include a URL.
  • the retrieved information includes one or more items of the associated information and/or the additional data described herein.
  • the method further comprises the step of transmitting the search query, for example to a search engine, database or other data store.
  • the method includes identifying two objects related to the input, and retrieving information relating to the two objects.
  • This feature may be provided independently.
  • the retrieved information can be used to resolve any ambiguity in the meaning of the input.
  • a further aspect of the invention comprises an apparatus for generating a search query, the apparatus comprising means (preferably a processor) for receiving an input, means (preferably a processor and a memory store) for comparing the input with an object of a taxonomy (preferably a taxonomy as described herein), means (preferably a processor) for identifying an object related to the input, means (preferably a processor) for retrieving information associated with the identified object, and means (preferably a processor) for generating the search query.
  • the invention further provides a search query generated by a method and/or using an apparatus as described herein.
  • a taxonomy is a hierarchy of subject headings used to classify information. Taxonomies have been used for many years to classify for example, plants and animals, books, manufactured goods, census returns.
  • Metadata is a set of information that describes the resource in order to make it easier for others to retrieve it. Metadata typically includes information about the author of the document, the date of publication, publisher, format etc. The main purpose of metadata is to allow resources to be found more easily by information users.
  • Metadata record An important part of the metadata record is the "subject" field(s). This field allows an author, a manager of content such as a librarian or web site manager or any other individual to record the key subjects to which the resource is related, so that those searching for information on that subject will find the resource in question.
  • the keywords in this subject field are taken from an agreed taxonomy, so that all those using the system are in agreement about how a given subject should be defined.
  • an agreed taxonomy such as Dewey Decimal or the UNSPSC (Universal Standard Products and Services Classification) is almost always used.
  • PCT/GB00/03652 describes a single taxonomy for use in classification and information navigation.
  • a single taxonomy that provides for all of these groups risks becoming unmanageably large. It might be difficult for users to navigate the taxonomy and to locate items of interest. There is also a risk that users would become overwhelmed by irrelevant taxonomy entries.
  • a further aspect of the invention provides an apparatus for managing a taxonomy, including means (for example a memory) for storing data relating to the taxonomy, the apparatus including control means (for example a processor and associated memory) for controlling access to the data.
  • means for example a memory
  • control means for example a processor and associated memory
  • the data may comprise, for example, the objects themselves of the taxonomy, the information relating to the hierarchical relationship between the objects, and/or information associated with the objects.
  • control means is adapted to provide access only to a portion of the data relating to the taxonomy.
  • control means is adapted to prevent access to data relating to the taxonomy.
  • This feature may be used by a user wishing only to view and/or edit a part of the taxonomy; by controlling the data which can be viewed and/or edited, the user' s task may be simplified.
  • the control means may also, or alternatively, provide an important security feature by preventing access to sensitive or confidential information, and also to control the ability of users to modify the data relating to the taxonomy.
  • control means provides a "filter” which regulates what can be viewed by a particular user and/or what can be modified by a user.
  • the control means may therefore comprise a security layer between the user and the taxonomy data.
  • control means is adapted to prevent the reading of data relating to the taxonomy and/or to prevent the modification of data relating to the taxonomy.
  • modification may, for example, comprise deleting or adding objects to the taxonomy, changing hierarchical relationships between objects in the hierarchy and/or adding or editing information associated with the objects of the hierarchy.
  • control means is adapted to prevent saving modifications to the data relating to the taxonomy.
  • the control means may allow the user to make modifications to the data, but not to make permanent changes to the stored data of the taxonomy.
  • control means provides a security layer which lies between the user interface and the stored information.
  • security layer which lies between the user interface and the stored information.
  • different access can be provided to the different types of data in the system.
  • the apparatus includes a first storage means (for example a memory) for storing data relating to the taxonomy which is adapted for use for viewing the taxonomy, and a second storage means (for example a memory which may be a part of the memory of the first storage means) for storing data relating to the taxonomy which is adapted for use for editing the data.
  • a first storage means for example a memory
  • a second storage means for example a memory which may be a part of the memory of the first storage means for storing data relating to the taxonomy which is adapted for use for editing the data.
  • control means is adapted to provide different access to the two data stores.
  • the storage means may be provided remotely from the control means.
  • the apparatus includes means (for example a processor and associated memory) for controlling access to data relating to an object of the taxonomy. Different access criteria can therefore be provided for each node of the taxonomy.
  • the apparatus comprises means (preferably a memory) for storing a list of users, and means for storing access privilege information associated with a user.
  • the access privilege information defines the access the user has to the data, for example the objects which can be viewed and/or modified by the user.
  • the apparatus further includes means (preferably a processor with associated memory) for receiving a request from a user relating to the data relating to the taxonomy, means (preferably a processor with associated memory) for retrieving the access privilege information for that user, and means (preferably a processor with associated memory) for using the access privilege information to determine whether or not to carry out the request.
  • means preferably a processor with associated memory
  • the access privilege information for that user, and means (preferably a processor with associated memory) for using the access privilege information to determine whether or not to carry out the request.
  • the apparatus further includes means (preferably a processor with associated memory) for defining a group of users and for storing access privilege information associated with the group of users.
  • means preferably a processor with associated memory
  • Several users may have the same access privileges to the data, and therefore by defining groups of users, the management of the access privileges for the users can be simplified.
  • a user may belong to more than one group.
  • a further aspect of the invention provides an apparatus for managing a taxonomy, including means for displaying the taxonomy, wherein the apparatus is adapted to display only a part of the information of the taxonomy.
  • the apparatus is adapted to prevent the display of a part of the taxonomy.
  • a further aspect of the invention provides apparatus for editing a taxonomy, including control means (preferably a processor with associated memory) for preventing modification of apart of the data relating to the taxonomy.
  • control means preferably a processor with associated memory
  • a further aspect of the invention provides apparatus for managing access to a taxonomy, including means (preferably a processor with associated memory) for generating a data set comprising a portion of the information of the taxonomy.
  • means preferably a processor with associated memory
  • a data set comprising a portion of the information of the taxonomy.
  • the data set comprises a branch of the hierarchy of the taxonomy.
  • the sub-taxonomy may comprise, for example, one subject area included in the original taxonomy from which the data set is derived.
  • a further aspect of the invention provides apparatus for managing a plurality of taxonomies, the apparatus being adapted to provide a link between an object of a first taxonomy and an object of the second taxonomy.
  • the link may provide a hierarchical relationship between obj ects of the two taxonomies. In this way, a user may view the objects of the two taxonomies in a single hierarchical structure.
  • the two taxonomies can be treated differently.
  • the arrangement may be that only one of the taxonomies may be modified, the other being "read only" (apart for the creation of links between the taxonomies).
  • a further aspect of the invention provides a method of managing a taxonomy, including storing data relating to the taxonomy, and controlling access to the data.
  • a further aspect of the invention provides a method of managing a taxonomy, including displaying only a part of the information of the taxonomy.
  • a further aspect of the invention provides a method of editing a taxonomy, including preventing modification of a part of the data relating to the taxonomy.
  • a further aspect of the invention provides a method of managing access to a taxonomy, including generating a data set comprising a portion of the information of the taxonomy.
  • a further aspect of the invention provides a method of managing a plurality of taxonomies, including the step of providing a link between an object of a first taxonomy and an object of the second taxonomy.
  • a further aspect of the invention provides a method of managing a first taxonomy, the method including transferring information from a set of data to the first taxonomy.
  • a further aspect of the invention provides a method of managing a taxonomy including a plurality of objects, the method including creating a link between a first object of the taxonomy and a second object.
  • a further aspect of the invention provides an apparatus for merging a set of data with a taxonomy, the apparatus including means (preferably a processor and associated memory) for determining whether the set of data includes data which is similar to data in the taxonomy, and for deleting the similar data from the set of data or from the taxonomy.
  • means preferably a processor and associated memory
  • Preferred examples provide a system which shows "views" of a taxonomy to different users and groups in an organisation, and allows those users and groups to manage these views.
  • the invention also provides a computer program and a computer program product for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, and a computer readable medium having stored thereon a program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
  • the invention also provides a signal embodying a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, a method of transmitting such a signal, and a computer product having an operating system which supports a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
  • Figure 1 shows a display generated by the taxonomy management toolset, illustrating two windows
  • Figure 2 shows a further display generated by the toolset
  • Figure 3 shows a further display generated by the toolset
  • Figure 4 shows a further display generated by the toolset
  • Figure 5 shows a further display generated by the toolset
  • Figure 6 shows a further display generated by the toolset
  • Figure 7 shows a further display generated by the toolset
  • Figure 8 shows a further display generated by the toolset
  • Figure 9 shows a further display generated by the toolset;
  • Figure 10 shows a search interface;
  • Figure 11 shows a further example of a display of the search interface
  • Figure 12 shows a further example of a display of the search interface
  • Figure 13 illustrates a user selecting and opening a wordset
  • Figure 14 illustrates a user granting privileges to modify or add to wordset data in a selected wordset; and Figure 15 illustrates a user granting relation privileges to users and groups.
  • the Taxonomy Management Toolset (the toolset) is a system designed to assist those involved in the creation and management of taxonomies.
  • Figure 1 shows a display generated by the toolset, illustrating two windows.
  • One window is the editor window 10, in which the taxonomy is built and edited.
  • the resource window 12 are displayed data from which the taxonomy in the editor window 10 is built or amended.
  • the system presents the taxonomic data as two branching hierarchies (the Trees 14, 16), in side by side windows 10, 12 in a visual interface.
  • Data can be manipulated by dragging and dropping objects from the resource tree 16 in the resource window 12 to the editor tree 14 in the editor window 10. This can be carried out in a variety of operations.
  • the toolset can thus enable the user to compile, maintain, and update data in the editor tree, and to create relationships within the data, as is described in more detail below.
  • Each point in the editor tree 14 is preferably treated as an object, called a "WordSet", and stores various data.
  • This data can include one or more of the following:
  • the data architecture thus can contain and allow the management of data about classification (principally about hierarchical relationships) and lexical data (principally concerning language).
  • An important function of a preferred example of the system is to store search statements for each object in the taxonomy, so that in an automated process, each object can be used to form a query to one or more search engines, databases or other data stores.
  • the search statement may contain synonyms or other terms known to qualify a statement to a search engine or database in such a way as to clarify the query.
  • the system can contain a series of processes by which such query statements may be made to search engines or databases.
  • the system can accept a dataset in the form of one or more of:
  • the system may be arranged to accept text, or data in another format.
  • the imported terms may, for example, be held in a plain text file, a comma separated values file or an XML file in which the attributes commonly stored and managed in the system are defined.
  • the imported terms may also be held in the preferred DTD format in an XML document and imported in that way. This is the preferred method for importing complex data, for example data including information other than parent-child and synonym relations.
  • complex data files held in other formats can be converted into the required format.
  • the system preferably generates: • Word Sets for each object
  • the system parses the XML document. For each WordSet element in the XML document, the relevant data is inserted into database tables.
  • the database stores hierarchical (parent-child ) relationships between objects, whether these are intrinsic in the data when it is imported or are later created by a user of the system.
  • the terms which are superordinate to the imported terms, if any are available, are preferably stored for use in an information search procedure, for example using a technique known as "query expansion".
  • query expansion a technique known as "query expansion”.
  • Associated terms are then preferably used in the default search string for the imported term.
  • the search string for the term "Ford” would include "Cars” + "Ford”. The search string can later be modified to store other terms to perform the same function.
  • the toolset preferably allows users to modify the data by dragging a dropping objects in the tree from one pane in the visual interface (the resource window 12) to the another (the editor window 10) .
  • a popup menu 30 (shown in Figure 3 ) preferably appears at the point in the tree to which the object is being dragged, and this menu can allow the user to choose between the various operations the system supports. In the case shown in Figure 3, the object is to be moved to that location in the editor tree 14.
  • a user may create a copy of an object from one location in the hierarchical structure to another.
  • the object "Ford” could be dragged from its place under the superordinate “Cars” to a new place under the superordinate "Automotive manufacturers”. This action preferably creates a new object named "Ford" in the second location.
  • This operation may also be performed for groups of objects, so that for example, all car manufacturers could also be listed under the superordinate 'Automotive trade. '
  • the editor hierarchy could be displayed in both the editor window 10 and in the resource window 12 and thus terms in the created hierarchy itself can be manipulated.
  • a user may change the location of an object by dragging it to a new place in the hierarchy.
  • the object 'Ford' may be moved from under the superordinate 'Large manufacturers' to a place under the superordinate 'Fortune 500 Companies'.
  • This operation may also be performed for a parent and its descendants.
  • a user may merge two subsets of objects in the hierarchy.
  • the system preferably removes duplicates; it also preferably merges the WordSet contents of the two merged objects.
  • the two subsets 'Nan manufacturers' and 'Truck Manufacturers' may be merged under a single superordinate 'Commercial vehicle manufacturers.
  • the object 'Ford' which appeared in both of the previous subsets, is merged into a single object.
  • One of the two WordSets contains the synonym Tveco', and this is retained in the merged object.
  • This operation would usually be performed for groups of objects, but may be performed for single objects where the objective is to merge the contents of the WordSets of those two objects.
  • the user may create a link between two objects in different parts of the hierarchy, called a symbolic link.
  • This process entails the user identifying a place in the hierarchy at which an object identical to the one under consideration also occurs, denoting an identical or very similar concept.
  • the term 'bowling' 40 shown in Figure 4 may appear under both the superordinate terms ' Sports' 42 and 'Recreation' 44.
  • a symbolic link may be created between the master object and a marker placed in the new location. The object is thus managed in one location, but seen by the user and in the interface to which the system outputs data, in more than one location.
  • the symbolic link 46 is, for this example, displayed in the visual interface with a chain link icon.
  • the user may create a 'Related link', in which a marker object is created in the new location denoting a concept that is not identical but is nonetheless related to the original object.
  • a marker object is created in the new location denoting a concept that is not identical but is nonetheless related to the original object.
  • the term 'Skiing' which appears under 'Sports' may be related to the term 'snow boots' which appears under the superordinate 'Shopping' .
  • a 'Related link' can ensure that the related terms are displayed in the interface although they are not adjacent to one another in the hierarchy.
  • the user may create a link between an object in one taxonomy and an object in another taxonomy.
  • the user may create a new WordSet at any place in the hierarchy.
  • the WordSet is added as a child beneath the superordinate at which the user is working.
  • the WordSet can contain a variety of records, which the system stores, as follows:
  • the user may open and amend the WordSet record that the system has provided when the object has been imported or created.
  • Figures 6 to 9 show the features of the object "Myanmar” which can be created or edited.
  • the "Edit WordSet” window shows various features of "Myanmar” 60 including the terms for the search string, which here include the words “Asia” and “Asian”, which are related to the parent of "Myanmar".
  • Synonyms may also be included in the search string.
  • One example of the Boolean logic used is: “term” OR “synonym” AND “Associated term 1", and so on. In some cases, the logic may be different, for example to include ORed associated terms.
  • the user may add and remove data to or from any of the fields described above, and to others that are added from time to time.
  • the user may add data to an object in the taxonomy which is the superordinate of other objects.
  • the user may specify then that the data or 'feature' may apply only to that object, or may apply to that object and to all the children of that object. This is known as an 'inheritable feature'.
  • the manager of the system may specify which attribute types will be made available to the system's users, in a feature known as 'user definable attributes' .
  • the attributes chosen by the manager are then shown to the user within the WordSet by means of a drop down menu, from which the user may select an attribute.
  • the user may choose to view the complete taxonomy in one of its foreign language variants by selecting from a menu.
  • a user may search for an object which carries a given term, either as lead word or as a synonym.
  • a list of results is then displayed, from which the user may select.
  • a user may delete an object in the hierarchy, and the result of the deletion is that the superordinate node is removed and its children promoted to its level.
  • a user may delete a node so that all of its children are also deleted.
  • the system can require changes to the data to be confirmed before they are accepted into the database, in a process termed 'Committing changes'.
  • the system can be set so that a user can commit his or her own changes, or it can be set so that changes can only be committed by a super-user with greater privileges. Until changes are committed, any uncommitted additions to the data are, this example, displayed in blue, while uncommitted deletions are displayed in grey.
  • An open book icon can show where changes have been made lower down the hierarchy. This enables the user to track changes easily, which is useful if more than one user is working on the tree at any given time. As a safety mechanism, any uncommitted changes can be undone using the undo function. This can allow the user a degree of flexibility to make experimental changes to the data before they are committed to the database. This is a valuable preferred feature when the user is editing complex data structures.
  • Taxonomy deployment described in this example involves converting the editorial database into a flattened, run-time form optimized for fast query access.
  • the conversion process preferably also involves the generation of additional tables to support the conflation of user-entered search terms.
  • the taxonomy may be presented in a user search and navigation interface which resembles a directory.
  • a call is preferably made to the runtime database in which the taxonomy is stored.
  • the call instigates a process in which an expanded query made up of data taken from fields in the taxonomy is formed.
  • the query may be formed by combining the synonyms found in the WordSet to which the link refers.
  • the query may be augmented with the associated terms stored in the WordSet, or with other items of data found in the WordSet, such as foreign language variants, numerical codes or other information of any kind stored within the taxonomy.
  • the system adds this information to the query by using Boolean operators.
  • a formed query for the term which is displayed in the interface as 'Internet filtering' may be augmented by the superordinate ' Software' and the synonyms 'safe surfing' and 'parental control' .
  • the system combines these items in a query using Boolean operators as follows: "internet filtering" OR "safe surfing” OR “parental control” AND “Software” .
  • the data taken from the taxonomy may be combined into a query using other forms of Boolean logic, or may be formed into a query, by using longer queries accepted by information retrieval engines, or may use a sample piece of text instead of a Boolean query, or may take any other information from the taxonomy and convert it for query input to a database or search engine.
  • the formed query can preferably be transmitted to a variety of search engines and databases, which may be remote systems on the Internet or local intranet systems.
  • the system stores the query formats required by each search engine, and formats the queries accordingly using the data taken from the taxonomy.
  • the system may store a URL (Uniform Resource Locator), which indicates the destination of the query.
  • the URL indicating the destination of the query may be the same for all objects in the taxonomy, i.e. may be a universal destination for all queries. There may be more than one destination for the query so that the queries using data from the taxonomy may query several search engines and databases simultaneously.
  • the URL indicating the destination of the query may be stored locally in the taxonomy and may refer to one object in the taxonomy only, to diverse single objects in the taxonomy, to groups of objects in the taxonomy, or to entire branches in the taxonomy together with sub-branches.
  • the taxonomy can therefore be used to store search sources, for example as well as lexical and classification information.
  • the system can manage the display of search results generated by the query.
  • the returned results are preferably first ' normalized' by the application of engine-specific parsing rules, and then merged and formatted to create the presented result list.
  • the system can also use the taxonomy to detect and assist the resolution of ambiguous user-entered search terms.
  • a user-entered search term is recognized as being ambiguous if it is found to occur in more than one context within the taxonomy.
  • the system uses information from the taxonomy to build a disambiguation page from which the user may select the intended interpretation from a list of contexts.
  • the system can make calls to the taxonomy data to provide other, related terms to the user's query term for display in the interface. These terms include, but are not limited to:
  • the system can also display symbolic links and related links in the search interface.
  • a symbolic link is a form of cross-referencing from one part of the database to another. Related links can be created to link together relevant pieces of information that exist elsewhere in the database. A "see also" link might be shown.
  • the system can support a range of search features between the user of the system's navigation interface and the taxonomy data, for example:
  • Symbolic links and related links can be displayed in the search interface as siblings, children, or uncles.
  • Figure 10 shows a search interface.
  • the interface shows a search box 100 into which the user can enter one or more terms before clicking on the "search" button 102 to initiate the search.
  • the user can search by category 104.
  • Figure 10 shows the search term "astronomy" has been entered. Astronomy appears more than once in the taxonomy and Figure 11 shows a disambiguation page which appears. The user clicks on the category of interest from the list shown 106.
  • the user is interested in amateur astronomy 108.
  • Figure 12 shows the resulting screen.
  • the position in the taxonomy is shown 110, as well as related terms 112.
  • Terms of the search string 114 are also shown, as well as the results of the search 116.
  • a further optional feature relates to the generation of a search site.
  • This feature allows the user to generate a navigation site based on the user's chosen parent node. The user highlights the chosen node in the taxonomy. He then chooses "convert taxonomy" from a pull-down menu.
  • the system sends a message (for example an e-mail) indicating that the generation is complete and advising as to the location of the navigation.
  • the message may include a URL of a website for the navigation interface.
  • the navigation preferably appears the same as a "normal" interface, but is based on a reduced taxonomy.
  • ⁇ FEATURE TYPE Original Code
  • VALUE "203210” > ⁇ /FEATURE>
  • ⁇ FEATURE TYPE "Description”
  • VALUE "" > ⁇ /FEATURE>
  • ⁇ FEATURE TYPE "Wordmap Unique Code”
  • VALUE "WUC-155980” > ⁇ /FEATURE>
  • the following describes an example of an XML DTD to represent a taxonomy or classification scheme. Its principal purpose is to allow data to be imported into the Taxonomy Management toolset described above. This may be useful to users who wish to export data a variety of other proprietary formats into a single format.
  • a taxonomy as used in the example of the system described above is a principally hierarchical data structure that has the following characteristics:
  • Each node may have a name, or leadword, plus several or no synonyms • Any node may have an arbitrary number of features (attribute value pairs) associated with it.
  • Symbolic and inter-taxonomy links allow nodes to be represented multiply in a taxonomy.
  • a taxonomy is a graph rather than a tree.
  • Each node in the taxonomy of this example is a first class data object that may have attributes known as "features". These are attribute value pairs, where values are arbitrary Unicode strings that may be interpreted as required by the processing application and inherited from nodes higher up in the taxonomy.
  • the set of features that can be attributed to a given node is defined by the source of the node.
  • nodes that could be placed in different parts of the taxonomy are assigned one parent only. However, they may occur elsewhere as symbolic links. These are 'child' nodes that are not physical; children, but references to other nodes.
  • Multilinguality The synonyms and leadwords in a taxonomy are automatically assigned an ISO standard language code or locale, for example "en”, “fir", “de” etc. If it is wished to create a multilingual dataset, the simplest way to do this is to duplicate the unique codes, together with translated leadwords and synonyms. These spreadsheets may then be merged With the primary language dataset, using the unique code to identify the relevant WordSets.
  • the DTD used is as follows:
  • the Wordmap element acts as the container element for the representation of the taxonomy.
  • the definition states that a Wordmap element is not valid unless it consists of at least one of each of the WORDSET and SOURCE elements. Indeed, a document can only represent a taxonomy when at least one Wordset, that must have a Source exists - this true if we deem the simplest of taxonomies to consist of a single node without children.
  • the Source element is used to represent information about the sources from which a Wordset (or a Wordset member) is derived. Each source must be given an ID that is used by the Wordset or Wordset member elements to identify it.
  • the MASTER_FLAG attribute specifies whether the source should be considered for use as a Master Taxonomy in the Taxonomy Editing Toolset; the value should be either 0 (not a master taxonomy) or 1 (a master taxonomy). If a Master taxonomy already exists in the receiving database, this value is overridden on import and the taxonomy is classed as a satellite ⁇ non-master, taxonomy.
  • the source element contains 0 or more feature_type elements that define the feature types that are valid for the containing source.
  • the concept of a whether a taxonomy is a master taxonomy or not in Wordmap relates to whether the taxonomy can be edited in the taxonomy toolset or not, and whether the taxonomy can be used as the inter-taxonomy link index taxonomy.
  • the FEATURE_TYPE element is used to represent the different types of feature that are available to store information about a wordset. Features can be used to model a host of information about a wordset. Common usages are to represent statuses, definitions and codes related to the wordset.
  • the EDITABLE attribute represents whether the feature value for a given feature type should be editable within the Taxonomy Editing Toolset ( 1 ) or not (0).
  • the INHERITABLE attribute specifies whether the value of a feature of a given feature_type should be inherited by the owning wordset' s children, overriding any value specified by the child wordsets (1); inherited by the owning wordset' s children only if the child wordset does not have its own value for the feature (2); inherited by the owning wordset' s children depending on whether the value of the inherited feature equals that specified for the INT_DEP_NAL of the feature ype (3); or not (0).
  • the TYPE attribute can be used to specify simple type categorisation for the feature type.
  • the supported TYPES for a feature are DATE - for storing dates, NUMBER- for storing numbers and CHAR - for storing any data other than DATEs and JMBERs that can be stored in a character string.
  • the CARDINALITY attribute can be used to specify whether the featurejype can have multiple instances within a wordset (M) or is constrained to a single instance (S).
  • the WORDSET element consists of at least one PHRASE element and zero or more of each of the FEATURE, AS SOCIATEDTERM and CHILD elements. Indeed, a wordset can only exist if it has at least one member and wordset members are represented in the Wordmap DTD as PHRASE elements.
  • the ID attribute is used by CHILD elements to refer to the wordset as a child.
  • the CURRSOURCEREF and ORIGSOURCEREF attributes refer to the ID of the current source and original source of the wordset respectively, i.e. the source of the taxonomy under which the wordset currently resides, and the original source of the wordset.
  • the TYPE of a wordset indicates whether is is the root node of a taxonomy (ROOT), a node in the taxonomy that stores data (DATA), or a node that is present only for representing structure (INDEX).
  • the PHRASE element is used to represent a single wordset member that can be one either the leadword for the wordset or any of its synonyms/language variants.
  • Each phrase has a position within the wordset that can be used to specify the relevance/importance of that phrase and a language code that should conform to the ISO 639 and 3166 standards for specification of country and language variants.
  • the FEATURE element represents the value of a feature of a specific type for a wordset.
  • the type should identify one of those feature types contained within the XML document itself.
  • the ASSOCIATEDTERM element can optionally be used to specify phrases that are associated with the wordset but are not deemed to be synonymous with the wordset. This element type is commonly used to produce better searchstrings for query expansion within the Navigation system and may be consumed within feature support in future versions of the product set.
  • the FLAG attribute specifies whether the string specified in the TERM attribute should be used in an expanded query string.
  • the CHILD element is used to model parent-child relationships between wordsets within the XML document.
  • the owning WORDSET element can contain many CHILD elements each with a variety of values for the RELTYPE attribute.
  • the ID attribute refers to the
  • the RELTYPE specifies the type of relationship being represented - PHY - Physical, SYM - Symbolic, REL - Related, ITL - Inter-Taxonomy
  • taxonomies can be viewed and managed at a number of different levels in preferred examples.
  • a central administrator(s) is able to select single nodes or branches of a taxonomy for publication, by highlighting the branch he wishes to publish and selecting "Generate navigation" from a drop down menu.
  • the "Generate navigation” option stores a flattened version of the relevant branch of the taxonomy in a server.
  • the navigation interface makes calls to this server data.
  • a single taxonomy can be split by an administrator into several sections, each of which can be published to a different user interface.
  • the preferred system can provide a second way for users to manage multiple taxonomies by allowing the storage, management and publication of "satellite" taxonomies.
  • This approach may be recommended when an organisation has taxonomies which are proprietary, and which it can therefore modify, and taxonomies belonging to a third party, which it cannot modify.
  • a manufacturer might have an internal directory of proprietary products and a directory of external products used in their assembly, which is based on the UNSPSC and administered by the relevant manufacturing trade body.
  • the former is known as the "master” taxonomy and can be modified, whereas the satellite taxonomy cannot be modified.
  • the user can create "inter-taxonomy links" between the trade taxonomy and the proprietary taxonomy. These links are colour coded in the management interface to distinguish them from intra-taxonomy links.
  • inter-taxonomy links are preserved when the taxonomy data is exported to the taxonomy server in a flattened version. Consequently, end users in a navigation interface can also navigate the satellite taxonomyries).
  • a number of satellite taxonomies can be stored, and these are selected for viewing and management in the right hand pane via a drop down menu.
  • a satellite taxonomy can also be selected as the master taxonomy, in which case it will be shown in the left hand pane.
  • the master taxonomy can be modified, whereas apart from the creation of links, the modification of a satellite taxonomy is not permitted.
  • Preferred examples allow different "views" of a taxonomy to be seen by different individuals or groups using an information system.
  • the system preferably stores taxonomies in two different formats:
  • the system provides a means of accessing taxonomies stored in these two formats.
  • a "security layer” controls which objects are visible.
  • the security layer also controls which objects may be moved, modified or deleted.
  • Acme Limited operates a taxonomy system. Among its users is Mrs Money in Finance and Mr Wrench in Manufacturing. Also using the system through Acme's web site is Mr Customer. The whole system is administered by Mr Mainframe.
  • Mr Mainframe has created four groups: World, Company, Finance and Manufacturing. He has granted full privileges over Finance to Mrs Money and over Manufacturing to Mr Wrench.
  • Mrs Money has created a number of Wordsets for Company viewing in the finance section, such as Pensions, Expenses and Tax advice. She has granted herself full privileges over these wordsets, so that she can modify both their contents and their position in the taxonomy. She has granted the group Company, of which all employees are members, half privileges, so that they can see but not modify these Wordsets. The group World, of which Mr Customer is a member, has no privileges in these wordsets.
  • Mrs Money also owns more sensitive wordsets such as payroll and cash position. These can only be viewed by the group finance, of which she is a member. The managing director has also been granted membership of this group.
  • the group finance is also a member of the group Company, and the group Company is a member of the group World.
  • Mrs Money can therefore see Acme's web site and information intended for employees.
  • Mr Wrench in manufacturing has taken a similar approach to that adopted by Mrs Money, and much technical information is only made available to members of the group manufacturing.
  • Mr Wrench is about to create three further groups: Cylinders, Gaskets and Pistons. Specialists in these areas will be granted privileges that will further refine the view of the taxonomy they see and therefore the information they retrieve.
  • a list of users of the system and of groups to which those users belong is compiled, or imported from another source.
  • the invention provides an interface in which users and groups can be maintained; or, the system is synchronised with commonly used directory (LDAP) products.
  • LDAP commonly used directory
  • An administrator has various privileges over users and groups: Compilation of a list of named users of the system
  • users and groups are assigned privileges relating to taxonomy objects and the relationships between taxonomy objects. Therefore, each system user can see or modify taxonomy objects according to his privileges. Privileges can be linked to the membership of a certain group, so that, for example, a financial controller is able to view taxonomy branches relating to payroll by virtue of her membership of a group or her individual privileges.
  • the taxonomy object stores data which controls its behaviour in end-user interfaces, such as synonyms, foreign language variants, status etc. Users of the system can have two levels of privilege defined at Wordset level:
  • the ability of users in both interfaces to see taxonomy objects is controlled by the level of privileges assigned at relationship level, ie the relationship between the taxonomy object and its parent.
  • a single taxonomy object can have many parents and therefore many locations.
  • the Wordset "Turkey” may have multiple parents such as “European countries”, “NATO members” “Mediterranean countries.”
  • Read The user sees the object, but is not permitted to modify the relationship, ie move or delete Read/write: The user can see and modify the relationship.
  • Figure 13 shows a screen in which a user ch ⁇ ses a wordset 100 and can call up a menu with an option 102 for editing settings for that wordset.
  • Figure 14 shows a view of the "edit wordset" screen in which the privileges for different users 104, or groups of users for example "accounts" 106 can be set.
  • Figure 15 shows a screen in which privileges can be set for viewing and changing links.

Abstract

An apparatus for managing a taxonomy is described. The apparatus stores a plurality of objects and associated information associated with an object of a hierarchy. The associated information includes information relating to the hierarchical relationship of the object to another object, and further includes additional data relating to the object. The associated information and/or the additional data may include a search string and/or search location information for use in carrying out a search relating to the object. In preferred examples, the taxonomy is displayed in a window (10) as a hierarchy (14).

Description

TAXONOMY MANAGEMENT
The invention relates to the management of data. The invention finds particular application in the creation and editing of taxonomies; and preferably hierarchical 5 arrangements of data. Examples of the invention described below relate to the creation and editing of hierarchical databases of terms for use in searching.
A taxonomy preferably comprises a classification system that divides a subject area hierarchically into progressively smaller subdivisions.
10.
Taxonomies have been used for many years to classify many forms of knowledge, for example products and services in telephone directories, and books in library subject areas. Using taxonomies, people can organise knowledge into clearly defined categories and can give users intelligent interfaces. 5
Aspects of the present invention seek to improve and simplify the creation and/or editing of hierarchical data structures.
An aspect of the invention provides an apparatus for managing a taxonomy, the apparatus 0 comprising means (preferably a memory store) for storing a plurality of obj ects, and means
(preferably a memory store) for storing associated information which is associated with an object of the hierarchy, the associated information including information relating to the
-hierarchical relationship of the object to another object, and further including additional data relating to the object. 5
Preferably each object of the taxonomy has associated information. Preferably, each object of the taxonomy has additional data.
At least a part of the associated information and/or the additional data may be associated 0 with a single object, or may be associated with a group of objects. The group of objects may have a hierarchical relationship between each other. Different parts of the associated information and/or additional data may be associated with different objects or groups of objects.
Preferably the associated information and/or the additional data includes information for use in a search string. The associated information and/or the additional data may include a search string.
Preferably the associated information and/or the additional data includes search location information. The search location information may relate to a search engine, a database or other data store which may be located locally, or remotely, for example on the Internet. The search location information may include a URL.
Preferably the apparatus includes means (preferably a processor and a memory store) for adding an object to the taxonomy, and/or removing an object from the taxonomy.
Preferably the apparatus includes means (preferably a processor and a memory store) for adding and/or editing additional information and/or additional data associated with an object. The added or edited information and/or data may be associated with a single object or with a group of objects.
Preferably the apparatus includes means (preferably a processor and a memory store) for adding a link object, the link object being linked to an object of the taxonomy. Preferably the apparatus is adapted to effect the display of both the linked object and of the object in a user interface. Preferably, other than being displayed as two objects, the linked obj ect and the object are treated as one object in the taxonomy.
A further aspect of the invention provides a taxonomy comprising a plurality of objects, wherein associated information is associated with an object of the taxonomy, the associated information including information relating to the hierarchical relationship of the object to another object, and further including additional data relating to the object. A further aspect of the invention provides a method of managing a taxonomy including a plurality of objects, the method including the step of associating information with an object of the taxonomy, the associated information including information relating to the hierarchical relationship of the object to another object, and further including additional data relating to the object.
Preferably the apparatus is adapted to generate a display of the taxonomy. Preferably the apparatus is adapted to generate a display of the taxonomy showing hierarchical relationships between the objects. The apparatus is preferably adapted to use the associated information of the objects to construct a hierarchical representation of the taxonomy. Preferably each node of the taxonomy relates to an object of the taxonomy. Preferably the associated information is associated with the relevant node of the taxonomy.
A further aspect of the invention provides an apparatus for managing a first taxonomy, the apparatus including means for transferring information from a set of data to the first taxonomy. Preferably the set of data comprises a second taxonomy. Preferably, the or each taxonomy comprises a plurality of hierarchically related objects. Preferably the apparatus is adapted to transfer information regarding an object of the set of data to the first taxonomy. The information may include information such as the associated information and additional data as described herein. Preferably the apparatus is arranged such that transfer of information regarding the object from the second taxonomy effects transfer of information regarding a group of objects of the second taxonomy to the first taxonomy. Preferably a group of objects can be transferred from the set of data to the first taxonomy.
The transfer may comprise copying information from the set of data to the taxonomy, or deletion from the set of data and insertion into the taxonomy ("cut and paste").
Preferably the apparatus includes means (preferably a processor) for determining whether the set of data includes data which is similar to data in the first taxonomy. The similar data may be identical to the data in the first taxonomy. Preferably the apparatus looks for terms which are duplicated in the imported data and the taxonomy. Preferably the apparatus is adapted to remove duplicate data, and/or to prevent duplicate data being included in the first taxonomy. Preferably the apparatus includes means (preferably a processor) for preventing duplicate data in the first taxonomy.
Thus the apparatus may be used to merge taxonomies.
Preferably the apparatus includes means (preferably a processor) for merging the associated information and/or the additional data of an object of the taxonomy with data of the set of data.
For example, where the set of data includes objects having associated information and/or additional data, the objects can be merged. The contents of each WordSet in each tree can also be merged, with differing data retained.
A further aspect of the invention provides an apparatus for managing a taxonomy including a plurality of objects, the apparatus being adapted to create a link between a first object of the taxonomy and a second object. The second object may be, for example, a further object of the same taxonomy, or may be an object of a different taxonomy.
The transfer of information into the taxonomy may, for example be effected using a key board, but is preferably effected using an electronic pointer device, for example a computer mouse. In preferred embodiments, the information can be moved to (or from) the taxonomy using "drag and drop". This feature may be provided independently.
Preferably the apparatus is adapted to display the first taxonomy and the set of data in a user interface. In a preferred example, the taxonomy and the set of data are displayed side by side, preferably in separate panes in a GUI.
Preferably the apparatus includes means for generating a hierarchical representation of the taxonomy.
Preferably the apparatus includes means for displaying the objects of a taxonomy in their hierarchical relationship. Preferably the display includes a branching hierarchical representation of a taxonomy.
A further aspect of the invention provides a method of creating a user interface in an apparatus for managing a taxonomy, the method including generating a hierarchical representation of the taxonomy. Preferably the method further includes generating a representation of a set of data. Preferably the method includes generating a display including the representation of the taxonomy adjacent the representation of the set of data. The set of data may comprise a further taxonomy.
A further aspect of the invention provides a user interface for an apparatus for managing a first taxonomy, the user interface including a hierarchical representation of the first taxonomy. Preferably the user interface further includes a representation of a set of data. The representation of the set of data may comprise a hierarchical representation of a taxonomy, which may be the same or a different taxonomy from the first taxonomy.
Preferably, items of a representation can be moved from one representation to another representation. Preferably the movement of items effects changes in the taxonomy.
The invention also provides an apparatus for managing a taxonomy, the apparatus including means (preferably a processor) for generating a sub-taxonomy comprising a part of the taxonomy. The apparatus preferably includes means for selecting the highest node of the taxonomy to be used in generating the sub-taxonomy.
The invention also provides a method of managing a taxonomy, the method comprising the step of generating a sub-taxonomy comprising a part of the taxonomy.
A further aspect of the invention comprises a method of generating a search query, the method comprising receiving an input, comparing the input with an object of a taxonomy (preferably a taxonomy as described herein), identifying an object related to the input, retrieving information associated with the identified object, and using the information to generate the search query.
The retrieved information may comprise a search string which may be used directly, or may comprise a set of terms which are linked, for example using Boolean operators, to form a search query. The retrieved information may include information regarding the location for the search, and may include a URL.
Preferably the retrieved information includes one or more items of the associated information and/or the additional data described herein.
Preferably the method further comprises the step of transmitting the search query, for example to a search engine, database or other data store.
Preferably the method includes identifying two objects related to the input, and retrieving information relating to the two objects. This feature may be provided independently. The retrieved information can be used to resolve any ambiguity in the meaning of the input.
A further aspect of the invention comprises an apparatus for generating a search query, the apparatus comprising means (preferably a processor) for receiving an input, means (preferably a processor and a memory store) for comparing the input with an object of a taxonomy (preferably a taxonomy as described herein), means (preferably a processor) for identifying an object related to the input, means (preferably a processor) for retrieving information associated with the identified object, and means (preferably a processor) for generating the search query.
The invention further provides a search query generated by a method and/or using an apparatus as described herein. A taxonomy is a hierarchy of subject headings used to classify information. Taxonomies have been used for many years to classify for example, plants and animals, books, manufactured goods, census returns.
More recently, taxonomies have been used by large organisations to classify digital resources. Typically, a digital resource, such as an electronic document, is "tagged" with a metadata record. Metadata is a set of information that describes the resource in order to make it easier for others to retrieve it. Metadata typically includes information about the author of the document, the date of publication, publisher, format etc. The main purpose of metadata is to allow resources to be found more easily by information users.
An important part of the metadata record is the "subject" field(s). This field allows an author, a manager of content such as a librarian or web site manager or any other individual to record the key subjects to which the resource is related, so that those searching for information on that subject will find the resource in question.
Preferably, the keywords in this subject field are taken from an agreed taxonomy, so that all those using the system are in agreement about how a given subject should be defined. In settings in which information has been classified for a long time, such as libraries, an agreed taxonomy such as Dewey Decimal or the UNSPSC (Universal Standard Products and Services Classification) is almost always used.
International Patent Application No. PCT/GB00/03652 describes a single taxonomy for use in classification and information navigation.
Many issues arise when an organisation decides to develop and use a taxonomy of its activities and assets. These include:
A large organisation typically has many departments and fields of interest, and each of these groups usually has its own way of organising and describing information. An effective taxonomy therefore needs to reflect a wide range of terminology usages;
A single taxonomy that provides for all of these groups risks becoming unmanageably large. It might be difficult for users to navigate the taxonomy and to locate items of interest. There is also a risk that users would become overwhelmed by irrelevant taxonomy entries.
S ecurity issues arise : it is likely that the organisation would wish to prevent certain user groups seeing sensitive or confidential information.
On the other hand, if the taxonomy is excessively simple and small, it will not reflect the level of detail each of these groups requires.
A further aspect of the invention provides an apparatus for managing a taxonomy, including means (for example a memory) for storing data relating to the taxonomy, the apparatus including control means (for example a processor and associated memory) for controlling access to the data.
The data may comprise, for example, the objects themselves of the taxonomy, the information relating to the hierarchical relationship between the objects, and/or information associated with the objects.
Preferably the control means is adapted to provide access only to a portion of the data relating to the taxonomy. Preferably the control means is adapted to prevent access to data relating to the taxonomy.
This feature may be used by a user wishing only to view and/or edit a part of the taxonomy; by controlling the data which can be viewed and/or edited, the user' s task may be simplified. However, the control means may also, or alternatively, provide an important security feature by preventing access to sensitive or confidential information, and also to control the ability of users to modify the data relating to the taxonomy.
In preferred examples, the control means provides a "filter" which regulates what can be viewed by a particular user and/or what can be modified by a user. The control means may therefore comprise a security layer between the user and the taxonomy data.
Preferably the control means is adapted to prevent the reading of data relating to the taxonomy and/or to prevent the modification of data relating to the taxonomy. Such modification may, for example, comprise deleting or adding objects to the taxonomy, changing hierarchical relationships between objects in the hierarchy and/or adding or editing information associated with the objects of the hierarchy.
Preferably the control means is adapted to prevent saving modifications to the data relating to the taxonomy. The control means may allow the user to make modifications to the data, but not to make permanent changes to the stored data of the taxonomy.
Preferably the control means provides a security layer which lies between the user interface and the stored information. Preferably different access can be provided to the different types of data in the system.
Preferably the apparatus includes a first storage means (for example a memory) for storing data relating to the taxonomy which is adapted for use for viewing the taxonomy, and a second storage means (for example a memory which may be a part of the memory of the first storage means) for storing data relating to the taxonomy which is adapted for use for editing the data.
Preferably, the control means is adapted to provide different access to the two data stores. The storage means may be provided remotely from the control means.
Preferably the apparatus includes means (for example a processor and associated memory) for controlling access to data relating to an object of the taxonomy. Different access criteria can therefore be provided for each node of the taxonomy. Preferably the apparatus comprises means (preferably a memory) for storing a list of users, and means for storing access privilege information associated with a user. Preferably the access privilege information defines the access the user has to the data, for example the objects which can be viewed and/or modified by the user.
Preferably the apparatus further includes means (preferably a processor with associated memory) for receiving a request from a user relating to the data relating to the taxonomy, means (preferably a processor with associated memory) for retrieving the access privilege information for that user, and means (preferably a processor with associated memory) for using the access privilege information to determine whether or not to carry out the request.
Preferably the apparatus further includes means (preferably a processor with associated memory) for defining a group of users and for storing access privilege information associated with the group of users. Several users may have the same access privileges to the data, and therefore by defining groups of users, the management of the access privileges for the users can be simplified. Preferably, a user may belong to more than one group.
A further aspect of the invention provides an apparatus for managing a taxonomy, including means for displaying the taxonomy, wherein the apparatus is adapted to display only a part of the information of the taxonomy.
Preferably the apparatus is adapted to prevent the display of a part of the taxonomy.
A further aspect of the invention provides apparatus for editing a taxonomy, including control means (preferably a processor with associated memory) for preventing modification of apart of the data relating to the taxonomy.
A further aspect of the invention provides apparatus for managing access to a taxonomy, including means (preferably a processor with associated memory) for generating a data set comprising a portion of the information of the taxonomy. In this way, a smaller subset of the taxonomy can be created which is more manageable for the user to view and, if necessary, edit.
Preferably the data set comprises a branch of the hierarchy of the taxonomy. Thus the sub-taxonomy may comprise, for example, one subject area included in the original taxonomy from which the data set is derived.
A further aspect of the invention provides apparatus for managing a plurality of taxonomies, the apparatus being adapted to provide a link between an object of a first taxonomy and an object of the second taxonomy. The link may provide a hierarchical relationship between obj ects of the two taxonomies. In this way, a user may view the objects of the two taxonomies in a single hierarchical structure.
By keeping the data of the two taxonomies separate, rather than for example importing the data of one taxonomy into the other, the two taxonomies can be treated differently. For example, the arrangement may be that only one of the taxonomies may be modified, the other being "read only" (apart for the creation of links between the taxonomies).
A further aspect of the invention provides a method of managing a taxonomy, including storing data relating to the taxonomy, and controlling access to the data.
A further aspect of the invention provides a method of managing a taxonomy, including displaying only a part of the information of the taxonomy.
A further aspect of the invention provides a method of editing a taxonomy, including preventing modification of a part of the data relating to the taxonomy. A further aspect of the invention provides a method of managing access to a taxonomy, including generating a data set comprising a portion of the information of the taxonomy.
A further aspect of the invention provides a method of managing a plurality of taxonomies, including the step of providing a link between an object of a first taxonomy and an object of the second taxonomy.
A further aspect of the invention provides a method of managing a first taxonomy, the method including transferring information from a set of data to the first taxonomy.
A further aspect of the invention provides a method of managing a taxonomy including a plurality of objects, the method including creating a link between a first object of the taxonomy and a second object.
A further aspect of the invention provides an apparatus for merging a set of data with a taxonomy, the apparatus including means (preferably a processor and associated memory) for determining whether the set of data includes data which is similar to data in the taxonomy, and for deleting the similar data from the set of data or from the taxonomy.
Preferred examples provide a system which shows "views" of a taxonomy to different users and groups in an organisation, and allows those users and groups to manage these views.
The invention also provides a computer program and a computer program product for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, and a computer readable medium having stored thereon a program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
The invention also provides a signal embodying a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, a method of transmitting such a signal, and a computer product having an operating system which supports a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
The invention extends to methods and/or apparatus substantially as herein described with reference to the accompanying drawings.
Any feature in one aspect of the invention may be applied to other aspects of the invention, in any appropriate combination. In particular, method aspects may be applied to apparatus aspects, and vice versa.
Furthermore, features implemented in hardware may generally be implemented in software, and vice versa. Any reference to software and hardware features herein should be construed accordingly.
Preferred features of the present invention will now be described, purely by way of example, with reference to the accompanying drawings, in which:
Figure 1 shows a display generated by the taxonomy management toolset, illustrating two windows;
Figure 2 • shows a further display generated by the toolset;
Figure 3 shows a further display generated by the toolset;
Figure 4 shows a further display generated by the toolset; Figure 5 shows a further display generated by the toolset;
Figure 6 shows a further display generated by the toolset;
Figure 7 shows a further display generated by the toolset;
Figure 8 shows a further display generated by the toolset;
Figure 9 shows a further display generated by the toolset; Figure 10 shows a search interface;
Figure 11 shows a further example of a display of the search interface; Figure 12 shows a further example of a display of the search interface;
Figure 13 illustrates a user selecting and opening a wordset;
Figure 14 illustrates a user granting privileges to modify or add to wordset data in a selected wordset; and Figure 15 illustrates a user granting relation privileges to users and groups.
The Taxonomy Management Toolset (the toolset) is a system designed to assist those involved in the creation and management of taxonomies.
Figure 1 shows a display generated by the toolset, illustrating two windows. One window is the editor window 10, in which the taxonomy is built and edited. In the resource window 12 are displayed data from which the taxonomy in the editor window 10 is built or amended. The system presents the taxonomic data as two branching hierarchies (the Trees 14, 16), in side by side windows 10, 12 in a visual interface.
Data can be manipulated by dragging and dropping objects from the resource tree 16 in the resource window 12 to the editor tree 14 in the editor window 10. This can be carried out in a variety of operations. The toolset can thus enable the user to compile, maintain, and update data in the editor tree, and to create relationships within the data, as is described in more detail below.
Each point in the editor tree 14 is preferably treated as an object, called a "WordSet", and stores various data. This data can include one or more of the following:
• a set of synonyms denoting a particular concept
• the foreign language equivalents of these synonyms
• numerical and other coding information
• information relating to the origin of the WordSet and its history
A more detailed outline of the preferred form of the data is given below. The data architecture thus can contain and allow the management of data about classification (principally about hierarchical relationships) and lexical data (principally concerning language).
An important function of a preferred example of the system is to store search statements for each object in the taxonomy, so that in an automated process, each object can be used to form a query to one or more search engines, databases or other data stores. The search statement may contain synonyms or other terms known to qualify a statement to a search engine or database in such a way as to clarify the query.
The system can contain a series of processes by which such query statements may be made to search engines or databases.
Functions in detail
Importing data
In a preferred example, the system can accept a dataset in the form of one or more of:
• a flat list of terms;
• a list of terms in hierarchical order by parent-child relationships; • a list of terms in hierarchical order which also includes synonymic and other items of data as defined by the data architecture.
The system may be arranged to accept text, or data in another format.
The imported terms may, for example, be held in a plain text file, a comma separated values file or an XML file in which the attributes commonly stored and managed in the system are defined.
The imported terms may also be held in the preferred DTD format in an XML document and imported in that way. This is the preferred method for importing complex data, for example data including information other than parent-child and synonym relations. In some examples, complex data files held in other formats can be converted into the required format.
On import of the file, the system preferably generates: • Word Sets for each object
• A visual representation of the objects in hierarchical form
When an XML file is imported, the system parses the XML document. For each WordSet element in the XML document, the relevant data is inserted into database tables.
The database stores hierarchical (parent-child ) relationships between objects, whether these are intrinsic in the data when it is imported or are later created by a user of the system.
Within each WordSet, the terms which are superordinate to the imported terms, if any are available, are preferably stored for use in an information search procedure, for example using a technique known as "query expansion". Thus, for example, if a list of cars were imported, where "Cars" is the superordinate term of "Ford", it (the term "Cars") is stored in a field named "associated terms" for each of the imported terms. Associated terms are then preferably used in the default search string for the imported term. For example, the search string for the term "Ford" would include "Cars" + "Ford". The search string can later be modified to store other terms to perform the same function.
The toolset preferably allows users to modify the data by dragging a dropping objects in the tree from one pane in the visual interface (the resource window 12) to the another (the editor window 10) . A popup menu 30 (shown in Figure 3 ) preferably appears at the point in the tree to which the object is being dragged, and this menu can allow the user to choose between the various operations the system supports. In the case shown in Figure 3, the object is to be moved to that location in the editor tree 14.
In an operation, a user may create a copy of an object from one location in the hierarchical structure to another. Thus, the object "Ford" could be dragged from its place under the superordinate "Cars" to a new place under the superordinate "Automotive manufacturers". This action preferably creates a new object named "Ford" in the second location.
This operation may also be performed for groups of objects, so that for example, all car manufacturers could also be listed under the superordinate 'Automotive trade. '
Thus it can be seen that the editor hierarchy could be displayed in both the editor window 10 and in the resource window 12 and thus terms in the created hierarchy itself can be manipulated.
In a further operation, a user may change the location of an object by dragging it to a new place in the hierarchy. Thus, the object 'Ford' may be moved from under the superordinate 'Large manufacturers' to a place under the superordinate 'Fortune 500 Companies'.
This operation may also be performed for a parent and its descendants.
In a further operation, a user may merge two subsets of objects in the hierarchy. In the case of a merge operation, the system preferably removes duplicates; it also preferably merges the WordSet contents of the two merged objects. Thus, the two subsets 'Nan manufacturers' and 'Truck Manufacturers' may be merged under a single superordinate 'Commercial vehicle manufacturers. ' As a result of the merge operation, the object 'Ford', which appeared in both of the previous subsets, is merged into a single object. One of the two WordSets contains the synonym Tveco', and this is retained in the merged object.
This operation would usually be performed for groups of objects, but may be performed for single objects where the objective is to merge the contents of the WordSets of those two objects. In a further operation, the user may create a link between two objects in different parts of the hierarchy, called a symbolic link. This process entails the user identifying a place in the hierarchy at which an object identical to the one under consideration also occurs, denoting an identical or very similar concept. Thus, the term 'bowling' 40 shown in Figure 4 may appear under both the superordinate terms ' Sports' 42 and 'Recreation' 44. In this instance, a symbolic link may be created between the master object and a marker placed in the new location. The object is thus managed in one location, but seen by the user and in the interface to which the system outputs data, in more than one location.
The symbolic link 46 is, for this example, displayed in the visual interface with a chain link icon.
In a further operation, the user may create a 'Related link', in which a marker object is created in the new location denoting a concept that is not identical but is nonetheless related to the original object. Thus, the term 'Skiing' which appears under 'Sports' may be related to the term 'snow boots' which appears under the superordinate 'Shopping' . A 'Related link' can ensure that the related terms are displayed in the interface although they are not adjacent to one another in the hierarchy.
In a further operation, the user may create a link between an object in one taxonomy and an object in another taxonomy.
In a further operation, the user may create a new WordSet at any place in the hierarchy. The WordSet is added as a child beneath the superordinate at which the user is working. The WordSet can contain a variety of records, which the system stores, as follows:
• The synonyms of the term used to denote the object;
• A means of controlling which of the synonyms is displayed in the interface at any time by changing its order in the list, the synonym in first place on the list being displayed in the interface and being known as the 'Lead word';
• A means of storing foreign language variants of the term and its synonyms; • A means of controlling which of the foreign language synonyms is displayed in the foreign language version of the interface at any time;
• A means of labelling the foreign language versions so that they may be recognized as such by ISO standard abbreviations; • A means of adding attributes to WordSets, such as one or more of o The date and time of import o The date and time of modification o The definition o The notes made by an editor o Special instructions for handling the WordSet o The source to which the individual WordSet should be sent to search o The numerical code assigned to the WordSet at the time of its import o Another numerical code, assigned by a user of the system o Miscellaneous attributes of the WordSet o 'Associated terms', which are the terms used in the formation of the query string.
In a further operation, the user may open and amend the WordSet record that the system has provided when the object has been imported or created. Figures 6 to 9 show the features of the object "Myanmar" which can be created or edited.
In Figure 6, the term "Myanmar" 60 is highlighted and the "Edit WordSet" window 62 can be displayed. In Figure 6, the "Edit WordSet" window 62 shows English-language synonyms. The term "Myanmar is in position 1 and thus it is that term which is displayed in the interface.
In Figure 7, the "Edit WordSet" window shows various features of "Myanmar" 60 including the terms for the search string, which here include the words "Asia" and "Asian", which are related to the parent of "Myanmar".
Synonyms may also be included in the search string. One example of the Boolean logic used is: "term" OR "synonym" AND "Associated term 1", and so on. In some cases, the logic may be different, for example to include ORed associated terms.
InFigure 8 the "Edit WordSet" window shows various attributes of the object "Myanmar" and in Figure 9, it shows terms associated with "Myanmar".
The user may add and remove data to or from any of the fields described above, and to others that are added from time to time.
The user may add data to an object in the taxonomy which is the superordinate of other objects. The user may specify then that the data or 'feature' may apply only to that object, or may apply to that object and to all the children of that object. This is known as an 'inheritable feature'.
At the time that the taxonomy dataset is formed, the manager of the system may specify which attribute types will be made available to the system's users, in a feature known as 'user definable attributes' . The attributes chosen by the manager are then shown to the user within the WordSet by means of a drop down menu, from which the user may select an attribute.
In a further operation, the user may choose to view the complete taxonomy in one of its foreign language variants by selecting from a menu.
A user may search for an object which carries a given term, either as lead word or as a synonym. A list of results is then displayed, from which the user may select.
A user may delete an object in the hierarchy, and the result of the deletion is that the superordinate node is removed and its children promoted to its level.
A user may delete a node so that all of its children are also deleted. The system can require changes to the data to be confirmed before they are accepted into the database, in a process termed 'Committing changes'. The system can be set so that a user can commit his or her own changes, or it can be set so that changes can only be committed by a super-user with greater privileges. Until changes are committed, any uncommitted additions to the data are, this example, displayed in blue, while uncommitted deletions are displayed in grey. An open book icon can show where changes have been made lower down the hierarchy. This enables the user to track changes easily, which is useful if more than one user is working on the tree at any given time. As a safety mechanism, any uncommitted changes can be undone using the undo function. This can allow the user a degree of flexibility to make experimental changes to the data before they are committed to the database. This is a valuable preferred feature when the user is editing complex data structures.
Taxonomy Deployment
Taxonomy deployment described in this example involves converting the editorial database into a flattened, run-time form optimized for fast query access. In addition to creating the flattened taxonomy representation, the conversion process preferably also involves the generation of additional tables to support the conflation of user-entered search terms.
The taxonomy may be presented in a user search and navigation interface which resembles a directory.
When a user clicks on a hypertext link in the navigation interface, a call is preferably made to the runtime database in which the taxonomy is stored. The call instigates a process in which an expanded query made up of data taken from fields in the taxonomy is formed. The query may be formed by combining the synonyms found in the WordSet to which the link refers. The query may be augmented with the associated terms stored in the WordSet, or with other items of data found in the WordSet, such as foreign language variants, numerical codes or other information of any kind stored within the taxonomy. The system adds this information to the query by using Boolean operators. Thus, a formed query for the term which is displayed in the interface as 'Internet filtering' may be augmented by the superordinate ' Software' and the synonyms 'safe surfing' and 'parental control' . The system combines these items in a query using Boolean operators as follows: "internet filtering" OR "safe surfing" OR "parental control" AND "Software" . However, the data taken from the taxonomy may be combined into a query using other forms of Boolean logic, or may be formed into a query, by using longer queries accepted by information retrieval engines, or may use a sample piece of text instead of a Boolean query, or may take any other information from the taxonomy and convert it for query input to a database or search engine.
The formed query can preferably be transmitted to a variety of search engines and databases, which may be remote systems on the Internet or local intranet systems. The system stores the query formats required by each search engine, and formats the queries accordingly using the data taken from the taxonomy. The system may store a URL (Uniform Resource Locator), which indicates the destination of the query.
The URL indicating the destination of the query may be the same for all objects in the taxonomy, i.e. may be a universal destination for all queries. There may be more than one destination for the query so that the queries using data from the taxonomy may query several search engines and databases simultaneously.
The URL indicating the destination of the query may be stored locally in the taxonomy and may refer to one object in the taxonomy only, to diverse single objects in the taxonomy, to groups of objects in the taxonomy, or to entire branches in the taxonomy together with sub-branches. The taxonomy can therefore be used to store search sources, for example as well as lexical and classification information.
The system can manage the display of search results generated by the query. The returned results are preferably first ' normalized' by the application of engine-specific parsing rules, and then merged and formatted to create the presented result list. The system can also use the taxonomy to detect and assist the resolution of ambiguous user-entered search terms. A user-entered search term is recognized as being ambiguous if it is found to occur in more than one context within the taxonomy. In this case, the system uses information from the taxonomy to build a disambiguation page from which the user may select the intended interpretation from a list of contexts.
The system can make calls to the taxonomy data to provide other, related terms to the user's query term for display in the interface. These terms include, but are not limited to:
• The location of the term in the taxonomy, shown as a string (for example
Business > Automotive Manufacturers > Ford)
• The other terms in the taxonomy which share the same superordinate (siblings), for example General Motors, Daimler Chrysler
• The children of the user's query term for example for Ford: products, locations, and so on.
• The other locations in the taxonomy at which the user's query term occurs (for example Fortune 500 companies > Ford) ('uncles').
The system can also display symbolic links and related links in the search interface. A symbolic link is a form of cross-referencing from one part of the database to another. Related links can be created to link together relevant pieces of information that exist elsewhere in the database. A "see also" link might be shown.
The system can support a range of search features between the user of the system's navigation interface and the taxonomy data, for example:
• 'Searching the string'
If a user enters two terms, and those terms are not from the same object in the taxonomy but are found in separate objects in the same branch of the tree, a match is made. For example, a search query : 'Ford USA' would retrieve a string: 'Business > Fortune 500
Companies > Ford > Operations > USA locations' and would also retrieve a string: 'Business > USA > Automotive manufacturers > Ford. ', even though in each case 'Ford' and 'USA' are not stored in the same WordSet record.
• Substitution of synonyms in the interface If a user searches for a synonym from a WordSet it can be arranged such that it is always the user's variant that is displayed in the interface instead of any other synonym from the WordSet. For example, a search on 'nappy' retrieves several objects which use the term; a search on 'diaper' retrieves the same objects, but displayed with the 'diaper' variant. 'Nappy' and 'diaper' are synonyms taken from the same WordSet.
Symbolic links and related links can be displayed in the search interface as siblings, children, or uncles.
Figure 10 shows a search interface. The interface shows a search box 100 into which the user can enter one or more terms before clicking on the "search" button 102 to initiate the search. Alternatively, the user can search by category 104.
Figure 10 shows the search term "astronomy" has been entered. Astronomy appears more than once in the taxonomy and Figure 11 shows a disambiguation page which appears. The user clicks on the category of interest from the list shown 106.
In this case, the user is interested in amateur astronomy 108.
Figure 12 shows the resulting screen. The position in the taxonomy is shown 110, as well as related terms 112. Terms of the search string 114 are also shown, as well as the results of the search 116.
Generate Navigation
A further optional feature relates to the generation of a search site. This feature allows the user to generate a navigation site based on the user's chosen parent node. The user highlights the chosen node in the taxonomy. He then chooses "convert taxonomy" from a pull-down menu. When the "navigation" has been generated, the system sends a message (for example an e-mail) indicating that the generation is complete and advising as to the location of the navigation. For example, the message may include a URL of a website for the navigation interface.
The navigation preferably appears the same as a "normal" interface, but is based on a reduced taxonomy.
For example, if a navigation were being created for an English department of a University, the chosen node might be "english". The created navigation would then have "english" as its top node and only include the children and grandchildren, and so on, of "english". In that way, non-relevant terms can be excluded from the navigation.
XML Document
The following is an example of an extract of an XML document for use in the system described:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE WORDMAP SYSTEM "F:\lTTECH\DATA\XML\DTD\wordmap.dtd"> <WORDMAP>
<WORDSET ID = "WM203207" CURRSOURCEREF = "1" ORIGSOURCEREF = "1" TYPE = "ROOT">
<PHRASE POSITION = "1" LANGUAGE = "en" ORIGSOURCEREF = "1" ORIGINALWSID = "203207">astronomy</PHRASE>
<FEATURE TYPE = "Searchstring" VALUE = "&quot;Science&quot;" ></FEATURE>
<FEATURE TYPE = "Original Code" VALUE = "203207" ></FEATURE>
<FEATURE TYPE = "Description" VALUE = "" ></FEATURE>
<FEATURE TYPE = "Wordmap Unique Code" VALUE = "WUC-155978" ></FEATURE>
<ASSOCIATEDTERM TERM = "Science" FLAG = "17>
<CHILD ID = "WM203208" RELTYPE = "PHY"/>
<CHILD ID = " M203248" RELTYPE = "PHY7>
<CHILD ID = "WM203255" RELTYPE = "PHY7> <CHILD ID = "WM203262" RELTYPE = "PHY7>
<CHILD ID = "WM203264" RELTYPE = "PHY >
<CHILD ID = "WM203271" RELTYPE = "PHY7>
<CHILD ID = "WM203283" RELTYPE = "PHY7>
<CHILD ID = "WM203295" RELTYPE = "PHY7> <CHILD ID = " M203314" RELTYPE = "PHY7> <CHILD ID = "WM203365" RELTYPE = "PHY7> <CHILD ID = "WM363276" RELTYPE = "PHY7> <CHILD ID = "WM608366" RELTYPE = "PHY7> <CHILD ID = "WM695589" RELTYPE = "PHY7>
<CHILD ID = "WM210403" RELTYPE = "SYM" ALIAS = "Space_Exploration7> </WORDSET>
<WORDSET ID = "WM203208" CURRSOURCEREF = "1" ORIGSOURCEREF = "1" TYPE = "DATA">
<PHRASE POSITION = "2" LANGUAGE = "en" ORIGSOURCEREF = "1" ORIGINALWSID = "203208">Amateur</PHRASE>
<PHRASE POSITION = "1" LANGUAGE = "en" ORIGSOURCEREF = "" ORIGINALWSID = "0">amateur astronomy</PHRASE> <FEATURE TYPE = "Searchstring" VALUE = "&quot;Astronomy&quot;"
></FEATURE>
<FEATURE TYPE = Original Code" VALUE = "203208" ></FEATURE>
<FEATURE TYPE = "Description" VALUE = "" ></FEATURE>
<FEATURE TYPE = "Wordmap Unique Code" VALUE = "WUC-155979" ></FEATURE>
<ASSOCIATEDTERM TERM = "Astronomy" FLAG = "1 >
<CHILD ID = "WM203210" RELTYPE = "PHY7>
<CHILD ID = "WM203211" RELTYPE = "PHY7>
<CHILD ID = "WM203213" RELTYPE = "PHY7> <CHILD ID = "WM203216" RELTYPE = "PHY7> .
<CHILD ID = "WM203223" RELTYPE = "PHY7>
<CHILD ID = "WM203224" RELTYPE = "PHY7>
<CHILD ID = "WM203226" RELTYPE = "PHY7>
<CHILD ID = "WM203227" RELTYPE = "PHY7> <CHILD ID = "WM203228" RELTYPE = "PHY7>
<CHILD ID = "WM608365" RELTYPE = "PHY7>
<CHILD ID = "WM610387" RELTYPE = "PHY7> </WORDSET> <WORDSET ID = "WM203210" CURRSOURCEREF = "1" ORIGSOURCEREF = "1" TYPE = "DATA">
<PHRASE POSITION = "1" LANGUAGE = "en" ORIGSOURCEREF = "1" ORIGINALWSID = "203210">telescope making</PHRASE>
<FEATURE TYPE = "Searchstring" VALUE = "&quot;Amateur&quot; &quot;amateur_astronomy&quot;" ></FEATURE>
<FEATURE TYPE = Original Code" VALUE = "203210" ></FEATURE> <FEATURE TYPE = "Description" VALUE = "" ></FEATURE> <FEATURE TYPE = "Wordmap Unique Code" VALUE = "WUC-155980" ></FEATURE> <ASSOCIATEDTERM TERM = "amateur_astronomy" FLAG = "17>
<ASSOCIATEDTERM TERM = "Amateur" FLAG = "17> </WORDSET>
<WORDSET ID = "WM203211" CURRSOURCEREF = "1" ORIGSOURCEREF = "1" TYPE = "DATA">
<PHRASE POSITION = "1" LANGUAGE = "en" ORIGSOURCEREF = "1"
ORIGINALWSID = "203211">astrophotography</PHRASE>
<FEATURE TYPE = "Searchstring" VALUE = "&quot;Amateur&quot; &quot;amateur_astronomy&quot;" ></FEATURE>
<FEATURE TYPE = "Original Code" VALUE = "203211" ></FEATURE> <FEATURE TYPE = "Description" VALUE = "" ></FEATURE> <FEATURE TYPE = "Wordmap Unique Code" VALUE = "WUC-155981"
></FEATURE> <ASSOClATEDTERM TERM = "amateur_astronomy" FLAG = "17>
<ASSOCIATEDTERM TERM = "Amateur" FLAG = "17>
</WORDSET>
<WORDSET ID = "WM203213" CURRSOURCEREF = "1" ORIGSOURCEREF = "1" TYPE "DATA">
<PHRASE POSITION = "1" LANGUAGE = "en" ORIGSOURCEREF = "1" ORIGINALWSID = "203213">binocular astronomy</PHRASE>
<FEATURE TYPE = "Searchstring" VALUE = "&quot;Amateur&quot; &quot;amateur_astronomy&quot;" ></FEATURE> <FEATURE TYPE = "Original Code" VALUE = "203213" ></FEATURE>
<FEATURE TYPE = "Description" VALUE = "" ></FEATURE> <FEATURE TYPE = "Wordmap Unique Code" VALUE = "WUC-155982" ></FEATURE>
<ASSOCIATEDTERM TERM = "amateur_astronomy" FLAG = "17> <ASSOCIATEDTERM TERM = "Amateur" FLAG = "17>
</WORDSET>
...continues
The XML-DTD for a Taxonomy Definition
The following describes an example of an XML DTD to represent a taxonomy or classification scheme. Its principal purpose is to allow data to be imported into the Taxonomy Management toolset described above. This may be useful to users who wish to export data a variety of other proprietary formats into a single format.
Background Discussion
From the computational point of view, a taxonomy as used in the example of the system described above is a principally hierarchical data structure that has the following characteristics:
• Each node has one parent
• Each node may have a name, or leadword, plus several or no synonyms • Any node may have an arbitrary number of features (attribute value pairs) associated with it.
• Symbolic and inter-taxonomy links allow nodes to be represented multiply in a taxonomy. Thus, computationally, a taxonomy is a graph rather than a tree. '
Objects Attributes and Inheritance
Each node in the taxonomy of this example is a first class data object that may have attributes known as "features". These are attribute value pairs, where values are arbitrary Unicode strings that may be interpreted as required by the processing application and inherited from nodes higher up in the taxonomy. The set of features that can be attributed to a given node is defined by the source of the node.
Symbolic Links
As described in the example above, nodes that could be placed in different parts of the taxonomy are assigned one parent only. However, they may occur elsewhere as symbolic links. These are 'child' nodes that are not physical; children, but references to other nodes.
Multilinguality The synonyms and leadwords in a taxonomy are automatically assigned an ISO standard language code or locale, for example "en", "fir", "de" etc. If it is wished to create a multilingual dataset, the simplest way to do this is to duplicate the unique codes, together with translated leadwords and synonyms. These spreadsheets may then be merged With the primary language dataset, using the unique code to identify the relevant WordSets.
The DTD used is as follows:
<! — DTD for Wordmap Data —> <! — $Revision: 4 $ '■—>
<!ELEMENT WORDMAP (WORDSET+, SOURCE+) >
<! ELEMENT SOURCE (FEATURE TYPE* ) >
<!ATTLIST SOURCE
ID CDATA #REQUIRED NAME CDATA #REQUIRED VERSION CDATA #REQUIRED DESCRIPTION CDATA #IMPLIED MASTER FLAG (0 | 1) #REQUIRED
>
<! ELEMENT FEATURE_TYPE EMPTY> <!ATTLIST FEATURE_TYPE NAME CDATA #REQUIRED
EDITABLE (0 1 1) ' #IMPLIED
INHERITABLE (0 1 1 I 2 ] 3) #IMPLIED
TYPE (CHAR | DATE I NUMBER) #IMPLIED
CARDINALITY (S | M) #IMPLIED
INT DEP VAL CDATA #IMPLIED
>
<! ELEMENT WORDSET PHRASE+, FEATURE*, ASSOCIATEDTERM*, CHILD* )> <!ATTLIST WORDSET
ID ID #REQUIRED
CURRSOURCEREF CDATA #REQUIRED
ORIGSOURCEREF CDATA #REQUIRED
TYPE (DATA ] ROOT | INDEX) # REQUIRED
>
<! ELEMENT PHRASE (#PCDATA)>
<!ATTLIST PHRASE
POSITION CDATA #REQUIRED LANGUAGE CDATA #REQUIRED ORIGSOURCEREF CDATA #REQUIRED ORIGINALWSID CDATA #IMPLIED
>
<! ELEMENT FEATURE EMPTY>
<!ATTLIST FEATURE
TYPE CDATA #REQUIRED VALUE CDATA #REQUIRED
>
<! ELEMENT ASSOCIATEDTERM EMPTY>
<!ATTLIST ASSOCIATEDTERM
TERM CDATA #REQUIRED FLAG CDATA #REQUIRED
>
<! ELEMENT CHILD EMPTY>
<!ATTLIST CHILD
ID IDREF #REQUIRED RELTYPE (PHY | SYM | REL | ITT) #REQUIRED ALIAS CDATA #IMPLIED
>
The following section breaks down the DTD into clauses, and explains their purpose.
<! — DTD for Wordmap Data —>
<! — $Revision: 4 $ —>
<! ELEMENT WORDMAP (WORDSET+, SOURCE+)>
The Wordmap element acts as the container element for the representation of the taxonomy. The definition states that a Wordmap element is not valid unless it consists of at least one of each of the WORDSET and SOURCE elements. Indeed, a document can only represent a taxonomy when at least one Wordset, that must have a Source exists - this true if we deem the simplest of taxonomies to consist of a single node without children.
<! ELEMENT SOURCE (FEATURE_TYPE*) > <!ATTLIST SOURCE ID CDATA #REQUIRED
NAME CDATA #REQUIRED
VERSION CDATA #REQUIRED
DESCRIPTION CDATA #IMPLIED
MASTER FLAG (0 | 1) #REQUIRED >
The Source element is used to represent information about the sources from which a Wordset (or a Wordset member) is derived. Each source must be given an ID that is used by the Wordset or Wordset member elements to identify it. The MASTER_FLAG attribute specifies whether the source should be considered for use as a Master Taxonomy in the Taxonomy Editing Toolset; the value should be either 0 (not a master taxonomy) or 1 (a master taxonomy). If a Master taxonomy already exists in the receiving database, this value is overridden on import and the taxonomy is classed as a satellite^ non-master, taxonomy. The source element contains 0 or more feature_type elements that define the feature types that are valid for the containing source. The concept of a whether a taxonomy is a master taxonomy or not in Wordmap relates to whether the taxonomy can be edited in the taxonomy toolset or not, and whether the taxonomy can be used as the inter-taxonomy link index taxonomy.
<! ELEMENT FEATURE_TYPE EMPTY> <!ATTLIST FEATURE TYPE
NAME CDATA #REQUIRED
EDITABLE (0 | 1) #REQUIRED
INHERITABLE (0 | 1 | 2 | 3) #IMPLIED TYPE (CHAR I DATE | NUMBER) #IMPLIED
CARDINALITY (S | M) #IMPLIED
INT_DEP__VAL CDATA ^IMPLIED > The FEATURE_TYPE element is used to represent the different types of feature that are available to store information about a wordset. Features can be used to model a host of information about a wordset. Common usages are to represent statuses, definitions and codes related to the wordset. The EDITABLE attribute represents whether the feature value for a given feature type should be editable within the Taxonomy Editing Toolset ( 1 ) or not (0). The INHERITABLE attribute specifies whether the value of a feature of a given feature_type should be inherited by the owning wordset' s children, overriding any value specified by the child wordsets (1); inherited by the owning wordset' s children only if the child wordset does not have its own value for the feature (2); inherited by the owning wordset' s children depending on whether the value of the inherited feature equals that specified for the INT_DEP_NAL of the feature ype (3); or not (0). The TYPE attribute can be used to specify simple type categorisation for the feature type. Currently the supported TYPES for a feature are DATE - for storing dates, NUMBER- for storing numbers and CHAR - for storing any data other than DATEs and JMBERs that can be stored in a character string. The CARDINALITY attribute can be used to specify whether the featurejype can have multiple instances within a wordset (M) or is constrained to a single instance (S).
<! ELEMENT WORDSET (PHRASE+, FEATURE*, ASSOCIATEDTERM*, CHILD*) >
<!ATTLIST WORDSET
ID ID #REQUIRED
CURRSOURCEREF CDATA #REQUIRED ORIGSOURCEREF CDATA #REQUIRED
TYPE (DATA I ROOT | INDEX) #REQUIRED
>
The WORDSET element consists of at least one PHRASE element and zero or more of each of the FEATURE, AS SOCIATEDTERM and CHILD elements. Indeed, a wordset can only exist if it has at least one member and wordset members are represented in the Wordmap DTD as PHRASE elements.
The ID attribute is used by CHILD elements to refer to the wordset as a child. The CURRSOURCEREF and ORIGSOURCEREF attributes refer to the ID of the current source and original source of the wordset respectively, i.e. the source of the taxonomy under which the wordset currently resides, and the original source of the wordset. The TYPE of a wordset indicates whether is is the root node of a taxonomy (ROOT), a node in the taxonomy that stores data (DATA), or a node that is present only for representing structure (INDEX).
<! ELEMENT PHRASE (#PCDATA)> <!ATTLIST PHRASE
POSITION CDATA #REQUIRED LANGUAGE CDATA #REQUIRED
ORIGSOURCEREF CDATA #REQUIRED
ORIGINALWSID CDATA #IMPLIED > The PHRASE element is used to represent a single wordset member that can be one either the leadword for the wordset or any of its synonyms/language variants. Each phrase has a position within the wordset that can be used to specify the relevance/importance of that phrase and a language code that should conform to the ISO 639 and 3166 standards for specification of country and language variants.
<! ELEMENT FEATURE EMPTY> <!ATTLIST FEATURE
TYPE CDATA #REQUIRED VALUE CDATA #REQUIRED >
The FEATURE element represents the value of a feature of a specific type for a wordset. The type should identify one of those feature types contained within the XML document itself.
<! ELEMENT ASSOCIATEDTERM EMPTY> <!ATTLIST ASSOCIATEDTERM
TERM CDATA #REQUIRED
FLAG CDATA #REQUIRED >
The ASSOCIATEDTERM element can optionally be used to specify phrases that are associated with the wordset but are not deemed to be synonymous with the wordset. This element type is commonly used to produce better searchstrings for query expansion within the Navigation system and may be consumed within feature support in future versions of the product set. The FLAG attribute specifies whether the string specified in the TERM attribute should be used in an expanded query string.
<! ELEMENT CHILD EMPTY> <!ATTLIST CHILD
ID IDREF #REQUIRED
RELTYPE (PHY | SYM | REL | ITL) #REQUIRED ALIAS CDATA #IMPLIED >
The CHILD element is used to model parent-child relationships between wordsets within the XML document. The owning WORDSET element can contain many CHILD elements each with a variety of values for the RELTYPE attribute. The ID attribute refers to the
ID of the child wordset. The RELTYPE specifies the type of relationship being represented - PHY - Physical, SYM - Symbolic, REL - Related, ITL - Inter-Taxonomy
Link. In summary, the discussion above has described XML-DTD that may beused to represent taxonomies. This is principally designed for simplicity to allow ready import into the Wordmap taxonomy management toolset. However, the format is sufficiently general to allow data exchange between taxonomies.
Multiple taxonomies can be viewed and managed at a number of different levels in preferred examples.
Taxonomy publication
This is a simple way of publishing multiple taxonomies. A central administrator(s) is able to select single nodes or branches of a taxonomy for publication, by highlighting the branch he wishes to publish and selecting "Generate navigation" from a drop down menu.
The "Generate navigation" option stores a flattened version of the relevant branch of the taxonomy in a server. The navigation interface makes calls to this server data.
In this model, therefore, a single taxonomy can be split by an administrator into several sections, each of which can be published to a different user interface.
Satellite taxonomies
The preferred system can provide a second way for users to manage multiple taxonomies by allowing the storage, management and publication of "satellite" taxonomies. This approach may be recommended when an organisation has taxonomies which are proprietary, and which it can therefore modify, and taxonomies belonging to a third party, which it cannot modify. For example, a manufacturer might have an internal directory of proprietary products and a directory of external products used in their assembly, which is based on the UNSPSC and administered by the relevant manufacturing trade body. The former is known as the "master" taxonomy and can be modified, whereas the satellite taxonomy cannot be modified. The user can create "inter-taxonomy links" between the trade taxonomy and the proprietary taxonomy. These links are colour coded in the management interface to distinguish them from intra-taxonomy links.
The children of linked terms can be opened and explored in the management interface "in place", so that linking to an external taxonomy exposes the children of the linked node to users.
These "inter-taxonomy links" are preserved when the taxonomy data is exported to the taxonomy server in a flattened version. Consequently, end users in a navigation interface can also navigate the satellite taxonomyries).
A number of satellite taxonomies can be stored, and these are selected for viewing and management in the right hand pane via a drop down menu. At any time, a satellite taxonomy can also be selected as the master taxonomy, in which case it will be shown in the left hand pane.
The master taxonomy can be modified, whereas apart from the creation of links, the modification of a satellite taxonomy is not permitted.
Preferred examples allow different "views" of a taxonomy to be seen by different individuals or groups using an information system.
The system preferably stores taxonomies in two different formats:
• Editorial format, optimised for management;
• Published or run-time format, optimised for use in navigation, in a variety of enterprise systems in which navigation is provides, eg portal, content management, CRM, ERP.
The system provides a means of accessing taxonomies stored in these two formats. In both of these interfaces, a "security layer" controls which objects are visible. In the editorial view, the security layer also controls which objects may be moved, modified or deleted.
Worked example
Acme Limited operates a taxonomy system. Among its users is Mrs Money in Finance and Mr Wrench in Manufacturing. Also using the system through Acme's web site is Mr Customer. The whole system is administered by Mr Mainframe.
Mr Mainframe has created four groups: World, Company, Finance and Manufacturing. He has granted full privileges over Finance to Mrs Money and over Manufacturing to Mr Wrench.
Mrs Money has created a number of Wordsets for Company viewing in the finance section, such as Pensions, Expenses and Tax advice. She has granted herself full privileges over these wordsets, so that she can modify both their contents and their position in the taxonomy. She has granted the group Company, of which all employees are members, half privileges, so that they can see but not modify these Wordsets. The group World, of which Mr Customer is a member, has no privileges in these wordsets.
Mrs Money also owns more sensitive wordsets such as payroll and cash position. These can only be viewed by the group finance, of which she is a member. The managing director has also been granted membership of this group.
The group finance is also a member of the group Company, and the group Company is a member of the group World. Mrs Money can therefore see Acme's web site and information intended for employees. Mr Wrench in manufacturing has taken a similar approach to that adopted by Mrs Money, and much technical information is only made available to members of the group manufacturing. Mr Wrench is about to create three further groups: Cylinders, Gaskets and Pistons. Specialists in these areas will be granted privileges that will further refine the view of the taxonomy they see and therefore the information they retrieve.
Table 1: Ownership of taxonomy objects
Figure imgf000038_0001
Table 2: Complete taxonomy
Welcome to Acme
Pensions
Expenses
Tax advice
Payroll
Cash position
Cylinders
Gaskets
Pistons Table 3: Taxonomy views of individuals
Mr Customer's view: Welcome to Acme
Mrs Money's view:
Welcome to Acme
Pensions
Expenses
Tax advice
Payroll
Cash position
Mr Wrench' s view: Welcome to Acme Pensions Expenses Tax advice Cylinders Gaskets Pistons
This may be achieved in the following manner.
Firstly, a list of users of the system and of groups to which those users belong is compiled, or imported from another source. The invention provides an interface in which users and groups can be maintained; or, the system is synchronised with commonly used directory (LDAP) products.
An administrator has various privileges over users and groups: Compilation of a list of named users of the system
Granting membership of groups to users
Granting membership of groups to other groups
Assigning taxonomy object level privileges to users and groups
Assigning relationship level privileges to users and groups
Secondly, users and groups are assigned privileges relating to taxonomy objects and the relationships between taxonomy objects. Therefore, each system user can see or modify taxonomy objects according to his privileges. Privileges can be linked to the membership of a certain group, so that, for example, a financial controller is able to view taxonomy branches relating to payroll by virtue of her membership of a group or her individual privileges.
Taxonomy object privileges
The taxonomy object, or "Wordset", stores data which controls its behaviour in end-user interfaces, such as synonyms, foreign language variants, status etc. Users of the system can have two levels of privilege defined at Wordset level:
Half privileges, by which they are permitted to modify the data in the Wordset but not to commit the changes they have made
Full privileges, which allow them both to edit the data and to commit the changes they have made.
Taxonomy relationship privileges
The ability of users in both interfaces to see taxonomy objects is controlled by the level of privileges assigned at relationship level, ie the relationship between the taxonomy object and its parent.
A single taxonomy object can have many parents and therefore many locations. For example, the Wordset "Turkey" may have multiple parents such as "European Countries", "NATO members" "Mediterranean Countries."
There are three relationship privilege levels:
None: At this level, the user does not see that the object in question is in this location Read: The user sees the object, but is not permitted to modify the relationship, ie move or delete Read/write: The user can see and modify the relationship.
Figure 13 shows a screen in which a user chόses a wordset 100 and can call up a menu with an option 102 for editing settings for that wordset. Figure 14 shows a view of the "edit wordset" screen in which the privileges for different users 104, or groups of users for example "accounts" 106 can be set. Figure 15 shows a screen in which privileges can be set for viewing and changing links.
The precise details of the implementation of the various functions described above, and their distribution between hardware and software, are a matter of choice for the implementor and will not be described in detail.
The modules and other components have been described in terms of the features and functions provided by each component, together with optional and preferable features. With the information given and specifications provided, actual implementation of these features and the precise details are left to the implementor. As an example, certain modules could be implemented in software.
The above modules and components are merely illustrative, and the invention may be implemented in a variety of ways, and, in particular, some components may be combined with others which perform similar functions, or some may be omitted in simplified implementations. Hardware and software implementations of each of the functions may be freely mixed, both between components and within a single component. It will be readily understood that the functions performed by the hardware, the computer software, and such like are performed on or using electrical and like signals.
It will be understood that the present invention has been described above purely by way of example, and modification of detail can be made within the scope of the invention.
Each feature disclosed in the description, and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination.

Claims

1. Apparatus for managing a taxonomy, including means for storing data relating to the taxonomy, the apparatus including control means for controlling access to the data.
2. Apparatus according to claim 1, wherein the control means is adapted to provide access only to a portion of the data relating to the taxonomy.
3. Apparatus according to claim 1 or claim 2, wherein the control means is adapted to prevent access to data relating to the taxonomy.
4. Apparatus according to any preceding claim, wherein the control means is adapted to prevent the reading of data relating to the taxonomy.
5. Apparatus according to any preceding claim, wherein the control means is adapted to prevent the modification of data relating to the taxonomy.
6. Apparatus according to any preceding claim, wherein the control means is adapted to prevent saving modifications to the data relating to the taxonomy.
7. Apparatus according to any preceding claim, including a first storage means for storing data relating to the taxonomy which is adapted for use for viewing the taxonomy, and a second storage means for storing data relating to the taxonomy which is adapted for use for editing the data.
8. Apparatus according to any preceding claim, further comprising means for controlling access to data relating to an object of the taxonomy.
9. Apparatus according to any preceding claim, comprising means for storing a list of users, and means for storing access privilege information associated with a user.
10. Apparatus according to claim 9, further including means for receiving a . request from a user relating to the data relating to the taxonomy, means for retrieving the access privilege information for that user, and means for using the access privilege information to determine whether or not to carry out the request.
11. Apparatus according to claim 9 or claim 10, further including means for defining a group of users and for storing access privilege information associated with the group of users.
12. Apparatus for managing a taxonomy, including means for displaying the taxonomy, wherein the apparatus is adapted to display only a part of the information of the taxonomy.
13. Apparatus for editing a taxonomy, including control means for preventing modification of a part of the data relating to the taxonomy.
14. Apparatus for managing access to a taxonomy, including means for generating a data set comprising a portion of the information of the taxonomy.
15. Apparatus according to claim 14, wherein the data set comprises a branch of the hierarchy of the taxonomy.
16. Apparatus for managing a plurality of taxonomies, the apparatus being adapted to provide a link between an object of a first taxonomy and an object of the second taxonomy.
17. A method of managing a taxonomy, including storing data relating to the taxonomy, and controlling access to the data.
18. A method of managing a taxonomy, including displaying only a part of the information of the taxonomy.
19. A method of editing a taxonomy, including preventing modification of a part of the data relating to the taxonomy.
20. A method of managing access to a taxonomy, including generating a data set comprising a portion of the information of the taxonomy.
21. A method of managing a plurality of taxonomies, including the step of providing a link between an object of a first taxonomy and an object of the second taxonomy.
22. Apparatus for managing a taxonomy, the apparatus comprising means for storing a plurality of objects, and means for storing associated information which is associated with an object of the hierarchy, the associated information including information relating to the hierarchical relationship of the obj ect to another obj ect, and further including additional data relating to the object.
23. Apparatus according to any preceding claim, wherein each object of the taxonomy has associated information and/or additional data.
24. Apparatus according to claim 22 or claim 23, wherein at least a part of the associated information and/or the additional data is associated with a group of objects.
25. Apparatus according to any of claims 22 to 24, wherein the associated information and/or the additional data includes a search string.
26. Apparatus according to any of claims 22 to 25, wherein the associated information and/or the additional data includes search location information.
27. Apparatus according to any of claims 22 to 26, including means for adding an obj ect to the taxonomy, and/or removing an object from the taxonomy and/or for adding and/or editing additional information and/or additional data associated with an object.
28. Apparatus according to any of claims 22 to 27, the apparatus including means for adding a link object, the link object being linked to an object of the taxonomy.
29. Apparatus according to any of claims 22 to 28, wherein the apparatus is adapted to use the associated information of the objects to construct a hierarchical representation of the taxonomy.
30. A taxonomy comprising a plurality of objects, wherein associated information is associated with an object of the taxonomy, the associated information including information relating to the hierarchical relationship of the obj ect to another obj ect, and further including additional data relating to the object.
31. A method of managing a taxonomy including a plurality of objects, the method including the step of associating information with an object of the taxonomy, the associated information including information relating to the hierarchical relationship of the object to another object, and further including additional data relating to the object.
32. An apparatus for managing a first taxonomy, the apparatus including means for transferring information from a set of data to the first taxonomy.
33. An apparatus according to claim 32, wherein the taxonomy comprises a plurality of hierarchically related objects.
34. Apparatus according to claim 32 or claim 33, including means for determining whether the set of data includes data which is similar to data in the first taxonomy, and means for preventing duplicate data in the first taxonomy.
35. Apparatus according to any of claims 32 to 34, including means for merging the associated information and/or the additional data of an object of the taxonomy with data of the set of data.
36. An apparatus for managing a taxonomy including a plurality of objects, the apparatus being adapted to create a link between a first object of the taxonomy and a second object.
37. A method of merging a set of data with a taxonomy, the method including determining whether the set of data includes data which is similar to data in the taxonomy, and deleting the similar data from the set of data or from the taxonomy.
38. Ah apparatus according to any of claims 32 to 37, including means for generating a hierarchical representation of the taxonomy.
39. A method of creating a user interface in an apparatus for managing a taxonomy, the method including generating a hierarchical representation of the taxonomy.
40. A user interface for an apparatus for managing a first taxonomy, the user interface including a hierarchical representation of the first taxonomy.
41. A user interface according to claim 40, including a representation of a set of data wherein items of a representation can be moved from one representation to another representation.
42. An apparatus for managing a taxonomy, the apparatus including means for generating a sub-taxonomy comprising a part of the taxonomy.
43. A method of managing a taxonomy, the method comprising the step of generating a sub-taxonomy comprising a part of the taxonomy.
44. A method of generating a search query, the method comprising receiving an input, comparing the input with an object of a taxonomy, identifying an object related to the input, retrieving information associated with the identified object, and using the information to generate the search query.
45. A method according to claim 44, including identifying two objects related to the input, and retrieving information relating to the two objects.
46. An apparatus for generating a search query, the apparatus comprising means for receiving an input, means for comparing the input with an object of a taxonomy, means for identifying an object related to the input, means for retrieving information associated with the identified object, and means for generating the search query.
47. A search query generated by a method according to claim 44 or claim 45 and/or using an apparatus according to claim 46.
48. A method of managing a first taxonomy, the method including transferring information from a set of data to the first taxonomy.
49. A method of managing a taxonomy including a plurality of objects, the method including creating a link between a first object of the taxonomy and a second object.
50. An apparatus for merging a set of data with a taxonomy, the apparatus including means for determining whether the set of data includes data which is similar to data in the taxonomy, and for deleting the similar data from the set of data or from the taxonomy.
51. A method being substantially as herein described having reference to any of the accompanying drawings.
52. An apparatus being substantially as herein described having reference to and as illustrated by any of the accompanying drawings.
PCT/GB2002/005097 2001-11-13 2002-11-12 Taxonomy management WO2003042865A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0127248.3 2001-11-13
GBGB0127248.3A GB0127248D0 (en) 2001-11-13 2001-11-13 Taxonomy management

Publications (2)

Publication Number Publication Date
WO2003042865A2 true WO2003042865A2 (en) 2003-05-22
WO2003042865A3 WO2003042865A3 (en) 2004-06-24

Family

ID=9925703

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2002/005097 WO2003042865A2 (en) 2001-11-13 2002-11-12 Taxonomy management

Country Status (2)

Country Link
GB (1) GB0127248D0 (en)
WO (1) WO2003042865A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1657652A1 (en) * 2004-11-12 2006-05-17 Sap Ag Process-oriented classification
EP1975814A2 (en) * 2007-03-28 2008-10-01 Kabushiki Kaisha Toshiba Information retrieval apparatus and method
US8131694B2 (en) 2004-11-12 2012-03-06 Sap Ag Process-oriented classification
US20130346422A1 (en) * 2002-06-12 2013-12-26 Global Connect Technology Data storage, retrieval, manipulation and display tools enabling multiple hierarchical points of view
US9953062B2 (en) 2014-08-18 2018-04-24 Lexisnexis, A Division Of Reed Elsevier Inc. Systems and methods for providing for display hierarchical views of content organization nodes associated with captured content and for determining organizational identifiers for captured content

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001022251A2 (en) * 1999-09-24 2001-03-29 Wordmap Limited Apparatus for and method of searching

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001022251A2 (en) * 1999-09-24 2001-03-29 Wordmap Limited Apparatus for and method of searching

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FARQUHAR A ET AL: "Collaborative Ontology Construction for Information Integration" TECHNICAL REPORT KSL-95-63, KNOWLEDGE SYSTEMS LABORATORY, STANFORD UNIVERSITY, [Online] 1995, XP002268717 CA, USA Retrieved from the Internet: URL:ftp://ftp.ksl.stanford.edu/pub/KSL_Rep orts/KSL-95-63.ps.gz> [retrieved on 2004-01-30] *
NATALYA G. KEBERLE, VLADIM A. ERMOLAYEV: "An Approach to Dynamic Ontology Modification in Mediator Service-Oriented Information Systems" TECHNICAL REPORT, ZAPOROZHYE STATE UNIVERSITY, [Online] pages 1-12, XP002268719 Zaporozhye, Ukraine Retrieved from the Internet: URL:http://eva.zsu.zaporizhzhe.ua/eva_pers onal/PS/ISTA2001-ZSU.pdf> [retrieved on 2004-01-27] & NATALYA G. KEBERLE, VLADIM A. ERMOLAYEV: "An Approach to Dynamic Ontology Modification in Mediator Service-Oriented Information Systems" PROC. OF INTL. CONF. INFORMATION SYSTEMS TECHNOLOGY AND ITS APPLICATIONS'2001, 13 June 2001 (2001-06-13), - 15 June 2001 (2001-06-15) pages 247-249, Kharkiv, Ukraine *
ROBERT MACGREGOR, RAMESH S. PATIL: "Tools for Assembling and Managing Scalable Knowledge Bases" INFORMATION SCIENCES INSTITUTE, UNIVERSITY OF SOUTHERN CALIFORNIA, [Online] 1997, pages 1-15, XP002268718 Los Angeles, USA Retrieved from the Internet: URL:http://www.isi.edu/isd/OntoLoom/hpkb/O ntoLoom.html> [retrieved on 2004-01-27] *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130346422A1 (en) * 2002-06-12 2013-12-26 Global Connect Technology Data storage, retrieval, manipulation and display tools enabling multiple hierarchical points of view
EP1657652A1 (en) * 2004-11-12 2006-05-17 Sap Ag Process-oriented classification
EP1868118A2 (en) * 2004-11-12 2007-12-19 Sap Ag Process-oriented classification
EP1868118A3 (en) * 2004-11-12 2008-07-23 Sap Ag Process-oriented classification
US8131694B2 (en) 2004-11-12 2012-03-06 Sap Ag Process-oriented classification
EP1975814A2 (en) * 2007-03-28 2008-10-01 Kabushiki Kaisha Toshiba Information retrieval apparatus and method
EP1975814A3 (en) * 2007-03-28 2009-07-01 Kabushiki Kaisha Toshiba Information retrieval apparatus and method
US7987184B2 (en) 2007-03-28 2011-07-26 Kabushiki Kaisha Toshiba Information retrieval apparatus and method
US9953062B2 (en) 2014-08-18 2018-04-24 Lexisnexis, A Division Of Reed Elsevier Inc. Systems and methods for providing for display hierarchical views of content organization nodes associated with captured content and for determining organizational identifiers for captured content

Also Published As

Publication number Publication date
GB0127248D0 (en) 2002-01-02
WO2003042865A3 (en) 2004-06-24

Similar Documents

Publication Publication Date Title
Elmasri et al. Fundamentals of Database Systems 7th ed.
US5842212A (en) Data modeling and computer access record memory
US6078925A (en) Computer program product for database relational extenders
US5603025A (en) Methods for hypertext reporting in a relational database management system
US5953726A (en) Method and apparatus for maintaining multiple inheritance concept hierarchies
US7797336B2 (en) System, method, and computer program product for knowledge management
US8117535B2 (en) System and method for creating dynamic folder hierarchies
US6768986B2 (en) Mapping of an RDBMS schema onto a multidimensional data model
US5778378A (en) Object oriented information retrieval framework mechanism
Elmasri Fundamentals of database systems seventh edition
US20090055362A1 (en) System and computer program product for performing an inexact query transformation in a heterogeneous environment
US6915303B2 (en) Code generator system for digital libraries
ZA200503578B (en) Adaptively interfacing with a data repository
Abramowicz et al. Filtering the Web to feed data warehouses
EP1166218A2 (en) Intellectual property asset manager (ipam) for context processing of data objects
Netz et al. Integration of data mining and relational databases
US20020089551A1 (en) Method and apparatus for displaying a thought network from a thought&#39;s perspective
MXPA05006260A (en) Systems and methods for extensions and inheritance for units of information manageable by a hardware/software interface system.
WO2003042865A2 (en) Taxonomy management
EP1014283A1 (en) Intranet-based cataloguing and publishing system and method
Hoogeveen et al. Integration of information retrieval and database management in support of multimedia police work
Watson Beginning C# 2005 databases
EP1304630A2 (en) Report generating system
Blakeley et al. Enabling component databases with OLE DB
Фаловський et al. Basics of database design and using

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP