WO2002089005A2 - Data retrieval method and system - Google Patents

Data retrieval method and system Download PDF

Info

Publication number
WO2002089005A2
WO2002089005A2 PCT/GB2002/001968 GB0201968W WO02089005A2 WO 2002089005 A2 WO2002089005 A2 WO 2002089005A2 GB 0201968 W GB0201968 W GB 0201968W WO 02089005 A2 WO02089005 A2 WO 02089005A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
classifications
selectable
membership
objects
Prior art date
Application number
PCT/GB2002/001968
Other languages
French (fr)
Other versions
WO2002089005A3 (en
Inventor
Christopher Alan Mcmahon
Rose Crossland
Original Assignee
University Of Bristol
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Bristol filed Critical University Of Bristol
Publication of WO2002089005A2 publication Critical patent/WO2002089005A2/en
Publication of WO2002089005A3 publication Critical patent/WO2002089005A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90324Query formulation using system suggestions
    • G06F16/90328Query formulation using system suggestions using search space presentation or visualization, e.g. category or range presentation and selection

Definitions

  • the present invention relates to a data retrieval method and a system for implementing the same. Particularly, but not exclusively the present invention is suited for use in the browsing and searching of data, otherwise referred to herein as information entities or data objects, in internet and intranet environments.
  • Free text (i.e. keyword) search tools e.g. the 'Find files' feature in MicrosoftTM Explorer and the core functionality provided by the Internet search engines Alta VistaTM, GoogleTM, etc.
  • Browsing tools that allow the exploration of information entities that are hierarchically categorised or filed (e.g. the 'Folder tree' in
  • Keyword search tools in the form of query interfaces tend to be variants of standard text dialog boxes, possibly modified to allow the inclusion of keyword operators (e.g. AND, OR, NOT, etc.). Users of keyword search engines are continually faced with the problem of finding some compromise between:
  • Hierarchical browsing search tools can be browsed by users who can 'drill' down to find information entities of interest - analogous to browsing bookshelves in a library. This provides an intuitive way of searching for general information related to a topic.
  • problems can be faced by those unfamiliar with the structure of the hierarchies.
  • problems can quickly result if users are continually required to 'drill' down into multiple categories that could plausibly contain information of interest (it is notoriously difficult to consistently classify documents into single positions in a hierarchy).
  • the present invention seeks to address the limitations encountered with such conventional approaches to information entity browsing and searching.
  • the present invention provides a data retrieval method for locating one or more data objects from a plurality of data objects through the selection of one or more data classifications from a plurality of data classifications, the method comprising the steps of: generating a first matrix representing at least associations between data objects listed in a first dimension and data classifications listed in a second dimension; generating a first table representing the membership of data classifications that are selectable from the plurality of data classifications; generating a second table representing the membership of data objects that are associated with the selectable data classifications; updating the second table by means of Boolean operands in response to a selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
  • the present invention provides data retrieval software for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the software comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions: updating the second table by means of Boolean operands in response to a user selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
  • the present invention provides data retrieval software suitable for use on a remote server for locating one or more data objects from a plurality of data objects based on a user selection made from a remote terminal of one or more data classifications from a plurality of data classifications, the software comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions: updating the second table by means of Boolean operands in response to a user selection received via an input / output interface from a remote terminal of one or more selectable data classifications; and outputting via the input / output interface to the remote terminal one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
  • the present invention provides a data retrieval system for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the system comprising: a memory containing a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; an input device for receiving a user selection of one of more selectable data classifications; a processor for changing the contents of the second table by means of Boolean operands in response to the user selection received by the input device; and an output for outputting one or more data objects identified in the changed second table as associated with selected data classifications or data classifications which remain selectable.
  • the present invention provides a computer-readable medium embodying a database and instructions for execution by a processor for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the computer-readable medium comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions:updating the second table by means of Boolean operands in response to a user selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
  • the various aspects of the present invention ensure that the selection of one or more selectable data classifications always results in one or more data objects being output.
  • the present invention provides a data retrieval method and system for a plurality of data objects, wherein each data object is associated with one or more data classifications, each one of the data classifications being associated with at least one other data classification.
  • a data object may be directly associated with a data classification or it may be indirectly associated with a data classification through direct association with a data classification which is itself associated with the data classification.
  • Data retrieval progresses in a series of steps in which one or more data classifications being associated with one or more data objects are selected to generate a sub-set of data objects and in which associations between data classifications and data objects outside of the generated sub-set are ignored in subsequent steps.
  • the associations between the data objects and the data classifications are recorded as a matrix of binary values for ease of data extraction.
  • the associations between data classifications may be represented to the user as one or more hierarchies.
  • the data retrieval method and system of the present invention ensures that in searching the available data a user is never in a position of performing a search that will result with no results.
  • the present invention ensures that the user is always presented only with selectable options for further pruning of the search results for which there are positive, i.e. non- zero, results.
  • the problems associated with conventional browsing tools are mitigated through permitting many:many relationships between data classifications which makes it possible for data objects to be classified under more than one data classification.
  • the rationale behind this is that users will not necessarily be able to find a data object if it is only 'filed' in one location in a hierarchy.
  • data objects are organised and classified (and hence subsequently retrieved) on the basis of multiple characteristics or contents, it is easier to locate data objects and conduct complex searches using multiple search criteria (e.g. a user might search for data objects that are formal technical reports, about the subjects X and Y, written after a specified date).
  • Such a complex search only requires the selection of multiple branches in the hierarchy.
  • the intricacies underlying the implementation of such a search remain invisible to the users, which is not the case in conventional keyword searching where users are required to formulate the necessary complex search queries.
  • reference herein to an information entity is intended as reference to a data object such as, but not limited to, a document, a file or a database record.
  • reference herein to a concept or concept hierarchy is intended as reference to a data classification and data classification structure, respectively, and which is elsewhere referred to as categories.
  • the term concept hierarchy, or data classification structure is also intended to include any form of linked network of concepts, or data classifications, e.g. a directed acyclic graph.
  • each information entity is associated with one or more concepts.
  • a concept is a category used for the purposes of categorising the information entities, and is preferably represented to the user by a descriptive string.
  • the information entity The Hilton' might be associated with the concepts 'London' and 'Hotel'.
  • the concepts are organised into concept hierarchies, wherein each concept has zero, one or many ancestor concepts and zero, one or many descendant concepts.
  • the concept 'London' may have parent concepts 'Capital Cities' and 'England', and child concepts 'Central', 'North', 'South', 'East' and 'West'.
  • the parent concept 'England' may also have its own parent concept, e.g. 'Europe', which is also an ancestor concept of 'London'.
  • the child concept 'Central' may also have its own child concepts, e.g. 'West End' and The City', which are also descendant concepts of 'London'.
  • the concepts are organised into parent-child hierarchies representing taxonomies of concepts in particular subject domains.
  • the concepts may be organised into one or more concept hierarchies, and each information entity may be associated with one or more concepts in the same or different concept hierarchies.
  • a plurality of selectable concepts, in concept hierarchies relevant to a given set of information entities are displayed to a user and the user selects those concepts of interest.
  • Each concept is either associated with at least one information entity or at least one of its descendant concepts is associated with at least one information entity.
  • the user is selecting a set of one or more information entities associated either with the selected concepts or with their descendant concepts.
  • each information entity may be associated with one or more concepts.
  • Each concept may be associated with zero, one or more information entities.
  • one of its descendant concepts must be associated with at least one information entity in order for the concept to be selectable by the user.
  • the user is selecting the set of information entities that is the union of the sets of information entities associated with each selected concept. This union shall be referred to herein as a filtered set of information entities, and this set will always be smaller than or equal to the initial set of information entities, but will never be null.
  • a second set of concepts may be generated from the filtered set of information entities, by computing the union of all of the concepts associated with the filtered set of information entities which shall be referred to herein as a filtered set of concepts.
  • This filtered set is then used to generate a display for the user of the concepts that may be selected by the user to refine the search. Search refinement is achieved by selecting from the filtered set of concepts one or more further concepts. In so doing, the user is selecting the set of information entities that is the intersection of the set of information entities associated with all of the newly selected concepts and the set of information entities associated with previously selected concepts. This is a new filtered set of information entities, and this set will again always be smaller than or equal to the initial filtered set of information entities, but will never be null.
  • a new filtered set of concepts may again be generated from this filtered set of information entities by computing the union of all of the concepts associated with the new filtered set of information entities.
  • the display of selectable concepts is then updated to show this new filtered set of concepts, and the user may then select one or more further concepts to further refine the search. Clearly, this method is repeated as required.
  • the effect, from the point of view of the user, is that he or she is selecting concepts of interest from a hierarchical display of concepts.
  • the user is assisted in the selection by having displayed to him or her with each concept the number of information entities associated with the concept (in "Match exact concept” mode) or the number of information entities associated with the concept and all of its descendant concepts (in "Match concept and descendants” mode).
  • the current filtered information entity list is also displayed. The entire list may be displayed or the list may be displayed in segments, the size of which may be adjusted according to a user preference.
  • the hierarchical display is pruned, the number of information entities associated with each concept ("Match exact concept” mode) and its descendant concepts (“Match concept and descendants” mode) is updated, and the filtered information entity list is updated to reflect the user's selection. In this way the user is guided by the graphical display in the permissible selections that remain, and the possibility of ending up with no search results is prevented.
  • the data retrieval system comprises a database which includes a concept-information entity matrix, a used concept row, a used entity column, a match exact array, a match descendants array, a ConceptRep table and an EntityRep table.
  • the concept-information entity matrix has at least two-dimensions and stores at least data representing the associations between the information entities and the concepts.
  • Each row in the matrix represents an information entity and each column represents a concept.
  • Each element of the matrix is preferably represented as a bit and is set to 1 if the corresponding information entity and concept are associated; otherwise, the bit is set to 0.
  • the used concept row is a table representing the concepts which are members of the filtered set of concepts, i.e. those concepts which are selectable.
  • the table is a one-dimensional array having bit elements corresponding to each concept.
  • an element of the used concept row is set to 1 then the corresponding concept belongs to, or is a member of, the filtered set of concepts.
  • all concepts of the concept hierarchies are members of the filtered set of concepts.
  • the used concept row is a one-dimensional binary array, all elements of the array are set to 1.
  • the used entity column is a table representing the information entities which are members of the filtered set of information entities, i.e. those information entities which are associated with the selected concepts.
  • the table is a one-dimensional array having bit elements corresponding to each information entity.
  • an element of the used entity column is set to 1 then the corresponding information entity belongs to, or is a member of, the filtered set of information entities.
  • all information entities are members of the filtered set of information entities.
  • the used entity column is a one- dimensional binary array
  • all elements of the array are set to 1.
  • the used concept row and used entity column are stored as separate one-dimensional arrays, it will be appreciated that either or both of the used concept row and used entity column could form part of the concept-information entity matrix.
  • Both the match exact and match descendants arrays are one- dimensional arrays having integer elements corresponding to each concept.
  • Each element of the match exact array stores the number of information entities which are associated with that particular concept.
  • Each element of the match descendants array stores the number of information entities which are associated or with the descendant concepts of the particular concept.
  • the ConceptRep table is a table of objects, such as a one-dimensional array of objects, wherein each object of the table stores information concerning a concept of the concept hierarchies.
  • Each object has at least three elements: Name, Parents and Children.
  • the first element, ConceptRep. Name is a string and stores the name of the concept as displayed to the user in the hierarchical display.
  • Child respectively store which concepts are parent concepts and child concepts of the particular concept represented by the object. Parent and child concepts respectively refer only to the direct ancestor and descendant concepts of the particular concept. In order to determine all the descendant concepts of a particular concept, the child concepts are first obtained from the ConceptRep. Children. Children of the child concepts are then obtained from their corresponding ConceptRep. Children, and so on.
  • the ConceptRep. Parents and ConceptRep. Children elements each comprise a list of references to the ConceptRep objects of the parent and child concepts respectively. This list may comprise a one-dimensional array of bits corresponding to the total number of concepts. Each bit of the ConceptRep. Parents array and the ConceptRep.Children array would then be set to 1 if the corresponding concept is respectively a parent or child concept of the particular concept. However, as the total number of concepts may be large, storing details of the parent and child concepts in this manner for each concept can be inefficient. An alternative method is to store the list of references as a linked list.
  • the ConceptRep. Parents and ConceptRep.Children elements would then comprise a series of linked components. Each component would comprise two pointers.
  • the first pointer would point to a ConceptRep object corresponding to a parent or child concept. If a concept has no parent or child concepts then the ConceptRep. Parents. ConceptPointer o ConceptRep. Children. ConceptPointer is set to be null. The second pointer in each linked component, called the NextComponentPointer, would point to the next component in the ConceptRep. Parents or ConceptRep.Children list. If the concept has no further parent or child concepts then the
  • the EntityRep table is a table of objects, such a one-dimensional array of objects, wherein each object of the table stores information concerning an information entity. Each object might store a number of details relating to the information entity and is not limited strictly to a string of descriptive text. For example, if the information entity is a hotel, then the EntityRep object might store the address and telephone number of the hotel, the URL link to the hotel's own website, and links to documents which review the hotel and its facilities.
  • each ConceptRep object is stored together.
  • each type of element may be stored separately.
  • the names of all the concepts as displayed to the user, i.e. the ConceptRep. Name elements may be stored separately as a one- dimensional array of strings.
  • the contents of the match exact array, match descendant array and used concept row may also be stored as elements of the ConceptRep objects.
  • the contents of the used entity column may be included as an element of the EntityRep objects.
  • Each concept hierarchy has a root, or top, concept.
  • the identity of each root concept is normally stored in a table, such as an array or linked list.
  • the ConceptRep. Name of each root concept is then displayed at the top of the hierarchical display.
  • the child concepts of each root concept are then determined from the ConceptRep.Children element corresponding to each root concept.
  • the ConceptRep. Name of each of the child concepts is then displayed as a child of the root concept. Child concepts of these child concepts are then determined and so on until the complete hierarchical tree is displayed to the user.
  • the root concepts generally, though not necessarily, have no parent concepts. Rather than storing the identity of the root concepts in a separate table, the root concepts may alternatively be determined by examining the ConceptRep. Parents element of each object in the
  • ConceptRep table The root concepts are then those concepts in which the ConceptRep. Parents. ConceptPointer is null. The determination of root concepts in this manner may be facilitated by storing the root concepts at the top of the ConceptRep table such that the search for root concepts is ended once a ConceptRep object having a non-null ConceptRep. Parents. ConceptPointer is read.
  • the concept hierarchies may be displayed in an expand-collapse mode.
  • each concept having descendant concepts will have an option to expand or collapse the parent-child hierarchical branch associated with that particular concept.
  • the child concepts When a particular concept is expanded, only the child concepts are presented to the user. The child concepts must then themselves be expanded before the complete hierarchical branch is displayed.
  • no descendants of the concept are shown.
  • the EntityRep objects are used to display the details of the information entities to the user. Only the details of the information entities which are associated with the currently selected concepts are displayed. If the used entity column bit for a particular information entity is set to 1 , then the details of the information entity are displayed according to the information stored in the EntityRep object.
  • the element of the used concept row corresponding to the visited concept is assigned a value of 0. If the element of the match exact array corresponding to the visited concept is 0 and the concept does have one or more children with a value of 1 stored in the corresponding element of the used concept row, i.e. children belonging to the filtered set of concepts, and after all such children of the concept have been visited the elements of the used concept row corresponding to those children all have values of 0 then the element of the used concept row corresponding to the visited concept is assigned a value of 0.
  • ConceptRep.Children elements If a child concept is identified in the used concept row as a member of the filtered set of concepts, i.e. if the corresponding bit is 1 , then the ConceptRep. Name of that child concept is displayed as a child of the root concept. If the child concept is not identified in the used concept row as a member of the filtered set of concepts then the concept is not included in the hierarchical display. For those child concepts which are identified in the used concept row, their child concepts are established and the process continues until a complete hierarchical tree of concepts belonging to the used concept row is displayed. Alongside each ConceptRep. Name, the number stored in the match exact or match descendants array for that particular concept is displayed depending on whether data retrieval is operating in "Match exact concept" or "Match concept and descendants” mode; and
  • one or more concepts may be deselected causing the filtered set of information entities to be updated.
  • the selected concepts may be stored as a set of selected concepts in a table, such as an array or linked list.
  • the used concept row is reset such that all concepts become members of the filtered set of concepts.
  • the used concept row is a one- dimensional binary array, all elements of the array are set to 1.
  • the information entities may be listed according to the number of concepts that are associated with each of the listed information entities.
  • the information entities and concepts may be arranged within the matrix according to a weighting scheme.
  • An example weighting scheme could include ordering the information entities within the matrix in order of importance.
  • each element of the matrix may store weighted values according to the strength of association between the information entities and the concepts.
  • the information entities may then be listed according to the strength of association with the selected concepts.
  • the weighting of associations between information entities and concepts may be stored in a third dimension of the matrix, or in a separate matrix.
  • the method may be implemented by means of computer software which may be run on a computer system wherein the software includes a database comprising at least the concept- information entity matrix, the used concept row and the used entity column, and a series of instructions for performing the data retrieval method which can be stored in memory of the computer system.
  • the software includes a database for storing the ConceptRep and EntityRep objects.
  • the software may be implemented on a remote server in which case the software includes instructions for interfacing with a remote terminal such that a user selection made from the remote terminal can be received by the server and the list of information entities generated by the data retrieval method can be outputted to the remote terminal.
  • the data retrieval system may be a computer system such as a conventional personal computer configured to perform the data retrieval method as described.
  • the data retrieval system may be implemented for electronic data exchange in which one or both of the database and the set of instructions is stored on a remote server that is capable of communication with a separate terminal.
  • the described method of constructing the matrix and arrays of the data retrieval system may be limited in the number of information entities that may be indexed and in the number of concepts that may be used by, firstly, limitations in the size of array that may be constructed and manipulated by the available computing equipment, and, secondly, by declining performance of the method as the size of the array increases. Performance limitations may be overcome by caching the results of searches using concepts below the top level of the concept hierarchy, and by pre-computing the results of these searches using computer idle time where possible.
  • the data retrieval method described above may be implemented in a system using an object database, and with index creation and manipulation and user interface implemented as a Java application. It is envisaged that a typical application will implement the database on a computing server. Retrieval of data is preferably carried out as an internet or intranet application with the creation and manipulation of the matrix and arrays also taking place on a remote server. User presentation of results will take place as a client application communicating with the remote server using conventional internet protocols.
  • the data retrieval method and system described above is particularly suited for use in client-server e-commerce or business applications where users and customers are required make enquiries of a database via a Web- enabled client (e.g. a standard Web browser) using multiple 'search' criteria.
  • Examples include holiday booking sites (which is discussed in greater detail below), a second-hand automobile locating site, an on-line bookshop or any product locating site.
  • the functionality is preferably provided on the server side.
  • a Web-based holiday booking site on which users are required to make multiple selections from hierarchically organised menus to search a database of available holidays (e.g. by selecting the departure / arrival destination, country, resort, price, date, etc.).
  • a user of a Web-based holiday booking site may wish to book a holiday in country V, and in resort W, hotel X on date Y and for a price-range Z.
  • the user is asked to select a particular destination country followed by a resort, hotel and date.
  • a conventional product / service search engine it is possible (provided that a suitable database structure has been adopted and the Web page designed adequately) to link the options that are presented to users in the resort selection process according to the selected country since only a restricted number of resorts are applicable for a given selected country (i.e. allowing users to select resorts that do not exist in a particular country would not be desirable).
  • the data retrieval method and system of the present invention presents selections and options to users on the basis of the contents of the database, as opposed to its underlying structure. Therefore in the given example, the data retrieval system would only provide users with options that are guaranteed to be successful. Thus if the user had selected a given country, resort and hotel, they would then only be presented with dates that would result in a successful query because only dates which were associated with entities (holidays) in the selected hotel would be offered in the filtered concept set. The user would not be able to select the date Y as this would represent an invalid selection.
  • One of the key advantages of the data retrieval method and system of the present invention is that the user is free to choose the selection criteria in any arbitrary sequence which is in contrast to conventional systems where users are required to follow a pre-defined selection sequence.
  • this option can be made and only hotels with vacancies for that date will be presented to the user.
  • the user would not be presented with the option of selecting the resort W as there would be no relevant relationship stored in the matrix and arrays.
  • the choice of the resort is the user's key selection criteria then the user would be able to use this as the initial selection criteria in the same manner.
  • the data retrieval tool of the present invention may be implemented as a medical diagnosis tool (or more generally as a fault finding and identification tool).
  • the hierarchical structure might contain multiple branches that cover concepts such as condition, symptom and treatment.
  • the user of such a tool is able to make any number of multiple selections from branches in the hierarchy.
  • the user is only presented with the results (i.e. the condition, symptom and treatment) that are relevant to those selections that have already been made. For example, if a given symptom is not applicable to a given condition then the user will not be presented with that condition since it does not represent a valid result.
  • This approach contrasts with more conventional expert systems that are widely used in such applications.
  • This approach therefore ensures that a user will always be presented with a result that matches their query.
  • users are guaranteed to be presented with one or more results in response to a query (i.e. the system does not allow the browsing of parts of a classification hierarchy that does not contain any documents).
  • the data retrieval tool also permits the formulation of 'complex' searches that are not possible using standard keyword or browsing tools. For example, suppose that a user is searching for information entities in a company document management system and wants to find out the names of the universities that are collaborating with their company in a particular research project. Using conventional search techniques the user would have to formulate complex search queries that would combine keywords that are related to both the research project and to universities in general. Using the data retrieval tool of the present invention, they need only select the concept representing the research project, and then note which children of the Universities concept remain in the filtered concept set, and hence visible in the concept hierarchy.
  • the data retrieval method and system described above may be implemented as a search tool component and built into any type of document or content management software application, or more generally any software that requires users to search or browse for information that has been pre-classified or organised into a hierarchical structure. In this mode of operation, it may form part of a stand-alone piece of software, or be provided as a separate system on the client or server side in a thin / thick client application. It will of course be immediately apparent that the data retrieval method and system described herein may be implemented using many different platforms and is applicable to a broad range of applications where interrelated information is stored and queried. As such, the data retrieval method and system is not limited to the particular example given above but may be altered without departing from the scope of the invention as claimed.

Abstract

A data retrieval method is disclosed for locating one or more data objects from a plurality of data objects based on a selection of one or more data classifications from a plurality of data classifications. The method comprises the step of generating a matrix representing associations between data objects and data classifications, generating a first table representing data classifications that are selectable, generating a second table representing data objects that are associated with the selectable data classifications; updating the second table by means of Boolean operands in response to a selection of the selectable data classifications, and outputting one or more data objects identified in the updated second table. Data retrieval software, a data retrieval system and a computer-readable medium embodying the described method are also disclosed.

Description

DATA RETRIEVAL METHOD AND SYSTEM
The present invention relates to a data retrieval method and a system for implementing the same. Particularly, but not exclusively the present invention is suited for use in the browsing and searching of data, otherwise referred to herein as information entities or data objects, in internet and intranet environments.
Conventional information search tools can be broadly considered to fall into one of two distinct categories:
• Free text (i.e. keyword) search tools (e.g. the 'Find files' feature in Microsoft™ Explorer and the core functionality provided by the Internet search engines Alta Vista™, Google™, etc.)
• Browsing tools that allow the exploration of information entities that are hierarchically categorised or filed (e.g. the 'Folder tree' in
Microsoft™ Explorer and the search tool directories Yahoo!™, LookSmart™ and the Netscape™ Open Directory Project). Each of these approaches to searching has a number of limitations. Keyword search tools in the form of query interfaces tend to be variants of standard text dialog boxes, possibly modified to allow the inclusion of keyword operators (e.g. AND, OR, NOT, etc.). Users of keyword search engines are continually faced with the problem of finding some compromise between:
• overly broad searches that return an overwhelming number of 'hits' that cannot possibly be properly assessed (advanced searches using Boolean logic can help to refine broad searches, however these are difficult and time consuming to prepare) and
• overly narrow searches that either fail to return any relevant hits, or miss out many of those that are actually relevant. A further problem with some implementations of keyword search techniques is that results cannot easily be refined, and instead have to be carried out from scratch each time. This makes the process of successfully searching for information using such tools one of trial and error.
With hierarchical browsing search tools hierarchical categories can be browsed by users who can 'drill' down to find information entities of interest - analogous to browsing bookshelves in a library. This provides an intuitive way of searching for general information related to a topic. However, problems can be faced by those unfamiliar with the structure of the hierarchies. In addition, since documents often are only placed in a single location in a hierarchy, frustration can quickly result if users are continually required to 'drill' down into multiple categories that could plausibly contain information of interest (it is notoriously difficult to consistently classify documents into single positions in a hierarchy).
In US 5,924,090 a data retrieval tool is described which seeks to combine the advantages offered by keyword and hierarchy browsing strategies. This data retrieval tool is initiated by a keyword search which then organises the pages it finds into custom folders with the folders representing the nodes in a' hierarchy. The folders are created 'on the fly', which means that for every search the hierarchy of categories drilled down through will reflect the initial information requirement input by the user, as represented by their choice of initial keyword search terms. The custom folders may be organised into sub-folders, and only folders that contain documents are displayed to the user. However, with the data retrieval tool described in US 5,924,090 there is no possibility of exploring multiple branches of the hierarchy created as a result of the initial keywords input by the user or of cross-referencing the results of searches in multiple hierarchy positions.
The present invention seeks to address the limitations encountered with such conventional approaches to information entity browsing and searching. In a first aspect, the present invention provides a data retrieval method for locating one or more data objects from a plurality of data objects through the selection of one or more data classifications from a plurality of data classifications, the method comprising the steps of: generating a first matrix representing at least associations between data objects listed in a first dimension and data classifications listed in a second dimension; generating a first table representing the membership of data classifications that are selectable from the plurality of data classifications; generating a second table representing the membership of data objects that are associated with the selectable data classifications; updating the second table by means of Boolean operands in response to a selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable. In a second aspect, the present invention provides data retrieval software for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the software comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions: updating the second table by means of Boolean operands in response to a user selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
In a third aspect, the present invention provides data retrieval software suitable for use on a remote server for locating one or more data objects from a plurality of data objects based on a user selection made from a remote terminal of one or more data classifications from a plurality of data classifications, the software comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions: updating the second table by means of Boolean operands in response to a user selection received via an input / output interface from a remote terminal of one or more selectable data classifications; and outputting via the input / output interface to the remote terminal one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
In a fourth aspect, the present invention provides a data retrieval system for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the system comprising: a memory containing a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; an input device for receiving a user selection of one of more selectable data classifications; a processor for changing the contents of the second table by means of Boolean operands in response to the user selection received by the input device; and an output for outputting one or more data objects identified in the changed second table as associated with selected data classifications or data classifications which remain selectable.
In a fifth aspect, the present invention provides a computer-readable medium embodying a database and instructions for execution by a processor for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the computer-readable medium comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions:updating the second table by means of Boolean operands in response to a user selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
The various aspects of the present invention ensure that the selection of one or more selectable data classifications always results in one or more data objects being output.
The present invention provides a data retrieval method and system for a plurality of data objects, wherein each data object is associated with one or more data classifications, each one of the data classifications being associated with at least one other data classification. A data object may be directly associated with a data classification or it may be indirectly associated with a data classification through direct association with a data classification which is itself associated with the data classification. Data retrieval progresses in a series of steps in which one or more data classifications being associated with one or more data objects are selected to generate a sub-set of data objects and in which associations between data classifications and data objects outside of the generated sub-set are ignored in subsequent steps.
Ideally, the associations between the data objects and the data classifications are recorded as a matrix of binary values for ease of data extraction. Moreover, the associations between data classifications may be represented to the user as one or more hierarchies.
Thus, the data retrieval method and system of the present invention ensures that in searching the available data a user is never in a position of performing a search that will result with no results. The present invention ensures that the user is always presented only with selectable options for further pruning of the search results for which there are positive, i.e. non- zero, results.
With the present invention the problems associated with conventional browsing tools are mitigated through permitting many:many relationships between data classifications which makes it possible for data objects to be classified under more than one data classification. The rationale behind this is that users will not necessarily be able to find a data object if it is only 'filed' in one location in a hierarchy. However, if data objects are organised and classified (and hence subsequently retrieved) on the basis of multiple characteristics or contents, it is easier to locate data objects and conduct complex searches using multiple search criteria (e.g. a user might search for data objects that are formal technical reports, about the subjects X and Y, written after a specified date). Such a complex search only requires the selection of multiple branches in the hierarchy. The intricacies underlying the implementation of such a search remain invisible to the users, which is not the case in conventional keyword searching where users are required to formulate the necessary complex search queries.
In the context of this document it is to be understood that reference herein to an information entity is intended as reference to a data object such as, but not limited to, a document, a file or a database record. Furthermore, reference herein to a concept or concept hierarchy is intended as reference to a data classification and data classification structure, respectively, and which is elsewhere referred to as categories. The term concept hierarchy, or data classification structure, is also intended to include any form of linked network of concepts, or data classifications, e.g. a directed acyclic graph. An embodiment of the present invention will now be described by way of example with reference to the accompanying Figure 1 which illustrates a matrix employed in the data retrieval method and system of the present invention.
With the data retrieval tool of the present invention, each information entity is associated with one or more concepts. A concept is a category used for the purposes of categorising the information entities, and is preferably represented to the user by a descriptive string. For example, the information entity The Hilton' might be associated with the concepts 'London' and 'Hotel'.
The concepts are organised into concept hierarchies, wherein each concept has zero, one or many ancestor concepts and zero, one or many descendant concepts. For example, the concept 'London' may have parent concepts 'Capital Cities' and 'England', and child concepts 'Central', 'North', 'South', 'East' and 'West'. The parent concept 'England' may also have its own parent concept, e.g. 'Europe', which is also an ancestor concept of 'London'. Similarly, the child concept 'Central' may also have its own child concepts, e.g. 'West End' and The City', which are also descendant concepts of 'London'. In this way, the concepts are organised into parent-child hierarchies representing taxonomies of concepts in particular subject domains. The concepts may be organised into one or more concept hierarchies, and each information entity may be associated with one or more concepts in the same or different concept hierarchies. To retrieve data using the retrieval method of the present invention, a plurality of selectable concepts, in concept hierarchies relevant to a given set of information entities, are displayed to a user and the user selects those concepts of interest. Each concept is either associated with at least one information entity or at least one of its descendant concepts is associated with at least one information entity. By selecting one or more concepts from one or more concept hierarchies, the user is selecting a set of one or more information entities associated either with the selected concepts or with their descendant concepts. As mentioned above, each information entity may be associated with one or more concepts. Each concept may be associated with zero, one or more information entities. When a concept is not associated with any information entity then one of its descendant concepts must be associated with at least one information entity in order for the concept to be selectable by the user. By selecting one or more concepts from the concept hierarchies representing the complete or starting set of concepts, the user is selecting the set of information entities that is the union of the sets of information entities associated with each selected concept. This union shall be referred to herein as a filtered set of information entities, and this set will always be smaller than or equal to the initial set of information entities, but will never be null. A second set of concepts may be generated from the filtered set of information entities, by computing the union of all of the concepts associated with the filtered set of information entities which shall be referred to herein as a filtered set of concepts. This filtered set is then used to generate a display for the user of the concepts that may be selected by the user to refine the search. Search refinement is achieved by selecting from the filtered set of concepts one or more further concepts. In so doing, the user is selecting the set of information entities that is the intersection of the set of information entities associated with all of the newly selected concepts and the set of information entities associated with previously selected concepts. This is a new filtered set of information entities, and this set will again always be smaller than or equal to the initial filtered set of information entities, but will never be null. A new filtered set of concepts may again be generated from this filtered set of information entities by computing the union of all of the concepts associated with the new filtered set of information entities. The display of selectable concepts is then updated to show this new filtered set of concepts, and the user may then select one or more further concepts to further refine the search. Clearly, this method is repeated as required.
The effect, from the point of view of the user, is that he or she is selecting concepts of interest from a hierarchical display of concepts. The user is assisted in the selection by having displayed to him or her with each concept the number of information entities associated with the concept (in "Match exact concept" mode) or the number of information entities associated with the concept and all of its descendant concepts (in "Match concept and descendants" mode). The current filtered information entity list is also displayed. The entire list may be displayed or the list may be displayed in segments, the size of which may be adjusted according to a user preference. As concepts are selected, the hierarchical display is pruned, the number of information entities associated with each concept ("Match exact concept" mode) and its descendant concepts ("Match concept and descendants" mode) is updated, and the filtered information entity list is updated to reflect the user's selection. In this way the user is guided by the graphical display in the permissible selections that remain, and the possibility of ending up with no search results is prevented. As illustrated in Figure 1 , the data retrieval system comprises a database which includes a concept-information entity matrix, a used concept row, a used entity column, a match exact array, a match descendants array, a ConceptRep table and an EntityRep table.
The concept-information entity matrix has at least two-dimensions and stores at least data representing the associations between the information entities and the concepts. Each row in the matrix represents an information entity and each column represents a concept. Each element of the matrix is preferably represented as a bit and is set to 1 if the corresponding information entity and concept are associated; otherwise, the bit is set to 0.
The used concept row is a table representing the concepts which are members of the filtered set of concepts, i.e. those concepts which are selectable. Preferably, the table is a one-dimensional array having bit elements corresponding to each concept. Thus if an element of the used concept row is set to 1 then the corresponding concept belongs to, or is a member of, the filtered set of concepts. Initially, all concepts of the concept hierarchies are members of the filtered set of concepts. Thus in the embodiment in which the used concept row is a one-dimensional binary array, all elements of the array are set to 1.
The used entity column is a table representing the information entities which are members of the filtered set of information entities, i.e. those information entities which are associated with the selected concepts. Preferably, the table is a one-dimensional array having bit elements corresponding to each information entity. Thus if an element of the used entity column is set to 1 then the corresponding information entity belongs to, or is a member of, the filtered set of information entities. Initially, all information entities are members of the filtered set of information entities. Thus in the embodiment in which the used entity column is a one- dimensional binary array, all elements of the array are set to 1.
Whilst in a preferred embodiment, the used concept row and used entity column are stored as separate one-dimensional arrays, it will be appreciated that either or both of the used concept row and used entity column could form part of the concept-information entity matrix.
Both the match exact and match descendants arrays are one- dimensional arrays having integer elements corresponding to each concept. Each element of the match exact array stores the number of information entities which are associated with that particular concept. Each element of the match descendants array stores the number of information entities which are associated or with the descendant concepts of the particular concept. The ConceptRep table is a table of objects, such as a one-dimensional array of objects, wherein each object of the table stores information concerning a concept of the concept hierarchies. Each object has at least three elements: Name, Parents and Children. The first element, ConceptRep. Name, is a string and stores the name of the concept as displayed to the user in the hierarchical display. The second and third elements, ConceptRep. Parents and ConceptRep. Children, respectively store which concepts are parent concepts and child concepts of the particular concept represented by the object. Parent and child concepts respectively refer only to the direct ancestor and descendant concepts of the particular concept. In order to determine all the descendant concepts of a particular concept, the child concepts are first obtained from the ConceptRep. Children. Children of the child concepts are then obtained from their corresponding ConceptRep. Children, and so on.
The ConceptRep. Parents and ConceptRep. Children elements each comprise a list of references to the ConceptRep objects of the parent and child concepts respectively. This list may comprise a one-dimensional array of bits corresponding to the total number of concepts. Each bit of the ConceptRep. Parents array and the ConceptRep.Children array would then be set to 1 if the corresponding concept is respectively a parent or child concept of the particular concept. However, as the total number of concepts may be large, storing details of the parent and child concepts in this manner for each concept can be inefficient. An alternative method is to store the list of references as a linked list. The ConceptRep. Parents and ConceptRep.Children elements would then comprise a series of linked components. Each component would comprise two pointers. The first pointer, called the ConceptPointer, would point to a ConceptRep object corresponding to a parent or child concept. If a concept has no parent or child concepts then the ConceptRep. Parents. ConceptPointer o ConceptRep. Children. ConceptPointer is set to be null. The second pointer in each linked component, called the NextComponentPointer, would point to the next component in the ConceptRep. Parents or ConceptRep.Children list. If the concept has no further parent or child concepts then the
NextComponentPointer is set to be null. The parent or child concepts of a particular concept are therefore obtained by starting at the first component of the ConceptRep. Parents or ConceptRep.Children list and moving along the list via the NextComponentPointer until a null is read. The EntityRep table is a table of objects, such a one-dimensional array of objects, wherein each object of the table stores information concerning an information entity. Each object might store a number of details relating to the information entity and is not limited strictly to a string of descriptive text. For example, if the information entity is a hotel, then the EntityRep object might store the address and telephone number of the hotel, the URL link to the hotel's own website, and links to documents which review the hotel and its facilities. In the embodiment described above, the individual elements of each ConceptRep object are stored together. However, it will be appreciated that each type of element may be stored separately. For example, the names of all the concepts as displayed to the user, i.e. the ConceptRep. Name elements, may be stored separately as a one- dimensional array of strings. It will also be appreciated that the contents of the match exact array, match descendant array and used concept row may also be stored as elements of the ConceptRep objects. Likewise, the contents of the used entity column may be included as an element of the EntityRep objects.
The information stored in the ConceptRep objects is used to build the hierarchical display presented to the user. Each concept hierarchy has a root, or top, concept. The identity of each root concept is normally stored in a table, such as an array or linked list. The ConceptRep. Name of each root concept is then displayed at the top of the hierarchical display. The child concepts of each root concept are then determined from the ConceptRep.Children element corresponding to each root concept. The ConceptRep. Name of each of the child concepts is then displayed as a child of the root concept. Child concepts of these child concepts are then determined and so on until the complete hierarchical tree is displayed to the user.
The root concepts generally, though not necessarily, have no parent concepts. Rather than storing the identity of the root concepts in a separate table, the root concepts may alternatively be determined by examining the ConceptRep. Parents element of each object in the
ConceptRep table. The root concepts are then those concepts in which the ConceptRep. Parents. ConceptPointer is null. The determination of root concepts in this manner may be facilitated by storing the root concepts at the top of the ConceptRep table such that the search for root concepts is ended once a ConceptRep object having a non-null ConceptRep. Parents. ConceptPointer is read.
Alongside each ConceptRep. Name, the number stored in the match exact or match descendants array for that particular concept is displayed; the number displayed depends upon which mode has been selected by the user, "Match exact concept" or "Match concept and descendants".
As the hierarchical display may be large, the concept hierarchies may be displayed in an expand-collapse mode. In this mode of operation, each concept having descendant concepts will have an option to expand or collapse the parent-child hierarchical branch associated with that particular concept. When a particular concept is expanded, only the child concepts are presented to the user. The child concepts must then themselves be expanded before the complete hierarchical branch is displayed. When a concept is collapsed, no descendants of the concept are shown.
The EntityRep objects are used to display the details of the information entities to the user. Only the details of the information entities which are associated with the currently selected concepts are displayed. If the used entity column bit for a particular information entity is set to 1 , then the details of the information entity are displayed according to the information stored in the EntityRep object.
When one or more concepts are simultaneously selected from the concept hierarchies, the following series of steps is performed: (1 ) the contents of all columns in the concept-information entity matrix corresponding to the selected concepts and the descendant concepts of the selected concepts are OR'ed together to form a single column. The contents of this single column, which represents all the information entities which are associated with the selected concepts and their descendant concepts, is then AND'ed with the contents of the used entity column, with the result being stored as the new used entity column, i.e. for a single selected concept, used entity column Y] = { matrix[selected concept][Y] OR matrix[first descendant][Y] OR matrixfsecond descendant][Y] OR ... matrix[Nth descendant][Y] } AND used entity column Y]; (2) for each concept which is identified in the used concept row as a member of the filtered set of concepts, the contents of the column of the concept-information entity matrix corresponding to that concept is AND'ed with the contents of the used entity column. If all contents of the resultant column are equal to zero, the corresponding element of the match exact array is assigned a value of 0, otherwise it is temporarily assigned a value of 1. (3) the concept hierarchy corresponding to concepts which are identified in the used concept row as members of the filtered set of concepts is recursively walked in a depth first traversal beginning at each root concept. For each concept visited, if the corresponding element of the match exact array is 0 and the concept has no children with a value of 1 stored in the corresponding element of the used concept row, i.e. no children belonging to the filtered set of concepts, then the element of the used concept row corresponding to the visited concept is assigned a value of 0. If the element of the match exact array corresponding to the visited concept is 0 and the concept does have one or more children with a value of 1 stored in the corresponding element of the used concept row, i.e. children belonging to the filtered set of concepts, and after all such children of the concept have been visited the elements of the used concept row corresponding to those children all have values of 0 then the element of the used concept row corresponding to the visited concept is assigned a value of 0. (4) for each concept of the concept hierarchies, the contents of the column of the concept-information entity matrix, corresponding to that concept, is AND'ed with the contents of the used entity column. All elements of the resulting single column, which are either 0 or 1, are then summed and the number is stored in the element of the match exact array corresponding to that concept, i.e. match exact[X] = { matrix[X][1] AND used entity column^] } + { matrix[X][2] AND used entity column[2] } + ... { matrix[X][N] AND used entity column[N] };
(5) for each concept of the concept hierarchies, the contents of all columns of the concept-information entity matrix corresponding to that concept and all of its descendant concepts are OR'ed together. All elements of the resulting single column, which are either 0 or 1 , are then summed and the number is stored in the element of the match descendant array corresponding to that concept;
(6) the hierarchical display is updated. The ConceptRep. Name of each root concept is displayed at the top of the hierarchical display. The child concepts of each root concept are then determined from the
ConceptRep.Children elements. If a child concept is identified in the used concept row as a member of the filtered set of concepts, i.e. if the corresponding bit is 1 , then the ConceptRep. Name of that child concept is displayed as a child of the root concept. If the child concept is not identified in the used concept row as a member of the filtered set of concepts then the concept is not included in the hierarchical display. For those child concepts which are identified in the used concept row, their child concepts are established and the process continues until a complete hierarchical tree of concepts belonging to the used concept row is displayed. Alongside each ConceptRep. Name, the number stored in the match exact or match descendants array for that particular concept is displayed depending on whether data retrieval is operating in "Match exact concept" or "Match concept and descendants" mode; and
(7) details of the information entities identified in the used entity column as members of the filtered set of information entities are displayed using the information stored in the corresponding EntityRep objects, i.e. if the used concept element for a particular information entity is 1 then details of the information entity are displayed using the information stored in the corresponding EntityRep object. In the particular embodiment described above, the match exact and match descendant values are evaluated for all concepts (steps 4 and 5), regardless of whether or not the concept forms part of the hierarchical display. As an alternative, the match exact and match descendant values may be evaluated only for the concepts indicated in the used concept row. For concept hierarchies in which each concept is associated with at least one information entity, steps 2 and 3 described above may be replaced by a single step in which all the rows in the concept-information entity matrix corresponding to the information entities identified in the used entity column are OR'ed together and stored as the used concept row, i.e. used concept row[X] = matrix[X][Y] AND used entity column Y].
It in addition to the selection of one or more concepts, one or more concepts may be deselected causing the filtered set of information entities to be updated. There are various methods in which this might be done. For example, when one or more concepts are selected, the selected concepts may be stored as a set of selected concepts in a table, such as an array or linked list. When the set of selected concepts are deselected, the concepts are removed from the table, and the used concept row is reset such that all concepts become members of the filtered set of concepts. In the embodiment in which the used concept row is a one- dimensional binary array, all elements of the array are set to 1. For each of the remaining concepts identified in the table as selected, the contents of all columns of the concept-information entity matrix corresponding to the selected concept and its descendant concepts are then OR'ed together to form a single column. This is repeated for all concepts identified in the table as selected, and the resulting single columns, one for each selected concept, are then AND'ed together and the result is stored in the used entity column. Finally, steps 2 to 7 of the data retrieval method as described above are carried out. It will be appreciated that other methods for updating the filtered entity list and filtered concept list upon concept deselection exist. In particular, as the concept that is deselected may correspond to the last selected concept, the contents of the used entity column and used concept row prior to the selection of the concept may be stored and then restored upon the deselection of the concept.
Whilst in the particular embodiment described above there is generally no order to the information entities listed in the filtered information entity list, it will be appreciated that various methods exist for instilling some form of order to the list of information entities. For example, the information entities may be listed according to the number of concepts that are associated with each of the listed information entities. Alternatively, the information entities and concepts may be arranged within the matrix according to a weighting scheme. An example weighting scheme could include ordering the information entities within the matrix in order of importance. Rather than storing the association data as binary values, each element of the matrix may store weighted values according to the strength of association between the information entities and the concepts. The information entities may then be listed according to the strength of association with the selected concepts. Alternatively, the weighting of associations between information entities and concepts may be stored in a third dimension of the matrix, or in a separate matrix.
Whilst reference herein has been made to a data retrieval method and system, it is to be understood that the method may be implemented by means of computer software which may be run on a computer system wherein the software includes a database comprising at least the concept- information entity matrix, the used concept row and the used entity column, and a series of instructions for performing the data retrieval method which can be stored in memory of the computer system. Preferably the software includes a database for storing the ConceptRep and EntityRep objects. Furthermore, the software may be implemented on a remote server in which case the software includes instructions for interfacing with a remote terminal such that a user selection made from the remote terminal can be received by the server and the list of information entities generated by the data retrieval method can be outputted to the remote terminal. The data retrieval system may be a computer system such as a conventional personal computer configured to perform the data retrieval method as described. Alternatively, the data retrieval system may be implemented for electronic data exchange in which one or both of the database and the set of instructions is stored on a remote server that is capable of communication with a separate terminal. The described method of constructing the matrix and arrays of the data retrieval system may be limited in the number of information entities that may be indexed and in the number of concepts that may be used by, firstly, limitations in the size of array that may be constructed and manipulated by the available computing equipment, and, secondly, by declining performance of the method as the size of the array increases. Performance limitations may be overcome by caching the results of searches using concepts below the top level of the concept hierarchy, and by pre-computing the results of these searches using computer idle time where possible.
The data retrieval method described above may be implemented in a system using an object database, and with index creation and manipulation and user interface implemented as a Java application. It is envisaged that a typical application will implement the database on a computing server. Retrieval of data is preferably carried out as an internet or intranet application with the creation and manipulation of the matrix and arrays also taking place on a remote server. User presentation of results will take place as a client application communicating with the remote server using conventional internet protocols.
The data retrieval method and system described above is particularly suited for use in client-server e-commerce or business applications where users and customers are required make enquiries of a database via a Web- enabled client (e.g. a standard Web browser) using multiple 'search' criteria. Examples include holiday booking sites (which is discussed in greater detail below), a second-hand automobile locating site, an on-line bookshop or any product locating site. In these application scenarios the functionality is preferably provided on the server side. In order to more fully explore the functionality of the data retrieval method and system of the present invention consider the example of a Web-based holiday booking site on which users are required to make multiple selections from hierarchically organised menus to search a database of available holidays (e.g. by selecting the departure / arrival destination, country, resort, price, date, etc.).
A user of a Web-based holiday booking site may wish to book a holiday in country V, and in resort W, hotel X on date Y and for a price-range Z. Suppose that the user is asked to select a particular destination country followed by a resort, hotel and date. For a conventional product / service search engine it is possible (provided that a suitable database structure has been adopted and the Web page designed adequately) to link the options that are presented to users in the resort selection process according to the selected country since only a restricted number of resorts are applicable for a given selected country (i.e. allowing users to select resorts that do not exist in a particular country would not be desirable). In a similar manner it would be possible to link the hotel options according to the resort selected by the user, and so restrict the options presented to the user to those that are logical and reflect the structure of the underlying database. The choice of date made by the user would only be restricted by the design of the database or the Web page (e.g. by preventing the selection of dates that are prior to the current date). These user selections would then form the basis of a query that would be submitted to the underlying database. However, given this scenario it is often the case that the result of this search / query provides the user with a null response. For example the selected hotel may be fully booked on the selected date. Thus, such a conventional search engine would permit a query that will produce a null result. This can be extremely frustrating for a user where each query must be started from scratch.
In contrast, the data retrieval method and system of the present invention presents selections and options to users on the basis of the contents of the database, as opposed to its underlying structure. Therefore in the given example, the data retrieval system would only provide users with options that are guaranteed to be successful. Thus if the user had selected a given country, resort and hotel, they would then only be presented with dates that would result in a successful query because only dates which were associated with entities (holidays) in the selected hotel would be offered in the filtered concept set. The user would not be able to select the date Y as this would represent an invalid selection.
One of the key advantages of the data retrieval method and system of the present invention is that the user is free to choose the selection criteria in any arbitrary sequence which is in contrast to conventional systems where users are required to follow a pre-defined selection sequence. Thus, with the data retrieval method of the present invention if the most important selection criteria for a particular user is the particular date, this option can be made and only hotels with vacancies for that date will be presented to the user. In the example given above, suppose all the hotels in the given resort W are booked, with the data retrieval method of the present invention, the user would not be presented with the option of selecting the resort W as there would be no relevant relationship stored in the matrix and arrays. Similarly, if the choice of the resort is the user's key selection criteria then the user would be able to use this as the initial selection criteria in the same manner.
In an alternative application, the data retrieval tool of the present invention may be implemented as a medical diagnosis tool (or more generally as a fault finding and identification tool). In a simple illustrative example, the hierarchical structure might contain multiple branches that cover concepts such as condition, symptom and treatment. The user of such a tool is able to make any number of multiple selections from branches in the hierarchy. However, at any given time the user is only presented with the results (i.e. the condition, symptom and treatment) that are relevant to those selections that have already been made. For example, if a given symptom is not applicable to a given condition then the user will not be presented with that condition since it does not represent a valid result. This approach contrasts with more conventional expert systems that are widely used in such applications. In general, expert systems are highly prescriptive in the order in which options such as symptoms, conditions and treatments are presented to users. With the present invention users are freed from such constraints. This data retrieval approach for the applications given above or for other applications where interrelated data objects are searched, allows for the gradual refinement of a search by letting users progressively apply more stringent requirements (i.e. constraints) that the searched for object must meet for it to be judged to be relevant. This approach gives continuous feedback to the user, since numbers of documents relevant to a given query are dynamically calculated on the basis of the previous selections made by the user. Thus, if a user only wants objects or documents about subject X, the concepts for further selection are recalculated to include only those associated with documents that are related to subject X. This approach therefore ensures that a user will always be presented with a result that matches their query. Using this data retrieval approach users are guaranteed to be presented with one or more results in response to a query (i.e. the system does not allow the browsing of parts of a classification hierarchy that does not contain any documents).
The data retrieval tool also permits the formulation of 'complex' searches that are not possible using standard keyword or browsing tools. For example, suppose that a user is searching for information entities in a company document management system and wants to find out the names of the universities that are collaborating with their company in a particular research project. Using conventional search techniques the user would have to formulate complex search queries that would combine keywords that are related to both the research project and to universities in general. Using the data retrieval tool of the present invention, they need only select the concept representing the research project, and then note which children of the Universities concept remain in the filtered concept set, and hence visible in the concept hierarchy. The data retrieval method and system described above may be implemented as a search tool component and built into any type of document or content management software application, or more generally any software that requires users to search or browse for information that has been pre-classified or organised into a hierarchical structure. In this mode of operation, it may form part of a stand-alone piece of software, or be provided as a separate system on the client or server side in a thin / thick client application. It will of course be immediately apparent that the data retrieval method and system described herein may be implemented using many different platforms and is applicable to a broad range of applications where interrelated information is stored and queried. As such, the data retrieval method and system is not limited to the particular example given above but may be altered without departing from the scope of the invention as claimed.

Claims

1. A data retrieval method for locating one or more data objects from a plurality of data objects through the selection of one or more data classifications from a plurality of data classifications, the method comprising the steps of: generating a first matrix representing at least associations between data objects listed in a first dimension and data classifications listed in a second dimension; generating a first table representing the membership of data classifications that are selectable from the plurality of data classifications; generating a second table representing the membership of data objects that are associated with the selectable data classifications; updating the second table by means of Boolean operands in response to a selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
2. The data retrieval method as claimed in claim 1 , wherein in accordance with the method exclusion of all data objects from the membership of data objects during the data classification selection is prevented.
3. The data retrieval method as claimed in either of claims 1 or 2, wherein each data classification identified in the first table as a selectable data classification is associated with one or more data objects.
4. The data retrieval method as claimed in claim 3, wherein the step of updating the second table comprises the steps of: selecting all rows of the first matrix corresponding to the selected data classifications; applying the logical operator OR to all association data in the selected rows in the second dimension to temporarily form a third table representing the membership of data objects that are associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
5. The data retrieval method as claimed in any one of the preceding claims, wherein the method further comprises the steps of: updating the first table by means of Boolean operands in response to the updating of the second table to represent the membership of data classifications that remain selectable, and outputting the one or more data classifications identified in the updated first table as selectable data classifications.
6. The data retrieval method as claimed in claim 5, wherein the step of updating the first table comprises the steps of: selecting all rows in the first matrix corresponding to the data objects identified in the second table as associated with selected data classifications; applying the logical operator OR to all membership data of the selected rows in the first dimension; and storing the result representing the membership of data classifications that remain selectable in the first table in replacement of the previous membership data.
7. The data retrieval method as claimed in either of claims 1 or 2, wherein at least one data classification is associated with at least one other data classification and the method further comprises the step of generating a second matrix representing associations between data classifications; and each data classification or one or more of its associated data classifications identified in the first table as selectable data classifications are associated with one or more data objects.
8. The data retrieval method as claimed in claim 7, wherein the step of updating the second table comprises the steps of: selecting all rows of the first matrix corresponding to the selected data classifications and the data classifications associated with the selected data classifications as determined from the second matrix; applying the logical operator OR to all association data in the selected rows in the second dimension to form a third table representing the membership of data objects that are associated with the selected data classifications and the data classifications associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
9. The data retrieval method as claimed in either of claims 7 or 8, wherein the method further comprises the steps of: updating the first table by means of Boolean operands in response to the updating of the second table to represent the membership of data classifications that remain selectable, and outputting the one or more data classifications identified in the updated first table as selectable data classifications.
10. The data retrieval method as claimed in claim 9, wherein the step of outputting the one or more data classifications identified in the updated first table as selectable includes outputting one or more data classifications associated with the data classifications identified in the first table as selectable as selectable data classifications.
11. The data retrieval method as claimed in either of claims 9 or 10, wherein the step of updating the first table comprises the steps of: for a row of the first matrix corresponding to a selectable data classification, applying the logical operator AND to the association data of the row and the corresponding membership data of the updated second table to generate a direct association data row representative of whether any data object in the membership of data objects is directly associated with the data classification of the row, and repeating this step for each row of the first matrix corresponding to a selectable data classification to generate a set of association data rows; applying the logical operator OR to the content of each direct association data row and determining for each data classification whether the outcome is non-zero; and for each data classification, where the outcome is zero and the outcome is zero for all data classifications associated with said data classification, updating the first table such that said data classification is no longer selectable.
2. The data retrieval method as claimed in any one of claims 5, 6, 9, 10 or 11 , wherein the steps of updating the first and second table, outputting one or more data objects identified in the updated second table as associated with selected data classifications, and outputting one or more data classifications identified in the updated first table as selectable data classifications are repeated at least once through the further selection of one or more selectable data classifications.
13. The data retrieval method as claimed in any one of the preceding claims, wherein the associations between the data objects and the data classifications are stored within the matrix as binary values.
14. The data retrieval method as claimed in any one of claims 1 to 12, wherein the associations between the data objects and the data classifications are stored within the matrix as weighted values.
15. The data retrieval method as claimed in any one of the preceding claims, wherein the data objects are ordered within the matrix according to a weighting scheme.
16. Data retrieval software for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the software comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions: updating the second table by means of Boolean operands in response to a user selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
17. Data retrieval software suitable for use on a remote server for locating one or more data objects from a plurality of data objects based on a user selection made from a remote terminal of one or more data classifications from a plurality of data classifications, the software comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions: updating the second table by means of Boolean operands in response to a user selection received via an input / output interface from a remote terminal of one or more selectable data classifications; and outputting via the input / output interface to the remote terminal one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
18. The data retrieval software as claimed in either of claims 16 or 17, wherein exclusion of all data objects from the membership of data objects during the data classification selection is prevented.
19. The data retrieval software as claimed in any one of claims 16 to 18, wherein each data classification identified in the first table as a selectable data classification is associated with one or more data objects.
20. The data retrieval software as claimed in claim 19, wherein the set of instructions for updating the second table comprises the steps of: selecting all rows of the first matrix corresponding to the selected data classifications; applying the logical operator OR to all association data in the selected rows in the second dimension to temporarily form a third table representing the membership of data objects that are associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
21. The data retrieval software as claimed in any one of claims 16 to 20, wherein the software further comprises a set of instructions for: updating the first table by means of Boolean operands in response to the updating of the second table to represent the membership of data classifications that remain selectable, and outputting the one or more data classifications identified in the updated first table as selectable data classifications.
22. The data retrieval software as claimed in claim 21 , wherein the set of instructions for updating the first table comprises the steps of: selecting all rows in the first matrix corresponding to the data objects identified in the second table as associated with selected data classifications; applying the logical operator OR to all membership data of the selected rows in the first dimension; and storing the result representing the membership of data classifications that remain selectable in the first table in replacement of the previous membership data.
23. The data retrieval software as claimed in any one of claims 16 to 18, wherein at least one data classification is associated with at least one other data classification and the database further comprises a second matrix representing associations between data classifications; and each data classification or one or more of its associated data classifications identified in the first table as selectable data classifications are associated with one or more data objects.
24. The data retrieval software as claimed in claim 23, wherein the set of instructions for updating the second table comprises the steps of: selecting all rows of the first matrix corresponding to the selected data classifications and the data classifications associated with the selected data classifications as determined from the second matrix; applying the logical operator OR to all association data in the selected rows in the second dimension to form a third table representing the membership of data objects that are associated with the selected data classifications and the data classifications associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
25. The data retrieval software as claimed in either of claims 23 or 24, wherein the software further comprises the set of instructions for: updating the first table by means of Boolean operands in response to the updating of the second table to represent the membership of data classifications that remain selectable, and outputting the one or more data classifications identified in the updated first table as selectable data classifications.
26. The data retrieval software as claimed in claim 25, wherein the set of instructions for outputting the one or more data classifications identified in the updated first table as selectable includes outputting one or more data classifications associated with the data classifications identified in the first table as selectable as selectable data classifications.
27. The data retrieval software as claimed in either of claims 25 or 26, wherein the set of instructions for updating the first table comprises the steps of: for a row of the first matrix corresponding to a selectable data classification, applying the logical operator AND to the association data of the row and the corresponding membership data of the updated second table to generate a direct association data row representative of whether any data object in the membership of data objects is directly associated with the data classification of the row, and repeating this step for each row of the first matrix corresponding to a selectable data classification to generate a set of association data rows; applying the logical operator OR to the content of each direct association data row and determining for each data classification whether the outcome is non-zero; and for each data classification, where the outcome is zero and the outcome is zero for all data classifications associated with said data classification, updating the first table such that said data classification is no longer selectable.
28. The data retrieval software as claimed in any one of claims 21 , 22, 25, 26 or 27, wherein the set of instructions for updating the first and second table, outputting one or more data objects identified in the updated second table as associated with selected data classifications, and outputting one or more data classifications identified in the updated first table as selectable data classifications are repeated at least once through the further selection of one or more selectable data classifications.
29. The data retrieval software as claimed in any one of claims 16 to 28, wherein the association data between the data objects and the data classifications are stored within the matrix as binary values.
30. The data retrieval software as claimed in any one of claims 16 to 28, wherein the association data between the data objects and the data classifications are stored within the matrix as weighted values.
31. The data retrieval software as claimed in any one of claims 16 to 20, wherein the data objects are ordered within the matrix according to a weighting scheme.
32. Data retrieval system for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the system comprising: a memory containing a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; an input device for receiving a user selection of one of more selectable data classifications; a processor for changing the contents of the second table by means of Boolean operands in response to the user selection received by the input device; and an output for outputting one or more data objects identified in the changed second table as associated with selected data classifications or data classifications which remain selectable.
33. The data retrieval system as claimed in claim 32, wherein exclusion of all data objects from the membership of data objects during the data classification selection is prevented.
34. The data retrieval system as claimed in either of claims 32 or 33, wherein each data classification identified in the first table as a selectable data classification is associated with one or more data objects.
35. The data retrieval system as claimed in claim 34, wherein the processor changes the contents of the second table by: selecting all rows of the first matrix corresponding to the selected data classifications; applying the logical operator OR to all association data in the selected rows in the second dimension to temporarily form a third table representing the membership of data objects that are associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
36. The data retrieval system as claimed in any one of claims 32 to 35, wherein the processor further changes the contents of the first table by means of Boolean operands in response to the changes made to the contents of the second table to represent the membership of data classifications that remain selectable, and the output outputs the one or more data classifications identified in the changed first table as selectable data classifications.
37. The data retrieval system as claimed in claim 36, wherein the processor changes the contents of the first table by: selecting all rows in the first matrix corresponding to the data objects identified in the second table as associated with selected data classifications; applying the logical operator OR to all membership data of the selected rows in the first dimension; and storing the result representing the membership of data classifications that remain selectable in the first table in replacement of the previous membership data.
38. The data retrieval system as claimed in either of claims 32 or 33, wherein at least one data classification is associated with at least one other data classification and the memory further contains a second matrix storing association data between data classifications; and each data classification or one or more of its associated data classifications identified in the first table as selectable data classifications are associated with one or more data objects.
39. The data retrieval system as claimed in claim 38, wherein the processor changes the contents of the second table by: selecting all rows of the first matrix corresponding to the selected data classifications and the data classifications associated with the selected data classifications as determined from the second matrix; applying the logical operator OR to all association data in the selected rows in the second dimension to form a third table representing the membership of data objects that are associated with the selected data classifications and the data classifications associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
40. The data retrieval system as claimed in either of claims 38 or 39, wherein the processor further changes the contents of the first table by means of Boolean operands in response to the changes made to the contents of the second table to represent the membership of data classifications that remain selectable, and the output outputs the one or more data classifications identified in the changed first table as selectable data classifications.
41. The data retrieval system as claimed in claim 40, wherein the output the output further outputs one or more data classifications associated with the data classifications identified in the first table as selectable as selectable data classifications
42. The data retrieval system as claimed in either of claims 40 or 41 , wherein the processor changes the contents of the first table by: for a row of the first matrix corresponding to a selectable data classification, applying the logical operator AND to the association data of the row and the corresponding membership data of the updated second table to generate a direct association data row representative of whether any data object in the membership of data objects is directly associated with the data classification of the row, and repeating this step for each row of the first matrix corresponding to a selectable data classification to generate a set of association data rows; applying the logical operator OR to the content of each direct association data row and determining for each data classification whether the outcome is non-zero; and for each data classification, where the outcome is zero and the outcome is zero for all data classifications associated with said data classification, updating the first table such that said data classification is no longer selectable.
43. The data retrieval system as claimed in any one of claims 36, 37, 40, 41 or 42, wherein the wherein the changes made to the contents of the first and second tables by the processor, the outputting of one or more data objects identified in the changed second table as associated with selected data classifications by the output, and the outputting of one or more data classifications identified in the changed first table as selectable data classifications by the output are repeated at least once through the further user selection of one or more selectable data classifications.
44. The data retrieval system as claimed in any one of claims 32 to 43, wherein the association data between the data objects and the data classifications are stored within the matrix as binary values.
45. The data retrieval system as claimed in any one of claims 32 to 43, wherein the association data between the data objects and the data classifications are stored within the matrix as weighted values.
46. The data retrieval software as claimed in any one of claims 32 to 45, wherein the data objects are ordered within the matrix according to a weighting scheme.
47. A computer-readable medium embodying a database and instructions for execution by a processor for locating one or more data objects from a plurality of data objects based on a user selection of one or more data classifications from a plurality of data classifications, the computer-readable medium comprising: a database comprising a first matrix storing at least association data between data objects listed in a first dimension and data classifications listed in a second dimension, a first table storing membership data of data classifications that are selectable from the plurality of data classifications and a second table storing membership data of data objects that are associated with the selectable data classifications; and a set of instructions for performing the following functions: updating the second table by means of Boolean operands in response to a user selection of one or more selectable data classifications; and outputting one or more data objects identified in the updated second table as associated with selected data classifications or data classifications which remain selectable.
48. The computer-readable medium as claimed in claim 47, wherein exclusion of all data objects from the membership of data objects during the data classification selection is prevented.
49. The computer-readable medium as claimed in either of claims 47 or 48, wherein each data classification identified in the first table as a selectable data classification is associated with one or more data objects.
50. The computer-readable medium as claimed in claim 49, wherein the set of instructions for updating the second table comprises the steps of: selecting all rows of the first matrix corresponding to the selected data classifications; applying the logical operator OR to all association data in the selected rows in the second dimension to temporarily form a third table representing the membership of data objects that are associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
51. The computer-readable medium as claimed in any one of claims 47 to 50, wherein the medium further comprises a set of instructions for: updating the first table by means of Boolean operands in response to the updating of the second table to represent the membership of data classifications that remain selectable, and outputting the one or more data classifications identified in the updated first table as selectable data classifications.
52. The computer-readable medium as claimed in claim 51 , wherein the set of instructions for updating the first table comprises the steps of: selecting all rows in the first matrix corresponding to the data objects identified in the second table as associated with selected data classifications; applying the logical operator OR to all membership data of the selected rows in the first dimension; and storing the result representing the membership of data classifications that remain selectable in the first table in replacement of the previous membership data.
53. The computer-readable medium as claimed in either one of claims 47 or 48, wherein at least one data classification is associated with at least one other data classification and the database further comprises a second matrix representing associations between data classifications; and each data classification or one or more of its associated data classifications identified in the first table as selectable data classifications are associated with one or more data objects.
54. The computer-readable medium as claimed in claim 53, wherein the set of instructions for updating the second table comprises the steps of: selecting all rows of the first matrix corresponding to the selected data classifications and the data classifications associated with the selected data classifications as determined from the second matrix; applying the logical operator OR to all association data in the selected rows in the second dimension to form a third table representing the membership of data objects that are associated with the selected data classifications and the data classifications associated with the selected data classifications; applying the logical operator AND to the membership data of the second table and the corresponding membership data of the third table; and storing the outcome in the second table in replacement of the previous membership data.
55. The computer-readable medium as claimed in either of claims 53 or 54, wherein the software further comprises the set of instructions for: updating the first table by means of Boolean operands in response to the updating of the second table to represent the membership of data classifications that remain selectable, and outputting the one or more data classifications identified in the updated first table as selectable data classifications.
56. The computer-readable medium as claimed in claim 55, wherein the set of instructions for outputting the one or more data classifications identified in the updated first table as selectable includes outputting one or more data classifications associated with the data classifications identified in the first table as selectable as selectable data classifications.
57. The computer-readable medium as claimed in either of claims 55 or
56, wherein the set of instructions for updating the first table comprises the steps of: for a row of the first matrix corresponding to a selectable data classification, applying the logical operator AND to the association data of the row and the corresponding membership data of the updated second table to generate a direct association data row representative of whether any data object in the membership of data objects is directly associated with the data classification of the row, and repeating this step for each row of the first matrix corresponding to a selectable data classification to generate a set of association data rows; applying the logical operator OR to the content of each direct association data row and determining for each data classification whether the outcome is non-zero; and for each data classification, where the outcome is zero and the outcome is zero for all data classifications associated with said data classification, updating the first table such that said data classification is no longer selectable.
58. The data retrieval software as claimed in any one of claims 51 , 52, 55, 56 or 57, wherein the set of instructions for updating the first and second table, outputting one or more data objects identified in the updated second table as associated with selected data classifications, and outputting one or more data classifications identified in the updated first table as selectable data classifications are repeated at least once through the further selection of one or more selectable data classifications.
59. The computer-readable medium as claimed in any one of claims 47 to 58, wherein the association data between the data objects and the data classifications are stored within the matrix as binary values.
60. The computer-readable medium as claimed in any one of claims 47 to 58, wherein the association data between the data objects and the data classifications are stored within the matrix as weighted values.
61. The computer-readable medium as claimed in any one of claims 47 to 60, wherein the data objects are ordered within the matrix according to a weighting scheme.
PCT/GB2002/001968 2001-04-30 2002-04-30 Data retrieval method and system WO2002089005A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0110571.7 2001-04-30
GBGB0110571.7A GB0110571D0 (en) 2001-04-30 2001-04-30 Data retrieval method and system

Publications (2)

Publication Number Publication Date
WO2002089005A2 true WO2002089005A2 (en) 2002-11-07
WO2002089005A3 WO2002089005A3 (en) 2003-07-10

Family

ID=9913752

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2002/001968 WO2002089005A2 (en) 2001-04-30 2002-04-30 Data retrieval method and system

Country Status (2)

Country Link
GB (1) GB0110571D0 (en)
WO (1) WO2002089005A2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5983219A (en) * 1994-10-14 1999-11-09 Saggara Systems, Inc. Method and system for executing a guided parametric search
WO2001003009A1 (en) * 1999-07-02 2001-01-11 Toptier, Israel, Ltd. Relation path viability prediction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5983219A (en) * 1994-10-14 1999-11-09 Saggara Systems, Inc. Method and system for executing a guided parametric search
WO2001003009A1 (en) * 1999-07-02 2001-01-11 Toptier, Israel, Ltd. Relation path viability prediction

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN S-M ET AL: "FUZZY QUERY PROCESSING FOR DOCUMENT RETRIEVAL BASED ON EXTENDED FUZZY CONCEPT NETWORKS" IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS. PART B: CYBERNETICS, IEEE SERVICE CENTER, US, vol. 29, no. 1, February 1999 (1999-02), pages 96-104, XP000831582 ISSN: 1083-4419 *
HEARST M A ET AL: "CAT-A-CONE: AN INTERACTIVE INTERFACE FOR SPECIFYING SEARCHED AND VIEWING RETRIEVAL RESULTS USING A LARGE CATEGORY HIERARCHY" PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL ACM-SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL. PHILADELPHIA, PA, JULY 27 - 31, 1997, ANNUAL INTERNATIONAL ACM-SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEV, 27 July 1997 (1997-07-27), pages 246-255, XP000782010 ISBN: 0-89791-836-3 *
MING-CHUAN WU ET AL: "Encoded bitmap indexing for data warehouses" DATA ENGINEERING, 1998. PROCEEDINGS., 14TH INTERNATIONAL CONFERENCE ON ORLANDO, FL, USA 23-27 FEB. 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 23 February 1998 (1998-02-23), pages 220-230, XP010268321 ISBN: 0-8186-8289-2 *

Also Published As

Publication number Publication date
GB0110571D0 (en) 2001-06-20
WO2002089005A3 (en) 2003-07-10

Similar Documents

Publication Publication Date Title
US6067552A (en) User interface system and method for browsing a hypertext database
US6101503A (en) Active markup--a system and method for navigating through text collections
US6334131B2 (en) Method for cataloging, filtering, and relevance ranking frame-based hierarchical information structures
US6434556B1 (en) Visualization of Internet search information
US6321228B1 (en) Internet search system for retrieving selected results from a previous search
US6574625B1 (en) Real-time bookmarks
US7984065B2 (en) Portable browsing interface for information retrieval
US7941428B2 (en) Method for enhancing search results
US6732088B1 (en) Collaborative searching by query induction
KR100337810B1 (en) Search dedicated website and search method on Internet
EP1014282A1 (en) Search channels between queries for use in an information retrieval system
US20010049674A1 (en) Methods and systems for enabling efficient employment recruiting
US20030061209A1 (en) Computer user interface tool for navigation of data stored in directed graphs
US20080082578A1 (en) Displaying search results on a one or two dimensional graph
Head et al. World wide web navigation aid
WO2001067209A2 (en) Method and apparatus for performing a research task by interchangeably utilizing a multitude of search methodologies
JP2006012197A (en) Method and system of database query and information delivery
CA2411184A1 (en) Method and apparatus for data collection and knowledge management
WO2001016807A1 (en) An internet search system for tracking and ranking selected records from a previous search
WO2001040988A1 (en) Web map tool
WO2004055625A2 (en) Database for allowing multiple customized views
JPH10334120A (en) Browser for internet, address specifying method for browser for internet and storage medium
KR20000030486A (en) Internet information search system for special searching a regional information and method for searching the internet information using the same
Nicholson Indexing and abstracting on the World Wide Web: an examination of six Web databases
US20060059126A1 (en) System and method for network searching

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ CZ DE DE DK DK DM DZ EC EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP