WO1999027432A9

WO1999027432A9 - System and method for integrating heterogeneous information

Info

Publication number: WO1999027432A9
Application number: PCT/US1998/024711
Authority: WO
Inventors: Vishal Sikka; Digvijay Sikka; Thomas Soares; Sukesh Patel
Original assignee: Patternrx Inc
Priority date: 1997-11-21
Filing date: 1998-11-20
Publication date: 2000-03-02
Also published as: WO1999027432A3; WO1999027432A2

Abstract

A computer-implemented method for querying multiple different types of information, each type of information having a different evaluator, includes receiving a query (102) comprising an identification of at least two evaluators, at least one relationship between the evaluators, and a method of combining results from the evaluators; parsing (104) the query to create (108) an evaluation sequence comprising an ordered sequence of invocations of the evaluators; invoking (110) the evaluators in the evaluation sequence; and combining (112) results from the evaluators according to the method of combining results from the evaluators specified in the query.

Description

SYSTEM AND METHOD FOR INTEGRATING HETEROGENEOUS

INFORMATION

Field of the Invention

The present invention relates generally to information processing, and more particularly, to a system and method for integrating heterogeneous information.

Identification of Copyright

A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

Description of the Background Art

The key technological challenge in integrating heterogeneous information is that different types of information is analyzed using fundamentally different evaluators. Evaluators perform functions such as searching, prediction, collaborative filtering, data mining, and the like, and include such tools as search engines, neural networks, fuzzy logic-based decision makers, and a variety of other systems well known to those of skill in the art of information processing.

For unstructured text, e.g. news and the like, evaluators based on information retrieval or search based mechanisms work best. For structured database data, evaluators based on data access mechanisms work best. Evaluation mechanisms such as searching, prediction, collaborative filtering, data mining, each have semantic differences. Structured information is semantically organized and stored in databases, such as relational databases and object oriented databases. Structured information is accessed by semantic retrieval mechanisms with explicit syntax and semantics like SQL, OSQL, and ODBC. The entire database industry including decision support, data access, warehousing, mining, and middleware is built on the ability to store and extract information stored in structured data sources.

Unstructured information, on the other hand, is mostly composed of free form, natural language text, e.g. news articles, documents, messages and web pages. The mechanisms to analyze such information are keyword or concept based search and retrieval. Search engines return either too much irrelevant information or too little relevant information.

Qualitative information is more intuitive in nature. Human responses and experiences such as intuition, market conditions, investment style, personality, preferences or ratings are examples of qualitative information. Techniques for evaluating such information are based on collaborative filtering, qualitative data analysis, at the like.

Quantitative information is evaluated based on precise analytical and mathematical models and expert knowledge. Examples include econometric models in the finance industry for measuring risk and performance as well as predictive models. Examples of the latter include systems for predicting frauds and purchase patterns. In many cases this type of information is proprietary and is used within research departments of organizations.

Unfortunately, conventional systems typically allow querying over only a single type of data. For example, SQL, a standard database query language, provides database access, but not access to unstructured information, qualitative information, or the like. Thus, multiple querying systems are required, increasing complexity and cost.

Likewise, current systems perform only a single type of analysis in a single query. For example, typical analysis mechanisms such as text searches, data access, collaborative filtering, prediction, and the like are performed separately. Thus, multiple queries are required to achieve multiple forms of analysis, decreasing performance and increasing complexity and cost.

Another disadvantage is that current systems are static. In other words, such systems generally cannot be dynamically extended to incorporate new types of analysis via new evaluators without a substantial and expensive overhaul.

What is needed, then, is a system and method for integrating heterogeneous information in which querying may be performed over multiple types of data in a single query. What is also needed is system and method for integrating heterogeneous information that is dynamically extendible to incorporate new types of anaylsis.

SUMMARY OF THE INVENTION

The present invention overcomes the aforementioned problems by providing a Universal Analysis Language (UAL) and the Universal Analysis Model, which is used by the method and system described herein to query over multiple different types of information using multiple different mechanisms for analyzing and evaluating information.

The information to be queried can be located in multiple sources, for example databases, news wires, text files, online (Web) pages, can be of multiple types, e.g. unstructured text, structured database data, quantitative (numeric) data or qualitative data. The mechanisms to analyze and evaluate such information (hereafter referred to as "evaluators") can be equally diverse, e.g. text search methods to evaluate text data, data analysis methods for structured data, collaborative filtering for qualitative data, prediction for quantitative data, etc. The present invention allows users to query multiple different types of information, using multiple different evaluation mechanisms in a single framework for decision making.

In accordance with one aspect the present invention, a computer-implemented method for querying multiple different types of information, each type of information having a different evaluator, includes receiving a query comprising an identification of at least two evaluators, at least one relationship between the evaluators, and a method of combining results from the evaluators; parsing the query to create an evaluation sequence comprising an ordered sequence of invocations of the evaluators; invoking the evaluators in the evaluation sequence; and combining results from the evaluators according to the method of combining results from the evaluators specified in the query.

In another aspect of the invention, a system for querying multiple different types of information, each type of information having a different evaluator, includes a parser for receiving a query comprising an identification of at least two evaluators, at least one relationship between the evaluators, and a method of combining results from the evaluators; means, coupled to the parser, for creating an evaluation sequence comprising an ordered sequence of invocations of the evaluators; means, coupled to the creating means, for invoking the evaluators in the evaluation sequence; and means, coupled to the invoking means, for combining results from the evaluators according to the method of combining results from the evaluators specified in the query.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is a flow chart of a method for integrating heterogeneous information using a universal analysis language (UAL) in accordance with an embodiment of the present invention.

Figure 2 is an illustration of a plurality of truth functions in accordance with an embodiment of the present invention.

Figure 3 is a screen view of a query in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Universal Analysis Model

The Universal Analysis Model (UAM) is a means to integrate the above- described diverse evaluators within a common framework. The UAM achieves this integration by combining the results of analyzing various bodies of information within a single framework. In addition, the UAM must ensure that this integration is semantic, in that the results of various evaluators are combined correctly. By correctness, it is meant the logical correctness (or validity) of a query over disparate, heterogeneous, information using different evaluators A key aspect of the present invention is the Universal Analysis Language, including a model, i.e. a formal interpretation structure, for that language. The Universal Analysis Language universalizes the analysis of information from diverse evaluators.

The declarative syntax of the language has two key characteristics. First, it enables expressing the relationships between various evaluators as well as combining the results obtained from these evaluators. The language employs multi-valued logic, a generalization of fuzzy logic, and an interpreter. It is based on a generalization of dynamically interpreted infinite-valued logic. With this language one can express the relationship between various evaluators and their relationship to the validity of decision criteria given the underlying information. In addition, this language enables one to handle inconsistency and incompleteness in the underlying information, as well as find incomplete or inconsistent information.

Second, the language is dynamically extensible. It includes a mechanism to add as well as execute new evaluation mechanisms (either native evaluators or third-party evaluators) on the fly. Dynamic extendibility is supported the architecture in a variety of ways. In one embodiment, a DLL-based execution mechanism called a "Module Manager" can be dynamically extended (i.e. a new evaluator can be added without recompiling or even restarting the system). This approach is primarily used in, for example, a Microsoft Windows™ environment. In another embodiment, a Platform Independent CORBA based execution mechanism that uses an Evaluator Invocation Repository (based on CORBA's DII - the Dynamic Invocation Interface) is used.

The key idea of the language is to express the relationship between the results of various evaluators, and their relationship to the truth of decision criteria in a user query. In one embodiment, two primary methods for combining criteria include: (i) Statistical, based on weighted average, and (ii) Intuitive Logic, based on an extension of multi-valued logic (Similar to a generalized version of fuzzy logic) with preferences for individual criteria.

The above-described system provides several distinct benefits and advantages. First, is the ability to query over multiple types of data. For example, the underlying data can be text, structured, qualitative, quantitative. Second, is the fact that multiple types of analyses can be combined in a single query. For instance, the underlying analysis mechanisms can be text search, data access, collaborative filtering, prediction, or other analyses. Third, is dynamically extensibility. New forms of analysis based on new evaluators can be dynamically added to the system without significant cost or disruption. Fourth, is the fact that importance values can be assigned to individual decision criteria. Each criteria in a decision can have an associated importance value. Fifth, is the ability to use partially specified decision criteria. Information in a Criteria can be specified imprecisely. Terms such as "High", "Low", and "Aggressive" may be used instead of precise, domain-specific quantities.

System Implementation

The system and methods disclosed herein may be implemented using a general purposes computer, such as an IBM PC or compatible machine. For example, embodiments of the present system may be implemented using the C++ and Java programming languages and executed on such a computer.

In one embodiment, the implementation infrastructure consists of an object oriented programming language, a lexical analyzer, and a parser. Typical language interpretation systems take one or more files as input. The contents of the input file are logically concatenated and made available to the lexical analyzer. The lexical analyzer analysis the input stream of characters and based on user defined regular expressions, outputs a sequence of tokens. The sequence of tokens is then made available to the parser. The parser uses the grammar defined by the programmer to analyze the input stream of tokens and build a parse tree called an abstract syntax tree. The abstract syntax tree is then traversed to perform various operations depending on the needs of the language designer.

Object oriented programming (OOP) technology allows encapsulation of both data and algorithms into Object classes. OOP allows programmers to associate methods (functions or procedures) and variables (which may themselves be instances of objects) with object classes. Object classes can inherit both behavior (i.e., methods) and data from parent super-classes. Classes that inherit properties from parent classes are called sub-classes. In typical OOP languages, a sub-class can be used in any context where its parent class can be used. Object instances are created just as basic primitive types like characters or integers are created in conventional languages. Object instances can be named and used just like any other variables. OOP languages provide special syntax to refer to object instance variable and methods.

In one embodiment of the invention, there is defined a set of Object Classes and

Special Functions for defining, using, and combining Criteria in useful ways. For example, one embodiment includes the following components:

1. Parser: The parser takes text definitions for one or more Criteria as input. The input is parsed to create a set of Criteria Definition Objects - one for each input text definition.

2. Object Manager: The object manager serves as a repository for criteria definition objects, values (such as literals and definitions), and the expressions used to evaluate values for individual Criteria.

3. Module Manager: The module manager allows retrieval of external function objects given only the name of the function as input. This component also allows for functions and object classes to be dynamically added to the system.

4. Virtual Table: The Virtual Table provides all the available candidate data. This component unifies access to candidate information that may come from various data sources and data types. 5. Special Functions /Classes: These classes were developed for the end-user who will define Criteria. These functions and classes help to make the process of developing Criteria much more efficient and easy.

Parser

The parser accepts a stream of Criteria Definitions in text format and builds a set of Criteria Definition Objects. One aspect of the present invention includes a special language called Criteria-Script (UAL) that is tailored to make it extremely easy and efficient to define and use Criteria.

Table 1 provides an overview of the principle classes that are used to build the UAL parser.

To facilitate the definition and use of Criteria, a domain specific language called UAL was developed. UAL has the following important features:

1. UAL allows Criteria to be easily defined. Criteria can return fuzzy-truth values or any real (float) valued result.

2. UAL provides conversation operators for mapping between real values and fuzzy-truth values.

3. UAL provides special support for dealing with fuzzy truth-values, fuzzy- expressions (expressions using logical /arithmetic operators that return a fuzzy truth-value.

4. UAL provides special support to access and manipulate information from the Virtual Table mechanisms (see below).

5. UAL provides an efficient mechanism to add functions at run time that may then be immediately used to define new Criteria.

6. UAL is easy to use and has simple intuitive semantics that are tailored to defining and combining Criteria results in a safe and efficient manner.

Object Manager

The object manager (OM) was developed to provide convenient storage and access to data structures (i.e. objects, literal constants, names, etc.) that are necessary to support the implementation mechanisms for defining and evaluating Criteria. The object manager's functionality is available through a application programming interface (API) (i.e., behavior that can be invoked from external functions, objects, procedures, etc.). Table 2 outlines the key methods supported by the Object manager.

Module Manager

The Module Manager allows functions to be dynamically added and retrieved to support the execution (i.e., interpretation) of the expressions contained in Criteria. The following classes and functions are used to support the capabilities of the module manager. Table 3 outlines the key sub-components needed to implement the module manager.

Virtual Table

The VT is designed to hold all candidate-related information. Because candidate information can come from multiple sources and be of multiple types the VT was designed as a table whose cells are object classes. The VT table can be accessed given a candidate identifier that must be unique - each row of the VT corresponds to one and only one candidate. The implementation of the VT is supported by the classes listed and described in Table 4.

The VT table provides a consistent, well-defined, interface for the language execution mechanism to access data and information in flexible ways.

Special Classes/Functions In one embodiment of the present invention, two types of special Object Classes and Functions are provided:

1. A set of functions to make it easier to write and use Criteria definitions.

2. A set of functions and classes to make the implementation of the system easier and more efficient.

Table 5 lists the functions designed and developed for defining and using Criteria:

Table 6 lists additional classes and functions that helped to make the implementation easier and more efficient.

Method of Operation

Referring now to Figure 1, there is shown a flow chart of a method for integrating heterogeneous information using a universal analysis language (UAL).

The method begins by creating 102 a requirement or query. A requirement is an expression in UAL. The requirement may be created using an editor or other means.

Thereafter, the method continues by parsing 104 the requirement by means of the parser.

After the requirement is parsed, the method continues by checking 106 or validating the results of the UAL parsing of the requirement. If the results are not correct, i.e. the requirement does not conform to the UAL grammar, the method is complete. Otherwise, the method continues by creating 108 an ordered sequence of evaluations, using the available evaluators, necessary to execute the requirement.

After the ordered sequence is created, the method continues by invoking 110 the evaluators to execute the evaluation sequence as determined in step 108. Thereafter, the results obtained of the evaluators are combined 112 using the appropriate UAL logic to produce the results of the requirement. The result is an ordering of decision candidates for the given requirement.

Operational Examples

Example Query

The following example illustrates how a sample query is executed. The query is represented in the system as a logical combination of criteria. The intent of the query is for a user to find mutual funds based on their criteria. In this case the criteria (with their values) are:

1. One Year Return, High

2. Risk Adjusted Return, Very High 3. Analysis, Long Term Capital Appreciation

4. Rating by people like me High

5. Similar To, KAUFX

The first step is to determine an optimal evaluation sequence for these criteria based on the meta information available. The first criteria, for instance, uses the structured data access evaluator. The data relevant to One Year Return for each candidate is retrieved from the appropriate source and the evaluator compares this information to the definition of High. The structured data access evaluator accesses and evaluates structured database information. It takes a multivalued concept and returns the extent of the match between the concept and the actual information. "High" may be defined as "> 15%", in which case this comparison is trivial. Alternatively, it may be defined using truth functions such as those shown in Figure 2.

Alternatively, High may mean a better One Year Return than the average One Year Return of all the Funds known. Alternatively, it may mean better performance than, say, the S&P 500 index. The internal definition of the concept can be arbitrarily complex. This evaluator analyzes the extent to which the given, actual information matches the desired value.

For the criteria "long term capital appreciation", the specified evaluator may be a string similarity evaluator. This evaluator takes the argument string, and the source document for the candidate, which may be an analyst report, and returns an evaluation. This evaluation is converted into a truth value for this criterion, by looking up the rules for using this evaluator in the Universal Analysis Model. The criterion "Rating by people like me" employs a collaborative filter, such as Firefly or NetPerceptions' GroupLens. Similarly, depending on the execution sequence for the remaining criteria, each evaluation mechanism is invoked with the appropriate parameters. Following this, the results from the evaluators are combined in accordance with their relationship as specified in the UAM, and the combined results, i.e. the ranked set of funds matching the query, are returned to the user. These results are documentable, so that a user is able to drill down into the rationale for the decision recommendation, all the way down to the actual information from the source. The sequence for executing a query is: (i) Determine an optimal execution sequence, (ii) Run the evaluators on the appropriate criteria, (iii) Use the various evaluators to convert the evaluators' results into the truth of the criteria, and (iv) Combine the results into a single value for each candidate, in this case mutual funds, given the entire query.

To keep the explanation simple, two assumptions are made in this query which the UAL does not require:

1. None of the criteria has any preference or importance, associated to it.

2. Each criteria is combined with an implicit AND connective. Figure 3 illustrates a screen view of a query.

Criterion Definition Example 1

The example below defines a Criterion for One Year Return. The Criterion defines 4 values. All the values are evaluated with respect to a user's input. Thus a "good" one year return is between 0.75 to 1.25 times the input desired return with a drop-off of 10% of the desired value at the two extremes. Thus if the user input "10" as the input desired value, a good one year return is between 7.5% to 12.5%. Anything less than 6.5% is definitely not good and one year returns between 6.5% and 7.5% are good to some degree - "more good" as they approach 7.5%. In a similar manner, the definition includes other Criterion values such as "VeryGood", "Great", and "Extraordinary".

OneYearReturn (ΘdesiredReturn : Float) : Fuzzy { value Good ( ) { Pi (candidate. ' One Year Return', SdesiredReturn * 0.75, SdesiredRetum * 1.25,

OdesiredReturn * 0.1,

SdesiredRetum * 0.1) } value VeryGooα ) { Pi (candidate. One Year Return , SdesireOReturn * 1.35,

(sdesiredReturn * 1.75,

ΘdesiredReturn * 0.1,

ΘdesiredReturn * C 1) } value Great ( ⁽ { Pi (candidate . 'One Year Return', SdesiredReturn * 1 85,

@desιredReturn * 2 5,

ΘdesiredReturn * 0.1,

@desιredReturn * 0 1) } value Extraordinary ( ) { candidate 'One Year Return > SdesiredReturn * 2.5 }

Criterion Definition Example 2

In the following example, there is defined a Criterion that is selective in an inverse manner. It defines a Mediocre fund to be one whose "OneYearReturn" (the definition of Example 1 above) is "Good" or "Very Good" but not "Great" or "Extraordinary". MediocreFund ( ) ^• Fuzzy

{

(OneYearReturn for (Average (working set 'One Year Return')) is Good OR OneYearReturn for (Average (working set. One Year Return')) is VeryGood) AND NOT (OneYearReturn for (Average (working set. One Year Return')) is Great OR OneYearReturn for (Average (working set. 'One Year Return')) is Extraordinary)

Criterion Definition Example 3 The following function uses text analysis to evaluate a Criterion.

StringSimilarity returns a float number describes the degree to which the two input strings are similar. The result of StringSimilarity is fed to PI to return a fuzz result.

SeeksCapitalAppreciation ( ) : Fuzzy { Pi ( StπngSim lcandidate . Notes , " seeks capital appreciaiton" ) , 0 . 1 , 1 . 0 , 0 . 05 , 0 . 0 )

} Universal Analysis Language (UAL) Grammar

The following is a BNF grammar for the Universal Analysis Language (UAL), according to one embodiment of the present invention.

//

/ / MACRO — contains definitions of common types

%macro

{squote} '"";

{float} '[0-9]+\.[0-9]+(e[\-+][0-9]+)?'_;

{truthValue} '[10]\.[0-9]+[Tt]';

{integer} '[0-9]+';

{simpleld} '[A-Za-z_][A-Za-z_0-9]*'; jcomplexld) '{squote}[A-Za-z_0-9][A-Za-z_0-9 ]*{squote}';

/ /

// EXPRESSION - contains REGEXP definitions / / %expression Main

[ \t\n]+' %ignore;

{simpleld} 1 {complexld}' ID;

\"[^Λ\n\"]*Y" STRING_LITERAL;

LJPAREN, '(';

\y R_PAREN, ')';

^■\ {' L_BRACE, '{';

^•\γ R_BRACE, '}';

L_BRACKET, '[';

^■\γ R_BRACKET, ']';

COLON, ':';

/ COMMA, ',';

^•\ r VERT_LINE, ' 1 ';

AMPERSAND, '&'; v STAR, '*';

V SLASH, '/';

^■ \ +' PLUS, '+';

GT, ^•>';

'<' LT, '<^•;

^■ >=' GE, '>=';

'<=' LE, '<=';

'!=' NE, '!=';

'=' EQUAL, ^*="; MINUS, '-';

PERCENT, '%';

POUND, ^*#';

EXCL, '!';

'\$' DOLLAR, '$';

TILDA, '-';

AT, '©';

PERIOD, '.';

{float} FLOAT_LITERAL; {integer} INTEGER_LITERAL;

{truthValue} TRUTH_LITERAL;

[Aa][Nn][Dd]' AND;

[Oo][Rr]' OR;

[Nn][Oo][Tt]' NOT; '[Ff][Oo][Rr]' FOR;

[Aa][Ll][Ll]' ALL;

[Ii][Nn]' IN;

[Ww][Hh][Ee][Rr][Ee]' WHERE; 'value' VALUE; candidate' CANDIDATE; working set' WORKING_SET; universal set' UNIVERSAL_SET; row' ROW; 'with importance' WΠΉJMPORTANCE;

lll ll IIII III I IIII III I III II III I II llll l III 11 I I IIII II III 11 II //PRECEDENCE

//

%prec

// Logical Ops (Lowest priority) 1, '&', %left; 1, ' I ', %left;

2, WITHJMPORTANCE, %left;

// Logical NOT (higher than other logical ops)

3, NOT, %right; 3, '!', %right;

II Boolean comparison ops , '!=', %left; , '<', %left; , '>', %left; , '<=', %left; , '>=', %left; , '=', %left;

/ / Additive Ops 5, '+', %left; 5, '-', %left;

/ / Multiplicative Ops 6, '*', %left; 6, '/', %left;

II

/ 1 PRODUCTION - grammar productions with start symbol = start II

%production start

Startl start -> criteria_definition;

Start2 start -> start criteria_definition;

II

11 Criteria Definition

/ /

CritDef criteria_definition -> identifier '(' typed_param_list ')' ':' type_identifier criteria_body;

//

/ / Parameter Declaration List

// TypedParamList typed_param_list -> parameter_decl ',' typed_param_list;

TypedParamListl typed_param_list -> parameter_decl;

TypedParamListO typed_param_list -> ;

ParameterDec parameter_decl -> '@' identifier ':' type_identifier;

/ /

// Criteria Body / / CriteriaBodyl criteria_body -> '{' value_definition_list '}'; CriteriaBody2 criteria_body -> '{' expression

/ / II Value Definition List

/ /

ValueDefList value_definition_list -> value_def value_definition_list;

ValueDefListl value_definition_list -> value_def;

ValueDef value_def -> VALUE identifier '(' typed_param_list ')' '{' expression '}';

//

/ / Expression II

Expressionl e exxpprreessssiioonn -->> t teerrmm;;

ExpressionAnd expression -> expression and term;

ExpressionOr expression -> expression or term;

II

11 Term

/ /

TermPlus term -> term +' term;

TermMinus term -> term -' term; TermTimes term -> term *' term;

TermDivideBy term -> term /' term;

Term WIMP term -> term WITHJO PORTANCE term;

TermGT term -> term >' term;

TermLT term -> term <' term; TermNE term -> term !=' term;

TermEQ term -> term =' term;

TermLE term -> term <=' term;

TermGE term -> term >=' term;

TermNot term -> not term; Term term -> literal;

II

II Literal

/ /

/ /LiteralNot literal -> not literal; LiteralFormula literal -> formula; LiteralExp literal -> '(' expression ')'; II

II Formula

/ /

FormulaCrit formula -> criteria_usage;

FormulaFunc formula -> function_usage;

FormulaConst formula -> constant_value;

FormulaSelect formula -> field_expr;

FormulaParam formula -> '@' identifier;

/ /

/ / Criteria Usage

//

CritUsel criteria_usage -> identifier;

CritUse2 criteria_usage -> identifier FOR '(' argument_list ')';

CritUse3 criteria_usage -> identifier IS identifier;

CritUse4 criteria_usage -> identifier IS identifier '(' argumentjist ')';

CritUse5 criteria_usage -> identifier FOR '(' argumentjist ')' IS identifier;

CritUseό criteria_usage -> identifier FOR '(' argument_list ')' IS identifier '(' argumentjist ')';

II

II Field Expression — an expression that selects /returns /operates on fields //

FieldExprSel field_expr -> fieldj^ase field_selection; FieldExprFunc field_expr -> field_base field_func;

FieldExprlndex field_expr -> field_base field_index;

FieldBaseCand field_base -> CANDIDATE;

FieldBaselD field_base -> '#' ID; FieldBaseWS field_base -> WORKING_SET; FieldBaseUS field_base -> UNIVERSAL_SET;

FieldBaseFunc field_base -> function_usage;

FieldBaseRow field_base -> ROW;

FieldBaseParam field_base -> '-' ID;

FieldBaseExpr field_base -> field_expr;

FieldBaseFilter field_base -> '(' ALL '-' ID IN WORKING_SET WHERE '(' expression ')' ')';

/ /

/ / Field Terms

/ /

FieldTermSel field_selection -> '.' identifier;

FieldTermFunc 1 field June -> '(' argumentjist ')'; FieldTermlndexl field Jndex -> '[' expression ']';

/ /

/ / Function Usage

/ /

FuncUsage function_usage -> identifier '(' argumentjist ')';

/ /

/ / Argument List

/ /

ArgList argumentjist -> argumentjist ',' expression;

ArgListl argumentjist -> expression;

ArgListEmpty argumentjist -> ;

/ /

/ / Binary Operators

/ /

//BOPTimes binary_op -> '*';

//BOPDivide binary _op -> '/';

//BOPPlus binary _op -> '+';

//BOPMinus binary _op -> '-';

//BOPWithlmp binary_op -> WITH JMPORTANCE;

//BOPGT binary_op -> '>';

//BOPLT binary_op -> '<';

//BOPNE binary _op -> '!=';

/ / More binary ops to be defined

/ /

/ / Type Identifiers

/ /

Typeldentifier typejdentifier -> ID; // Need to add more flexible type ids

II

11 Constant Value

/ /

ConstFloat constant_value -> FLOAT J TERAL;

Constlnt constant_value -> INTEGER JJTERAL;

ConstString constant_value -> STRING JJTERAL;

ConstTruth constant_value -> TRUTH JJTERAL;

II / / Miscellaneous Productions

/ /

Identifier identifier -> ID;

Andl and -> '&';

And2 and -> AND;

Orl or -> ' 1 ';

Or2 or -> OR;

Notl not -> '!';

Not2 not -> NOT;

The above description is included to illustrate the operation of the preferred

embodiments and is not meant to limit the scope of the invention. The scope of the

invention is to be limited only by the following claims. From the above discussion,

many variations will be apparent to one skilled in the art that would yet be

encompassed by the spirit and scope of the present invention.

What is claimed is:

Claims

1. A computer-implemented method for querying multiple different types of information, each type of information having a different evaluator, the method comprising: receiving a query comprising an identification of at least two evaluators, at least one relationship between the evaluators, and a method of combining results from the evaluators; parsing the query to create an evaluation sequence comprising an ordered sequence of invocations of the evaluators; invoking the evaluators in the evaluation sequence; and combining results from the evaluators according to the method of combining results from the evaluators specified in the query.

2. The method of claim 1, wherein the parsing step includes the substep of: validating the query against a grammar; wherein the invoking and combining steps are performed only if the query is successfully validated.

3. The method of claim 1, wherein the types of information are selected from the group consisting of structured information, unstructured information, qualitative information, and quantitative information.

4. The method of claim 1, wherein the evaluators are configured to perform a function from the group consisting of searching, prediction, collaborative filtering, and data mining.

5. A system for querying multiple different types of information, each type of information having a different evaluator, the system comprising: a parser for receiving a query comprising an identification of at least two evaluators, at least one relationship between the evaluators, and a method of combining results from the evaluators; means, coupled to the parser, for creating an evaluation sequence comprising an ordered sequence of invocations of the evaluators; means, coupled to the creating means, for invoking the evaluators in the evaluation sequence; and means, coupled to the invoking means, for combining results from the evaluators according to the method of combining results from the evaluators specified in the query.

6. The method of claim 5, the types of information are selected from the group consisting of structured information, unstructured information, qualitative information, and quantitative information.

7. The method of claim 5, wherein the evaluators are configured to perform a function from the group consisting of searching, prediction, collaborative filtering, and data mining.

8. A computer-readable medium having computer-readable program code modules embodied therein for querying multiple different types of information, each type of information having a different evaluator, the computer-readable medium comprising: computer-readable program code modules configured to receive a query comprising an identification of at least two evaluators, at least one relationship between the evaluators, and a method of combining results from the evaluators; computer-readable program code modules configured to parse the query to create an evaluation sequence comprising an ordered sequence of invocations of the evaluators; computer-readable program code modules configured to invoke the evaluators in the evaluation sequence; and computer-readable program code modules configured to combine results from the evaluators according to the method of combining results from the evaluators specified in the query.