WO2004077213A2 - System and method for parsing queries for objects irrespective of server functionality - Google Patents

System and method for parsing queries for objects irrespective of server functionality Download PDF

Info

Publication number
WO2004077213A2
WO2004077213A2 PCT/IN2004/000024 IN2004000024W WO2004077213A2 WO 2004077213 A2 WO2004077213 A2 WO 2004077213A2 IN 2004000024 W IN2004000024 W IN 2004000024W WO 2004077213 A2 WO2004077213 A2 WO 2004077213A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
repository
instruction
server
parser
Prior art date
Application number
PCT/IN2004/000024
Other languages
French (fr)
Other versions
WO2004077213A3 (en
Inventor
Vinayak K. Rao
Original Assignee
Vaman Technologies (R & D) Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vaman Technologies (R & D) Limited filed Critical Vaman Technologies (R & D) Limited
Publication of WO2004077213A2 publication Critical patent/WO2004077213A2/en
Publication of WO2004077213A3 publication Critical patent/WO2004077213A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation

Definitions

  • the present invention provides a software-implemented process, system, and method for use in a computing environment.
  • Lexical Analysis which involves converting a character string into a token string, by identifying each character, one-at-a-time, using a set of rules as a standard feature of all existent parsing methodologies.
  • the present invention adopts a unique way of interpreting an object and breaking the syntax information into an object interface based on operation specified in the syntax for the object with various options and values.
  • the parser decomposes a query statement into Operation / Object, Options, Values.
  • the general classification for any command is either of SQL or OQL or XQL nature.
  • the operations embedded in these command strings can be broadly sub classified as Data Definition Language (DDL) or Data Manipulation Language (DML) or Data Control Language (DCL) etc.
  • DDL Data Definition Language
  • DML Data Manipulation Language
  • DCL Data Control Language
  • FSM Finite State Machine
  • Fig .1 is flow diagram illustrating the manner in which the current invention decomposing a query statement into Objects, Operations, Options and Values
  • Fig. 2 is a block diagram of a parser illustrating the various functional block.
  • Fig. 3 is a flow diagram illustrating the process in which the parser carries the parsing using the various parser architectural blocks.
  • Fig. 4, 5, 6 and 7 are the graphical user interface depicting the three different operations such as the CREATE as depicted in figure 4 and 7, ALTER as depicted in figure 5 and DROP as depicted in figure 6 on an Object "ROLE" is defined with various Options and Values.
  • Fig 1 there is shown a manner in which the present invention decomposes a query statement into Objects, Operations, Options and Values.
  • these language implementation varies, which are proprietary as per the functional scope of the object and features supported by functionality. This gives rise to disparate syntaxes, which do not give portability across languages and vendors.
  • the present invention devised a way of interpreting any object or sub object in any functional server across vendors and operating systems by analyzing the object in four different parameters.
  • the preferred embodiment of the present invention defines and interprets any object as a virtual interface, which has all features and syntactical functionalities independent of vendor implementation or OS features. The present invention then analyzes and accumulated these words used for different operations on the object with various options. A lot of similarities occur because of ANSI SQL standards but a lot differ too as some operational and functional features are termed with different marketing/vendor jargons. The present invention then needs to map these dissimilar words but similar functionalities and derive upon a substitution mechanism and its related linked words.
  • the problem still remains of isolating server interpretation of these tokens as these tokens may go on varying with same vendors but subsequent versioning of their products.
  • the present invention designed a generic object interface linking features and function and isolating terminologies used by vendors.
  • the parser creates an interface object rather than a token table and updates this interface object as per vendor terminologies.
  • the interface object thus finally has all supported features and functionalities across architectures of implementation irrespective of vendor specific deviations related to object functioning with specific options and values specified for specific operations.
  • Indexes 3 Isolates all operations which can be performed on the object such as Data Definition Language
  • DDL Data Manipulation Language
  • DML Data Manipulation Language
  • Size option for any object can accept floating data types (i.e. number with decimal accuracy) but may require a translation if specified as 10.2M into bytes because M / G/ T can be interpreted as Megabyte, Gigabyte, Terabyte.
  • the dictionary design and content predicts and associates only those objects upon which these operations can be performed 102.
  • Subsequent parsing tries to isolate the object in the query by analyzing reserved words in dictionary like Table, Index, Database etc. 1030nce the operation and object is isolated 104 the parser shrinks the current dictionary words context to only those relevant words with respect to the object & operation 105.
  • the parser then analyzes the remaining words in the query to isolate options 106 relevant to the object and prepares the interface object linking features of the objects with options specified and associated related values along with options 107.
  • the parser transforms these expressions and translates to the relevant data type as dictated by dictionary definition, (for example size requires number in bytes and the query string can accept a floating point).
  • dictionary definition for example size requires number in bytes and the query string can accept a floating point.
  • options, which are required minimum for the specified operation to function are missing in the query defaults are assumed and the interface buffer is updated.
  • the next step involves isolating object, which can be follows as shown by diagram. The isolation of object type and operation stated earlier restricts again the dictionary to a small set of options / properties supported by the object and operation stated and creates an anticipated dictionary because options nature of data (string / date / time float etc) are dictated by the object schema itself.
  • Validation of the values is carried out depending upon the anticipated value and value from the dictionary 108. After the values are found to be valid the object interface is updated with defaults 109. Depending on the nature of the default values, the isolation of the object 106 or the isolation of the word 101 is carried out.
  • a block diagram of the parser configured to include a Lexical Analyzer 200, Syntax Analyzer and Validator 202, Vendor Independent Syntax Translator 204, Expression Analyzer and Evaluator 206, Interface Translator 208, Cache Manager 210, an Object Schema 212, a Message or Error 214.
  • the Parser undertakes the process of analyzing a request, checking its syntactical and logical validity for a perfect command and preparing the requisite buffer (typically a token table / tree) for further processing, which the target system can understand and work upon.
  • a parser validates queries received by the server and checks the syntax of the query or command given by the Client, irrespective of the source of request (ODBC / web / mail clients). If the query received is found valid, then the parser module generates the appropriate interface object for it and sends a buffer to the Server for further processing. If the query is found invalid, the parser returns an error code and error string to the caller.
  • the Vendor Independent Syntax Translator (VIST) 204 is configured to check for vendor specific syntax and sends for syntax analysis. VIST uses the dictionary to derive and instantiate "object interface", which has "elements" based on options for the specified object relating to all features for the object irrespective of any vendor supporting that object in its functional server.
  • the Lexical Analyzer 200 is configured to convert an input stream of characters into stream of words or tokens.
  • the Syntax Analyzer 202 is configured to check SQL statement for syntactical validation that the SQL confirms to the rules of SQL standard and all keywords or reserved words and operators are valid and correctly used.
  • the Syntax Analyzer 202 is configured to generate an Object Interface as per options and values specified in query. It functions to convert this object interface to a contiguous buffer if needed by remote clients.
  • the Expression Evaluator 206 is configured to check for the Expressions, Conditions and Functions and add it in Expression Tree.
  • the Cache Manager 210 is configured to reduce the work of DML statements by caching needed information in the parse dictionary tree. So the execution of DML statements becomes faster. Any previous literals or identifiers which are not a part of standard dictionary words but maybe specific to application object names are cached and hence the subsequent parsing of the same words does not yield an identifier but a specific application object.
  • the Object Schema 212 or the Dictionary plays a vital role in parsing any query since it contains all reserved word, symbol and functional details.
  • the data dictionary has two part - one which has been designed and mapped and is static and other dynamic which is derived run time upon usage of the dictionary during parsing,
  • the static designed part has "object interfaces" defined and designed based on following observation: Each functional server was decomposed into its components and similar / dissimilar properties of each component outweighed with its frequency of usability, features and functionality associated with syntactical and semantic requirements, resource needs, associated algorithms or functional process needs, persistence requirements, dependency on regional or environmental settings, stress requirements for concurrent or distributed needs etc...
  • an interface element was prepared for each object of a database, which fulfilled every vendor functional need across versions of their products and operating systems available.
  • This interface served in multiple ways: Firstly it isolated the server from any syntactical issues which very vendor, version or OS specific. Secondly it served a guideline for parsing needs to update these interface elements as per commands or options in queries fired by users. Hence a very independent development process of server or parser could be accomplished because as 'n' when newer vendor compatibility had to be added the server functionality remained unaffected.
  • Four probabilistic of combinations were possible based on two permutation variables - functionality and syntax: Similar functionality but dissimilar syntaxes: (SFDS) - Basically this was the most common combination possible for generic and indispensable objects like table, index etc...
  • the parser uses a concept of dynamic dictionary, which updates the dictionary words real time with names of the objects (which had classified earlier as identifiers) so that subsequent parsing directly returns the object information and saves a lot of validation & searches for the server using the parser.
  • the Dynamic Dictionary optimizes the parsing technique for increasing speed of the Server. This technique is used for storing the header information for a particular table i.e. validates the query and stores required information into a tree. Now when the parser encounters the same query, it searches for the information in the dynamic dictionary and it simply refills the buffer and sends it to the Server.
  • the interface object is prepared by the Interface translator arranging object options and values as specified in the request operation.
  • the Error Handler 214 is configured to check if any error occurs during parsing. If such errors occur, it gives an error message to the caller along with an error code.
  • Fig. 3 is a flow diagram illustrating the process in which the parser carries the parsing using the various parser architectural blocks.
  • the Lexical Analyzer 200 isolates the query data 302 verifying each character and creating a delimited words, which are verified for existence in dictionary.
  • the query is checked for grammar that is for symmetrical objects such as formula formats and the open and close braces.
  • the purpose for verifying it character by character is to isolate reserved characters or frequency dependent character occurrences such as an expression where braces have to be properly terminated or even the identifier data that are encapsulated in vendor specific delimiters.
  • the date is specified as #12/11/1970#, string data as "DATA" where # or " are the delimiters.
  • the Lexical Analyzer 200 is not successful in isolating the query, the query is given 304 to the Error Handler 214.
  • the Syntax and Semantic Analyzer 202 carries out the verification of words 306 wherein the sequences of words, parameters before or after the word and their presentation specified are isolated and their grammar validated. Words, which are not found in the dictionary, are isolated as identifiers, which need to be validated later.
  • the Expression checking is carried out 308.
  • the query has an expression such as terminator or arithmetic operations, logical operations can be take care of.
  • the expression is then checked for accuracy 310.
  • the query is given 304 to Error Handler 214.
  • the query does not have an expression, it is checked for object isolation 312.
  • the separated object is further operated upon 314.
  • the Object environment is checked 316 for environmental objects such as Spool or Cursor etc.
  • the above process describes the VIST block of the parser architecture.
  • a validity check for Operations supported on the objects is carried out 318. In case the operation is invalid it is reported to the error handler 304.
  • the checking for valid option as per operation and output is carried out 320.
  • the event of an invalid option as per operation and output it is reported to the error handler 304.
  • the result of the operation is called projection that is output.
  • 'select * ' is a valid operation only on the table or view in a database and hence it cannot be performed on a database.
  • the Object Attribute Translator (OAT) translates object Data as per the features supported by the vendor 322.
  • the OAT maps jargons with functionality. For example suppose for every object different vendors support different features.
  • a Matrix consisting of options and corresponding values is created after object attribute translation 324. This generated Option-Value matrix is used to fill interface.
  • the Dynamic Cache or dictionary updates the words in real time with names of the objects 326 so that subsequent parsing directly returns the object information and hence saves a lot of validation and searches for the server using the parser.
  • the value is checked for any required translation 328 consider size option for any object which can accept floating data types but may require a translation if size is specified as 10.2M that is 10.2*1024*1024 bytes since M or G or T can be interpreted as Megabyte, Gigabyte or Terabyte.
  • the validation for value is carried out 330.
  • the value translation is required it is carried out as may be the required by the value332.
  • the Value Validator carries out the validation of the values example path specified should be a valid path. In the event of an invalid value it is handed to the Error Handler 304. In the event of valid values the object data is translated as per the schema 334. The translation of the Object Data is per the object types that is view, table and as per rights available to the user that is user scope.
  • the Interface Translator does the translation of the vendor specific object to the interface 336. After the Interface Translator is made the Object Interface Buffer Converter picks up say from SQL and puts it in buffer 338. Finally the Object interface is created 340.
  • FIG. 4 depicts the graphical user interface wherein the three different operations such as the CREATE as depicted in figure 4 and 7, ALTER as depicted in figure 5 and DROP as depicted in figure 6 on an Object "ROLE" is defined with various Options and Values.
  • the Figures 4, 5, 6 and 7 depicts the options and values defined while creating ROLE object and corresponding mapping of a value of an option based on one of the attributes i e. datatype of the option.
  • the figure depicts string option values, which is selected from a predefined set of values. Also the figure shows the rule engine that the rule scripts dictate and are saved in the Lookup Table i.e. the server

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates generally to a system and method of a parsing mechanism of one or a plurality of objects, irrespective of the server features and functionality. More particularly, the present invention relates to a system and method of interpreting accessing mechanisms (typically SQL or scripts) of objects and classifying them into paradigms of operations, objects, options and values.

Description

TITLE OF INVENTION
SYSTEM AND METHOD FOR PARSING QUERIES FOR OBJECTS IRRESPECTIVE OF SERVER FUNCTIONALITY
BACKGROUND OF THE INVENTION In most applications written for a server, the major sub-objects of the server technologies used are Tables and Indexes. These objects formed the basic building blocks for any server objects and every other sub-object, which is a part of server entity is in some way related or archived in this format. For example, every part of any server metadata, which has various views or procedures or triggers or sequences etc are finally saved and manipulated as server records. However, the way syntaxes are defined to access these objects or sub-objects in servers due to differences in the object or sub-object features of different vendors cause applications to be non-portable and vendor specific. Hence, applications written for one server vendor cannot work for the other.
Organizations increasingly require access to data residing in multiple sources in multiple vendor servers throughout the enterprise. However, the diversity of languages and the lack of a standardized server protocol for interoperability have made this difficult and costly. Different servers are not able to understand the meaning or the context of the data that is being exchanged between them. Currently, when data has to be transferred from one server to another, it has to be carried out manually because server languages are incompatible or server architectures are incompatible. As a result of this, companies are required to buy applications, which does not help them economically integrate servers in order to effectively utilize their IT resources.
Further, all modern Web sites integrate with live data from databases. These databases may include product specifications, customer data, office addresses, and other valuable information. Currently, this integration is made possible with the help of middleware technologies such as ColdFusion, Active
Server Pages (ASP), CGI scripts written in Perl or Java. However there exists no method or system whereby queries from such scripts can be received by a RDBMS, without the use of middleware. There exists a need for a system and method by which all web-related scripts can be integrated with a server, without the need for an intermediate conversion, thereby reducing costs and increasing the functionality and efficiency of the server.
A need exists for a system and method by which queries from a Client can be validated independent of the Server functionality, features, syntaxes and thereby reducing overheads and increasing the efficiency of the server. Accordingly, a need exists for a system and method by which queries received from diverse Server languages can be validated and processed by the data server, without the need to manually transfer data from one server to another server. Further, for an application to work seamlessly, what is required is a system and method to analyze each server object or Sub object the application used and in the manner of programming syntaxes known to the application developer. In addition, a need exists for a system and method by which queries from a Client can be validated independent of the Server functionality, features, syntaxes and thereby reducing overheads and increasing the efficiency of the server. Lastly, there exists a need for a system and method by which all web-related scripts can be integrated with a server, without the need for an intermediate conversion, thereby reducing costs and increasing the functionality and efficiency of the server.
SUMMARY OF THE INVENTION
To meet the foregoing needs of parsing queries for objects irrespective of server functionalities and in accordance with the purpose of the invention as broadly described herein, the present invention provides a software-implemented process, system, and method for use in a computing environment.
One of the major hurdles is application and data migration is syntax differences adapted by various vendors for implementing the same features and functionality. Currently object syntaxes are parsed by a process known as Lexical Analysis, which involves converting a character string into a token string, by identifying each character, one-at-a-time, using a set of rules as a standard feature of all existent parsing methodologies. The present invention adopts a unique way of interpreting an object and breaking the syntax information into an object interface based on operation specified in the syntax for the object with various options and values. The parser decomposes a query statement into Operation / Object, Options, Values. The general classification for any command is either of SQL or OQL or XQL nature. The operations embedded in these command strings can be broadly sub classified as Data Definition Language (DDL) or Data Manipulation Language (DML) or Data Control Language (DCL) etc. Once the operation is isolated the set of reserved words, which can pertain to an object, are the only scope on which subsequent words in the command can follow. Hence a large part of the dictionary, which may have to be traversed or searched, shrinks to a finite set based on operation supported by those specific objects. The basic mechanism of interpretation and using any object requires a medium of communication and a common language for communication between the object and object user.
The entire design is based on state machines and modules comprising of various events communication via messages that is it is event driven using Finite State Machine (FSM) concept, the functionality is broken down into a series of events scheduled by kernel.
BRIEF DESCRIPTION OF THE DRAWINGS The various objects and advantages of the present invention will become apparent to those of ordinary skill in the relevant art after reviewing the following detailed description and accompanying drawings, wherein:
Fig .1 is flow diagram illustrating the manner in which the current invention decomposing a query statement into Objects, Operations, Options and Values
Fig. 2 is a block diagram of a parser illustrating the various functional block.
Fig. 3 is a flow diagram illustrating the process in which the parser carries the parsing using the various parser architectural blocks. Fig. 4, 5, 6 and 7 are the graphical user interface depicting the three different operations such as the CREATE as depicted in figure 4 and 7, ALTER as depicted in figure 5 and DROP as depicted in figure 6 on an Object "ROLE" is defined with various Options and Values.
DETAILED DESCRIPTION OF THE INVENTION
While the present invention is susceptible to embodiment in various forms, there is shown in the drawings and will hereinafter be described a presently preferred embodiment with the understanding that the present disclosure is to be considered an exemplification of the invention and is not intended to limit the invention to the specific embodiment illustrated.
In the present disclosure, the words "a" or "an" are to be taken to include both the singular and the plural. Conversely, any reference to plural items shall, where appropriate, include the singular.
Referring now to the drawing particularly in Fig 1 there is shown a manner in which the present invention decomposes a query statement into Objects, Operations, Options and Values. Across vendors these language implementation varies, which are proprietary as per the functional scope of the object and features supported by functionality. This gives rise to disparate syntaxes, which do not give portability across languages and vendors. Hence the present invention devised a way of interpreting any object or sub object in any functional server across vendors and operating systems by analyzing the object in four different parameters.
1 ) Operation(s) performed on the object.
2) Object(s) on which operations are performed.
3) Option(s) specified in the language to perform operations on the objects. 4) Value(s) given to the options during the course of operation. The preferred embodiment of the present invention defines and interprets any object as a virtual interface, which has all features and syntactical functionalities independent of vendor implementation or OS features. The present invention then analyzes and accumulated these words used for different operations on the object with various options. A lot of similarities occur because of ANSI SQL standards but a lot differ too as some operational and functional features are termed with different marketing/vendor jargons. The present invention then needs to map these dissimilar words but similar functionalities and derive upon a substitution mechanism and its related linked words. The problem still remains of isolating server interpretation of these tokens as these tokens may go on varying with same vendors but subsequent versioning of their products. Hence the present invention designed a generic object interface linking features and function and isolating terminologies used by vendors. The parser creates an interface object rather than a token table and updates this interface object as per vendor terminologies. The interface object thus finally has all supported features and functionalities across architectures of implementation irrespective of vendor specific deviations related to object functioning with specific options and values specified for specific operations.
To derive a generic object parser for a server the present invention follows these steps:
1 ) Decomposes the collective server functionality into objects and sub objects.
2) Relates functionality or features, which are interdependent and arrive upon a hierarchy of objects and sub objects in a parent child relation. For example if table is an object then, columns form the sub object, constraints for the columns form the sub objects of columns. Also Indexes can work only on columns.
Table l_ Columns r" Constraints
Indexes 3) Isolates all operations which can be performed on the object such as Data Definition Language
(DDL) and Data Manipulation Language (DML) (CREATE, ALTER, DROP, SELECT, INSERT, UPDATE, DELETE) 4) Isolates options than can be given as per specific operations and their syntactical and semantic sequences. 5) Isolates the values for options during operations and derive upon their basic data types required for server functionality with or without translation. Ex: Size option for any object can accept floating data types (i.e. number with decimal accuracy) but may require a translation if specified as 10.2M into bytes because M / G/ T can be interpreted as Megabyte, Gigabyte, Terabyte. 6) Derives the minimum, maximum or default values with expected presentation logic so that these values cannot crash the server assumption and in case options are specified without any values defaults can be assumed. For example the size parameter cannot have negative values hence a minimum size for an object is decided as per the object functionality requirements and even the extreme values (maximum) is sealed so that the wrong input from user is blocked or validated against any absurd parameter. The presentation logic dictates any discrepancies, which may arise like dd/mm/yyyy, or mm/dd/yyyy can have valid interchangeable values under 13 but can have drastic failures for business logic.
Constraints that had to be taken care while designing modules of parser:
For implementing the above design an object dictionary had to be first derived with these interpretation taking care that as words in dictionary increases with more and more vendor compatibility the parsing time should remain constant irrespective of complexities of various syntax interpretations. Hence a unique approach of dynamic shrinking dictionary is adapted which upon isolating operation and objects resolves the entire dictionary scope to a very limited number of words and enhances parsing speed. This can be summarized as shown in Fig.1. As soon as a query is received from a client 100, the operation(s) involved in the query is isolated 101. Usually the first word is the operation itself with more to follow down the context. (For example Select, Insert, Update, Delete, Create, Alter, Drop ). The dictionary design and content predicts and associates only those objects upon which these operations can be performed 102. Subsequent parsing tries to isolate the object in the query by analyzing reserved words in dictionary like Table, Index, Database etc. 1030nce the operation and object is isolated 104 the parser shrinks the current dictionary words context to only those relevant words with respect to the object & operation 105. The parser then analyzes the remaining words in the query to isolate options 106 relevant to the object and prepares the interface object linking features of the objects with options specified and associated related values along with options 107. In case values specified for an option requires translation or requires being resolved like an expression the parser transforms these expressions and translates to the relevant data type as dictated by dictionary definition, (for example size requires number in bytes and the query string can accept a floating point). In case options, which are required minimum for the specified operation to function are missing in the query defaults are assumed and the interface buffer is updated. The next step involves isolating object, which can be follows as shown by diagram. The isolation of object type and operation stated earlier restricts again the dictionary to a small set of options / properties supported by the object and operation stated and creates an anticipated dictionary because options nature of data (string / date / time float etc) are dictated by the object schema itself. Validation of the values is carried out depending upon the anticipated value and value from the dictionary 108. After the values are found to be valid the object interface is updated with defaults 109. Depending on the nature of the default values, the isolation of the object 106 or the isolation of the word 101 is carried out.
Designing interfaces for objects, which were common across vendor architectures (ex Table object) required first to analyze following parameters
• like functionalities with similar syntaxes
• like functionalities with dissimilar syntaxes vendor specific object functionality dependent on server architecture vendor specific object functionality independent on server architecture
OS specific features used for some object feature enhancements
Syntax similarity but differed functionality
Syntax clashes with similar or dissimilar functionalities
Other objects or sub objects affected by adapting a common path wherever contentions arise
Hence the object schema finally derived after understanding all these implementation constraints behaves like a universal object having all the functional features and properties supported by various vendor architectures irrespective of varied syntactical accessing mechanisms.
Referring now to the drawing particularly in Fig 2, there is shown a block diagram of the parser configured to include a Lexical Analyzer 200, Syntax Analyzer and Validator 202, Vendor Independent Syntax Translator 204, Expression Analyzer and Evaluator 206, Interface Translator 208, Cache Manager 210, an Object Schema 212, a Message or Error 214.
The Parser undertakes the process of analyzing a request, checking its syntactical and logical validity for a perfect command and preparing the requisite buffer (typically a token table / tree) for further processing, which the target system can understand and work upon.
A parser validates queries received by the server and checks the syntax of the query or command given by the Client, irrespective of the source of request (ODBC / web / mail clients). If the query received is found valid, then the parser module generates the appropriate interface object for it and sends a buffer to the Server for further processing. If the query is found invalid, the parser returns an error code and error string to the caller. The Vendor Independent Syntax Translator (VIST) 204 is configured to check for vendor specific syntax and sends for syntax analysis. VIST uses the dictionary to derive and instantiate "object interface", which has "elements" based on options for the specified object relating to all features for the object irrespective of any vendor supporting that object in its functional server.
The Lexical Analyzer 200 is configured to convert an input stream of characters into stream of words or tokens.
The Syntax Analyzer 202 is configured to check SQL statement for syntactical validation that the SQL confirms to the rules of SQL standard and all keywords or reserved words and operators are valid and correctly used. The Syntax Analyzer 202 is configured to generate an Object Interface as per options and values specified in query. It functions to convert this object interface to a contiguous buffer if needed by remote clients.
The Expression Evaluator 206 is configured to check for the Expressions, Conditions and Functions and add it in Expression Tree.
The Cache Manager 210 is configured to reduce the work of DML statements by caching needed information in the parse dictionary tree. So the execution of DML statements becomes faster. Any previous literals or identifiers which are not a part of standard dictionary words but maybe specific to application object names are cached and hence the subsequent parsing of the same words does not yield an identifier but a specific application object.
The Object Schema 212 or the Dictionary plays a vital role in parsing any query since it contains all reserved word, symbol and functional details. The data dictionary has two part - one which has been designed and mapped and is static and other dynamic which is derived run time upon usage of the dictionary during parsing, The static designed part has "object interfaces" defined and designed based on following observation: Each functional server was decomposed into its components and similar / dissimilar properties of each component outweighed with its frequency of usability, features and functionality associated with syntactical and semantic requirements, resource needs, associated algorithms or functional process needs, persistence requirements, dependency on regional or environmental settings, stress requirements for concurrent or distributed needs etc... Accordingly an interface element was prepared for each object of a database, which fulfilled every vendor functional need across versions of their products and operating systems available. This interface served in multiple ways: Firstly it isolated the server from any syntactical issues which very vendor, version or OS specific. Secondly it served a guideline for parsing needs to update these interface elements as per commands or options in queries fired by users. Hence a very independent development process of server or parser could be accomplished because as 'n' when newer vendor compatibility had to be added the server functionality remained unaffected. Four probabilistic of combinations were possible based on two permutation variables - functionality and syntax: Similar functionality but dissimilar syntaxes: (SFDS) - Basically this was the most common combination possible for generic and indispensable objects like table, index etc... The terminologies used but various vendors were varying but internal functionality supported in spite of syntactical changes were same. Similar functionality and similar syntaxes: (SFSS) - This combination was the least of the issues to be tackled as percentage compliance of standards ANSI SQL 92/99 adhered by the respective product vendor led to this possibility. Dissimilar functionality and dissimilar syntaxes: (DFDS) - These objects / features were either vendor specific added to enhance performance or was a vertical specific need. This lead to exclusivity of application compliance to a vendor or product version and was the primary culprit for vendor lockin. This combination also led to death of customers of products when the providing vendor eloped or vanished. Also this was the main cause of data migration problem as exclusivity of a feature not catering to any standard led to portability issue and effort loss. Dissimilar functionality and similar syntaxes: (DFSS) - This was the trickiest part of implementation and the primary reason to bridge and create homogeneity in heterogeneity. Once the static part of the dictionary is defined each word contained in the query given to the parser is checked for existence in the dictionary. If a word is found in the dictionary then it is replaced with a value stored in the dictionary. During the course of parsing words, which fail to match dictionary contents are classified as identifiers. These identifiers maybe user defined objects of sub objects whose occurrence recurs very frequently as the use of the object accessing real time with more and more clients accessing the object concurrently. Hence the parser uses a concept of dynamic dictionary, which updates the dictionary words real time with names of the objects (which had classified earlier as identifiers) so that subsequent parsing directly returns the object information and saves a lot of validation & searches for the server using the parser. The Dynamic Dictionary optimizes the parsing technique for increasing speed of the Server. This technique is used for storing the header information for a particular table i.e. validates the query and stores required information into a tree. Now when the parser encounters the same query, it searches for the information in the dynamic dictionary and it simply refills the buffer and sends it to the Server.
After the complete lexical and syntax or semantic analysis is complete the Operation and Object is isolated and based on the object schema definition the interface object is prepared by the Interface translator arranging object options and values as specified in the request operation.
The Error Handler 214 is configured to check if any error occurs during parsing. If such errors occur, it gives an error message to the caller along with an error code.
Fig. 3 is a flow diagram illustrating the process in which the parser carries the parsing using the various parser architectural blocks. As soon as a command or a query is received for parsing 300 the Lexical Analyzer 200 isolates the query data 302 verifying each character and creating a delimited words, which are verified for existence in dictionary. In other words the query is checked for grammar that is for symmetrical objects such as formula formats and the open and close braces. The purpose for verifying it character by character is to isolate reserved characters or frequency dependent character occurrences such as an expression where braces have to be properly terminated or even the identifier data that are encapsulated in vendor specific delimiters. For example the date is specified as #12/11/1970#, string data as "DATA" where # or " are the delimiters. In the event the Lexical Analyzer 200 is not successful in isolating the query, the query is given 304 to the Error Handler 214. In the event after the lexical analysis is successfully completed, the Syntax and Semantic Analyzer 202 carries out the verification of words 306 wherein the sequences of words, parameters before or after the word and their presentation specified are isolated and their grammar validated. Words, which are not found in the dictionary, are isolated as identifiers, which need to be validated later.
In the event after the successful completion of the Syntax or Semantic Analysis, the Expression checking is carried out 308. In the event the query has an expression such as terminator or arithmetic operations, logical operations can be take care of. The expression is then checked for accuracy 310. In the event of an incorrect expression the query is given 304 to Error Handler 214. In the event correct expressions are present, the checking is carried out for object isolation 312. For example consider a query 'select * from tablename or view with expression like A=10 AND B*2=5' where A and B are columns of the table abc. The value of the expression cannot be obtained till the value of B that is a column of the table abc is known. Hence isolation of object such as table or view is done. Similarly, in the event that the query does not have an expression, it is checked for object isolation 312. After the isolation of object 312 the separated object is further operated upon 314. In the event that the object cannot be isolated, the Object environment is checked 316 for environmental objects such as Spool or Cursor etc. The above process describes the VIST block of the parser architecture. In the next step, a validity check for Operations supported on the objects is carried out 318. In case the operation is invalid it is reported to the error handler 304. In the event the operation for the object is valid then the checking for valid option as per operation and output is carried out 320. In the event of an invalid option as per operation and output it is reported to the error handler 304. The result of the operation is called projection that is output. For example 'select * ' is a valid operation only on the table or view in a database and hence it cannot be performed on a database. In the event of a valid option, the Object Attribute Translator (OAT) translates object Data as per the features supported by the vendor 322. The OAT maps jargons with functionality. For example suppose for every object different vendors support different features.
A Matrix consisting of options and corresponding values is created after object attribute translation 324. This generated Option-Value matrix is used to fill interface. The Dynamic Cache or dictionary updates the words in real time with names of the objects 326 so that subsequent parsing directly returns the object information and hence saves a lot of validation and searches for the server using the parser. Next the value is checked for any required translation 328 consider size option for any object which can accept floating data types but may require a translation if size is specified as 10.2M that is 10.2*1024*1024 bytes since M or G or T can be interpreted as Megabyte, Gigabyte or Terabyte. In the event the value does not need translation, the validation for value is carried out 330. In the event the value translation is required it is carried out as may be the required by the value332.
The Value Validator carries out the validation of the values example path specified should be a valid path. In the event of an invalid value it is handed to the Error Handler 304. In the event of valid values the object data is translated as per the schema 334. The translation of the Object Data is per the object types that is view, table and as per rights available to the user that is user scope. The Interface Translator does the translation of the vendor specific object to the interface 336. After the Interface Translator is made the Object Interface Buffer Converter picks up say from SQL and puts it in buffer 338. Finally the Object interface is created 340.
The accompanying figures 4, 5, 6 and 7 depicts the graphical user interface wherein the three different operations such as the CREATE as depicted in figure 4 and 7, ALTER as depicted in figure 5 and DROP as depicted in figure 6 on an Object "ROLE" is defined with various Options and Values. The Figures 4, 5, 6 and 7 depicts the options and values defined while creating ROLE object and corresponding mapping of a value of an option based on one of the attributes i e. datatype of the option.
The figure depicts string option values, which is selected from a predefined set of values. Also the figure shows the rule engine that the rule scripts dictate and are saved in the Lookup Table i.e. the server

Claims

object repository outside the server or parser code. Hence the user can add or modify rules based on newer vendor syntaxes or features. As long as the object interface does not change the parser rule engine can take care of any syntactical and semantic issues hence more and more vendor compliance can be enforced and VIST can extend current vendor syntax compliance to any level.What is claimed is:
1. A method for instruction and data parsing, sourced from disparate functional servers having different syntax, formats and features, comprising:
creating a repository of symbols, words and rules based on lexical, semantic and syntactical parameters of predetermined server languages;
creating an object interface for pre-determined objects based on syntactical parameters such thai said object interface has all elements to be compatible with the features of said object;
providing for a plurality of input means for receiving said instruction and said data in predetermined format;
analyzing said instruction and said data lexically thereby providing the first set of results, based on the relevant information in said repository for the required server languages;
analyzing said instruction and said data semantically thereby providing the first set of results, based on the corresponding information in said repository for the required server languages;
analyzing said instruction and said data syntactically thereby providing the first set of results, based on the corresponding information in said repository for the required server languages; and mapping the third set of instructions to each element of the said object interface;
whereby providing for functionally interchanging across objects and data.
2. The method according to claim 1 , where said invention can parse and validate queries from a client independent of the server functionality, features, syntaxes, irrespective of the diversity of the language (such as SQL, OQL, XQL, scripts such as ASP, JSP) and protocol(s) used.
3. The method according to claim 1 , wherein the repository is a multi-dimensional matrix.
4. The method according to claim 1 , wherein the method provides for optimial parsing using the third set of results and replacing the relevant data where said instruction has been parsed earlier, whereby avoiding the need to re-parse the instruction.
5. The method according to claim 1 , wherein the third set of result is used for analyzing the expression.
6. The method according to claim 1 , wherein the predetermined server languages include C, C++,
ASP, SQL etc
7. The method according to claim 1 , where said parser provides for a mechanism to parse and interpret said instructions and data irrespective of type and nature of client request by using OOOV concept for decomposing the query, this method of decomposing comprises of the following steps:
Associate an operation with feature and functionality;
Associate an object; Associate options to call back function; and Associate value.
8. The method according to claim 1 , wherein said output generated an object interface definition that acts as a standard interface for these objects.
9. The method according to claim 1 , where said invention is operable in any environment because the object addressability lies outside said parser, wherein the said environment could be using any hardware or software.
10. The method according to claim 1 , where said Object Interface of said invention can work with and accommodate new vendor syntax and semantics because the Rule dictionary lies outside the parser.
11. The method according to claim 1 , where said Object Interface of the invention can work with any programming or scripting language.
12. The method according to claim 1 , where said invention allows customization of said parser in live environment.
13. The method according to claim 1 , where said invention is operable simultaneously in legacy as well as non-legacy environments.
14. The method according to claim 1 , where said invention provides for error handling.
15. The system for parsing queries, comprising:
a means to create a repository of symbols, words and rules based on lexical, semantic and syntactical parameters of predetermined server languages; a means to create an object interface for pre-determined objects based on syntactical parameters such that said object interface has all elements to be compatible with the features of said object;
an input means to provide for receiving said instruction and data in predetermined format;
a means to analyze said instruction and data lexically thereby provide for first set of results, based on the relevant information in said repository for the required server languages;
a means to analyze said instruction and data semantically thereby provide for second set of results, based on the corresponding information in said repository for the required server languages;
a means to analyze said instruction and data syntactically thereby provide the third set of results, based on the corresponding information in said repository for the required server languages; and
a means to map said third set of instructions to each element of the said object interface.
16. A system according to claim 15, further comprising of a repository means.
17. A system according to claim 15, further comprising of an error handling means.
18. The system according to claim 15, the system provides for means of seamless migration without any need to even recompile the application source.
19. The system according to claim 15, the system analyzes resource requirement patterns and functional mechanism associated with said data.
20. The system for parsing, comprising:
a input means to receive request in any format;
a memory means to store any metadata, heuristic, rules, object etc;
a means for lexical analysis coupled with a means to provide vendor independent syntax translation;
a means for semantic analysis coupled with a means to provide vendor independent syntax translation;
a means for syntactical analysis coupled with a means to provide vendor independent syntax translation; and
a means to generate and translate the object interface.
whereby said parser provides for mapping said features (options and values) specified in a query for a specific object and operation to a heuristics map, which quantifies the resources usage for each option and value specified in the query.
PCT/IN2004/000024 2003-01-30 2004-01-29 System and method for parsing queries for objects irrespective of server functionality WO2004077213A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN119MU2003 2003-01-30
IN119/MUM/2003 2003-01-30

Publications (2)

Publication Number Publication Date
WO2004077213A2 true WO2004077213A2 (en) 2004-09-10
WO2004077213A3 WO2004077213A3 (en) 2005-05-19

Family

ID=32922933

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2004/000024 WO2004077213A2 (en) 2003-01-30 2004-01-29 System and method for parsing queries for objects irrespective of server functionality

Country Status (1)

Country Link
WO (1) WO2004077213A2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001008028A2 (en) * 1999-07-23 2001-02-01 Koninklijke Philips Electronics Nv Timing shell automation for hardware macro-cell modeling
US6434629B1 (en) * 1988-05-23 2002-08-13 Hewlett-Packard Co. Computing system which implements recording and playback of semantic commands

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6434629B1 (en) * 1988-05-23 2002-08-13 Hewlett-Packard Co. Computing system which implements recording and playback of semantic commands
WO2001008028A2 (en) * 1999-07-23 2001-02-01 Koninklijke Philips Electronics Nv Timing shell automation for hardware macro-cell modeling

Also Published As

Publication number Publication date
WO2004077213A3 (en) 2005-05-19

Similar Documents

Publication Publication Date Title
US11907247B2 (en) Metadata hub for metadata models of database objects
US8086998B2 (en) transforming meta object facility specifications into relational data definition language structures and JAVA classes
US7512614B2 (en) System and method for data ETL in a data warehouse environment
US7921416B2 (en) Formal language and translator for parallel processing of data
US11762852B2 (en) Metadata-based translation of natural language queries into database queries
US8447744B2 (en) Extensibility platform using data cartridges
US8458164B2 (en) Query model tool and method for visually grouping and ungrouping predicates
WO2019237333A1 (en) Converting database language statements between dialects
CN112235311B (en) OVSDB client code automatic generation method, system, device and medium
US11995075B2 (en) System and method for efficient transliteration of machine interpretable languages
CN111694846B (en) Separation mode distributed storage process implementation method based on Type 2JDBC driver
CN114764558A (en) SQL dialect conversion method, device, system and storage medium
WO2004077213A2 (en) System and method for parsing queries for objects irrespective of server functionality
CN114089980A (en) Programming processing method, device, interpreter and nonvolatile storage medium
US11100286B2 (en) Methods and systems for implied graph patterns in property chains
WO2021081228A1 (en) Enhanced natural language processing with semantic shortcuts
Ruschitzka Heterogeneous data translations based on environment grammars
CN116756727B (en) Data authority control method and device, electronic equipment and storage medium
EP4266205A1 (en) Logical pointers supporting reuse of text translations
US20240134833A1 (en) Relational database system
EP4155968A1 (en) Identification and import of metadata for extensions to database artefacts
EP3617900A1 (en) Methods and systems for implied graph patterns in property chains
CN113626423A (en) Log management method, device and system of service database
WO2004077217A2 (en) System and method of object query analysis, optimization and execution irrespective of server functionality
Cleve et al. D6. 1 Cross-Database Data Migration Techniques Analysis

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase