WO2003012690A1 - Adaptable database runtime kernel - Google Patents

Adaptable database runtime kernel Download PDF

Info

Publication number
WO2003012690A1
WO2003012690A1 PCT/NO2002/000274 NO0200274W WO03012690A1 WO 2003012690 A1 WO2003012690 A1 WO 2003012690A1 NO 0200274 W NO0200274 W NO 0200274W WO 03012690 A1 WO03012690 A1 WO 03012690A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
module
database
code
storage
Prior art date
Application number
PCT/NO2002/000274
Other languages
French (fr)
Inventor
Jan-Thore BJØRNEMYR
Bjørn-Harald SJØGREN
Original Assignee
Berg-Jacobsen Holding As
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Berg-Jacobsen Holding As filed Critical Berg-Jacobsen Holding As
Priority to EP02746220A priority Critical patent/EP1421517A1/en
Publication of WO2003012690A1 publication Critical patent/WO2003012690A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Definitions

  • the invention relates in general to database technology, and more specifically to a computer system for providing access to a database upon a request from an application computer program.
  • the invention relates to such a system which comprises a data model represented in a data dictionary, an application program interface, a data storage, and a database kernel providing access from the application program interface to the data storage.
  • the invention also relates to a method for improving such a database system, a database kernel in such a database system and a code generator in such a database system.
  • meta-information is often referred to as the data model.
  • the meta-information may be viewed as structural information, and a description of how the real information has to be structured.
  • Fig. 1 is a schematic block diagram illustrating the major components in a prior art database system.
  • the database runtime environment consists of these parts:
  • Data storage 120 which is the physical storage of data, i.e. data files.
  • - Database kernel 110 serving as a database runtime and storage engine, which is the actual data interpreter.
  • On-line Data dictionary 130 that holds the meta-information.
  • External application programs 150 access the database system via an application program interface (not indicated) to the kernel 110.
  • Database definition or schema/data model 160 (not part of the database) is interpreted by the schema compiler and stored in the data dictionary 130 to allow the database kernel 110 to act in accordance to the data model.
  • the data storage 120 is the place where the actual data are kept. Often it uses the operating system's standard file system, but it is not unusual to have special file access in order to speed up data transfer to and from disk. Usually the data storage is platform dependent, which means that data stored on one platform can't be transferred to another platform without data conversion.
  • Meta information is as mentioned above information about the information. It describes all tables, columns, fields etc. which may be mentally viewed as the table layout.
  • the data dictionary also holds information about data types for the columns and domain restrictions for these columns.
  • NAME is character string consisting of characters 'a'-'z', 'A'-'Z' and '.', ' ', '-', and INCOME is a positive integer
  • CURRENCY is one of: 'USD' 'NOK', ⁇ UR', 'JPY', 'GBP', r SEK ⁇ 'DKK' ).
  • the data dictionary also holds information about the different constraint types as primary keys, foreign keys, subset constraints, exclude constraints and other constraints.
  • the dictionary and database runtime should be able to handle all types of constraints (in addition to those mentioned above), but current market leading systems only handle primary keys, unique keys, foreign keys and mandatory columns.
  • the data dictionary 130 also contains information about external views. This means information of how the information are to be presented to a specific application. One application may wish to view just the 'Name' and 'Address' columns of the table 'Person' while another application wish to view the 'Project', 'Assignment' and 'Person' tables as a compound table 'Project- Assignment'. Many data dictionaries also contain user definitions and user authorization information.
  • the database runtime module & storage engine 110 is the heart of the database system. It handles data retrieval request and update requests. In order to fulfill any request the runtime system has to consult the data dictionary 130. First the runtime module 110 has to consult the dictionary 130 in order to validate the application's request. It has to check that the requested data is known by the application 150 (or should be known), and do a mapping to the underlying data model. It then has to consult the data dictionary 130 in order to determine how the actual request to the storage engine should be expressed. When the result from the storage engine is returned, the database runtime once more has to consult the dictionary 130 in order to do a transformation of the retrieved data to fit the applications expectations with respect to naming and structure.
  • the database runtime module 110 has to run a consistency check of the database based on the rules stored in the data dictionary. For each rule found in the data dictionary 130, the runtime module 110 has to analyze the rule and run a consistency check. This is very complicated and time consuming. All market leading database vendors therefore do this kind of consistency control after every single update, to prevent a large backlog of consistency controls to be performed. But most important; unless the controls are carried out immediately, the functionality required to do the control after the entire transaction has been carried out will be very complex.
  • the runtime module 110 has to handle (consistency check, authorization, etc.) the module is oversized, which involves an efficiency problem.
  • the dictionary compiler or schema compiler 140 checks the data model for consistency. If the data model is consistent it stores the data model information in the data dictionary.
  • the database runtime module 110 has to dynamically validate the request, and dynamically create an execution plan. In order to do that the system has to send a lot of enquiries to the data dictionary 130 and interpret the results. This has to be done for every single request. All enquiries to the data dictionary 130 creates a lot of overhead and decrease performance significantly. It also requires the database runtime module 110 to be constructed such that all type of data models can be handled. As a result of this there are a lot more program code than necessary for most data models and the runtime module 110 is oversized for most applications.
  • DBMS Database management systems
  • Current DBMS normally solves a lot of problems that are not required by the applications. That means, an application rarely makes use of all offered functionally at the same time. Different applications require different functionality. The result of this is that DBMS offers a lot more functionality that requested at the same time much functionality required is not offered. Compared to the offered functionality, the footprints of current DBMS are fairly reasonable, but compared to required functionality they are oversized. For desktop use this is no problem, but for handheld applications and embedded systems this is a major problem. Handheld devices and embedded devices has very limited resources, therefore small footprint databases are vital. It is a further object of the present invention to ease database and application programming, particularly to ease constraint handling and thereby reduce application complexity.
  • a computer system for providing access to a database upon a request from an application computer program comprising - a data model represented in a data dictionary,
  • said database kernel comprises a database runtime module for providing access from the application program interface to the database kernel, said runtime module (210) being dynamically changeable, dependent on the data model (260) represented in the data dictionary, and - a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model.
  • the system further comprises an automatic code generator which is adapted to generate an executable program code for the runtime module based upon the data model represented in the data dictionary.
  • the program code for the runtime module is preferably generated by the code generator subsequent to any amendments in the data model.
  • the code generator preferably comprises a source code generating module and a compiler module, said code generating module being adapted to generate a source code to be processed by the compiler module, thus generating said executable program code.
  • the code generating module is further preferably adapted for receiving structural data provided from the data dictionary, for receiving syntactical data provided from a template, and for processing said structural data in accordance with rules defined by the syntactical data, thus producing source code adapted for input to the compiler module.
  • the code generating module is advantageously adapted to generate descriptive code or procedural code, dependent on the syntactical data provided from the template.
  • the storage engine is preferably adapted to offer the runtime module a platform independent access to the data stored in the data storage.
  • the storage engine is also adapted to store and retrieve data elements of a first data structure, comprising an unordered set of data, and data elements of a second data structure, comprising an ordered set of data.
  • the storage engine is also preferably adapted to provide an identifier in each data elements of said first data structure, and to provide data elements of said second structure wherein each entry has a set of such associated identifiers.
  • a method for improving a database system for providing access to a database upon a request from an application computer program comprising
  • a database kernel providing access from the application program interface to the data storage
  • the method being characterized by the steps of providing in the database kernel a database runtime module for providing access from the application program interface to the database kernel, said runtime module being dynamically changeable, dependent on the data model represented in the data dictionary, and providing in the database kernel a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model.
  • the invention also relates to a database kernel in a computer system for providing access to a database upon a request from an application computer program, said system comprising
  • said database kernel providing access from the application program interface to the data storage
  • said database kernel comprising a database runtime module for providing access from the application program interface to the database kernel, said runtime module being dynamically changeable, dependent on the data model represented in the data dictionary, and a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model.
  • the invention relates to an automatic code generator in a computer system for providing access to a database upon a request from an application computer program, said system comprising - a data model represented in a data dictionary,
  • a database kernel providing access from the application program interface to the data storage
  • said database kernel comprising a database runtime module for providing access from the application program interface to the database kernel, said runtime module being dynamically changeable, dependent on the data model represented in the data dictionary, and a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model, said code generator being adapted to generate an executable program code for the runtime module based upon said data model.
  • Fig. 1 is a schematic block diagram illustrating a database system according to prior art
  • Fig. 2 is a schematic block diagram illustrating a database system according to the invention
  • Fig. 3 is a data structure diagram illustrating three types of data structures
  • Fig. 4 is a sequence diagram illustrating an example of an update sequence
  • Fig. 5 is a diagram illustrating a data processing hierarchy
  • Fig. 6 is a block diagram illustrating the use of various application interfaces in a database system according to the invention
  • Fig. 7 is a block diagram illustrating the principles of a code generator
  • Fig. 8 is a diagram illustrating a data dictionary structure.
  • Fig. 2 is a schematic block diagram illustrating a database system according to the invention.
  • the system comprises a database runtime module 210, a storage engine denoted a RISK module (Reduced Instruction Set Kernel) 215, a data storage 220, an offline data dictionary 230, a code generator 240 which also has the capability to act as a schema compiler as indicated by 140 in Fig. 1.
  • Applications 250 and Schema/Data model 260 are external elements which do not form part of the system.
  • the code generator first compile the schema similar to the prior art schema compiler 140 and store the structural information in the offline data dictionary 230. Then, based on the stored information in the offline data dictionary a new database runtime module 210 is generated.
  • the newly created source code must be compiled, upon successful compilation a new database runtime module is created.
  • the database runtime module is generated and unique for each data model, the application programming interface (the part of database runtime module 210 seen by the application) is kept unchanged.
  • the arrow between the code generator 240 and database runtime module 210 is an one way arrow indicting that it is not possible for the database runtime module 210 to access the data dictionary 230.
  • the database kernel (constituted by the runtime module 210 and the storage engine/RISK module 215) is based on the data model. More specifically, the runtime module part 210 of the kernel is based on the data model, whereas the storage engine 215 is not.
  • an application 250 wants to carry out a database operation, it will send the question to the database runtime module 210 that immediately can pass the right question to a storage engine 215 denoted a RISK module (Reduced Instruction Set Kernel).
  • the information returned to the runtime module 210 by the RISK module 215 will be immediately understood by the runtime module 210 and transferred back to the application 250.
  • the code generator 240 is introduced to automatically create the best possible runtime environment for a given data model (the data model stored in the data dictionary 230).
  • the use of a code generator in this setting tailors the database kernel to suit a specific need, at the same time maximum flexibility is ensured due to the data dictionary 230.
  • code generator 240 In principle almost all code generators may be used as the code generator 240, if the supporting data dictionary 230 has sufficient structures to express a general data model structure and all types of constraints.
  • the code generator 240 should be neutral to which format or language the code is produced. It is therefore advisable with a generator that works on templates separated from the dictionary 230 and separated from the generator itself.
  • the code generator 240 may have integrated template(s) and dictionary, or alternatively separate templates and dictionary.
  • the code generator 240 may be perfectly capable to generate C, C++, Pascal, Java, as well as SQL schemas, OQL schemas and Word documents, provided templates for these productions are developed.
  • the neutrality has significantly eased the process to tailor the template that is the basis for the code produced. The neutrality is not imperative, but a non template based generator will certainly create a more complicated development process.
  • Fig. 7 is a block diagram further illustrating the principles of a code generator.
  • the code generator 240 cooperates with the data dictionary 230 and the templates 720.
  • the code generator 240 produces documents 710 based on the data dictionary 230, which describes data model and the expected output.
  • the templates 720 describe the syntactical rules and constructions that the code must be in accordance with.
  • the code produced by the code generator 240 may both be descriptive and procedural.
  • the produced documents 710 may be of different types, such as eDB 711, Oracle 712, Sybase 713, VC++ 714, Delphi 715, Web objects 716 and more. Both these code examples may be produced:
  • the templates 720 describe the code production process.
  • a template can be regarded as a Microsoft Word Template, but in addition it has to have control flow mechanisms that manage the actual production process.
  • An example of a template is:
  • the template is actually an integrated part of the code generator, but it may be viewed as a separate part that is input to or included in the code generator depending on what kind of code the generator is configured to produce.
  • WA_Length CompressIndlnWA (Itree) ;
  • Error KernelObtamPosition (Itree, First, WA, 0, 4) ; if (Error
  • NoErr) goto Exit; ExpandlndFromWA(Itree) ; while ( iMatchlndex (Itree) ) ⁇
  • WA_Length CompressIndlnWA (Itree) ;
  • the offline data dictionary 230 The offline data dictionary 230
  • the offline data dictionary 230 contains the same kind of information as a online data dictionary in traditional database systems. It also plays the same role as a placeholder for structural information.
  • the data dictionary is basically a description of the data model. It has elements like Domain (or data types), tables and columns. It knows about constraints as foreign key etc.
  • the data dictionary is structured as a flat ASCII file describing all elements necessary. An example of such a structure is schematically illustrated in fig. 8.
  • TableName For each table object TableName is stored as an attribute. For each column object ColumName, Type, Mandatory etc is stored as attributes.
  • the database runtime module 210 The database runtime module 210
  • the database runtime module 210 comprises a set of record definitions and functions tailored to the actual data model as described in the database schema 160 and in the data dictionary 230. For each table there is a struct definition (or record definition) and a set of basic operations. These operations include functions for insert, delete, update and retrieve.
  • the insert function knows the format for each attribute, which attribute that are mandatory, which integrity rules that are involed etc. and also how the information is formatted and where the information resides in the permanent storage.
  • the insert function knows all about involved constraint and integrity rules that apply for the actual object type handled by this function. Thus the insert function does not have to consult any data dictionary in order to figure out which measures that must be taken in order to validate the data.
  • the delete function In the exact same manner as insert function the delete function knows all about the involved integrity rules, and how to interpret the data.
  • update function knows all about the involved integrity rules. In fact update combines the insert and delete functionality.
  • the obtain function comes in a large number of varieties. These varieties falls into three categories: Direct, Relative and Position. As the obtain function knows the data structure this function neither needs any dictionary.
  • ObtainDirect enables the application to do a specified search in the database. That means search for a data entry which a special value or combination of such.
  • the obtain-relative function may either be positional (item next or prior to current), based on different value (next-different or prior-different) or based on same value (next or prior with same value).
  • Example eDB_OBT_Relative(Person, Next) // Returns next person after Bill eDB_OBT_Relative(Person, NextDifferent) // Returns next person with another name eDB_OBT_Relative(Person, NextEqual) // Returns next person with same name.
  • eDB_OBT_Relative(Person, Prior) // Returns next person prior to Bill eDB_OBT_Relative(Person, PriorDifferent) // Returns prior person with another name eDB_OBT_Relative(Person, PriorEqual) // Returns prior person with same name.
  • ObtainPosition enables the application to directly access first or last data item.
  • eDB_OBT _Relative(Person, Last) // Returns last person, according to an index.
  • an index name can be added to specify which ordering of data to use when navigating in the database.
  • the RISK module 215 is what generally is recognized as storage engine.
  • the RISK module offers the runtime module a platform independent access to the data.
  • the RISK module (storage engine) undertook some dramatic changes.
  • the navigational tool include functions like 'Ge First', 'Ge Last', 'GetGreaterThan', 'GetLessThan'. 'Next', 'Prior', 'NextDifferent', "PriorDifferent'.
  • the RISK module 215 has to have different structures that maintain an ordering of the data. Basically the RISK module offer three types of structures; BASE, INDEX and PROJECTION. See the data storage section below for a further description of these structures.
  • Data storage 220 provides for the physical storage of data, i.e. data files.
  • Fig. 3 is a data structure diagram illustrating three types of data structures that may be handled by the RISK module 215 and stored in the data storage 220; BASE 310, INDEX 320 and PROJECTION 330.
  • the RISK module Basically, there are two types of storage structures handled by RISK module. The first is an unordered set of data, which is the BASE structure 310, and the second is the ordered set of data which is the INDEX structure 320.
  • the PROJECTION structure is just a simplified INDEX structure.
  • BASE structure 310 is the basic structure that holds the actual information. The information is stored linear as they are entered into the database. Each data entry may have a variable size (length) and a unique rowid.
  • INDEX structure 320 is the structure used to index the base structure 310. Note that neither the RISK module nor the data storage itself knows about the logical (and actual) connections between these structures.
  • data entry is a variable sized data area. For each data entry, the length and a set of rowids where the actual value is referenced. Not that this is a fairly compact way of storing the data, and at the same time ensuring a high performance.
  • PROJECTION Structure 330 is a structure to hold information about the BASE (or INDEX) structure.
  • the RISK module or the data storage itself knows about this interconnection.
  • the data is a variable sized data entry, but instead of storing each individual rowid, the just the number of rowids are stored.
  • the internal structure of data entries of these structures 310, 320, 330 is not known by the RISK module 215.
  • the data entries are seen as complete undividable items. In fact these data entries can have a very complicated structure, but this internal structure is only known by database runtime module 210.
  • Every BASE structure 310 data entry has an unique identifier named Rowid.
  • the Rowid is created by the RISK module 215.
  • Data entries in the INDEX structure 320 do not have unique Rowids, each data entry has a set of associated Rowids.
  • Data entries in the PROJECTION structure 330 do not have rowids at all, just a number that tells how many associated rowids a corresponding data entry in an index structure 320 would have had.
  • RISK module 215 does not contain any dictionary kind of information it is not capable to maintain any interconsistency between the different structures. It is the runtime module 210 that maintains the internal consistency between the different structures, as that module is the only module that have knowledge of THE data model 260.
  • Fig. 4 is a sequence diagram illustrating an example of an update sequence.
  • the sequence comprises four stages: 1.
  • An integrity check 430 wherein the runtime module 210 and the RISK module 215 searches for conflicts/violations.
  • every computer program is a set of loops that repeats a set of tasks a number of times. These loops lie often within each other.
  • Fig. 5. is a schematic diagram illustrating the data processing hierarchy in a database system.
  • the outer loops 510 are controlled by the application 250 (for instance retrieval of a list of employees).
  • the loops in between 520 are controlled by the database runtime system 210 (for instance collection of each of the attributes for the employee, department information etc. This may involve several loops into several tables to retrieve all necessary information.)
  • the inner loops 530 are controlled by the storage engine / RISK module 215 (for instance this is a collection of each of the disk blocks required to gather the necessary information, and to split these block into single data items).
  • the storage engine / RISK module 215 for instance this is a collection of each of the disk blocks required to gather the necessary information, and to split these block into single data items.
  • the RISK module 215 also serializes the innermost loops. Instead of start the looping process all over each time it gains control, it simply continues from where it was when it gave control back to the Runtime module 210 last time. This obviously reduce looping depth an reduce the total number of instructions needed to complete a task.
  • the database structure according to the invention gives: a non interpreting database runtime. minimal sized runtime module tailored to the actual problem area. database runtime that runs the most efficient code. full flexibility with respect to external interfaces, maintainability as normal systems a runtime module that is able to handle all kind of constraints, a portable and scalable database system
  • the database approach according to the invention does not contain any online data dictionary. All data dictionary information is embedded in the database runtime module. This means that there is no need for data dictionary enquiries. Requests like 'Describe table person' can be fulfilled immediately without any dictionary enquiries. All consistency checking code is generated as part of the code, so there is no need for interpreting consistency information.
  • the runtime module does not contain any superfluous code. If there is no float data types in use, no code to handle float data types will be generated as part of the database runtime. The same applies for consistency checking as well. This ensures a code that is minimal for the actual data model (and data dicitionary) and will further ensure a minimal sized database system.
  • a traditional data dictionary normally consists of external view information (i.e. SQL Views) that may be used to tailor an interface to a specific application.
  • external view information i.e. SQL Views
  • traditional database systems only support one kind of database interfaces (Relational, Network oriented, Hierarchical, ObjectRelational or ObjectOriented). They never support a mix of these.
  • a particular development tool for developing solutions according to the invention has the knowledge of external views (also with respect to different database technologies), so it can generate all interfaces needed, tailored to the actual database.
  • Fig. 6 is a block diagram illustrating the use of various application program interfaces (APIs) in a database system according to the invention. This figure shows that more applications 250a, 250b, 250c can access the same database.
  • the code generator 240 can not only generate the runtime module, but also (by use of particular templates) different application program interfaces 205a, 205b, 205c on top of the database runtime module 210. Each of these interfaces 205a, 205b, 205c can be used simultaneously to allow access to the database from different applications 250a, 250b, 250c, respectively.
  • a booking application and a flight control application can both access the schedule in the database using an SQL interface, while an XML based flight publication application concurrently access the schedule.
  • the inventive approach allows the same type of system development as traditional database system development.
  • the application developers focus on just the same aspects; which are developing business logic, user interfaces and interfaces to other systems.
  • the database API Application Program Interface
  • a runtime module that is able to handle all kind of constraints.
  • the runtime module may also be generated using another programming language as for instance Java, C++, Basic or Pascal.
  • the database system according to the invention is well suited for hand held devices with limited computing power.

Abstract

A computer system for providing access to a database upon a request from an application computer program (250), comprising a data model (260) represented in a data dictionary (230), an application program interface, a data storage (220), a database kernel (210), (215) providing access from the application program interface to the data storage (220). The database kernel (210), (215) comprises a runtime module (210) for providing access from the application program interface to the database kernel and a storage engine module (215) for providing access from the runtime module (210) to the data storage (220). The runtime module (210) is dynamically changeable, dependent on the data model (260) represented in the data dictionary (230), while the storage engine module (215) is invariable and independent of the data model. An automatic code generator (240) is arranged to generate an executable program code for the runtime module 210, based upon the data model (260) represented in the data dictionary (230).

Description

ADAPTABLE DATABASE RUNTIME KERNEL
TECHNICAL FIELD
The invention relates in general to database technology, and more specifically to a computer system for providing access to a database upon a request from an application computer program. Particularly, the invention relates to such a system which comprises a data model represented in a data dictionary, an application program interface, a data storage, and a database kernel providing access from the application program interface to the data storage.
The invention also relates to a method for improving such a database system, a database kernel in such a database system and a code generator in such a database system.
BACKGROUND OF THE INVENTION
Current database systems require extensive computing power, and they usually offer functionality not required by most applications. There is an increasing need for database systems wherein the computing power requirements are reduced, particularly in handheld applications and embedded systems having limited resources. There is also a need for facilitating database and application programming, particularly to ease constraint handling and thereby reduce application complexity.
RELATED BACKGROUND ART
Current database technologies handles basically two types of information. It is the real information itself, and information about the information, so-called meta- information. The meta-information is often referred to as the data model. The meta-information may be viewed as structural information, and a description of how the real information has to be structured.
Fig. 1 is a schematic block diagram illustrating the major components in a prior art database system. In this system, the database runtime environment consists of these parts:
Data storage 120, which is the physical storage of data, i.e. data files. - Database kernel 110, serving as a database runtime and storage engine, which is the actual data interpreter. On-line Data dictionary 130 that holds the meta-information. Dictionary compiler (or schema compiler/interpreter) 140.
External application programs 150 (not part of the database) access the database system via an application program interface (not indicated) to the kernel 110. Database definition or schema/data model 160 (not part of the database) is interpreted by the schema compiler and stored in the data dictionary 130 to allow the database kernel 110 to act in accordance to the data model.
The data storage 120 is the place where the actual data are kept. Often it uses the operating system's standard file system, but it is not unusual to have special file access in order to speed up data transfer to and from disk. Usually the data storage is platform dependent, which means that data stored on one platform can't be transferred to another platform without data conversion.
In the data dictionary 130 all meta information is stored. Meta information is as mentioned above information about the information. It describes all tables, columns, fields etc. which may be mentally viewed as the table layout. The data dictionary also holds information about data types for the columns and domain restrictions for these columns. (NAME is character string consisting of characters 'a'-'z', 'A'-'Z' and '.', ' ', '-', and INCOME is a positive integer, and CURRENCY is one of: 'USD' 'NOK', ΕUR', 'JPY', 'GBP', rSEK\ 'DKK' ). The data dictionary also holds information about the different constraint types as primary keys, foreign keys, subset constraints, exclude constraints and other constraints. In a "perfect" database system, the dictionary and database runtime should be able to handle all types of constraints (in addition to those mentioned above), but current market leading systems only handle primary keys, unique keys, foreign keys and mandatory columns.
The data dictionary 130 also contains information about external views. This means information of how the information are to be presented to a specific application. One application may wish to view just the 'Name' and 'Address' columns of the table 'Person' while another application wish to view the 'Project', 'Assignment' and 'Person' tables as a compound table 'Project- Assignment'. Many data dictionaries also contain user definitions and user authorization information.
The database runtime module & storage engine 110 is the heart of the database system. It handles data retrieval request and update requests. In order to fulfill any request the runtime system has to consult the data dictionary 130. First the runtime module 110 has to consult the dictionary 130 in order to validate the application's request. It has to check that the requested data is known by the application 150 (or should be known), and do a mapping to the underlying data model. It then has to consult the data dictionary 130 in order to determine how the actual request to the storage engine should be expressed. When the result from the storage engine is returned, the database runtime once more has to consult the dictionary 130 in order to do a transformation of the retrieved data to fit the applications expectations with respect to naming and structure.
If the application 150 instead of requesting data, wants to update the database (insert new data, or delete or update existing data) the database runtime module 110 has to run a consistency check of the database based on the rules stored in the data dictionary. For each rule found in the data dictionary 130, the runtime module 110 has to analyze the rule and run a consistency check. This is very complicated and time consuming. All market leading database vendors therefore do this kind of consistency control after every single update, to prevent a large backlog of consistency controls to be performed. But most important; unless the controls are carried out immediately, the functionality required to do the control after the entire transaction has been carried out will be very complex.
As a result of all tasks the runtime module 110 has to handle (consistency check, authorization, etc.) the module is oversized, which involves an efficiency problem.
The dictionary compiler or schema compiler 140 checks the data model for consistency. If the data model is consistent it stores the data model information in the data dictionary.
If a customer has only got a runtime version of the database system, the dictionary compiler is left out. The ability to do changes to the data model is then efficiently removed (Create Table, Alter Table and Drop Table will not work). It is worth mentioning that in current SQL databases, the schema compiler is accessible from most applications (also user developed applications).
Further according to prior art, when an application issues an update- or retrieval- request to the database system, the database runtime module 110 has to dynamically validate the request, and dynamically create an execution plan. In order to do that the system has to send a lot of enquiries to the data dictionary 130 and interpret the results. This has to be done for every single request. All enquiries to the data dictionary 130 creates a lot of overhead and decrease performance significantly. It also requires the database runtime module 110 to be constructed such that all type of data models can be handled. As a result of this there are a lot more program code than necessary for most data models and the runtime module 110 is oversized for most applications. An interesting observation to this is: in order to handle all kind of data-models including a complete set of integrity enforcement rules and a proper transaction handling in conjunction with the constraints, the complexity of such a database runtime module 110 will explode and the performance will drop catastrophically. Therefore, current database runtime modules just handle a small portion of possible data models, they just offer a limited set of constraint mechanisms, they offer a limited transaction model and finally they suffer from poor performance and oversized executables.
OBJECTS OF THE INVENTION It is an object of the present invention to improve performance in database systems.
Most of the current leading databases require extensive computing power. Database management systems (DBMS) have traditionally been one of the key driving forces behind the struggle of delivering more and more computing power. Although multi-media applications seem to require even more computing power, DBMS will be a major consumer of computational power years ahead. The most efficient DBMS is far away from fulfilling the requirements of the most demanding applications both with respect response time as well as amount of information. Use of traditional databases in real-time applications, as storage for huge amount of measured values (weather appliances, seismic information, network traffic information etc) or in data warehouse have not been practical due to lack of computing power (or to demanding DBMS). Such appliances have therefore often been programmed using special purpose databases or no databases at all.
It is another object of the present invention to reduce database footprints in database systems. Current DBMS normally solves a lot of problems that are not required by the applications. That means, an application rarely makes use of all offered functionally at the same time. Different applications require different functionality. The result of this is that DBMS offers a lot more functionality that requested at the same time much functionality required is not offered. Compared to the offered functionality, the footprints of current DBMS are fairly reasonable, but compared to required functionality they are oversized. For desktop use this is no problem, but for handheld applications and embedded systems this is a major problem. Handheld devices and embedded devices has very limited resources, therefore small footprint databases are vital. It is a further object of the present invention to ease database and application programming, particularly to ease constraint handling and thereby reduce application complexity.
Most of the current DBMS have the ability to define integrity constraints. Enforcement of these constraints is an integral part of the DBMS kernel. Current database systems lack the ability to postpone the integrity control until the complete set of updates has been carried out. All current DBMS insists to do the integrity control immediately after every single update to the database. This creates a tremendous challenge to the application developer both to let the application do updates in logical chunks and to create work-around to overcome the shortcoming of current database systems. In most cases the application programmer has to compromise with the constraint handling of the database system and program these constraint as part of the application. This approach creates another challenge, which is to implement all these application-enforced constraints in every application that may access the database, but this is virtually impossible.
SUMMARY OF THE INVENTION
According to a first aspect of the present invention, the above objects are achieved by a computer system for providing access to a database upon a request from an application computer program, comprising - a data model represented in a data dictionary,
- an application program interface,
- a data storage,
- a database kernel, providing access from the application program interface to the data storage, wherein said database kernel comprises a database runtime module for providing access from the application program interface to the database kernel, said runtime module (210) being dynamically changeable, dependent on the data model (260) represented in the data dictionary, and - a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model.
According to a preferred embodiment, the system further comprises an automatic code generator which is adapted to generate an executable program code for the runtime module based upon the data model represented in the data dictionary. The program code for the runtime module is preferably generated by the code generator subsequent to any amendments in the data model.
The code generator preferably comprises a source code generating module and a compiler module, said code generating module being adapted to generate a source code to be processed by the compiler module, thus generating said executable program code.
The code generating module is further preferably adapted for receiving structural data provided from the data dictionary, for receiving syntactical data provided from a template, and for processing said structural data in accordance with rules defined by the syntactical data, thus producing source code adapted for input to the compiler module.
The code generating module is advantageously adapted to generate descriptive code or procedural code, dependent on the syntactical data provided from the template.
The storage engine is preferably adapted to offer the runtime module a platform independent access to the data stored in the data storage.
Preferably, the storage engine is also adapted to store and retrieve data elements of a first data structure, comprising an unordered set of data, and data elements of a second data structure, comprising an ordered set of data.
The storage engine is also preferably adapted to provide an identifier in each data elements of said first data structure, and to provide data elements of said second structure wherein each entry has a set of such associated identifiers.
According to a second aspect of the present invention, the above objects are provided by a method for improving a database system for providing access to a database upon a request from an application computer program, said system comprising
- a data model represented in a data dictionary,
- an application program interface,
- a data storage,
- a database kernel, providing access from the application program interface to the data storage, the method being characterized by the steps of providing in the database kernel a database runtime module for providing access from the application program interface to the database kernel, said runtime module being dynamically changeable, dependent on the data model represented in the data dictionary, and providing in the database kernel a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model.
The invention also relates to a database kernel in a computer system for providing access to a database upon a request from an application computer program, said system comprising
- a data model represented in a data dictionary,
- an application program interface,
- a data storage, said database kernel providing access from the application program interface to the data storage, said database kernel comprising a database runtime module for providing access from the application program interface to the database kernel, said runtime module being dynamically changeable, dependent on the data model represented in the data dictionary, and a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model.
Finally, the invention relates to an automatic code generator in a computer system for providing access to a database upon a request from an application computer program, said system comprising - a data model represented in a data dictionary,
- an application program interface,
- a data storage,
- a database kernel, providing access from the application program interface to the data storage, said database kernel comprising a database runtime module for providing access from the application program interface to the database kernel, said runtime module being dynamically changeable, dependent on the data model represented in the data dictionary, and a storage engine module for providing access from the runtime module to the data storage, said storage engine module being invariable, independent of said data model, said code generator being adapted to generate an executable program code for the runtime module based upon said data model.
BRIEF DESCRIPTION OF THE DRAWINGS The features and advantages of the present invention will become more apparent when referring to the presently preferred embodiment described in the following specification. In the figures, Fig. 1 is a schematic block diagram illustrating a database system according to prior art,
Fig. 2 is a schematic block diagram illustrating a database system according to the invention, Fig. 3 is a data structure diagram illustrating three types of data structures,
Fig. 4 is a sequence diagram illustrating an example of an update sequence,
Fig. 5 is a diagram illustrating a data processing hierarchy,
Fig. 6 is a block diagram illustrating the use of various application interfaces in a database system according to the invention, Fig. 7 is a block diagram illustrating the principles of a code generator, and
Fig. 8 is a diagram illustrating a data dictionary structure.
DETAILED DESCRIPTION OF THE INVENTION
Fig. 2 is a schematic block diagram illustrating a database system according to the invention. The system comprises a database runtime module 210, a storage engine denoted a RISK module (Reduced Instruction Set Kernel) 215, a data storage 220, an offline data dictionary 230, a code generator 240 which also has the capability to act as a schema compiler as indicated by 140 in Fig. 1. Applications 250 and Schema/Data model 260 are external elements which do not form part of the system. When a new data model is introduced the code generator first compile the schema similar to the prior art schema compiler 140 and store the structural information in the offline data dictionary 230. Then, based on the stored information in the offline data dictionary a new database runtime module 210 is generated. The newly created source code must be compiled, upon successful compilation a new database runtime module is created. Although the database runtime module is generated and unique for each data model, the application programming interface (the part of database runtime module 210 seen by the application) is kept unchanged.
The arrow between the code generator 240 and database runtime module 210 is an one way arrow indicting that it is not possible for the database runtime module 210 to access the data dictionary 230. According to the invention, and in contrast to the use of the a database system with online data dictionary, the database kernel (constituted by the runtime module 210 and the storage engine/RISK module 215) is based on the data model. More specifically, the runtime module part 210 of the kernel is based on the data model, whereas the storage engine 215 is not. As an application 250 wants to carry out a database operation, it will send the question to the database runtime module 210 that immediately can pass the right question to a storage engine 215 denoted a RISK module (Reduced Instruction Set Kernel). The information returned to the runtime module 210 by the RISK module 215 will be immediately understood by the runtime module 210 and transferred back to the application 250.
The code generator 240
The code generator 240 is introduced to automatically create the best possible runtime environment for a given data model (the data model stored in the data dictionary 230). The use of a code generator in this setting tailors the database kernel to suit a specific need, at the same time maximum flexibility is ensured due to the data dictionary 230.
In principle almost all code generators may be used as the code generator 240, if the supporting data dictionary 230 has sufficient structures to express a general data model structure and all types of constraints. The code generator 240 should be neutral to which format or language the code is produced. It is therefore advisable with a generator that works on templates separated from the dictionary 230 and separated from the generator itself.
The code generator 240 may have integrated template(s) and dictionary, or alternatively separate templates and dictionary. The code generator 240 may be perfectly capable to generate C, C++, Pascal, Java, as well as SQL schemas, OQL schemas and Word documents, provided templates for these productions are developed. The neutrality has significantly eased the process to tailor the template that is the basis for the code produced. The neutrality is not imperative, but a non template based generator will certainly create a more complicated development process.
Fig. 7 is a block diagram further illustrating the principles of a code generator.
As illustrated, the code generator 240 cooperates with the data dictionary 230 and the templates 720. The code generator 240 produces documents 710 based on the data dictionary 230, which describes data model and the expected output. The templates 720 describe the syntactical rules and constructions that the code must be in accordance with. The code produced by the code generator 240 may both be descriptive and procedural. As shown in Fig. 7, the produced documents 710 may be of different types, such as eDB 711, Oracle 712, Sybase 713, VC++ 714, Delphi 715, Web objects 716 and more. Both these code examples may be produced:
Select Name, Address, Income, Department from Employee;
SetRecordType(Employee) GetFirstRecord(WA)
While (ILastRecord(WA)) { GetNextRecord(WA);
Printf("Name = %s, Address = %s, Income = %s, Department = %s\n", WA.Name, WA.Address, WA.Income, WA. Department); }
Templates 720
The templates 720 describe the code production process. A template can be regarded as a Microsoft Word Template, but in addition it has to have control flow mechanisms that manage the actual production process. An example of a template is:
// This part of the code produce struct definitions for each table found in the data dictionary: typedef struct { char Name [32+1] int Type; mt Length; mt Fraction; mt Nullable; } ColumnType; char * currentTable; ColumnType * currentColumn;
CurrentTable = GetFirstTableNameFromDictionary ( ) while (currentTable) { pπntf(«/* For %s data definition */\ntypedef struct {\n», currentTable); currentColumn = GetFirstColumn (currentTable) ; while (currentColumn) { prmtf («Boolean NU_%s;\n», currentColumn->Name) ; switch (currentColumn->Type) { TEXT: printf («char %s[%d];\n», currentColumn->Name, currentColumn-
>Length) ; break; INTEGER: prmtf («char %s[%d];\n», currentColumn->Name, currentColumn-
>Length) ; break; <Part removed to improve readabιlιty>
} currentColumn = GetNextColumn (currentTable) ; pr tf («} S_%s; \n\n\n», currentTable); currentTable = GetNextTableNameFromDictionary ( ) }; // This part of the template produce Obtain function that recognize the different tables : prmtf («short eDB_OBT (Strmg31 RecordTypeName, Stπng31 IndexName) {\n\n»); prmtf (« short Btree; \n») ; pprrmmttff ((«« sshhoorrtt Itree; \n»); prmtf (« short Error = RuntimeError; \n») ; prmtf (« long Rownum; \n») ; prmtf (« short WA_Length; \n») ; prmtf (« short *CuUnPtr; \n») ; pprrmmttff ((«« sshhoorrtt *CuPrTyPtr; \n») ; prmtf (« Boolean *CuVιPtr; \n») ; prmtf (« Boolean *CuExPtr; \n») ; prmtf (« QueryType Keytype = Exactlndex; \n») ; prmtf (« Boolean Ma Established = false; \n») ; pprrmmttff ((«« B Bttrreeee == TTreeNumberOf (RecordTypeName) ; \n») ; prmtf (« if (Btree NoTree) \n») ; prmtf (« rreettuurrn: OnknownRecordType; \n») ;
<Part removed to improve readabιlιty>
CurrentTable = GetFirstTableNameFromDictionary ( ) ; while (currentTable) { prmtf (« switch (Btree) {«); prmtf (« case TR_%s:\n», currentTable); prmtf ( "StDB_%s .Rownum = Rownum\n», currentTable); prmtf (" eDB_%s . Rownum = Rownum; \n» p intf (" Itree = TreeNumberOf ( \"101_%s\" ); \n», currentTable
); prmtf (" Error = Copy01dRAto01dIA(Btree, Itree) \n») ; printf (" if (Error ι = NoErr) \n») ; printf (" {CopyOldRAToCurrentRA (Btree) ; return Error; }\n»);
<Part removed to improve readabιlιty> currentTable = GetNextTableNameFromDictionary () ;
};
As one can see this is really a program that produces a certain code in a certain language. In this example the template is actually an integrated part of the code generator, but it may be viewed as a separate part that is input to or included in the code generator depending on what kind of code the generator is configured to produce.
The Code Produced For each table, for each column and foreign key construction referred in the dictionary the code generator will produce a set of instructions, as indicated in the previous section. Basically a c-struct is produced for each table, a set of functions to operate on these structures. Finally it uses RISK functions to store the data and to retrieve already stored data.
The code produced looks similar to this (this is example of actual output but has been reduced slightly to improve readability):
$MicroExportHeader$example\eDBSta . h
/* For Department data definition typedef struct {
Boolean NO_Budget_for; char Budget_fo [13] ;
Boolean NO_DeptName_fo ; char DeptName_for [31] ; long BC Logstatus;
} S Department;
For Employee data definition def struct {
Boolean NU MobilePhone_for; char MobilePhone_fo [16]
Boolean NU_EmpName_for; char EmpName_for [21] ;
Boolean Nϋ_Budget_has; char Budget_has [13] ; long BC Logstatus;
} S Employee;
<Code removed>
short eDB_OBT (String31 RecordTypeName, String31 IndexName) { short Btree; short Itree; short Error = Runti eErro long Rownum; short A_Length; short *CuϋnPtr; short *CuPrTyPtr;
Boolean *CuViPtr;
Boolean *CuExPtr;
QueryType Keytype = Exactlndex;
Boolean MainEstablished = = false;
Btree = TreeNumberOf (RecordTypeName) ; if (Btree == NoTree) return UnknownRecordType; Itree = IndexNumberO (RecordTypeName, IndexName) ; if (Itree == NoTree) return OnknownlndexType; if (Itree == Dniversallndex) return Illegallndex; //Call Before ConstraintViolation.DiagNo = 0; strcpy( (char*) ConstraintViolation.Message, " ")
CopyCurrentRAToOldRA (Btree) ; if (Itree == Querylndex) (
EstablishCurrentPtrs (Btree, sCuϋnPtr, &CuPrTyPtr, &CuViPtr, sCuExPtr) ; if (*CuOnPtr == 0) {
Itree = Bestlndex (Btree, SKeytype) ; if (Keytype == NoQuery) return EmptyKey; EstablishCurrentPtrs (Btree, &CuϋnPtr, SCuPrTyPtr, SCuViPtr, SCuExPtr) ; } Error = MoveQueryToIA (Btree, *CuOnPtr) ;
WA_Length = CompressIndlnW (*CuϋnPtr) ; Itree = *CuOnPtr; *CuExPtr = false; switch (*CuPrTyPtr) { case Exactlndex: case Majorlndex:
Error = KernelObtamDirect (Itree, First, WA, 0, A_Length - 4); break; case ExactAndNo: case Ma^orAndNo:
Error = KernelObtamDirect (Itree, First, WA, 0, WA_Length - 4); MainEstablished = true; while ( (Error == NoErr) ) {
Error = ExpandlndFromWA (Itree) ; if ( iMatchlndex(Itree) )
[Error = NotFound; goto Exit; } Rownum = FetchRecordID (Itree) ; memset (WA, 0, ASize) ; StoreLonglnBytes (WA, 0, Rownum); Error = KernelObtamDirect (Btree, First, WA, 0 , 4) ; if (Error '= NoErr)
(Error = RuntimeError ; goto Exit; } ExpandFromWA (Btree) ; if (MatchQuery (Btree) ) goto FetchMam;
WA_Length = CompressIndlnWA (Itree) ;
Error = KernelObtamRelative (Itree, Next, WA, 0,WA_Length - 4); } // loop until match; goto Exit; case Nolndex: case AllScan: memset (WA, 0, WASize) ;
Error = KernelObtamPosition (Btree, First , WA, 0, 4 ) ; if (Error ' = NoErr) goto Exit;
ExpandFromWA (Btree) ; MainEstablished = true; while ( 'MatchQuery (Btree) ) {
Error = KernelObtamRelative (Btree, Next, WA, 0, 4) ; if (Error '= NoErr) goto Exit; ExpandFromWA (Btree) ; } goto FetchMam; case IndexScan:
Error = KernelObtamPosition (Itree, First, WA, 0, 4) ; if (Error |= NoErr) goto Exit; ExpandlndFromWA(Itree) ; while ( iMatchlndex (Itree) ) {
WA_Length = CompressIndlnWA (Itree) ;
Error = KernelObtamRelative (Itree, Next, WA, 0,WA_Length - 4); if (Error l= NoErr) goto Exit; ExpandlndFromWA ( Itree ) ;
} goto FetchMam; } } else ( CLEAR_Current_Query (Btree) ;
Error = MoveRecordBufferToIA (Btree, Itree) ; if ((Error ι= NoErr) && (Error l= MissmgMandatory) && (Error '= ValueError) ) goto Exit; WA_Length = CompressIndlnWA (Itree) ; if (WA_Length != 0)
Error = KernelObtamDirect (Itree, First, WA, 0, WA_Length - 4) else
(Error = EmptyKey; goto Exit;} } if (Error != NoErr) goto Exit;
FetchMain: if ( IMainEstablished) {
Error = ExpandlndFromWA (Itree) ; Rownum = FetchRecordID (Itree) ; memset (WA, 0, ASize) ;
StoreLonglnBytes (WA, 0, Rownum);
Error = KernelObtamDirect (Btree, First, WA, 0, 4 ) if (Error != NoErr) goto Exit; ExpandFromWA (Btree) ;
} Error = MoveRAToRecordBuffe (Btree) ;
Exit: if (Error != NoErr) (
CopyOldRAToCurrentRA (Btree) ; return Error; } return NoErr;
/*eDB OBT*/
The offline data dictionary 230
The offline data dictionary 230 contains the same kind of information as a online data dictionary in traditional database systems. It also plays the same role as a placeholder for structural information.
It comprises the basic table or object structure including columns/attributes. It also includes the concept of domains, and a wide variety of integrity rules.
The data dictionary is basically a description of the data model. It has elements like Domain (or data types), tables and columns. It knows about constraints as foreign key etc. The data dictionary is structured as a flat ASCII file describing all elements necessary. An example of such a structure is schematically illustrated in fig. 8.
For each table object TableName is stored as an attribute. For each column object ColumName, Type, Mandatory etc is stored as attributes.
In addition to the structure illustrated above and with reference to Fig. 8, similar structures for Indexes, PrimaryKeys, ForeignKeys, etc. are implemented in the dictionary.
The actual layout of the data dictionary looks like this: WinlAST X9 Semantical 1
MODEL 2001-07-30 15:01:49 2001-07-30 15:02:07 1 X9
DOMAIN Budget FIXED 10 2
DOMAIN DeptName CHARACTER 30 0
DOMAIN EmpName CHARACTER 20 0
DOMAIN MobilePhone CHARACTER 15 0
RECORD Department 100
ELEMENT Budget_for 1 NOTNULL
ELEMENT DeptName_for 2 NOTNULL
RECORD Employee 100
ELEMENT MobilePhone_for 4 NOTNULL
ELEMENT EmpName_for 3 NOTNULL
ELEMENT Budget_has 1 NOTNULL
UNIQUE I01_Department Department
ELEMENT Budget_for 1
UNIQUE I2_Department Department
ELEMENT DeptName_for 2
UNIQUE I01_Employee Employee
ELEMENT MobιlePhone_for 4
UNIQUE I2_Employee Employee
ELEMENT EmpName_for 3
SUBSET Department_Employee Department
ELEMENT Budget_for 1
MEMBER Employee
ELEMENT Budget_has 1
RENAME 2001-07-30 15:02:07 1
EXTERNALS 2001-07-30 15:02:07
HISTORY 1 END DICTIONARY
The database runtime module 210
The database runtime module 210 comprises a set of record definitions and functions tailored to the actual data model as described in the database schema 160 and in the data dictionary 230. For each table there is a struct definition (or record definition) and a set of basic operations. These operations include functions for insert, delete, update and retrieve.
Example
For a table Person there will be a struct definition as
typedef struct Person { char name[21]; char address[31];
double income;
and functions: eDB_INSERT_Person(); eDB_DELETE_Person(); eDB_UPDATE_Person(); eDB_OBTAIN_Person( AccessPath);
All these functions are wrapped in common functions: eDB_INSERT(<table name>) eDB_DELETE(<table name>) eDB_UPDATE(<table name>) eDB_OBT_<modifier>(<table name>, <AccessPath>)
Common for all these functions are the build-in knowledge about the data model. For instance the insert function knows the format for each attribute, which attribute that are mandatory, which integrity rules that are involed etc. and also how the information is formatted and where the information resides in the permanent storage.
The eDB INSERT function
The insert function knows all about involved constraint and integrity rules that apply for the actual object type handled by this function. Thus the insert function does not have to consult any data dictionary in order to figure out which measures that must be taken in order to validate the data.
At pseudo level the code looks like this: eDBJNSERT(table_name) { switch(table ιame) { default: do_error; case tablet
// Check format check format of columnl of tablel ; if not ok do_error; check format of column2 of tablel ; if not ok do_error; check format of columnN of tablel ; if not ok do_error; // Check mandatory columns check if mandatory_col1 of table 1 is present; if not do_error; check if mandatory_col2 of table 1 is present; if not do_error; check if mandatory_colK of table 1 is present; if not do_error;
// Check each individual value check value of coll ; if not ok do_error; check value of col2; if not ok do_error; // Check foreign key
// Check joint total // Check other constraints
CompressDataO dblD = KernelStoreRecordQ; if not ok do_error; idxVal = CreatelndexValue(lndexl); KernelStorelndex(lndex1, idxVal, dblD); // Check for unique key violations idxVal = CreatelndexValue(lndex2); KernelStorelndex(lndex2, idxVal, dblD); // Check for unique key violations idxVal = CreatelndexValue(lndexL); KernelStorelndex(lndexL, idxVal, dblD); // Check for unique key violations
// If this point is reached return NoError; case table2:
}
As one can see all necessary code to ensure the complete set of controls are present. Because the structural information is already within the code, the code will execute very efficiently.
The delete function In the exact same manner as insert function the delete function knows all about the involved integrity rules, and how to interpret the data.
eDB_DELETE(table_name) { switch(table ιame) { default: do_error; case table
// Check format KernelDeleteRecord(RecordTypel) idxVal = CreatelndexValue(RecordType1 , Indexl); KernelObtain(lndex1, idxVal); KernelDeletelndex(lndex1 , idxVal); // Check for foreign key violations idxVal = CreatelndexValue(RecordType1, Index2); KernelObtain(lndex2, idxVal); KemelDeletelndex(lndex2, idxVal); // Check for foreign key violations idxVal = CreatelndexValue(RecordType1, IndexL); KernelObtain(lndexL, idxVal); KernelDeletelndex(lndexL, idxVal); // Check for unique key violations
// If this point is reached the update went OK return NoError; case table2: The update function
In the exact same manner as insert function the update function knows all about the involved integrity rules. In fact update combines the insert and delete functionality.
The obtain function
The obtain function comes in a large number of varieties. These varieties falls into three categories: Direct, Relative and Position. As the obtain function knows the data structure this function neither needs any dictionary.
Common for all obtain functions is that the result set consists of 0 or 1 records. These functions are thus record oriented and not set oriented.
ObtainDirect enables the application to do a specified search in the database. That means search for a data entry which a special value or combination of such. Example
Person.name = 'Bill'; eDB_OBT_Direct(Person)
ObtainRelative enables the application to navigate from current item forwards or backwards following an index. The obtain-relative function may either be positional (item next or prior to current), based on different value (next-different or prior-different) or based on same value (next or prior with same value).
Example eDB_OBT_Relative(Person, Next) // Returns next person after Bill eDB_OBT_Relative(Person, NextDifferent) // Returns next person with another name eDB_OBT_Relative(Person, NextEqual) // Returns next person with same name. eDB_OBT_Relative(Person, Prior) // Returns next person prior to Bill eDB_OBT_Relative(Person, PriorDifferent) // Returns prior person with another name eDB_OBT_Relative(Person, PriorEqual) // Returns prior person with same name.
ObtainPosition enables the application to directly access first or last data item.
Example eDB_OBT_Position(Person, First) // Returns first person, according to an index. eDB_OBT _Relative(Person, Last) // Returns last person, according to an index.
In addition to the above parameters, an index name can be added to specify which ordering of data to use when navigating in the database. The RISK module 215
The RISK module 215 is what generally is recognized as storage engine. The RISK module offers the runtime module a platform independent access to the data. In order to offer the runtime module a high performance access to the data, the RISK module (storage engine) undertook some dramatic changes. The knowledge about, and interpretation of data structures where removed from the storage engine. Basically the storage engine just stores a chunk of unformatted data, and can retrieve the same chunk very efficiently. It has been extended with a set of navigational functions to let the runtime module search and retrieve just the desired information ultra fast. The navigational tool include functions like 'Ge First', 'Ge Last', 'GetGreaterThan', 'GetLessThan'. 'Next', 'Prior', 'NextDifferent', "PriorDifferent'.
In order to be able to retrieve entries using for instance 'GetGreaterThan', the RISK module 215 has to have different structures that maintain an ordering of the data. Basically the RISK module offer three types of structures; BASE, INDEX and PROJECTION. See the data storage section below for a further description of these structures.
The data storage 220
Data storage 220 provides for the physical storage of data, i.e. data files. Fig. 3 is a data structure diagram illustrating three types of data structures that may be handled by the RISK module 215 and stored in the data storage 220; BASE 310, INDEX 320 and PROJECTION 330.
Basically, there are two types of storage structures handled by RISK module. The first is an unordered set of data, which is the BASE structure 310, and the second is the ordered set of data which is the INDEX structure 320. The PROJECTION structure is just a simplified INDEX structure.
BASE structure 310 is the basic structure that holds the actual information. The information is stored linear as they are entered into the database. Each data entry may have a variable size (length) and a unique rowid. INDEX structure 320 is the structure used to index the base structure 310. Note that neither the RISK module nor the data storage itself knows about the logical (and actual) connections between these structures. As for BASE structure 310, data entry is a variable sized data area. For each data entry, the length and a set of rowids where the actual value is referenced. Not that this is a fairly compact way of storing the data, and at the same time ensuring a high performance. PROJECTION Structure 330 is a structure to hold information about the BASE (or INDEX) structure. It is simply a bookkeeping function that counts how many entries of for instance the value 'YES' that are in a database table. As for the INDEX structure 320 the RISK module or the data storage itself knows about this interconnection. As for INDEX structure the data is a variable sized data entry, but instead of storing each individual rowid, the just the number of rowids are stored.
The internal structure of data entries of these structures 310, 320, 330 is not known by the RISK module 215. The data entries are seen as complete undividable items. In fact these data entries can have a very complicated structure, but this internal structure is only known by database runtime module 210.
Every BASE structure 310 data entry has an unique identifier named Rowid. The Rowid is created by the RISK module 215. Data entries in the INDEX structure 320 do not have unique Rowids, each data entry has a set of associated Rowids. Data entries in the PROJECTION structure 330 do not have rowids at all, just a number that tells how many associated rowids a corresponding data entry in an index structure 320 would have had.
As the RISK module 215 does not contain any dictionary kind of information it is not capable to maintain any interconsistency between the different structures. It is the runtime module 210 that maintains the internal consistency between the different structures, as that module is the only module that have knowledge of THE data model 260.
Work flow
Fig. 4 is a sequence diagram illustrating an example of an update sequence. The sequence comprises four stages: 1. An update request 410 from the application 250, via the application program interface, to the runtime module 210, further to the RISK module 215 and to the data storage 220.
2. An update 420 of the actual data stored in the data storage 420.
3. An integrity check 430, wherein the runtime module 210 and the RISK module 215 searches for conflicts/violations.
4. A return 440, wherein the runtime module passes the result back to the application 250.
As appears from fig. 4, there is no data dictionary to be enquired. All necessary structural information is present in the generated database runtime module 210. A discussion of why this approach is more efficient
Basically every computer program is a set of loops that repeats a set of tasks a number of times. These loops lie often within each other.
Fig. 5. is a schematic diagram illustrating the data processing hierarchy in a database system.
The outer loops 510 are controlled by the application 250 (for instance retrieval of a list of employees).
The loops in between 520 are controlled by the database runtime system 210 (for instance collection of each of the attributes for the employee, department information etc. This may involve several loops into several tables to retrieve all necessary information.)
The inner loops 530 are controlled by the storage engine / RISK module 215 (for instance this is a collection of each of the disk blocks required to gather the necessary information, and to split these block into single data items). By instinct it is reasonable to add as much functionality as possible in the innermost loop to stop the looping as early as possible. If the kernel has enough information to stop looping, it is likely to believe that it has enough knowledge to process the data as well. It is therefore very tempting to add to much functionality in the innermost loop and thereby slowing down the entire system. According to the invention, all processing of data from the innermost loops 530 are moved to the loops 520 controlled by the runtime module 210. The RISK module 215 has the sole responsibility of finding the data as efficient as possible, but never process the data.
Bearing in mind that the majority of database processing is looping, the RISK module 215 also serializes the innermost loops. Instead of start the looping process all over each time it gains control, it simply continues from where it was when it gave control back to the Runtime module 210 last time. This obviously reduce looping depth an reduce the total number of instructions needed to complete a task.
Advantages of adaptable databases
The database structure according to the invention gives: a non interpreting database runtime. minimal sized runtime module tailored to the actual problem area. database runtime that runs the most efficient code. full flexibility with respect to external interfaces, maintainability as normal systems a runtime module that is able to handle all kind of constraints, a portable and scalable database system
A non-interpreting database runtime.
The database approach according to the invention does not contain any online data dictionary. All data dictionary information is embedded in the database runtime module. This means that there is no need for data dictionary enquiries. Requests like 'Describe table person' can be fulfilled immediately without any dictionary enquiries. All consistency checking code is generated as part of the code, so there is no need for interpreting consistency information.
A minimal sized runtime module tailored to the actual problem area.
The runtime module does not contain any superfluous code. If there is no float data types in use, no code to handle float data types will be generated as part of the database runtime. The same applies for consistency checking as well. This ensures a code that is minimal for the actual data model (and data dicitionary) and will further ensure a minimal sized database system.
A database runtime that runs the most efficient code.
As the database system doesn't contain any unnecessary code, it will outperform the market leading database systems. As the code generator knows the exact table layout, all integrity rules and where to apply them an optimal algorithms for data retrieval and manipulation can be chosen by the code generator.
Full flexibility with respect to external interfaces.
A traditional data dictionary normally consists of external view information (i.e. SQL Views) that may be used to tailor an interface to a specific application. On the other hand traditional database systems only support one kind of database interfaces (Relational, Network oriented, Hierarchical, ObjectRelational or ObjectOriented). They never support a mix of these.
A particular development tool for developing solutions according to the invention has the knowledge of external views (also with respect to different database technologies), so it can generate all interfaces needed, tailored to the actual database.
Fig. 6 is a block diagram illustrating the use of various application program interfaces (APIs) in a database system according to the invention. This figure shows that more applications 250a, 250b, 250c can access the same database. The code generator 240 can not only generate the runtime module, but also (by use of particular templates) different application program interfaces 205a, 205b, 205c on top of the database runtime module 210. Each of these interfaces 205a, 205b, 205c can be used simultaneously to allow access to the database from different applications 250a, 250b, 250c, respectively. For instance, a booking application and a flight control application can both access the schedule in the database using an SQL interface, while an XML based flight publication application concurrently access the schedule.
Maintainable as normal systems
The inventive approach allows the same type of system development as traditional database system development. The application developers focus on just the same aspects; which are developing business logic, user interfaces and interfaces to other systems. The database API (Application Program Interface) looks like ordinary database APIs and offers the same functionality.
If changes in the data model 260 occurs, a new runtime module is generated (in traditional system the database schema (or model) has to be compiled and stored in a dictionary). On the surface this looks the same, but the difference inside the database system is essential.
A runtime module that is able to handle all kind of constraints.
As the development tool knows the concept of conceptual integrity rules the code that is produced is able to handle all types of constraints. In fact it is able to handle both dynamic and static integrity rules, and also rules that works both ways (equal and exclude constraints).
A portable and scalable database system
As the generator produces ANSI C code, the C code itself will ensure portability. As the actual dictionary is divided from the run time, the runtime module may also be generated using another programming language as for instance Java, C++, Basic or Pascal.
Because of the above mentioned facts the database system according to the invention is well suited for hand held devices with limited computing power.
In the foregoing, the invention has been described for purposes of clarity of understanding. The specified embodiment should therefore be interpreted as illustrative and not restrictive. It will be apparent to those skilled in the art that various changes and modifications may be practiced within the scope of the invention, which is defined by the following claims and their equivalents.

Claims

PATENT CLAIMS
1. A computer system for providing access to a database upon a request from an application computer program (250), comprising
- a data model (260) represented in a data dictionary (230), - an application program interface,
- a data storage (220),
- a database kernel (210, 215), providing access from the application program interface to the data storage (220), characterized in that said database kernel (210, 215) comprises - a database runtime module (210) for providing access from the application program interface to the database kernel, said runtime module (210) being dynamically changeable, dependent on the data model (260) represented in the data dictionary (230), and a storage engine module (215) for providing access from the runtime module (210) to the data storage (220), said storage engine module (215) being invariable, independent of said data model (260).
2. A computer system according to claim 1, further comprising an automatic code generator (240), said code generator being adapted to generate an executable program code for the runtime module (210) based upon the data model (260) represented in the data dictionary (230).
3. A computer system according to claim 2, wherein the program code for the runtime module (210) is generated by the code generator (240) subsequent to any amendments in the data model (260).
4. A computer system according to claim 2, wherein said code generator (240) comprises a source code generating module and a compiler module, said code generating module being adapted to generate a source code to be processed by the compiler module, thus generating said executable program code.
5. A computer system according to claim 4, wherein said code generating module is adapted for
- receiving structural data provided from the data dictionary (230),
- receiving syntactical data provided from a template (720), and processing said structural data in accordance with rules defined by the syntactical data, thus producing source code adapted for input to the compiler module.
6. A computer system according to claim 5, wherein said code generating module is adapted to generate descriptive code or procedural code, dependent on the syntactical data provided from the template (720).
7. A computer system according to claim 1, wherein said storage engine (215) is adapted to offer the runtime module (210) a platform independent access to the data stored in the data storage (220).
8. A computer system according to claim 1, wherein said storage engine is adapted to store and retrieve data elements of a first data structure (BASE), comprising an unordered set of data, and data elements of a second data structure (INDEX), comprising an ordered set of data.
9. A computer system according to claim 8, wherein said storage engine is adapted to provide an identifier (Rowid) in each data elements (BASE) of said first data structure, and to provide data elements (INDEX) of said second structure wherein each entry has a set of such associated identifiers (Rowid).
10. A method for improving a database system for providing access to a database upon a request from an application computer program (250), said system comprising
- a data model (260) represented in a data dictionary (230), - an application program interface,
- a data storage (220),
- a database kernel (210, 215), providing access from the application program interface to the data storage (220), the method being characterized by the steps of - providing in the database kernel (210, 215) a database runtime module (210) for providing access from the application program interface to the database kernel, said runtime module (210) being dynamically changeable, dependent on the data model (260) represented in the data dictionary (230), and providing in the database kernel a storage engine module (215) for providing access from the runtime module (210) to the data storage (220), said storage engine module (215) being invariable, independent of said data model
(260).
11. A method according to claim 10, further comprising
- providing in the database system an automatic code generator (240), said code generator being adapted to generate an executable program code for the runtime module (210) based upon the data model (260) represented in the data dictionary (230).
12. A method according to claim 11, wherein the program code for the runtime module (210) is generated by the code generator (240) subsequent to any amendments in the data model (260).
13. A method according to claim 11, wherein said code generator (240) comprises a source code generating module and a compiler module, said code generating module being adapted to generate a source code to be processed by the compiler module, thus generating said executable program code.
14. A method according to claim 13, wherein said code generating module is adapted for - receiving structural data provided from the data dictionary (230),
- receiving syntactical data provided from a template (720), and processing said structural data in accordance with rules defined by the syntactical data, thus producing source code adapted for input to the compiler module.
15. A method according to claim 14, wherein said code generating module is adapted to generate descriptive code or procedural code, dependent on the syntactical data provided from the template (720).
16. A method according to claim 10, wherein said storage engine (215) is adapted to offer the runtime module (210) a platform independent access to the data stored in the data storage (220).
17. A method according to claim 16, wherein said storage engine is adapted to store and retrieve data elements of a first data structure (BASE), comprising an unordered set of data, and data elements of a second data structure (INDEX), comprising an ordered set of data.
18. A method according to claim 17, wherein said storage engine is adapted to provide an identifier (Rowid) in each data elements (BASE) of said first data structure, and to provide data elements (INDEX) of said second structure wherein each entry has a set of such associated identifiers (Rowid).
19. In a computer system for providing access to a database upon a request from an application computer program (250), said system comprising
- a data model (260) represented in a data dictionary (230),
- an application program interface, and
- a data storage (220), a database kernel (210, 215), providing access from the application program interface to the data storage (220), said database kernel (210, 215) comprising a database runtime module (210) for providing access from the application program interface to the database kernel, said runtime module (210) being dynamically changeable, dependent on the data model (260) represented in the data dictionary (230), and a storage engine module (215) for providing access from the runtime module (210) to the data storage (220), said storage engine module (215) being invariable, independent of said data model (260).
20. In a computer system for providing access to a database upon a request from an application computer program (250), said system comprising
- a data model (260) represented in a data dictionary (230), - an application program interface,
- a data storage (220), and
- a database kernel (210, 215), providing access from the application program interface to the data storage (220), said database kernel (210, 215) comprising - a database runtime module (210) for providing access from the application program interface to the database kernel, said runtime module (210) being dynamically changeable, dependent on the data model (260) represented in the data dictionary (230), and a storage engine module (215) for providing access from the runtime module (210) to the data storage (220), said storage engine module (215) being invariable, independent of said data model (260), an automatic code generator, said code generator being adapted to generate an executable program code for the runtime module (210) based upon said data model (260).
PCT/NO2002/000274 2001-08-01 2002-07-31 Adaptable database runtime kernel WO2003012690A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP02746220A EP1421517A1 (en) 2001-08-01 2002-07-31 Adaptable database runtime kernel

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NO20013784A NO314426B1 (en) 2001-08-01 2001-08-01 Adaptable database runtime kernel
NO20013784 2001-08-01

Publications (1)

Publication Number Publication Date
WO2003012690A1 true WO2003012690A1 (en) 2003-02-13

Family

ID=19912712

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NO2002/000274 WO2003012690A1 (en) 2001-08-01 2002-07-31 Adaptable database runtime kernel

Country Status (3)

Country Link
EP (1) EP1421517A1 (en)
NO (1) NO314426B1 (en)
WO (1) WO2003012690A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919615A (en) * 2015-12-28 2017-07-04 航天信息股份有限公司 Data access method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5819090A (en) * 1994-03-04 1998-10-06 Ast Research, Inc. Application control module for common user access interface
US5835910A (en) * 1994-11-10 1998-11-10 Cadis, Inc. Method and system for comparing attributes in an object-oriented management system
WO1999054833A2 (en) * 1998-04-20 1999-10-28 Recursion Dynamics Inc. Dynamically configurable data storage and processing system optimized for performing database operations
WO2000052571A1 (en) * 1999-03-02 2000-09-08 Acta Technology, Inc. Specification to abap code converter
US6266666B1 (en) * 1997-09-08 2001-07-24 Sybase, Inc. Component transaction server for developing and deploying transaction- intensive business applications

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5819090A (en) * 1994-03-04 1998-10-06 Ast Research, Inc. Application control module for common user access interface
US5835910A (en) * 1994-11-10 1998-11-10 Cadis, Inc. Method and system for comparing attributes in an object-oriented management system
US6266666B1 (en) * 1997-09-08 2001-07-24 Sybase, Inc. Component transaction server for developing and deploying transaction- intensive business applications
WO1999054833A2 (en) * 1998-04-20 1999-10-28 Recursion Dynamics Inc. Dynamically configurable data storage and processing system optimized for performing database operations
WO2000052571A1 (en) * 1999-03-02 2000-09-08 Acta Technology, Inc. Specification to abap code converter

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IBM TECHNICAL DISCLOSURE BULLETIN, vol. 21, no. 6, November 1978 (1978-11-01), pages 2532 - 2533

Also Published As

Publication number Publication date
NO314426B1 (en) 2003-03-17
EP1421517A1 (en) 2004-05-26
NO20013784L (en) 2003-02-03
NO20013784D0 (en) 2001-08-01

Similar Documents

Publication Publication Date Title
CN100468396C (en) Mapping architecture for arbitrary data models
US7457810B2 (en) Querying markup language data sources using a relational query processor
Rahm et al. A survey of approaches to automatic schema matching
US6374252B1 (en) Modeling of object-oriented database structures, translation to relational database structures, and dynamic searches thereon
Miller et al. The Clio project: managing heterogeneity
US8250529B2 (en) Specification to ABAP code converter
US8412746B2 (en) Method and system for federated querying of data sources
US5970490A (en) Integration platform for heterogeneous databases
US7031956B1 (en) System and method for synchronizing and/or updating an existing relational database with supplemental XML data
US6014670A (en) Apparatus and method for performing data transformations in data warehousing
CN103226478A (en) Method for automatically generating and using code
Brunel et al. Supporting hierarchical data in SAP HANA
US7251641B2 (en) Adjustable database runtime kernel
Jennings Professional ADO. NET 3.5 with LINQ and the Entity Framework
WO2003012690A1 (en) Adaptable database runtime kernel
Blakeley et al. Distributed/heterogeneous query processing in Microsoft SQL server
Muller Software engineering for the industrial Internet: Situation-aware smart applications
Attardi et al. Template metaprogramming an object interface to relational tables
Lawrence Automatic Conflict Resolution to Integrate Relational Schema
Vega Ruiz et al. Mortadelo: automatic generation of NoSQL stores from platform-independent data models
Bozzon et al. Integrating databases, search engines and web applications: a model-driven approach
US20110282911A1 (en) Method and apparatus for providing a relational document-based datastore
WO1993007564A1 (en) Computer system for generating sql statements from cobol code
Grossman et al. Integrating structured data and text
Ovchinnikov et al. A declarative concept-based query language as a mean for relational database querying

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BY BZ CA CH CN CO CR CU CZ DE DM DZ EC EE ES FI GB GD GE GH HR HU ID IL IN IS JP KE KG KP KR LC LK LR LS LT LU LV MA MD MG MN MW MX MZ NO NZ OM PH PL PT RU SD SE SG SI SK SL TJ TM TN TR TZ UA UG UZ VN YU ZA ZM

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZM ZW AM AZ BY KG KZ RU TJ TM AT BE BG CH CY CZ DK EE ES FI FR GB GR IE IT LU MC PT SE SK TR BF BJ CF CG CI GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2002746220

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2002746220

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP