Organic dictionary method for storage and unlocking data
Database structure
The invention relates in general to a method for building a database and relates furthermore to a database structure which can be implemented smoothly by means of said method. The invention is specifically directed to a non-relational database structure.
STATE OF THE ART
A method as described in the first paragraph is known from EP-0351786. This prior art publication describes especially a data storage apparatus for managing data of individual names, telephone numbers and the like. The apparatus comprises a number of first storage means each storing a table of data items about the same subject. For instance, the first table contains names of companies, the second table contains names of employees, the third table contains telephone numbers etc. Each entry in said tables is characterised by its address in the respective table. The apparatus comprises furthermore a second storage means divided into records each assigned to one individual. Each record contains for the respective individual a combination of addresses of the tables in the first storage means. In this way each individual is linked to a company, to a telephone number, etc.
This prior art apparatus only stores data records in a space efficient manner. The width of each table can be restricted dependent on the largest entry and each table only comprises unique entries. For instance, the name of a company employing a large number of employees is entered only once. In comparison with a database with records which for each individual hold the company name as a separate entry a significant reduction of memory space can be realised. This prior art apparatus does not store data about relations between the various subjects. The database in this prior art apparatus is certainly not a relational database.
According to the state of the art building a relational database requires analysing the data to be stored in the database, predefining the various fixed table relations
between the data and fields within the fixed tables, making a structural and functional design and finally generating the necessary software.
Possible problems with existing databases could arise in relation to: adding new data when the data fields are not defined - adding or changing new table relations within the pre-defined datamodel formulating new queries which are not pre-defined adding additional fields within the pre-defined tables etc. Conversion/importing of other data from other different database with different datamodel structures - Exporting data on the lowers aggregation level
Increasing pollution of the data because the data there is no check of making data unique
THE INVENTION The above mentioned disadvantages are overcome by the database structure according to the invention which database structure comprises; a) a facts dictionary containing facts, each fact being an isolated indivisible unique piece of information and each fact being entered into the facts dictionary as combination of a unique fact identification code, a fact description and a fact value, b) a identity dictionary containing identities, each identity being an isolated combination of facts which together form a unique unit and each identity being entered into the identity dictionary as combination of a unique identity identification code, an identity description and an indication of those facts which together define the identity, c) a dynamic and none pre defined descriptive relation dictionary, each relation being a dynamic connection between identities, between an identity and a relation, or between relations and each relation being entered into the dictionary as combination of a unique relation identification code, a relation description and an indication of the identities or relations involved in the relation. In case an identity is defined on the basis of only one fact a relation could be defined between an identity and a fact or even between two facts.
Although the relation dictionary comprises for each entry a description of the
respective relation the original cards provide some more information which will be lost or hard to trace if no other measures are taken. In general two identities are involved in a relation and each identity plays a specific role in said relation. For instance, a person A has a business relation with a firm B. More specific the role of person A is the em- ployee and the role of the firm B is the employer. However, in another business relation between person A and firm B the person A could be the owner of the firm B in which case the firm B has the property role. In general different relations may exist between two identities wherein the identities may play a different role.
To include role information into the database it is preferred that in the relation dictionary for each of the identities involved it is indicated which role they play in the respective relation. In other words a relation is entered into the relation dictionary as a combination of a unique relation identification code, a relation description, an indication of the first identity involved, a description of the role of said first identity, an indication of the second identity involved and a description of the role of said second iden- tity.
In the above-described embodiments a lot of descriptions are included in the various dictionaries. These descriptions may appear more than once and in a practical case a specific description may appear in a large number. That requires a lot of memory space and is not very efficient. To avoid said disadvantage it is preferred that the structure furthermore comprises d) a fact description dictionary whereby each fact description is entered into the dictionary as combination of a unique fact description identification code and a unique name of the respective fact. After concentrating the actual descriptions in a specific dictionary the fact dictionary can be made more manageable by replacing each fact description by the fact description identification code which in combination with the respective fact description is entered in the fact description dictionary
To avoid the above mentioned disadvantage it is furthermore preferred that the structure furthermore comprises e) an identity description dictionary whereby each identity description is entered into the dictionary as combination of a unique identity description identification code
and a typifying unique indication of the identity description as entered in the identity description dictionary. After concentrating the actual descriptions in a specific dictionary the identity dictionary can be made more manageable by replacing each identity description by the identity description identification code which in combination with the respective identity description is stored in the identity description dictionary.
To avoid the above mentioned disadvantage it is also preferred that the structure furthermore comprises f) a relation description dictionary whereby each relation description is entered into the dictionary as combination of a unique relation description identification code and a typifying unique indication of the type of relation After concentrating the actual descriptions in a specific dictionary the relation's dictionary can be made more manageable by replacing each relation description by the relation description identification code which in combination with the corresponding relation description is entered into the relation description dictionary.
That still may leave some role descriptions in the relation dictionary. It is therefor preferred that the structure furthermore comprises: g) a role description dictionary whereby each role description is entered into the dictionary as a combination of a unique role description identification code and a typifying unique identification of the role description.
After concentrating the actual descriptions in a specific dictionary the relation dictionary can be made more manageable by replacing each role description by the role description identification code which in combination with the corresponding role description is entered into the role description dictionary.
EXAMPLES
The invention will be explained first with reference to a few examples whereafter a more general description of the database structure will be provided.
EXAMPLE 1
A database will be build from data retrieved from a number of business cards. Three of those cards look like:
Piet Peters The Explorid Group Mario de Nries The Explorid Group
Director R&D Reeuwijkse Poort 208 Director Reeuwijkse poort 208 NL-2811 MZ Reeuwijk 2811 MZ Reeuwijk The Netherlands The Netherlands T +31 0 182 308 151 T +31 0 182 308 151 F +31 0 182 308 150 F +31 0 182 308 150 M +31 0 651 501 036 M +31 0 651 501 038
Kees Jansma Nederlandsch Octrooibureau Partner Scheveningseweg 92
NL-2517 KZ Den Haag The Netherlands T +31 0 703 527 500 F +31 0 703 527 528
First of all the various facts have to be identified whereby a fact is considered to be an isolated indivisible piece of information. To start with the first card above provides the first fact which is "Piet" which fact can de described as the first name of a person. The second fact is "Peters" which fact can be described as the surname of a person. Each word or combination of words could be a fact. Each fact can be indicated by a fact description. There may be more facts which have the same description, e.g. the fact "Mario" (the name on the second card) has the same description as the first fact, i.e. "first name of a person". On the other hand each fact is uniquely defined by a unique fact identification. Even if there is a further card in the collection which bears the name Piet Peters then it receives a unique identification code so that the persons with a same name can be distinguished.
From the above-indicated cards the following facts can be identified:
The cards provide furthermore information about a number of identities whereby an identity is considered to be an isolated dynamic combination of facts which define a unique unit. The two facts "Piet" and "Peters" together define a person. The two facts "Kees" and "Jansma" define another person.
By combining two or more facts the following identities in this example can be identified:
Identity
Person
Business
Address
Website
In general identities which completely stand-alone are very seldom, certainly in larger data collections. It is far more common that an identity has at least one relation with another identity. In this example e.g. Piet Peters has a working relation with The Explo-
rid Group,
By considering combinations of identities the following relations can be recognised:
Relation
Working
Address
According to the above scheme three dictionaries can be filled with data from the cards, a Facts dictionary, an Identities dictionary and a Relations dictionary.
Each entry in the Facts dictionary comprises a fact identification code (which could be a simple succession number as in this example), a fact description and a fact value:
Each entry in the Identity dictionary comprises an identity identification code (which could be a simple succession number as in this example), an identity description and an indication of the facts which together determine the identity.
Each entry in the Relations dictionary comprises a relation identification code (which could be a simple succession number as in this example), a relation description and an indication of the identities which are involved in the relation.
Facts Dictionary
Identities Dictionary
Relations Dictionary
In the above-described Identities Dictionary the facts are mentioned in two columns, identity facts and other facts.
In general a fact can be considered as an isolated, indivisible unique piece of information i.e. the sexe of a person, the birth date of a person, the colour of a car, the text of a mailing, the street name, the time an organic relation exists etc. Facts can be distinguished in a) identity facts or static facts and b) dynamic facts. Whether a fact is an identity fact or a dynamic fact depends on the use of the fact in the definition of an identity. A fact can be both, but not for the same identity i.e. the price can be an identity fact for the identity product, but it can be a dynamic fact for the identity transaction.
An identity fact can be defined as a fact of which the value doesn't change during the lifetime of an identity and which is used for identifying the identity i.e. sexe, birth date, street name, communication date, transaction time etc.
A dynamic fact can be defined as a fact of which the value can be changed during the lifetime of an identity or an organic relation i.e. the colour of a car, different answers on the same question from a questionnaire, the telephone number, the time that a person is reachable on an address.
EXAMPLE 2
This example is based on the same collection of cards as used in example 1. A disadvantage in the relations dictionary is that the dictionary only indicates the identities which are involved in the relation without indicating the specific role they play in the relation. For example the relations dictionary in the above example mentions as relation 1 a working relation between identity 1 (defined by fact 1 and fact 2 = the person Piet Peters) and identity 2 (defined by fact 4 = the business The Explorid Group). This relation will become much more clear in case it is indicated that the role of identity 1 is "employee" and the role of identity 2 is "employer". The addition of a role indicator does not have any influence neither on the Facts Dictionary nor on the Identities dictionary. These two dictionaries are identical to the ones in example 1. However, by adding role information the Relations dictionary changes into:
Relations dictionary
Identities may exist in various different contexts. To make that clear we add another card to the collection, which card is issued by the soccer club "Reeuwijk United". The membership card of Piet Peters reads:
Soccer club "Reeuwijk United" Sponsored by The Explorid Group Member Piet Peters
Platteweg 37
2814 RG
Reeuwijk Membership nr A675 Year of birth 1965
By adding this card to the collection some new facts are introduced in the Facts dictionary:
Facts Dictionary
The Identities dictionary could be extended just by adding information about the soccer club (fact Id 32), the address of Piet Peters (Fact Id 33, 34, 35, 8) and further personal data about Piet Peters (36, 37) but in that case a lot of information would be lost. To avoid that for each identity one or more contexts are defined in which the respective identity can be defined.
For instance the identity Piet Peters first of all exists in: a personal context which is build on pure personal facts ( 1, 2, 33, 34, 35, 37 ) a working context which is build on business facts (3, 12, 14 )
- a sporting context which is build on sporting facts ( 36 )
Taking various different contexts into account the Identities Dictionary will become:
The identity facts in the third column above define in fact an inherent context without which the respective identity cannot exist. The dynamic facts are subdivided into one or more further contexts. Each contact is mentioned in a column "context" followed by a column "facts" in which the corresponding facts are indicated.
Because some new relations are introduced such as Piet Peters being a member of the soccer club and the business The Explorid Group being a sponsor of the soccer club, a number of rows has to be added to the Relations Dictionary:
Relations dictionary
EXAMPLE 4
In the rather simple examples of a database according to the invention, described above as examples 1 and 2, a lot of data items are stored more than once. See for instance the various description columns. This requires a lot of memory space. A solution for said disadvantage is to add further dictionaries for storing the various different descriptions. That is done in this fourth example wherein five further dictionaries are added in which the fact descriptions, identity descriptions, relation descriptions, role descriptions and context descriptions are stored. In the original three dictionaries these descriptions are replaces by relatively short description identification codes. The eight resulting dictionaries are:
Identity Description Dictionary
Relation Description Dictionary
Role Description Dictionary
Context Description Dictionary
Identity Dictionary
Relations Dictionary
EXAMPLE 5 Multilanguage dictionary
We like to build an organic database for storing translations of words in different languages in combination with a descriptive meaning of the words in English. Suppose we have the following words with translations and meanings
The whole procedure will be subdivided in three steps:
- a first step in which the various kinds of facts are recognised,
- a second step in which the descriptions are made of and in which the definition organic dictionaries are filled and
- a third step in which the data organic dictionaries are filled.
Step 1.
The following facts can be identified:
The following identities can be identified:
Between the identities the following organic relations can be identified:
Step 2.
The definition organic dictionaries will be filled as following:
Fact Description Dictionary
Identity Description Dictionary
Role Description Dictionary
The data organic dictionaries will be filled as following:
Facts Dictionary
Identities Dictionary
Relations Dictionary
Diseases gene database
We would like to build an organic database for storing diseases and the genes that are related to these diseases. Let's suppose that we have the following diseases and genes:
Asthma
Crohn's disease
We use the same three-step method as in example 4 Step 1.The following facts can be identified:
The following contexts and identities can be identified:
Between the identities the following organic relations can be identified:
The definition organic dictionaries will be filled as following:
Context Description Dictionary
Identity Description Dictionary
Relation Description Dictionary
Role Description Dictionary
Step 3.
The data organic dictionaries will be filled as following:
Facts Dictionary
Identities Dictionary
Relations Dictionary
In the following part of the description a general definition of the Organic dictionary structure according to the invention will be provided.
The organic dictionary model
The organic dictionary model contains two types of organic dictionaries. A first number of definition organic dictionaries which contain definitions and descriptions and a second number of data organic dictionaries which contain the data itself. In general five description or definition dictionaries and three data dictionaries are sufficient to store all the data. However, if these numbers are not suitable, it is possible to create more dictionaries. In fact that there is no limit to the number of dictionaries.
In the attached drawing a very schematically view of a database structure according to the invention is shown. The various dictionaries which will be described in detail in the following paragraphs are indicated by the following reference numbers: 1 Fact description dictionary 2 Context description dictionary
3 Identity description dictionary
4 Role description dictionary
5 Relation description dictionary
6 Facts dictionary 7 Identities dictionary 8 Relations dictionary.
It is important to understand that the lines between the various dictionaries do not indicate "relations" as in prior art "relational databases" but indicate how the entities are developed. The horizontal lines make clear that identities are defined as one fact or a combination of facts and that there could be relations between identities (or between relations and even between facts if an identity is based on only one fact). The vertical lines indicate that the description dictionaries comprise descriptions of the data dictionaries connected thereto by the respective vertical lines.
In the following each of the dictionaries in a general embodiment with three data dictionaries and five description dictionaries will be described using general terms:
Fact Description Dictionary.
A definition organic dictionary which contains the description of the facts.
Facts Dictionary
Data organic dictionary contains the values of the facts.
Context Description Dictionary Definition organic dictionary contains the description of the contexts in which the identities occur.
IdentityDescription Dictionary
Definition organic dictionary contains the description and definitions of the identities.
RelationDescription Dictionary
Definition organic dictionary contains the descriptions of the type of organic relations.
RoleDescription Dictionary
Definition organic dictionary contains the descriptions of the roles that an identity can have in an organic relation.
Relations Dictionary
Data organic dictionary contains the organic relations between the identities. Relations are possible between identities, relations and an identity and a relation. Relations can have their own dynamic facts.
GLOSSARY
Finally underneath a list of words, important for a correct understanding of the invention, is presented together with their meaning
Attribute
The item from an organic dictionary element that defines the type of the values stored in that element, sees also organic dictionary.
Context An interrelated condition in which an identity exists or occurs i.e. the identity person occurs in the personal context, the identity business occurs in the business context. In a context more than one identity can occur, i.e. the identities invoice and transaction occurs in the financial context, the identities zip code and address occurs in the location
context. An identity can occur in more than one context i.e. the identity product can occur in the context product and in the context transaction. A context can be freely defined, but it must contain at least one identity.
Context method
A dynamic method where you can dynamically define all various of new context for understanding the information. With this method you can creates different contexts which describers how identities, roles, relations and dynamic facts behave and what they mean. With this method you can understand information contextual and how iden- tities are related to each other contextual. You can create any context which you want as long within the context there is at least one unique identity which can be identified and make it unique. For example you can create a personal context, which contains all information of a person, This person is a unique identity and you can understand this person from a personal context, you can create a financial context which contains for example a identity transaction and you can understand transaction behaviour form a transaction context. Because there are organic relations between a personal context end a financial context you can understand a person as a unique identity through a personal context and you can understand the same person through a financial context.
Data organic dictionary
An organic dictionary, which contains data, sees also organic dictionary
Definition organic dictionary
An organic dictionary, which contains definitions and descriptions, see also organic dictionary
Dynamicfact
A fact of which the value can be change during the lifetime of an identity or an organic relation i.e. the colour of a car, different answers on the same question from a question- naire, the telephone number, the time that a person is reachable on an address.
Element
The combinations of the values with the same index from all the entries of an organic dictionary see also organic dictionary.
Entry The item of an organic dictionary element that identifies the values stored in that element, sees also organic dictionary
Fact
An isolated, indivisible unique piece of information i.e. the gender of a person, the birth date of a person, the colour of a car, the text of a mailing, the street name, the time an organic relation exists etc. If a fact is an identity fact or a dynamic fact depends on the use of the fact in the definition of an identity. A fact can be both, but not for the same identity i.e. the price can be an identity fact for the identity product, but it can be a dynamic fact for the identity transaction.
Identity
An isolated dynamic combination of facts that form a unique identifiable combination of facts, i.e. the identity person can be a combination of the facts: birth date, gender, surname and first name, the identity business can be a combination of the facts: cham- ber of commerce number, tax number and name, the identity communication can be a combination of the facts: communication date, medium and name etc. An identity can be freely defined from facts, but the combination of facts must lead to a unique and identifiable combination of facts. If a piece of information is a fact or an identity is a little bit arbitrary and depends on the usability. A text can be defined as a fact, but if the words of a text are defined as facts, the text itself can be defined as an identity. The difference between a fact and an identity is that a fact can only be used for identifying an identity or to describe a property from an identity or an organic relation, while an identity can have organic relations with other identities. So if a piece of information has organic relations than it is an identity, otherwise it can be both.
Identity fact
A fact of which the value doesn't change during the lifetime of an identity and which is used for identifying the identity i.e. gender, birth date, street name, communication date, transaction time etc.
Item
A part of an element of an organic dictionary, sees also organic dictionary.
Organic
The way of storing data according to the context method.
Organic dictionary
A dynamic data structure which adepts itself automatically build from elements used for storing data. Each element in the list is a list of three items, the entry, the value and the attributes. The dictionary can contain an unlimited amount of elements, the entry contains the identification of the values, the value can contain an unlimited list of data and the attribute contains the type of the value.
Organic relation An organic relation occurs based on the data and is not pre-defined such as in a relation database, it is also a dynamic description of a relationship between identities in which each identity has its own role. An organic relation is defined by the data itself and therefore it is not pre-defined in the first place. I.e. the address of a person, the role of the address is the residence and the role of the person is resident, the company of a per- son, the role of the company is employer and the role of the person is employee etc. The same identities can have several different organic relations. I.e. two brothers, who both are studying on the same university, sports on the same sports club and living next-door to each other have four relations. First they are brothers, so their role is brother, second they study on the same university, so their role is colleagues, third they sports on the same sports club, so their role is club members and fourth they live next- door to each other, so their role is neighbours.
Role
Behaviour of an identity in an organic relation i.e. a father, a working place, a buyer, an employer, an employee etc.