EP2519899A1 - Method and arrangement for data storage - Google Patents
Method and arrangement for data storageInfo
- Publication number
- EP2519899A1 EP2519899A1 EP09852855A EP09852855A EP2519899A1 EP 2519899 A1 EP2519899 A1 EP 2519899A1 EP 09852855 A EP09852855 A EP 09852855A EP 09852855 A EP09852855 A EP 09852855A EP 2519899 A1 EP2519899 A1 EP 2519899A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- sub
- identifier
- database
- tables
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/289—Object oriented databases
Definitions
- the invention relates generally to a method and arrangement for providing efficient storage and retrieval of data in a database.
- the stored data may relate to any type of object information that can be arranged logically in different classes or categories of a hierarchical structure.
- object' is used throughout this description, although other terms could also be used such as items, assets, entities, etc.
- attributes are typically relevant for each object, and data referring to such attributes is stored in the database available for retrieval in response to data queries.
- the attributes typically form columns in one or more tables in the database.
- attribute represents any subject, aspect or characteristic that may be relevant for describing an object
- An attribute may be either generic or more specific.
- media items may be divided into the classes “Moving Images” and “Still Images”, while “Moving Images” may in turn be divided into the classes “Movies”, “Sports”, “Documentaries”, and so forth.
- the different classes can form a hierarchical structure which can be represented by tables in the database.
- Each class may be associated with one or more attributes relevant for that class.
- Jig. 1 illustrates schematically a hierarchical structure of different classes in which data for objects can be logically organized. On top of the hierarchy is a "Raotclass” with attributes denoted "Info 1", “Info 2", “Info 3"...
- attributes refer to general or “abstract' data relevant for all objects in the database, e.g. title, description, language, input date, etc., using the example of media items.
- An object identity "Object HT is also specified in a field of this class to identify each object, which can be used as a primary key for data searches.
- the R)otclass may be called the "Assetclass”.
- the next level comprises two sub-classes A and B with attributes denoted "Info Al”, “Info A2”, “Info A3”... and “Info Bl”, “Info B2”, “Info B3”...,
- each sub-class is relevant only for objects that fit into that sub-class.
- sub-class A may refer to Moving Images and the attributes therein may refer to encoding, duration, color, etc
- sub-class B referring to Still Images may have other relevant attributes such as width, height, resolution, format, etc.
- sub-class A may refer to Movies and Sports, respectively, each having one or more sub-class specific attributes. All sub-classes A, B, AA, BB... have the Object! field to identify the objects fitting into the respective sub-class.
- a single table is used for all sub-classes and attributes in the entire class hierarchy, and the table thus comprises columns or the like for all attributes used in me database.
- me table would contain columns for all attributes Wo 1-3, Wo A1-A3, Wo B1-B3, Wo AA1-AA3, Wo AB1-AB3... in all sub-classes, tiius typically resulting in a great number of columns in the table.
- One table is used for each sub-class and contains all attributes relevant for objects that fit into that sub-class, including attributes in all higher classes to which that sub-class is connected.
- a table for sub-class AA would contain columns for attributes Wo AA1-AA3, Wo A1-A3, Wo 1-3...
- a table for sub-class AB would contain columns for attributes Wo AB1-AB3 , Wo Al- A3, Wol-3.
- a method for handling object-related data in a hierarchical database comprising a root table and a plurality of predefined sub-tables organized in multiple hierarchical levels below said root table, each table comprising one or more attributes under which object data can be stored.
- a table identifier TID is assigned to each sub-table, and the table identifier TID is generated by multiplying a series of prime numbers associated with parent tables of said sub-table.
- an object identifier OID is assigned to each object in the database such that the table identifiers TID:s of a path of hierarchically connected sub-tables in different consecutive levels are encoded into me object identifier OK).
- the data on me object is stored in me connected sub- tables in me table path, where appropriate.
- searching for data can be limited to those tables that are valid for the searched object and where data on the object is expected to be found by encoding the valid table path into the object identifier OK), such that the processing delays and load for searching the database can be miriimised.
- an arrangement is provided thatis configured to handle object-related data in the above hierarchical database.
- the database arrangement comprises a configuring function adapted to assign a table identifier ⁇ ) to each sub-table, generated by multiplying a series of prime numbers associated with parent tables of said sub-table.
- the database arrangement also comprises a storing function adapted to assign an object identifier OK) to each object in the database, such that the table identifiers TK):s of a path of hierarchically connected sub-tables in different consecutive levels are encoded into the object identifier OK), and to store data on said object in the connected sub-tables in the table path.
- the table identifier IK is generated for a sub-table by multiplying the parent table identifier HD of that sub-table's parent table in an adjacent level with a next available prime number of said parent table.
- a table path can readily be encoded in a single number used as object identifier OK), by means of factorization of prime numbers.
- data relating to an object is stored in the database by determining a table path with an identified sub-table relevant for said object and parent tables hierarchically connected to said sub-table, assigning an objectidentifier OID to the object computed from the table identifier HD of the determined relevant sub-table, and storing the data and the objectidentifier OD in the tables of said path.
- the table identifier HD is encoded in the object identifier OK) while all parent table identifiers HD:s can be determined from that table identifier HD by prime number factorization thereof
- object-related data when receiving a data query referring to an object identifier OK) of a requested object, object-related data can be retrieved from the database by computing from the received object identifier OK) a table identifier HD of a sub-table relevant for the requested object
- the table path encoded into the received object identifier OK) is then determined by computing parent table identifiers HD of successive levels until the root table is reached.
- Object data is then fetched from the tables in the determined table path, and the fetched data is finally returned in response to the data query.
- a new sub-table is added to the database by computing a table identifier HD for the new sub-table by multiplying the table identifier HD of its parent table by the next available prime number. If the computed table identifier TID is not below 2 k where k is an integer selected to set 2 k as a maximum value of the table identifier TID in the database hierarchy, k is
- the new sub-table can be added to the database using the computed table identifier TID.
- Jig. 1 illustrates an exemplary hierarchical structure with different classes of object information.
- Jig. 2 is a schematic block diagram illustra ting how a data query can be handled by a logical search function, according to one possible embodiment
- Jig. 3 is a schematic representation of an exemplary database structure, according to another possible embodiment
- Jig. 4 is a flow chart with steps performed by a logical storing function to store object data in a database, according to further exemplary embodiments.
- Jig. 5 is a flow chart with steps performed by a logical search function to search for object data in a database, according to further exemplary embodiments.
- Jig. 6 is a flow chart with steps performed by a logical corifiguring function to add a new table in a database, according to further exemplary embodiments.
- Jig. 7 is a block diagram illustra ting in more detail an arrangement with logical storing, search and configuring functions connected to a database, according to further exemplary embodiments.
- a solution is provided to enable efficient storage and rapid retrieval of object data from a database with hierarchically arranged tables associated with different logical classes or categories.
- Each table comprises one or more attributes arranged as columns or equivalent, under which data can be stored for different objects.
- the database can be configured basically according to approach 3) above, i.e. using just one table for each sub-class which contains attributes of that sub-class only, where duplication of attributes and stored data in multiple tables is not necessary.
- the database is thus logically organized in a hierarchical structure of tables, e.g. according to the scheme shown in Jig. 1, including a root table at a top level and multiple sub-tables in successively lower layers in the hierarchy.
- a "parent table” is a table to which one or more subordinate sub-tables are connected in the hierarchical structure. Thus, each sub-table is connected to a parent table.
- an object identifier "O ID" is assigned to each object stored in the database where identities of all interconnected sub-tables relevant for a particular object are effectively encoded into the OID of thatobject
- the tables that are relevant for searching can be determined solely from the OID of a requested object while no further tables need to be searched, which will basically avoid any unnecessary searching.
- no table(s) relevant to a searched object will be missed in the search
- TIDs are numbers thatcan be encoded into the object identifiers, OlD.s, as follows.
- Each TID is generated by multiplying a series of prime numbers, such that the TID of a sub-table connected to a parent table is generated by multiplying the TID of the parent table with a "next available prime number", i.e. a prime number that has not been used yet to generate any other sub-table connected to that parent table, which will be described in more detail and by means of examples below.
- k is an integer selected to set 2 k as a maximum value of the ⁇ ) in the entire database hierarchy, and n is a sequence number for objects in the determined relevant sub-table.
- n is a sequence number for objects in the determined relevant sub-table.
- the search function 200 is configured to search for object data in a hierarchical database 202 upon receiving a search query or request for object data, as follows.
- the database 202 is basically configured as described above.
- Mod is a modulus operation dete mining the residual integer after dividing O E) with 2 k .
- the search function 200 extracts all lower TID:s in a path of interconnected sub-tables by identifying the prime numbers that generate the first TID, illustrated as a block 206. This operation is possible since each sub-table TID has been generated from its parent table TID, that is by multiplying the TID of the parent table with a "next available prime number", as described above. Basically, the TID determined by (2) in block 204 above is prime number facto rized in block 206.
- a data search is accordingly performed for the object in database 202, illustrated as a block 208, and this search can thus be limited to the tables in the determined path only, since no data of that object can be expected to occur in any other tables outside the path.
- the determined table path is thus valid for the object as encoded in its OID.
- the search results i.e. data found, can be returned to the querying party, illustrated as a block 210.
- Jig. 3 illustrates a TID scheme for a database organized with a
- TID the number of sub-tables in the level 300 and multiple sub-tables in successively lower layers 302, 304, 306,... in the hierarchy.
- TID:s of the tables are given within the shown spots and the prime numbers used for generating underiying TID:s are given at respective connection arrows.
- TID the parent table 2
- TID the parent table 2 with a series of prime numbers, i.e. 3, 5, 7 and 11, resulting in TID:s 6, 10, 14 and 22, respectively.
- TID s has been made for four sub-tables connected to the parent table 3, having TID:s generated by multiplying 3 with a series of prime numbers, i.e. 5, 7, 11 and 13, resulting in HD:s 15 , 21, 33 and 39, respectively, and so forth.
- each TID is generated by multiplying the TID of the parent table with a next available prime number, i.e. a prime number that has notbeen used for any other sub-table connected to that parent table.
- a prime number i.e. a prime number that has notbeen used for any other sub-table connected to that parent table.
- each prime number used for generating an underiying TID is used just once for each parent table, hence the term "next available prime number", ensuring thatthe TID:s will be unique for all tables in the database.
- a first step 400 data related to the object is received for storage in the database.
- the sub-table most relevant for the object and its assigned TID are basically received at the data storing function.
- Hentifying the most relevant sub-table for the object is a logical operation that depends on the nature of the object and the definitions of the predefined sub-tables in the database. The operation of identifying the relevant sub-table can be made more or less manually by an administator, or automatically according to a preprogrammed logic, e.g. based on the type of object and/ or me data to be stored. The details of this operation is however outside me scope of mis embodiment
- a valid table path with interconnected parent tables in successively higher layers in me hierarchy is determined from me relevant sub-table's TID in a following step 404.
- a TID for a parent table is also denoted Rirent table Identifier "HD ⁇
- the parent tables can be determined such mat each parent table identifier HD is computed, one by one, by dividing a current table identifier TID with its greatest prime factor.
- a further step 406 an OID is assigned to me object and the OID is computed by means of equation (1) above, using me TID of me most relevant sub- table as input, which was identified in step 402 above.
- k is an integer selected to set 2 k as a maximum value of me TID in me entire database hierarchy
- n is a sequence number for objects in me determined relevant sub- table.
- the sequence number n is set for me object as a unique object number in me table, e.g. a next available number not used for any other previous object in mat table.
- me object data received in step 400 and me OK) computed and assigned in step 406, are stored in the tables of me table path determined in step 404, where appropriate. It should be noted mat it is not necessary to fill all me attributes in the tables of the path with data in this step, depending on what data is available for storage.
- a first step 500 a data query is received from a querying parly, the query referring to an OK) of an object
- step 504 the greatestprime factor of the computed TK) is determined and the nextHD of the parent table connected to the first sub -table is computed therefrom by dividing the above computed IK) with its greatestprime factor. It is then deterained if the roottable has been reached yet, in a step 506. If not, the process returns to step 504 where the nextHD is computed by in turn dividing the above computed HD with its greatestprime factor. Steps 504 and 506 are thus repeated to find the parent tables, one by one, until the roottable has been reached. [00055] When it is found thatthe roottable has finally been reached in step 506, determination of the valid table path has been completed and object data can be searched and fetched from the tables in the path according to the computed
- TIE TIE/ HD:s, in a next step 508. It is notnecessary to search any other tables than those in the determined table path, as explained above. Then, the fetched data of the object is returned to the querying parly, in a final shown step 510.
- a new table to the table hierarchy, e.g. when objects liaving a new characteristic in some respect not fitting with the existing table definitions, are to be stored in the database, or when a particular existing table is populated with very large amounts of stored data and needs to be subdivided for facilitating the searches therein.
- information on the new table is generally received at the database corifiguring function, which has been defined e.g. by setting a name for the table and defining attributes, the details of which are somewhat outside the scope of this solution. It has also been determined where in the table hierarchy the new table shall be situated, i.e. its parent table has been determined.
- a TID is generated for the new table, by multiplying the next available prime number with the HD of the parent table, in the manner described above, lis then checked in a following step 604, whether the generated HD does not amount to me parameter 2 k which has been set as a maximum value of me TID in me entire database hierarchy.
- each OID is computed according to equation (1).
- the generated TID does not equal or exceed the parameter 2 k in step 604
- the current value of k can be imintained and no updating of OID:s are necessary, thus excluding step 606.
- the new table can be added to the database, in a last shown step 608.
- the database arrangement 700 is configured to handle object-related data in a hierarchical database 702 comprising a root table and a plurality of predefined sub-tables organized in multiple
- Each table comprises one or more attributes under which object data can be stored.
- the database arrangement 700 may be used to accomplish any of the above-described procedures and embodiments.
- the various functions therein are called “modules” in this description, although they could also be seen as units, blocks, elements or components.
- the shown database arrangement comprises a storing function 704, a search function 706 and a configuring function 708.
- the configuring function 708 is adapted to assign a table identifier TID to each sub-table, generated by
- the storing function 704 is adapted to assign an object identifier OK) to each object in the database such that the table identifiers TK):s of a path of hierarchically connected sub-tables in different consecutive levels are encoded into me object identifier OK), and to store data on the object in the connected sub -tables in me table path.
- the storing function 704 includes a determining module 704a adapted to determine a table path with an identified sub-table and parent tables hierarchically connected to that sub-table, an assigning module 704b adapted to assign an object identifier OK) to the object computed from the table identifier ⁇ ) of the determined relevant sub-table, and a storing module 704c adapted to store the data and the object identifier OK) in the tables of the path.
- the shown database arrangement also comprises a search function 706 adapted to determine which tables to search when a data query is received referring to an object identifier OK) of a requested object, based on the received object identifier OK).
- the search function 706 includes a computing module 706a adapted to compute from the received object identifier OK) a table identifier ⁇ ) of a sub-table that is the most relevant for the requested object
- the search function 706 further includes a detennining module 706b adapted to determine a table path encoded into the received object identifier OK), by computing parent table identifiers HD of successive levels until the root table is reached, and a retrieving module 706c adapted to fetch object data from the tables in the determined table path, and to return the fetched data in response to the data query.
- the determining module 706b may be further adapted to compute each parent table identifier HD by dividing a current table identifier HD with its greatest prime factor.
- the configuring function 708 includes a computing module 708a adapted to compute a table identifier HD for a new sub-table to be added to the database, by multiplying the table identifier HD of its parent table by a next available prime number of the parent table.
- the configuring function 708 also includes an updating module 708b.
- the updating module 708b is adapted to increment k by 1 and update the object identifiers OK):s of objects stored in the database according to the new k
- the configuring function 708 also includes an adding module 708c adapted to add the new sub-table to the database using the computed table identifier HD.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SE2009/051511 WO2011081580A1 (en) | 2009-12-29 | 2009-12-29 | Method and arrangement for data storage |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2519899A1 true EP2519899A1 (en) | 2012-11-07 |
EP2519899A4 EP2519899A4 (en) | 2016-11-09 |
Family
ID=44226699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP09852855.7A Withdrawn EP2519899A4 (en) | 2009-12-29 | 2009-12-29 | Method and arrangement for data storage |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP2519899A4 (en) |
JP (1) | JP2013516017A (en) |
WO (1) | WO2011081580A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102955843B (en) * | 2012-09-20 | 2015-07-22 | 北大方正集团有限公司 | Method for realizing multi-key finding of key value database |
CN112395293B (en) * | 2020-11-27 | 2024-03-01 | 浙江诺诺网络科技有限公司 | Database and table dividing method, database and table dividing device, database and table dividing equipment and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040122825A1 (en) * | 2002-12-20 | 2004-06-24 | International Business Machines Corporation | Method, system, and program product for managing hierarchical structure data items in a database |
GB2422924A (en) * | 2005-02-04 | 2006-08-09 | Sony Comp Entertainment Europe | Determining derived relationships in a hierarchical structure |
US7487143B2 (en) * | 2005-11-17 | 2009-02-03 | International Business Machines Corporation | Method for nested categorization using factorization |
GB0623059D0 (en) * | 2006-11-18 | 2006-12-27 | Etgar Ran | Database system and method |
-
2009
- 2009-12-29 WO PCT/SE2009/051511 patent/WO2011081580A1/en active Application Filing
- 2009-12-29 JP JP2012547050A patent/JP2013516017A/en not_active Withdrawn
- 2009-12-29 EP EP09852855.7A patent/EP2519899A4/en not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
See references of WO2011081580A1 * |
Also Published As
Publication number | Publication date |
---|---|
EP2519899A4 (en) | 2016-11-09 |
JP2013516017A (en) | 2013-05-09 |
WO2011081580A1 (en) | 2011-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11176132B2 (en) | Processing database queries using format conversion | |
US7702619B2 (en) | Methods and systems for joining database tables using indexing data structures | |
US10521191B1 (en) | Multi-faceted search | |
US10296522B1 (en) | Index mechanism for report generation | |
US10067954B2 (en) | Use of dynamic dictionary encoding with an associated hash table to support many-to-many joins and aggregations | |
US8219564B1 (en) | Two-dimensional indexes for quick multiple attribute search in a catalog system | |
US5201047A (en) | Attribute-based classification and retrieval system | |
US8103658B2 (en) | Index backbone join | |
EP1403788A2 (en) | Eliminating group-by operation in a join plan | |
US20090150413A1 (en) | Virtual columns | |
EP3289484B1 (en) | Method and database computer system for performing a database query using a bitmap index | |
GB2329044A (en) | Data retrieval system | |
CN101901242A (en) | Federated configuration data management | |
EP1049997A1 (en) | Method and apparatus for optimizing query generation by selectively utilizing attributes or key values | |
US6438541B1 (en) | Method and article for processing queries that define outer joined views | |
US20090182766A1 (en) | Avoiding database related joins with specialized index structures | |
US7136861B1 (en) | Method and system for multiple function database indexing | |
US7617189B2 (en) | Parallel query processing techniques for minus and intersect operators | |
RU2004131664A (en) | METHOD AND DEVICE FOR HANDLING A REQUEST FOR RELATIVE DATABASES | |
WO2011081580A1 (en) | Method and arrangement for data storage | |
US8423523B2 (en) | Apparatus and method for utilizing context to resolve ambiguous queries | |
US10339138B2 (en) | System and a method for determining an index of an object in a sequence of objects | |
CN109726254A (en) | A kind of construction method and device of triple knowledge base | |
CN114238241B (en) | Metadata processing method and computer system for financial data | |
WO2024213261A1 (en) | Method and system for partitioning data records for a join operation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20120613 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: BJOERK, JONAS Inventor name: LIDSTROEM, MATTIAS Inventor name: MORITZ, SIMON |
|
DAX | Request for extension of the european patent (deleted) | ||
RA4 | Supplementary search report drawn up and despatched (corrected) |
Effective date: 20161007 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 17/30 20060101AFI20161003BHEP |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
INTG | Intention to grant announced |
Effective date: 20171130 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20180411 |