AU2018102082A4 - A specific type of dimension extending standard functional capabilities beyond slicing and dicing. Applicable for cases when it's required to change behaviour of calculations without redeployment of OLAP solution or full reprocessing. - Google Patents

A specific type of dimension extending standard functional capabilities beyond slicing and dicing. Applicable for cases when it's required to change behaviour of calculations without redeployment of OLAP solution or full reprocessing. Download PDF

Info

Publication number
AU2018102082A4
AU2018102082A4 AU2018102082A AU2018102082A AU2018102082A4 AU 2018102082 A4 AU2018102082 A4 AU 2018102082A4 AU 2018102082 A AU2018102082 A AU 2018102082A AU 2018102082 A AU2018102082 A AU 2018102082A AU 2018102082 A4 AU2018102082 A4 AU 2018102082A4
Authority
AU
Australia
Prior art keywords
dimension
data
calculations
dicing
slicing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2018102082A
Inventor
Artem Fedorenko
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to AU2018102082A priority Critical patent/AU2018102082A4/en
Application granted granted Critical
Publication of AU2018102082A4 publication Critical patent/AU2018102082A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates generally to retrieving data by various reporting tools from computer database designed as data warehouse, more specifically, to a technique for retrieving data by operator involving, by specific design of a data warehouse, a mechanism for dynamic modification of calculations.

Description

TITLE OF INVENTION
SPECIFIC TYPE OF DIMENSION IN DATA WAREHOUSE ARCHITECTURE
TECHNICAL FIELD [001] The present invention relates generally to retrieving data by various reporting tools from computer database designed as data warehouse, more specifically, to a technique for retrieving data by operator involving, by specific design of a data warehouse, a mechanism for dynamic modification of calculations.
BACKGROUND ART [002] The invention comes from the real problem of optimisation of a large data warehouse populated with data produced by wind power generators. The first complication is to apply a mathematical formula to a large volume of data and get fast result. The second complication lies in the fact that before deployment the formula is yet not known and it's required to provide an analyst the ability to input new formulas after deployment. The ability to add new formulas is mainly required to perform what-if analysis, i.e. to compare formulas between each other in order to find the best fit to the business case (see CLAUSES [009], [010]). To optimise the cost of development and support, the solution must be developed on software platform supplied by widely known vendor. This dictates the requirement that the data model structure has to match average capabilities of the major platforms. Also it's beneficial to avoid any features in the solution requiring customisation in form of third-party components or software extension.
[003] To the best of the applicant's knowledge, at the moment of application, there were no websites or books, considering management of metadata or instructions in data warehouse as a standalone method, or recognising among other dimension types a separate type main characteristic of which is management of metadata or instructions.
2018102082 18 Dec 2018 [004] The Kimball Group, which is an indisputable authority in IT industry, on the company's website (see CLAUSE [024]) and in the book The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modelling lists the next types of dimensions: Degenerate dimension, Renormalised Flattened dimension, Calendar Date dimension, Junk dimension, Outrigger dimension, Multivalued dimension, Aggregated Facts dimension, Behaviour Tags dimension, Dynamic Value Banding dimension, Text Comments dimension, Time Zones dimension, Measure Type dimension, Step dimension, Hot Swappable dimension, Abstract Generic dimension, Audit dimension, Late Arriving dimension. All the listed dimensions differ by the method of storing data and there is no single one among them that operates instructions or formulas. All the listed dimension types back up only the slicing-and-dicing functionality. Including the Dynamic Value Banding dimension, that, despite having word dynamic in the title, is updated at the data warehouse population step (see CLAUSE [025]).
[005] Performed comprehensive web-search discovered several more dimension types defined by other practitioners: Parent-Child dimension, Shrunk dimension, Static dimension, and rarely used: Data Mining dimension, Virtual dimension, Demographic dimension, Primary dimension, Secondary dimension, Tertiary dimension, Informational dimension, Triage dimension, and Non-conforming dimension from the General Ledger. Again these dimension types also present various ways to store data, and not metadata, and enable only the slicing-and-dicing functionality.
SUMMARY OF THE INVENTION [006] The invention is a specific type of dimension that stores instructions. In the basic case these instructions can be mathematical formulas or static set of values (like Active/Budget, Debit/Credit, Thousands7Millions7Billions7Default, Decimal7Hexadecimal, Linear7Logarithmic, static coefficients, etc.). In more complicated cases instructions can be presented by regular expressions, string parsing functions, infrastructure commands, a query language, or custom metalanguage.
TECHNICAL PROBLEM
2018102082 18 Dec 2018 [007] The technical realisation has two complications: large amount of data in a data warehouse and a need to provide an analyst the ability to add and modify formulas after solution is deployed. Also in resulting solution the metadata storage must act as a regular dimension or any other object of general OLAP structure that can be used by any third-party visualisation tool as filter or slicer without customisation.
[008] In the real-life project related to the present invention wind power generators send 2-3 times per second to the data storage values of about 40 parameters (main gear vibration, main gear temperature, brakes level, speed, wind direction and strength, inside and outside temperature, etc.). This data stream forms huge volume of data and dictates use of only high-speed processing techniques, like for example OLAP.
[009] The purpose of the formulas in such a business case is to deliver a performance indicator allowing to foresee potential equipment failures. Indicator's continuous analysis by event processing engine provides prediction of expected safe range for it for several seconds ahead. As soon as the value leaves predicted range the system applies brakes, impacted generator stops and stands idle until manual investigation. The formulas in the most of cases are not complex and typically represents a sum over the parameters with applied weights.
[010] Ability to input new formulas into OLAP solution is required to enable what-if analysis. By using data from the cases when generators were stopped uselessly it's possible to find parameters' weights giving better predictor.
[Oil] Special requirements are placed on the speed of calculation: if analysts input hundreds of new formulas, then the indicator must be calculated for all of them as quick as possible in order to make the analysts' work efficient. This imposes restrictions on the structure storing the formulas. Also because of this requirement new formulas can't be applied to the data at the data warehouse population step and have to be applied at query time, so need be a changeable part of OLAP structure.
2018102082 18 Dec 2018
SOLUTION TO PROBLEM [012] The proposed solution is an innovative approach to the data warehouse design consisting of specifically designed storage for metadata and specific linking between the metadata storage and a fact table realised in calculations. The proposed solution allows to apply simple mathematical formulas to the data without string parsing or any other slow operation that is critically important for large volumes of data.
[013] Typically, an OLAP solution consists of three components: metadata (tables' structure, relationships between tables, security settings, etc.), calculations based on static formulas, and data. Among the three components only data is variable after deployment, however in some business cases it's required to keep calculations flexible as well. The solution is to store formulas or, in general case, a computer code, as data in special tables. This kind of tables has all characteristics of dimension and matches its definition: dimensions are composed of individual, non-overlapping elements and allow filtering, grouping, and labelling the fact data (see CLAUSE [026]).
[014] In the proposed solution linking between fact and the special tables is realised in calculations. Early data warehousing tools supported only one kind of linking between dimension and fact tables that represents relationship between dimensions' data members: one-to-one, one-to-many, and many-to-many. However, while the industry progressed, there were introduced new types of relationships more affecting behaviour of calculations rather than innovating data modelling. For example in Microsoft Power BI technology were introduced virtual, inactive, and bi-directional cross filtering relationships. These variants represent the same relationship between data members of attributes but each variant signals query engine to process or present data differently. Similar way in the proposed invention the dimension does not represent traditional relationship between dimension and fact tables but links them indirectly by using code in the static calculations. Modern trends introduce new ways to link tables in OLAP model based more on method of data processing or presentation, and the proposed invention is a case of table linking based on manipulation of multidimensional engine behaviour by instructions stored as data in special tables.
2018102082 18 Dec 2018 [015] Instructions stored in the presented type of dimension can be of different kind: mathematical formulas, regular expressions, string parsing functions, infrastructure commands, query language scripts, custom data filters or filter sets, notifying phrases or error messages with conditions when they should appear, translations, switches guiding which measure to use for calculation or which dimension to use for lookup, data binning ranges, or custom metalanguage code (commands or operators). The proposed invention stores coefficients of the formulas of a design specific to the particular business case.
[016] The proposed invention supports standard slicing and dicing functionality. The drawings demonstrate the proposed structure acts as filter and as slicer.
BRIEF DESCRIPTION OF THE DRAWINGS [017] All drawings presented relate to a single sample case. The metadata is presented by sample mathematical formula consisting of three coefficients and two mathematical operations. The example is simplified for clarity, in the real-life projects number of coefficients is defined by complexity of equipment.
DETAILED DESCRIPTION [018] FIG. 1 shows fact table with data produced by monitored equipment. Sample data consists of values of three parameters and reflects work of two generators during three days.
[019] FIG. 2 shows list of three formulas entered by operator for what-if analysis.
[020] FIG. 3 demonstrates that the dimension of instructions can be utilised by standard filter components of the most visualisation tools available on the market. On the screenshot Power BI filter component lists members of the dimension based on the special table designed to store formulas' coefficients. The dropdown list contains three formulas' IDs as three formulas are used in the example project.
2018102082 18 Dec 2018 [021] FIG. 4 illustrates how the performance indicator's values are calculated in the case shown on FIG. 3 when is chosen first formula.
[022] FIG. 5 illustrates how the performance indicator's values are calculated in the case in the filter on FIG. 3 is chosen second formula.
[023] FIG. 6 displays a screenshot of a basic Power BI report demonstrating the proposed structure acting as slicer. The matrix component correctly displays dynamically calculated values on intersections of rows representing dates and columns representing formulas.
CITATION LIST
NON-PATENT LITERATURE [024] Literature 1: Lists of dimension modelling techniques by the Kimball Group:
https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball- techniques/dimensional-modeling-techniques/ [025] Literature 2: Dynamic Value Banding by the Kimball Group:
https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball- techniques/dimensional-modeling-techniques/dynamic-value-banding/ [026] Literature 3: Wikipedia article Dimension containing dimension definition: https://en.wikipedia.org/wiki/Dimension_(data_warehouse)

Claims (1)

1. The present invention relates generally to retrieving data by various reporting tools from computer database designed as data warehouse, more specifically, to a technique for retrieving data by operator involving, by specific design of a data warehouse, a mechanism for dynamic modification of calculations.
AU2018102082A 2018-12-18 2018-12-18 A specific type of dimension extending standard functional capabilities beyond slicing and dicing. Applicable for cases when it's required to change behaviour of calculations without redeployment of OLAP solution or full reprocessing. Ceased AU2018102082A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2018102082A AU2018102082A4 (en) 2018-12-18 2018-12-18 A specific type of dimension extending standard functional capabilities beyond slicing and dicing. Applicable for cases when it's required to change behaviour of calculations without redeployment of OLAP solution or full reprocessing.

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2018102082A AU2018102082A4 (en) 2018-12-18 2018-12-18 A specific type of dimension extending standard functional capabilities beyond slicing and dicing. Applicable for cases when it's required to change behaviour of calculations without redeployment of OLAP solution or full reprocessing.

Publications (1)

Publication Number Publication Date
AU2018102082A4 true AU2018102082A4 (en) 2019-02-07

Family

ID=65235622

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2018102082A Ceased AU2018102082A4 (en) 2018-12-18 2018-12-18 A specific type of dimension extending standard functional capabilities beyond slicing and dicing. Applicable for cases when it's required to change behaviour of calculations without redeployment of OLAP solution or full reprocessing.

Country Status (1)

Country Link
AU (1) AU2018102082A4 (en)

Similar Documents

Publication Publication Date Title
US9904694B2 (en) NoSQL relational database (RDB) data movement
US9244964B2 (en) Determining a cause of an incident based on text analytics of documents
US20080082529A1 (en) Comparing Taxonomies
US8843483B2 (en) Method and system for interactive search result filter
US20100228752A1 (en) Multi-condition filtering of an interactive summary table
Drabas et al. Learning PySpark
Boehmke Data wrangling with R
US20140279677A1 (en) Ontology-driven construction of semantic business intelligence models
EP2725503B1 (en) Generic semantic layer for in-memory database reporting
US10474675B2 (en) Explain tool for optimizing SPARQL queries
CN105975440A (en) Matrix decomposition parallelization method based on graph calculation model
CN103348329A (en) Generating test data
US11238082B2 (en) Text analysis of unstructured data
CN103077192B (en) A kind of data processing method and system thereof
US11720597B2 (en) Generating an OLAP model from a spreadsheet
US20130086011A1 (en) Associative Memory Visual Evaluation Tool
US20160124932A1 (en) Data processing device and method
US20100293450A1 (en) System and method for simulating discrete financial forecast calculations
Luke Writing the visible page: a multimodal approach to graphic devices in literary fiction
US10437793B2 (en) BI cloud services data modeling denormalized table introspection algorithm
AU2018102082A4 (en) A specific type of dimension extending standard functional capabilities beyond slicing and dicing. Applicable for cases when it's required to change behaviour of calculations without redeployment of OLAP solution or full reprocessing.
GB2466427A (en) Storing of changes to baseline table in a database
Al Essa et al. Data Mining and Warehousing
GarcĂ­a et al. QlikView: Advanced Data Visualization: Discover deeper insights with Qlikview by building your own rich analytical applications from scratch
US11481539B1 (en) Systems and methods for improved building of interactive workbooks within business reporting, analysis, and management software

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry