EP1747517A1 - Method and apparatus for automatically creating a data warehouse and olap cube - Google Patents
Method and apparatus for automatically creating a data warehouse and olap cubeInfo
- Publication number
- EP1747517A1 EP1747517A1 EP05738895A EP05738895A EP1747517A1 EP 1747517 A1 EP1747517 A1 EP 1747517A1 EP 05738895 A EP05738895 A EP 05738895A EP 05738895 A EP05738895 A EP 05738895A EP 1747517 A1 EP1747517 A1 EP 1747517A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- data
- data warehouse
- ledgers
- stored procedures
- analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
- G06F16/972—Access to data in other repository systems, e.g. legacy data or dynamic Web page generation
Definitions
- This invention relates generally to systems and methods for analyzing and extracting data from Enterprise Resource Planning (ERP) or other, business software systems, and replicating that data over computer networks into other systems, specifically data warehouses and/or data marts and on-line analytical processing (OLAP) systems, for the purpose of effectively analyzing and creating reports using that data.
- ERP Enterprise Resource Planning
- OLAP on-line analytical processing
- one implementation of the present invention is a system and method for automatically analyzing the form and structure of the data within an ERP system and determining the appropriate metadata and data to be replicated and aggregated from that system to a separate data server, in order to construct a data warehouse and an associated OLAP cube without significant human intervention.
- the structure of the resulting data warehouse and OLAP cube permits cross ledger analysis and reporting against both summary and detail transactional data from the source ERP system simultaneously.
- ERP Enterprise Resource Planning
- GL General Ledger
- subledgers for example, accounts receivable, accounts payable, inventory, fixed assets, and so on.
- ERP systems often (although not always) use a relational database system to store their data.
- an ERP system is mainly a transactional system, i.e.
- Transactional systems are structured and organized so that these types of transactions can be performed as rapidly and reliably as possible with the large volumes of data that they require.
- the essential structure and organization of transaction' systems make them poorly suited for analyzing the data they hold and reporting against it in certain ways, such as are typically required to effectively and efficiently manage a business.
- ERP systems are "mission-critical", in that they are vital to running the organization's business and so their performance cannot be impacted by also using them for other tasks (such as analysis and reporting).
- many organizations separate their analysis / reporting systems from their transaction systems by replicating key transaction data from their ERP systems (and possibly also other databases) and storing them in a data warehouse or data mart.
- a data warehouse typically contains a summary snapshot of an organization's transactional data at a single point in time, and this data is organized in such a way as to make it well suited for management analysis and decision making.
- This snapshot of data in the data warehouse is created by manual or automatic replication of a subset of the data in one or more external transaction systems (such as an ERP system).
- the data is often summarized across certain dimensions (operational facets) of the data, such as time, product, customer, location, etc., to provide summary information in a form which more accurately reflects the structure of the organization than the data in an ERP system.
- Data marts are essentially structurally identical to data warehouses, except they are usually smaller and only contain data relevant to a narrow segment or specific functional area of the organization. Both data warehouses and data marts' typically use a relational database system for data storage. Data marts are most useful for "stovepipe" analysis (i.e. for a specific subject area not linked to other business functions), and are ideal in such situations because they are faster and easier to create than a data warehouse; however as a result they are not as well suited for cross-functional analysis and reporting.
- data warehouses are still not ideally suitable for easily creating queries to answer the kind of real business questions used in management decision making (often called business intelligence). This is because a significant requirement of the end users asking such business questions is the ability to create their own ad-hoc analytical queries and reports using common business terms; however data warehouses are built on relational databases and so utilize very technical concepts such as tables, fields and views. In addition, data warehouses usually require the use of a complex query language, e.g. SQL (Structured Query Language) to access the data. None of these concepts are at all intuitive to non-technical end users.
- SQL Structured Query Language
- a business may sell a wide variety of products in different regions over some period of time.
- sales data might be stored as individual data records, each representing a sale to a customer on a particular date (time period), and linked to records in other tables containing the customer's address (geographic location), the details of the products sold (product), and other information.
- time period a date
- product the details of the products sold
- other information the same data would be stored in a large central fact table, containing a series of records containing the aggregated value of all sales to each specific customer for each specific product in each specific region over each time period (for example individual months), together with summary totals for each grouping of customer, product, region, and time period.
- Each record in the fact table also contains a set of key values, each of which references a separate dimension table containing the descriptive business terms associated with the key values (e.g. the product names or geographic regions). These descriptive business terms are sometimes referred to as members or member names.
- the fields in the fact table which contain numerical values to be aggregated are referred to as measures. Measures are the central values of a cube that can be analyzed by the dimensions and hold the figures that the end user is primarily interested in. Some common measures are sales amount, cost amount, sales quantity.
- the structure of a data mart with a central fact table joined to separate dimension tables is referred to as a star or snowflake schema.
- each dimension is represented in the OLAP system's data structures (often referred to as a cube), together with the elements of that dimension.
- each dimension may contain a series of hierarchical levels; for example, in a time dimension that might be years, quarters, and months. All these dimensions, member names and hierarchies are collectively known as "metadata", and are typically stored within the OLAP system so that they can be referenced much more intuitively for business analysis and reporting purposes by non-technical users.
- OLAP systems are therefore commonly used to facilitate the efficient retrieval and analysis of the information in data warehouses and data marts. These OLAP systems permit the rapid and effective creation of ad-hoc management analyses and complex summary reports from a number of different perspectives, which would be impossible, inefficient and/or too cumbersome with transactional systems.
- OLAP systems make it impossible for a naive end user to accidentally (or maliciously) create a "runaway query", i.e. one which generates a complex, multi- table join that takes an enormous amount of time and system resources to complete (degrading the performance of the entire system).
- SQL Structured Query Language
- SQL Structured Query Language
- Some relational databases permit the creation of stored procedures (essentially predefined SQL programs which can be saved in the database). Just as with procedure calls in any other programming language, stored procedures can be passed arguments and return values, so they can be used to perform complex programmatic tasks.
- Data warehouses and OLAP cubes typically contain only summary and aggregate data, making certain kinds of detail level analysis and reporting impossible.
- many systems include the ability to drillthrough from the summary data values into the individual transaction records (or some subset of them) which make up those values in order to facilitate detailed level analysis; however, such functionality is usually very inefficient (requiring linking across different systems) and is very hard to provide in a generalized and useful way across all the data in the warehouse without making the data set unmanageably large.
- the present invention is based on the insight that the technical bias against generating data warehouses (e.g. with a star or snowflake schema structure) and large OLAP hyper-cube analytic structures ('OLAP cubes') from multiple source ledgers is not well founded.
- the norm has been to generate a single physical OLAP cube for each individual ledger/sub-ledger — hence one for the General Ledger, a further one for sales order processing etc.
- This prior art approach makes cross functional analysis of multiple source ledgers difficult since it requires the creation of a further level of physical or virtual cubes across multiple physical cubes and complex querying. Building such as network of interconnected small cubes is not only expensive to customize and maintain but is also not very intuitive or flexible for the business end-user.
- a 'physical' OLAP cube stands in contrast to a virtual cube (physical cubes hold their own data; virtual cubes do not and are merely views onto other cubes).
- the term 'data warehouse' includes within its scope a data mart.
- the step of constructing the data warehouse or the step of generating the associated OLAP cube is achieved without significant human intervention using software.
- This data warehouse that can be constructed using software is a generic warehouse that is capable of being customised when installed by an end-user.
- the OLAP cube that is generated using software is a generic OLAP cube that is capable of being customised without technical knowledge of the software or data structures. This enables a software package to be used without detailed (or indeed any) guidance from costly consultants: previously, even constructing multiple physical cubes could require substantial expert consultant involvement in custom designing the warehouse and cube. But with the present invention, an OLAP cube that enables cross-functional analysis of multiple source ledgers or modules can be automatically generated — i.e. without significant human intervention at all.
- the OLAP cube is based on a central fact table in the data warehouse that also holds some summary information.
- the central fact table is created using SQL joins from multiple input ledgers or modules to generate additional, detail rows in the central fact table.
- the SQL joins can be made between source ledgers or modules, such as: (a) • General Ledger and inventory/ stock (b) General Ledger and sales purchase ledger (c) General Ledger and invoice/credit (sales order processing).
- Another feature is that orders and other entries that do not have a financial impact to the general Ledger can be placed into the central fact table.
- Fields in data warehouse tables can be dynamically updateable, in that these fields can change in accordance with the underlying ERP accounting or business structure and data when the warehouse is updated with a new snapshot of ERP system data.
- the stored procedures can be defined in an XML file; the XML file includes SQL scripts and script templates. These stored procedures are executed in the data warehouse against the source ERP database to determine the various metadata structures required to create or update an OLAP cube to facilitate analysis and reporting of the data. More specifically, the stored procedures can execute queries to determine and identify one or more of the following metadata in the data warehouse: (a) The dimension names (b) The fields containing the member names for each dimension (c) The hierarchies of the members in each dimension
- An application of the invention can use a file containing instructions for creating multiple stored procedures which, when executed, initiate and control the process whereby the relevant and appropriate data is extracted from an ERP system into the data warehouse and the OLAP cube.
- the stored procedures initially exist in the file in a generic state, but their structure is changed dynamically at runtime, according to the responses given by a user during installation of the application.
- the application determines dynamically (by executing appropriate queries) and utilizes during the installation process, one or more of the following: (a) The ERP system product version; ' (b) The table/field naming convention used; (c) Any other naming differences between product versions; (d) The number of levels of totaling within the account hierarchy (and advanced ( dimensions); (e) The granularity of financial periods used, e.g. years/months vs. weeks & quarters. (f) Internal programming and option switches within the XML file; (g) External switches in an external license key file.
- the stored procedures can take as parameters a list of new fields to be added to the data warehouse and the stored procedures then loop through this list of new fields and add them in to the correct tables in a star schema in the data warehouse.
- An application adds new dimensions (or measures of the dimensions) to the OLAP cube based on these new fields. This application can be written to loop multiple times through a series of scripts to create multiple sets of stored procedures, altering each set of stored procedures slighdy on the basis of the structures and data in the ERP system.
- the application When a periodic update is automatically triggered, the application will execute certain stored procedures which will scan through an ERP ledger data and recreate the data warehouse, and refresh the data in the OLAP cube.
- a validation process can also be incorporated: in order to ensure that all the ledger information has been combined correctly, there are stored procedures which check the resulting account balances in the data warehouse or mart against the equivalent balances in the ERP system. The results of the error checking can be reported over the Internet to an implementation or support team via a Diagnostic Support Web Service.
- the present invention enables the following types of reporting or analysis to be performed: ⁇ (a) Product/Customer/Salesperson analysis against the Chart of Accounts (b) Aged Credit and Aged Debt comparison to obtain overall health of company (c) Complete Item/Customer profit margin calculated including adjustments, cost of materials and indirect costs — i.e. true margin as compared to estimated margin. (d) Dashboard style information on each customer including outstanding orders, debt, invoice history, suppliers and warehouses used to fulfill orders. (e) Full product analysis including orders, stock, previous sales, cost adjustments etc (f) Discount analysis by product/ customer/salesperson. (g) Status of orders, including quotes (h) Asset management (i) Cash flow planning and reporting and an almost unlimited list of static or ad hoc reports and analyses which help an end- user business control and manage business performance.
- the present invention is predicated on there being a single data warehouse and associated single, physical OLAP cube as opposed to the large proliferation of cubes required in the prior art.
- the present invention allows just a single data warehouse and associated single, physical OLAP cube to be generated.
- An example of one type of information would be financial information.
- Another type would be CRM (customer relationship management).
- Another type would be Supply Chain Analysis.
- An implementation of the present invention would therefore have a single data warehouse and associated physical OLAP cube for all financial source ledgers, sub- ledgers or modules (the prior art would require a large number).
- that implementation would have another, single data warehouse and associated physical OLAP cube for all CRM source ledgers, sub-ledgers or modules; and another single data warehouse and associated physical OLAP cube for all Supply Chain Analysis source ledgers, sub-ledgers or modules.
- Other types of information can be catered for with further single data warehouses and associated single physical OLAP cubes.
- FIG. 1 is a schematic depiction of the system architecture for an exemplary implementation of the invention
- FIG. 2 depicts a data flow diagram for an exemplary implementation of the invention, showing data being extracted from the ERP system into the data warehouse and OLAP cube, and thence being accessed by end users;
- FIG. 3 shows steps 1 and 2 of the process, i.e. creating staging tables for the ERP data, then using the data in those tables to build the data warehouse;
- FIG. 4 is a schematic depiction of how the information contained in an example sales invoice would be represented in the ERP ledgers, and eventually, the data warehouse;
- FIGS. 5 - 9 show a specific simplified example of the data in the individual tables and ledgers from the example sales invoice presented in FIG. 4, above; in particular, FIG. 5 shows the sales invoice itself; FIG. 6 shows the information stored in the GL, FIG. 7 the information stored in the invoice detail ledger, and FIG 8 the information stored in the inventory table; finally, FIG. 9 illustrates this information as stored in the data warehouse fact table;
- FIG. 10 schematically depicts the process of creating the fact table from the data warehouse by using intermediate tables
- FIG. 11 shows a greatly simplified example of the dimension tables in the data warehouse being used to create the dimension hierarchies in the OLAP cube
- FIGS. 12 - 15 depict various example screenshots from a particular implementation of the invention, showing the user interface screens for entering certain configuration parameters;
- FIG. 12 shows the screen to enter the ERP server and database name, user name, and password
- FIG. 13 shows the screen to enter the OLAP server name and database name
- FIG. 14 shows the selection of dimensions found in the ERP to include in the OLAP cube
- FIG. 15 illustrates configuring the day(s) and time to execute automatic periodic updates.
- an implementation of the present invention is a system and method for automatically analyzing the form and structure of the data within an ERP system and determining the appropriate metadata and data to be replicated and aggregated from that system to a separate data server, in order to construct a data warehouse and an associated OLAP cube without significant human intervention. It will be clear to those skilled in the art that the system and methods employed in the preferred embodiment of the invention are unique and are a significant improvement over the prior art.
- the system is configured on a standard PC data server connected to a computer network, the data warehouse is constructed in Microsoft ® SQL ServerTM, and the OLAP cube is constructed using Microsoft Analysis ServicesTM.
- the ERP system holding the metadata and data for the process is Microsoft Business Solutions— Navision ® on a separate data server.
- FIG. 1 describes an example of the overall system architecture of such an implementation.
- data is extracted from the various ledgers in the ERP system (i.e. GL, Accounts Payable, Accounts Receivable, etc.) via an ERP adapter (a programmatic interface for accessing the ERP data store) by a Data Warehouse & OLAP Builder program.
- ERP adapter a programmatic interface for accessing the ERP data store
- OLAP Cube This data is then used to construct a data warehouse, and subsequently an OLAP Cube.
- End user reporting and analysis tools (such as report writers, portals, spreadsheets, and so on) can then be used to extract, analyze and report against the data warehouse and/or the OLAP Cube.
- FIG. 2 shows a summary data flow diagram for this exemplary implementation, in which ERP system data is extracted from the ERP data store on an external server into the data warehouse on another server via computer networks, under control of the Builder program. Some subset of this data is then used to construct the metadata and data in the OLAP Cube. Finally, multiple end users using multiple client computers can access the information in the data warehouse and OLAP Cube via the various reporting and analysis tools described above. 1.
- the Extraction Process to a Staging Database for a proprietary based ERP system
- an initial process is performed to first extract the data into a SQL database to facilitate subsequent processing of that data. This is to ensure that the source data model remains consistent and so that SQL commands can be used for all subsequent access of the source data.
- a Microsoft SQL Server environment for storing its data this extraction step in the process is not required.
- the relevant ledgers in the ERP system which are copied are:
- FIG. 3 Illustrates this step in the process.
- Data in the ERP system ledgers is first extracted (if necessary) into a set of staging tables in the relational database, which essentially mirror the structure of the ERP ledgers. These staging tables are then used as a basis for creating the data warehouse (step 2, below).
- ERP enterprise resource planning
- financial or accounting ' software systems such as Microsoft Great Plains ® , Microsoft Axapta ® , Oracle Financials ® , Lawson Financial Suite ® , and many others.
- the system includes an application whose function is to construct the data warehouse and OLAP cube.
- This application known as the Builder, takes as input an Extended Markup Language (XML) file.
- XML Extended Markup Language
- This file contains instructions for creating over 200 stored procedures in the database which, when executed, initiate and control the process whereby at each stage the relevant and appropriate data is extracted from the ERP system into the staging tables, the data warehouse and the OLAP cube, as described herein.
- the XML input file can be automatically downloaded from a Web Service every time that it needs to be read ensuring that the most up to date version is always in place.
- the Builder is executed initially to install and configure the product, and also upon certain other significant transitional events (for example, after a version upgrade of the ERP system).
- the Builder When the Builder is executed it causes customized stored procedures to be created in the database. These stored procedures initially exist within the XML file in a generic state but the content of the stored procedures is changed dynamically at runtime, to customize them for each implementation. There are two types of such customizations which occur in the stored procedures, these are static and dynamic. Static changes (for example, the substitution of server names, company names, etc.) occur throughout all of the stored procedures during the installation process, and become hard-coded when the stored procedures are inserted into the database.
- Dynamic changes (changes that reflect the specific structure of the ERP, such as account totals or Advanced Dimensions in Microsoft Navision) mosdy occur during execution of the stored procedures, and are managed using database cursors which adapt to changes in the data, and so change the execution path of the stored procedures. These dynamic changes are written in such a way that the behavior of the stored procedures changes at execution time.
- many of the stored procedures include a string parameter which is replaced at execution time with a string built according to the ERP data structures.
- the Builder is also written to perform customization based on certain types of native structures from the ERP system into the data warehouse and OLAP cube.
- native structures provided within the ERP include Navision Advanced Dimensions
- Advanced Dimensions are custom structures which can be created within a Navision implementation.
- the Navision database is interrogated to provide a list of any Advanced Dimensions which are defined.
- a dimension table and the associated keys within the fact table are created in the data warehouse (and optionally also the OLAP cube).
- the end user has an opportunity to individually either include or exclude each Advanced Dimension from the cube. In the event that a new Advanced Dimension is created .subsequently, the new Advanced Dimension will be automatically included in the data warehouse as described above; however, it will not also be included in the cube unless explicitly specified by the user.
- the Builder is a programmable process in that the XML instructions are dynamically adapted to certain facets of the ERP installation.
- the Builder has a defined execution path that is dependent on the responses from the user during installation, and also on the results of executed queries performed during the process.
- Certain other parameters are predefined and are controlled by values stored in an external license key file (which is provided during initial installation), for example:
- the source (ERP) server name The source (ERP) database name
- the SQL server name The SQL database name
- the OLAP server name The OLAP database name Whether the ERP system utilizes a proprietary or relational database A switch to use single or multiple companies
- FIGS. 12 - 15 depict various screenshots from a particular implementation of the invention, showing the Builder user interface screens for entering some of the above parameters. For example, the ERP server name, database name, user name, and password (FIG. 12); the OLAP server name and database name (FIG. 13); which extra dimensions to include in the OLAP cube (FIG. 14); and the day and time of the automatic periodic update will occur (FIG. 15).
- the ERP server name, database name, user name, and password FIG. 12
- the OLAP server name and database name FIG. 13
- which extra dimensions to include in the OLAP cube FIG. 14
- the day and time of the automatic periodic update will occur
- the same XML file may be used (unchanged) for different installations. Additional modules or companies can be added by simply by executing a different subset of the stored procedures as specified by internal switches in the XML.
- the Builder is written to loop multiple times through a series of scripts to create multiple sets of stored procedures (for example, once for each company in the ERP system), altering each set of stored procedures slightly on the basis of the structures and data in the ERP system. These sets of stored procedures, when executed, will perform all the operations necessary to populate the appropriate tables and fields in the data warehouse (as described in steps 1 & 2, above).
- a separate section of the XML file contains a script to generate and reprocess an OLAP cube. This script is modified based on the appropriate metadata and data from the underlying data warehouse (see step 3, above), in order to create a cube which reflects the elements and structure of the source data.
- the Builder Once the Builder has finished executing the create procedure SQL statements it then goes on to read the OLAP cube definition XML and generates the OLAP cube — in this exemplary implementation by using Microsoft Analysis Services Decision Support Objects (DSO).
- DSO Microsoft Analysis Services Decision Support Objects
- the XML file also contains all the scripts needed to create the periodic update (see step 6, below) and the penultimate step in the update process is the execution of a program script which processes the OLAP cube.
- the Builder is executed to install the system initially. It may be executed again to upgrade the system or to modify it in response to upgrades of the external ERP system.
- This example implementation of the Builder demonstrates that this aspect of the invention is a unique and significant improvement over the prior art, in that the Builder dynamically generates stored procedures based on the structure of the ERP system and various other parameters. Also, the self-modifying nature of the stored procedures as previously described facilitates intelligent adaptation of the data warehouse and cube structure to the structure of the ERP system source data. Specifically this exemplary implementation of the invention does not require the use of Microsoft SQL Server Data Transformation Services.
- additional stored procedures are executed against the data warehouse to determine the various metadata structures required for an OLAP cube to facilitate analysis and reporting of the data.
- These stored procedures execute queries to determine and identify the following metadata in the data warehouse:
- FIG. 11 The hierarchies of the members in each dimension This process is represented (in highly simplified form) by FIG. 11.
- the Builder program when first executed, will create the metadata structures required to define an OLAP cube. When the cube is created (and subsequently refreshed), these structures are updated with the actual dimension, member, and hierarchy information retrieved by these queries from the data warehouse.
- a significant improvement in the present invention over the prior art is the inclusion of data from many ERP system ledgers in a single data warehouse and cube (instead of creating multiple "stovepipe" data marts).
- This process is described as a Hyperjoin.
- Such a Hyperjoin is made possible by including references to other ledgers in the data warehouse, as described above. Therefore, instead of analysis and reporting being performed on many individual OLAP cubes it can be performed against one cube (in a typical exemplary implementation the invention can replace 20 or more individual cubes with a single cube of 30-60+ dimensions).
- This exceptional functionality permits analysis and reporting across all areas of a business without the need to perform complex joins between multiple different data marts or OLAP cubes, thus avoiding the need for users to possess significant technical or accounting knowledge.
- reporting and analysis functionality Some examples of the types of reporting and analysis that can be produced from the invention are: • Product/ Customer/Salesperson analysis against the Chart of Accounts • Aged Credit and Aged Debt comparison to obtain overall health of company • Complete Item/Customer profit calculated including adjustments, cost of materials and indirect costs • Dashboard style information on each customer including outstanding orders, debt, invoice history, suppliers and warehouses used to fulfill orders. • Full product analysis including orders, stock, previous sales, cost adjustments etc • Discount analysis by product/customer/salesperson/etc • Status of orders, including quotes • Asset Management • Cash Flow planning and reporting
- stored procedures are either executed to extract the data stored in the temporary staging tables (if they were necessarily created in the previous step of the process), or directly against the ERP system tables (in the event that the ERP system already utilizes a Microsoft SQL Server database).
- This data is then used to build and populate a data warehouse in the form of a snowflake schema (i.e., a central fact table and a series of linked dimensional tables with additional detail tables linked to them).
- the data warehouse will contain both summary and detailed data ' from the source ERP system representing a comprehensive snapshot of the data in the ERP system at the time of the original data extract.
- this invention describes a number of significant improvements over the prior art which permit and facilitate certain valuable kinds of business analysis and reporting as described herein.
- the data is manipulated by performing SQL joins to create detail rows in the fact table, which also hold summary information.
- a single sales invoice might have four lines, and would be represented by a single row in the GL (and the Accounts Receivable ledger also).
- the GL summary line would be joined to the 4 invoice lines and create four rows in the resulting fact table, including in each row a combination of analysis information from the GL row and the invoice line. This is to facilitate subsequent analysis and reporting against both the summary and detailed data when it is transferred to the data warehouse, and represents a significant improvement over the prior art.
- FIGS. 4 - 9 show a simplified example illustrating this process.
- FIG. 4 is an overview of the data flow in this example from a sales invoice to the data warehouse. Items of data from the sales invoice (see FIG. 5), such as the purchase order number (PO #) and the invoice date are stored as a single row in the GL (see FIG. 6) Other data items from the sales invoice, such as the order quantity (Qty), the item code (SKU), and the unit price are stored as multiple rows in the invoice detail ledger (see FIG. 7) Costs and cost adjustments relating to the products sold are held in the inventory (see FIG. 8). The data from the GL and invoice detail and inventory is combined to create multiple rows in the data warehouse fact table (FIG. 9), incorporating both general information and detail lines from the invoice. Note that the cost adjustments appear against the original date of the invoice allowing for margin analysis of invoices by period (even though the adjustments may have occurred in subsequent periods).
- the building of the data warehouse dimension tables is performed by extracting the relevant metadata for each dimension in turn and copying it directly into the dimension tables in the data warehouse, with the exception of posting account data, financial period data, and certain custom data (for example, Navision Advanced Dimensions) which are dealt with by a more sophisticated process.
- any totaling structure information stored within the ERP data for accounts, financial periods and advanced dimensions is used to create appropriate hierarchies. For example, in the Accounts dimension within the ERP system, posting accounts may be grouped into total assets, total liabilities, total revenue, and so on; these same groupings are used to create hierarchies for these dimensions in both the data warehouse and the OLAP cube.
- the data warehouse fact table is created by extracting transactional data from the ERP system or the staging tables (as appropriate) into a set of intermediate tables which are then manipulated via a multi-stage process before arriving at the completed fact table.
- the invoice and credit memo records for both Accounts Receivable and Accounts Payable ledgers are inserted into a single table to create an intermediate table representing line details for sales/purchasing invoices/credit memos (referred to as the In ⁇ oiceCredit able).
- the Accounts Receivable/Payable ledgers contain lines which are not necessarily generated by an invoice or credit and yet which have useful analysis fields (e.g. Customer No). In addition, certain other subledger analysis fields are only held at an Accounts Receivable/Payable level and not at an invoice/ credit level. Because of this, an additional intermediate table (the ExfraSubLedge ⁇ is created to allow this information to be included.
- the GL, InvoiceCredit and ExtraSubLedger tables are then joined together to form the basis for the fact table.
- the GL to InvoiceCredit join there is a one to many relationship, so where a match exists the GL information is expanded by the addition of extra rows to include the detailed information from the other tables.
- the GL to ExtraSubLedger join there is a one to one join and the additional analysis fields stored at subledger level are combined with the GL lines (via a SQL Join) to allow further analysis in the resultant fact table.
- FIG. 10 illustrates the entire process of creating and joining these various intermediate tables to create the fact table.
- a journal from a subledger would usually be represented in the GL by two records (each balancing the other out).
- the journal information would be stored in the ExtraSubLedger table.
- the join between the ExtraSubledger and the GL in this exemplary implementation would not increase the number of records, so the fact table would also contain two records.
- this join allows the capture of subledger analysis fields such as Salesperson and Customer in the relevant records of the fact table, thus providing a significant benefit over the prior art in terms of the potential information that is made available for analyses and reports.
- Many other tables holding transactions are also joined to the GL, InvoiceCredit and ExtraSubLedger at this point to provide further analysis fields.
- the most notable of these are the inventory tables, fixed asset ledgers, and bank account ledgers. For example, the cost of sales lines and stock lines in the GL can be joined to the inventory to produce full product margin analysis and stock analysis.
- the fact table created in the above example of the invention does not simply consist of a list of keys linked to the dimension tables and a set of the relevant measures; it is significantly different to conventional implementations using the prior art in that it retains some detailed analysis/descriptive information.
- Unfulfilled orders i.e. sales/purchase orders and quotes that have been input into the ERP system but not yet invoiced (and so not yet posted to the financial ledgers);
- a periodic update when automatically triggered it will execute certain stored procedures which will scan through the ERP ledger data and recreate the staging tables and the data warehouse, and refresh the data in the OLAP cube in a manner similar to that described in steps 1 through 3, above (except that the data warehouse and/or cube may be refreshed rather than rebuilt from scratch).
- the process of updating the data warehouse may be either a total rebuild (i.e. deleting the contents of the data warehouse tables and recreating them) or an incremental update (i.e. leaving the data warehouse table contents intact and simply including information from the ERP system which is new or changed since the last periodic update, as required).
- the incremental update can either update information across the whole data warehouse, or can just be an update of the General Ledger transactions - this facilitates Month End journal reporting.
- the stored procedures may uncover new it ⁇ ms which appear in the ERP transactional data but not in the data and/or metadata in the data warehouse and OLAP cube. These new items usually fall into one of the following categories:
- a new data value has been added, e.g. a new customer: The new customer will automatically get added to the associated grouping in the Customer dimension without any user intervention. The new customer will appear in the dimension after the next periodic update.
- a new data value that might change the structure e.g. the Accounting Period table has rows added for weekly reporting (in addition to years & months)
- the dimension table is rebuilt with each periodic update, so it will be automatically rebuilt with the new structure.
- the new structure information will appear in the OLAP cube after the next periodic update (e.g. the user will see the weeks as well as years and months in their time dimension).
- the final step in the automatic update process is to call a Diagnostic Support Web Service and request a set of SQL queries. This final step is always executed, even if the preceding steps fail.
- the queries supplied by the Web Service are run against the data warehouse to check for errors, or to capture general diagnostics. Once the queries have been executed the results are posted back to the Web Service via the Internet and stored in a database owned by the implementation/ support team. These postings can trigger email notifications to alert the implementation/support team of any potential problems. Examples of the queries executed are: Checking for discrepancies in account balances, checking the update process ran through successfully, reporting the time which the update process took to complete, etc.
- the XML input file as described in Step 2 contains stored procedures that are created in the data warehouse to accommodate future customizations.
- an ERP system may have been customized to include an extra analysis field such as Product Range.
- a standard data warehouse implementation in the field of the invention would require highly trained technical staff with a working knowledge of the data warehouse and ERP structures and platforms in order to add this field for analysis by the end user.
- One aspect of the present invention is a set of stored procedures that take as parameters a list of new fields to be added to the data warehouse. The stored procedures then loop through this list of fields and add them in to the correct tables in the star schema in the data warehouse.
- a further tool known as the Architect, adds new dimensions to the OLAP cube based on these fields. Thus no technical knowledge or consultancy is required to add many kinds of new fields to the data warehouse implementation (i.e. those fields from the source that exist on tables already utilized within the data warehouse structure).
- the Architect tool also enables other changes to be made to the OLAP cube without the need for expertise with Microsoft SQL Server Analysis Services.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GBGB0409674.9A GB0409674D0 (en) | 2004-04-30 | 2004-04-30 | Method and apparatus for automatically creating a cross-functional data warehouse and OLAP cube |
PCT/GB2005/001645 WO2005106711A1 (en) | 2004-04-30 | 2005-04-29 | Method and apparatus for automatically creating a data warehouse and olap cube |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1747517A1 true EP1747517A1 (en) | 2007-01-31 |
Family
ID=32408318
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP05738895A Ceased EP1747517A1 (en) | 2004-04-30 | 2005-04-29 | Method and apparatus for automatically creating a data warehouse and olap cube |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1747517A1 (en) |
GB (2) | GB0409674D0 (en) |
WO (1) | WO2005106711A1 (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007095959A1 (en) * | 2006-02-24 | 2007-08-30 | Timextender A/S | Method for generating data warehouses and olap cubes |
EP2116961A4 (en) | 2006-12-26 | 2011-09-21 | Fujitsu Ltd | Detailed data aggregation device, detailed data aggregation program, and detailed data aggregation method |
US8082239B2 (en) | 2008-02-04 | 2011-12-20 | Microsoft Corporation | Defining sub-cube scope based upon a query |
US9830366B2 (en) | 2008-03-22 | 2017-11-28 | Thomson Reuters Global Resources | Online analytic processing cube with time stamping |
US9110970B2 (en) | 2008-07-25 | 2015-08-18 | International Business Machines Corporation | Destructuring and restructuring relational data |
US8943087B2 (en) | 2008-07-25 | 2015-01-27 | International Business Machines Corporation | Processing data from diverse databases |
US8972463B2 (en) | 2008-07-25 | 2015-03-03 | International Business Machines Corporation | Method and apparatus for functional integration of metadata |
US9286370B2 (en) | 2010-02-24 | 2016-03-15 | International Business Machines Corporation | Viewing a dimensional cube as a virtual data source |
EE05651B1 (en) | 2010-03-19 | 2013-04-15 | Abile Mobile O� | A method and system for real-time augmentation in Dashboard Networks |
US10120927B2 (en) | 2015-06-03 | 2018-11-06 | International Business Machines Corporation | Technology for generating a model in response to user selection of data |
US10345991B2 (en) | 2015-06-16 | 2019-07-09 | International Business Machines Corporation | Adjusting appearance of icons in an electronic device |
US11093495B2 (en) | 2019-06-25 | 2021-08-17 | International Business Machines Corporation | SQL processing engine for blockchain ledger |
CN112860659B (en) * | 2021-01-18 | 2023-09-01 | 北京奇艺世纪科技有限公司 | Data warehouse construction method, device, equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020029207A1 (en) * | 2000-02-28 | 2002-03-07 | Hyperroll, Inc. | Data aggregation server for managing a multi-dimensional database and database management system having data aggregation server integrated therein |
US6684207B1 (en) * | 2000-08-01 | 2004-01-27 | Oracle International Corp. | System and method for online analytical processing |
US7756822B2 (en) * | 2003-12-01 | 2010-07-13 | Sap Ag | Operational reporting architecture |
-
2004
- 2004-04-30 GB GBGB0409674.9A patent/GB0409674D0/en not_active Ceased
-
2005
- 2005-04-29 WO PCT/GB2005/001645 patent/WO2005106711A1/en active Application Filing
- 2005-04-29 GB GB0508852A patent/GB2413665A/en not_active Withdrawn
- 2005-04-29 EP EP05738895A patent/EP1747517A1/en not_active Ceased
Non-Patent Citations (1)
Title |
---|
See references of WO2005106711A1 * |
Also Published As
Publication number | Publication date |
---|---|
GB2413665A (en) | 2005-11-02 |
GB0508852D0 (en) | 2005-06-08 |
GB0409674D0 (en) | 2004-06-02 |
WO2005106711A1 (en) | 2005-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9684703B2 (en) | Method and apparatus for automatically creating a data warehouse and OLAP cube | |
WO2005106711A1 (en) | Method and apparatus for automatically creating a data warehouse and olap cube | |
US20230222109A1 (en) | Method and apparatus for converting heterogeneous databases into standardized homogeneous databases | |
US7870016B2 (en) | Report management system | |
Ballard et al. | Dimensional Modeling: In a Business Intelligence Environment | |
Stackowiak et al. | Oracle data warehousing & business intelligence Solutions | |
US9740992B2 (en) | Data warehouse system | |
EP2551773B1 (en) | Data audit module for application software | |
EP1843259A2 (en) | Packaged warehouse solution system | |
Gupta | An introduction to data warehousing | |
CN108701154B (en) | Data source system agnostic fact category partitioning information repository and methods for inserting and retrieving data using the same | |
US20080288448A1 (en) | Method, apparatus, and system for providing business intelligence | |
WO2016118940A1 (en) | Systems and methods for automatically generating application software | |
US20040122699A1 (en) | Method and system for integrating workflow management with business intelligence | |
US7865461B1 (en) | System and method for cleansing enterprise data | |
EP1225528A2 (en) | Data warehouse system | |
Hancock et al. | Practical Business Intelligence with SQL Server 2005 | |
Gonzales | IBM Data Warehousing: With IBM Business Intelligence Tools | |
Ralston | PowerPivot for business intelligence using Excel and SharePoint | |
Prasad | Data warehouse development Tools | |
Lachev | Applied Microsoft Analysis Services 2005: And Microsoft Business Intelligence Platform | |
Khan | Data warehousing 101: Concepts and implementation | |
Ballard et al. | Data Warehousing with the Informix Dynamic Server | |
Marques | PRESENTING BUSINESS INSIGHTS ON ADVANCED PRICING AGREEMENTS USING A BUSINESS INTELLIGENCE FRAMEWORK | |
Vacca | Design and implementation of a Business Intelligence pipeline for Poste Italiane |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20061130 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: PRECISIONPOINT SOFTWARE LIMITED |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: GEARY, NIGEL Inventor name: JARVIS, BEVERLEY Inventor name: MEW, CHRIS Inventor name: GORE, HELEN |
|
DAX | Request for extension of the european patent (deleted) | ||
17Q | First examination report despatched |
Effective date: 20071008 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20110309 |