CN114676201A - Data mart framework - Google Patents

Data mart framework Download PDF

Info

Publication number
CN114676201A
CN114676201A CN202210192987.6A CN202210192987A CN114676201A CN 114676201 A CN114676201 A CN 114676201A CN 202210192987 A CN202210192987 A CN 202210192987A CN 114676201 A CN114676201 A CN 114676201A
Authority
CN
China
Prior art keywords
data
framework
mart
report
platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210192987.6A
Other languages
Chinese (zh)
Inventor
吴霜
张志遵
朱瑞星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shenzhi Information Technology Co ltd
Original Assignee
Shanghai Shenzhi Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shenzhi Information Technology Co ltd filed Critical Shanghai Shenzhi Information Technology Co ltd
Priority to CN202210192987.6A priority Critical patent/CN114676201A/en
Publication of CN114676201A publication Critical patent/CN114676201A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data mart frame, belonging to the field of data mart, wherein the data mart frame is connected with at least one data platform, and comprises: the data integration module is used for monitoring the data platform and integrating the source data which changes when the source data in the data platform changes; the data processing module is connected with the data integration module and is used for synchronizing the integrated source data into a data table; and the report display module is connected with the data processing module and is used for presetting a report and displaying the content through the updated data table when the report is issued, wherein the data table and the report are displayed in a one-to-one manner. The invention utilizes a producer-consumer mode to monitor data in real time, and triggers the producer once the data changes, thereby realizing data integration and processing by the consumer; when the report is issued, the content is displayed directly through the data table, and the data table and the report are displayed in a one-to-one manner, so that a reliable decision making basis is provided.

Description

Data mart framework
Technical Field
The invention relates to the field of data mart, in particular to a data mart framework.
Background
A Data Mart (Data Mart), also called Data market, is a repository for collecting Data from manipulated and other Data sources serving a particular group of professionals, designed to Support Decision-making Support System (DSS) functionality. In a data warehouse, each data unit is associated with a particular time. The data warehouse includes data at the atomic level and lightly aggregated data, which is a theme-oriented, integrated, non-updatable (stability), time-varying (different times) data set to support decision-making processes in business management.
Data is, in scope, extracted from an enterprise-wide database, a data warehouse, or a more specialized data warehouse. However, data extracted from the existing data marts are messy, have no brain storage, are difficult to manage, and need to spend a large amount of labor cost for maintenance; and the data can be processed only when the report needs to be issued, so that the report issuing efficiency is low, and therefore, aiming at the problems, a data mart framework needs to be designed urgently to meet the requirement of actual use.
Disclosure of Invention
In order to solve the above technical problems, the present invention provides a data mart framework.
The technical problem solved by the invention can be realized by adopting the following technical scheme:
a data mart framework having access to at least one data platform, the data mart framework comprising:
the data integration module is used for monitoring the data platform and integrating the source data which changes when the source data in the data platform changes;
the data processing module is connected with the data integration module and used for synchronizing the integrated source data into a data table;
and the report display module is connected with the data processing module and used for presetting a report and displaying the content of the updated data table when the report is issued, wherein the data table and the report are displayed in a one-to-one manner.
In the data mart framework of the present invention, the at least one data platform includes a database for storing the source data;
the data integration module and the at least one data platform perform real-time monitoring and data integration on the database through a producer-consumer mode.
The data mart framework of the present invention, the at least one data platform comprises:
A producer processing unit, which is used for triggering a producer once the source data changes and generating a producer record;
the data integration module comprises: and the consumer processing unit is used for consuming according to the producer record.
According to the data mart framework, the data integration module is used for orderly recording each producer record in a log mode.
The data mart framework of the invention, the data processing module includes:
and the timing processing unit is used for carrying out full synchronous updating on the source data which changes in the data platform according to the producer record according to the preset timing task.
According to the data mart framework, the data processing module synchronizes the changed source data to the corresponding fields in the data table in a field relation mapping mode.
According to the data mart frame, the data processing module is a Kettel tool, and the Kettel tool is stored in the data mart frame in a plug-in mode.
According to the data mart frame, a report interface is provided for the data mart frame to access different reports through the report interface.
The data mart framework provides a data resource interface for accessing the at least one data platform through the data resource interface.
According to the data mart framework, the data platform is a medical data platform.
The invention has the beneficial effects that:
the data market system builds a data market frame through a python program, monitors data in an accessed data platform in real time by using a producer-consumer mode, triggers a producer once the data changes, and further realizes data integration and processing through a consumer; in the data integration and processing process, the source data is pulled into the data table, when the report is issued, the content is displayed directly through the data table, and the data table and the report are displayed in a one-to-one manner, so that a reliable decision making basis is provided.
Drawings
FIG. 1 is a block diagram of a data mart framework in accordance with a preferred embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a transformation process of a table corresponding to data according to a preferred embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
The following explains standard terms related to the embodiments of the present invention:
the Kettel tool: the ETL tool can be directly deployed on a Linux system, is deployed through a kitchen.sh script, directly writes the kitchen.sh script into a Dockerfile or yum file, and runs in the background in the form of a run command or a compound-up command, wherein the Dockerfile is a text file used for constructing an image, and the text content comprises a piece of instruction and description required for constructing the image.
The embodiment of the invention provides a data mart frame, which belongs to the field of data mart, and as shown in fig. 1 and 2, the data mart frame is accessed to at least one data platform 1, and the data mart frame comprises:
the data integration module 3 is used for monitoring the data platform 1 and integrating the changed source data when the source data in the data platform 1 changes;
the data processing module 4 is connected with the data integration module 3 and used for synchronizing the integrated source data into a data table B;
And the report display module 5 is connected with the data processing module 4 and is used for presetting a report C, displaying the content of the updated data table B when the report C is issued, and displaying the data table B and the report C in a one-to-one way.
Specifically, the embodiment of the invention builds a data mart frame through a python program, monitors data in the accessed data platform 1 in real time by using a producer-consumer mode, triggers a producer once the data changes, and further consumes through a button tool to realize data integration and processing; and in the data integration and processing process, data capture is directly carried out, source data are directly pulled into the data table B, when the report C is issued, content display is directly carried out through the data table B, the report does not need to be processed through additional SQL or programs, and the data table B and the report C are displayed in a one-to-one mode, so that a reliable decision making basis is provided.
Further, the framework has an importing function, the data report template can be imported, after the data report template is imported, the framework can directly generate a new data table B, and the data table B directly corresponds to the fields in the report C one by one.
The data mart framework of the embodiment of the invention optimizes the capture mode of the report C, and is different from the prior art that data capture is carried out when the report C needs to be issued; according to the invention, the data processing process is completed before the report C is issued by the one-to-one mode of the direct data table B and the report C, so that the processing mode of the report is optimized, the report processing time is greatly reduced, and the report issuing efficiency is improved.
In a preferred embodiment, at least one data platform 1 includes a database 11 for storing source data;
the data integration module 3 and at least one data platform 1 perform real-time monitoring and data integration on the database through a producer-consumer mode.
As a preferred embodiment, among others, at least one data platform 1 comprises:
a producer processing unit 12, for triggering a producer once the source data changes, and generating a producer record;
the data integration module 3 includes: a consumer processing unit 31 for consuming according to the producer record.
Specifically, in this embodiment, once the data platform 1 generates new data, it triggers a producer once, generates a producer record, and sends the producer record to the python program, and the python program uses the button tool to automatically consume and perform data synchronization processing.
In a preferred embodiment, the data integration module 3 orderly records each producer record in a log manner.
Specifically, in this embodiment, each producer record is ordered, and is recorded through the log, need not to install any plug-in, and the mode through the log is convenient for later stage user to carry out accurate effectual traceing back to data platform 1's data, traces back the reason for the unusual problem of follow-up data platform 1 and provides the data basis.
As a preferred embodiment, among others, the data processing module 4 includes:
a timing processing unit 41, for performing a full-scale synchronous update on the source data changed in the data platform 1 according to the producer record according to a preset timing task.
Specifically, in this embodiment, in order to avoid the risk of data loss, the present invention performs processing based on a producer-consumer mode to obtain producer records, and triggers data updating, and because each producer record is stored in order, the data synchronization is performed according to a preset timing task in a timed, centralized, and ordered manner, for example, the preset timing task may be set to process all producer records in a set at 8 points per night, perform timed full-volume synchronization updating, perform periodic maintenance every night or every other period of time, prevent the situation that the consumption data is missing or the data obtained by other approaches is not updated in time, and ensure the integrity and reliability of the data when the report C is issued.
Furthermore, when the data size is large, the data synchronization process may be performed by an incremental update method.
As a preferred embodiment, the data processing module 4 synchronizes the changed source data to the corresponding field in the data table B by means of field relation mapping.
In a preferred embodiment, the data processing module 4 is a keytel tool, and the keytel tool is stored in the data mart framework in the form of a plug-in.
Specifically, in the embodiment of the present invention, a keytel tool is used to process data, the data is directly subjected to table-linking operation, the data is synchronized into corresponding fields in a data table B in a field relation mapping manner, and the data in a database table a in multiple databases or multiple database tables a in one database can be merged into one data table B in a frame after being processed. Because the Kettel tool is integrated in the data mart frame in a plug-in mode, the existing tool or the tool updated on the existing basis can be used for replacing the Kettel tool to realize the data processing function in the subsequent process according to the requirement, and the diversity of the data mart frame is improved.
Furthermore, the embodiment of the invention reasonably and efficiently manages the data in a mode of combining a Kettel tool and a producer-consumer mode, and has low maintenance cost.
In a preferred embodiment, the data mart framework provides a report interface 6 for accessing different reports C through the report interface 6.
Specifically, in this embodiment, the data mart framework has one or more report interfaces 6, the report interfaces 6 may be one or more, so as to match different reports C, and the report C accessed by the report interface 6 is changed by the python program, so that different report displays can be realized.
In a preferred embodiment, the data mart framework provides a data resource interface 2 for accessing at least one data platform 1 via the data resource interface 2.
Specifically, in this embodiment, the data mart framework further has a data resource interface 2, and the data resource interface 2 may be one or multiple, so as to match different data platforms 1, and the data platforms 1 accessed by the data resource interface 2 are changed by a python program, so that the data summarization display of different data platforms 1 can be realized.
Preferably, the data platform 1 accessed by the data mart framework can be a medical data platform, a financial data platform or other field resource platforms, and the data mart framework provided by the embodiment of the invention has high applicability and can be widely applied to diversified fields by means of an interface access mode, so that data summarization in different fields is realized, and further, regular generation of reports is automatically performed.
According to the data mart framework of the embodiment of the invention, the data resource interface 2 and the report interface 6 are manually distributed in the python program, the Kettel tool is stored in the framework in a plug-in mode, and in the subsequent process, only the database name, the data table B name and the report name in the framework need to be modified, or the data resource interface 2 (producer) and the report interface 6 are added, so that the report query function can be automatically realized by using the data mart framework, and the data mart framework can pass through.
Adopt above-mentioned technical scheme's advantage or beneficial effect to lie in: the data mart system builds a data mart frame through a python program, monitors data in an accessed data platform in real time by using a producer-consumer mode, triggers a producer once the data changes, and further realizes data integration and processing through a consumer; in the data integration and processing process, the source data is pulled into the data table, when the report is issued, the content is displayed directly through the data table, and the data table and the report are displayed in a one-to-one manner, so that a reliable decision making basis is provided.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made without departing from the spirit and scope of the invention.

Claims (10)

1. A data mart framework having access to at least one data platform, the data mart framework comprising:
the data integration module is used for monitoring the data platform and integrating the changed source data when the source data in the data platform changes;
the data processing module is connected with the data integration module and is used for synchronizing the integrated source data into a data table;
and the report display module is connected with the data processing module and used for presetting a report and displaying the content of the updated data table when the report is issued, wherein the data table and the report are displayed in a one-to-one manner.
2. The data mart framework of claim 1, wherein the at least one data platform includes a database for storing the source data;
the data integration module and the at least one data platform perform real-time monitoring and data integration on the database through a producer-consumer mode.
3. The data mart framework of claim 1, wherein the at least one data platform comprises:
a producer processing unit, which is used for triggering a producer once the source data changes and generating a producer record;
The data integration module comprises: and the consumer processing unit is used for consuming according to the producer record.
4. The data mart framework of claim 3 wherein the data consolidation module logs each of the producer records in order.
5. The data mart framework of claim 3, wherein the data processing module comprises:
and the timing processing unit is used for carrying out full synchronous updating on the source data which changes in the data platform according to the producer record according to the preset timing task.
6. The data mart framework of claim 1 wherein the data processing module synchronizes the source data that has changed to corresponding fields in the data table by way of field relationship mapping.
7. The data mart framework of claim 1, wherein the data processing module is a Kettel tool, the Kettel tool being stored in the data mart framework in the form of a plug-in.
8. The data mart framework of claim 1, wherein the data mart framework provides a reporting interface for accessing different reports through the reporting interface.
9. The data mart framework of claim 1 wherein the data mart framework provides a data asset interface for accessing the at least one data platform through the data asset interface.
10. The data mart framework of claim 1, wherein the data platform is a medical data platform.
CN202210192987.6A 2022-02-28 2022-02-28 Data mart framework Pending CN114676201A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210192987.6A CN114676201A (en) 2022-02-28 2022-02-28 Data mart framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210192987.6A CN114676201A (en) 2022-02-28 2022-02-28 Data mart framework

Publications (1)

Publication Number Publication Date
CN114676201A true CN114676201A (en) 2022-06-28

Family

ID=82073198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210192987.6A Pending CN114676201A (en) 2022-02-28 2022-02-28 Data mart framework

Country Status (1)

Country Link
CN (1) CN114676201A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116226172A (en) * 2023-05-08 2023-06-06 深圳市新国都数字科技有限公司 Statistical analysis file analysis method, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682096A (en) * 2016-12-01 2017-05-17 北京奇虎科技有限公司 Method and device for log data management
CN107193967A (en) * 2017-05-25 2017-09-22 南开大学 A kind of multi-source heterogeneous industry field big data handles full link solution
CN107679708A (en) * 2017-09-13 2018-02-09 浙江帝杰曼信息科技股份有限公司 A kind of management of housing fund cloud platform system
CN109597850A (en) * 2018-11-22 2019-04-09 四川省烟草公司成都市公司 Tobacco integrated information data mart modeling stores platform and data processing method
CN113485781A (en) * 2021-07-23 2021-10-08 中国建设银行股份有限公司 Report generation method and device, electronic equipment and computer readable medium
CN113642301A (en) * 2021-08-09 2021-11-12 京东科技控股股份有限公司 Report generation method, device and system
CN113779092A (en) * 2021-09-17 2021-12-10 平安科技(深圳)有限公司 Real-time data display method, device, equipment and medium based on data warehouse

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682096A (en) * 2016-12-01 2017-05-17 北京奇虎科技有限公司 Method and device for log data management
CN107193967A (en) * 2017-05-25 2017-09-22 南开大学 A kind of multi-source heterogeneous industry field big data handles full link solution
CN107679708A (en) * 2017-09-13 2018-02-09 浙江帝杰曼信息科技股份有限公司 A kind of management of housing fund cloud platform system
CN109597850A (en) * 2018-11-22 2019-04-09 四川省烟草公司成都市公司 Tobacco integrated information data mart modeling stores platform and data processing method
CN113485781A (en) * 2021-07-23 2021-10-08 中国建设银行股份有限公司 Report generation method and device, electronic equipment and computer readable medium
CN113642301A (en) * 2021-08-09 2021-11-12 京东科技控股股份有限公司 Report generation method, device and system
CN113779092A (en) * 2021-09-17 2021-12-10 平安科技(深圳)有限公司 Real-time data display method, device, equipment and medium based on data warehouse

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116226172A (en) * 2023-05-08 2023-06-06 深圳市新国都数字科技有限公司 Statistical analysis file analysis method, device and storage medium

Similar Documents

Publication Publication Date Title
CN110489313B (en) Operation log recording method and device based on block chain and storage medium
CN107958082B (en) Off-line increment synchronization method and system from database to data warehouse
US20110161132A1 (en) Method and system for extracting process sequences
CN107544984A (en) A kind of method and apparatus of data processing
CN112445863A (en) Real-time data synchronization method and system
CN112016287A (en) Data management method, platform, storage medium and electronic device
CN114676201A (en) Data mart framework
CN115794929A (en) Data management system and data management method for data mart
CN114254870A (en) Automatic production scheduling method and device for production orders, electronic equipment and storage medium
CN111753015A (en) Data query method and device of payment clearing system
CN111090803A (en) Data processing method and device, electronic equipment and storage medium
US20210090028A1 (en) Replication of planned working time information
CN115330420B (en) Gem and jade tracing method and system based on standards
CN116307570A (en) Subway vehicle electronic record system based on block chain technology
CN110597899B (en) Project expense management method and system
CN113946627A (en) Data accuracy detection early warning system and method under data real-time synchronization scene
CN112561368B (en) Visual performance calculation method and device for OA approval system
CN112541030A (en) Intelligent patrol management tool based on mobile platform
CN111625616A (en) Enterprise-level data management system capable of realizing mass storage
CN111223028A (en) Policy information sharing system and sharing method
CN114741412B (en) User behavior self-help analysis system
CN210805242U (en) Intelligent medical health management application system
CN112925697B (en) Method, device, equipment and medium for monitoring job difference
Lanjun Study on the reconstruction of accounting information system based on the financial accounting conceptual framework
Gao-qin et al. Configurable data exchange tool design based on tagged data structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination