CN113111103A - Intelligent comprehensive big data fusion processing platform - Google Patents
Intelligent comprehensive big data fusion processing platform Download PDFInfo
- Publication number
- CN113111103A CN113111103A CN202110366842.9A CN202110366842A CN113111103A CN 113111103 A CN113111103 A CN 113111103A CN 202110366842 A CN202110366842 A CN 202110366842A CN 113111103 A CN113111103 A CN 113111103A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- management
- metadata
- sharing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007499 fusion processing Methods 0.000 title claims abstract description 17
- 238000007726 management method Methods 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 26
- 238000013523 data management Methods 0.000 claims abstract description 20
- 238000012550 audit Methods 0.000 claims abstract description 7
- 238000004458 analytical method Methods 0.000 claims description 8
- 238000012544 monitoring process Methods 0.000 claims description 8
- 238000012098 association analyses Methods 0.000 claims description 3
- 238000013475 authorization Methods 0.000 claims description 3
- 238000007405 data analysis Methods 0.000 claims description 3
- 230000008676 import Effects 0.000 claims description 3
- 238000003384 imaging method Methods 0.000 claims description 2
- 230000004927 fusion Effects 0.000 abstract description 3
- 230000000007 visual effect Effects 0.000 abstract description 3
- 238000001514 detection method Methods 0.000 abstract 1
- 238000000034 method Methods 0.000 description 5
- 238000012423 maintenance Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012517 data analytics Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 210000001503 joint Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/26—Visual data mining; Browsing structured data
Abstract
The invention discloses an intelligent comprehensive big data fusion processing platform, which comprises a data management system and a data value system; the data management system comprises: the metadata management module is used for uniformly managing and storing the metadata; the data resource management module is used for comprehensively managing the universe data; the data service management module is used for realizing the display and application management of the external data service directory; the data quality auditing module is used for carrying out data quality detection on the system data; the data value system comprises: the data access module is used for accessing various data; the data sharing module is used for sharing data; and the data processing module is used for acquiring and processing data. The invention meets the requirements of intelligent data management, intelligent data audit, visual data management, high-performance data access and extensible data exchange sharing by constructing an integrated intelligent big data fusion platform integrating big data management and management.
Description
Technical Field
The invention relates to the field of big data processing, in particular to an intelligent comprehensive big data fusion processing platform.
Background
With the advent of the cloud era, big data has attracted increasing attention, and teams of analysts have recognized that big data is often used to host the large amount of unstructured and semi-structured data created by companies that can take excessive time and money to download to relational databases for analysis. Big data analytics are often tied to cloud computing. Large data requires special techniques to efficiently process large amounts of data that are tolerant of elapsed time. The strategic significance of big data technology is not to grasp huge data information, but to specialize the data containing significance. In other words, if big data is compared to an industry, the key to realizing profitability of the industry is to improve the processing capability of the data and realize the value increment of the data through processing.
Mention is made in patent application publication No. CN201810366975.4 of: big data is a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, highly-growing and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode.
The existing big data processing platform only aims at data from a single source generally, has single function, is relatively troublesome in data processing process, has high maintenance cost for data processing, and is not beneficial to large-scale use and maintenance of the big data processing platform.
Disclosure of Invention
Based on this, the invention aims to solve the problems that the large data processing platform in the prior art has a single data source, the data processing process is troublesome and the maintenance cost of data processing is high.
In order to achieve the purpose, the invention provides an intelligent comprehensive big data fusion processing platform, which comprises a data management system and a data value system;
the data management system comprises:
the metadata management module is used for uniformly managing and storing metadata and forming data assets through a metadata view angle;
the data resource management module is used for comprehensively managing the universe data, and butting the basic information resource library and the theme library to realize the unified management of the data assets;
the data service management module is used for realizing the display and application management of an external data service directory and providing a data exchange sharing condition for a user;
the data quality auditing module is used for detecting the data quality of system data, continuously monitoring the fluctuation condition of the data quality, analyzing the proportion of data quality rules, periodically generating key data quality reports of each system and mastering the data quality condition of the system;
the data value system comprises:
the data access module is used for accessing various data, including data access through a plug-in program;
the data sharing module is used for sharing data in a mode of files, interfaces and plug-in programs;
and the data processing module is used for acquiring and processing data through a Web-ETL tool and a data grabbing tool.
The metadata management module comprises the steps of metadata query, automatic acquisition, import, export, table establishment, authorization and version comparison; the metadata includes business metadata, technical metadata, and management metadata.
The data resource management module provides a display and monitoring interface for data analysis, use and operation.
The data service management module comprises a data service query submodule, a data service release submodule, a data service auditing submodule and a data service monitoring submodule.
The data quality auditing module comprises a data quality rule specifying submodule and is used for intelligently generating data quality rules according to data standards.
The data access module further comprises a full volume/incremental interface.
The data sharing module comprises:
the file and interface data sharing submodule is used for exchanging shared data in a file and interface mode;
the full/incremental file generation submodule is used for supporting full/incremental generation of data files;
the data calling interface generation submodule is used for generating a data calling interface;
and the plug-in program sharing submodule is used for carrying out sharing mode expansion through the plug-in program.
The data processing module comprises an imaging operation interface and a programming-free application plate and is used for realizing data processing, data modeling and data scheduling.
The intelligent comprehensive big data fusion processing platform further comprises a metadata blood margin analysis module, a metadata influence analysis module and a metadata association analysis module, and is used for displaying data flow direction and data association.
The beneficial effect of this application: the systematic management of data assets is completed aiming at links such as data standard, data quality, data safety, metadata management and data life cycle; data asset operation and application support is realized around data circulation and data service; an integrated intelligent big data fusion platform integrating digital government affair business knowledge with big data technology, big data management with management, data access with data exchange sharing and data safety with visual data development tools is constructed, and the requirements of intelligent data management, intelligent data audit, visual data management, high-performance data access, extensible data exchange sharing and the like are met.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the structures of the drawings without creative efforts.
FIG. 1 is a framework diagram of a fusion processing platform of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to more clearly understand the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described with reference to the accompanying drawings.
As shown in fig. 1, in this embodiment, the present invention provides an intelligent integrated big data fusion processing platform, which includes a data management system and a data value system, and combines an integrated data quality audit module and a data processing module to provide data access capability, data exchange sharing capability, data asset management capability, and data management capability.
1. The data management system comprises:
a metadata management module: the metadata management module can be expanded in a user-defined mode so as to meet diversified metadata requirements. The metadata is used as an internal drive of data management, unified management is carried out on the metadata, the functions of metadata query, automatic acquisition, import, export, table establishment, authorization, version comparison and the like are included, data assets are formed through a metadata view angle, and unified management and storage of service metadata, technical metadata and management metadata are achieved. Meanwhile, the platform also provides metadata blood relationship analysis, influence analysis and association analysis, and visually displays the data flow direction and the association relation.
A data resource management module: the universal data management system has the advantages that the universal data management is realized, the basic information resource base and each subject base are connected in a butt joint mode, the unified management of data assets is realized, a channel for mastering and understanding the integrated data resources is provided for users, the universal data management system is a standard window for promoting data integration and future data exchange fusion, and meanwhile, a display and monitoring interface is provided for data analysis, use and operation.
A data service management module: the display and application management of the external data service directory are realized, and the data exchange sharing condition is mastered. The system has the functions of data service inquiry, data service release, data service audit, data service monitoring and the like.
A data quality audit module: and intelligently generating a data quality rule according to the data standard, and exposing the data quality problem of each system by formulating and implementing data quality inspection. And continuously monitoring the data quality fluctuation condition of each system and the data quality rule proportion analysis, periodically generating a key data quality report of each system, and mastering the data quality condition of the system.
2. The data value system comprises:
a data access module: supporting a plurality of data access modes, including (database) full volume/increment interface, batch access, multi-table access to the same data table; (file) analyzing the file to be put in storage, and keeping the file not to be put in storage; and (interface) calling the interface to be put in storage, and storing interface information in a transparent transmission mode and the like. Meanwhile, the platform also supports the extension of an access mode through a plug-in (third party) program, and the data source is conveniently and quickly adaptively extended under the condition that the functions of the platform are not influenced, so that the requirements of practical application scenes are met.
A data sharing module: the method supports the exchange of shared data in a mode of files and interfaces, and supports the generation of data files in full amount/increment; supporting generation of a data calling interface; and the sharing mode extension of a plug-in (third party) program is supported to meet the data sharing after data aggregation and calculation.
A data processing module: the platform provides an integrated Web-ETL tool and a data grabbing tool, supports a graphical operation interface, is free from programming application, provides an object-oriented operation mode, and completes data acquisition and processing flows in a one-stop mode. The integrated configuration of processing, modeling and scheduling can be realized in a panel working area, the continuity of the operation of a user on a data management tool is improved, and the maintenance cost of data processing is reduced.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (9)
1. An intelligent comprehensive big data fusion processing platform is characterized by comprising a data management system and a data value system;
the data management system comprises:
the metadata management module is used for uniformly managing and storing metadata and forming data assets through a metadata view angle;
the data resource management module is used for comprehensively managing the universe data, and butting the basic information resource library and the theme library to realize the unified management of the data assets;
the data service management module is used for realizing the display and application management of an external data service directory and providing a data exchange sharing condition for a user;
the data quality auditing module is used for detecting the data quality of system data, continuously monitoring the fluctuation condition of the data quality, analyzing the proportion of data quality rules, periodically generating key data quality reports of each system and mastering the data quality condition of the system;
the data value system comprises:
the data access module is used for accessing various data, including data access through a plug-in program;
the data sharing module is used for sharing data in a mode of files, interfaces and plug-in programs;
and the data processing module is used for acquiring and processing data through a Web-ETL tool and a data grabbing tool.
2. The intelligent integrated big data fusion processing platform of claim 1, wherein the metadata management module comprises metadata query, automatic collection, import, export, tabulation, authorization and version comparison; the metadata includes business metadata, technical metadata, and management metadata.
3. The intelligent integrated big data fusion processing platform of claim 1, wherein the data resource management module provides a display and monitoring interface for data analysis, usage and operation.
4. The intelligent integrated big data fusion processing platform of claim 1, wherein the data service management module comprises a data service query sub-module, a data service release sub-module, a data service audit sub-module, and a data service monitor sub-module.
5. The intelligent integrated big data fusion processing platform of claim 1, wherein the data quality audit module comprises a data quality rule specification submodule for intelligently generating data quality rules according to data standards.
6. The intelligent integrated big data fusion processing platform of claim 1, wherein the data access module further comprises a full/incremental interface.
7. The intelligent integrated big data fusion processing platform of claim 1, wherein the data sharing module comprises:
the file and interface data sharing submodule is used for exchanging shared data in a file and interface mode;
the full/incremental file generation submodule is used for supporting full/incremental generation of data files;
the data calling interface generation submodule is used for generating a data calling interface;
and the plug-in program sharing submodule is used for carrying out sharing mode expansion through the plug-in program.
8. The intelligent integrated big data fusion processing platform of claim 1, wherein the data processing module comprises an imaging operation interface and a programming-free application plate for realizing data processing, data modeling and data scheduling.
9. The intelligent integrated big data fusion processing platform of claim 1, further comprising a metadata consanguinity analysis module, a metadata influence analysis module, and a metadata association analysis module for displaying data flow direction and data association.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110366842.9A CN113111103A (en) | 2021-04-06 | 2021-04-06 | Intelligent comprehensive big data fusion processing platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110366842.9A CN113111103A (en) | 2021-04-06 | 2021-04-06 | Intelligent comprehensive big data fusion processing platform |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113111103A true CN113111103A (en) | 2021-07-13 |
Family
ID=76713990
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110366842.9A Pending CN113111103A (en) | 2021-04-06 | 2021-04-06 | Intelligent comprehensive big data fusion processing platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113111103A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113626648A (en) * | 2021-07-30 | 2021-11-09 | 联通(广东)产业互联网有限公司 | Water conservancy data processing system, method and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112396404A (en) * | 2020-11-27 | 2021-02-23 | 广州光点信息科技有限公司 | Data center system |
-
2021
- 2021-04-06 CN CN202110366842.9A patent/CN113111103A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112396404A (en) * | 2020-11-27 | 2021-02-23 | 广州光点信息科技有限公司 | Data center system |
Non-Patent Citations (1)
Title |
---|
NITHIN VIJAYENDRA 等: "\"A Web-based ETL Tool for Data Integration Process\"", 《2013 6TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTIONS (HSI)》, pages 434 - 438 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113626648A (en) * | 2021-07-30 | 2021-11-09 | 联通(广东)产业互联网有限公司 | Water conservancy data processing system, method and storage medium |
CN113626648B (en) * | 2021-07-30 | 2024-01-26 | 联通(广东)产业互联网有限公司 | Water conservancy data processing system, method and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5826831B2 (en) | Data mart automation | |
Berthold et al. | An architecture for ad-hoc and collaborative business intelligence | |
CN112396404A (en) | Data center system | |
CN109272155A (en) | A kind of corporate behavior analysis system based on big data | |
CN114925045B (en) | PaaS platform for big data integration and management | |
CN101645032B (en) | Performance analysis method of application server and application server | |
US9348874B2 (en) | Dynamic recreation of multidimensional analytical data | |
US8676860B2 (en) | Web service discovery via data abstraction model | |
CN112148718A (en) | Big data support management system for city-level data middling station | |
CN115794929B (en) | Data management system and data management method for data marts | |
CN116205396A (en) | Data panoramic monitoring method and system based on data center | |
US20100153466A1 (en) | Systems and methods to facilitate report creation for non-relational databases | |
US20070239587A1 (en) | System and Method For Dynamically Utilizing and Managing Financial, Operational, and Compliance Data | |
CN113111103A (en) | Intelligent comprehensive big data fusion processing platform | |
US20140143248A1 (en) | Integration to central analytics systems | |
CN113722564A (en) | Visualization method and device for energy and material supply chain based on space map convolution | |
Milosevic et al. | Big data management processes in business intelligence systems | |
Shao et al. | Optimization research of information management system based on big data technology | |
Rahman | Data warehousing and business intelligence with big data | |
CN111259082A (en) | Method for realizing full data synchronization in big data environment | |
Ding et al. | Key Technology for Service Sharing Platform Based on Big Data | |
Madhikerrni et al. | Data discovery method for Extract-Transform-Load | |
Tian et al. | RETRACTED: Research on Big Data Analysis Platform of Power Grid Enterprise Accounting Based on Cloud Computing | |
Hu | Data Warehouse Technology and Application in Data Centre Design for E-government | |
Tiwari et al. | A Survey of Optimization Big Data Analytical Tools |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210713 |