CN108197182A - A kind of data atlas analysis system and method - Google Patents
A kind of data atlas analysis system and method Download PDFInfo
- Publication number
- CN108197182A CN108197182A CN201711424043.2A CN201711424043A CN108197182A CN 108197182 A CN108197182 A CN 108197182A CN 201711424043 A CN201711424043 A CN 201711424043A CN 108197182 A CN108197182 A CN 108197182A
- Authority
- CN
- China
- Prior art keywords
- data
- metadata
- unit
- illustrative plates
- related information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
- G06F16/2448—Query languages for particular applications; for extensibility, e.g. user defined types
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Automatic Analysis And Handling Materials Therefor (AREA)
Abstract
The invention discloses a kind of data atlas analysis system, including:Data processing module, data processing module include metadata acquisition unit and the data flow analytic unit being connect with metadata acquisition unit;Data memory module, data memory module include the data collection of illustrative plates storage unit being connect with data flow analytic unit;Data display module, data display module include the data collection of illustrative plates display unit being connect with data collection of illustrative plates storage unit;Metadata acquisition unit is used to collect metadata from each Production database and analytical database, data flow analytic unit analyzes ETL scripts using analytical tool JSQLParser, and with reference to metadata, obtain table and the related information of field level between each Production database and analytical database, related information is stored in data collection of illustrative plates storage unit, and data collection of illustrative plates display unit is used for the related information of present graphical.The above-mentioned technical proposal of the present invention can store and show the incidence relation between metadata.
Description
Technical field
The present invention relates to network technique field, it particularly relates to a kind of data atlas analysis system and method.
Background technology
Data atlas analysis system describes the whole process to data processing, origin including data and handles these numbers
According to all successor operations.Distributed data share becomes increasingly conspicuous under big data platform, and the source of data is for analysis data, system
One data bore, administrative model change, the confidence level for weighing data, the quality for ensureing data etc. is particularly important.But existing skill
There is no such data atlas analysis systems in art.
Invention content
The problem of for the relevant technologies, the present invention propose a kind of data atlas analysis system and method, can
Incidence relation between storage and displaying metadata.
The technical proposal of the invention is realized in this way:
According to an aspect of the invention, there is provided a kind of data atlas analysis system, including:Data processing module, number
Include metadata acquisition unit and the data flow analytic unit being connect with metadata acquisition unit according to processing module;Data
Memory module, data memory module include the data collection of illustrative plates storage unit being connect with data flow analytic unit;Data show mould
Block, data display module include the data collection of illustrative plates display unit being connect with data collection of illustrative plates storage unit;
Wherein, metadata acquisition unit is used to collect metadata, data flow from each Production database and analytical database
Analyze ETL scripts, and with reference to metadata using analytical tool JSQLParser to analytic unit, obtain each Production database and
Table and the related information of field level between analytical database, related information deposit data collection of illustrative plates storage unit, data collection of illustrative plates exhibition
Show related information of the unit for present graphical.
According to an embodiment of the invention, metadata acquisition unit is additionally operable to through Workflow management platform Airflow to receiving
The flow of set metadata is scheduled.
According to an embodiment of the invention, data collection of illustrative plates display unit by spring boot frames, Neo4j databases and
Echarts builds data dictionary and data collection of illustrative plates displayed page with the related information of present graphical.
According to an embodiment of the invention, wherein data collection of illustrative plates storage unit includes Neo4j databases, related information deposit
In Neo4j databases.
According to an embodiment of the invention, data memory module further includes the metadata storage being connect with metadata acquisition unit
Unit, wherein metadata acquisition unit will be in the Metadata integrations to metadata storage unit of collection.
According to an embodiment of the invention, data display module further includes metadata management unit, and metadata management unit is used
In providing inquiry and displaying interface to check and query metadata.
According to another aspect of the present invention, a kind of datagram spectral analysis method is provided, including:
S1 collects metadata from each Production database and analytical database;
S2 analyzes ETL scripts, and with reference to metadata, obtain each Production database using analytical tool JSQLParser
Table and the related information of field level between analytical database;
S3, the related information of present graphical.
According to an embodiment of the invention, step S1 is further included:By Workflow management platform Airflow to collecting metadata
Flow be scheduled.
According to an embodiment of the invention, step S3 is specifically included:By spring boot frames, Neo4j databases and
Echarts builds data dictionary and data collection of illustrative plates displayed page with the related information of present graphical.
The present invention utilizes analytical tool JSQLParser analysis ETL scripts parsing Production databases and analytical database member number
Incidence relation between, and the incidence relation between display data can be stored;Each module can work independently, face
To each system can freely be configured, function modoularization simultaneously have higher configurability;Additionally by open source software
The extension of JSQLParser can be suitable for the structured statement of a variety of different standards and nonstandardized technique, can with higher
Autgmentability.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the present invention
Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings
Obtain other attached drawings.
Fig. 1 is the block diagram of data atlas analysis system according to embodiments of the present invention;
Fig. 2 is the exemplary process diagram of the data flow analytic unit analysis ETL scripts in Fig. 1;
Fig. 3 is the flow chart of datagram spectral analysis method according to embodiments of the present invention.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art's all other embodiments obtained belong to what the present invention protected
Range.
As shown in Figure 1, data atlas analysis system 100 according to embodiments of the present invention includes:Data processing module 110,
Data processing module 110 includes metadata acquisition unit 112 and the data flow being connect with metadata acquisition unit 112 point
Analyse unit 114;Data memory module 120, data memory module 120 include the data being connect with data flow analytic unit 114
Collection of illustrative plates storage unit 124;Data display module 130, data display module include the number being connect with data collection of illustrative plates storage unit 124
According to collection of illustrative plates display unit 134.
Wherein, metadata acquisition unit 112 is used to collect first number from each Production database 210 and analytical database 220
According to data flow analytic unit 114, with reference to metadata, is obtained using analytical tool JSQLParser analysis ETL scripts 230
Table and the related information of field level between each Production database 210 and analytical database 220, related information deposit datagram
Storage unit 124 is composed, data collection of illustrative plates display unit 134 is used for the related information of present graphical.
The above-mentioned technical proposal of the present invention parses creation data using analytical tool JSQLParser analysis ETL scripts 230
Incidence relation between 220 metadata of library 210 and analytical database, and the incidence relation between display data can be stored;
Each module can work independently, towards each system can freely be configured, function modoularization simultaneously have it is higher can
Configurability;Additionally by the extension to open source software JSQLParser, a variety of different standards and nonstandardized technique can be suitable for
Structured statement, have higher scalability.
With continued reference to shown in Fig. 1, wherein metadata acquisition unit 112 is additionally operable to through Workflow management platform Airflow
The flow for collecting metadata is scheduled.
Preferably, data collection of illustrative plates display unit 134 passes through spring boot frames, Neo4j databases and echarts structures
Data dictionary and data collection of illustrative plates displayed page are built with the related information of present graphical.By using high performance graphic data base
Neo4j and spring boot frames, the interface of echarts structures can flexibly show each table and the attribute arranged and relationship.
In the embodiment shown in fig. 1, wherein data collection of illustrative plates storage unit 124 includes Neo4j databases, data flow point
In the related information deposit Neo4j databases that analysis unit 114 obtains.
Wherein, data memory module 120 further includes the metadata storage unit 122 being connect with metadata acquisition unit 112,
Wherein metadata acquisition unit 112 will be in the Metadata integration of collection to metadata storage unit 122.
Wherein, data display module 130 further includes metadata management unit 132, and metadata management unit 132 is used to provide
Inquiry and displaying interface are to check and query metadata.
The operation flow of each unit of the data atlas analysis system 100 of the present invention is illustrated below in conjunction with Fig. 1.
Metadata acquisition unit 112 is supported from 220 extracting metadata of various types of Production databases 210 and analytical database, union
Into in the MySQL database to metadata storage unit 122, whole flow process can be by Workflow management platform Airflow come real
Apply scheduling.With reference to shown in Fig. 2, data flow analytic unit 114 is various by supporting the extension of open source software JSQLParser
SQL statement, and can be extended according to the grammer of disparate databases;Data flow analytic unit 114 is by SQL statement
Parsing and the inquiry of basic metadata obtain the data source and calculation expression of each table and row.Metadata management unit
132 can provide complete inquiry and displaying interface, and the function can be used to check the essential information of metadata, according to searching bar
Part inquires qualified metadata.Data collection of illustrative plates storage unit 124 and data collection of illustrative plates display unit 134, data atlas analysis
System 100 can extract all metadata from each Production database 210 and analytical database 220, be imported by loadNeo4j
Neo4j databases;Data collection of illustrative plates display unit 134 is built by spring boot frames, Neo4j databases and echarts
Patterned related information is showed user by data collection of illustrative plates interface (such as web page).
In conclusion the advantageous effect of the data atlas analysis system of the present invention is:
1. function modoularization, high configurability.Each module of system can work independently.Towards each system
It can freely be configured;
2. enhanced scalability.By the extension to open source software JSQLParser, a variety of different standards can be suitable for
With the structured statement of nonstandardized technique;
3. flexible data collection of illustrative plates displaying.Using high performance graphic data base Neo4j and spring boot frames,
The interface of echarts structures flexibly shows the attribute and relationship of each table and row.
As shown in figure 3, according to an embodiment of the invention, a kind of datagram spectral analysis method 30 is additionally provided, including following
Step:
S32 collects metadata from each Production database and analytical database;
S34 analyzes ETL scripts, and with reference to metadata, obtain each Production database using analytical tool JSQLParser
Table and the related information of field level between analytical database;
S36, the related information of present graphical.
In one embodiment, step 32 can also include:By Workflow management platform Airflow to collecting metadata
Flow be scheduled.
In one embodiment, step S36 is specifically included:By spring boot frames, Neo4j databases and
Echarts builds data dictionary and data collection of illustrative plates displayed page with the related information of present graphical.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention
With within principle, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention god.
Claims (9)
1. a kind of data atlas analysis system, which is characterized in that including:
Data processing module, the data processing module include metadata acquisition unit and with the metadata acquisition unit
The data flow analytic unit of connection;
Data memory module, the data memory module include the data collection of illustrative plates being connect with data flow analytic unit storage
Unit;
Data display module, the data display module include the data collection of illustrative plates being connect with data collection of illustrative plates storage unit displaying
Unit;
Wherein, the metadata acquisition unit is used to collect metadata, the number from each Production database and analytical database
ETL scripts are analyzed using analytical tool JSQLParser, and with reference to the metadata, obtain described each according to analytic unit is flowed to
Table and the related information of field level between Production database and analytical database, the related information are stored in the data collection of illustrative plates
Storage unit, the data collection of illustrative plates display unit are used for the related information of present graphical.
2. data atlas analysis system according to claim 1, which is characterized in that
The metadata acquisition unit is additionally operable to adjust the flow for collecting metadata by Workflow management platform Airflow
Degree.
3. data atlas analysis system according to claim 1, which is characterized in that the data collection of illustrative plates display unit passes through
Spring boot frames, Neo4j databases and echarts structure data dictionaries and data collection of illustrative plates displayed page are with present graphical
The related information changed.
4. data atlas analysis system according to claim 1, which is characterized in that wherein described datagram spectrum storage unit
Including Neo4j databases, the related information is stored in the Neo4j databases.
5. data atlas analysis system according to claim 1, which is characterized in that the data memory module further include with
The metadata storage unit of metadata acquisition unit connection, wherein the metadata acquisition unit is by the metadata set of collection
Into in the metadata storage unit.
6. data atlas analysis system according to claim 5, which is characterized in that the data display module further includes member
Data Management Unit, the metadata management unit inquire and show interface for offer to check and inquire the metadata.
7. a kind of datagram spectral analysis method, which is characterized in that including:
S1 collects metadata from each Production database and analytical database;
S2 analyzes ETL scripts, and with reference to the metadata, obtain each production number using analytical tool JSQLParser
According to table between library and analytical database and the related information of field level;
S3, the related information of present graphical.
8. datagram spectral analysis method according to claim 7, which is characterized in that step S1 is further included:Pass through workflow
Management platform Airflow is scheduled the flow for collecting metadata.
9. datagram spectral analysis method according to claim 7, which is characterized in that step S3 is specifically included:Pass through
Spring boot frames, Neo4j databases and echarts structure data dictionaries and data collection of illustrative plates displayed page are with present graphical
The related information changed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711424043.2A CN108197182A (en) | 2017-12-25 | 2017-12-25 | A kind of data atlas analysis system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711424043.2A CN108197182A (en) | 2017-12-25 | 2017-12-25 | A kind of data atlas analysis system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108197182A true CN108197182A (en) | 2018-06-22 |
Family
ID=62583865
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711424043.2A Pending CN108197182A (en) | 2017-12-25 | 2017-12-25 | A kind of data atlas analysis system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108197182A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109446263A (en) * | 2018-11-02 | 2019-03-08 | 成都四方伟业软件股份有限公司 | A kind of data relationship correlating method and device |
CN109739894A (en) * | 2019-01-04 | 2019-05-10 | 深圳前海微众银行股份有限公司 | Supplement method, apparatus, equipment and the storage medium of metadata description |
CN109840267A (en) * | 2019-03-01 | 2019-06-04 | 成都品果科技有限公司 | A kind of ETL process system and method |
CN110019252A (en) * | 2019-04-16 | 2019-07-16 | 成都四方伟业软件股份有限公司 | The method, apparatus and electronic equipment of information processing |
CN111078695A (en) * | 2019-11-29 | 2020-04-28 | 东软集团股份有限公司 | Method and device for calculating metadata association relation in enterprise |
CN112100266A (en) * | 2020-11-05 | 2020-12-18 | 成都中科大旗软件股份有限公司 | Big data map analysis method and system |
CN112685405A (en) * | 2020-12-21 | 2021-04-20 | 福建新大陆软件工程有限公司 | Data management method, system, equipment and medium based on knowledge graph |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080168420A1 (en) * | 2006-03-17 | 2008-07-10 | The Mitre Corporation | Semantic system for integrating software components |
CN101770479A (en) * | 2008-12-31 | 2010-07-07 | 北京亿阳信通软件研究院有限公司 | Association relationship query method and device |
CN104216888A (en) * | 2013-05-30 | 2014-12-17 | 中国电信股份有限公司 | Data processing task relation setting method and system |
CN107273079A (en) * | 2017-05-18 | 2017-10-20 | 网易(杭州)网络有限公司 | Related information is shown, collection of illustrative plates processing method, device, medium, equipment and system |
-
2017
- 2017-12-25 CN CN201711424043.2A patent/CN108197182A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080168420A1 (en) * | 2006-03-17 | 2008-07-10 | The Mitre Corporation | Semantic system for integrating software components |
CN101770479A (en) * | 2008-12-31 | 2010-07-07 | 北京亿阳信通软件研究院有限公司 | Association relationship query method and device |
CN104216888A (en) * | 2013-05-30 | 2014-12-17 | 中国电信股份有限公司 | Data processing task relation setting method and system |
CN107273079A (en) * | 2017-05-18 | 2017-10-20 | 网易(杭州)网络有限公司 | Related information is shown, collection of illustrative plates processing method, device, medium, equipment and system |
Non-Patent Citations (1)
Title |
---|
李克学: "某银行元数据解析处理系统", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109446263A (en) * | 2018-11-02 | 2019-03-08 | 成都四方伟业软件股份有限公司 | A kind of data relationship correlating method and device |
CN109739894A (en) * | 2019-01-04 | 2019-05-10 | 深圳前海微众银行股份有限公司 | Supplement method, apparatus, equipment and the storage medium of metadata description |
CN109739894B (en) * | 2019-01-04 | 2022-12-09 | 深圳前海微众银行股份有限公司 | Method, device, equipment and storage medium for supplementing metadata description |
CN109840267A (en) * | 2019-03-01 | 2019-06-04 | 成都品果科技有限公司 | A kind of ETL process system and method |
CN109840267B (en) * | 2019-03-01 | 2023-04-21 | 成都品果科技有限公司 | Data ETL system and method |
CN110019252A (en) * | 2019-04-16 | 2019-07-16 | 成都四方伟业软件股份有限公司 | The method, apparatus and electronic equipment of information processing |
CN111078695A (en) * | 2019-11-29 | 2020-04-28 | 东软集团股份有限公司 | Method and device for calculating metadata association relation in enterprise |
CN111078695B (en) * | 2019-11-29 | 2023-11-21 | 东软集团股份有限公司 | Method and device for calculating association relation of metadata in enterprise |
CN112100266A (en) * | 2020-11-05 | 2020-12-18 | 成都中科大旗软件股份有限公司 | Big data map analysis method and system |
CN112100266B (en) * | 2020-11-05 | 2021-02-09 | 成都中科大旗软件股份有限公司 | Big data map analysis method and system |
CN112685405A (en) * | 2020-12-21 | 2021-04-20 | 福建新大陆软件工程有限公司 | Data management method, system, equipment and medium based on knowledge graph |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108197182A (en) | A kind of data atlas analysis system and method | |
CN104899295B (en) | A kind of heterogeneous data source data relation analysis method | |
CN111694858A (en) | Data blood margin analysis method, device, equipment and computer readable storage medium | |
WO2010030392A3 (en) | Interpersonal spacetime interaction system | |
CN102880709B (en) | Data warehouse management system and data warehouse management method | |
CN103646086B (en) | Junk file cleaning method and device | |
CN105488231B (en) | A kind of big data processing method divided based on adaptive table dimension | |
WO2003036426A3 (en) | System and method for managing spending | |
CA2675216A1 (en) | Method and system for information discovery and text analysis | |
Monaco et al. | A super lithium-rich red-clump star in the open cluster Trumpler 5 | |
CN106897285B (en) | Data element extraction and analysis system and data element extraction and analysis method | |
CN110851667A (en) | Integrated analysis method and tool for multi-source large data | |
CN101556666A (en) | Method, device and auditing system for establishing auditing model | |
CN109684402A (en) | One kind being based on big data platform metadata genetic connection implementation method | |
CN105095436A (en) | Automatic modeling method for data of data sources | |
KR102345410B1 (en) | Big data intelligent collecting method and device | |
CN109360106A (en) | Portrait construction method, system, medium and computer system | |
WO2017001887A1 (en) | Data processing system and data processing method | |
US8793272B2 (en) | Query transformation | |
CN107357919A (en) | User behaviors log inquiry system and method | |
US20130086058A1 (en) | Synonym Groups | |
CN109446263A (en) | A kind of data relationship correlating method and device | |
CN101520864A (en) | Method for realizing subject support items in accounting management process | |
US20140351200A1 (en) | Pivot analysis method using condition group | |
Kleinn et al. | The National Forest Inventory in Germany: responding to forest related information needs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180622 |