CN108197182A - A kind of data atlas analysis system and method - Google Patents

A kind of data atlas analysis system and method Download PDF

Info

Publication number
CN108197182A
CN108197182A CN201711424043.2A CN201711424043A CN108197182A CN 108197182 A CN108197182 A CN 108197182A CN 201711424043 A CN201711424043 A CN 201711424043A CN 108197182 A CN108197182 A CN 108197182A
Authority
CN
China
Prior art keywords
data
metadata
unit
illustrative plates
related information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711424043.2A
Other languages
Chinese (zh)
Inventor
阙子扬
曹博睿
掌建军
赵卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yun Yun Polytron Technologies Inc
Original Assignee
Yun Yun Polytron Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yun Yun Polytron Technologies Inc filed Critical Yun Yun Polytron Technologies Inc
Priority to CN201711424043.2A priority Critical patent/CN108197182A/en
Publication of CN108197182A publication Critical patent/CN108197182A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2448Query languages for particular applications; for extensibility, e.g. user defined types
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Automatic Analysis And Handling Materials Therefor (AREA)

Abstract

The invention discloses a kind of data atlas analysis system, including:Data processing module, data processing module include metadata acquisition unit and the data flow analytic unit being connect with metadata acquisition unit;Data memory module, data memory module include the data collection of illustrative plates storage unit being connect with data flow analytic unit;Data display module, data display module include the data collection of illustrative plates display unit being connect with data collection of illustrative plates storage unit;Metadata acquisition unit is used to collect metadata from each Production database and analytical database, data flow analytic unit analyzes ETL scripts using analytical tool JSQLParser, and with reference to metadata, obtain table and the related information of field level between each Production database and analytical database, related information is stored in data collection of illustrative plates storage unit, and data collection of illustrative plates display unit is used for the related information of present graphical.The above-mentioned technical proposal of the present invention can store and show the incidence relation between metadata.

Description

A kind of data atlas analysis system and method
Technical field
The present invention relates to network technique field, it particularly relates to a kind of data atlas analysis system and method.
Background technology
Data atlas analysis system describes the whole process to data processing, origin including data and handles these numbers According to all successor operations.Distributed data share becomes increasingly conspicuous under big data platform, and the source of data is for analysis data, system One data bore, administrative model change, the confidence level for weighing data, the quality for ensureing data etc. is particularly important.But existing skill There is no such data atlas analysis systems in art.
Invention content
The problem of for the relevant technologies, the present invention propose a kind of data atlas analysis system and method, can Incidence relation between storage and displaying metadata.
The technical proposal of the invention is realized in this way:
According to an aspect of the invention, there is provided a kind of data atlas analysis system, including:Data processing module, number Include metadata acquisition unit and the data flow analytic unit being connect with metadata acquisition unit according to processing module;Data Memory module, data memory module include the data collection of illustrative plates storage unit being connect with data flow analytic unit;Data show mould Block, data display module include the data collection of illustrative plates display unit being connect with data collection of illustrative plates storage unit;
Wherein, metadata acquisition unit is used to collect metadata, data flow from each Production database and analytical database Analyze ETL scripts, and with reference to metadata using analytical tool JSQLParser to analytic unit, obtain each Production database and Table and the related information of field level between analytical database, related information deposit data collection of illustrative plates storage unit, data collection of illustrative plates exhibition Show related information of the unit for present graphical.
According to an embodiment of the invention, metadata acquisition unit is additionally operable to through Workflow management platform Airflow to receiving The flow of set metadata is scheduled.
According to an embodiment of the invention, data collection of illustrative plates display unit by spring boot frames, Neo4j databases and Echarts builds data dictionary and data collection of illustrative plates displayed page with the related information of present graphical.
According to an embodiment of the invention, wherein data collection of illustrative plates storage unit includes Neo4j databases, related information deposit In Neo4j databases.
According to an embodiment of the invention, data memory module further includes the metadata storage being connect with metadata acquisition unit Unit, wherein metadata acquisition unit will be in the Metadata integrations to metadata storage unit of collection.
According to an embodiment of the invention, data display module further includes metadata management unit, and metadata management unit is used In providing inquiry and displaying interface to check and query metadata.
According to another aspect of the present invention, a kind of datagram spectral analysis method is provided, including:
S1 collects metadata from each Production database and analytical database;
S2 analyzes ETL scripts, and with reference to metadata, obtain each Production database using analytical tool JSQLParser Table and the related information of field level between analytical database;
S3, the related information of present graphical.
According to an embodiment of the invention, step S1 is further included:By Workflow management platform Airflow to collecting metadata Flow be scheduled.
According to an embodiment of the invention, step S3 is specifically included:By spring boot frames, Neo4j databases and Echarts builds data dictionary and data collection of illustrative plates displayed page with the related information of present graphical.
The present invention utilizes analytical tool JSQLParser analysis ETL scripts parsing Production databases and analytical database member number Incidence relation between, and the incidence relation between display data can be stored;Each module can work independently, face To each system can freely be configured, function modoularization simultaneously have higher configurability;Additionally by open source software The extension of JSQLParser can be suitable for the structured statement of a variety of different standards and nonstandardized technique, can with higher Autgmentability.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the present invention Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings Obtain other attached drawings.
Fig. 1 is the block diagram of data atlas analysis system according to embodiments of the present invention;
Fig. 2 is the exemplary process diagram of the data flow analytic unit analysis ETL scripts in Fig. 1;
Fig. 3 is the flow chart of datagram spectral analysis method according to embodiments of the present invention.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art's all other embodiments obtained belong to what the present invention protected Range.
As shown in Figure 1, data atlas analysis system 100 according to embodiments of the present invention includes:Data processing module 110, Data processing module 110 includes metadata acquisition unit 112 and the data flow being connect with metadata acquisition unit 112 point Analyse unit 114;Data memory module 120, data memory module 120 include the data being connect with data flow analytic unit 114 Collection of illustrative plates storage unit 124;Data display module 130, data display module include the number being connect with data collection of illustrative plates storage unit 124 According to collection of illustrative plates display unit 134.
Wherein, metadata acquisition unit 112 is used to collect first number from each Production database 210 and analytical database 220 According to data flow analytic unit 114, with reference to metadata, is obtained using analytical tool JSQLParser analysis ETL scripts 230 Table and the related information of field level between each Production database 210 and analytical database 220, related information deposit datagram Storage unit 124 is composed, data collection of illustrative plates display unit 134 is used for the related information of present graphical.
The above-mentioned technical proposal of the present invention parses creation data using analytical tool JSQLParser analysis ETL scripts 230 Incidence relation between 220 metadata of library 210 and analytical database, and the incidence relation between display data can be stored; Each module can work independently, towards each system can freely be configured, function modoularization simultaneously have it is higher can Configurability;Additionally by the extension to open source software JSQLParser, a variety of different standards and nonstandardized technique can be suitable for Structured statement, have higher scalability.
With continued reference to shown in Fig. 1, wherein metadata acquisition unit 112 is additionally operable to through Workflow management platform Airflow The flow for collecting metadata is scheduled.
Preferably, data collection of illustrative plates display unit 134 passes through spring boot frames, Neo4j databases and echarts structures Data dictionary and data collection of illustrative plates displayed page are built with the related information of present graphical.By using high performance graphic data base Neo4j and spring boot frames, the interface of echarts structures can flexibly show each table and the attribute arranged and relationship.
In the embodiment shown in fig. 1, wherein data collection of illustrative plates storage unit 124 includes Neo4j databases, data flow point In the related information deposit Neo4j databases that analysis unit 114 obtains.
Wherein, data memory module 120 further includes the metadata storage unit 122 being connect with metadata acquisition unit 112, Wherein metadata acquisition unit 112 will be in the Metadata integration of collection to metadata storage unit 122.
Wherein, data display module 130 further includes metadata management unit 132, and metadata management unit 132 is used to provide Inquiry and displaying interface are to check and query metadata.
The operation flow of each unit of the data atlas analysis system 100 of the present invention is illustrated below in conjunction with Fig. 1. Metadata acquisition unit 112 is supported from 220 extracting metadata of various types of Production databases 210 and analytical database, union Into in the MySQL database to metadata storage unit 122, whole flow process can be by Workflow management platform Airflow come real Apply scheduling.With reference to shown in Fig. 2, data flow analytic unit 114 is various by supporting the extension of open source software JSQLParser SQL statement, and can be extended according to the grammer of disparate databases;Data flow analytic unit 114 is by SQL statement Parsing and the inquiry of basic metadata obtain the data source and calculation expression of each table and row.Metadata management unit 132 can provide complete inquiry and displaying interface, and the function can be used to check the essential information of metadata, according to searching bar Part inquires qualified metadata.Data collection of illustrative plates storage unit 124 and data collection of illustrative plates display unit 134, data atlas analysis System 100 can extract all metadata from each Production database 210 and analytical database 220, be imported by loadNeo4j Neo4j databases;Data collection of illustrative plates display unit 134 is built by spring boot frames, Neo4j databases and echarts Patterned related information is showed user by data collection of illustrative plates interface (such as web page).
In conclusion the advantageous effect of the data atlas analysis system of the present invention is:
1. function modoularization, high configurability.Each module of system can work independently.Towards each system It can freely be configured;
2. enhanced scalability.By the extension to open source software JSQLParser, a variety of different standards can be suitable for With the structured statement of nonstandardized technique;
3. flexible data collection of illustrative plates displaying.Using high performance graphic data base Neo4j and spring boot frames, The interface of echarts structures flexibly shows the attribute and relationship of each table and row.
As shown in figure 3, according to an embodiment of the invention, a kind of datagram spectral analysis method 30 is additionally provided, including following Step:
S32 collects metadata from each Production database and analytical database;
S34 analyzes ETL scripts, and with reference to metadata, obtain each Production database using analytical tool JSQLParser Table and the related information of field level between analytical database;
S36, the related information of present graphical.
In one embodiment, step 32 can also include:By Workflow management platform Airflow to collecting metadata Flow be scheduled.
In one embodiment, step S36 is specifically included:By spring boot frames, Neo4j databases and Echarts builds data dictionary and data collection of illustrative plates displayed page with the related information of present graphical.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention With within principle, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention god.

Claims (9)

1. a kind of data atlas analysis system, which is characterized in that including:
Data processing module, the data processing module include metadata acquisition unit and with the metadata acquisition unit The data flow analytic unit of connection;
Data memory module, the data memory module include the data collection of illustrative plates being connect with data flow analytic unit storage Unit;
Data display module, the data display module include the data collection of illustrative plates being connect with data collection of illustrative plates storage unit displaying Unit;
Wherein, the metadata acquisition unit is used to collect metadata, the number from each Production database and analytical database ETL scripts are analyzed using analytical tool JSQLParser, and with reference to the metadata, obtain described each according to analytic unit is flowed to Table and the related information of field level between Production database and analytical database, the related information are stored in the data collection of illustrative plates Storage unit, the data collection of illustrative plates display unit are used for the related information of present graphical.
2. data atlas analysis system according to claim 1, which is characterized in that
The metadata acquisition unit is additionally operable to adjust the flow for collecting metadata by Workflow management platform Airflow Degree.
3. data atlas analysis system according to claim 1, which is characterized in that the data collection of illustrative plates display unit passes through Spring boot frames, Neo4j databases and echarts structure data dictionaries and data collection of illustrative plates displayed page are with present graphical The related information changed.
4. data atlas analysis system according to claim 1, which is characterized in that wherein described datagram spectrum storage unit Including Neo4j databases, the related information is stored in the Neo4j databases.
5. data atlas analysis system according to claim 1, which is characterized in that the data memory module further include with The metadata storage unit of metadata acquisition unit connection, wherein the metadata acquisition unit is by the metadata set of collection Into in the metadata storage unit.
6. data atlas analysis system according to claim 5, which is characterized in that the data display module further includes member Data Management Unit, the metadata management unit inquire and show interface for offer to check and inquire the metadata.
7. a kind of datagram spectral analysis method, which is characterized in that including:
S1 collects metadata from each Production database and analytical database;
S2 analyzes ETL scripts, and with reference to the metadata, obtain each production number using analytical tool JSQLParser According to table between library and analytical database and the related information of field level;
S3, the related information of present graphical.
8. datagram spectral analysis method according to claim 7, which is characterized in that step S1 is further included:Pass through workflow Management platform Airflow is scheduled the flow for collecting metadata.
9. datagram spectral analysis method according to claim 7, which is characterized in that step S3 is specifically included:Pass through Spring boot frames, Neo4j databases and echarts structure data dictionaries and data collection of illustrative plates displayed page are with present graphical The related information changed.
CN201711424043.2A 2017-12-25 2017-12-25 A kind of data atlas analysis system and method Pending CN108197182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711424043.2A CN108197182A (en) 2017-12-25 2017-12-25 A kind of data atlas analysis system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711424043.2A CN108197182A (en) 2017-12-25 2017-12-25 A kind of data atlas analysis system and method

Publications (1)

Publication Number Publication Date
CN108197182A true CN108197182A (en) 2018-06-22

Family

ID=62583865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711424043.2A Pending CN108197182A (en) 2017-12-25 2017-12-25 A kind of data atlas analysis system and method

Country Status (1)

Country Link
CN (1) CN108197182A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446263A (en) * 2018-11-02 2019-03-08 成都四方伟业软件股份有限公司 A kind of data relationship correlating method and device
CN109739894A (en) * 2019-01-04 2019-05-10 深圳前海微众银行股份有限公司 Supplement method, apparatus, equipment and the storage medium of metadata description
CN109840267A (en) * 2019-03-01 2019-06-04 成都品果科技有限公司 A kind of ETL process system and method
CN110019252A (en) * 2019-04-16 2019-07-16 成都四方伟业软件股份有限公司 The method, apparatus and electronic equipment of information processing
CN111078695A (en) * 2019-11-29 2020-04-28 东软集团股份有限公司 Method and device for calculating metadata association relation in enterprise
CN112100266A (en) * 2020-11-05 2020-12-18 成都中科大旗软件股份有限公司 Big data map analysis method and system
CN112685405A (en) * 2020-12-21 2021-04-20 福建新大陆软件工程有限公司 Data management method, system, equipment and medium based on knowledge graph

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080168420A1 (en) * 2006-03-17 2008-07-10 The Mitre Corporation Semantic system for integrating software components
CN101770479A (en) * 2008-12-31 2010-07-07 北京亿阳信通软件研究院有限公司 Association relationship query method and device
CN104216888A (en) * 2013-05-30 2014-12-17 中国电信股份有限公司 Data processing task relation setting method and system
CN107273079A (en) * 2017-05-18 2017-10-20 网易(杭州)网络有限公司 Related information is shown, collection of illustrative plates processing method, device, medium, equipment and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080168420A1 (en) * 2006-03-17 2008-07-10 The Mitre Corporation Semantic system for integrating software components
CN101770479A (en) * 2008-12-31 2010-07-07 北京亿阳信通软件研究院有限公司 Association relationship query method and device
CN104216888A (en) * 2013-05-30 2014-12-17 中国电信股份有限公司 Data processing task relation setting method and system
CN107273079A (en) * 2017-05-18 2017-10-20 网易(杭州)网络有限公司 Related information is shown, collection of illustrative plates processing method, device, medium, equipment and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李克学: "某银行元数据解析处理系统", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446263A (en) * 2018-11-02 2019-03-08 成都四方伟业软件股份有限公司 A kind of data relationship correlating method and device
CN109739894A (en) * 2019-01-04 2019-05-10 深圳前海微众银行股份有限公司 Supplement method, apparatus, equipment and the storage medium of metadata description
CN109739894B (en) * 2019-01-04 2022-12-09 深圳前海微众银行股份有限公司 Method, device, equipment and storage medium for supplementing metadata description
CN109840267A (en) * 2019-03-01 2019-06-04 成都品果科技有限公司 A kind of ETL process system and method
CN109840267B (en) * 2019-03-01 2023-04-21 成都品果科技有限公司 Data ETL system and method
CN110019252A (en) * 2019-04-16 2019-07-16 成都四方伟业软件股份有限公司 The method, apparatus and electronic equipment of information processing
CN111078695A (en) * 2019-11-29 2020-04-28 东软集团股份有限公司 Method and device for calculating metadata association relation in enterprise
CN111078695B (en) * 2019-11-29 2023-11-21 东软集团股份有限公司 Method and device for calculating association relation of metadata in enterprise
CN112100266A (en) * 2020-11-05 2020-12-18 成都中科大旗软件股份有限公司 Big data map analysis method and system
CN112100266B (en) * 2020-11-05 2021-02-09 成都中科大旗软件股份有限公司 Big data map analysis method and system
CN112685405A (en) * 2020-12-21 2021-04-20 福建新大陆软件工程有限公司 Data management method, system, equipment and medium based on knowledge graph

Similar Documents

Publication Publication Date Title
CN108197182A (en) A kind of data atlas analysis system and method
CN104899295B (en) A kind of heterogeneous data source data relation analysis method
CN111694858A (en) Data blood margin analysis method, device, equipment and computer readable storage medium
WO2010030392A3 (en) Interpersonal spacetime interaction system
CN102880709B (en) Data warehouse management system and data warehouse management method
CN103646086B (en) Junk file cleaning method and device
CN105488231B (en) A kind of big data processing method divided based on adaptive table dimension
WO2003036426A3 (en) System and method for managing spending
CA2675216A1 (en) Method and system for information discovery and text analysis
Monaco et al. A super lithium-rich red-clump star in the open cluster Trumpler 5
CN106897285B (en) Data element extraction and analysis system and data element extraction and analysis method
CN110851667A (en) Integrated analysis method and tool for multi-source large data
CN101556666A (en) Method, device and auditing system for establishing auditing model
CN109684402A (en) One kind being based on big data platform metadata genetic connection implementation method
CN105095436A (en) Automatic modeling method for data of data sources
KR102345410B1 (en) Big data intelligent collecting method and device
CN109360106A (en) Portrait construction method, system, medium and computer system
WO2017001887A1 (en) Data processing system and data processing method
US8793272B2 (en) Query transformation
CN107357919A (en) User behaviors log inquiry system and method
US20130086058A1 (en) Synonym Groups
CN109446263A (en) A kind of data relationship correlating method and device
CN101520864A (en) Method for realizing subject support items in accounting management process
US20140351200A1 (en) Pivot analysis method using condition group
Kleinn et al. The National Forest Inventory in Germany: responding to forest related information needs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180622