CN104142980A - Big data-based metadata model management system and method - Google Patents

Big data-based metadata model management system and method Download PDF

Info

Publication number
CN104142980A
CN104142980A CN201410336111.XA CN201410336111A CN104142980A CN 104142980 A CN104142980 A CN 104142980A CN 201410336111 A CN201410336111 A CN 201410336111A CN 104142980 A CN104142980 A CN 104142980A
Authority
CN
China
Prior art keywords
metadata
data
data source
extraction
metadata schema
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410336111.XA
Other languages
Chinese (zh)
Other versions
CN104142980B (en
Inventor
闵圣捷
谢朝阳
童晓渝
王慧
赵斌
靳永超
邹云
丁星
武静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd
Original Assignee
CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd filed Critical CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd
Priority to CN201410336111.XA priority Critical patent/CN104142980B/en
Publication of CN104142980A publication Critical patent/CN104142980A/en
Application granted granted Critical
Publication of CN104142980B publication Critical patent/CN104142980B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Abstract

The invention provides a big data-based metadata model management system and a big data-based metadata model management method. The management method comprises the following steps of 1, judging the type of a data source structure of big data; 2, extracting metadata from a structured data source and executing a step 4; 3, extracting metadata from a non-structured data source and executing the step 4; 4, defining a relationship of the extracted metadata, forming a metadata model corresponding to the relationship, and executing a step 5; 5, storing the formed metadata model in a database in a graphics mode, and executing a step 6; 6, according to the defined metadata model, issuing the metadata according to business requirements to provide the metadata for an external system. Different types of data are managed, a unified metadata system can be constructed on heterogeneous data sources, and functions of storing, managing and using the system are provided.

Description

Metadata schema management system and management method based on large data
Technical field
The present invention relates to metadata schema management system and the management method in a kind of telecommunication technology field, particularly, relate to a kind of metadata schema management system and management method based on large data.
Background technology
People describe and define by large data the mass data that the information explosion epoch produce, and name associated technical development and innovation.Data are expanding rapidly and are becoming large, it is determining the future development of enterprise, although enterprise may not recognize that data explosion increases the hidden danger of bringing problem now, As time goes on, people will more and more recognize the importance of data to enterprise.
Large data age has proposed new challenge to the mankind's data controling power, and along with Internet of Things and the continual generation mass data of mobile terminal, and data type is abundant, and how to manage these dissimilar data, just becomes a difficult problem.The metadata schema management method that the present invention is based on large data is exactly in order to adapt to such environment, solves the dissimilar problem of management of large data.
Summary of the invention
For defect of the prior art, the object of this invention is to provide a kind of metadata schema management system and management method based on large data, it realizes the dissimilar data of management, can on heterogeneous data source, build unified metadata system, and the storage to this system, the function of management and using are provided.
According to an aspect of the present invention, a kind of metadata schema management method based on large data is provided, it is characterized in that, it comprises the following steps: step 1, judge the type of the data source structure of large data, judgement is structured data source or unstructured data sources, if structured data source performs step two, if unstructured data sources performs step three; Step 2, carries out after Metadata Extraction structured data source, execution step four; Step 3, carries out after Metadata Extraction unstructured data sources, execution step four; Step 4, the relation of the metadata after definition is extracted, and form corresponding with it metadata schema, execution step five; Step 5, is stored in the metadata schema of formation in database with graphics mode, execution step six; Step 6, according to the metadata schema defining, issues metadata according to business demand, to provide external system to use metadata.
Preferably, described structured data source relation of inclusion database and document form, unstructured data sources comprises NOSQL database.
Preferably, described step 2 and step 3 are passed through the user-defined metadata of manual extraction, and metadata format are converted to the form that meets JSON data standard.
Preferably, described step 5 is the JSON data layout of analytical element data model first, this data layout is resolved to the data layout of the figure recognition method that becomes node, node relationships, then node, node relationships is stored in graphic data base.
The present invention also provides a kind of metadata schema management system based on large data, it is characterized in that, it comprises:
Judge module, for judging the type of the data source structure of large data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extraction, and form corresponding with it metadata schema;
Memory module, is stored in the metadata schema of model definition and formation module in database;
Release module, for issuing metadata.
Compared with prior art, the present invention has following beneficial effect: one, the present invention directly according to business demand to dissimilar, between diverse geographic location database, metadata information extracts, merge, share, the isomery processing of merging and carrying out metadata data modeling, it is effectively to manage based on structured data source and unstructured data sources that isomery is processed.Two, the excavation that the present invention is mass data and analysis provide basic uniform data standard, and lay the foundation for building industry semantic base.Three, the present invention provides a whole set of complete metadata management function for user.Four, the present invention is that large data processing realizes fast, efficient, accurately metadata and metadata schema memory function.Five, it is quick that the pattern of graphics mode storing metadata model can reach inquiry velocity, and bandwagon effect is clear, and this bandwagon effect has been shown the process of establishing of metadata data model and the process of model extension clearly.Six, the present invention is that large data processing has been set up unification, stable metadata data warehouse.
Accompanying drawing explanation
By reading the detailed description of non-limiting example being done with reference to the following drawings, it is more obvious that other features, objects and advantages of the present invention will become:
Fig. 1 is the process flow diagram that the present invention is based on the metadata schema management method of large data.
Fig. 2 is the theory diagram that the present invention is based on the metadata schema management system of large data.
Embodiment
Below in conjunction with specific embodiment, the present invention is described in detail.Following examples will contribute to those skilled in the art further to understand the present invention, but not limit in any form the present invention.It should be pointed out that to those skilled in the art, without departing from the inventive concept of the premise, can also make some distortion and improvement.These all belong to protection scope of the present invention.
As shown in Figure 1, the metadata schema management method that the present invention is based on large data comprises the following steps:
Step 1, judges the type of the data source structure of large data, and judgement is structured data source or unstructured data sources, if structured data source performs step two, if unstructured data sources performs step three; Structured data source relation of inclusion database and document form, relational database is as ORACLE, MYSQL, DB2; Document form is as CSV, XLSX etc.Unstructured data sources comprises NOSQL (database of general reference non-relational) database.Step 1 is specifically judged the type of the data source structure of large data by judge module, the feature of structured data source is by bivariate table structure, to come this feature of logic realization data to formulate data source semantic type standard according to structural data characteristic, and unstructured data sources feature is document according to unstructured data sources characteristic, picture, form, image, audio frequency etc. are formulated data source semantic type standard.
Step 2, carries out after Metadata Extraction structured data source, execution step four; Step 2 is specifically carried out Metadata Extraction by abstraction module to structured data source;
Step 3, carries out after Metadata Extraction unstructured data sources, execution step four; Step 3 is specifically carried out Metadata Extraction by abstraction module to structured data source;
Step 4, the relation of the metadata after definition is extracted also forms corresponding with it metadata schema, execution step five; Step 4 specifically defines the various relations between the different metadata after extraction by metadata data modeling, by different business, sets up different relations, thus by this different metadata with and various relation form corresponding with it metadata schema; Step 4 is specifically by model definition with form module and complete;
Step 5, is stored in the metadata schema of formation in database with graphics mode, execution step six; Step 5 is specifically completed by memory module;
Step 6, according to the metadata schema defining, issues metadata according to business demand, to provide external system to use metadata.Step 6 is specifically completed by release module.
Wherein, step 2 and step 3 are by the user-defined metadata of manual extraction, and metadata format is converted to and meets JSON (JavaScript Object Notation, a kind of data interchange format of lightweight) form of data standard, the benefit of this data standard is the semantic criteria that define metadata, avoids semantic conflict.Step 5 is the JSON data layout of analytical element data model first, this data layout is resolved to the data layout of the figure recognition method that becomes node, node relationships, then node, node relationships is stored in graphic data base.Metadata is a kind of binary message, is the descriptive information to data and information resources.
As shown in Figure 2, the metadata schema management system that the present invention is based on large data comprises:
Judge module, for judging the type of the data source structure of large data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extraction, and form corresponding with it metadata schema;
Memory module, is stored in the metadata schema of model definition and formation module in database;
Release module, for issuing metadata.
In sum, the present invention realizes the dissimilar data of management, can on heterogeneous data source, build unified metadata system, and this metadata system includes extraction, modeling, storage, inquiry and the management etc. of isomery metadata.
Above specific embodiments of the invention are described.It will be appreciated that, the present invention is not limited to above-mentioned specific implementations, and those skilled in the art can make various distortion or modification within the scope of the claims, and this does not affect flesh and blood of the present invention.

Claims (5)

1. the metadata schema management method based on large data, is characterized in that, it comprises the following steps:
Step 1, judges the type of the data source structure of large data, and judgement is structured data source or unstructured data sources, if structured data source performs step two, if unstructured data sources performs step three;
Step 2, carries out after Metadata Extraction structured data source, execution step four;
Step 3, carries out after Metadata Extraction unstructured data sources, execution step four;
Step 4, the relation of the metadata after definition is extracted, and form corresponding with it metadata schema, execution step five;
Step 5, is stored in the metadata schema of formation in database with graphics mode, execution step six;
Step 6, according to the metadata schema defining, issues metadata according to business demand, to provide external system to use metadata.
2. the metadata schema management method based on large data according to claim 1, is characterized in that, described structured data source relation of inclusion database and document form, and unstructured data sources comprises NOSQL database.
3. the metadata schema management method based on large data according to claim 1, is characterized in that, described step 2 and step 3 are passed through the user-defined metadata of manual extraction, and metadata format are converted to the form that meets JSON data standard.
4. the metadata schema management method based on large data according to claim 3, it is characterized in that, described step 5 is the JSON data layout of analytical element data model first, the data layout of this data layout being resolved to the figure recognition method that becomes node, node relationships, then stores node, node relationships in graphic data base into.
5. the metadata schema management system based on large data, is characterized in that, it comprises:
Judge module, for judging the type of the data source structure of large data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extraction, and form corresponding with it metadata schema;
Memory module, is stored in the metadata schema of model definition and formation module in database;
Release module, for issuing metadata.
CN201410336111.XA 2014-07-15 2014-07-15 Metadata schema management system and management method based on big data Active CN104142980B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410336111.XA CN104142980B (en) 2014-07-15 2014-07-15 Metadata schema management system and management method based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410336111.XA CN104142980B (en) 2014-07-15 2014-07-15 Metadata schema management system and management method based on big data

Publications (2)

Publication Number Publication Date
CN104142980A true CN104142980A (en) 2014-11-12
CN104142980B CN104142980B (en) 2017-11-17

Family

ID=51852154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410336111.XA Active CN104142980B (en) 2014-07-15 2014-07-15 Metadata schema management system and management method based on big data

Country Status (1)

Country Link
CN (1) CN104142980B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886004A (en) * 2013-11-29 2014-06-25 北京吉威数源信息技术有限公司 Material data modeling processing method
CN104580474A (en) * 2015-01-13 2015-04-29 深圳市融创天下科技有限公司 Urban operation sign big data visualization multi-screen interaction display platform and method
CN105574086A (en) * 2015-12-10 2016-05-11 天津海量信息技术有限公司 Artificial intelligence extraction method of internet unstructured data fields
CN105701181A (en) * 2016-01-06 2016-06-22 中电科华云信息技术有限公司 Dynamic heterogeneous metadata acquisition method and system
CN105912636A (en) * 2016-04-08 2016-08-31 金蝶软件(中国)有限公司 Map/Reduce based ETL data processing method and device
CN106557569A (en) * 2016-11-14 2017-04-05 用友网络科技股份有限公司 Introduction method and gatherer based on the non-structured document of meta-model
CN106886535A (en) * 2015-12-16 2017-06-23 大唐软件技术股份有限公司 A kind of data pick-up method and apparatus for being adapted to multiple data sources
CN107291875A (en) * 2017-06-19 2017-10-24 华中科技大学 A kind of metadata organization management method and system based on metadata graph
CN107633181A (en) * 2017-09-12 2018-01-26 复旦大学 The data model and its operation system of data-oriented opening and shares
CN108320066A (en) * 2017-01-18 2018-07-24 重庆邮电大学 A kind of Explore of Unified Management Ideas for realizing different production lines based on metadata
CN108733727A (en) * 2017-04-25 2018-11-02 华为技术有限公司 A kind of inquiry processing method, data source registration method and query engine
CN109242259A (en) * 2018-08-10 2019-01-18 华迪计算机集团有限公司 A kind of data integrating method and system based on basic data resources bank
CN109542960A (en) * 2018-10-18 2019-03-29 国网内蒙古东部电力有限公司信息通信分公司 A kind of data analysis domain system
CN109710602A (en) * 2018-12-26 2019-05-03 中科曙光国际信息产业有限公司 Data model detection method and device
CN109739893A (en) * 2018-12-28 2019-05-10 上海连尚网络科技有限公司 A kind of metadata management method, equipment and computer-readable medium
CN109857822A (en) * 2018-12-29 2019-06-07 国家开发银行 Meta-model conversion method and management system based on chart database
CN109871417A (en) * 2018-12-29 2019-06-11 国家开发银行 The metadata visualization map constructing method and system of knowledge based map
CN110209380A (en) * 2019-05-30 2019-09-06 上海直真君智科技有限公司 A kind of unified dynamic metadata processing method towards big data isomery model
CN112115183A (en) * 2020-09-18 2020-12-22 广州锦行网络科技有限公司 Honeypot system threat information analysis method based on graph
US11494611B2 (en) 2019-07-31 2022-11-08 International Business Machines Corporation Metadata-based scientific data characterization driven by a knowledge database at scale
US11703404B2 (en) 2019-06-17 2023-07-18 Colorado State University Research Foundation Device for automated crop root sampling

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188887B (en) * 2018-09-26 2022-11-08 第四范式(北京)技术有限公司 Data management method and device for machine learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070233680A1 (en) * 2006-03-31 2007-10-04 Microsoft Corporation Auto-generating reports based on metadata
CN101908176A (en) * 2010-08-02 2010-12-08 国电南瑞科技股份有限公司 Method for modeling on basis of power information data and applying metadata management
CN103246753A (en) * 2013-05-30 2013-08-14 安徽皖通科技股份有限公司 Method for generating entity metadata model according to database structure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070233680A1 (en) * 2006-03-31 2007-10-04 Microsoft Corporation Auto-generating reports based on metadata
CN101908176A (en) * 2010-08-02 2010-12-08 国电南瑞科技股份有限公司 Method for modeling on basis of power information data and applying metadata management
CN103246753A (en) * 2013-05-30 2013-08-14 安徽皖通科技股份有限公司 Method for generating entity metadata model according to database structure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐小天 等: "基于 JSON 的电力企业业务系统非结构化数据抽取方法", 《华北电力技术》 *

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886004A (en) * 2013-11-29 2014-06-25 北京吉威数源信息技术有限公司 Material data modeling processing method
CN103886004B (en) * 2013-11-29 2017-06-09 北京吉威时代软件股份有限公司 A kind of data type data modeling processing method
CN104580474A (en) * 2015-01-13 2015-04-29 深圳市融创天下科技有限公司 Urban operation sign big data visualization multi-screen interaction display platform and method
CN105574086A (en) * 2015-12-10 2016-05-11 天津海量信息技术有限公司 Artificial intelligence extraction method of internet unstructured data fields
CN106886535A (en) * 2015-12-16 2017-06-23 大唐软件技术股份有限公司 A kind of data pick-up method and apparatus for being adapted to multiple data sources
CN105701181A (en) * 2016-01-06 2016-06-22 中电科华云信息技术有限公司 Dynamic heterogeneous metadata acquisition method and system
CN105912636B (en) * 2016-04-08 2020-04-07 金蝶软件(中国)有限公司 Map/Reduce-based ETL data processing method and device
CN105912636A (en) * 2016-04-08 2016-08-31 金蝶软件(中国)有限公司 Map/Reduce based ETL data processing method and device
CN106557569B (en) * 2016-11-14 2020-07-03 用友网络科技股份有限公司 Method and device for importing unstructured document based on meta-model
CN106557569A (en) * 2016-11-14 2017-04-05 用友网络科技股份有限公司 Introduction method and gatherer based on the non-structured document of meta-model
CN108320066A (en) * 2017-01-18 2018-07-24 重庆邮电大学 A kind of Explore of Unified Management Ideas for realizing different production lines based on metadata
CN108733727A (en) * 2017-04-25 2018-11-02 华为技术有限公司 A kind of inquiry processing method, data source registration method and query engine
US11907213B2 (en) 2017-04-25 2024-02-20 Huawei Technologies Co., Ltd. Query processing method, data source registration method, and query engine
US11366808B2 (en) 2017-04-25 2022-06-21 Huawei Technologies Co., Ltd. Query processing method, data source registration method, and query engine
CN108733727B (en) * 2017-04-25 2021-11-30 华为技术有限公司 Query processing method, data source registration method and query engine
CN107291875B (en) * 2017-06-19 2019-12-06 华中科技大学 Metadata organization management method and system based on metadata graph
CN107291875A (en) * 2017-06-19 2017-10-24 华中科技大学 A kind of metadata organization management method and system based on metadata graph
CN107633181B (en) * 2017-09-12 2021-01-26 复旦大学 Data model realization method facing data open sharing and operation system thereof
CN107633181A (en) * 2017-09-12 2018-01-26 复旦大学 The data model and its operation system of data-oriented opening and shares
CN109242259B (en) * 2018-08-10 2020-12-11 华迪计算机集团有限公司 Data integration method and system based on basic data resource library
CN109242259A (en) * 2018-08-10 2019-01-18 华迪计算机集团有限公司 A kind of data integrating method and system based on basic data resources bank
CN109542960A (en) * 2018-10-18 2019-03-29 国网内蒙古东部电力有限公司信息通信分公司 A kind of data analysis domain system
CN109710602A (en) * 2018-12-26 2019-05-03 中科曙光国际信息产业有限公司 Data model detection method and device
CN109739893A (en) * 2018-12-28 2019-05-10 上海连尚网络科技有限公司 A kind of metadata management method, equipment and computer-readable medium
CN109871417A (en) * 2018-12-29 2019-06-11 国家开发银行 The metadata visualization map constructing method and system of knowledge based map
CN109857822A (en) * 2018-12-29 2019-06-07 国家开发银行 Meta-model conversion method and management system based on chart database
CN110209380A (en) * 2019-05-30 2019-09-06 上海直真君智科技有限公司 A kind of unified dynamic metadata processing method towards big data isomery model
US11703404B2 (en) 2019-06-17 2023-07-18 Colorado State University Research Foundation Device for automated crop root sampling
US11494611B2 (en) 2019-07-31 2022-11-08 International Business Machines Corporation Metadata-based scientific data characterization driven by a knowledge database at scale
CN112115183A (en) * 2020-09-18 2020-12-22 广州锦行网络科技有限公司 Honeypot system threat information analysis method based on graph

Also Published As

Publication number Publication date
CN104142980B (en) 2017-11-17

Similar Documents

Publication Publication Date Title
CN104142980A (en) Big data-based metadata model management system and method
CN110941612B (en) Autonomous data lake construction system and method based on associated data
US9400835B2 (en) Weighting metric for visual search of entity-relationship databases
CN111026874A (en) Data processing method and server of knowledge graph
CN112364046B (en) Knowledge graph-based main data management method in heterogeneous environment
US9485306B2 (en) Methods, apparatuses, and computer program products for facilitating a data interchange protocol
WO2021032146A1 (en) Metadata management method and apparatus, device, and storage medium
CN103116574B (en) From the method for natural language text excavation applications process body
CN110990467B (en) BIM model format conversion method and conversion system
US20150293947A1 (en) Validating relationships between entities in a data model
US11449477B2 (en) Systems and methods for context-independent database search paths
CN110275962B (en) Method and apparatus for outputting information
CN108305306B (en) Animation data organization method based on sketch interaction
Singh et al. Big data-a review
Gopalakrishnan et al. Big Data in building information modeling research: survey and exploratory text mining
CN113609100B (en) Data storage method, data query device and electronic equipment
Kim et al. Customer preference analysis based on SNS data
US8694918B2 (en) Conveying hierarchical elements of a user interface
CN113326345A (en) Knowledge graph analysis and application method, platform and equipment based on dynamic ontology
CN105912723A (en) Storage method of custom field
CN111813555B (en) Super-fusion infrastructure layered resource management system based on internet technology
KR20230142799A (en) Diagram of child nodes with multiple parent nodes
CN113468340A (en) Construction system and construction method of industrial knowledge map
CN106557564A (en) A kind of object data analysis method and device
CN109684329A (en) A kind of method for managing resource based on data center apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant