CN104142980A - Big data-based metadata model management system and method - Google Patents
Big data-based metadata model management system and method Download PDFInfo
- Publication number
- CN104142980A CN104142980A CN201410336111.XA CN201410336111A CN104142980A CN 104142980 A CN104142980 A CN 104142980A CN 201410336111 A CN201410336111 A CN 201410336111A CN 104142980 A CN104142980 A CN 104142980A
- Authority
- CN
- China
- Prior art keywords
- metadata
- data
- data source
- extraction
- metadata schema
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/907—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
Abstract
The invention provides a big data-based metadata model management system and a big data-based metadata model management method. The management method comprises the following steps of 1, judging the type of a data source structure of big data; 2, extracting metadata from a structured data source and executing a step 4; 3, extracting metadata from a non-structured data source and executing the step 4; 4, defining a relationship of the extracted metadata, forming a metadata model corresponding to the relationship, and executing a step 5; 5, storing the formed metadata model in a database in a graphics mode, and executing a step 6; 6, according to the defined metadata model, issuing the metadata according to business requirements to provide the metadata for an external system. Different types of data are managed, a unified metadata system can be constructed on heterogeneous data sources, and functions of storing, managing and using the system are provided.
Description
Technical field
The present invention relates to metadata schema management system and the management method in a kind of telecommunication technology field, particularly, relate to a kind of metadata schema management system and management method based on large data.
Background technology
People describe and define by large data the mass data that the information explosion epoch produce, and name associated technical development and innovation.Data are expanding rapidly and are becoming large, it is determining the future development of enterprise, although enterprise may not recognize that data explosion increases the hidden danger of bringing problem now, As time goes on, people will more and more recognize the importance of data to enterprise.
Large data age has proposed new challenge to the mankind's data controling power, and along with Internet of Things and the continual generation mass data of mobile terminal, and data type is abundant, and how to manage these dissimilar data, just becomes a difficult problem.The metadata schema management method that the present invention is based on large data is exactly in order to adapt to such environment, solves the dissimilar problem of management of large data.
Summary of the invention
For defect of the prior art, the object of this invention is to provide a kind of metadata schema management system and management method based on large data, it realizes the dissimilar data of management, can on heterogeneous data source, build unified metadata system, and the storage to this system, the function of management and using are provided.
According to an aspect of the present invention, a kind of metadata schema management method based on large data is provided, it is characterized in that, it comprises the following steps: step 1, judge the type of the data source structure of large data, judgement is structured data source or unstructured data sources, if structured data source performs step two, if unstructured data sources performs step three; Step 2, carries out after Metadata Extraction structured data source, execution step four; Step 3, carries out after Metadata Extraction unstructured data sources, execution step four; Step 4, the relation of the metadata after definition is extracted, and form corresponding with it metadata schema, execution step five; Step 5, is stored in the metadata schema of formation in database with graphics mode, execution step six; Step 6, according to the metadata schema defining, issues metadata according to business demand, to provide external system to use metadata.
Preferably, described structured data source relation of inclusion database and document form, unstructured data sources comprises NOSQL database.
Preferably, described step 2 and step 3 are passed through the user-defined metadata of manual extraction, and metadata format are converted to the form that meets JSON data standard.
Preferably, described step 5 is the JSON data layout of analytical element data model first, this data layout is resolved to the data layout of the figure recognition method that becomes node, node relationships, then node, node relationships is stored in graphic data base.
The present invention also provides a kind of metadata schema management system based on large data, it is characterized in that, it comprises:
Judge module, for judging the type of the data source structure of large data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extraction, and form corresponding with it metadata schema;
Memory module, is stored in the metadata schema of model definition and formation module in database;
Release module, for issuing metadata.
Compared with prior art, the present invention has following beneficial effect: one, the present invention directly according to business demand to dissimilar, between diverse geographic location database, metadata information extracts, merge, share, the isomery processing of merging and carrying out metadata data modeling, it is effectively to manage based on structured data source and unstructured data sources that isomery is processed.Two, the excavation that the present invention is mass data and analysis provide basic uniform data standard, and lay the foundation for building industry semantic base.Three, the present invention provides a whole set of complete metadata management function for user.Four, the present invention is that large data processing realizes fast, efficient, accurately metadata and metadata schema memory function.Five, it is quick that the pattern of graphics mode storing metadata model can reach inquiry velocity, and bandwagon effect is clear, and this bandwagon effect has been shown the process of establishing of metadata data model and the process of model extension clearly.Six, the present invention is that large data processing has been set up unification, stable metadata data warehouse.
Accompanying drawing explanation
By reading the detailed description of non-limiting example being done with reference to the following drawings, it is more obvious that other features, objects and advantages of the present invention will become:
Fig. 1 is the process flow diagram that the present invention is based on the metadata schema management method of large data.
Fig. 2 is the theory diagram that the present invention is based on the metadata schema management system of large data.
Embodiment
Below in conjunction with specific embodiment, the present invention is described in detail.Following examples will contribute to those skilled in the art further to understand the present invention, but not limit in any form the present invention.It should be pointed out that to those skilled in the art, without departing from the inventive concept of the premise, can also make some distortion and improvement.These all belong to protection scope of the present invention.
As shown in Figure 1, the metadata schema management method that the present invention is based on large data comprises the following steps:
Step 1, judges the type of the data source structure of large data, and judgement is structured data source or unstructured data sources, if structured data source performs step two, if unstructured data sources performs step three; Structured data source relation of inclusion database and document form, relational database is as ORACLE, MYSQL, DB2; Document form is as CSV, XLSX etc.Unstructured data sources comprises NOSQL (database of general reference non-relational) database.Step 1 is specifically judged the type of the data source structure of large data by judge module, the feature of structured data source is by bivariate table structure, to come this feature of logic realization data to formulate data source semantic type standard according to structural data characteristic, and unstructured data sources feature is document according to unstructured data sources characteristic, picture, form, image, audio frequency etc. are formulated data source semantic type standard.
Step 2, carries out after Metadata Extraction structured data source, execution step four; Step 2 is specifically carried out Metadata Extraction by abstraction module to structured data source;
Step 3, carries out after Metadata Extraction unstructured data sources, execution step four; Step 3 is specifically carried out Metadata Extraction by abstraction module to structured data source;
Step 4, the relation of the metadata after definition is extracted also forms corresponding with it metadata schema, execution step five; Step 4 specifically defines the various relations between the different metadata after extraction by metadata data modeling, by different business, sets up different relations, thus by this different metadata with and various relation form corresponding with it metadata schema; Step 4 is specifically by model definition with form module and complete;
Step 5, is stored in the metadata schema of formation in database with graphics mode, execution step six; Step 5 is specifically completed by memory module;
Step 6, according to the metadata schema defining, issues metadata according to business demand, to provide external system to use metadata.Step 6 is specifically completed by release module.
Wherein, step 2 and step 3 are by the user-defined metadata of manual extraction, and metadata format is converted to and meets JSON (JavaScript Object Notation, a kind of data interchange format of lightweight) form of data standard, the benefit of this data standard is the semantic criteria that define metadata, avoids semantic conflict.Step 5 is the JSON data layout of analytical element data model first, this data layout is resolved to the data layout of the figure recognition method that becomes node, node relationships, then node, node relationships is stored in graphic data base.Metadata is a kind of binary message, is the descriptive information to data and information resources.
As shown in Figure 2, the metadata schema management system that the present invention is based on large data comprises:
Judge module, for judging the type of the data source structure of large data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extraction, and form corresponding with it metadata schema;
Memory module, is stored in the metadata schema of model definition and formation module in database;
Release module, for issuing metadata.
In sum, the present invention realizes the dissimilar data of management, can on heterogeneous data source, build unified metadata system, and this metadata system includes extraction, modeling, storage, inquiry and the management etc. of isomery metadata.
Above specific embodiments of the invention are described.It will be appreciated that, the present invention is not limited to above-mentioned specific implementations, and those skilled in the art can make various distortion or modification within the scope of the claims, and this does not affect flesh and blood of the present invention.
Claims (5)
1. the metadata schema management method based on large data, is characterized in that, it comprises the following steps:
Step 1, judges the type of the data source structure of large data, and judgement is structured data source or unstructured data sources, if structured data source performs step two, if unstructured data sources performs step three;
Step 2, carries out after Metadata Extraction structured data source, execution step four;
Step 3, carries out after Metadata Extraction unstructured data sources, execution step four;
Step 4, the relation of the metadata after definition is extracted, and form corresponding with it metadata schema, execution step five;
Step 5, is stored in the metadata schema of formation in database with graphics mode, execution step six;
Step 6, according to the metadata schema defining, issues metadata according to business demand, to provide external system to use metadata.
2. the metadata schema management method based on large data according to claim 1, is characterized in that, described structured data source relation of inclusion database and document form, and unstructured data sources comprises NOSQL database.
3. the metadata schema management method based on large data according to claim 1, is characterized in that, described step 2 and step 3 are passed through the user-defined metadata of manual extraction, and metadata format are converted to the form that meets JSON data standard.
4. the metadata schema management method based on large data according to claim 3, it is characterized in that, described step 5 is the JSON data layout of analytical element data model first, the data layout of this data layout being resolved to the figure recognition method that becomes node, node relationships, then stores node, node relationships in graphic data base into.
5. the metadata schema management system based on large data, is characterized in that, it comprises:
Judge module, for judging the type of the data source structure of large data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extraction, and form corresponding with it metadata schema;
Memory module, is stored in the metadata schema of model definition and formation module in database;
Release module, for issuing metadata.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410336111.XA CN104142980B (en) | 2014-07-15 | 2014-07-15 | Metadata schema management system and management method based on big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410336111.XA CN104142980B (en) | 2014-07-15 | 2014-07-15 | Metadata schema management system and management method based on big data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104142980A true CN104142980A (en) | 2014-11-12 |
CN104142980B CN104142980B (en) | 2017-11-17 |
Family
ID=51852154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410336111.XA Active CN104142980B (en) | 2014-07-15 | 2014-07-15 | Metadata schema management system and management method based on big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104142980B (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103886004A (en) * | 2013-11-29 | 2014-06-25 | 北京吉威数源信息技术有限公司 | Material data modeling processing method |
CN104580474A (en) * | 2015-01-13 | 2015-04-29 | 深圳市融创天下科技有限公司 | Urban operation sign big data visualization multi-screen interaction display platform and method |
CN105574086A (en) * | 2015-12-10 | 2016-05-11 | 天津海量信息技术有限公司 | Artificial intelligence extraction method of internet unstructured data fields |
CN105701181A (en) * | 2016-01-06 | 2016-06-22 | 中电科华云信息技术有限公司 | Dynamic heterogeneous metadata acquisition method and system |
CN105912636A (en) * | 2016-04-08 | 2016-08-31 | 金蝶软件(中国)有限公司 | Map/Reduce based ETL data processing method and device |
CN106557569A (en) * | 2016-11-14 | 2017-04-05 | 用友网络科技股份有限公司 | Introduction method and gatherer based on the non-structured document of meta-model |
CN106886535A (en) * | 2015-12-16 | 2017-06-23 | 大唐软件技术股份有限公司 | A kind of data pick-up method and apparatus for being adapted to multiple data sources |
CN107291875A (en) * | 2017-06-19 | 2017-10-24 | 华中科技大学 | A kind of metadata organization management method and system based on metadata graph |
CN107633181A (en) * | 2017-09-12 | 2018-01-26 | 复旦大学 | The data model and its operation system of data-oriented opening and shares |
CN108320066A (en) * | 2017-01-18 | 2018-07-24 | 重庆邮电大学 | A kind of Explore of Unified Management Ideas for realizing different production lines based on metadata |
CN108733727A (en) * | 2017-04-25 | 2018-11-02 | 华为技术有限公司 | A kind of inquiry processing method, data source registration method and query engine |
CN109242259A (en) * | 2018-08-10 | 2019-01-18 | 华迪计算机集团有限公司 | A kind of data integrating method and system based on basic data resources bank |
CN109542960A (en) * | 2018-10-18 | 2019-03-29 | 国网内蒙古东部电力有限公司信息通信分公司 | A kind of data analysis domain system |
CN109710602A (en) * | 2018-12-26 | 2019-05-03 | 中科曙光国际信息产业有限公司 | Data model detection method and device |
CN109739893A (en) * | 2018-12-28 | 2019-05-10 | 上海连尚网络科技有限公司 | A kind of metadata management method, equipment and computer-readable medium |
CN109857822A (en) * | 2018-12-29 | 2019-06-07 | 国家开发银行 | Meta-model conversion method and management system based on chart database |
CN109871417A (en) * | 2018-12-29 | 2019-06-11 | 国家开发银行 | The metadata visualization map constructing method and system of knowledge based map |
CN110209380A (en) * | 2019-05-30 | 2019-09-06 | 上海直真君智科技有限公司 | A kind of unified dynamic metadata processing method towards big data isomery model |
CN112115183A (en) * | 2020-09-18 | 2020-12-22 | 广州锦行网络科技有限公司 | Honeypot system threat information analysis method based on graph |
US11494611B2 (en) | 2019-07-31 | 2022-11-08 | International Business Machines Corporation | Metadata-based scientific data characterization driven by a knowledge database at scale |
US11703404B2 (en) | 2019-06-17 | 2023-07-18 | Colorado State University Research Foundation | Device for automated crop root sampling |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188887B (en) * | 2018-09-26 | 2022-11-08 | 第四范式(北京)技术有限公司 | Data management method and device for machine learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070233680A1 (en) * | 2006-03-31 | 2007-10-04 | Microsoft Corporation | Auto-generating reports based on metadata |
CN101908176A (en) * | 2010-08-02 | 2010-12-08 | 国电南瑞科技股份有限公司 | Method for modeling on basis of power information data and applying metadata management |
CN103246753A (en) * | 2013-05-30 | 2013-08-14 | 安徽皖通科技股份有限公司 | Method for generating entity metadata model according to database structure |
-
2014
- 2014-07-15 CN CN201410336111.XA patent/CN104142980B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070233680A1 (en) * | 2006-03-31 | 2007-10-04 | Microsoft Corporation | Auto-generating reports based on metadata |
CN101908176A (en) * | 2010-08-02 | 2010-12-08 | 国电南瑞科技股份有限公司 | Method for modeling on basis of power information data and applying metadata management |
CN103246753A (en) * | 2013-05-30 | 2013-08-14 | 安徽皖通科技股份有限公司 | Method for generating entity metadata model according to database structure |
Non-Patent Citations (1)
Title |
---|
徐小天 等: "基于 JSON 的电力企业业务系统非结构化数据抽取方法", 《华北电力技术》 * |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103886004A (en) * | 2013-11-29 | 2014-06-25 | 北京吉威数源信息技术有限公司 | Material data modeling processing method |
CN103886004B (en) * | 2013-11-29 | 2017-06-09 | 北京吉威时代软件股份有限公司 | A kind of data type data modeling processing method |
CN104580474A (en) * | 2015-01-13 | 2015-04-29 | 深圳市融创天下科技有限公司 | Urban operation sign big data visualization multi-screen interaction display platform and method |
CN105574086A (en) * | 2015-12-10 | 2016-05-11 | 天津海量信息技术有限公司 | Artificial intelligence extraction method of internet unstructured data fields |
CN106886535A (en) * | 2015-12-16 | 2017-06-23 | 大唐软件技术股份有限公司 | A kind of data pick-up method and apparatus for being adapted to multiple data sources |
CN105701181A (en) * | 2016-01-06 | 2016-06-22 | 中电科华云信息技术有限公司 | Dynamic heterogeneous metadata acquisition method and system |
CN105912636B (en) * | 2016-04-08 | 2020-04-07 | 金蝶软件(中国)有限公司 | Map/Reduce-based ETL data processing method and device |
CN105912636A (en) * | 2016-04-08 | 2016-08-31 | 金蝶软件(中国)有限公司 | Map/Reduce based ETL data processing method and device |
CN106557569B (en) * | 2016-11-14 | 2020-07-03 | 用友网络科技股份有限公司 | Method and device for importing unstructured document based on meta-model |
CN106557569A (en) * | 2016-11-14 | 2017-04-05 | 用友网络科技股份有限公司 | Introduction method and gatherer based on the non-structured document of meta-model |
CN108320066A (en) * | 2017-01-18 | 2018-07-24 | 重庆邮电大学 | A kind of Explore of Unified Management Ideas for realizing different production lines based on metadata |
CN108733727A (en) * | 2017-04-25 | 2018-11-02 | 华为技术有限公司 | A kind of inquiry processing method, data source registration method and query engine |
US11907213B2 (en) | 2017-04-25 | 2024-02-20 | Huawei Technologies Co., Ltd. | Query processing method, data source registration method, and query engine |
US11366808B2 (en) | 2017-04-25 | 2022-06-21 | Huawei Technologies Co., Ltd. | Query processing method, data source registration method, and query engine |
CN108733727B (en) * | 2017-04-25 | 2021-11-30 | 华为技术有限公司 | Query processing method, data source registration method and query engine |
CN107291875B (en) * | 2017-06-19 | 2019-12-06 | 华中科技大学 | Metadata organization management method and system based on metadata graph |
CN107291875A (en) * | 2017-06-19 | 2017-10-24 | 华中科技大学 | A kind of metadata organization management method and system based on metadata graph |
CN107633181B (en) * | 2017-09-12 | 2021-01-26 | 复旦大学 | Data model realization method facing data open sharing and operation system thereof |
CN107633181A (en) * | 2017-09-12 | 2018-01-26 | 复旦大学 | The data model and its operation system of data-oriented opening and shares |
CN109242259B (en) * | 2018-08-10 | 2020-12-11 | 华迪计算机集团有限公司 | Data integration method and system based on basic data resource library |
CN109242259A (en) * | 2018-08-10 | 2019-01-18 | 华迪计算机集团有限公司 | A kind of data integrating method and system based on basic data resources bank |
CN109542960A (en) * | 2018-10-18 | 2019-03-29 | 国网内蒙古东部电力有限公司信息通信分公司 | A kind of data analysis domain system |
CN109710602A (en) * | 2018-12-26 | 2019-05-03 | 中科曙光国际信息产业有限公司 | Data model detection method and device |
CN109739893A (en) * | 2018-12-28 | 2019-05-10 | 上海连尚网络科技有限公司 | A kind of metadata management method, equipment and computer-readable medium |
CN109871417A (en) * | 2018-12-29 | 2019-06-11 | 国家开发银行 | The metadata visualization map constructing method and system of knowledge based map |
CN109857822A (en) * | 2018-12-29 | 2019-06-07 | 国家开发银行 | Meta-model conversion method and management system based on chart database |
CN110209380A (en) * | 2019-05-30 | 2019-09-06 | 上海直真君智科技有限公司 | A kind of unified dynamic metadata processing method towards big data isomery model |
US11703404B2 (en) | 2019-06-17 | 2023-07-18 | Colorado State University Research Foundation | Device for automated crop root sampling |
US11494611B2 (en) | 2019-07-31 | 2022-11-08 | International Business Machines Corporation | Metadata-based scientific data characterization driven by a knowledge database at scale |
CN112115183A (en) * | 2020-09-18 | 2020-12-22 | 广州锦行网络科技有限公司 | Honeypot system threat information analysis method based on graph |
Also Published As
Publication number | Publication date |
---|---|
CN104142980B (en) | 2017-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104142980A (en) | Big data-based metadata model management system and method | |
CN110941612B (en) | Autonomous data lake construction system and method based on associated data | |
US9400835B2 (en) | Weighting metric for visual search of entity-relationship databases | |
CN111026874A (en) | Data processing method and server of knowledge graph | |
CN112364046B (en) | Knowledge graph-based main data management method in heterogeneous environment | |
US9485306B2 (en) | Methods, apparatuses, and computer program products for facilitating a data interchange protocol | |
WO2021032146A1 (en) | Metadata management method and apparatus, device, and storage medium | |
CN103116574B (en) | From the method for natural language text excavation applications process body | |
CN110990467B (en) | BIM model format conversion method and conversion system | |
US20150293947A1 (en) | Validating relationships between entities in a data model | |
US11449477B2 (en) | Systems and methods for context-independent database search paths | |
CN110275962B (en) | Method and apparatus for outputting information | |
CN108305306B (en) | Animation data organization method based on sketch interaction | |
Singh et al. | Big data-a review | |
Gopalakrishnan et al. | Big Data in building information modeling research: survey and exploratory text mining | |
CN113609100B (en) | Data storage method, data query device and electronic equipment | |
Kim et al. | Customer preference analysis based on SNS data | |
US8694918B2 (en) | Conveying hierarchical elements of a user interface | |
CN113326345A (en) | Knowledge graph analysis and application method, platform and equipment based on dynamic ontology | |
CN105912723A (en) | Storage method of custom field | |
CN111813555B (en) | Super-fusion infrastructure layered resource management system based on internet technology | |
KR20230142799A (en) | Diagram of child nodes with multiple parent nodes | |
CN113468340A (en) | Construction system and construction method of industrial knowledge map | |
CN106557564A (en) | A kind of object data analysis method and device | |
CN109684329A (en) | A kind of method for managing resource based on data center apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |