CN104142980B - Metadata schema management system and management method based on big data - Google Patents
Metadata schema management system and management method based on big data Download PDFInfo
- Publication number
- CN104142980B CN104142980B CN201410336111.XA CN201410336111A CN104142980B CN 104142980 B CN104142980 B CN 104142980B CN 201410336111 A CN201410336111 A CN 201410336111A CN 104142980 B CN104142980 B CN 104142980B
- Authority
- CN
- China
- Prior art keywords
- metadata
- data
- data source
- schema
- metadata schema
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/907—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
Abstract
The invention provides a kind of metadata schema management system and management method based on big data, the management method comprises the following steps:Step 1, judge the type of the data source structure of big data;Step 2, after carrying out Metadata Extraction to structured data source, perform step 4;Step 3, after carrying out Metadata Extraction to unstructured data sources, perform step 4;Step 4, the relation of the metadata after definition extraction, and corresponding metadata schema is formed, perform step 5;Step 5, the metadata schema of formation is stored in database to graphically, performs step 6;Step 6, according to the metadata schema defined, metadata is issued according to business demand, metadata is used to provide external system.The present invention, which realizes, manages different types of data, unified Meta data system can be built on heterogeneous data source, and provide storage, management and the function of using to the system.
Description
Technical field
The present invention relates to a kind of metadata schema management system in telecommunication technology field and management method, in particular it relates to
A kind of metadata schema management system and management method based on big data.
Background technology
People describe and defined mass data caused by the information explosion epoch with big data, and name associated skill
Art develops and innovation.Data expand rapidly and become big, and it decides the future development of enterprise, although enterprise may be simultaneously now
Do not recognize that data explosion increases the hidden danger for bringing problem, but over time, people will more and more anticipate
Know importance of the data to enterprise.
The big data epoch propose new challenge to the data controling power of the mankind, as Internet of Things and mobile terminal continue
Mass data is constantly produced, and data type is enriched, and how to manage these different types of data just becomes one
The problem of difficult.Metadata schema management method of the invention based on big data is exactly to adapt to such environment, is solved big
The different types of problem of management of data.
The content of the invention
For in the prior art the defects of, it is an object of the invention to provide a kind of metadata schema management based on big data
Systems and management method, it, which is realized, manages different types of data, and unified metadata can be built on heterogeneous data source
System, and storage, management and the function of using to the system are provided.
According to an aspect of the present invention, there is provided a kind of metadata schema management method based on big data, its feature exist
In it comprises the following steps:Step 1, judge the type of the data source structure of big data, that is, judge be structured data source also
It is unstructured data sources, if structured data source then performs step 2, if unstructured data sources then perform step 3;
Step 2, after carrying out Metadata Extraction to structured data source, perform step 4;Step 3, unstructured data sources are carried out
After Metadata Extraction, step 4 is performed;Step 4, the relation of the metadata after definition extraction, and form corresponding first number
According to model, step 5 is performed;Step 5, the metadata schema of formation is stored in database to graphically, performs step
Six;Step 6, according to the metadata schema defined, metadata is issued according to business demand, to provide external system
Use metadata.
Preferably, the structured data source inclusion relation database and document form, unstructured data sources include
NOSQL databases.
Preferably, the step 2 and step 3 manually extract user-defined metadata, and by metadata lattice
Formula is converted into the form for meeting JSON data standards.
Preferably, the step 5 parses the JSON data formats of metadata schema first, and this data format is parsed and become
Into node, the data format of the figure identification method of node relationships, graphic data base then is arrived into node, node relationships storage
In.
The present invention also provides a kind of metadata schema management system based on big data, it is characterised in that it includes:
Judge module, the type of the data source structure for judging big data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extracting, and form corresponding first number
According to model;
Memory module, model definition and the metadata schema for forming module are stored in database;
Release module, for being issued to metadata.
Compared with prior art, the present invention has following beneficial effect:One, the present invention is directly according to business demand to not
Same type, metadata information is extracted between diverse geographic location database, is merged, and is shared, and fusion and carries out metadata
The isomery processing of data modeling, isomery processing are effectively managed based on structured data source and unstructured data sources.Two,
The present invention provides basic uniform data standard for the excavation and analysis of mass data, and is laid the foundation for structure industry semantic base.
Three, the present invention provides the user a whole set of complete metadata management function.Four, the present invention realizes quickly for big data processing, high
Imitate, accurately metadata and metadata schema store function.Five, the pattern of graphics mode storage metadata schema can reach
Inquiry velocity is quick, and bandwagon effect is clear, this bandwagon effect clearly illustrate metadata data model establish process and
The process of model extension.Six, the present invention establishes unified, stable metadata data warehouse for big data processing.
Brief description of the drawings
The detailed description made by reading with reference to the following drawings to non-limiting example, further feature of the invention,
Objects and advantages will become more apparent upon:
Fig. 1 is the flow chart of the metadata schema management method of the invention based on big data.
Fig. 2 is the theory diagram of the metadata schema management system of the invention based on big data.
Embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to the technology of this area
Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill to this area
For personnel, without departing from the inventive concept of the premise, various modifications and improvements can be made.These belong to the present invention
Protection domain.
As shown in figure 1, the metadata schema management method of the invention based on big data comprises the following steps:
Step 1, judge the type of the data source structure of big data, that is, judge to be structured data source or unstructured
Data source, if structured data source then performs step 2, if unstructured data sources then perform step 3;Structural data
Source inclusion relation database and document form, relational database such as ORACLE, MYSQL, DB2;Document form such as CSV, XLSX etc..
Unstructured data sources include NOSQL (database for referring to non-relational) database.Step 1 is specifically sentenced by judge module
The type of the data source structure of disconnected big data, it is i.e. with bivariate table knot according to structural data characteristic the characteristics of structured data source
Structure carrys out this feature of logic realization data to formulate data source semantic type standard, and unstructured data sources feature is according to non-knot
Structure data source characteristic is document, picture, form, image, and audio etc. formulates data source semantic type standard.
Step 2, after carrying out Metadata Extraction to structured data source, perform step 4;Step 2 is specifically by extraction mould
Block carries out Metadata Extraction to structured data source;
Step 3, after carrying out Metadata Extraction to unstructured data sources, perform step 4;Step 3 is specifically by extracting
Module carries out Metadata Extraction to structured data source;
Step 4, the relation of the metadata after definition extraction simultaneously form corresponding metadata schema, perform step 5;
Step 4 defines the various relations between the different metadata after extracting particular by metadata data modeling, by not of the same trade or business
Different relations are established in business, so as to form corresponding metadata schema by this different metadata and its various relation;
Step 4 is specifically by model definition and forms module completion;
Step 5, the metadata schema of formation is stored in database to graphically, performs step 6;Step 5 has
Body is completed by memory module;
Step 6, according to the metadata schema defined, metadata is issued according to business demand, to provide outside
System uses metadata.Step 6 is specifically to be completed by release module.
Wherein, step 2 and step 3 manually extract user-defined metadata, and metadata form is changed
Into the lattice for meeting a kind of JSON (JavaScript Object Notation, being data interchange format of lightweight) data standard
Formula, the benefit of this data standard are to define the semantic criteria of metadata, avoid semantic conflict.Step 5 parses metadata first
The JSON data formats of model, by this data format parsing become node, node relationships figure identification method data format,
Then by node, node relationships storage into graphic data base.Metadata is a kind of binary message, is to data and information money
The descriptive information in source.
As shown in Fig. 2 the metadata schema management system of the invention based on big data includes:
Judge module, the type of the data source structure for judging big data;
Abstraction module, for carrying out Metadata Extraction to structured data source or to unstructured data sources;
Model definition and formation module, for defining the relation of the metadata after extracting, and form corresponding first number
According to model;
Memory module, model definition and the metadata schema for forming module are stored in database;
Release module, for being issued to metadata.
In summary, the present invention, which realizes, manages different types of data, and unification can be built on heterogeneous data source
Meta data system, this Meta data system include extraction, modeling, storage, inquiry and management of isomery metadata etc..
The specific embodiment of the present invention is described above.It is to be appreciated that the invention is not limited in above-mentioned
Particular implementation, those skilled in the art can make various deformations or amendments within the scope of the claims, this not shadow
Ring the substantive content of the present invention.
Claims (1)
1. a kind of metadata schema management method based on big data, it is characterised in that it comprises the following steps:
Step 1, judge the type of the data source structure of big data, that is, judge it is structured data source or unstructured data
Source, if structured data source then performs step 2, if unstructured data sources then perform step 3;
Step 2, after carrying out Metadata Extraction to structured data source, perform step 4;
Step 3, after carrying out Metadata Extraction to unstructured data sources, perform step 4;
Step 4, the relation of the metadata after definition extraction, and corresponding metadata schema is formed, perform step 5;
Step 5, the metadata schema of formation is stored in database to graphically, performs step 6;
Step 6, according to the metadata schema defined, metadata is issued according to business demand, to provide external system
Use metadata;
The structured data source inclusion relation database and document form, unstructured data sources include NOSQL databases;
The step 2 and step 3 manually extract user-defined metadata, and metadata form is converted into meeting
The form of JSON data standards;
The step 5 parses the JSON data formats of metadata schema first, and the parsing of this data format is become into node, node
The data format of the figure identification method of relation, then by node, node relationships storage into graphic data base.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410336111.XA CN104142980B (en) | 2014-07-15 | 2014-07-15 | Metadata schema management system and management method based on big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410336111.XA CN104142980B (en) | 2014-07-15 | 2014-07-15 | Metadata schema management system and management method based on big data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104142980A CN104142980A (en) | 2014-11-12 |
CN104142980B true CN104142980B (en) | 2017-11-17 |
Family
ID=51852154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410336111.XA Active CN104142980B (en) | 2014-07-15 | 2014-07-15 | Metadata schema management system and management method based on big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104142980B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188887A (en) * | 2018-09-26 | 2019-08-30 | 第四范式(北京)技术有限公司 | The data managing method and device of Machine oriented study |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103886004B (en) * | 2013-11-29 | 2017-06-09 | 北京吉威时代软件股份有限公司 | A kind of data type data modeling processing method |
CN104580474A (en) * | 2015-01-13 | 2015-04-29 | 深圳市融创天下科技有限公司 | Urban operation sign big data visualization multi-screen interaction display platform and method |
CN105574086A (en) * | 2015-12-10 | 2016-05-11 | 天津海量信息技术有限公司 | Artificial intelligence extraction method of internet unstructured data fields |
CN106886535A (en) * | 2015-12-16 | 2017-06-23 | 大唐软件技术股份有限公司 | A kind of data pick-up method and apparatus for being adapted to multiple data sources |
CN105701181A (en) * | 2016-01-06 | 2016-06-22 | 中电科华云信息技术有限公司 | Dynamic heterogeneous metadata acquisition method and system |
CN105912636B (en) * | 2016-04-08 | 2020-04-07 | 金蝶软件(中国)有限公司 | Map/Reduce-based ETL data processing method and device |
CN106557569B (en) * | 2016-11-14 | 2020-07-03 | 用友网络科技股份有限公司 | Method and device for importing unstructured document based on meta-model |
CN108320066A (en) * | 2017-01-18 | 2018-07-24 | 重庆邮电大学 | A kind of Explore of Unified Management Ideas for realizing different production lines based on metadata |
CN114490630A (en) | 2017-04-25 | 2022-05-13 | 华为技术有限公司 | Query processing method, data source registration method and query engine |
CN107291875B (en) * | 2017-06-19 | 2019-12-06 | 华中科技大学 | Metadata organization management method and system based on metadata graph |
CN107633181B (en) * | 2017-09-12 | 2021-01-26 | 复旦大学 | Data model realization method facing data open sharing and operation system thereof |
CN109242259B (en) * | 2018-08-10 | 2020-12-11 | 华迪计算机集团有限公司 | Data integration method and system based on basic data resource library |
CN109542960B (en) * | 2018-10-18 | 2023-03-14 | 国网内蒙古东部电力有限公司信息通信分公司 | Data analysis domain system |
CN109710602A (en) * | 2018-12-26 | 2019-05-03 | 中科曙光国际信息产业有限公司 | Data model detection method and device |
CN109739893B (en) * | 2018-12-28 | 2022-04-22 | 上海尚往网络科技有限公司 | Metadata management method, equipment and computer readable medium |
CN109871417A (en) * | 2018-12-29 | 2019-06-11 | 国家开发银行 | The metadata visualization map constructing method and system of knowledge based map |
CN109857822A (en) * | 2018-12-29 | 2019-06-07 | 国家开发银行 | Meta-model conversion method and management system based on chart database |
CN110209380B (en) * | 2019-05-30 | 2020-11-03 | 上海直真君智科技有限公司 | Unified dynamic metadata processing method oriented to big data heterogeneous model |
US11703404B2 (en) | 2019-06-17 | 2023-07-18 | Colorado State University Research Foundation | Device for automated crop root sampling |
US11494611B2 (en) | 2019-07-31 | 2022-11-08 | International Business Machines Corporation | Metadata-based scientific data characterization driven by a knowledge database at scale |
CN112115183B (en) * | 2020-09-18 | 2021-09-21 | 广州锦行网络科技有限公司 | Honeypot system threat information analysis method based on graph |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908176A (en) * | 2010-08-02 | 2010-12-08 | 国电南瑞科技股份有限公司 | Method for modeling on basis of power information data and applying metadata management |
CN103246753A (en) * | 2013-05-30 | 2013-08-14 | 安徽皖通科技股份有限公司 | Method for generating entity metadata model according to database structure |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070233680A1 (en) * | 2006-03-31 | 2007-10-04 | Microsoft Corporation | Auto-generating reports based on metadata |
-
2014
- 2014-07-15 CN CN201410336111.XA patent/CN104142980B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908176A (en) * | 2010-08-02 | 2010-12-08 | 国电南瑞科技股份有限公司 | Method for modeling on basis of power information data and applying metadata management |
CN103246753A (en) * | 2013-05-30 | 2013-08-14 | 安徽皖通科技股份有限公司 | Method for generating entity metadata model according to database structure |
Non-Patent Citations (1)
Title |
---|
基于 JSON 的电力企业业务系统非结构化数据抽取方法;徐小天 等;《华北电力技术》;20131130(第2013年第11期);第32-35页 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188887A (en) * | 2018-09-26 | 2019-08-30 | 第四范式(北京)技术有限公司 | The data managing method and device of Machine oriented study |
CN110188887B (en) * | 2018-09-26 | 2022-11-08 | 第四范式(北京)技术有限公司 | Data management method and device for machine learning |
Also Published As
Publication number | Publication date |
---|---|
CN104142980A (en) | 2014-11-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104142980B (en) | Metadata schema management system and management method based on big data | |
CN110941612B (en) | Autonomous data lake construction system and method based on associated data | |
CN105446966B (en) | The method and apparatus that production Methods data are converted to the mapping ruler of RDF format data | |
Kumar Kaliyar | Graph databases: A survey | |
CN106202292B (en) | Standard information analysis method based on structured data model | |
CN104866593A (en) | Database searching method based on knowledge graph | |
CN103699638A (en) | Method for realizing cross-database type synchronous data based on configuration parameters | |
CN112364046B (en) | Knowledge graph-based main data management method in heterogeneous environment | |
CN103116574B (en) | From the method for natural language text excavation applications process body | |
US20150293947A1 (en) | Validating relationships between entities in a data model | |
CN103353899A (en) | Accurate summarized information searching method | |
CN106503214A (en) | A kind of complex rule matching process based on Redis memory databases | |
CN104346466A (en) | Method and device of adding new attribute data in database | |
US20190311051A1 (en) | Virtual columns to expose row specific details for query execution in column store databases | |
CN114817481A (en) | Big data-based intelligent supply chain visualization method and device | |
CN104809186A (en) | Constructing method for mold design and manufacturing knowledge base | |
CN104346331A (en) | Retrieval method and system for XML database | |
CN107526746A (en) | The method and apparatus of management document index | |
CN103927402A (en) | Control logic diagram modular design management system implementation method | |
CN105159904B (en) | A kind of method and system of digital resource associate management | |
CN104794244B (en) | A kind of method and apparatus that figure conversion is realized based on MongoDB | |
CN106933844B (en) | Construction method of reachability query index facing large-scale RDF data | |
Kim et al. | Customer preference analysis based on SNS data | |
CN116097253A (en) | Method and device for constructing multi-level knowledge graph | |
CN112199488A (en) | Incremental knowledge graph entity extraction method and system for power customer service question answering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |