CN103942245A - Data extracting method based on metadata - Google Patents

Data extracting method based on metadata Download PDF

Info

Publication number
CN103942245A
CN103942245A CN201410055786.7A CN201410055786A CN103942245A CN 103942245 A CN103942245 A CN 103942245A CN 201410055786 A CN201410055786 A CN 201410055786A CN 103942245 A CN103942245 A CN 103942245A
Authority
CN
China
Prior art keywords
data
model
data pick
metadata
definition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410055786.7A
Other languages
Chinese (zh)
Inventor
胡顺杰
王刚
张立勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Co Ltd
Original Assignee
Inspur Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Co Ltd filed Critical Inspur Software Co Ltd
Priority to CN201410055786.7A priority Critical patent/CN103942245A/en
Publication of CN103942245A publication Critical patent/CN103942245A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data extracting method based on metadata, and belongs to the field of data extracting. According to the method, a data extracting model is established on the basis of a common metadata model of a service model, and service data are extracted and formulated from the service model. Compared with the prior art, the data extracting method based on the metadata is based on industrial standard specification data elements, a metadata model is sorted out by extracting, refining and carding the service model, and the service data correspond to the metadata. Service classification is carried out on the metadata, and the processed metadata are mapped to the established data extracting model, so that the data extracting model based on the metadata is formed, the purpose of extracting the flexible service data is achieved, and the data extracting method has good popularization and application value.

Description

Data pick-up method based on metadata
Technical field
The present invention relates to data pick-up field, specifically a kind of data pick-up method based on metadata.
Background technology
At each business bar line of health industry, there is the business model that all volumes are large and complicated, corresponding data model has the business characteristics such as list structure complexity, field are various.
In existing data pick-up model, great majority be for each line service model or certain independently business model carry out data pick-up modelling targetedly.This design not only designs complexity, and adaptability to changes is poor.When changing because of industry standard specification or during in zones of different region form, will producing loaded down with trivial details and complicated change, bring huge maintenance workload, and be difficult for expansion.
Summary of the invention
Technical assignment of the present invention is for above-mentioned the deficiencies in the prior art, and a kind of data pick-up method based on metadata is provided.
Technical assignment of the present invention is realized in the following manner: the data pick-up method based on metadata, be characterized on the basis of the public metadata schema of business model, setting up data pick-up model, and from business model, extract and formulate business datum.
Described metadata schema, by extracting in business model, is set up the incidence relation of business model and metadata schema.
The renewal source definition that described data pick-up model comprises model definition, the definition of data pick-up item, each item number certificate, and data pick-up sorted logic processing, by data pick-up item and the associated and metadata of metadata and the associated relation of setting up three of business datum, reach by the target of the data pick-up model extraction business datum of metadata.
The realization of said method comprises data pick-up model definition, the definition of data pick-up item, the definition of Data Update source and the processing of data pick-up sorted logic:
Described data pick-up model definition refers to the framework of data extraction definition model, from different tangent planes, different dimensions, different points, the data that will extract are sorted out and gathered, each model definition comprises: model ISN, title, description base attribute, and the processing mode of model, whether need the table name definition of redirect mark, storage details;
The definition of described data pick-up item comprises processing mode, data type, length, the precision of data pick-up item corresponding field value, the extraction attribute definition of processing type;
The source definition of described Data Update is to define for the Data Source of data pick-up item, for the data of determining each data pick-up item when, from which metadata, upgrade, comprise source algorithm definition, metadata mark and the DSD calculating;
The processing of described data pick-up sorted logic comprises the processing of accumulative total class data pick-up, essential information class data pick-up processes and renewal gathers the processing of class data pick-up.
Compared with prior art, method of the present invention, based on industry standard specification data element, by the refinement combing to business model, arranges out metadata schema, and business datum is corresponding with metadata.Carry out business classification for these metadata, and be mapped to the data pick-up model of having set up, thereby form the data pick-up model based on metadata, reach business datum extracting objects flexibly, there is following beneficial effect highlightedly:
(1) extraction model is based upon on bottom metadata schema, can not carry out a large amount of variations and maintenance because of the variation of business model.
(2) data pick-up model corresponding element data, can, according to the version of metadata, set up the version management of data pick-up, are of value to the version of managing extraction model.
(3) by the Data Update definition of originating, effectively process the update mechanism of extracted data.
(4) by dissimilar processing logic, process targetedly different extraction models in the time that business datum changes and the operation of doing, and Unified Model, be convenient to management and expansion.
Brief description of the drawings
Accompanying drawing 1 is data pick-up illustraton of model in the inventive method;
Accompanying drawing 2 is samples of data pick-up model in embodiment;
Accompanying drawing 3 is samples of data pick-up item definition in embodiment;
Accompanying drawing 4 is samples of Data Update source definition in embodiment;
Accompanying drawing 5 is concise and to the point class figure of data pick-up model in embodiment.
Embodiment
Data pick-up method based on metadata of the present invention is described in detail below with specific embodiment with reference to Figure of description.
Embodiment:
The data pick-up method that the present invention is based on metadata comprises data pick-up model definition, the definition of data pick-up item, the definition of Data Update source, the processing of data pick-up sorted logic.
Below further illustrate:
(1), data pick-up model definition
According to business demand, the framework of data extraction definition model, sorts out and gathers the data that will extract from different tangent planes, different dimensions, different points.Each model definition comprises: model ISN, title, description base attribute, and the processing mode of model, whether need the table name definition of redirect mark, storage details.By the definition of these aspects, can determine a kind of extraction process and mode of data pick-up model.
attribute describe
processing mode accumulative total class, essential information class, renewal gather class
redirect mark whether definition there is hop field.Hop field need to generate the detailed record of field.
table name in detail for configuring the storage list of hop field record.
(2), data pick-up item definition
Each or each class data pick-up model, is made up of for the data item extracting some.Each data pick-up item will comprise processing mode, data type, length, the precision of data pick-up item corresponding field value, the extraction attribute definition of processing type.Can determine extraction process and the mode of the details item of a data extraction model by the definition of these data pick-up items.
attribute describe
processing mode specific field type, (MERGE merges the field of type, the field of JUMP redirect type).
(3), Data Update source definition
Data Update source is mainly to define for the Data Source of data pick-up item, when can determine the data of each data pick-up item, from which metadata, upgrades.Comprise source algorithm definition, metadata mark and the DSD calculating.
attribute describe
algorithm can, according to the processing class definition dynamic call Processing Algorithm of definition, realize the dynamic expansion of data pick-up Processing Algorithm.
(4), data pick-up sorted logic processing
Combing based on us to business extraction model and classification, can be from accumulative total class, essential information class, upgrade the data pick-up model that gathers class three types and process.
Further instruction is as follows:
A. add up class data pick-up processing logic:
A) obtain data according to the traffic table that is defined into metadata association of data pick-up model.
If b) these data have existed and business datum state is non-delete or new state more, abandon current business list processing (LISP).Next step processing of other situation continueds.
C) business datum is deleted or more when new state, former extraction record is deleted.End after deletion state service data processing completes, non-delete state service data continue next step.
D) business datum, for newly-increased or more when new state, is proceeded to analyze to business datum, obtains in traffic table and records for generating extracted data the table name that number is maximum.
E) data of obtaining table according to table name circulate.Extract many data according to the definition of extraction model value from each table.
B. essential information class data pick-up logic:
A) the essential information data that generated according to the definition data acquisition of data pick-up model.
If what b) carry out is deletion action, directly the extraction record existing is deleted.Complete aftertreatment and finish, other situations continue to process.
C) carry out extraction one by one according to field contents in the definition of data pick-up item, obtain the data of the corresponding service fields of metadata.Whether exist according to the data that extract, carry out and upgrade or newly-increased operation.
C. upgrade and gather class data pick-up logic:
A) obtain renewal according to data pick-up model definition data and gather traffic table data.
B) business datum state is judged, if deletion state and more new state are deleted data recording corresponding traffic table in the detailed table of data pick-up record.After complete, deletion state service data processing is complete.Other types continue processing below.
Renewal gathers class data pick-up model, deletes only for the data manipulation in detailed record sheet.Upgrading data in summary sheet does not change.May cause upgrading and in summary sheet, have part dirty data.
Business datum is newly-increased or upgrades while operation, need to extract one by one the field for data pick-up in business datum.The details record that also needs other generated data to extract for the field that belongs to redirect type.

Claims (4)

1. the data pick-up method based on metadata, is characterized in that: on the basis of the public metadata schema of business model, set up data pick-up model, extract and formulate business datum from business model.
2. the data pick-up method based on metadata according to claim 1, is characterized in that: described metadata schema is by extracting in business model.
3. the data pick-up method based on metadata according to claim 2, it is characterized in that: the renewal source definition that described data pick-up model comprises model definition, the definition of data pick-up item, each item number certificate, and data pick-up sorted logic processing, by data pick-up item and the associated and metadata of metadata and the associated relation of setting up three of business datum, reach by the target of the data pick-up model extraction business datum of metadata.
4. the data pick-up method based on metadata according to claim 3, is characterized in that comprising data pick-up model definition, the definition of data pick-up item, the definition of Data Update source and the processing of data pick-up sorted logic:
Described data pick-up model definition refers to the framework of data extraction definition model, from different tangent planes, different dimensions, different points, the data that will extract are sorted out and gathered, each model definition comprises: model ISN, title, description base attribute, and the processing mode of model, whether need the table name definition of redirect mark, storage details;
The definition of described data pick-up item comprises processing mode, data type, length, the precision of data pick-up item corresponding field value, the extraction attribute definition of processing type;
The source definition of described Data Update is to define for the Data Source of data pick-up item, for the data of determining each data pick-up item when, from which metadata, upgrade, comprise source algorithm definition, metadata mark and the DSD calculating;
The processing of described data pick-up sorted logic comprises the processing of accumulative total class data pick-up, essential information class data pick-up processes and renewal gathers the processing of class data pick-up.
CN201410055786.7A 2014-02-19 2014-02-19 Data extracting method based on metadata Pending CN103942245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410055786.7A CN103942245A (en) 2014-02-19 2014-02-19 Data extracting method based on metadata

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410055786.7A CN103942245A (en) 2014-02-19 2014-02-19 Data extracting method based on metadata

Publications (1)

Publication Number Publication Date
CN103942245A true CN103942245A (en) 2014-07-23

Family

ID=51189913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410055786.7A Pending CN103942245A (en) 2014-02-19 2014-02-19 Data extracting method based on metadata

Country Status (1)

Country Link
CN (1) CN103942245A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778236A (en) * 2015-04-02 2015-07-15 上海烟草集团有限责任公司 ETL (Extract-Transform-Load) realization method and system based on metadata
CN105989162A (en) * 2015-03-04 2016-10-05 银联商务有限公司 Online data extraction method and apparatus
CN106021294A (en) * 2016-04-30 2016-10-12 华南理工大学 Urban rail transit line net access data interface processing method
CN106921614A (en) * 2015-12-24 2017-07-04 北京国双科技有限公司 Business data processing method and device
CN108255953A (en) * 2017-12-20 2018-07-06 浪潮软件集团有限公司 Data processing method and processing device
CN108280147A (en) * 2018-01-02 2018-07-13 浪潮软件集团有限公司 Data management method and device
WO2019019621A1 (en) * 2017-07-25 2019-01-31 平安科技(深圳)有限公司 Service processing method, device, server and storage medium
CN110851559A (en) * 2019-10-14 2020-02-28 中科曙光南京研究院有限公司 Automatic data element identification method and identification system
CN111159191A (en) * 2019-12-30 2020-05-15 深圳博沃智慧科技有限公司 Data processing method, device and interface

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050216443A1 (en) * 2000-07-06 2005-09-29 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
CN101364240A (en) * 2008-10-14 2009-02-11 杭州华三通信技术有限公司 Metadata management method and device
US20110295794A1 (en) * 2010-05-28 2011-12-01 Oracle International Corporation System and method for supporting data warehouse metadata extension using an extender
CN102902750A (en) * 2012-09-20 2013-01-30 浪潮齐鲁软件产业有限公司 Universal data extraction and conversion method
CN102938731A (en) * 2012-11-22 2013-02-20 北京锐易特软件技术有限公司 Exchange and integration device and method based on proxy cache adaptation model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050216443A1 (en) * 2000-07-06 2005-09-29 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
CN101364240A (en) * 2008-10-14 2009-02-11 杭州华三通信技术有限公司 Metadata management method and device
US20110295794A1 (en) * 2010-05-28 2011-12-01 Oracle International Corporation System and method for supporting data warehouse metadata extension using an extender
CN102902750A (en) * 2012-09-20 2013-01-30 浪潮齐鲁软件产业有限公司 Universal data extraction and conversion method
CN102938731A (en) * 2012-11-22 2013-02-20 北京锐易特软件技术有限公司 Exchange and integration device and method based on proxy cache adaptation model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周茂伟等: "基于元数据的ETL工具设计与实现", 《科学技术与工程》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989162A (en) * 2015-03-04 2016-10-05 银联商务有限公司 Online data extraction method and apparatus
CN105989162B (en) * 2015-03-04 2020-01-31 银联商务有限公司 online data extraction method and device
CN104778236A (en) * 2015-04-02 2015-07-15 上海烟草集团有限责任公司 ETL (Extract-Transform-Load) realization method and system based on metadata
CN106921614A (en) * 2015-12-24 2017-07-04 北京国双科技有限公司 Business data processing method and device
CN106021294A (en) * 2016-04-30 2016-10-12 华南理工大学 Urban rail transit line net access data interface processing method
WO2019019621A1 (en) * 2017-07-25 2019-01-31 平安科技(深圳)有限公司 Service processing method, device, server and storage medium
CN108255953A (en) * 2017-12-20 2018-07-06 浪潮软件集团有限公司 Data processing method and processing device
CN108280147A (en) * 2018-01-02 2018-07-13 浪潮软件集团有限公司 Data management method and device
CN110851559A (en) * 2019-10-14 2020-02-28 中科曙光南京研究院有限公司 Automatic data element identification method and identification system
CN111159191A (en) * 2019-12-30 2020-05-15 深圳博沃智慧科技有限公司 Data processing method, device and interface
CN111159191B (en) * 2019-12-30 2023-05-09 深圳博沃智慧科技有限公司 Data processing method, device and interface

Similar Documents

Publication Publication Date Title
CN103942245A (en) Data extracting method based on metadata
CN107729399B (en) Data processing method and device
CN106970929B (en) Data import method and device
CN103970853A (en) Method and device for optimizing search engine
JP2007011548A (en) Data set dividing program, data set dividing device, and data set dividing method
CN105224377A (en) A kind of method by metadata automatic generating software project code file and device
CN106126601A (en) A kind of social security distributed preprocess method of big data and system
CN109508355A (en) A kind of data pick-up method, system and terminal device
CN104978324B (en) Data processing method and device
CN107807932B (en) Hierarchical data management method and system based on path enumeration
CN105205105A (en) Data ETL (Extract Transform Load) system based on storm and treatment method based on storm
CN103234549B (en) A kind of differential data generation method for upgrading map
CN103903086A (en) Method and system for developing management information system based on service model driving
CN104657387A (en) Data query method and device
CN105900093A (en) Keyvalue database data table updating method and data table updating device
CN104281891A (en) Time-series data mining method and system
CN106649718B (en) A kind of big data acquisition and processing method for PDM system
CN104484460A (en) Metadata heat degree statistical method of distributed file system
CN106649602A (en) Way, device and server of processing business object data
CN105574660A (en) Supplier evaluation and analysis system
CN104462361A (en) Method and device for matching data in data table
CN112328592A (en) Data storage method, electronic device and computer readable storage medium
CN107526746A (en) The method and apparatus of management document index
CN111694505B (en) Data storage management method, device and computer readable storage medium
CN103984723A (en) Method used for updating data mining for frequent item by incremental data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140723