CN104699860A - Data processing and storage method for sharing-type master data - Google Patents

Data processing and storage method for sharing-type master data Download PDF

Info

Publication number
CN104699860A
CN104699860A CN201510163449.4A CN201510163449A CN104699860A CN 104699860 A CN104699860 A CN 104699860A CN 201510163449 A CN201510163449 A CN 201510163449A CN 104699860 A CN104699860 A CN 104699860A
Authority
CN
China
Prior art keywords
data
master
system
sharing
master data
Prior art date
Application number
CN201510163449.4A
Other languages
Chinese (zh)
Inventor
朱焰冰
Original Assignee
成都卡莱博尔信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都卡莱博尔信息技术有限公司 filed Critical 成都卡莱博尔信息技术有限公司
Priority to CN201510163449.4A priority Critical patent/CN104699860A/en
Publication of CN104699860A publication Critical patent/CN104699860A/en

Links

Abstract

The invention discloses a data processing and storage method for sharing-type master data. The method is characterized by including the steps that extraction is performed, wherein master data of all information domains are extracted from all service systems or files; cleaning is preformed, wherein the extracted master data are cleaned, and dirty data are taken out; mapping is performed, wherein a mapping relation table between a master data system and all the service systems is established, and a mapping relation between master data of the service systems and master data of a master data management system is established; conversion is conducted, wherein interface data of all the service systems are converted so that data types of the interface data can be consistent with data standards of the master data management system; loading is performed, wherein the extracted, cleaned and converted data are loaded into a master database model. The method achieves data sharing, and avoids the problem of redundancy and inconsistency of data among all the systems.

Description

一种共享型主数据的数据加工存储方法 A method of processing data stored in the shared master data

[0001] [0001]

技术领域 FIELD

[0002] 本发明涉及数据存储技术领域,具体地,涉及一种共享型主数据的数据加工存储方法。 [0002] The present invention relates to data storage technology, in particular, relates to a data processing method for storing master data is shared.

背景技术 Background technique

[0003]目前的企业单位均存在着大大小小不同时期建设的专业性业务系统,系统间的部署主要以网状结构存在。 [0003] The current business units exist in the construction of large and small, professional and business systems at different times, between systems deployed mainly in the network structure exists. 而现有的各个部分使用各自的系统,各系统之间的信息不能互享,形成“信息孤岛”。 And each using respective portions of the conventional system, information can not enjoy each other between the systems, the formation of "information island." 随着企业单位研究业务的变化、经济社会的转型和信息技术的发展,数据库之间的数据交换越来越频繁。 With the changes in business unit operations research, development and transformation of economic and social information technology, exchange of data between the database more frequently. 在业务应用中,经常需要进行复杂的数据交换,尤其是不同系统和业务之间,这些数据交换要求跨平台、跨系统的,同时要实现业务数据结构的变化和多业务的交互。 In business applications, often require complex data exchange, especially between different systems and business, these data exchange requirements of cross-platform, cross-system, changes in business at the same time to achieve the interaction data structures and multi-service. 在对各系统之间的数据进行整理时,由于各系统之间的数据存在冗余、不一致的问题,采用现有的数据存储方式显然不太合适。 When data between systems organize, due to problems with redundant, inconsistent data between systems using the conventional way of storing data is clearly not suitable.

发明内容 SUMMARY

[0004] 本发明为了解决的上述技术问题,提供了一种共享型主数据的数据加工存储方法。 [0004] The present invention for solving the above problems, there is provided a data processing method for storing master data is shared.

[0005] 本发明解决上述问题所采用的技术方案是: [0005] aspect of the present invention to solve the above problems is adopted:

一种共享型主数据的数据加工存储方法,其特征在于,包括: A method of processing data stored in the shared master data, characterized in that, comprising:

A、抽取,从各业务系统或文件中抽取各信息域主数据; A, extraction, extracting data from each of the information field of each master file or a service system;

B、清洗,清洗抽取到的主数据,取出脏数据; B, cleaning, washing the extracted main data, remove the dirty data;

C、映射,建立主数据系统和各业务系统数据映射关系表,建立业务系统主数据和主数据管理系统主数据的映射关系; C, mapping, and the establishment of the operational system master data system data mapping table, mapping relationship of the main data and the main business system master data management system;

D、转换,对各业务系统的接口数据进行转换,使其数据类型与主数据管理系统的数据规范一致; D, conversion, business systems interface data to be converted, so that the data type specification master data management system of the same;

E、加载,将抽取到的、经过清洗和转换的数据装载到主数据库模型中。 E, loading, to the extracted, transformed and cleaned via the data loaded into the main database model.

[0006] 所谓的脏数据,即源系统中的数据不在给定的范围内或对实际业务毫无意义,或者数据格式非法,以及在源系统中存在不规范的编码和含糊的业务逻辑。 [0006] the so-called dirty data, i.e. the data source system that is not in the presence of a given range or not ambiguous standard coding and business logic in the source system to the actual traffic meaningless, or illegal data format, and. 脏数据的造成主要是由于源系统的设计不严密造成的,主要表现在:数据格式错误,数据不一致,数据重复、错误,业务逻辑不合理,违反业务规则等。 Dirty data caused mainly due to the design of the source system is not tight due, mainly in: Data format error, inconsistent data, data duplication, errors, business logic is unreasonable, in violation of traffic rules. 譬如:未经验证的身份、未经验证的日期、字段等。 For example: unauthenticated identity, unproven date, and other fields. 本发明的数据加工存储方法基于现有的各系统之间的信息相互独立不能共享交互而生的。 Data processing method of the present invention is based on the storage of information between the existing system can not be shared independent interaction born. 先对各系统中的信息进抽取,随之对脏数据进行剔除,利用映射,使得主数据系统和各业务系统数据相对应,再对接口数据进行转换,使得数据类型相匹配一致,最后再加载到主数据模型中,实现数据的共享,避免各系统之间数据的冗余和不一致的问题。 Information of each of the first system into the extraction, followed by dirty data to be removed, using the map, so that the main system data and system data corresponding to each service, and then convert the data to the interface, so that the same data type matches, and finally loading the main data model, data sharing, redundancy and avoid the problem of inconsistency of data between the various systems.

[0007] 作为优选,为了保证数据的安全性,还包括备份:对主数据管理平台数据库中的应用框架数据和主数据进行备份。 [0007] Advantageously, in order to ensure data security, further comprising a backup: application framework data and the main data of the main data management platform in the database backup.

[0008] 进一步的,所述的清洗步骤中还包括对取出的脏数据的存储,对脏数据进行存储,便于分析使用。 [0008] Further, the washing step further comprises storing dirty data taken out of the dirty data is stored, to facilitate analysis.

[0009] 作为优选,为了便于对数据进行管理,在对主数据系统和各业务系统数据之间进行映射时,同时加上时间戳。 [0009] Advantageously, in order to facilitate the management data, when the data between the primary system and the business system data mapping, time-stamped at the same time.

[0010] 作为优选,为了便于对数据进行管理,在将将抽取到的、经过清洗和转换的数据装载到主数据库模型中时,加上时间戳。 [0010] Advantageously, in order to facilitate data management, will be drawn into, and passes the converted data cleaning and loaded into the main database model, time-stamp.

[0011] 综上,本发明的有益效果是: [0011] In summary, the advantages are:

本发明的方法对各系统中的信息进抽取,随之对脏数据进行剔除,利用映射,使得主数据系统和各业务系统数据相对应,再对接口数据进行转换,使得数据类型相匹配一致,最后再加载到主数据模型中,实现数据的共享,避免各系统之间数据的冗余和不一致的问题。 The method of the present invention into each of the information extraction system, followed by dirty data to be removed, using the map, so that the main system data and system data corresponding to each service, and then convert the data to the interface, so that the same data type matches, and finally into the main data model, data sharing, redundancy and avoid the problem of inconsistency of data between the various systems.

具体实施方式 Detailed ways

[0012] 下面结合实施例,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。 [0012] Example embodiments in conjunction with the following technical solutions in the embodiments of the present invention will be clearly and completely described, obviously, the described embodiments are merely part of embodiments of the present invention rather than all embodiments. 基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。 Based on the embodiments of the present invention, all other embodiments of ordinary skill in the art without any creative effort shall fall within the scope of the present invention.

[0013] 实施例1: [0013] Example 1:

一种共享型主数据的数据加工存储方法,其特征在于,包括: A method of processing data stored in the shared master data, characterized in that, comprising:

A、抽取,从各业务系统或文件中抽取各信息域主数据; A, extraction, extracting data from each of the information field of each master file or a service system;

B、清洗,清洗抽取到的主数据,取出脏数据; B, cleaning, washing the extracted main data, remove the dirty data;

C、映射,建立主数据系统和各业务系统数据映射关系表,建立业务系统主数据和主数据管理系统主数据的映射关系; C, mapping, and the establishment of the operational system master data system data mapping table, mapping relationship of the main data and the main business system master data management system;

D、转换,对各业务系统的接口数据进行转换,使其数据类型与主数据管理系统的数据规范一致; D, conversion, business systems interface data to be converted, so that the data type specification master data management system of the same;

E、加载,将抽取到的、经过清洗和转换的数据装载到主数据库模型中。 E, loading, to the extracted, transformed and cleaned via the data loaded into the main database model.

[0014] 本发明的数据加工存储方法基于现有的各系统之间的信息相互独立不能共享交互而生的。 [0014] The data processing method of the present invention is based on the storage of information between the existing system can not be shared independent interaction born. 先对各系统中的信息进抽取,随之对脏数据进行剔除,利用映射,使得主数据系统和各业务系统数据相对应,再对接口数据进行转换,使得数据类型相匹配一致,最后再加载到主数据模型中,实现数据的共享,避免各系统之间数据的冗余和不一致的问题。 Information of each of the first system into the extraction, followed by dirty data to be removed, using the map, so that the main system data and system data corresponding to each service, and then convert the data to the interface, so that the same data type matches, and finally loading the main data model, data sharing, redundancy and avoid the problem of inconsistency of data between the various systems.

[0015] 实施例2: [0015] Example 2:

为了提高数据的安全性,本实施例在上述实施例的基础上做了优化,即还包括备份:对主数据管理平台数据库中的应用框架数据和主数据进行备份。 To improve the security of the data, the present embodiment is optimized on the basis of the above-described embodiments, i.e., further comprising a backup: application framework data and the main data of the main data management platform in the database backup.

[0016] 所述的清洗步骤中还包括对取出的脏数据的存储。 The washing step is [0016] further comprising storing dirty data fetched.

[0017] 实施例3: [0017] Example 3:

为了便于对数据的管理,本实施例在上述实施例的基础上做了优化,即在对主数据系统和各业务系统数据之间进行映射时,同时加上时间戳。 In order to facilitate the management of data, in the present embodiment based on the above-described embodiments optimized, i.e. when the system between the primary data and the system data of each service is mapped, at the same time stamped.

[0018] 在将将抽取到的、经过清洗和转换的数据装载到主数据库模型中时,加上时间戳。 When [0018] When the to the extracted, transformed and cleaned via the data loaded into the main database model, time-stamp.

[0019] 如上所述,可较好的实现本发明。 [0019] As described above, the present invention can be better realized.

Claims (5)

1.一种共享型主数据的数据加工存储方法,其特征在于,包括: 抽取,从各业务系统或文件中抽取各信息域主数据; 清洗,清洗抽取到的主数据,取出脏数据; 映射,建立主数据系统和各业务系统数据映射关系表,建立业务系统主数据和主数据管理系统主数据的映射关系; 转换,对各业务系统的接口数据进行转换,使其数据类型与主数据管理系统的数据规范一致; 加载,将抽取到的、经过清洗和转换的数据装载到主数据库模型中。 A data processing method for sharing master data storage, characterized in that, comprising: extracting, extracting data from each of the information field of each master file or a service system; cleaning, washing the extracted main data, remove the dirty data; Mapping establishing master data system and the business system data mapping table, a mapping relation between the service system and the master data master data master data management system; conversion, to business systems interface data for conversion, and a data type master data management consistent data system specification; loading, to be extracted, cleaned and converted via the data loaded into the main database model.
2.根据权利要求1所述的一种共享型主数据的数据加工存储方法,其特征在于:还包括备份:对主数据管理平台数据库中的应用框架数据和主数据进行备份。 The data storage processing method of claim 1 for sharing master data of the preceding claims, characterized in that: further comprising a backup: application framework data and the main data of the main data management platform in the database backup.
3.根据权利要求1或2所述的一种共享型主数据的数据加工存储方法,其特征在于:所述的清洗步骤中还包括对取出的脏数据的存储。 The data storage processing method of claim 12 for sharing master data as claimed in claim, wherein: said cleaning step further comprises storing dirty data fetched.
4.根据权利要求1所述的一种共享型主数据的数据加工存储方法,其特征在于:在对主数据系统和各业务系统数据之间进行映射时,同时加上时间戳。 A data processing method according to a stored master data for sharing claim, wherein: when the data between the primary system and the business system data mapping, time-stamped at the same time.
5.根据权利要求1所述的一种共享型主数据的数据加工存储方法,其特征在于:在将将抽取到的、经过清洗和转换的数据装载到主数据库模型中时,加上时间戳。 The data storage processing method of claim 1 for sharing master data of the preceding claims, characterized in that: in the extraction to the after cleaning and conversion of data loaded into the main database model, the time-stamped .
CN201510163449.4A 2015-04-09 2015-04-09 Data processing and storage method for sharing-type master data CN104699860A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510163449.4A CN104699860A (en) 2015-04-09 2015-04-09 Data processing and storage method for sharing-type master data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510163449.4A CN104699860A (en) 2015-04-09 2015-04-09 Data processing and storage method for sharing-type master data

Publications (1)

Publication Number Publication Date
CN104699860A true CN104699860A (en) 2015-06-10

Family

ID=53346980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510163449.4A CN104699860A (en) 2015-04-09 2015-04-09 Data processing and storage method for sharing-type master data

Country Status (1)

Country Link
CN (1) CN104699860A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104993958A (en) * 2015-06-29 2015-10-21 北京京东尚科信息技术有限公司 Method and system for generating user master data
CN106126629A (en) * 2016-06-22 2016-11-16 武汉斗鱼网络科技有限公司 Master data management method and system based on live broadcast industry
CN108052645A (en) * 2017-12-26 2018-05-18 重庆信联达软件有限公司 Enterprise internal data standardization management method
CN108121809A (en) * 2017-12-26 2018-06-05 重庆信联达软件有限公司 Method for achieving enterprise internal-data standardization
CN108197192A (en) * 2017-12-26 2018-06-22 重庆信联达软件有限公司 Main-data system used for realizing enterprise internal-data standardization

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070198593A1 (en) * 2005-11-28 2007-08-23 Anand Prahlad Systems and methods for classifying and transferring information in a storage network
US20140059024A1 (en) * 2012-08-27 2014-02-27 Ss8 Networks, Inc. System and method of storage, recovery, and management of data intercepted on a communication network
CN103853843A (en) * 2014-03-20 2014-06-11 浪潮集团山东通用软件有限公司 Method for realizing data concentration across security domains based on main data mapping

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070198593A1 (en) * 2005-11-28 2007-08-23 Anand Prahlad Systems and methods for classifying and transferring information in a storage network
US20140059024A1 (en) * 2012-08-27 2014-02-27 Ss8 Networks, Inc. System and method of storage, recovery, and management of data intercepted on a communication network
CN103853843A (en) * 2014-03-20 2014-06-11 浪潮集团山东通用软件有限公司 Method for realizing data concentration across security domains based on main data mapping

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104993958A (en) * 2015-06-29 2015-10-21 北京京东尚科信息技术有限公司 Method and system for generating user master data
CN106126629A (en) * 2016-06-22 2016-11-16 武汉斗鱼网络科技有限公司 Master data management method and system based on live broadcast industry
CN108052645A (en) * 2017-12-26 2018-05-18 重庆信联达软件有限公司 Enterprise internal data standardization management method
CN108121809A (en) * 2017-12-26 2018-06-05 重庆信联达软件有限公司 Method for achieving enterprise internal-data standardization
CN108197192A (en) * 2017-12-26 2018-06-22 重庆信联达软件有限公司 Main-data system used for realizing enterprise internal-data standardization

Similar Documents

Publication Publication Date Title
Midgley Industrialization and welfare: the case of the four little tigers
CN103678665B (en) Heterogeneous data integration method and system for large based on data warehouse
CN102426609B (en) Index generation method and index generation device based on MapReduce programming architecture
WO2009039230A3 (en) Healthcare semantic interoperability platform
CN103049556B (en) Quick Stats query method for mass medical data
CN103593422B (en) Virtual access method for managing heterogeneous databases
CN101957865A (en) Data exchange and sharing technology among heterogeneous systems
CN103607469B (en) A data sharing method of distributed heterogeneous data sharing cloud platform to achieve
CN103116661B (en) Data processing method database
CN103729460B (en) Graphical data model managing method and system based on metadata
CN103218574A (en) Hash tree-based data dynamic operation verifiability method
CN101645011A (en) Integration scheme and platform between heterogeneous workgroup collaborative design system and PLM system
CN102142027A (en) Adaptive method for data integration
CN102262758A (en) Multi-dimensional modeling and unified management model based on space-time grid cime file
CN102708336B (en) Method and system for electronic document processing based on separation of key data from customized template
CN101714157A (en) Method, device and heterogeneous database system for generating heterogeneous database report
CN103823797A (en) FTP (file transfer protocol) based real-time industry database data synchronization system
CN104636864A (en) Government affair information resource management system based on cloud computation
CN102361499B (en) Method for producing set top box
CN104135516B (en) A distributed cloud storage method based on industry data collection
CN103581332B (en) Pressure HDFS HDFS framework structure and method for decomposing a node NameNode
CN104965935B (en) Update method of network monitoring logs
CN102902777B (en) Cross data source query data source query device and a method of cross-
CN101621529B (en) High-efficient and low-cost loading method for heterogeneous mass data
CN102130904B (en) Blood relationship description system for entity trust in information system

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
WD01