CN110263052A - One kind automating simultaneous techniques innovative approach based on big data Hadoop platform ODS - Google Patents

One kind automating simultaneous techniques innovative approach based on big data Hadoop platform ODS Download PDF

Info

Publication number
CN110263052A
CN110263052A CN201910552169.0A CN201910552169A CN110263052A CN 110263052 A CN110263052 A CN 110263052A CN 201910552169 A CN201910552169 A CN 201910552169A CN 110263052 A CN110263052 A CN 110263052A
Authority
CN
China
Prior art keywords
ods
synchronous
task
big data
change
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910552169.0A
Other languages
Chinese (zh)
Other versions
CN110263052B (en
Inventor
王德敏
张程
史梦丽
裴宝山
祁洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanyin Faba Consumer Finance Co.,Ltd.
Original Assignee
Suning Consumption Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Consumption Finance Co Ltd filed Critical Suning Consumption Finance Co Ltd
Priority to CN201910552169.0A priority Critical patent/CN110263052B/en
Publication of CN110263052A publication Critical patent/CN110263052A/en
Application granted granted Critical
Publication of CN110263052B publication Critical patent/CN110263052B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to one kind to automate simultaneous techniques innovative approach based on big data Hadoop platform ODS, it is characterised in that: comprising the following steps: (1) ODS system obtains the structure change of source system table automatically, and issues synchronous meter structure request;(2) it is synchronized according to whether system requests synchronous meter structure decision whether to execute table structure, and determines it is execution or postponement of execution at once;(3) ODS system implements automation synchronization.The present invention, which automates, realizes that newly-increased ODS data source, modification, data are synchronous, avoids data problem caused by artificial incorrect operation.Automation makes ODS development process more efficient, and development operation more standardizes, and is easily managed and safeguards, realizes the direct mapping that version change is realized to technology.

Description

One kind automating simultaneous techniques innovative approach based on big data Hadoop platform ODS
Technical field
The present invention relates to ODS to automate simultaneous techniques field, and in particular to one kind is based on big data Hadoop platform ODS certainly Dynamicization simultaneous techniques innovative approach.
Background technique
With the development of corporate business, the access of a variety of cooperation channels, the extension of diversification business model and core system The function optimization of system, causes source data frequently to change, such as table structure is newly-increased, modification, system switching.Under big data warehouse is used as Trip system, it is necessary to assure complete synchronizing traffic data access is online, is analyzed with the data of quick supporting business department, this is undoubtedly It is a huge challenge.Traditional ODS maintaining method is all to spend a large amount of workload and human cost, is come by manual operation same Step data source, step is more and cumbersome, and a link modification fault results even in data accident, not can guarantee bottom data and stablizes Reliably.It is badly in need of a kind of safe, reliable, efficient ODS synchronous method in this context.
Summary of the invention
Technical problem to be solved by the invention is to provide one kind to automate synchronous skill based on big data Hadoop platform ODS Art innovative approach.
In order to solve the above technical problems, the technical solution of the present invention is as follows: providing a kind of based on big data Hadoop platform ODS Automate simultaneous techniques innovative approach, it is characterised in that: comprising the following steps:
(1) ODS system obtains the structure change of source system table automatically, and issues synchronous meter structure request;
(2) whether synchronous if requesting synchronous meter structure decision whether to execute table structure according to system, and determine be execute at once or Postponement of execution;
(3) ODS system implements automation synchronization.
Further, the structure change of source system table is obtained in the step (1) automatically, and issues synchronous meter structure request It mainly comprises the steps that
1. the table for being related to change is reported to system by page mode in version day monthly by IT research staff, and to source system Table marks label, labeling: conventional table, middle table, interim table, backup table;
2. the daily timing automatic comparison ETL system of ODS internal system and source system table structure, automatic capture exception table, and it is automatic Request synchronous table structure;
3. business/developer or ETL personnel are according to business demand if it find that table structure is asynchronous, by page operation side Formula issues request synchronization request operation to ODS system.
Further, the source system table of the structure change obtained automatically in the step (1) include newly-increased table, deletion table, Field length change, field type modification.
Further, determine it is execution at once or the method that postponement of execution table structure synchronizes in the step (2) Are as follows: ODS system engine selects the suitable time to change automatically according to the change policy of preset table structure, described Change policy are as follows:
1. such as conventional table normal synchronized, middle table and interim table are asynchronous, standby by the rule of table label come control synchro system Part table Lag synchronization, delay time backstage configure;
2. by the white and black list strategy of table come control synchro system, such as white list normal synchronized, blacklist is asynchronous, Gray list Lag synchronization, delay time backstage configure;
3. dispatching the priority synchronous with ODS by formulating ETL come control synchro system, discovery has correspondence before ODS is synchronous Task instances generate or task is carrying out, and postponement of execution table structure is synchronous, establish task instances dependence, execute to task instances It is synchronous that end executes table structure again;
4. task instances are hung up when discovery there are corresponding task instances to generate in ODS synchronizing process, triggered again to ODS with the end of the step Task instances continue to execute;
5. if ODS is normally executed all without finding that corresponding task instances generate before and after ODS synchronizing process.
Further, in the step (3) real-time automatic synchronization the following steps are included:
1. ODS system, which executes, accesses new table;
2. it is synchronous that ODS system executes table structure.
Further, the step 1. in ODS system execute and access the specific steps of new table and include:
A, it configures newly-built Hive table, metadata management platform is added;
B, it creates ETL task and configures the HQL that isolates, create/modification task flow;
C, the task dependence in event and configuration task stream and the event dependent relationship between task flow are created.
Further, the step 2. in system execute table structure it is synchronous when specific steps include:
A, production system, automatic synchronization ODS table structure are accessed;
B, modification ETL task HQL isolates script;
C, synchrodata.
The present invention compared to the prior art, the beneficial effects are as follows:
One kind of the invention is based on big data Hadoop platform ODS and automates simultaneous techniques innovative approach, and ODS number is realized in automation , modification newly-increased according to source, data are synchronous, avoid data problem caused by artificial incorrect operation.Automation makes ODS development process higher Effect, development operation more standardize, are easily managed and safeguard, realize the direct mapping that version change is realized to technology.
Detailed description of the invention
It, below will be to needed in the embodiment in order to more clearly illustrate the technical solution in the embodiment of the present invention Attached drawing is simply introduced, it should be apparent that, the accompanying drawings in the following description is only some embodiments recorded in the present invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is that one kind of the invention is based on the step of big data Hadoop platform ODS automates simultaneous techniques innovative approach Flow chart.
Fig. 2 is that ODS system of the invention executes the flow chart for accessing new table.
Fig. 3 is that ODS system of the invention executes the synchronous flow chart of table structure.
Specific embodiment
Technical solution of the present invention will be clearly and completely described by specific embodiment below.
Of the invention providing is a kind of based on big data Hadoop platform ODS automation simultaneous techniques innovative approach, such as Fig. 1 institute Show, comprising the following steps:
(1) ODS system obtains the structure change of source system table automatically, and issues synchronous meter structure request, the structure obtained automatically The source system table of change includes newly-increased table, deletes table, field length change, field type modification.Automatic acquisition source system table Structure change, and issue synchronous meter structure request and mainly comprise the steps that
1. the table for being related to change is reported to system by page mode in version day monthly by IT research staff, and to source system Table marks label, labeling: conventional table, middle table, interim table, backup table;
2. the daily timing automatic comparison ETL system of ODS internal system and source system table structure, automatic capture exception table, and it is automatic Request synchronous table structure;
3. business/developer or ETL personnel are according to business demand if it find that table structure is asynchronous, by page operation side Formula issues request synchronization request operation to ODS system.
(2) it is synchronized according to whether system requests synchronous meter structure decision whether to execute table structure, and determines to be to execute at once Or postponement of execution, decision are to execute at once or method that postponement of execution table structure is synchronous are as follows: ODS system engine automatically according to The change policy of preset table structure, selects the suitable time to change, the change policy are as follows:
1. such as conventional table normal synchronized, middle table and interim table are asynchronous, standby by the rule of table label come control synchro system Part table Lag synchronization, delay time backstage configure;
2. by the white and black list strategy of table come control synchro system, such as white list normal synchronized, blacklist is asynchronous, Gray list Lag synchronization, delay time backstage configure;
3. dispatching the priority synchronous with ODS by formulating ETL come control synchro system, discovery has correspondence before ODS is synchronous Task instances generate or task is carrying out, and postponement of execution table structure is synchronous, establish task instances dependence, execute to task instances It is synchronous that end executes table structure again;
4. task instances are hung up when discovery there are corresponding task instances to generate in ODS synchronizing process, triggered again to ODS with the end of the step Task instances continue to execute;
5. if ODS is normally executed all without finding that corresponding task instances generate before and after ODS synchronizing process.
(3) ODS system implement automation synchronize, implement automatic synchronization the following steps are included:
1. ODS system, which executes, accesses new table, as shown in Fig. 2, specific steps include:
A, it configures newly-built Hive table, metadata management platform is added;
B, it creates ETL task and configures the HQL that isolates, create/modification task flow;
C, the task dependence in event and configuration task stream and the event dependent relationship between task flow are created.
2. ODS system executes, table structure is synchronous, as shown in figure 3, the specific steps that system executes when table structure synchronizes include:
A, production system, automatic synchronization ODS table structure are accessed;
B, modification ETL task HQL isolates script;
C, synchrodata.
Embodiment described above is only that the preferred embodiment of the present invention is described, not to design of the invention It is defined with range, without departing from the design concept of the invention, ordinary engineering and technical personnel is to this hair in this field The all variations and modifications that bright technical solution is made should all fall into protection scope of the present invention, claimed skill of the invention Art content is all documented in technical requirements book.

Claims (7)

1. one kind automates simultaneous techniques innovative approach based on big data Hadoop platform ODS, it is characterised in that: it is specific comprising with Lower step:
(1) ODS system obtains the structure change of source system table automatically, and issues synchronous meter structure request;
(2) whether synchronous if requesting synchronous meter structure decision whether to execute table structure according to system, and determine be execute at once or Postponement of execution;
(3) ODS system implements automation synchronization.
2. a kind of big data Hadoop platform ODS that is based on according to claim 1 automates simultaneous techniques innovative approach, It is characterized in that: obtaining the structure change of source system table in the step (1) automatically, and issue synchronous meter structure request and mainly include Following steps:
1. the table for being related to change is reported to system by page mode in version day monthly by IT research staff, and to source system Table marks label, labeling: conventional table, middle table, interim table, backup table;
2. the daily timing automatic comparison ETL system of ODS internal system and source system table structure, automatic capture exception table, and it is automatic Request synchronous table structure;
3. business/developer or ETL personnel are according to business demand if it find that table structure is asynchronous, by page operation side Formula issues request synchronization request operation to ODS system.
3. a kind of big data Hadoop platform ODS that is based on according to claim 1 automates simultaneous techniques innovative approach, Be characterized in that: the source system table of the structure change obtained automatically in the step (1) includes newly-increased table, deletion table, field length Change, field type modification.
4. a kind of big data Hadoop platform ODS that is based on according to claim 1 automates simultaneous techniques innovative approach, It is characterized in that: determining it is execution at once or the method that postponement of execution table structure synchronizes in the step (2) are as follows: ODS system Engine unite automatically according to the change policy of preset table structure, selects the suitable time to change, the change policy Are as follows:
1. such as conventional table normal synchronized, middle table and interim table are asynchronous, standby by the rule of table label come control synchro system Part table Lag synchronization, delay time backstage configure;
2. by the white and black list strategy of table come control synchro system, such as white list normal synchronized, blacklist is asynchronous, Gray list Lag synchronization, delay time backstage configure;
3. dispatching the priority synchronous with ODS by formulating ETL come control synchro system, discovery has correspondence before ODS is synchronous Task instances generate or task is carrying out, and postponement of execution table structure is synchronous, establish task instances dependence, execute to task instances It is synchronous that end executes table structure again;
4. task instances are hung up when discovery there are corresponding task instances to generate in ODS synchronizing process, triggered again to ODS with the end of the step Task instances continue to execute;
5. if ODS is normally executed all without finding that corresponding task instances generate before and after ODS synchronizing process.
5. a kind of big data Hadoop platform ODS that is based on according to claim 1 automates simultaneous techniques innovative approach, Be characterized in that: implementing automatic synchronization in the step (3) the following steps are included:
1. ODS system, which executes, accesses new table;
2. it is synchronous that ODS system executes table structure.
6. a kind of big data Hadoop platform ODS that is based on according to claim 5 automates simultaneous techniques innovative approach, Be characterized in that: the step 1. in ODS system execute and access the specific steps of new table and include:
A, it configures newly-built Hive table, metadata management platform is added;
B, it creates ETL task and configures the HQL that isolates, create/modification task flow;
C, the task dependence in event and configuration task stream and the event dependent relationship between task flow are created.
7. a kind of big data Hadoop platform ODS that is based on according to claim 5 automates simultaneous techniques innovative approach, Be characterized in that: the step 2. in system execute table structure it is synchronous when specific steps include:
A, production system, automatic synchronization ODS table structure are accessed;
B, modification ETL task HQL isolates script;
C, synchrodata.
CN201910552169.0A 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS Active CN110263052B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910552169.0A CN110263052B (en) 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910552169.0A CN110263052B (en) 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS

Publications (2)

Publication Number Publication Date
CN110263052A true CN110263052A (en) 2019-09-20
CN110263052B CN110263052B (en) 2021-07-20

Family

ID=67921073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910552169.0A Active CN110263052B (en) 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS

Country Status (1)

Country Link
CN (1) CN110263052B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110040727A1 (en) * 2009-08-11 2011-02-17 At&T Intellectual Property I, L.P. Minimizing staleness in real-time data warehouses
CN103699580A (en) * 2013-12-03 2014-04-02 中铁程科技有限责任公司 Database synchronization method and database synchronization device
CN104937582A (en) * 2013-02-27 2015-09-23 惠普发展公司,有限责任合伙企业 Data synchronization
US20160335305A1 (en) * 2015-05-14 2016-11-17 Walleye Software, LLC Computer data system data source refreshing using an update propagation graph
CN106599061A (en) * 2016-11-16 2017-04-26 成都九洲电子信息系统股份有限公司 SQLite-based embedded database synchronization method
CN108470228A (en) * 2017-02-22 2018-08-31 国网能源研究院 Financial data auditing method and audit system
CN109101622A (en) * 2018-08-10 2018-12-28 北京奇虎科技有限公司 Method of data synchronization, calculates equipment and computer storage medium at device
CN109189764A (en) * 2018-09-20 2019-01-11 北京桃花岛信息技术有限公司 A kind of colleges and universities' data warehouse layered design method based on Hive
CN109271444A (en) * 2018-08-10 2019-01-25 武汉达梦数据库有限公司 A kind of table level bi-directional synchronization method and system based on trigger
US20190037019A1 (en) * 2015-05-27 2019-01-31 University Of Utah Research Foundation Agent for healthcare data application delivery
CN109885581A (en) * 2019-03-14 2019-06-14 苏州达家迎信息技术有限公司 Synchronous method, device, equipment and the storage medium of database

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110040727A1 (en) * 2009-08-11 2011-02-17 At&T Intellectual Property I, L.P. Minimizing staleness in real-time data warehouses
CN104937582A (en) * 2013-02-27 2015-09-23 惠普发展公司,有限责任合伙企业 Data synchronization
CN103699580A (en) * 2013-12-03 2014-04-02 中铁程科技有限责任公司 Database synchronization method and database synchronization device
US20160335305A1 (en) * 2015-05-14 2016-11-17 Walleye Software, LLC Computer data system data source refreshing using an update propagation graph
US20190037019A1 (en) * 2015-05-27 2019-01-31 University Of Utah Research Foundation Agent for healthcare data application delivery
CN106599061A (en) * 2016-11-16 2017-04-26 成都九洲电子信息系统股份有限公司 SQLite-based embedded database synchronization method
CN108470228A (en) * 2017-02-22 2018-08-31 国网能源研究院 Financial data auditing method and audit system
CN109101622A (en) * 2018-08-10 2018-12-28 北京奇虎科技有限公司 Method of data synchronization, calculates equipment and computer storage medium at device
CN109271444A (en) * 2018-08-10 2019-01-25 武汉达梦数据库有限公司 A kind of table level bi-directional synchronization method and system based on trigger
CN109189764A (en) * 2018-09-20 2019-01-11 北京桃花岛信息技术有限公司 A kind of colleges and universities' data warehouse layered design method based on Hive
CN109885581A (en) * 2019-03-14 2019-06-14 苏州达家迎信息技术有限公司 Synchronous method, device, equipment and the storage medium of database

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LI XIN-YU等: "Research of data synchronization for P2P-based collaborative design systems", 《COMPUTER ENGINEERING》 *
QU,WEIPING等: "On-Demand Snapshot Maintenance in Data Warehouses Using Incremental ETL Pipeline", 《TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE-CENTERED SYSTEMS XXXII》 *
WENG,NIANLONG等: "Scalable Parallel Join for Huge Tables", 《2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA》 *
江城: "基于RFID的车辆出入控制系统设计与实现", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *
罗朝宇等: "内蒙古电力数据中心的建设分析", 《内蒙古电力技术》 *

Also Published As

Publication number Publication date
CN110263052B (en) 2021-07-20

Similar Documents

Publication Publication Date Title
CN103617176B (en) One kind realizes the autosynchronous method of multi-source heterogeneous data resource
CN102638566B (en) BLOG system running method based on cloud storage
CN102034152B (en) The method for integrating of heterogeneous software system data and system based on SOA framework
CN104834582B (en) A kind of monitor event methods of exhibiting
CN105843182A (en) Power dispatching accident handling scheme preparing system and power dispatching accident handling scheme preparing method based on OMS
CN103532749B (en) A kind of main station information management system
CN107797767A (en) One kind is based on container technique deployment distributed memory system and its storage method
CN101350009A (en) System for writing and compiling cooperated documents
CN108259562A (en) A kind of method of data synchronization and device based on multi-endpoint
WO2019047441A1 (en) Communication optimization method and system
CN115374102A (en) Data processing method and system
CN107786355A (en) A kind of method and apparatus of smart city information sharing
CN102508886B (en) Extensive makeup language (XML)-based method for synchronously updating increment of spatial data
CN106953910A (en) A kind of Hadoop calculates storage separation method
CN102999364B (en) Method and device for classifying and dynamically loading subjects based on power grid operation monitoring
CN101997714A (en) Time processing method, device and system
CN104462185A (en) Digital library cloud storage system based on mixed structure
CN104504160A (en) Excel document online batch write-in method based on SSH frame
CN103631931A (en) Method and system for hierarchically storing data
CN105227379A (en) A kind of centralized monitor for java web application and method for early warning
CN111143468B (en) Multi-database data management method based on MPP distributed technology
CN103390252B (en) A kind of control centre and intelligent substation graphical information exchange method
CN103458050A (en) Electronic reading room set-up method based on cloud computation
CN110069566A (en) Heterogeneous database synchronization method in a kind of one-way import system
CN110263052A (en) One kind automating simultaneous techniques innovative approach based on big data Hadoop platform ODS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: No.88, Huaihai Road, Qinhuai District, Nanjing, Jiangsu 210001

Patentee after: Nanyin Faba Consumer Finance Co.,Ltd.

Address before: No.88, Huaihai Road, Qinhuai District, Nanjing, Jiangsu 210001

Patentee before: SUNING CONSUMER FINANCE Co.,Ltd.

CP01 Change in the name or title of a patent holder