CN110263052B - Automatic synchronization technology innovation method based on big data Hadoop platform ODS - Google Patents

Automatic synchronization technology innovation method based on big data Hadoop platform ODS Download PDF

Info

Publication number
CN110263052B
CN110263052B CN201910552169.0A CN201910552169A CN110263052B CN 110263052 B CN110263052 B CN 110263052B CN 201910552169 A CN201910552169 A CN 201910552169A CN 110263052 B CN110263052 B CN 110263052B
Authority
CN
China
Prior art keywords
ods
synchronization
table structure
task
automatically
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910552169.0A
Other languages
Chinese (zh)
Other versions
CN110263052A (en
Inventor
王德敏
张程
史梦丽
裴宝山
祁洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanyin Faba Consumer Finance Co.,Ltd.
Original Assignee
Suning Consumer Finance Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Consumer Finance Co ltd filed Critical Suning Consumer Finance Co ltd
Priority to CN201910552169.0A priority Critical patent/CN110263052B/en
Publication of CN110263052A publication Critical patent/CN110263052A/en
Application granted granted Critical
Publication of CN110263052B publication Critical patent/CN110263052B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an innovative method of automatic synchronization technology of ODS (ODS) based on a big data Hadoop platform, which is characterized by comprising the following steps: the method specifically comprises the following steps: (1) the ODS system automatically acquires structure change of a source system table and sends out a synchronous table structure request; (2) judging whether to execute table structure synchronization according to whether the system requests a synchronization table structure, and determining whether to execute the table structure synchronization immediately or delay the execution; (3) the ODS system implements automated synchronization. The invention automatically realizes the addition, modification and data synchronization of the ODS data source and avoids the data problem caused by manual misoperation. The automation makes the ODS development process more efficient, the development operation more standard, the management and maintenance are easy, and the direct mapping from the version change to the technical implementation is realized.

Description

Automatic synchronization technology innovation method based on big data Hadoop platform ODS
Technical Field
The invention relates to the technical field of ODS (enhanced data processing) automatic synchronization, in particular to an innovative method of an ODS automatic synchronization technology based on a big data Hadoop platform.
Background
With the development of company business, the access of various cooperative channels, the expansion of diversified business modes and the function optimization of a core system cause frequent changes of source data, such as new increase and modification of a table structure, system switching and the like. The big data warehouse as a downstream system must ensure that complete business data is synchronously accessed to the online to quickly support data analysis of business departments, which is undoubtedly a great challenge. The traditional ODS maintenance method costs a large amount of workload and labor cost, data sources are synchronized by means of manual operation, the steps are multiple and tedious, data accidents can be caused even if a link is modified by mistake, and stability and reliability of bottom-layer data cannot be guaranteed. Under the background, a safe, reliable and efficient ODS synchronization method is urgently needed.
Disclosure of Invention
The invention aims to provide an innovative method for automatic synchronization technology of ODS (ODS) based on a big data Hadoop platform.
In order to solve the technical problems, the technical scheme of the invention is as follows: the provided automatic synchronization technology innovation method based on big data Hadoop platform ODS is characterized by comprising the following steps: the method specifically comprises the following steps:
(1) the ODS system automatically acquires structure change of a source system table and sends out a synchronous table structure request;
(2) judging whether to execute table structure synchronization according to whether the system requests a synchronization table structure, and determining whether to execute the table structure synchronization immediately or delay the execution;
(3) the ODS system implements automated synchronization.
Further, the step (1) of automatically acquiring the structure change of the source system table and sending out the synchronous table structure request mainly includes the following steps:
the IT research and development personnel report the tables related to change to the system in a page mode on a monthly version day, labels are marked on the source system table, and the labels are classified: a conventional table, a middle table, a temporary table and a backup table;
automatically comparing the table structures of the ETL system and the source system at every day in the ODS system, automatically capturing an abnormal table, and automatically requesting a synchronous table structure;
and thirdly, if the service/developer or the ETL personnel find that the table structures are not synchronous according to the service requirements, sending a request synchronous request operation to the ODS system through a page operation mode.
Further, the source system table of structure change automatically obtained in step (1) includes an added table, a deleted table, a field length change, and a field type modification.
Further, the method for determining whether to execute the table structure synchronization immediately or postpone execution in the step (2) is as follows: the ODS system engine automatically selects a proper time to change according to a preset change policy of a table structure, wherein the change policy is as follows:
firstly, a synchronization mechanism is controlled through a rule of a table tag, for example, a conventional table is normally synchronized, an intermediate table and a temporary table are not synchronized, a backup table is delayed to be synchronized, and delay time background configuration is performed;
controlling a synchronization mechanism through a white list and a black list strategy of the table, wherein the white list is normally synchronized, the black list is not synchronized, the gray list is delayed to be synchronized, and the delay time is configured in a background;
controlling a synchronization mechanism by setting the priority of ETL scheduling and ODS synchronization, when the generation of a corresponding task instance or the execution of a task is found before the ODS synchronization, delaying the synchronization of the execution table structure, establishing task instance dependency, and executing the synchronization of the table structure after the execution of the task instance is finished;
fourthly, when the generation of the corresponding task instance is found in the synchronization process of the ODS, the task instance is suspended, and the task instance is triggered to continue to execute after the synchronization of the ODS is finished;
and fifthly, if the corresponding task instance is not generated before and after the ODS synchronization process, the ODS is normally executed.
Further, the real-time automatic synchronization in the step (3) comprises the following steps:
firstly, an ODS system executes access to a new table;
the ODS system performs table structure synchronization.
Further, the specific steps of the ODS system in step (i) performing access to the new table include:
A. configuring a newly-built Hive table and adding a metadata management platform;
B. establishing an ETL task, configuring a drawing number HQL, and establishing/modifying a task flow;
C. and newly establishing an event and configuring task dependence in the task flow and an event dependence relationship between the task flows.
Further, the specific steps of the system in the second step when performing the table structure synchronization include:
A. accessing a production system, and automatically synchronizing an ODS table structure;
B. modifying an ETL task HQL drawing script;
C. the data is synchronized.
Compared with the prior art, the invention has the following beneficial effects:
the invention discloses an innovative method of an ODS (enhanced data storage) automatic synchronization technology based on a big data Hadoop platform, which can automatically realize the addition, modification and data synchronization of an ODS (enhanced data storage) data source and avoid the data problem caused by manual misoperation. The automation makes the ODS development process more efficient, the development operation more standard, the management and maintenance are easy, and the direct mapping from the version change to the technical implementation is realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments are briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of steps of an innovative method of automated synchronization technology ODS based on a big data Hadoop platform. FIG. 2 is a flow chart of the ODS system of the present invention performing access to a new table.
FIG. 3 is a flowchart of the ODS system performing table structure synchronization of the present invention.
Detailed Description
The technical solution of the present invention will be clearly and completely described by the following detailed description.
The invention provides an innovative method for automatic synchronization technology of ODS (ODS) based on a big data Hadoop platform, which specifically comprises the following steps as shown in FIG. 1:
(1) the ODS system automatically acquires structure change of a source system table and sends a synchronous table structure request, wherein the source system table of the automatically acquired structure change comprises a newly added table, a deleted table, field length change and field type modification. The method for automatically acquiring the structure change of the source system table and sending the synchronous table structure request mainly comprises the following steps:
the IT research and development personnel report the tables related to change to the system in a page mode on a monthly version day, labels are marked on the source system table, and the labels are classified: a conventional table, a middle table, a temporary table and a backup table;
automatically comparing the table structures of the ETL system and the source system at every day in the ODS system, automatically capturing an abnormal table, and automatically requesting a synchronous table structure;
and thirdly, if the service/developer or the ETL personnel find that the table structures are not synchronous according to the service requirements, sending a request synchronous request operation to the ODS system through a page operation mode.
(2) The method for judging whether to execute the table structure synchronization according to whether the system requests the synchronization table structure and determining whether to execute the table structure synchronization immediately or delay the execution comprises the following steps: the ODS system engine automatically selects a proper time to change according to a preset change policy of a table structure, wherein the change policy is as follows:
firstly, a synchronization mechanism is controlled through a rule of a table tag, for example, a conventional table is normally synchronized, an intermediate table and a temporary table are not synchronized, a backup table is delayed to be synchronized, and delay time background configuration is performed;
controlling a synchronization mechanism through a white list and a black list strategy of the table, wherein the white list is normally synchronized, the black list is not synchronized, the gray list is delayed to be synchronized, and the delay time is configured in a background;
controlling a synchronization mechanism by setting the priority of ETL scheduling and ODS synchronization, when the generation of a corresponding task instance or the execution of a task is found before the ODS synchronization, delaying the synchronization of the execution table structure, establishing task instance dependency, and executing the synchronization of the table structure after the execution of the task instance is finished;
fourthly, when the generation of the corresponding task instance is found in the synchronization process of the ODS, the task instance is suspended, and the task instance is triggered to continue to execute after the synchronization of the ODS is finished;
and fifthly, if the corresponding task instance is not generated before and after the ODS synchronization process, the ODS is normally executed.
(3) The ODS system implements automatic synchronization, which includes the following steps:
the ODS system performs access to a new table, as shown in fig. 2, and includes the following specific steps:
A. configuring a newly-built Hive table and adding a metadata management platform;
B. establishing an ETL task, configuring a drawing number HQL, and establishing/modifying a task flow;
C. and newly establishing an event and configuring task dependence in the task flow and an event dependence relationship between the task flows.
The ODS system performs the table structure synchronization, as shown in FIG. 3, the specific steps when the system performs the table structure synchronization include:
A. accessing a production system, and automatically synchronizing an ODS table structure;
B. modifying an ETL task HQL drawing script;
C. the data is synchronized.
The above-mentioned embodiments are merely descriptions of the preferred embodiments of the present invention, and do not limit the concept and scope of the present invention, and various modifications and improvements made to the technical solutions of the present invention by those skilled in the art should fall into the protection scope of the present invention without departing from the design concept of the present invention, and the technical contents of the present invention as claimed are all described in the technical claims.

Claims (6)

1. An innovative method of automatic synchronization technology ODS based on big data Hadoop platform is characterized by comprising the following steps: the method specifically comprises the following steps:
(1) the ODS system automatically acquires structure change of a source system table and sends out a synchronous table structure request;
(2) judging whether to execute table structure synchronization according to whether the system requests a synchronization table structure, and determining whether to execute the table structure synchronization immediately or delay the execution;
(3) the ODS system implements automated synchronization;
the method for determining whether to execute the table structure synchronization immediately or postpone in the step (2) comprises the following steps: the ODS system engine automatically selects a proper time to change according to a preset change policy of a table structure, wherein the change policy is as follows:
firstly, a synchronization mechanism is controlled through a rule of a table tag, for example, a conventional table is normally synchronized, an intermediate table and a temporary table are not synchronized, a backup table is delayed to be synchronized, and delay time background configuration is performed;
controlling a synchronization mechanism through a white list and a black list strategy of the table, wherein the white list is normally synchronized, the black list is not synchronized, the gray list is delayed to be synchronized, and the delay time is configured in a background;
controlling a synchronization mechanism by setting the priority of ETL scheduling and ODS synchronization, when the generation of a corresponding task instance or the execution of a task is found before the ODS synchronization, delaying the synchronization of the execution table structure, establishing task instance dependency, and executing the synchronization of the table structure after the execution of the task instance is finished;
fourthly, when the generation of the corresponding task instance is found in the synchronization process of the ODS, the task instance is suspended, and the task instance is triggered to continue to execute after the synchronization of the ODS is finished;
and fifthly, if the corresponding task instance is not generated before and after the ODS synchronization process, the ODS is normally executed.
2. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 1, characterized by: the step (1) of automatically obtaining the structure change of the source system table and sending the synchronous table structure request mainly comprises the following steps:
the IT research and development personnel report the tables related to change to the system in a page mode on a monthly version day, labels are marked on the source system table, and the labels are classified: a conventional table, a middle table, a temporary table and a backup table;
automatically comparing the table structures of the ETL system and the source system at every day in the ODS system, automatically capturing an abnormal table, and automatically requesting a synchronous table structure;
and thirdly, if the service/developer or the ETL personnel find that the table structures are not synchronous according to the service requirements, sending a request synchronous request operation to the ODS system through a page operation mode.
3. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 1, characterized by: the source system table of the structure change automatically acquired in the step (1) comprises an added table, a deleted table, a field length change and a field type modification.
4. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 1, characterized by: the implementation of automatic synchronization in the step (3) comprises the following steps:
firstly, an ODS system executes access to a new table;
the ODS system performs table structure synchronization.
5. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 4, characterized by: the specific steps of the ODS system in the first step of accessing the new table include:
A. configuring a newly-built Hive table and adding a metadata management platform;
B. establishing an ETL task, configuring a drawing number HQL, and establishing/modifying a task flow;
C. and newly establishing an event and configuring task dependence in the task flow and an event dependence relationship between the task flows.
6. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 4, characterized by: the concrete steps of the system in the step II when executing the table structure synchronization comprise:
A. accessing a production system, and automatically synchronizing an ODS table structure;
B. modifying an ETL task HQL drawing script;
C. the data is synchronized.
CN201910552169.0A 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS Active CN110263052B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910552169.0A CN110263052B (en) 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910552169.0A CN110263052B (en) 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS

Publications (2)

Publication Number Publication Date
CN110263052A CN110263052A (en) 2019-09-20
CN110263052B true CN110263052B (en) 2021-07-20

Family

ID=67921073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910552169.0A Active CN110263052B (en) 2019-06-25 2019-06-25 Automatic synchronization technology innovation method based on big data Hadoop platform ODS

Country Status (1)

Country Link
CN (1) CN110263052B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699580A (en) * 2013-12-03 2014-04-02 中铁程科技有限责任公司 Database synchronization method and database synchronization device
CN106599061A (en) * 2016-11-16 2017-04-26 成都九洲电子信息系统股份有限公司 SQLite-based embedded database synchronization method
CN108470228A (en) * 2017-02-22 2018-08-31 国网能源研究院 Financial data auditing method and audit system
CN109189764A (en) * 2018-09-20 2019-01-11 北京桃花岛信息技术有限公司 A kind of colleges and universities' data warehouse layered design method based on Hive
CN109885581A (en) * 2019-03-14 2019-06-14 苏州达家迎信息技术有限公司 Synchronous method, device, equipment and the storage medium of database

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8856071B2 (en) * 2009-08-11 2014-10-07 At&T Intellectual Property I, L.P. Minimizing staleness in real-time data warehouses
BR112015018368B1 (en) * 2013-02-27 2022-08-02 Hewlett-Packard Development Company, L.P. METHOD, SYSTEM AND COMPUTER-READABLE MEDIUM FOR SYNCHRONIZING DATA
WO2016183550A1 (en) * 2015-05-14 2016-11-17 Walleye Software, LLC Dynamic table index mapping
US20160350482A1 (en) * 2015-05-27 2016-12-01 University Of Utah Research Foundation Agent for healthcare data application delivery
CN109271444A (en) * 2018-08-10 2019-01-25 武汉达梦数据库有限公司 A kind of table level bi-directional synchronization method and system based on trigger
CN109101622A (en) * 2018-08-10 2018-12-28 北京奇虎科技有限公司 Method of data synchronization, calculates equipment and computer storage medium at device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699580A (en) * 2013-12-03 2014-04-02 中铁程科技有限责任公司 Database synchronization method and database synchronization device
CN106599061A (en) * 2016-11-16 2017-04-26 成都九洲电子信息系统股份有限公司 SQLite-based embedded database synchronization method
CN108470228A (en) * 2017-02-22 2018-08-31 国网能源研究院 Financial data auditing method and audit system
CN109189764A (en) * 2018-09-20 2019-01-11 北京桃花岛信息技术有限公司 A kind of colleges and universities' data warehouse layered design method based on Hive
CN109885581A (en) * 2019-03-14 2019-06-14 苏州达家迎信息技术有限公司 Synchronous method, device, equipment and the storage medium of database

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
On-Demand Snapshot Maintenance in Data Warehouses Using Incremental ETL Pipeline;Qu,Weiping等;《TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE-CENTERED SYSTEMS XXXII》;20171231;全文 *
Scalable Parallel Join for Huge Tables;Weng,Nianlong等;《2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA》;20131231;全文 *
内蒙古电力数据中心的建设分析;罗朝宇等;《内蒙古电力技术》;20130630(第03期);全文 *

Also Published As

Publication number Publication date
CN110263052A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
US11379428B2 (en) Synchronization of client machines with a content management system repository
CN108536761B (en) Report data query method and server
AU2017409830B2 (en) Multi-task scheduling method and system, application server and computer-readable storage medium
CN109271435B (en) Data extraction method and system supporting breakpoint continuous transmission
CN102638566B (en) BLOG system running method based on cloud storage
CN111324610A (en) Data synchronization method and device
CN115374102A (en) Data processing method and system
CN108009258A (en) It is a kind of can Configuration Online data collection and analysis platform
CN109885642B (en) Hierarchical storage method and device for full-text retrieval
CN104580532A (en) Cross-platform application system
WO2023155819A1 (en) Application deployment method and system
CN114281757A (en) Database migration method and system and computer readable storage medium
CN111177173A (en) System and method for realizing data synchronization optimization processing under big data environment
CN114385956A (en) Method for communicating among multiple tabs of browser and updating state
CN110263052B (en) Automatic synchronization technology innovation method based on big data Hadoop platform ODS
CN102122302A (en) Centralized processing system and method for documents
CN112817915A (en) Automatic multi-product document uniform publishing and displaying method
CN116974689A (en) Cluster container scheduling method, device, equipment and computer readable storage medium
JP2021140430A (en) Database migration method, database migration system, and database migration program
CN114116158A (en) Task scheduling method and system based on SD-WAN system
CN115455121A (en) Real-time reliable data synchronous transmission method, equipment and medium
CN114064678A (en) Event data processing method and device and terminal equipment
CN110532000B (en) Kbroker distributed operating system for operation publishing and operation publishing system
CN112217849B (en) Task scheduling method, system and computer equipment in SD-WAN system
CN110245148B (en) Data storage method, device, system and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: No.88, Huaihai Road, Qinhuai District, Nanjing, Jiangsu 210001

Patentee after: Nanyin Faba Consumer Finance Co.,Ltd.

Address before: No.88, Huaihai Road, Qinhuai District, Nanjing, Jiangsu 210001

Patentee before: SUNING CONSUMER FINANCE Co.,Ltd.