CN110263052B - Automatic synchronization technology innovation method based on big data Hadoop platform ODS - Google Patents
Automatic synchronization technology innovation method based on big data Hadoop platform ODS Download PDFInfo
- Publication number
- CN110263052B CN110263052B CN201910552169.0A CN201910552169A CN110263052B CN 110263052 B CN110263052 B CN 110263052B CN 201910552169 A CN201910552169 A CN 201910552169A CN 110263052 B CN110263052 B CN 110263052B
- Authority
- CN
- China
- Prior art keywords
- ods
- synchronization
- table structure
- task
- automatically
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to an innovative method of automatic synchronization technology of ODS (ODS) based on a big data Hadoop platform, which is characterized by comprising the following steps: the method specifically comprises the following steps: (1) the ODS system automatically acquires structure change of a source system table and sends out a synchronous table structure request; (2) judging whether to execute table structure synchronization according to whether the system requests a synchronization table structure, and determining whether to execute the table structure synchronization immediately or delay the execution; (3) the ODS system implements automated synchronization. The invention automatically realizes the addition, modification and data synchronization of the ODS data source and avoids the data problem caused by manual misoperation. The automation makes the ODS development process more efficient, the development operation more standard, the management and maintenance are easy, and the direct mapping from the version change to the technical implementation is realized.
Description
Technical Field
The invention relates to the technical field of ODS (enhanced data processing) automatic synchronization, in particular to an innovative method of an ODS automatic synchronization technology based on a big data Hadoop platform.
Background
With the development of company business, the access of various cooperative channels, the expansion of diversified business modes and the function optimization of a core system cause frequent changes of source data, such as new increase and modification of a table structure, system switching and the like. The big data warehouse as a downstream system must ensure that complete business data is synchronously accessed to the online to quickly support data analysis of business departments, which is undoubtedly a great challenge. The traditional ODS maintenance method costs a large amount of workload and labor cost, data sources are synchronized by means of manual operation, the steps are multiple and tedious, data accidents can be caused even if a link is modified by mistake, and stability and reliability of bottom-layer data cannot be guaranteed. Under the background, a safe, reliable and efficient ODS synchronization method is urgently needed.
Disclosure of Invention
The invention aims to provide an innovative method for automatic synchronization technology of ODS (ODS) based on a big data Hadoop platform.
In order to solve the technical problems, the technical scheme of the invention is as follows: the provided automatic synchronization technology innovation method based on big data Hadoop platform ODS is characterized by comprising the following steps: the method specifically comprises the following steps:
(1) the ODS system automatically acquires structure change of a source system table and sends out a synchronous table structure request;
(2) judging whether to execute table structure synchronization according to whether the system requests a synchronization table structure, and determining whether to execute the table structure synchronization immediately or delay the execution;
(3) the ODS system implements automated synchronization.
Further, the step (1) of automatically acquiring the structure change of the source system table and sending out the synchronous table structure request mainly includes the following steps:
the IT research and development personnel report the tables related to change to the system in a page mode on a monthly version day, labels are marked on the source system table, and the labels are classified: a conventional table, a middle table, a temporary table and a backup table;
automatically comparing the table structures of the ETL system and the source system at every day in the ODS system, automatically capturing an abnormal table, and automatically requesting a synchronous table structure;
and thirdly, if the service/developer or the ETL personnel find that the table structures are not synchronous according to the service requirements, sending a request synchronous request operation to the ODS system through a page operation mode.
Further, the source system table of structure change automatically obtained in step (1) includes an added table, a deleted table, a field length change, and a field type modification.
Further, the method for determining whether to execute the table structure synchronization immediately or postpone execution in the step (2) is as follows: the ODS system engine automatically selects a proper time to change according to a preset change policy of a table structure, wherein the change policy is as follows:
firstly, a synchronization mechanism is controlled through a rule of a table tag, for example, a conventional table is normally synchronized, an intermediate table and a temporary table are not synchronized, a backup table is delayed to be synchronized, and delay time background configuration is performed;
controlling a synchronization mechanism through a white list and a black list strategy of the table, wherein the white list is normally synchronized, the black list is not synchronized, the gray list is delayed to be synchronized, and the delay time is configured in a background;
controlling a synchronization mechanism by setting the priority of ETL scheduling and ODS synchronization, when the generation of a corresponding task instance or the execution of a task is found before the ODS synchronization, delaying the synchronization of the execution table structure, establishing task instance dependency, and executing the synchronization of the table structure after the execution of the task instance is finished;
fourthly, when the generation of the corresponding task instance is found in the synchronization process of the ODS, the task instance is suspended, and the task instance is triggered to continue to execute after the synchronization of the ODS is finished;
and fifthly, if the corresponding task instance is not generated before and after the ODS synchronization process, the ODS is normally executed.
Further, the real-time automatic synchronization in the step (3) comprises the following steps:
firstly, an ODS system executes access to a new table;
the ODS system performs table structure synchronization.
Further, the specific steps of the ODS system in step (i) performing access to the new table include:
A. configuring a newly-built Hive table and adding a metadata management platform;
B. establishing an ETL task, configuring a drawing number HQL, and establishing/modifying a task flow;
C. and newly establishing an event and configuring task dependence in the task flow and an event dependence relationship between the task flows.
Further, the specific steps of the system in the second step when performing the table structure synchronization include:
A. accessing a production system, and automatically synchronizing an ODS table structure;
B. modifying an ETL task HQL drawing script;
C. the data is synchronized.
Compared with the prior art, the invention has the following beneficial effects:
the invention discloses an innovative method of an ODS (enhanced data storage) automatic synchronization technology based on a big data Hadoop platform, which can automatically realize the addition, modification and data synchronization of an ODS (enhanced data storage) data source and avoid the data problem caused by manual misoperation. The automation makes the ODS development process more efficient, the development operation more standard, the management and maintenance are easy, and the direct mapping from the version change to the technical implementation is realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments are briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of steps of an innovative method of automated synchronization technology ODS based on a big data Hadoop platform. FIG. 2 is a flow chart of the ODS system of the present invention performing access to a new table.
FIG. 3 is a flowchart of the ODS system performing table structure synchronization of the present invention.
Detailed Description
The technical solution of the present invention will be clearly and completely described by the following detailed description.
The invention provides an innovative method for automatic synchronization technology of ODS (ODS) based on a big data Hadoop platform, which specifically comprises the following steps as shown in FIG. 1:
(1) the ODS system automatically acquires structure change of a source system table and sends a synchronous table structure request, wherein the source system table of the automatically acquired structure change comprises a newly added table, a deleted table, field length change and field type modification. The method for automatically acquiring the structure change of the source system table and sending the synchronous table structure request mainly comprises the following steps:
the IT research and development personnel report the tables related to change to the system in a page mode on a monthly version day, labels are marked on the source system table, and the labels are classified: a conventional table, a middle table, a temporary table and a backup table;
automatically comparing the table structures of the ETL system and the source system at every day in the ODS system, automatically capturing an abnormal table, and automatically requesting a synchronous table structure;
and thirdly, if the service/developer or the ETL personnel find that the table structures are not synchronous according to the service requirements, sending a request synchronous request operation to the ODS system through a page operation mode.
(2) The method for judging whether to execute the table structure synchronization according to whether the system requests the synchronization table structure and determining whether to execute the table structure synchronization immediately or delay the execution comprises the following steps: the ODS system engine automatically selects a proper time to change according to a preset change policy of a table structure, wherein the change policy is as follows:
firstly, a synchronization mechanism is controlled through a rule of a table tag, for example, a conventional table is normally synchronized, an intermediate table and a temporary table are not synchronized, a backup table is delayed to be synchronized, and delay time background configuration is performed;
controlling a synchronization mechanism through a white list and a black list strategy of the table, wherein the white list is normally synchronized, the black list is not synchronized, the gray list is delayed to be synchronized, and the delay time is configured in a background;
controlling a synchronization mechanism by setting the priority of ETL scheduling and ODS synchronization, when the generation of a corresponding task instance or the execution of a task is found before the ODS synchronization, delaying the synchronization of the execution table structure, establishing task instance dependency, and executing the synchronization of the table structure after the execution of the task instance is finished;
fourthly, when the generation of the corresponding task instance is found in the synchronization process of the ODS, the task instance is suspended, and the task instance is triggered to continue to execute after the synchronization of the ODS is finished;
and fifthly, if the corresponding task instance is not generated before and after the ODS synchronization process, the ODS is normally executed.
(3) The ODS system implements automatic synchronization, which includes the following steps:
the ODS system performs access to a new table, as shown in fig. 2, and includes the following specific steps:
A. configuring a newly-built Hive table and adding a metadata management platform;
B. establishing an ETL task, configuring a drawing number HQL, and establishing/modifying a task flow;
C. and newly establishing an event and configuring task dependence in the task flow and an event dependence relationship between the task flows.
The ODS system performs the table structure synchronization, as shown in FIG. 3, the specific steps when the system performs the table structure synchronization include:
A. accessing a production system, and automatically synchronizing an ODS table structure;
B. modifying an ETL task HQL drawing script;
C. the data is synchronized.
The above-mentioned embodiments are merely descriptions of the preferred embodiments of the present invention, and do not limit the concept and scope of the present invention, and various modifications and improvements made to the technical solutions of the present invention by those skilled in the art should fall into the protection scope of the present invention without departing from the design concept of the present invention, and the technical contents of the present invention as claimed are all described in the technical claims.
Claims (6)
1. An innovative method of automatic synchronization technology ODS based on big data Hadoop platform is characterized by comprising the following steps: the method specifically comprises the following steps:
(1) the ODS system automatically acquires structure change of a source system table and sends out a synchronous table structure request;
(2) judging whether to execute table structure synchronization according to whether the system requests a synchronization table structure, and determining whether to execute the table structure synchronization immediately or delay the execution;
(3) the ODS system implements automated synchronization;
the method for determining whether to execute the table structure synchronization immediately or postpone in the step (2) comprises the following steps: the ODS system engine automatically selects a proper time to change according to a preset change policy of a table structure, wherein the change policy is as follows:
firstly, a synchronization mechanism is controlled through a rule of a table tag, for example, a conventional table is normally synchronized, an intermediate table and a temporary table are not synchronized, a backup table is delayed to be synchronized, and delay time background configuration is performed;
controlling a synchronization mechanism through a white list and a black list strategy of the table, wherein the white list is normally synchronized, the black list is not synchronized, the gray list is delayed to be synchronized, and the delay time is configured in a background;
controlling a synchronization mechanism by setting the priority of ETL scheduling and ODS synchronization, when the generation of a corresponding task instance or the execution of a task is found before the ODS synchronization, delaying the synchronization of the execution table structure, establishing task instance dependency, and executing the synchronization of the table structure after the execution of the task instance is finished;
fourthly, when the generation of the corresponding task instance is found in the synchronization process of the ODS, the task instance is suspended, and the task instance is triggered to continue to execute after the synchronization of the ODS is finished;
and fifthly, if the corresponding task instance is not generated before and after the ODS synchronization process, the ODS is normally executed.
2. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 1, characterized by: the step (1) of automatically obtaining the structure change of the source system table and sending the synchronous table structure request mainly comprises the following steps:
the IT research and development personnel report the tables related to change to the system in a page mode on a monthly version day, labels are marked on the source system table, and the labels are classified: a conventional table, a middle table, a temporary table and a backup table;
automatically comparing the table structures of the ETL system and the source system at every day in the ODS system, automatically capturing an abnormal table, and automatically requesting a synchronous table structure;
and thirdly, if the service/developer or the ETL personnel find that the table structures are not synchronous according to the service requirements, sending a request synchronous request operation to the ODS system through a page operation mode.
3. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 1, characterized by: the source system table of the structure change automatically acquired in the step (1) comprises an added table, a deleted table, a field length change and a field type modification.
4. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 1, characterized by: the implementation of automatic synchronization in the step (3) comprises the following steps:
firstly, an ODS system executes access to a new table;
the ODS system performs table structure synchronization.
5. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 4, characterized by: the specific steps of the ODS system in the first step of accessing the new table include:
A. configuring a newly-built Hive table and adding a metadata management platform;
B. establishing an ETL task, configuring a drawing number HQL, and establishing/modifying a task flow;
C. and newly establishing an event and configuring task dependence in the task flow and an event dependence relationship between the task flows.
6. The innovation method of automatic synchronization technology ODS based on big data Hadoop platform in claim 4, characterized by: the concrete steps of the system in the step II when executing the table structure synchronization comprise:
A. accessing a production system, and automatically synchronizing an ODS table structure;
B. modifying an ETL task HQL drawing script;
C. the data is synchronized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910552169.0A CN110263052B (en) | 2019-06-25 | 2019-06-25 | Automatic synchronization technology innovation method based on big data Hadoop platform ODS |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910552169.0A CN110263052B (en) | 2019-06-25 | 2019-06-25 | Automatic synchronization technology innovation method based on big data Hadoop platform ODS |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110263052A CN110263052A (en) | 2019-09-20 |
CN110263052B true CN110263052B (en) | 2021-07-20 |
Family
ID=67921073
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910552169.0A Active CN110263052B (en) | 2019-06-25 | 2019-06-25 | Automatic synchronization technology innovation method based on big data Hadoop platform ODS |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110263052B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699580A (en) * | 2013-12-03 | 2014-04-02 | 中铁程科技有限责任公司 | Database synchronization method and database synchronization device |
CN106599061A (en) * | 2016-11-16 | 2017-04-26 | 成都九洲电子信息系统股份有限公司 | SQLite-based embedded database synchronization method |
CN108470228A (en) * | 2017-02-22 | 2018-08-31 | 国网能源研究院 | Financial data auditing method and audit system |
CN109189764A (en) * | 2018-09-20 | 2019-01-11 | 北京桃花岛信息技术有限公司 | A kind of colleges and universities' data warehouse layered design method based on Hive |
CN109885581A (en) * | 2019-03-14 | 2019-06-14 | 苏州达家迎信息技术有限公司 | Synchronous method, device, equipment and the storage medium of database |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8856071B2 (en) * | 2009-08-11 | 2014-10-07 | At&T Intellectual Property I, L.P. | Minimizing staleness in real-time data warehouses |
BR112015018368B1 (en) * | 2013-02-27 | 2022-08-02 | Hewlett-Packard Development Company, L.P. | METHOD, SYSTEM AND COMPUTER-READABLE MEDIUM FOR SYNCHRONIZING DATA |
WO2016183550A1 (en) * | 2015-05-14 | 2016-11-17 | Walleye Software, LLC | Dynamic table index mapping |
US20160350482A1 (en) * | 2015-05-27 | 2016-12-01 | University Of Utah Research Foundation | Agent for healthcare data application delivery |
CN109271444A (en) * | 2018-08-10 | 2019-01-25 | 武汉达梦数据库有限公司 | A kind of table level bi-directional synchronization method and system based on trigger |
CN109101622A (en) * | 2018-08-10 | 2018-12-28 | 北京奇虎科技有限公司 | Method of data synchronization, calculates equipment and computer storage medium at device |
-
2019
- 2019-06-25 CN CN201910552169.0A patent/CN110263052B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699580A (en) * | 2013-12-03 | 2014-04-02 | 中铁程科技有限责任公司 | Database synchronization method and database synchronization device |
CN106599061A (en) * | 2016-11-16 | 2017-04-26 | 成都九洲电子信息系统股份有限公司 | SQLite-based embedded database synchronization method |
CN108470228A (en) * | 2017-02-22 | 2018-08-31 | 国网能源研究院 | Financial data auditing method and audit system |
CN109189764A (en) * | 2018-09-20 | 2019-01-11 | 北京桃花岛信息技术有限公司 | A kind of colleges and universities' data warehouse layered design method based on Hive |
CN109885581A (en) * | 2019-03-14 | 2019-06-14 | 苏州达家迎信息技术有限公司 | Synchronous method, device, equipment and the storage medium of database |
Non-Patent Citations (3)
Title |
---|
On-Demand Snapshot Maintenance in Data Warehouses Using Incremental ETL Pipeline;Qu,Weiping等;《TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE-CENTERED SYSTEMS XXXII》;20171231;全文 * |
Scalable Parallel Join for Huge Tables;Weng,Nianlong等;《2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA》;20131231;全文 * |
内蒙古电力数据中心的建设分析;罗朝宇等;《内蒙古电力技术》;20130630(第03期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110263052A (en) | 2019-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11379428B2 (en) | Synchronization of client machines with a content management system repository | |
CN108536761B (en) | Report data query method and server | |
AU2017409830B2 (en) | Multi-task scheduling method and system, application server and computer-readable storage medium | |
CN109271435B (en) | Data extraction method and system supporting breakpoint continuous transmission | |
CN102638566B (en) | BLOG system running method based on cloud storage | |
CN111324610A (en) | Data synchronization method and device | |
CN115374102A (en) | Data processing method and system | |
CN108009258A (en) | It is a kind of can Configuration Online data collection and analysis platform | |
CN109885642B (en) | Hierarchical storage method and device for full-text retrieval | |
CN104580532A (en) | Cross-platform application system | |
WO2023155819A1 (en) | Application deployment method and system | |
CN114281757A (en) | Database migration method and system and computer readable storage medium | |
CN111177173A (en) | System and method for realizing data synchronization optimization processing under big data environment | |
CN114385956A (en) | Method for communicating among multiple tabs of browser and updating state | |
CN110263052B (en) | Automatic synchronization technology innovation method based on big data Hadoop platform ODS | |
CN102122302A (en) | Centralized processing system and method for documents | |
CN112817915A (en) | Automatic multi-product document uniform publishing and displaying method | |
CN116974689A (en) | Cluster container scheduling method, device, equipment and computer readable storage medium | |
JP2021140430A (en) | Database migration method, database migration system, and database migration program | |
CN114116158A (en) | Task scheduling method and system based on SD-WAN system | |
CN115455121A (en) | Real-time reliable data synchronous transmission method, equipment and medium | |
CN114064678A (en) | Event data processing method and device and terminal equipment | |
CN110532000B (en) | Kbroker distributed operating system for operation publishing and operation publishing system | |
CN112217849B (en) | Task scheduling method, system and computer equipment in SD-WAN system | |
CN110245148B (en) | Data storage method, device, system and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: No.88, Huaihai Road, Qinhuai District, Nanjing, Jiangsu 210001 Patentee after: Nanyin Faba Consumer Finance Co.,Ltd. Address before: No.88, Huaihai Road, Qinhuai District, Nanjing, Jiangsu 210001 Patentee before: SUNING CONSUMER FINANCE Co.,Ltd. |