CN113704213A - Sqlldr2 and ogg data synchronization-based implementation method - Google Patents
Sqlldr2 and ogg data synchronization-based implementation method Download PDFInfo
- Publication number
- CN113704213A CN113704213A CN202110965086.1A CN202110965086A CN113704213A CN 113704213 A CN113704213 A CN 113704213A CN 202110965086 A CN202110965086 A CN 202110965086A CN 113704213 A CN113704213 A CN 113704213A
- Authority
- CN
- China
- Prior art keywords
- data
- ogg
- sqlldr2
- redo log
- log
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000008676 import Effects 0.000 claims abstract description 8
- 230000004048 modification Effects 0.000 claims description 7
- 238000012986 modification Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 5
- 238000011010 flushing procedure Methods 0.000 claims description 4
- 230000010076 replication Effects 0.000 claims description 4
- 230000001360 synchronised effect Effects 0.000 abstract description 5
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/178—Techniques for file synchronisation in file systems
- G06F16/1794—Details of file format conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of data processing, and discloses a realization method based on sqlldr2 and ogg data synchronization, wherein an sqlldr2 client is used for exporting Oracle database data, exporting the data in a txt form and transmitting the data to an HDFS distributed file system, an oggogg synchronization tool source end reads a redo log in real time and analyzes the redo log into an sql statement and transmits the sql statement to a target end in a binary file manner, and an ogg synchronization tool process of the target end reads files and imports the files into the HDFS distributed file system. According to the invention, the required data is formatted and output to the txt file by using the sqlldr2 tool according to different requirements, and the data of the specific table and the specific field are appointed to be collected by adopting the ogg tool, so that the Oracle data can be synchronously backed up and stored in the HDFS distributed file system, the problems that the data is frequently imported and exported, the data is difficult to import across platforms, and the table required to be synchronized cannot be newly added or deleted according to the requirements are solved, the automatic synchronization of the incremental data is realized, and the workload of each manual synchronization is reduced.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a method for realizing data synchronization based on sqlldr2 and ogg.
Background
Sqlldr 2: sqlldr2 is a client tool that can export data in the oracle database into txt files according to a specific format.
Ogg: the source end reads redolog in real time, analyzes the redolog into sql statements and transmits the sql statements to the target end in a binary file mode.
The functions realized by the two are as follows: the sqlldr2 is used for initializing oracle-hadoop data, and the ogg is used for real-time synchronization of incremental data, so that the integrity of initial data and the timeliness of the incremental data are guaranteed.
However, the existing implementation method for synchronizing sqlldr2 and ogg data has the problems of frequent data import and export, difficult data cross-platform import and incapability of adding or deleting a table to be synchronized according to requirements. Therefore, those skilled in the art provide an implementation method based on sqlldr2 and ogg data synchronization to solve the problems in the background art.
Disclosure of Invention
The invention aims to provide an implementation method based on sqlldr2 and ogg data synchronization, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: an implementation method based on sqlldr2 and ogg data synchronization is characterized by comprising an Oracle database, an sqlldr2 client, an ogg synchronization tool and an HDFS distributed file system;
the sqlldr2 client is used for exporting Oracle database data, exporting the data in a txt form and transmitting the data to an HDFS distributed file system;
the ogg synchronization tool is used for synchronously exporting the log documents of the Oracle database, a redo log in the Oracle database loads the modified log documents into a memory from a disk, an archive log mode archives and backups the modified log documents, when data are exported, a source end of the ogg synchronization tool reads the redo log in real time, analyzes the redo log into sql statements and transmits the sql statements to a target end in a binary file mode, and a process of the target end of the ogg synchronization tool reads the files and imports the files into an HDFS distributed file system.
As a still further scheme of the invention: the redo log consists of two parts, one is a redo log buffer of a log cache in the memory, the other is a redo log file in the disk, when the log document data record is modified each time, the modified content is firstly written into the redo log buffer, and then the modification in the memory is refreshed back to the redo log file after waiting for a proper time, and the whole process is as follows: and if the data is in the memory, directly modifying, otherwise, loading the data into the memory from the disk, generating a redo log after the modification is finished, writing the redo log into a redo log buffer, recording the modified value, and flushing the content in the redo log file back to the redo log buffer according to the selected strategy.
As a still further scheme of the invention: when the archive log mode is operated, all transaction redo logs are stored, the Oracle database stops all new operations before the redo log file is copied, and the Oracle database does not cover the old transaction log file before the old transaction log file is copied.
As a still further scheme of the invention: the ogg synchronization tool starts all processes in the form of start at the commands of the source end and the target end, and the starting sequence is completed according to the source mgr, the target mgr, the source extract, the source pump and the target replay;
the configuration of the source end of the ogg synchronization tool comprises the following steps:
a. configuring global variables of the ogg;
b. configuration manager mgr;
c. adding a copy table;
d. configuring an extract process;
e. configuring a pump process;
f. configuring a define file;
sending the generated ogg-test file to a destination file;
the configuration of the target end of the ogg synchronization tool comprises the following steps:
a. starting HDFS distributed file system service;
b. configuration manager mgr;
c. configuring checkpoint;
d. prop, hdfs;
e. the trail file is added to the replication process.
As a still further scheme of the invention: the sqlldr2 client side transmits data in an Oracle database to an HDFS distributed file system by adopting a Hadoop distributed computing platform, and the specific flow is as follows: the sqlldr2 client exports and converts data in an Oracle database into a txt format, transmits a txt format data task to a Hadoop distributed computing platform, breaks single task data through a MapReduce function in the Hadoop distributed computing platform, sends the broken task to a plurality of nodes, and then loads (Reduce) the broken task to an HDFS distributed file system in a single data set form.
Compared with the prior art, the invention has the beneficial effects that:
according to the invention, the required data is formatted and output to the txt file by using the sqlldr2 tool according to different requirements, and the data of the specific table and the specific field is appointed to be collected by adopting the ogg tool, so that synchronous conversion and transmission can be carried out on Oracle data, synchronous backup and storage are carried out on the Oracle data in the HDFS distributed file system, the problems that the data is frequently imported and exported, the data is difficult to import across platforms, and the table which needs to be synchronized cannot be newly added or deleted according to the requirements are effectively solved, automatic synchronization of incremental data is realized, and the workload of each manual synchronization is reduced.
Drawings
Fig. 1 is a schematic structural diagram of an implementation method based on sqlldr2 and ogg data synchronization.
Detailed Description
Referring to fig. 1, in the embodiment of the present invention, an implementation method based on sqlldr2 and ogg data synchronization includes an Oracle database, an sqlldr2 client, an ogg synchronization tool, and an HDFS distributed file system;
the sqlldr2 client is used for exporting Oracle database data, exporting the data in a txt form and transmitting the data to an HDFS distributed file system;
the ogg synchronization tool is used for synchronously exporting the log documents of the Oracle database, a redo log in the Oracle database loads the modified log documents into a memory from a disk, an archive log mode archives and backups the modified log documents, when data are exported, a source end of the ogg synchronization tool reads the redo log in real time, analyzes the redo log into sql statements and transmits the sql statements to a target end in a binary file mode, and a process of the target end of the ogg synchronization tool reads the files and imports the files into an HDFS distributed file system.
The redo log consists of two parts, one is a redo log buffer of a log cache in the memory, the other is a redo log file in the disk, when the log document data record is modified each time, the modified content is firstly written into the redo log buffer, and then the modification in the memory is refreshed back to the redo log file after waiting for a proper time, and the whole process is as follows: and if the data is in the memory, directly modifying, otherwise, loading the data into the memory from the disk, generating a redo log after the modification is finished, writing the redo log into a redo log buffer, recording the modified value, and flushing the content in the redo log file back to the redo log buffer according to the selected strategy.
When the archive log mode is operated, all transaction redo logs are stored, the Oracle database stops all new operations before the redo log file is copied, and the Oracle database does not cover the old transaction log file before the old transaction log file is copied.
The ogg synchronization tool starts all processes in the form of start at the commands of the source end and the target end, and the starting sequence is completed according to the source mgr, the target mgr, the source extract, the source pump and the target replay;
the configuration of the source end of the ogg synchronization tool comprises the following steps:
a. configuring global variables of the ogg;
b. configuration manager mgr;
c. adding a copy table;
d. configuring an extract process;
e. configuring a pump process;
f. configuring a define file;
sending the generated ogg-test file to a destination file;
the configuration of the target end of the ogg synchronization tool comprises the following steps:
a. starting HDFS distributed file system service;
b. configuration manager mgr;
c. configuring checkpoint;
d. prop, hdfs;
e. the trail file is added to the replication process.
The sqlldr2 client side transmits data in an Oracle database to an HDFS distributed file system by adopting a Hadoop distributed computing platform, and the specific flow is as follows: the sqlldr2 client exports and converts data in an Oracle database into a txt format, transmits a txt format data task to a Hadoop distributed computing platform, breaks single task data through a MapReduce function in the Hadoop distributed computing platform, sends the broken task to a plurality of nodes, and then loads (Reduce) the broken task to an HDFS distributed file system in a single data set form.
An implementation method based on sqlldr2 and ogg data synchronization comprises the following work flows:
the method comprises the following steps that S1, an sqlldr2 client side exports and converts data in an Oracle database and writes the data into a txt file, the txt file is imported for data initialization for the first time, then the sqlldr2 client side transmits a txt format data task to a Hadoop distributed computing platform, a MapReduce function in the Hadoop distributed computing platform breaks single task data, fragment tasks are sent to multiple nodes, and then the single task data are loaded into an HDFS distributed file system in a single data set mode;
s2, synchronizing, when modifying the log document data record in the Oracle database each time, if the data is in the memory, directly modifying, otherwise, loading the data from the disk into the memory, after the modification is completed, generating a redo log, writing the redo log into a redo log buffer, recording the modified value, flushing the content in the redo log file back into the redo log buffer according to the selected strategy, after the modification, operating in an archive log mode, and storing all the transaction redo logs;
and S3, reading the modified log file and generating a message by the source process of the ogg synchronization tool, and reading the file and importing the file into the HDFS distributed file system by the target process of the ogg synchronization tool.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention are equivalent to or changed within the technical scope of the present invention.
Claims (5)
1. An implementation method based on sqlldr2 and ogg data synchronization is characterized by comprising an Oracle database, an sqlldr2 client, an ogg synchronization tool and an HDFS distributed file system;
the sqlldr2 client is used for exporting Oracle database data, exporting the data in a txt form and transmitting the data to an HDFS distributed file system;
the ogg synchronization tool is used for synchronously exporting the log documents of the Oracle database, a redo log in the Oracle database loads the modified log documents into a memory from a disk, an archive log mode archives and backups the modified log documents, when data are exported, a source end of the ogg synchronization tool reads the redo log in real time, analyzes the redo log into sql statements and transmits the sql statements to a target end in a binary file mode, and a process of the target end of the ogg synchronization tool reads the files and imports the files into an HDFS distributed file system.
2. The method for realizing data synchronization based on sqlldr2 and ogg according to claim 1, wherein the redo log is composed of two parts, one is a redo log buffer of a log cache in a memory, and the other is a redo log file of a log file in a disk, when a log document data record is modified each time, the modified content is written into the redo log buffer first, and then the modified content in the memory is flushed back to the redo log file after waiting for a proper time, and the whole process is as follows: and if the data is in the memory, directly modifying, otherwise, loading the data into the memory from the disk, generating a redo log after the modification is completed, writing the redo log into a redo log buffer, recording the modified value, and flushing the content in the redo log file back to the redo log buffer according to the selected strategy.
3. The method for realizing data synchronization based on sqlldr2 and ogg according to claim 1, wherein when the archivelog model is running, all the transaction redo logs are saved, before the redo log file replication is completed, the Oracle database stops all new operations, and before the old transaction records are completed, the Oracle database does not overwrite the old transaction records.
4. The method for realizing data synchronization based on sqlldr2 and ogg according to claim 1, wherein the ogg synchronization tool starts all processes in the form of start at the command of source end and target end, and the starting sequence is completed according to source mgr-target mgr-source extract-source pump-target replay;
the configuration of the source end of the ogg synchronization tool comprises the following steps:
a. configuring global variables of the ogg;
b. configuration manager mgr;
c. adding a copy table;
d. configuring an extract process;
e. configuring a pump process;
f. configuring a define file;
sending the generated ogg-test file to a destination file;
the configuration of the target end of the ogg synchronization tool comprises the following steps:
a. starting HDFS distributed file system service;
b. configuration manager mgr;
c. configuring checkpoint;
d. prop, hdfs;
e. the trail file is added to the replication process.
5. The method for achieving data synchronization based on sqlldr2 and ogg according to claim 1, wherein the sqlldr2 client side transmits data in an Oracle database to an HDFS distributed file system by using a Hadoop distributed computing platform, and the specific process is as follows: the sqlldr2 client exports and converts data in an Oracle database into a txt format, transmits a txt format data task to a Hadoop distributed computing platform, breaks single task data through a MapReduce function in the Hadoop distributed computing platform, sends the broken task to a plurality of nodes, and then loads the broken task to an HDFS distributed file system in a single data set mode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110965086.1A CN113704213A (en) | 2021-08-20 | 2021-08-20 | Sqlldr2 and ogg data synchronization-based implementation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110965086.1A CN113704213A (en) | 2021-08-20 | 2021-08-20 | Sqlldr2 and ogg data synchronization-based implementation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113704213A true CN113704213A (en) | 2021-11-26 |
Family
ID=78653801
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110965086.1A Pending CN113704213A (en) | 2021-08-20 | 2021-08-20 | Sqlldr2 and ogg data synchronization-based implementation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113704213A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114048178A (en) * | 2021-11-29 | 2022-02-15 | 众和空间(北京)科技有限责任公司 | Dual-mode storage and synchronization method of data |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120030172A1 (en) * | 2010-07-27 | 2012-02-02 | Oracle International Corporation | Mysql database heterogeneous log based replication |
CN102752372A (en) * | 2012-06-18 | 2012-10-24 | 天津神舟通用数据技术有限公司 | File based database synchronization method |
US20150193466A1 (en) * | 2014-01-06 | 2015-07-09 | Siegfried Luft | Architecture and method for cloud provider selection and projection |
CN106503104A (en) * | 2016-10-17 | 2017-03-15 | 山东浪潮商用系统有限公司 | Oracle database data copy method under a kind of tertiary-structure network pattern |
CN107943979A (en) * | 2017-11-29 | 2018-04-20 | 山东鲁能软件技术有限公司 | The quasi real time synchronous method and device of data between a kind of database |
CN109376196A (en) * | 2018-09-25 | 2019-02-22 | 浪潮软件集团有限公司 | Method and device for batch synchronization of redo logs |
CN110727548A (en) * | 2019-09-29 | 2020-01-24 | 上海英方软件股份有限公司 | Continuous data protection method and device based on database DML synchronization |
CN111427898A (en) * | 2020-03-16 | 2020-07-17 | 上海爱数信息技术股份有限公司 | Continuous data protection system and method based on analysis of Oracle log |
CN111723160A (en) * | 2020-08-24 | 2020-09-29 | 国网浙江省电力有限公司 | Multi-source heterogeneous incremental data synchronization method and system |
CN111797337A (en) * | 2020-07-14 | 2020-10-20 | 上海乾臻信息科技有限公司 | Method and device for generating export page |
CN112181902A (en) * | 2020-11-30 | 2021-01-05 | 阿里云计算有限公司 | Database storage method and device and electronic equipment |
CN112527801A (en) * | 2020-12-21 | 2021-03-19 | 中国人民银行清算总中心 | Data synchronization method and system between relational database and big data system |
-
2021
- 2021-08-20 CN CN202110965086.1A patent/CN113704213A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120030172A1 (en) * | 2010-07-27 | 2012-02-02 | Oracle International Corporation | Mysql database heterogeneous log based replication |
CN102752372A (en) * | 2012-06-18 | 2012-10-24 | 天津神舟通用数据技术有限公司 | File based database synchronization method |
US20150193466A1 (en) * | 2014-01-06 | 2015-07-09 | Siegfried Luft | Architecture and method for cloud provider selection and projection |
CN106503104A (en) * | 2016-10-17 | 2017-03-15 | 山东浪潮商用系统有限公司 | Oracle database data copy method under a kind of tertiary-structure network pattern |
CN107943979A (en) * | 2017-11-29 | 2018-04-20 | 山东鲁能软件技术有限公司 | The quasi real time synchronous method and device of data between a kind of database |
CN109376196A (en) * | 2018-09-25 | 2019-02-22 | 浪潮软件集团有限公司 | Method and device for batch synchronization of redo logs |
CN110727548A (en) * | 2019-09-29 | 2020-01-24 | 上海英方软件股份有限公司 | Continuous data protection method and device based on database DML synchronization |
CN111427898A (en) * | 2020-03-16 | 2020-07-17 | 上海爱数信息技术股份有限公司 | Continuous data protection system and method based on analysis of Oracle log |
CN111797337A (en) * | 2020-07-14 | 2020-10-20 | 上海乾臻信息科技有限公司 | Method and device for generating export page |
CN111723160A (en) * | 2020-08-24 | 2020-09-29 | 国网浙江省电力有限公司 | Multi-source heterogeneous incremental data synchronization method and system |
CN112181902A (en) * | 2020-11-30 | 2021-01-05 | 阿里云计算有限公司 | Database storage method and device and electronic equipment |
CN112527801A (en) * | 2020-12-21 | 2021-03-19 | 中国人民银行清算总中心 | Data synchronization method and system between relational database and big data system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114048178A (en) * | 2021-11-29 | 2022-02-15 | 众和空间(北京)科技有限责任公司 | Dual-mode storage and synchronization method of data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104850598B (en) | A kind of real-time data base back-up restoring method | |
CN105868343B (en) | Database migration method and system | |
CN101266606B (en) | On-line data migration method based on Oracle database | |
US20110106768A1 (en) | Backup using metadata virtual hard drive and differential virtual hard drive | |
EP2194467A1 (en) | Extend CRUD to support lifecycle management and business continuity | |
CN100498796C (en) | Logic log generation method, database backup/ restoration method and system | |
US20050278388A1 (en) | Apparatus, system, and method for reliably updating a data group in a data replication environment | |
CN102339321A (en) | Network file system with version control and method using same | |
CN109299072A (en) | A kind of method and device that database migrates online | |
CN114036119A (en) | Data synchronization method based on button and database log | |
CN103530290A (en) | Method and system for data migration among databases | |
US8051054B2 (en) | Method and system for data processing with database update for the same | |
CN102541694A (en) | Method and device for database backup | |
CN103914359A (en) | Data recovery method and device | |
CN110597821B (en) | Method and device for changing database table structure | |
CN108614876B (en) | Redis database-based system and data processing method | |
CN110727548A (en) | Continuous data protection method and device based on database DML synchronization | |
CN111930850A (en) | Data verification method and device, computer equipment and storage medium | |
CN111367994A (en) | Method and system for synchronously backing up incremental data of database | |
CN113704213A (en) | Sqlldr2 and ogg data synchronization-based implementation method | |
US8255367B2 (en) | File management apparatus, method, and program | |
CN103176868A (en) | File status backup method | |
CN115658391A (en) | Backup recovery method of WAL mechanism based on QianBase MPP database | |
US20230306014A1 (en) | Transactionally consistent database exports | |
WO2017067397A1 (en) | Data recovery method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Kang Zhuang Inventor before: Chen Dianyin Inventor before: Zhang Dequan |
|
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20211126 |