CN110555055A - data mining method for redo log file of Oracle database - Google Patents
data mining method for redo log file of Oracle database Download PDFInfo
- Publication number
- CN110555055A CN110555055A CN201910654000.6A CN201910654000A CN110555055A CN 110555055 A CN110555055 A CN 110555055A CN 201910654000 A CN201910654000 A CN 201910654000A CN 110555055 A CN110555055 A CN 110555055A
- Authority
- CN
- China
- Prior art keywords
- log
- file
- redo
- redo log
- log file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000007418 data mining Methods 0.000 title claims abstract description 11
- 238000005065 mining Methods 0.000 claims description 30
- 238000009412 basement excavation Methods 0.000 claims description 3
- 230000009191 jumping Effects 0.000 claims description 3
- 238000011084 recovery Methods 0.000 abstract description 4
- 230000005540 biological transmission Effects 0.000 abstract description 3
- 239000012634 fragment Substances 0.000 abstract description 3
- 230000015572 biosynthetic process Effects 0.000 abstract description 2
- 230000010076 replication Effects 0.000 abstract description 2
- 238000003786 synthesis reaction Methods 0.000 abstract description 2
- 238000012795 verification Methods 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1471—Saving, restoring, recovering or retrying involving logging of persistent data for recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
Abstract
the invention discloses a data mining method for redo log files of an Oracle database. And the backup terminal synthesis process synthesizes the log fragments into a log file, finally, the application process analyzes and registers the log, and writes the log file into the disaster recovery database after the log file passes the verification and registration so as to realize the data replication of the database. The invention supports heterogeneous operating system and database management system, which is convenient for the data synchronization of the client between different database management systems and operating system platforms; compared with the transmission of the whole log file, the data volume transmitted by the method is greatly reduced, and the method can control the number of the sub-processes, so that the resources occupied by the sub-processes are less.
Description
Technical Field
the invention relates to the field of data backup and recovery of a database, in particular to a data mining method for redo log files of an Oracle database.
background
The primary purpose of the redo log file is that the redo log file can be dispatched to the farm in the event of an instance or media failure, or can be used as a way to maintain a standby database (standby database) to accomplish failover. If the host on which the database is located loses power, causing the instance to fail, Oracle will use the online redo log to restore the system to the point before the power failure. If the disk drive containing the data file has a permanent failure, Oracle will use the archive redo log and the online redo log to restore the backup of the disk drive to the appropriate point in time. In addition, if you inadvertently delete a table, or delete some important information, and commit the operation, a backup can be restored and Oracle can use the online and archive redo log files to restore it to the point before the incident occurred.
Disclosure of Invention
the invention provides a data mining method for redoing log files of an Oracle database aiming at the problems of data backup of the database and data recovery after the database fails, which comprises the following specific steps:
S1: acquiring a full path name of a database redo log, and judging whether the redo log is stored in a file system or on an Automatic Storage Manager (ASM) according to the full path name;
S2: if the log file exists on the ASM, acquiring the data block size information of the redo log file from the ASM; otherwise, accessing a file system, and acquiring data block size information of the redo log file from the file system, wherein the data block size information refers to the number of bytes of one data block;
S3: calculating the number of data blocks according to the size information of the data blocks of the current redo log file, starting a plurality of sub-processes, performing data mining of an independent application process on each data block, and setting the maximum number of processes if the performance factor of a processor is considered so as to avoid the excessive number of processes;
s4: database ID information of 24 th-27 th bytes of a file head of the current redo log file and thread information of 176 th-177 th bytes are mined, whether the database ID information is consistent with the corresponding information of the database or not is checked, and whether the obtained current redo log belongs to the database or not is judged; if not, returning an error, and if the error is consistent, entering the next step;
s5: reading and recording a log sequence number log sequence of 8 th-11 th bytes of a file header of the current redo log file, wherein the log sequence number represents the sequence of the current redo log file;
s6: mining a data block of recorded content after a file header of the current redo log file, mining 1M data each time, returning error information if the acquisition fails, and terminating the current mining;
S7: comparing the log sequence number log sequence in the 'file header' of the redo log file with the log sequence number log sequence in the data block of the recorded content, if the log sequence number log sequence is not consistent with the log sequence number log sequence, indicating that the redo recorded content recorded in the current redo log file is not the redo recorded content in the database, returning error information and terminating the current log mining;
s8: if the two log serial numbers are consistent, recording the current excavation position, and writing the excavated data block content into a newly-established temporary log record file, wherein the temporary file name is named according to the service system id, the online log group number, the log serial number and the current data block number;
S9: checking whether the redo log file finishes the mining work of all the data blocks, if not, jumping to S5, and continuing to mine the next data block in the redo log file;
S10: if the redo log file finishes mining, recording the number of data blocks of the current redo log file, and storing the number of the data blocks into 24-27 bytes of a file header of the redo log file;
s11: checking the redo log file, and writing the checking result into 14-15 bytes of a redo log file header;
S12: changing the online log type in the bytes 236 and 239 of the 'file head' of the redo log file into an archive log type;
s13: and finishing the mining of the current redo log file or performing the mining of the next redo log file.
in a preferred mode, the redo log file is divided into a plurality of data blocks, the first two data blocks of the redo log are respectively a file header and a log header, the file header and the log header record attribute information of the redo log file, and each data block records redo records in the database operation process from the third data block.
the invention has the beneficial effects that:
(1) supporting heterogeneous operating systems and database management systems, facilitating data synchronization of clients between different database management systems and operating system platforms, which is the core advantage;
(2) Compared with the transmission of the whole log file, the data volume transmitted by the method is greatly reduced, and the whole log file 1/4 or less is common;
(3) the method avoids the influence of the update of the copied database log on the copying process to the maximum extent, ensures zero loss of data, can realize limited copying of partial objects, meets the requirements of data distribution and data integration, and reduces the pressure of a source database;
(4) the method can control the number of the sub-processes, and occupies less system resources.
Drawings
FIG. 1 is a flow chart of a data method;
FIG. 2 is the format and specific information of the "header" of the redo log.
Detailed Description
the mining method for the redo log of the database is specifically described with reference to fig. 1, and a specific embodiment thereof is as follows;
The log change is tracked by rapidly analyzing the log, the log is mined at the first time of change to obtain data change fragments, and the log change data is cached in a local file system and then transmitted to the standby terminal through a transmission process. And the backup terminal synthesis process synthesizes the log fragments into a log file, finally, the application process analyzes and registers the log, and writes the log file into the disaster recovery database after the log file passes the verification and registration so as to realize the data replication of the database. The redo log file is divided into a plurality of data blocks, the first two data blocks of the redo log are respectively a file header and a log header, the file header and the log header record attribute information of the redo log file, and from the third data block, each data block records a redo record of the database operation process, as shown in fig. 2, the format and specific information of the file header of the redo log are shown. The specific data mining method comprises the following steps:
s1: acquiring a full path name of a database redo log, and judging whether the redo log is stored in a file system or on an Automatic Storage Manager (ASM) according to the full path name;
s2: if the log file exists on the ASM, acquiring data block size information of the redo log file from a table v $ ASM _ disk in the ASM; otherwise, accessing a file system, and acquiring data block size information of the redo log file from v $ log and v $ logfile in the file system, wherein the data block size information refers to the number of bytes of one data block; the table v $ ASM _ disk and the table v $ log, v $ logfile are tables inherent in the file system and the ASM, are used for storing basic information of the database, and include size information of the data block;
S3: calculating the number of data blocks according to the size information of the data blocks of the current redo log file, starting a plurality of sub-processes, performing data mining of an independent application process on each data block, and setting the maximum number of processes if the performance factor of a processor is considered so as to avoid the excessive number of processes;
S4: database ID information of 24 th-27 th bytes of a file head and thread information of 176 th-177 th bytes in the current redo log file are mined, whether the database ID information is consistent with the corresponding information of the database or not is checked, and whether the obtained current redo log belongs to the database or not is judged; if not, returning an error, and if the error is consistent, entering the next step;
S5: reading and recording a log sequence number log sequence of 8 th-11 th bytes of a file header of the current redo log file, wherein the log sequence number represents the sequence of the current redo log file;
S6: mining a data block of recorded content after a file header of the current redo log file, mining 1M data each time, returning error information if the acquisition fails, and terminating the current mining; generally, each data block has 512 bytes, and mining 1M is exactly the size of two data blocks, if the byte number of the mining data is increased, the mining speed is too low, if the database is changed again at the moment, the mined data is inconsistent with the actual redo record, and if the byte number of the mining data is reduced, the waste of computer operation resources is caused, so the mining speed is set to 1M;
S7: comparing the log sequence number log sequence in the 'file header' of the redo log file with the log sequence number log sequence in the data block of the recorded content, if the log sequence number log sequence is not consistent with the log sequence number log sequence, indicating that the redo recorded content recorded in the current redo log file is not the redo recorded content in the database, returning error information and terminating the current log mining;
S8: if the two log serial numbers are consistent, recording the current excavation position, and writing the excavated data block content into a newly-established temporary log record file, wherein the temporary file name is named according to the service system id, the online log group number, the log serial number and the current data block number;
s9: checking whether the redo log file finishes the mining work of all the data blocks, if not, jumping to S5, and continuing to mine the next data block in the redo log file;
s10: if the redo log file finishes mining, recording the number of data blocks of the current redo log file, and storing the number of the data blocks into 24-27 bytes of a file header of the redo log file;
S11: calculating the checksum of the redo log file, and writing the check result into 14-15 bytes of a file header of the redo log; the algorithm for the checksum is as follows:
(1) dividing the log header into a plurality of 64-byte blocks; the definition of 64 bytes is because the following calculation is convenient, and each block is divided into 4 segments of 16 bytes, 4 × 16 ═ 64;
(2) Firstly, dividing a first block into four 16-byte segments, and carrying out 16-byte XOR on the 1 st segment and the 2 nd segment to obtain a value a; carrying out 16-byte XOR on the 3 rd section and the 4 th section to obtain a value b; carrying out XOR on a and b to obtain new c, wherein the initial value of c is 0, and carrying out XOR on c and b to obtain new c;
(3) Circularly executing the operation on the large 64-byte section, and carrying out exclusive OR on the large 64-byte section and the value c obtained in the previous round each time, wherein c is a global variable and is not an initial value taken each time;
(4) waiting for all the calculation to obtain a final c value, and dividing 16 bytes of c into 4 segments of 4 bytes, r1, r2, r3 and r 4; carrying out bitwise XOR on the four values to obtain a value r 0;
(5) shifting r0 to the right by 16 bits to obtain g, and performing XOR on r0 and g to obtain m;
(6) performing bitwise AND on m and 0xFFFF to obtain a final checksum;
s12: changing the online log type in the bytes 236 and 239 of the 'file head' of the redo log file into an archive log type;
S13: and finishing the mining of the current redo log file or performing the mining of the next redo log file.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical solutions and the inventive concepts of the present invention within the technical scope of the present invention.
Claims (2)
1. A data mining method for redo log files of an Oracle database is characterized by comprising the following steps:
s1: acquiring a full path name of a database redo log, and judging whether the redo log is stored in a file system or on an Automatic Storage Manager (ASM) according to the full path name;
s2: if the log file exists on the ASM, acquiring the data block size information of the redo log file from the ASM; otherwise, accessing a file system, and acquiring data block size information of the redo log file from the file system, wherein the data block size information refers to the number of bytes of one data block;
s3: calculating the number of data blocks according to the size information of the data blocks of the current redo log file, starting a plurality of sub-processes, performing data mining of an independent application process on each data block, and setting the maximum number of processes if the performance factor of a processor is considered so as to avoid the excessive number of processes;
s4: database ID information of 24 th-27 th bytes of a file head of the current redo log file and thread information of 176 th-177 th bytes are mined, whether the database ID information is consistent with the corresponding information of the database or not is checked, and whether the obtained current redo log belongs to the database or not is judged; if not, returning an error, and if the error is consistent, entering the next step;
s5: reading and recording a log sequence number log sequence of 8 th-11 th bytes of a file header of the current redo log file, wherein the log sequence number represents the sequence of the current redo log file;
S6: mining a data block of recorded content after a file header of the current redo log file, mining 1M data each time, returning error information if the acquisition fails, and terminating the current mining;
s7: comparing the log sequence number log sequence in the 'file header' of the redo log file with the log sequence number log sequence in the data block of the recorded content, if the log sequence number log sequence is not consistent with the log sequence number log sequence, indicating that the redo recorded content recorded in the current redo log file is not the redo recorded content in the database, returning error information and terminating the current log mining;
s8: if the two log serial numbers are consistent, recording the current excavation position, and writing the excavated data block content into a newly-established temporary log record file, wherein the temporary file name is named according to the service system id, the online log group number, the log serial number and the current data block number;
s9: checking whether the redo log file finishes the mining work of all the data blocks, if not, jumping to S5, and continuing to mine the next data block in the redo log file;
S10: if the redo log file finishes mining, recording the number of data blocks of the current redo log file, and storing the number of the data blocks into 24-27 bytes of a file header of the redo log file;
S11: checking the redo log file, and writing the checking result into 14-15 bytes of a redo log file header;
S12: changing the online log type in the bytes 236 and 239 of the 'file head' of the redo log file into an archive log type;
s13: and finishing the mining of the current redo log file or performing the mining of the next redo log file.
2. the data mining method for the Oracle database redo log file according to claim 1, wherein the redo log file is divided into a plurality of data blocks, the first two data blocks of the redo log are respectively a "file header" and a "log header", the "file header" and the "log header" record the attribute information of the redo log file, and each data block records a redo record of the database operation process from the third data block.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910654000.6A CN110555055A (en) | 2019-07-19 | 2019-07-19 | data mining method for redo log file of Oracle database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910654000.6A CN110555055A (en) | 2019-07-19 | 2019-07-19 | data mining method for redo log file of Oracle database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110555055A true CN110555055A (en) | 2019-12-10 |
Family
ID=68736477
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910654000.6A Pending CN110555055A (en) | 2019-07-19 | 2019-07-19 | data mining method for redo log file of Oracle database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110555055A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110309128A (en) * | 2019-07-05 | 2019-10-08 | 广东铭太信息科技有限公司 | Oracle backup file automatic leading-in device and its implementation, the method that the importing of backup file is carried out using the device |
CN111190771A (en) * | 2019-12-27 | 2020-05-22 | 柏科数据技术(深圳)股份有限公司 | Block tracking technology for disaster recovery |
CN111611107A (en) * | 2020-05-21 | 2020-09-01 | 云和恩墨(北京)信息技术有限公司 | Method and device for acquiring database logs |
CN112965949A (en) * | 2021-03-03 | 2021-06-15 | 上海英方软件股份有限公司 | Method for remotely reading ORACLE log file |
CN113515573A (en) * | 2021-05-08 | 2021-10-19 | 上海英方软件股份有限公司 | Method and system for one-to-many negotiation replication breakpoint of ORACLE database |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060117074A1 (en) * | 2004-11-30 | 2006-06-01 | Ezzat Ahmed K | Method and apparatus for database cluster recovery |
CN102629250A (en) * | 2012-02-28 | 2012-08-08 | 杭州丰城信息技术有限公司 | Recovery method of redo log files for main memory database |
CN105022676A (en) * | 2014-04-22 | 2015-11-04 | 大唐软件技术股份有限公司 | Recovery method and device of main memory database redo log files |
CN105677876A (en) * | 2016-01-12 | 2016-06-15 | 国家电网公司 | Method for log mining based on physical level database |
-
2019
- 2019-07-19 CN CN201910654000.6A patent/CN110555055A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060117074A1 (en) * | 2004-11-30 | 2006-06-01 | Ezzat Ahmed K | Method and apparatus for database cluster recovery |
CN102629250A (en) * | 2012-02-28 | 2012-08-08 | 杭州丰城信息技术有限公司 | Recovery method of redo log files for main memory database |
CN105022676A (en) * | 2014-04-22 | 2015-11-04 | 大唐软件技术股份有限公司 | Recovery method and device of main memory database redo log files |
CN105677876A (en) * | 2016-01-12 | 2016-06-15 | 国家电网公司 | Method for log mining based on physical level database |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110309128A (en) * | 2019-07-05 | 2019-10-08 | 广东铭太信息科技有限公司 | Oracle backup file automatic leading-in device and its implementation, the method that the importing of backup file is carried out using the device |
CN110309128B (en) * | 2019-07-05 | 2020-07-17 | 广东铭太信息科技有限公司 | Oracle backup file automatic importing device, implementation method thereof and method for importing backup file by using device |
CN111190771A (en) * | 2019-12-27 | 2020-05-22 | 柏科数据技术(深圳)股份有限公司 | Block tracking technology for disaster recovery |
CN111611107A (en) * | 2020-05-21 | 2020-09-01 | 云和恩墨(北京)信息技术有限公司 | Method and device for acquiring database logs |
CN112965949A (en) * | 2021-03-03 | 2021-06-15 | 上海英方软件股份有限公司 | Method for remotely reading ORACLE log file |
CN113515573A (en) * | 2021-05-08 | 2021-10-19 | 上海英方软件股份有限公司 | Method and system for one-to-many negotiation replication breakpoint of ORACLE database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110555055A (en) | data mining method for redo log file of Oracle database | |
US20200409791A1 (en) | Serial Storage Node Processing of Data Functions | |
KR101863406B1 (en) | Nonvolatile media journaling of verified data sets | |
US7421617B2 (en) | Systems and methods for optimizing restoration of stored data | |
US7360113B2 (en) | Protocol for communicating data block copies in an error recovery environment | |
CN106776130B (en) | Log recovery method, storage device and storage node | |
US20160085630A1 (en) | Hash collision recovery in a deduplication vault | |
US20100070478A1 (en) | Retrieval and recovery of data chunks from alternate data stores in a deduplicating system | |
JP2005301497A (en) | Storage management system, restoration method and its program | |
US20080162599A1 (en) | Optimizing backup and recovery utilizing change tracking | |
US7487385B2 (en) | Apparatus and method for recovering destroyed data volumes | |
JPH04230512A (en) | Method and apparatus for updating record for dasd array | |
US7818524B2 (en) | Data migration systems and methods for independent storage device expansion and adaptation | |
US7020805B2 (en) | Efficient mechanisms for detecting phantom write errors | |
US10503620B1 (en) | Parity log with delta bitmap | |
US11429498B2 (en) | System and methods of efficiently resyncing failed components without bitmap in an erasure-coded distributed object with log-structured disk layout | |
CN105302665A (en) | Improved copy-on-write snapshot method and system | |
EP2669804A1 (en) | Storage system | |
US10990324B2 (en) | Storage node processing of predefined data functions | |
CN113885809B (en) | Data management system and method | |
CN113868028A (en) | Method for replaying log on data node, data node and system | |
US8151069B1 (en) | Multiprotection for snapsnots | |
CN113377569A (en) | Method, apparatus and computer program product for recovering data | |
WO2022105442A1 (en) | Erasure code-based data reconstruction method and appratus, device, and storage medium | |
US20120158652A1 (en) | System and method for ensuring consistency in raid storage array metadata |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191210 |
|
RJ01 | Rejection of invention patent application after publication |