CN110555055A

CN110555055A - data mining method for redo log file of Oracle database

Info

Publication number: CN110555055A
Application number: CN201910654000.6A
Authority: CN
Inventors: 杨万清; 栾敬钊; 张葆刚; 赵云; 于颖; 李若斌; 娄道国; 王跃东; 姜学朴; 牛明珠; 李君秋
Original assignee: State Grid Corp of China SGCC; State Grid Liaoning Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Liaoning Electric Power Co Ltd
Priority date: 2019-07-19
Filing date: 2019-07-19
Publication date: 2019-12-10

Abstract

the invention discloses a data mining method for redo log files of an Oracle database. And the backup terminal synthesis process synthesizes the log fragments into a log file, finally, the application process analyzes and registers the log, and writes the log file into the disaster recovery database after the log file passes the verification and registration so as to realize the data replication of the database. The invention supports heterogeneous operating system and database management system, which is convenient for the data synchronization of the client between different database management systems and operating system platforms; compared with the transmission of the whole log file, the data volume transmitted by the method is greatly reduced, and the method can control the number of the sub-processes, so that the resources occupied by the sub-processes are less.

Description

data mining method for redo log file of Oracle database

Technical Field

the invention relates to the field of data backup and recovery of a database, in particular to a data mining method for redo log files of an Oracle database.

background

The primary purpose of the redo log file is that the redo log file can be dispatched to the farm in the event of an instance or media failure, or can be used as a way to maintain a standby database (standby database) to accomplish failover. If the host on which the database is located loses power, causing the instance to fail, Oracle will use the online redo log to restore the system to the point before the power failure. If the disk drive containing the data file has a permanent failure, Oracle will use the archive redo log and the online redo log to restore the backup of the disk drive to the appropriate point in time. In addition, if you inadvertently delete a table, or delete some important information, and commit the operation, a backup can be restored and Oracle can use the online and archive redo log files to restore it to the point before the incident occurred.

Disclosure of Invention

the invention provides a data mining method for redoing log files of an Oracle database aiming at the problems of data backup of the database and data recovery after the database fails, which comprises the following specific steps:

S1: acquiring a full path name of a database redo log, and judging whether the redo log is stored in a file system or on an Automatic Storage Manager (ASM) according to the full path name;

S2: if the log file exists on the ASM, acquiring the data block size information of the redo log file from the ASM; otherwise, accessing a file system, and acquiring data block size information of the redo log file from the file system, wherein the data block size information refers to the number of bytes of one data block;

S3: calculating the number of data blocks according to the size information of the data blocks of the current redo log file, starting a plurality of sub-processes, performing data mining of an independent application process on each data block, and setting the maximum number of processes if the performance factor of a processor is considered so as to avoid the excessive number of processes;

s4: database ID information of 24 th-27 th bytes of a file head of the current redo log file and thread information of 176 th-177 th bytes are mined, whether the database ID information is consistent with the corresponding information of the database or not is checked, and whether the obtained current redo log belongs to the database or not is judged; if not, returning an error, and if the error is consistent, entering the next step;

s5: reading and recording a log sequence number log sequence of 8 th-11 th bytes of a file header of the current redo log file, wherein the log sequence number represents the sequence of the current redo log file;

s6: mining a data block of recorded content after a file header of the current redo log file, mining 1M data each time, returning error information if the acquisition fails, and terminating the current mining;

S7: comparing the log sequence number log sequence in the 'file header' of the redo log file with the log sequence number log sequence in the data block of the recorded content, if the log sequence number log sequence is not consistent with the log sequence number log sequence, indicating that the redo recorded content recorded in the current redo log file is not the redo recorded content in the database, returning error information and terminating the current log mining;

s8: if the two log serial numbers are consistent, recording the current excavation position, and writing the excavated data block content into a newly-established temporary log record file, wherein the temporary file name is named according to the service system id, the online log group number, the log serial number and the current data block number;

S9: checking whether the redo log file finishes the mining work of all the data blocks, if not, jumping to S5, and continuing to mine the next data block in the redo log file;

S10: if the redo log file finishes mining, recording the number of data blocks of the current redo log file, and storing the number of the data blocks into 24-27 bytes of a file header of the redo log file;

s11: checking the redo log file, and writing the checking result into 14-15 bytes of a redo log file header;

S12: changing the online log type in the bytes 236 and 239 of the 'file head' of the redo log file into an archive log type;

s13: and finishing the mining of the current redo log file or performing the mining of the next redo log file.

in a preferred mode, the redo log file is divided into a plurality of data blocks, the first two data blocks of the redo log are respectively a file header and a log header, the file header and the log header record attribute information of the redo log file, and each data block records redo records in the database operation process from the third data block.

the invention has the beneficial effects that:

(1) supporting heterogeneous operating systems and database management systems, facilitating data synchronization of clients between different database management systems and operating system platforms, which is the core advantage;

(2) Compared with the transmission of the whole log file, the data volume transmitted by the method is greatly reduced, and the whole log file 1/4 or less is common;

(3) the method avoids the influence of the update of the copied database log on the copying process to the maximum extent, ensures zero loss of data, can realize limited copying of partial objects, meets the requirements of data distribution and data integration, and reduces the pressure of a source database;

(4) the method can control the number of the sub-processes, and occupies less system resources.

Drawings

FIG. 1 is a flow chart of a data method;

FIG. 2 is the format and specific information of the "header" of the redo log.

Detailed Description

the mining method for the redo log of the database is specifically described with reference to fig. 1, and a specific embodiment thereof is as follows;

The log change is tracked by rapidly analyzing the log, the log is mined at the first time of change to obtain data change fragments, and the log change data is cached in a local file system and then transmitted to the standby terminal through a transmission process. And the backup terminal synthesis process synthesizes the log fragments into a log file, finally, the application process analyzes and registers the log, and writes the log file into the disaster recovery database after the log file passes the verification and registration so as to realize the data replication of the database. The redo log file is divided into a plurality of data blocks, the first two data blocks of the redo log are respectively a file header and a log header, the file header and the log header record attribute information of the redo log file, and from the third data block, each data block records a redo record of the database operation process, as shown in fig. 2, the format and specific information of the file header of the redo log are shown. The specific data mining method comprises the following steps:

s2: if the log file exists on the ASM, acquiring data block size information of the redo log file from a table v $ ASM _ disk in the ASM; otherwise, accessing a file system, and acquiring data block size information of the redo log file from v $ log and v $ logfile in the file system, wherein the data block size information refers to the number of bytes of one data block; the table v $ ASM _ disk and the table v $ log, v $ logfile are tables inherent in the file system and the ASM, are used for storing basic information of the database, and include size information of the data block;

S4: database ID information of 24 th-27 th bytes of a file head and thread information of 176 th-177 th bytes in the current redo log file are mined, whether the database ID information is consistent with the corresponding information of the database or not is checked, and whether the obtained current redo log belongs to the database or not is judged; if not, returning an error, and if the error is consistent, entering the next step;

S6: mining a data block of recorded content after a file header of the current redo log file, mining 1M data each time, returning error information if the acquisition fails, and terminating the current mining; generally, each data block has 512 bytes, and mining 1M is exactly the size of two data blocks, if the byte number of the mining data is increased, the mining speed is too low, if the database is changed again at the moment, the mined data is inconsistent with the actual redo record, and if the byte number of the mining data is reduced, the waste of computer operation resources is caused, so the mining speed is set to 1M;

S11: calculating the checksum of the redo log file, and writing the check result into 14-15 bytes of a file header of the redo log; the algorithm for the checksum is as follows:

(1) dividing the log header into a plurality of 64-byte blocks; the definition of 64 bytes is because the following calculation is convenient, and each block is divided into 4 segments of 16 bytes, 4 × 16 ═ 64;

(2) Firstly, dividing a first block into four 16-byte segments, and carrying out 16-byte XOR on the 1 st segment and the 2 nd segment to obtain a value a; carrying out 16-byte XOR on the 3 rd section and the 4 th section to obtain a value b; carrying out XOR on a and b to obtain new c, wherein the initial value of c is 0, and carrying out XOR on c and b to obtain new c;

(3) Circularly executing the operation on the large 64-byte section, and carrying out exclusive OR on the large 64-byte section and the value c obtained in the previous round each time, wherein c is a global variable and is not an initial value taken each time;

(4) waiting for all the calculation to obtain a final c value, and dividing 16 bytes of c into 4 segments of 4 bytes, r1, r2, r3 and r 4; carrying out bitwise XOR on the four values to obtain a value r 0;

(5) shifting r0 to the right by 16 bits to obtain g, and performing XOR on r0 and g to obtain m;

(6) performing bitwise AND on m and 0xFFFF to obtain a final checksum;

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical solutions and the inventive concepts of the present invention within the technical scope of the present invention.

Claims

1. A data mining method for redo log files of an Oracle database is characterized by comprising the following steps:

2. the data mining method for the Oracle database redo log file according to claim 1, wherein the redo log file is divided into a plurality of data blocks, the first two data blocks of the redo log are respectively a "file header" and a "log header", the "file header" and the "log header" record the attribute information of the redo log file, and each data block records a redo record of the database operation process from the third data block.