CN115658391A - Backup recovery method of WAL mechanism based on QianBase MPP database - Google Patents
Backup recovery method of WAL mechanism based on QianBase MPP database Download PDFInfo
- Publication number
- CN115658391A CN115658391A CN202211507825.3A CN202211507825A CN115658391A CN 115658391 A CN115658391 A CN 115658391A CN 202211507825 A CN202211507825 A CN 202211507825A CN 115658391 A CN115658391 A CN 115658391A
- Authority
- CN
- China
- Prior art keywords
- wal
- backup
- file
- qianbase
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a backup recovery method of a WAL mechanism based on a QianBase MPP database, which writes transaction data of each transaction successfully executed in the QianBase MPP database into a WAL file, wherein each node stores the WAL file and WAL files of other nodes; each transaction data of the successfully executed transaction has a time stamp; reading metadata of a first WAL backup file, and copying the WAL file to write back data until the backup data is recovered; before reading the metadata of the WAL backup file, firstly backing up the current WAL file of the node in the QianBase MPP database to obtain a second WAL backup file. The consistency and the accuracy of a backup and recovery system are ensured through a WAL mechanism, each node in the QianBase MPP database has a WAL file which is complete per se and WAL files of other nodes, and when the node fails, the normal operation of the database and the integrity of data can still be ensured.
Description
Technical Field
The invention relates to a data backup and recovery method, in particular to a backup and recovery method of a WAL mechanism based on a QianBase MPP database.
Background
The data security of the database is always a very concerned function point of manufacturers and customers, various threats often exist in the operation process of the computer database, the data security of the database is tested, and the backup recovery technology stores the database data off line at regular time, so that the data loss caused when the database or the computer is damaged is avoided. The current backup and recovery scheme of the main stream data mainly comprises MySQL, oracle, gaussDB, SQL Server and the like, and most of the schemes directly backup the data to generate SQL files or respectively complete backup and differential backup on logs and files in a database by using different backup modes. However, in the scenario of a distributed database, such as QianBase MPP, backup recovery cannot guarantee service continuity in a multi-computer environment, the problem of failure in backup or recovery of the distributed database, and the problem of recovery to any time point, so it is necessary to research and improve a data backup and recovery method in the scenario of the distributed database.
Disclosure of Invention
One of the objectives of the present invention is to provide a backup recovery method for a WAL mechanism based on a QianBase MPP database, so as to solve the technical problems that the existing data backup recovery scheme cannot ensure the continuity of services in a multi-computer environment, and the normal use of the database is affected after the database backup or recovery fails.
In order to solve the technical problems, the invention adopts the following technical scheme:
the invention provides a backup recovery method of a WAL mechanism based on a QianBase MPP database, which comprises the following steps.
Step A, writing transaction data of each transaction successfully executed in a QianBase MPP database into a WAL file, wherein each node in the QianBase MPP database stores the WAL of each node and the WAL files of other nodes; each successfully executed transaction has a timestamp in its transaction data.
B, backing up the current WAL file of the node in the QianBase MPP database to obtain a second WAL backup file; reading metadata of a first WAL backup file, and copying the WAL file to write back data until the backup data is recovered; before reading the metadata of the WAL backup file.
Preferably, the further technical scheme is as follows: and when the data recovery of the first WAL backup file fails, reading the metadata of the second WAL backup file, and copying the WAL file to write back the data until the backup data is recovered.
The further technical scheme is as follows: after starting backup, closing the WAL file of the current node, transferring the WAL file to a backup directory, and recording the metadata of the WAL file.
The further technical scheme is as follows: in the method, the WAL of the same node in the QianBase MPP database is stored in at least two nodes and is synchronously updated.
The present invention also provides a computer-readable storage medium having stored thereon instructions which, when executed by a computer, cause the computer to perform the above-described method.
Compared with the prior art, the invention has the following beneficial effects: the consistency and the accuracy of a backup and recovery system are ensured through a WAL mechanism, each node in the QianBase MPP database has a WAL file which is complete per se and WAL files of other nodes, and when the node fails, the normal operation of the database and the integrity of data can still be ensured.
Drawings
FIG. 1 is a schematic block diagram illustrating the structure of the QianBase MPP database in one embodiment of the present invention.
FIG. 2 is a logic flow diagram illustrating one embodiment of the present invention.
Detailed Description
The QianBase MPP mentioned in the present invention is a relational distributed database applied to a data warehouse, and specifically, as shown in fig. 1, has outstanding advantages in data storage, high concurrency, high availability, linear expansion, reaction speed, ease of use, cost performance, and the like. The database architecture comprises three layers, namely a client service layer, an SQL database service layer and a storage engine layer. The first layer is the client service layer where the application resides. The application may be written by a user or implemented through a third party ISV tool/solution. You can access the QianBase database service layer through a standard ODBC/JDBC interface using a Windows or Linux client driver provided by QianBase. QianBase supports type2JDBC, type4JDBC and ado. Depending on the particular requirements (response time, number of connections, security requirements, and other factors), you can select the appropriate driver type. The second layer is the SQL database engine layer. This layer includes all the QianBase services, encapsulating all the services that manage QianBase objects and efficiently execute SQL database requests. Services include connection management, SQL statement compilation and creation of optimal execution plans, SQL execution (serial and parallel), transaction management and workload management. The third layer is the storage engine layer, which includes the standard Hadoop services used by QianBase (HDFS and Zookeeper). The QianBase object is stored in a native Hadoop database structure and comprises HBase, a cache text file and a key value sequence file. QianBase handles SQL requests coming from applications and transparently translates these requests into native interface calls required by the underlying data format. QianBase provides a relational schema abstraction over HBase, so QianBase can support legacy relational database objects (tables, views, secondary indices) by using the familiar DDL/DML syntax (object naming, column definition, and data type support). In addition, qianBase also supports the native table of HBase and Hive as the appearance of QianBase.
The WAL (Write-Ahead-Log) pre-written Log is a Log used by the RegionServer of HBase to record operation contents in the process of processing data insertion and deletion, and data recovery can be performed according to the Log when a node is down.
The invention is further elucidated with reference to the drawing.
Referring to fig. 1 and 2, one embodiment of the present invention is a backup recovery method for WAL mechanism based on QianBase MPP database, which includes the following steps.
S1, writing transaction data of each successfully executed transaction into a WAL file in a QianBase MPP database; what is important here is that each node in the aforementioned QianBase MPP database stores its own WAL file as well as the WAL files of other nodes; that is, in the QianBase MPP database, WAL files of the same node are stored and updated synchronously in at least two nodes. Meanwhile, the transaction data of each transaction which is successfully executed has a timestamp, and the current node data can be restored to any previous time point through the timestamp.
S2, reading metadata of the first WAL backup file, and copying the WAL file to write back data until the backup data is recovered; it should be noted here that, before reading the metadata of the WAL backup file in this step, the current WAL file of the node in the QianBase MPP database is backed up first to obtain a second WAL backup file. Therefore, when the recovery fails, the database can be rolled back to the state before the recovery action.
When the data recovery of the first WAL backup file fails, the metadata of the second WAL backup file is read, and then the WAL file is copied to write back the data until the backup data is recovered. After the backup is started in the above steps, closing the WAL file of the current node, then transferring the WAL file to the backup directory, and recording the metadata of the WAL file.
Based on the above description of the method of the present invention, more specifically: firstly, storing WAL: in the distributed database QianBase MPP, as shown in fig. 1, each transaction is written into the WAL first, so that it can be ensured that the problem of transaction inconsistency does not occur when the database fails or the computer is powered off, and in the distributed environment, each node stores its own WAL and cross-stores partial WALs of other nodes, so that in the distributed environment, when partial nodes are down, the complete WAL can still be ensured.
Each transaction data is provided with an accurate timestamp: each transaction is assigned with an accurate timestamp when the transaction is successful, so that the data can be ensured not to have difference in quantity when the data is recovered, and the final data is inconsistent.
The size of the WAL file can be regulated and controlled: the size of the WAL file can be adjusted according to the actual situation, otherwise, an oversize single file can generate performance loss on data backup and movement, and the operation efficiency of the database is affected under severe conditions.
Transferring the WAL file during backup: when performing backup, as shown in fig. 2, only the WAL file that is not closed currently needs to be closed, and the WAL file information is recorded, and the closed WAL file is directly transferred to the backup directory, which has no influence on the service of the database. If the data is worried about to be copied, the CPU and the I/O of the physical machine are excessively occupied, the resource use upper limit of the backup can also be set, and the backup action is carried out imperceptibly as much as possible.
WAL files are not deleted directly when recovery: when the database needs to be restored, the database may still be running, and each node still has a large number of WAL files, as shown in fig. 2, the backed-up WAL files need to be copied back to the node during restoration, and at this time, the WAL files in the node need to be backed up, so that it can be ensured that the database can be rolled back to the state before the restoration action when the restoration fails.
Because the current data backup and recovery schemes copy the data itself or copy the data blocks, in the distributed database QianBase MPP, it is difficult to ensure the integrity and consistency of the data, and the method provided by the present invention can effectively solve the problem, and has the following characteristics:
first, data consistency can be guaranteed in a distributed environment: in a distributed database QianBase MPP, business data of each industry can be completely written into a WAL, and a time point is marked to be successful or failed, so that when backup is initiated, the WAL file in a disk is directly closed and then copied to a specified backup directory, the backed-up data is more than the required backup data, the data is screened through the time point during recovery, the consistency of the data can be ensured, and the business in operation cannot be influenced in the backup process.
And secondly, the database is failed in backup or recovery, namely when the backup recovery operation is executed, the operation cannot be guaranteed to be successful due to some unexpected conditions, but the backup recovery failure still needs to be guaranteed not to have negative influence on the currently running database operation, because the backup operation is the operation on the WAL file, and the backup failure does not have influence on the running business operation. Before the recovery operation is executed, the backup operation is firstly carried out, the backup recovery is started after the backup is successful, and when the backup recovery is failed, the backup is recovered, so that the consistency of the service data is ensured.
Thirdly, PIT recovery of the distributed database: based on backup recovery of the WAL mechanism, each WAL file records the time stamp of each service data, and the recovery to any time point is to filter out the data which is in accordance with the time period in the WAL file, so that the consistency of the recovery and recovery data at any time point can be ensured.
Based on the general form of the computer software product, a further embodiment of the present invention provides a computer-readable storage medium, in which instructions are stored, and when the instructions are executed by a computer, the computer is enabled to execute the backup and restore method based on the WAL mechanism of the QianBase MPP database in the foregoing embodiment.
The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a RAM, a ROM, an erasable programmable read-only memory (EPROM), a register, a hard disk, an optical fiber, a CD-ROM, an optical storage device, a magnetic storage device, any suitable combination of the foregoing, or any other form of computer readable storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuit (ASIC). In embodiments of the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
In addition to the foregoing, it should be noted that reference throughout this specification to "one embodiment," "another embodiment," "an embodiment," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment described generally throughout this application. The appearances of the same phrase in various places in the specification are not necessarily all referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with any embodiment, it is submitted that it is within the scope of the invention to effect such feature, structure, or characteristic in connection with other embodiments.
Although the invention has been described herein with reference to a number of illustrative embodiments thereof, it should be understood that numerous other modifications and embodiments can be devised by those skilled in the art that will fall within the spirit and scope of the principles of this disclosure. More specifically, various variations and modifications are possible in the component parts and/or arrangements of the subject combination arrangement within the scope of the disclosure, the drawings and the appended claims. In addition to variations and modifications in the component parts and/or arrangements, other uses will also be apparent to those skilled in the art.
Claims (5)
1. A backup recovery method of a WAL mechanism based on a QianBase MPP database is characterized by comprising the following steps:
writing transaction data of each transaction successfully executed in a QianBase MPP database into a WAL file, wherein each node in the QianBase MPP database stores the WAL file of the node and the WAL files of other nodes; each transaction data of the successfully executed transaction has a time stamp;
reading metadata of a first WAL backup file, and copying the WAL file to write back data until the backup data is recovered; before reading the metadata of the WAL backup file, firstly backing up the current WAL file of the node in the QianBase MPP database to obtain a second WAL backup file.
2. The QianBase MPP database-based backup recovery method for the WAL mechanism according to claim 1, characterized in that: and when the data recovery of the first WAL backup file fails, reading the metadata of the second WAL backup file, and copying the WAL file to write back the data until the backup data is recovered.
3. The QianBase MPP database-based backup recovery method for WAL mechanism according to claim 1 or 2, characterized in that: after starting backup, closing the WAL file of the current node, transferring the WAL file to a backup directory, and recording the metadata of the WAL file.
4. The QianBase MPP database-based backup recovery method for the WAL mechanism according to claim 1, characterized in that: in the method, the WAL of the same node in the QianBase MPP database is stored in at least two nodes and is synchronously updated.
5. A computer-readable medium, characterized in that: the computer-readable storage medium has stored therein instructions that, when executed by a computer, cause the computer to perform the method of any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211507825.3A CN115658391A (en) | 2022-11-29 | 2022-11-29 | Backup recovery method of WAL mechanism based on QianBase MPP database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211507825.3A CN115658391A (en) | 2022-11-29 | 2022-11-29 | Backup recovery method of WAL mechanism based on QianBase MPP database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115658391A true CN115658391A (en) | 2023-01-31 |
Family
ID=85020142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211507825.3A Pending CN115658391A (en) | 2022-11-29 | 2022-11-29 | Backup recovery method of WAL mechanism based on QianBase MPP database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115658391A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115858252A (en) * | 2023-02-21 | 2023-03-28 | 浙江智臾科技有限公司 | Data recovery method, device and storage medium |
-
2022
- 2022-11-29 CN CN202211507825.3A patent/CN115658391A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115858252A (en) * | 2023-02-21 | 2023-03-28 | 浙江智臾科技有限公司 | Data recovery method, device and storage medium |
CN115858252B (en) * | 2023-02-21 | 2023-06-02 | 浙江智臾科技有限公司 | Data recovery method, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11740974B2 (en) | Restoring a database using a fully hydrated backup | |
US11256715B2 (en) | Data backup method and apparatus | |
US10254996B1 (en) | Fast migration of metadata | |
CN107835983B (en) | Backup and restore in distributed databases using consistent database snapshots | |
US8965850B2 (en) | Method of and system for merging, storing and retrieving incremental backup data | |
CA2933790C (en) | Apparatus and method for creating a real time database replica | |
EP3796174B1 (en) | Restoring a database using a fully hydrated backup | |
US20140095432A1 (en) | Schema versioning for cloud hosted databases | |
US10204016B1 (en) | Incrementally backing up file system hard links based on change logs | |
US9223797B2 (en) | Reparse point replication | |
US11003364B2 (en) | Write-once read-many compliant data storage cluster | |
CN111078667B (en) | Data migration method and related device | |
CN104657382A (en) | Method and device for detecting consistency of data of MySQL master and slave servers | |
US10628298B1 (en) | Resumable garbage collection | |
CN107357920B (en) | Incremental multi-copy data synchronization method and system | |
US7631020B1 (en) | Method and system of generating a proxy for a database | |
US8843450B1 (en) | Write capable exchange granular level recoveries | |
CN112800019A (en) | Data backup method and system based on Hadoop distributed file system | |
CN115658391A (en) | Backup recovery method of WAL mechanism based on QianBase MPP database | |
US11966297B2 (en) | Identifying database archive log dependency and backup copy recoverability | |
US8595271B1 (en) | Systems and methods for performing file system checks | |
US7831564B1 (en) | Method and system of generating a point-in-time image of at least a portion of a database | |
US20060004846A1 (en) | Low-overhead relational database backup and restore operations | |
CN111221801A (en) | Database migration method, system and related device | |
US20230306014A1 (en) | Transactionally consistent database exports |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |