WO2021018020A1 - 数据处理方法、装置、电子设备及计算机存储介质 - Google Patents

数据处理方法、装置、电子设备及计算机存储介质 Download PDF

Info

Publication number
WO2021018020A1
WO2021018020A1 PCT/CN2020/104015 CN2020104015W WO2021018020A1 WO 2021018020 A1 WO2021018020 A1 WO 2021018020A1 CN 2020104015 W CN2020104015 W CN 2020104015W WO 2021018020 A1 WO2021018020 A1 WO 2021018020A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
database
backup
database instance
distributed
Prior art date
Application number
PCT/CN2020/104015
Other languages
English (en)
French (fr)
Inventor
吴迪
郭鹏
楼江航
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2021018020A1 publication Critical patent/WO2021018020A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the embodiments of the present invention relate to the field of computer technology, and in particular to a data processing method, device, electronic equipment, and computer storage medium.
  • Database backup and recovery are important guarantees for user data security, and with the increase of user business scale, the amount of data stored in the database and the storage load also increase exponentially.
  • a database sharding (MySQL Sharding) technology is provided.
  • the database sub-database sub-table technology is a technology that splits the database and/or data table and distributes the storage according to the sub-database and table algorithm. In the use scenario of a distributed database with sub-database and table, database backup and recovery operations Is even more important.
  • embodiments of the present invention provide a data processing solution to solve some or all of the above-mentioned problems.
  • a data processing method which includes: receiving a backup request for a distributed database, and performing a full data backup for each database instance corresponding to the backup request instruction; After the instance completes the full data backup, it locks the distributed transaction used to update the data across the database instances; obtains the log file information that records the incremental data of each database instance in a set time period; unlocks all The distributed transaction is described, and the data backup set of the distributed database is generated according to the backup result of each database instance and the information of the log file.
  • a data processing method which includes: receiving a restoration request for a distributed database, and determining a data backup set indicated by the restoration request, and the data backup set is based on the first On the one hand, a data backup set generated by the data processing method; according to the data backup set, each corresponding database instance is instructed to perform a full recovery operation.
  • a data processing device which includes: a full backup module, configured to receive a backup request for a distributed database, and perform full data on each database instance corresponding to the backup request instruction Backup; a locking module, used to lock the distributed transaction used to update data across database instances after determining that each database instance has completed the full data backup; the first acquisition module, used to acquire records of each database instance Information about the log file of incremental data within a set time period; an unlocking module for unlocking the distributed transaction, and generating the distribution according to the backup result of each database instance and the log file information The data backup set of the database.
  • a data processing device which includes: a backup set determining module, configured to receive a restore request for a distributed database, and determine a data backup set indicated by the restore request.
  • the data backup set is a data backup set generated by the data processing device according to the third aspect; the full recovery module is used to instruct each corresponding database instance to perform a full recovery operation based on the data backup set.
  • an electronic device including: a processor, a memory, a communication interface, and a communication bus.
  • the processor, the memory, and the communication interface complete each other through the communication bus.
  • the memory is used to store at least one executable instruction, the executable instruction causes the processor to perform operations corresponding to the data processing method described in the first aspect or the second aspect.
  • a computer storage medium having a computer program stored thereon, and when the program is executed by a processor, the data processing method as described in the first aspect or the second aspect is implemented.
  • each database instance corresponding to the distributed database is instructed to perform full data backup according to the backup request, and after the full data backup is completed, the distributed transaction is locked and the log file information is obtained.
  • Locking distributed transactions prevents data inconsistencies between database instances caused by distributed transactions, leading to inconsistent global data, and ensures that distributed databases have a consistent state of global data.
  • unlock the distributed transaction so that the distributed database can run normally, and generate a data backup set based on the backup results of each database instance and the information in the log file, so as to minimize the impact on the user's business , Database backup to ensure global data consistency.
  • Fig. 1 is a flowchart of the steps of a data processing method according to the first embodiment of the present invention
  • Fig. 2a is a sequence diagram of a distributed database for database backup according to scenario 1 of the present invention
  • Figure 2b is a usage scenario diagram of the data processing solution according to the present invention.
  • Fig. 3 is a flowchart of steps of a data processing method according to the second embodiment of the present invention.
  • Fig. 5 is a flowchart of steps of a data processing method according to the fourth embodiment of the present invention.
  • Fig. 6 is a flowchart of steps of a data processing method according to the fifth embodiment of the present invention.
  • Fig. 7 is a flowchart of steps of a data processing method according to the sixth embodiment of the present invention.
  • Fig. 8 is a flowchart of steps of a data processing method according to the seventh embodiment of the present invention.
  • FIG. 9 is a sequence diagram of a distributed database for database recovery according to scenario 2 of the present invention.
  • Fig. 10 is a structural block diagram of a data processing device according to the eighth embodiment of the present invention.
  • FIG. 11 is a structural block diagram of a data processing device according to the ninth embodiment of the present invention.
  • Fig. 12 is a structural block diagram of a data processing device according to the tenth embodiment of the present invention.
  • Figure 13 is a structural block diagram of a data processing device according to the eleventh embodiment of the present invention.
  • Fig. 14 is a schematic structural diagram of an electronic device according to the twelfth embodiment of the present invention.
  • Step S102 Receive a backup request for the distributed database, and perform a full data backup for each database instance corresponding to the backup request instruction.
  • the distributed database may use a database with a database and table technology (such as MySQL Sharing).
  • This kind of distributed database can split a data table with a large amount of data into multiple tables, and divide these tables Distributed in multiple database instances (such as MySQL instances).
  • the sub-tables in each MySQL instance only contain part of the data of the data table, thereby spreading the data storage and calculation pressure among multiple MySQL instances to solve the single-machine performance bottleneck problem.
  • a distributed database includes a data table and its corresponding index table. Both the data table and the index table are split into two sub-tables and stored in two database instances (such as MySQL instance A and MySQL instance shown in Figure 2a). B) on.
  • database instances such as MySQL instance A and MySQL instance shown in Figure 2a.
  • B database instances
  • each database instance of the distributed database is instructed to perform full data backup according to the backup request, for example, MySQL instance A and MySQL instance B perform full data backup respectively.
  • the transactional and non-transactional SQL of the database involving a single database instance can be executed normally, and the resulting data updates can be restored based on the log files of each database instance. Since each database instance is backed up on a stand-alone machine, it can ensure data consistency and prevent any major intrusion to the user's business during the database backup process, and can not affect the normal operation of the user's business.
  • Step S104 After determining that each database instance completes the full data backup, lock the distributed transaction used for data update across database instances.
  • a distributed transaction is a transaction that requires data update across database instances, where, as described above, the data update can be data change, data deletion, data insertion, and so on.
  • the distributed transaction across the database instance can be locked to prevent the execution of new distributed transactions, thereby avoiding cross-database instance data updates and ensuring that the full data After the backup is completed, the global data consistency of the database.
  • Step S106 Obtain the information of the log file that records the incremental data of each database instance in a set time period.
  • Each database instance corresponds to a log file, and the log file records the incremental data of each database instance within a set time period.
  • the log file can record incremental data within a set time period by recording the data change location information and post-data change information of the corresponding database instance within a set time period.
  • the log file can also record the pre-data change information of each data change location of the corresponding database instance within a set time period, so that it can be used for verification or error correction or for other purposes.
  • the set time period those skilled in the art can determine the specific start time and end time as needed.
  • the start time of the set time period is the time when the corresponding database instance starts the full backup operation
  • the end time is the time when the distributed transaction is locked, which is not limited in the embodiment of the present invention.
  • the log files in different types of database instances may be different.
  • the log files can be binlog files.
  • the data format of the Binlog file is row format (that is, the binlog_format parameter value is row).
  • the binlog file of this data format can record the SQL statements of each insert record, delete record, and change record executed by the database instance, and the value of the affected data row before and after the change.
  • log files may be other types of log files, which is not limited in this embodiment.
  • the distributed transaction is locked to ensure that the distributed database has a globally consistent state, and the information of the log file is acquired during the period of locking the distributed transaction to ensure that the data recorded by the log file is also globally consistent.
  • the data volume of the log file information that needs to be obtained is very small, so it can be guaranteed to complete the acquisition in a short time (such as 1 second), making the lock time of the distributed transaction very short. It hardly affects the user's business, and realizes that the global data consistency can be guaranteed during the database backup process, and the user's business is hardly invaded.
  • the information in the log file can be any appropriate information.
  • the information of the log file can be understood as the location information of the log file, which can include the database instance information (such as the database instance ID) corresponding to the log file, the name of the log file, and the offset of the log file.
  • the log file includes data change location information of the incremental data and information after the data change.
  • Step S108 Unlock the distributed transaction, and generate a data backup set of the distributed database according to the backup result of each database instance and the information of the log file.
  • a data backup set of the distributed database is generated according to the backup result of each database instance and the information of the log file, and the data backup set can be used for subsequent database restoration.
  • each database instance corresponding to the distributed database is instructed to perform full data backup according to the backup request, and after the full data backup is completed, the distributed transaction is locked and the log file information is obtained.
  • Locking distributed transactions prevents data inconsistencies between database instances caused by distributed transactions, leading to inconsistent global data, and ensures that distributed databases have a consistent state of global data.
  • unlock the distributed transaction so that the distributed database can run normally, and generate a data backup set based on the backup results of each database instance and the information in the log file, so as to minimize the impact on the user's business , Database backup to ensure global data consistency.
  • the data processing method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 3 there is shown a flow chart of the steps of a data processing method according to the second embodiment of the present invention.
  • the data processing method of this embodiment includes the aforementioned steps S102 to S108.
  • step S104 includes the following sub-steps:
  • Sub-step S1041 After determining that each database instance completes the full data backup, it is determined whether the distributed transaction of all executing cross-database instances is completed.
  • completion of the submission in this embodiment can be either a formal submission task or a rollback task.
  • sub-step S1042 is executed; otherwise, no action may be taken or other appropriate actions may be executed.
  • Sub-step S1042 If the commit is completed, generate a blocking instruction indicating to lock the distributed transaction, so as to lock the distributed transaction used for data update across database instances.
  • generating a blocking instruction that instructs to lock the distributed transaction can prevent the distributed database from executing new distributed transactions and destroy the global consistency of the data, and also makes the information of the log file obtained during the locking of the distributed transaction to ensure the data Global consistency.
  • the blocking instruction uses seconds as the blocking duration unit. For example, 1 second, 2 seconds, 5 seconds, 10 seconds, etc., so that the blocking time is short, thereby reducing the intrusion to user services as much as possible.
  • each database instance corresponding to the distributed database is instructed to perform full data backup according to the backup request, and after the full data backup is completed, the distributed transaction is locked and the log file information is obtained.
  • Locking distributed transactions prevents data inconsistencies between database instances caused by distributed transactions, leading to inconsistent global data, and ensures that distributed databases have a consistent state of global data.
  • unlock the distributed transaction so that the distributed database can run normally, and generate a data backup set based on the backup results of each database instance and the information in the log file, so as to minimize the impact on the user's business , Database backup to ensure global data consistency.
  • the blocking instruction takes the second level as the blocking duration unit, which can reduce the intrusion to user services as much as possible.
  • the data processing method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 4 there is shown a flow chart of the steps of a data processing method according to the third embodiment of the present invention.
  • the data processing method of this embodiment includes the aforementioned steps S102 to S108.
  • the step S104 may be implemented in the manner in the foregoing Embodiment 1 or Embodiment 2, or may be implemented in other manners.
  • step S108 includes the following sub-steps:
  • Sub-step S1081 Generate an unlock instruction that allows the distributed transaction to be submitted to instruct to unlock the distributed transaction.
  • an unlocking instruction allowing the submission of distributed transactions is generated, so that the user services can normally submit distributed transactions.
  • Sub-step S1082 Generate a data backup set of the distributed database based on the metadata of each database instance backed up when each database instance performs full data backup, the content data backup result of each database instance, and the information of the log file.
  • a data backup set is generated based on the metadata, content data backup results, and log file information of each database instance.
  • the metadata includes, but is not limited to, configuration information of each database instance, account information used to access the database instance, and so on.
  • the content data backup result includes the data in the data table stored in the corresponding database instance.
  • the log file includes data change location information and post data change information to record incremental data.
  • the information of the log file may be location information, which includes database instance information corresponding to the log file, the name of the log file, the offset of the log file, and the like.
  • the log file can be obtained according to the name of the log file included in the log file information during subsequent data recovery. Determine the incremental data according to the offset of the log file and the data change location information in the log file and the information after the data change.
  • the database instance corresponding to the log file is incrementally restored based on the incremental data.
  • each database instance corresponding to the distributed database is instructed to perform full data backup according to the backup request, and after the full data backup is completed, the distributed transaction is locked and the log file information is obtained.
  • Locking distributed transactions prevents data inconsistencies between database instances caused by distributed transactions, leading to inconsistent global data, and ensures that distributed databases have a consistent state of global data.
  • unlock the distributed transaction so that the distributed database can run normally, and generate a data backup set based on the backup results of each database instance and the information in the log file, so as to minimize the impact on the user's business , Database backup to ensure global data consistency.
  • the unlocking instruction is generated after obtaining the information of the log file, so that the user service can submit the distributed transaction normally, and the intrusion to the user service is minimized.
  • Generating a data backup set based on metadata, backup results, and log file information can facilitate the management of database backup data and facilitate subsequent database recovery.
  • the data processing method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • sequential execution order of the aforementioned steps is not limited by the step number. Those skilled in the art can configure the order of execution of the steps as needed, and each step can be executed in all order, all in parallel, or part of the order and part of the parallel execution .
  • FIG. 2a a sequence diagram of a distributed database for database backup is shown.
  • FIG. 2b a usage scenario diagram of a data processing solution is shown.
  • the distributed database includes middleware 200 and multiple database instances (ie 300_1, 300_2 to 300_N in the database embodiment), middleware 200 and multiple database instances 300_1 ⁇ 300_N communicate through the network.
  • the middleware 200 (such as DRDS proxy) in the distributed database is used as the execution subject to describe the data processing method provided in the embodiment of the present invention.
  • the middleware 200 is a service process added between the user business end 100 and the database instances 300_1 ⁇ 300_N, and mainly provides the user business end 100 with the routing capability of a distributed database, and the SQL (structured query statement) of the user business end 100 It will be routed to the required database instance according to the sharding algorithm (for example, sharding) algorithm of the distributed database, so that the user business terminal 100 can conveniently manage and operate multiple database instances.
  • the sharding algorithm for example, sharding
  • the data processing method in this usage scenario uses the middleware 200 to perform database backup process as follows:
  • Step A1 The user service terminal 100 triggers the middleware to perform a database backup operation through the middleware console (for example, the DRDS console).
  • the middleware console for example, the DRDS console.
  • Step B1 The middleware 200 (DRDS proxy) backs up the metadata of each database instance 300_1 to 300_N.
  • the metadata may include account information used by the middleware 200 to access the database instances 300_1 to 300_N (RDS), configuration information of sub-databases and tables of the database instances 300_1 to 300_N, etc.
  • Step C1 The middleware 200 (DRDS proxy) triggers the full backup operation of all lower-level database instances 300_1 to 300_N (such as MySQL instances).
  • Step D1 The middleware 200 (DRDS proxy) checks the backup status of the lower-level database instances 300_1 to 300_N (such as MySQ instances) until the full backup of all the database instances 300_1 to 300_N is completed.
  • DRDS proxy checks the backup status of the lower-level database instances 300_1 to 300_N (such as MySQ instances) until the full backup of all the database instances 300_1 to 300_N is completed.
  • Step E1 The middleware 200 (DRDS proxy) locks distributed transactions and blocks the submission of all currently uncommitted distributed transactions. It should be noted that before locking the distributed transaction, it will wait for the execution of all submitted distributed transactions to complete, and then perform the locking, so as to prevent the global data from being inconsistent.
  • DRDS proxy middleware 200
  • Step F1 The middleware 200 (DRDS proxy) records the location information of the current log files (such as binlog files) of each database instance 300_1 to 300_N (such as a MySQL instance).
  • the location information includes database instance information (for example, serverId), the name of the log file, and the offset of the log file (that is, binlog offset).
  • the log file includes data change location information for recording the incremental data and information after the data change.
  • the binlog file is a binary format file used to record the update operations on the data in the database. For example, record the data location affected by SQL of each data update operation and the data before and after the change. Using the binlog file can reliably record the data change information generated during the database backup process, thereby ensuring the reliability of subsequent database recovery based on the binlog file.
  • Step G1 The middleware 200 (DRDS proxy) unlocks distributed transactions and allows all distributed transactions to be committed.
  • Step H1 The database backup is completed, and a corresponding data backup set is generated according to the backup result of the full backup operation, the information of the log file, and the metadata.
  • the database is backed up through the distributed transaction locking mechanism (ie, the LOCK mechanism) and the binlog position information (position) is recorded, which ensures that the impact on user services is minimized.
  • the distributed transaction locking mechanism ie, the LOCK mechanism
  • the lock time of distributed transactions is seconds, and only cross-database transaction commit operations are blocked, and other SQL executions are not affected, ensuring that there is little impact on the user's business during the backup process, and it will not be affected by distributed transactions. Errors are reported when locked, and the execution of non-transactional SQL and stand-alone transactional SQL is not affected.
  • FIG. 5 there is shown a flow chart of the steps of a data processing method according to the fourth embodiment of the present invention.
  • Step S502 Receive a restoration request for the distributed database, and determine a data backup set indicated by the restoration request.
  • the middleware of the distributed database (such as DRDS proxy) is used as the execution subject to describe the data processing method provided in the embodiment of the present invention.
  • middleware is used to facilitate user business management and operation of database instances in distributed databases.
  • the recovery request is used to instruct the distributed database to perform database recovery based on a certain data backup set.
  • the recovery request may indicate the data backup set used by the name, identification, or storage address of the data backup set.
  • the data backup set is a data backup set generated according to any one of the data processing methods in the foregoing first to third embodiments.
  • the data backup set at least includes content data backup results of the full backup operation performed by each database instance.
  • Step S504 According to the data backup set, instruct each corresponding database instance to perform a full recovery operation.
  • the middleware may determine the database instance involved in the recovery request according to the data backup set indicated by the recovery request, and then instruct each involved database instance to perform a full recovery operation.
  • each database instance restores the backup results in full to the original database instance.
  • step S504 can be implemented as: obtaining the content data backup result of the backed-up database instance from the data backup set, and instructing each corresponding database instance to restore the content data backup result to the newly created recovery database instance .
  • each database instance creates a new database instance as the database instance for recovery, and restores the backup result to the database instance for recovery, avoiding the first feasible method that affects the user’s business during the recovery process.
  • the use of database instances reduces the intrusion into user services during database recovery.
  • the full recovery operation completes the database recovery.
  • the data backup set generated by the data processing method described in any one of the foregoing embodiments 1 to 3 is used for data restoration, which ensures the accuracy of database restoration and can guarantee the data after restoration. Global data consistency of distributed databases.
  • the data processing method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 6 there is shown a flow chart of the steps of a data processing method according to the fifth embodiment of the present invention.
  • the data processing method of this embodiment includes the aforementioned steps S502 to S504.
  • the method further includes the following steps:
  • Step S506 Obtain the information of the log file that records the incremental data of each database instance in a set time period from the data backup set.
  • step S506 and step S508 are executed when the database is restored.
  • the log file includes the data change location information used to record the incremental data and the information after the data change.
  • the information of the log file includes: database instance information corresponding to the log file, the name of the log file, and the offset of the log file.
  • database instance information corresponding to the log file
  • the name of the log file the name of the log file
  • the offset of the log file the information of the log file to include any appropriate content as needed.
  • the log file may be a binlog file.
  • the parameter indicating the data format of the binlog file (ie binlog_format) is row, which indicates the data row affected by the SQL that records each data update operation, and the value before and after the change of the data row.
  • Step S508 Perform an incremental recovery operation on the database instance after the full recovery operation according to the information in the log file.
  • step S508 includes the following sub-steps:
  • Sub-step S5081 According to the information of the log file, determine the database instance to be incrementally restored and the determined incremental data of the database instance in the set time period.
  • the log file achieves the purpose of recording incremental data by recording data change location information and post-data change information of the corresponding database instance within a set time period.
  • other methods may be used to record incremental data.
  • the process of determining the database instance and incremental data to be incrementally restored may be: determining the database instance to be incrementally restored according to the database instance name in the information in the log file. According to the name of the log file and the offset of the log file in the information of the log file, determine the data change location information and the post data change information of the database instance to be incrementally restored within the set time period. Furthermore, the location of the incremental data in the database instance is determined based on the data change location information, and the content of the incremental data is determined based on the information after the data change.
  • Sub-step S5082 Perform incremental recovery on the determined database instance according to the incremental data.
  • the data at the position of the determined incremental data is updated to the post-data information.
  • the data generated by the data update operation during the database backup process can be restored, thereby ensuring global data consistency.
  • the data backup set generated by the data processing method described in any one of the foregoing embodiments 1 to 3 is used for data restoration, which ensures the accuracy of database restoration and can guarantee the data after restoration. Global data consistency of distributed databases.
  • the data processing method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 7 a flowchart of the steps of a data processing method according to the sixth embodiment of the present invention is shown.
  • the data processing method of this embodiment includes the aforementioned steps S502 to S504.
  • the method may also include or not include the aforementioned steps S506 to S508.
  • the method further includes the following steps:
  • Step S510 Obtain metadata of each database instance backed up from the data backup set, create a new middleware instance based on the metadata, and restore the metadata to the new middleware instance.
  • middleware such as DRDS proxy
  • metadata of each database instance of the distributed database is stored on the middleware.
  • the metadata includes but is not limited to : The account information used by the middleware to access the database instance (RDS), the configuration information of the database and table of the database instance, etc.
  • the data backup set also includes metadata of each database instance.
  • the middleware obtains the backup metadata from the data backup set and restores it to the new middleware instance created.
  • a person skilled in the art can use any appropriate method to restore the metadata to the new middleware instance, for example, a copy method, which is not limited in this embodiment.
  • a new middleware instance corresponding to the restored new distributed database can be created, which can provide routing functions for user services more conveniently, so that it can easily manage and operate the newly restored distributed database through the new middleware instance.
  • the user business can still manage and operate the original distributed database normally, making the database recovery process less intrusive to the user business.
  • the data backup set generated by the data processing method described in any one of the foregoing embodiments 1 to 3 is used for data restoration, which ensures the accuracy of database restoration and can guarantee the data after restoration. Global data consistency of distributed databases.
  • creating a new middleware instance helps to reduce the intrusion of the database restoration process to the user's business.
  • the data processing method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 8 a flowchart of steps of a data processing method according to the seventh embodiment of the present invention is shown.
  • the data processing method of this embodiment includes the aforementioned steps S502 to S504.
  • the method may also include or not include the aforementioned steps S506 to S510.
  • step S510 is included and a new recovery database instance is created in step S504, the method further includes the following steps:
  • Step S512 Mount each newly created database instance for recovery to the new middleware instance.
  • the data backup set generated by the data processing method described in any one of the foregoing embodiments 1 to 3 is used for data restoration, which ensures the accuracy of database restoration and can guarantee the data after restoration. Global data consistency of distributed databases.
  • the data processing method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • sequential execution order of the aforementioned steps is not limited by the step number. Those skilled in the art can configure the order of execution of the steps as needed, and each step can be executed in all order, all in parallel, or part of the order and part of the parallel execution .
  • the distributed database includes middleware 200 and multiple database instances (ie 300_1, 300_2 to 300_N in the database embodiment), middleware 200 and multiple database instances 300_1 ⁇ 300_N communicate through the network.
  • the middleware 200 (such as DRDS proxy) in the distributed database is used as the execution subject to describe the data processing method provided in the embodiment of the present invention.
  • the middleware 200 is a service process added between the user business end 100 and the database instances 300_1 ⁇ 300_N, and mainly provides the user business end 100 with the routing capability of a distributed database, and the SQL (structured query statement) of the user business end 100 It will be routed to the required database instance according to the sharding algorithm (for example, sharding) algorithm of the distributed database, so that the user business terminal 100 can conveniently manage and operate multiple database instances.
  • the sharding algorithm for example, sharding
  • the data processing method in this usage scenario uses middleware to perform database recovery process as follows:
  • Step A2 The user service terminal 100 selects a valid data backup set on the console of the middleware 200 (such as DRDS proxy) and triggers a restore request.
  • a valid data backup set on the console of the middleware 200 such as DRDS proxy
  • Step B2 The middleware 200 (DRDS proxy) creates a new middleware instance according to the recovery request, and synchronizes relevant metadata to the new middleware instance.
  • Step C2 Middleware 200 (DRDS proxy) triggers all lower-level database instances 300_1 ⁇ 300_N (MySQL A and MySQL B as shown in Figure 9) based on the full recovery operation of the data backup set, so that each database instance creates a new recovery Database instances (MySQL C and MySQL D as shown in Figure 9), and restore the backup results in the data backup set to the new database instance for recovery.
  • DRDS proxy Middleware 200
  • Step D2 The middleware 200 (DRDS proxy) checks the recovery status of each database instance 300_1 to 300_N until the full recovery operation of all database instances is completed.
  • Step E2 The middleware 200 (DRDS proxy) mounts the new recovery database instance to the new middleware instance.
  • Step F2 Middleware 200 (DRDS proxy) according to the location information of the log file (such as binlog file) of each database instance recorded in the data backup set, backs up the original database instance from the beginning of the database to the lock of the distributed transaction period. Binlog is applied to the corresponding recovery database instance to complete the incremental data during this period. Because the binlog records the changes of data rows, even if part of the binlog is repeatedly applied to the recovery database instance, it can guarantee the idempotence and the correctness of the recovered data.
  • DRDS proxy Middleware 200 (DRDS proxy) according to the location information of the log file (such as binlog file) of each database instance recorded in the data backup set, backs up the original database instance from the beginning of the database to the lock of the distributed transaction period. Binlog is applied to the corresponding recovery database instance to complete the incremental data during this period. Because the binlog records the changes of data rows, even if part of the binlog is repeatedly applied to the recovery database instance, it can guarantee the idempotence and the correctness of the recovered data
  • Step G2 The middleware 200 (DRDS proxy) completes the database recovery.
  • the Point-In-Position recovery mechanism is implemented, through the full recovery operation of the database instance and the incremental recovery operation of the log file (binlog) , To ensure the global data consistency in the distributed database scenario.
  • FIG. 10 there is shown a structural block diagram of a data processing device according to the eighth embodiment of the present invention.
  • the data processing device of this embodiment includes: a full backup module 1002 for receiving a backup request for a distributed database, and performing full data backup for each database instance corresponding to the instruction of the backup request; a locking module 1004 for determining each After the database instance completes the full data backup, it locks the distributed transaction used to update the data across the database instance; the first obtaining module 1006 is used to obtain records of the increment of each database instance in a set time period Information about the log file of the data; the unlocking module 1008 is used to unlock the distributed transaction, and generate a data backup set of the distributed database according to the backup result of each database instance and the information of the log file.
  • each database instance corresponding to the distributed database is instructed to perform full data backup according to the backup request, and after the full data backup is completed, the distributed transaction is locked and the log file information is obtained.
  • Locking distributed transactions prevents data inconsistencies between database instances caused by distributed transactions, leading to inconsistent global data, and ensures that distributed databases have a consistent state of global data.
  • unlock the distributed transaction so that the distributed database can run normally, and generate a data backup set based on the backup results of each database instance and the information in the log file, so as to minimize the impact on the user's business , Database backup to ensure global data consistency.
  • FIG. 11 there is shown a structural block diagram of a data processing device according to the ninth embodiment of the present invention.
  • the data processing device of this embodiment includes: a full backup module 1102 for receiving a backup request for a distributed database, and performing full data backup for each database instance corresponding to the backup request instruction; a locking module 1104 for determining each After the database instance completes the full data backup, it locks the distributed transaction used to update the data across the database instance; the first obtaining module 1106 is used to obtain records of the increment of each database instance in a set time period Information about the log file of the data; the unlocking module 1108 is used to unlock the distributed transaction, and generate a data backup set of the distributed database according to the backup result of each database instance and the information of the log file.
  • the locking module 1104 includes: a transaction determining module 11041, configured to determine whether all the executing cross-database instances of the distributed transaction commit is completed after determining that each database instance has completed the full data backup;
  • the instruction generation module 11042 is configured to generate a blocking instruction indicating to lock the distributed transaction if the submission is completed, so as to lock the distributed transaction used for data update across database instances; wherein the blocking instruction is in seconds Blocking duration unit.
  • the log file includes data change location information used to record the incremental data and information after the data change;
  • the information of the log file includes: database instance information corresponding to the log file, the The name of the log file and the offset of the log file.
  • the unlocking module 1108 includes: an unlocking instruction generating module 11081, which is used to generate an unlocking instruction that allows the distributed transaction to be submitted to instruct to unlock the distributed transaction; a backup set generating module 11082, which is used to generate an unlocking command according to each database
  • an unlocking instruction generating module 11081 which is used to generate an unlocking instruction that allows the distributed transaction to be submitted to instruct to unlock the distributed transaction
  • a backup set generating module 11082 which is used to generate an unlocking command according to each database
  • the data processing device in this embodiment is used to implement the corresponding data processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • FIG. 12 there is shown a structural block diagram of a data processing device according to the tenth embodiment of the present invention.
  • the data processing device of this embodiment includes: a backup set determining module 1202, configured to receive a restore request for a distributed database, and determine a data backup set indicated by the restore request, the data backup set being generated by the above data processing device Data backup set; full recovery module 1204, used to instruct corresponding database instances to perform full recovery operations according to the data backup set.
  • the data backup set generated by the data processing method described in any one of the foregoing embodiments 1 to 3 is used for data restoration, which ensures the accuracy of database restoration and can guarantee the data after restoration. Global data consistency of distributed databases.
  • FIG. 13 there is shown a structural block diagram of a data processing device according to the eleventh embodiment of the present invention.
  • the data processing device of this embodiment includes: a backup set determining module 1302, configured to receive a restore request for a distributed database, and determine a data backup set indicated by the restore request, the data backup set being generated by the above data processing device Data backup set; full recovery module 1304, used to instruct corresponding database instances to perform full recovery operations according to the data backup set.
  • the device further includes: an information obtaining module 1306, configured to obtain information from a log file that records incremental data of each database instance in a set time period from the data backup set; an incremental recovery module 1308 , Used to perform an incremental recovery operation on the database instance after the full recovery operation according to the information in the log file.
  • an information obtaining module 1306, configured to obtain information from a log file that records incremental data of each database instance in a set time period from the data backup set
  • an incremental recovery module 1308 Used to perform an incremental recovery operation on the database instance after the full recovery operation according to the information in the log file.
  • the device further includes: a middleware creation module 1310, configured to obtain metadata of each database instance backed up from the data backup set, create a new middleware instance based on the metadata, and combine the The metadata is restored to the new middleware instance.
  • a middleware creation module 1310 configured to obtain metadata of each database instance backed up from the data backup set, create a new middleware instance based on the metadata, and combine the The metadata is restored to the new middleware instance.
  • the full recovery module 1304 is configured to obtain the content data backup result of the backed-up database instance from the data backup set, and instruct each corresponding database instance to restore the content data backup result to the newly created recovery database In the instance.
  • the device further includes: a mounting module 1312, configured to mount each newly created recovery database instance to the new middleware instance.
  • the incremental recovery module 1308 includes: an instance determination module 13081, configured to determine, according to the information in the log file, the database instance to be incrementally recovered and the determined database instance at the set time Incremental data in the segment; an incremental execution module 13082, configured to perform incremental recovery on the determined database instance according to the incremental data.
  • the data processing device in this embodiment is used to implement the corresponding data processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, and will not be repeated here.
  • FIG. 14 there is shown a schematic structural diagram of an electronic device according to the twelfth embodiment of the present invention.
  • the specific embodiment of the present invention does not limit the specific implementation of the electronic device.
  • the electronic device may include: a processor (processor) 1402, a communication interface (Communications Interface) 1404, a memory (memory) 1406, and a communication bus 1408.
  • processor processor
  • communication interface Communication Interface
  • memory memory
  • the processor 1402, the communication interface 1404, and the memory 1406 communicate with each other through the communication bus 1408.
  • the communication interface 1404 is used to communicate with other electronic devices such as terminal devices or servers.
  • the processor 1402 is configured to execute a program 1410, and specifically can execute relevant steps in the foregoing data processing method embodiment.
  • the program 1410 may include program code, and the program code includes computer operation instructions.
  • the processor 1402 may be a central processing unit CPU, or a specific integrated circuit (ASIC) (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement the embodiments of the present invention.
  • the one or more processors included in the electronic device may be processors of the same type, such as one or more CPUs; or processors of different types, such as one or more CPUs and one or more ASICs.
  • the memory 1406 is used to store the program 1410.
  • the memory 1406 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
  • the program 1410 can specifically be used to make the processor 1402 perform the following operations: receive a backup request for a distributed database, and perform a full data backup for each database instance corresponding to the backup request instruction; after determining that each database instance completes the full data backup Afterwards, lock the distributed transaction used to update data across database instances; obtain the log file information that records the incremental data of each database instance within a set time period; unlock the distributed transaction, and The backup result of each database instance and the information of the log file generate a data backup set of the distributed database.
  • the program 1410 is further configured to enable the processor 1402 to lock the distributed transaction used for data update across database instances after determining that each database instance has completed the full data backup, After it is determined that each database instance has completed the full data backup, it is determined whether the distributed transaction submission of all executing cross-database instances is completed; if the submission is completed, a blocking instruction indicating that the distributed transaction is locked is generated for the A distributed transaction that updates data across database instances is locked; wherein the blocking instruction uses a second as the blocking duration unit.
  • the log file includes data change location information used to record the incremental data and information after the data change;
  • the information of the log file includes: information corresponding to the log file Database instance information, the name of the log file, and the offset of the log file.
  • the program 1410 is further configured to cause the processor 1402 to unlock the distributed transaction, and generate the log file according to the backup result of each of the database instances that are backed up
  • the unlock instruction that allows the distributed transaction to be submitted is generated to instruct the unlocking of the distributed transaction; the metadata of each database instance and each database instance backed up when the full data backup is performed according to each database instance
  • the data backup result of the content and the information of the log file generate a data backup set of the distributed database.
  • the program 1410 may specifically be used to cause the processor 1402 to perform the following operations: receive a restore request for a distributed database, and determine a data backup set indicated by the restore request, where the data backup set is a data backup generated according to the foregoing data processing method Set; According to the data backup set, instruct each corresponding database instance to perform a full recovery operation.
  • the program 1410 is further configured to enable the processor 1402 to obtain from the data backup set the information of the log file that records the incremental data of each database instance within a set time period;
  • the log file information is used to perform incremental recovery operations on the database instance after the full recovery operation.
  • the program 1410 is further configured to enable the processor 1402 to obtain metadata of each database instance backed up from the data backup set, create a new middleware instance based on the metadata, and combine all The metadata is restored to the new middleware instance.
  • the program 1410 is further configured to enable the processor 1402 to obtain a backed-up database instance from the data backup set when instructing each corresponding database instance to perform a full recovery operation according to the data backup set.
  • the result of the content data backup instructs each corresponding database instance to restore the content data backup result to the newly created database instance for recovery.
  • the program 1410 is further configured to cause the processor 1402 to mount each newly created recovery database instance to the new middleware instance.
  • the program 1410 is further configured to enable the processor 1402 to perform an incremental recovery operation on the database instance after the full recovery operation according to the information in the log file, according to the information in the log file , Determining the database instance to be incrementally restored and the determined incremental data of the database instance within the set time period; and performing incremental recovery on the determined database instance according to the incremental data.
  • each database instance corresponding to the distributed database is instructed to perform a full data backup according to the backup request, and after the full data backup is completed, the distributed transaction is locked and the log file information is obtained.
  • Locking distributed transactions prevents data inconsistencies between database instances caused by distributed transactions, leading to inconsistent global data, and ensures that distributed databases have a consistent state of global data.
  • unlock the distributed transaction so that the distributed database can run normally, and generate a data backup set based on the backup results of each database instance and the information in the log file, so as to minimize the impact on the user's business , Database backup to ensure global data consistency.
  • the data backup set generated by the data processing method described in any one of the foregoing embodiments 1 to 3 is used for data restoration, ensuring the accuracy of database restoration, and It can ensure the global data consistency of the restored distributed database.
  • each component/step described in the embodiment of the present invention can be split into more components/steps, or two or more components/steps or partial operations of components/steps can be combined into New components/steps to achieve the purpose of the embodiments of the present invention.
  • the above method according to the embodiments of the present invention can be implemented in hardware, firmware, or implemented as software or computer code that can be stored in a recording medium (such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk), or implemented by
  • a recording medium such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk
  • the computer code downloaded from the network is originally stored in a remote recording medium or a non-transitory machine-readable medium and will be stored in a local recording medium, so that the method described here can be stored using a general-purpose computer, a dedicated processor or a programmable Or such software processing on a recording medium of dedicated hardware (such as ASIC or FPGA).
  • a computer, processor, microprocessor controller, or programmable hardware includes storage components (for example, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is used by the computer, When accessed and executed by the processor or hardware, the data processing method described here is implemented.
  • storage components for example, RAM, ROM, flash memory, etc.
  • the execution of the code converts the general-purpose computer into a special-purpose computer for executing the data processing method shown here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据处理方法、装置、电子设备及计算机存储介质。该数据处理方法包括:接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份(S102);在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定(S104);获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息(S106);解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集(S108)。该数据处理方法可以在备份数据库时最小化侵入用户业务。

Description

数据处理方法、装置、电子设备及计算机存储介质
本申请要求2019年07月26日递交的申请号为201910682744.9、发明名称为“数据处理方法、装置、电子设备及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明实施例涉及计算机技术领域,尤其涉及一种数据处理方法、装置、电子设备及计算机存储介质。
背景技术
数据库备份和恢复是用户数据安全的重要保障,而随着用户业务规模的增加,数据库存储的数据量和存储负载也呈指数级增长。为了提升单机数据库性能,提供了一种数据库分库分表(MySQL Sharding)技术。
数据库分库分表技术是一种根据分库分表算法对数据库和/或数据表进行拆分并分布存储的技术,在具有分库分表的分布式数据库使用场景下,数据库备份和恢复操作就更为重要。
目前,针对上述使用场景下的分布式数据库,使用的数据备份和恢复方案主要有两种:
1.针对每个数据库实例(如MySQL实例)分别进行备份,后续使用备份的数据进行恢复。这种方式只能保证单个数据库实例(即物理分库)的数据一致性,无法保证全局的数据一致性。
2.在备份时,对数据库全局禁止写入,之后分别针对每个数据库实例(如MySQL实例)进行备份,后续使用备份的数据进行恢复。这种方式虽然保证了全局的数据一致性,但是对用户业务侵入较大,数据库在一定时间内无法提供数据写入服务,影响用户使用。
发明内容
有鉴于此,本发明实施例提供一种数据处理方案,以解决上述部分或全部问题。
根据本发明实施例的第一方面,提供了一种数据处理方法,其包括:接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份;在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的 分布式事务进行锁定;获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
根据本发明实施例的第二方面,提供了一种数据处理方法,其包括:接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集,所述数据备份集为根据第一方面数据处理方法生成的数据备份集;根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作。
根据本发明实施例的第三方面,提供了一种数据处理装置,其包括:全量备份模块,用于接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份;锁定模块,用于在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定;第一获取模块,用于获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;解锁模块,用于解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
根据本发明实施例的第四方面,提供了一种数据处理装置,其包括:备份集确定模块,用于接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集,所述数据备份集为根据第三方面数据处理装置生成的数据备份集;全量恢复模块,用于根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作。
根据本发明实施例的第五方面,提供了一种电子设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如第一方面或第二方面所述的数据处理方法对应的操作。
根据本发明实施例的第六方面,提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现如第一方面或第二方面所述的数据处理方法。
根据本发明实施例提供的数据处理方案,根据备份请求指示分布式数据库对应的各数据库实例进行全量数据备份,并在全量数据备份完成后锁定分布式事务并获取日志文件的信息。通过锁定分布式事务防止分布式事务造成数据库实例间的数据不一致,导致全局数据不一致的问题,保证分布式数据库具有全局数据一致状态。在获取日志文件的信息后,解锁分布式事务使分布式数据库可以正常运行,并根据各数据库实例的备份结果和日志文件的信息生成数据备份集,实现了在最小化对用户业务影响的前提下,保证 数据全局一致性的数据库备份。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明实施例中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。
图1为根据本发明实施例一的一种数据处理方法的步骤流程图;
图2a为根据本发明使用场景一的一种分布式数据库进行数据库备份的时序图;
图2b为根据本发明数据处理方案的使用场景图;
图3为根据本发明实施例二的一种数据处理方法的步骤流程图;
图4为根据本发明实施例三的一种数据处理方法的步骤流程图;
图5为根据本发明实施例四的一种数据处理方法的步骤流程图;
图6为根据本发明实施例五的一种数据处理方法的步骤流程图;
图7为根据本发明实施例六的一种数据处理方法的步骤流程图;
图8为根据本发明实施例七的一种数据处理方法的步骤流程图;
图9为根据本发明使用场景二的一种分布式数据库进行数据库恢复的时序图;
图10为根据本发明实施例八的一种数据处理装置的结构框图;
图11为根据本发明实施例九的一种数据处理装置的结构框图;
图12为根据本发明实施例十的一种数据处理装置的结构框图;
图13为根据本发明实施例十一的一种数据处理装置的结构框图;
图14为根据本发明实施例十二的一种电子设备的结构示意图。
具体实施方式
为了使本领域的人员更好地理解本发明实施例中的技术方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明实施例一部分实施例,而不是全部的实施例。基于本发明实施例中的实施例,本领域普通技术人员所获得的所有其他实施例,都应当属于本发明实施例保护的范围。
下面结合本发明实施例附图进一步说明本发明实施例具体实现。
实施例一
参照图1,示出了根据本发明实施例一的一种数据处理方法的步骤流程图。
本实施例的数据处理方法包括以下步骤:
步骤S102:接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份。
本实施例中,分布式数据库可以采用分库分表技术(如MySQL Sharding)的数据库,该种分布式数据库可以将数据量较大的数据表拆分为多个分表,并将这些分表分布在多个数据库实例(如MySQL实例)中。这样每个MySQL实例中的分表仅包含数据表的部分数据,从而将数据存储和计算压力分摊到多个MySQL实例中,以解决单机性能瓶颈的问题。
在对分布式数据库进行数据库备份时,需要保证全局数据一致性,即需要保证各个数据库实例的数据一致性。但是,跨数据库实例的数据更新操作(如数据删除、数据变更和数据插入)却极易造成全局数据不一致。
例如,分布式数据库包括数据表和与其对应的索引表,数据表和索引表均拆分为两个分表,并存储在两个数据库实例(例如图2a中所示的MySQL实例A和MySQL实例B)上。当需要插入新数据时,需要在数据表和索引表中都插入新数据,若这一插入新数据操作需要MySQL实例A和MySQL实例B都执行,则只有在两个数据库实例都成功执行时才能保证全局数据一致,若其中任一失败,就会造成数据表和索引表数据不一致。
为了避免上述现象,确保数据一致性,现有技术中通常采取的方式是:全局禁止写入,即在数据库备份期间,禁止用户业务对数据库进行数据更新,但这种方式对用户业务的侵入性强,影响用户对分布式数据库的使用。
为了保证全局数据一致性,并降低对用户业务的侵入性,在本实施例中,首先,在接收到备份请求时,根据该备份请求指示分布式数据库的各数据库实例进行全量数据备份,例如,MySQL实例A和MySQL实例B分别进行全量数据备份。在全量数据备份过程中,数据库的涉及单个数据库实例的事务和非事务SQL可以正常执行,由此产生的数据更新可以根据各数据库实例的日志文件进行恢复。由于各数据库实例是进行单机备份,因此可以在保证数据一致性的基础上,还使得数据库备份过程中不会对用户业务造成较大侵入,可以不影响用户业务的正常运行。
步骤S104:在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实 例的数据更新的分布式事务进行锁定。
本实施例中,分布式事务即需要进行跨数据库实例的数据更新的事务,其中,如前所述,数据更新可以是数据变更、数据删除和数据插入等。通过在确定各数据库实例完成所述全量数据备份后,对跨数据库实例的分布式事务进行锁定,可以阻止新的分布式事务执行,从而避免了发生跨数据库实例的数据更新,保证了在全量数据备份完成后,数据库的全局数据一致性。
对于不同类型的数库,本领域技术人员可以采用与数据库类型对应的方式对分布式事务进行锁定,本实施例对此不作限定。
步骤S106:获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息。
各数据库实例均对应有日志文件,该日志文件中记录有各数据库实例在设定时间段内的增量数据。例如,本实施例中,该日志文件通过记录对应的数据库实例在设定时间段内的数据变更位置信息和数据变更后信息,可以记录设定时间段内的增量数据。当然,日志文件还可以记录对应的数据库实例在设定时间段内的各数据变更位置的数据变更前信息,以便于校验或纠错时使用或者在其他用途中使用。对于所述设定时间段,本领域技术人员可以根据需要确定具体开始时间和结束时间。例如,设定时间段的开始时间为对应的数据库实例开始进行全量备份操作的时间,结束时间为锁定分布式事务的时间,本发明实施例对此不作限制。
不同类型的数据库实例中的日志文件可能不同,以MySQL实例为例,日志文件可以是binlog文件。Binlog文件的数据格式为行格式(即binlog_format参数值为row)。这种数据格式的binlog文件可以记录数据库实例执行的每个插入记录、删除记录和变更记录的SQL语句,所影响的数据行在变更前和变更后的值等信息。
当然,在其他类型的数据库实例中,日志文件可以是其他类型的日志文件,本实施例对此不作限定。
通过获取日志文件的信息可以确保在需要进行数据库恢复时能够获得各数据库实例的日志文件,进而能够根据日志文件进行数据恢复。
在本实施例中,通过锁定分布式事务保证分布式数据库具有全局数据一致的状态,并在锁定分布式事务的期间获取日志文件的信息,确保日志文件记录的数据也是全局一致的。由于在分布式事务的锁定期间,需要获取的日志文件的信息的数据量很小,因此能够保证在较短的时间(如1秒)内完成获取,使得对分布式事务的锁定时间很短,几 乎不会影响用户业务,实现了对数据库备份过程中能够保证全局数据一致性,且几乎不会侵入用户业务。
根据需要的不同,日志文件的信息可以是任何适当的信息。例如,日志文件的信息可以理解为日志文件的位点信息,其可以包括日志文件对应的数据库实例信息(如数据库实例ID)、日志文件的名称和日志文件的偏移量等。日志文件包括所述增量数据的数据变更位置信息和所述数据变更后信息。
步骤S108:解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
在获取日志文件的信息后,尽快解锁分布式事务,减少对数据库运行和用户业务的影响即可以正常进行分布式事务的提交。
同时,还根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集,该数据备份集可以供后续数据库恢复使用。
通过本实施例,根据备份请求指示分布式数据库对应的各数据库实例进行全量数据备份,并在全量数据备份完成后锁定分布式事务并获取日志文件的信息。通过锁定分布式事务防止分布式事务造成数据库实例间的数据不一致,导致全局数据不一致的问题,保证分布式数据库具有全局数据一致状态。在获取日志文件的信息后,解锁分布式事务使分布式数据库可以正常运行,并根据各数据库实例的备份结果和日志文件的信息生成数据备份集,实现了在最小化对用户业务影响的前提下,保证数据全局一致性的数据库备份。
本实施例的数据处理方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机等。
实施例二
参照图3,示出了根据本发明实施例二的一种数据处理方法的步骤流程图。
本实施例的数据处理方法包括前述的步骤S102~步骤S108。
其中,所述步骤S104包括以下子步骤:
子步骤S1041:在确定各数据库实例完成所述全量数据备份后,确定是否所有执行中的跨数据库实例的所述分布式事务提交完成。
由于分布式事务的提交可能造成数据的全局一致性无法保证,为了避免这一问题,在确定各数据库实例完成全量数据备份后,锁定分布式事务之前,确定是否所有执行中 的所述分布式事务提交完成。
需要说明的是,本实施例中的提交完成既可以是正式提交任务,也可以是回滚任务。
若确定所有执行中的分布式事务都提交完成,则执行子步骤S1042;反之,则可以不动作或者执行其他适当动作。
子步骤S1042:若提交完成,则生成指示锁定分布式事务的阻塞指令,以对用于进行跨数据库实例的数据更新的分布式事务进行锁定。
若所有执行中的分布式事务都提交完成,则表示此时分布式数据库的数据是全局一致的。此时,生成指示锁定分布式事务的阻塞指令,可以防止分布式数据库执行新的分布式事务而破坏数据的全局一致性,也使得在锁定分布式事务期间获取的日志文件的信息能够保证数据的全局一致性。
为了减少对用户业务的侵入程度,所述阻塞指令以秒级为阻塞时长单位。例如,1秒、2秒、5秒、10秒等,这样阻塞时间较短,从而尽可能地降低对用户业务的侵入。
通过本实施例,根据备份请求指示分布式数据库对应的各数据库实例进行全量数据备份,并在全量数据备份完成后锁定分布式事务并获取日志文件的信息。通过锁定分布式事务防止分布式事务造成数据库实例间的数据不一致,导致全局数据不一致的问题,保证分布式数据库具有全局数据一致状态。在获取日志文件的信息后,解锁分布式事务使分布式数据库可以正常运行,并根据各数据库实例的备份结果和日志文件的信息生成数据备份集,实现了在最小化对用户业务影响的前提下,保证数据全局一致性的数据库备份。
此外,阻塞指令以秒级为阻塞时长单位,可以尽可能地降低对用户业务的侵入。
本实施例的数据处理方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机等。
实施例三
参照图4,示出了根据本发明实施例三的一种数据处理方法的步骤流程图。
本实施例的数据处理方法包括前述的步骤S102~步骤S108。
所述步骤S104可以采用前述实施例一或实施例二中的方式实现,或者采用其他方式实现。
在本实施例中,在所述步骤S108包括以下子步骤:
子步骤S1081:生成允许分布式事务提交的解锁指令,以指示解除对分布式事务的 锁定。
为了保证最小化对用户业务的侵入,在获得日志文件的信息后,即生成允许分布式事务提交的解锁指令,以使用户业务能够正常提交分布式事务。
本领域技术人员可以根据需要采用任何适当的方式生成解锁指令,本实施例对此不作限定。
子步骤S1082:根据各数据库实例进行全量数据备份时备份的各数据库实例的元数据、各数据库实例的内容数据备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
为了便于管理备份数据和后续进行数据库恢复,根据各数据库实例的元数据、内容数据备份结果和日志文件的信息生成数据备份集。
其中,元数据包括但不限于各数据库实例的配置信息、访问数据库实例使用的账号信息等。
内容数据备份结果中包括对应的数据库实例中存储的数据表中的数据。
日志文件包括数据变更位置信息和数据变更后信息,以记录增量数据。日志文件的信息可以是位点信息,其包括所述日志文件对应的数据库实例信息、所述日志文件的名称和日志文件的偏移量等。通过获取的日志文件的信息,使得后续在进行数据恢复时,可以根据日志文件的信息中包含的日志文件的名称获取日志文件。根据日志文件的偏移量和日志文件中的数据变更位置信息和数据变更后信息,确定增量数据。根据增量数据对日志文件对应的数据库实例进行增量恢复。
通过本实施例,根据备份请求指示分布式数据库对应的各数据库实例进行全量数据备份,并在全量数据备份完成后锁定分布式事务并获取日志文件的信息。通过锁定分布式事务防止分布式事务造成数据库实例间的数据不一致,导致全局数据不一致的问题,保证分布式数据库具有全局数据一致状态。在获取日志文件的信息后,解锁分布式事务使分布式数据库可以正常运行,并根据各数据库实例的备份结果和日志文件的信息生成数据备份集,实现了在最小化对用户业务影响的前提下,保证数据全局一致性的数据库备份。
此外,在获取日志文件的信息后生成解锁指令,使得用户业务能够正常提交分布式事务,最小化了对用户业务的侵入。根据元数据、备份结果和日志文件的信息生成数据备份集可以便于管理数据库的备份数据,方便后续进行数据库恢复。
本实施例的数据处理方法可以由任意适当的具有数据处理能力的电子设备执行,包 括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机等。
需要说明的是,前述的步骤中的先后执行顺序并不受步骤编号限制,本领域技术人员可以根据需要配置步骤的执行顺序,各步骤可以全部顺序执行、全部并行执行或者部分顺序执行部分并行执行。
使用场景一:
参照图2a,示出了一种分布式数据库进行数据库备份的时序图。参照图2b,示出了一种数据处理方案的使用场景图。的
本使用场景中,数据处理方案应用至分布式数据库中,分布式数据库包括中间件200和多个数据库实例(即数据库实施例中的300_1、300_2到300_N),中间件200和多个数据库实例300_1~300_N之间通过网络通信。其中,以分布式数据库中的中间件200(如DRDS proxy)为执行主体,对本发明实施例提供的数据处理方法进行说明。其中,中间件200是在用户业务端100与数据库实例300_1~300_N之间加入的服务进程,主要为用户业务端100提供分布式数据库的路由能力,用户业务端100的SQL(结构化查询语句)会根据分布式数据库的分库分表算法(例如,sharding)算法路由到需要的数据库实例上,这样可以方便用户业务端100方便地管理和操作多个数据库实例。
具体地,本使用场景中的数据处理方法通过中间件200进行数据库备份的过程如下:
步骤A1:用户业务端100通过中间件控制台(例如,DRDS控制台),触发中间件进行数据库备份操作。
步骤B1:中间件200(DRDS proxy)备份各数据库实例300_1~300_N的元数据。其中,元数据可以包括中间件200访问数据库实例300_1~300_N(RDS)使用的账号信息、数据库实例300_1~300_N的分库分表的配置信息等。
步骤C1:中间件200(DRDS proxy)触发下层所有数据库实例300_1~300_N(如MySQL实例)的全量备份操作。
步骤D1:中间件200(DRDS proxy)检查下层数据库实例300_1~300_N(如MySQ实例)的备份状态,直至所有数据库实例300_1~300_N全量备份完成。
步骤E1:中间件200(DRDS proxy)锁定分布式事务,阻塞当前所有未提交的分布式事务的提交。需要说明的是,在锁定分布式事务前,会等待所有正在提交分布事务执行完成,再进行锁定,从而防止造成全局数据不能保持一致性。
步骤F1:中间件200(DRDS proxy)记录每个数据库实例300_1~300_N(如MySQL实例)当前的日志文件(如binlog文件)的位点信息。其中,位点信息包括数据库实例 信息(例如,serverId),日志文件的名称,日志文件的偏移量(即binlog offset)。日志文件包括用于记录所述增量数据的数据变更位置信息和所述数据变更后信息。
binlog文件是一种二进制格式文件,用于记录对数据库中数据的更新操作。如,记录每个数据更新操作的SQL影响的数据位置和变更前和变更后的数据。使用binlog文件能够可靠地记录数据库备份过程中产生的数据变更信息,进而保证后续根据binlog文件进行数据库恢复的可靠性。
步骤G1:中间件200(DRDS proxy)解锁分布式事务,允许所有分布式事务提交。
步骤H1:数据库备份完成,根据全量备份操作的备份结果、日志文件的信息和元数据生成对应的数据备份集。
在前述的数据库备份过程中,通过分布式事务锁定机制(即LOCK机制)和记录binlog位点信息(position)的方式实现对数据库的备份,保证了最小化对用户业务的影响。
其中,分布式事务锁定时长为秒级,且仅跨数据库事务提交操作被阻塞,其它SQL执行不受影响,保证在备份过程中,对用户业务的影响甚微,且不会由于分布式事务被锁定而报错,而且非事务SQL与单机事务SQL执行不受影响。
实施例四
参照图5,示出了根据本发明实施例四的一种数据处理方法的步骤流程图。
本实施例的数据处理方法包括以下步骤:
步骤S502:接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集。
本实施例中,以分布式数据库的中间件(如DRDS proxy)为执行主体,对本发明实施例提供的数据处理方法进行说明。如前述使用场景一中所述,中间件用于方便用户业务管理和操作分布式数据库中的数据库实例。
恢复请求用于指示根据某个数据备份集对分布式数据库进行数据库恢复。恢复请求可以通过数据备份集的名称、标识或存储地址等方式指示使用的数据备份集。
在本实施例中,所述数据备份集为根据前述的实施例一至三中任一所述数据处理方法生成的数据备份集。该数据备份集中至少包括各数据库实例进行全量备份操作的内容数据备份结果。
步骤S504:根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作。
中间件可以根据恢复请求指示的数据备份集确定恢复请求涉及的数据库实例,进而指示涉及的各数据库实例进行全量恢复操作。
第一种可行方式中,各数据库实例将备份结果全量恢复到原有数据库实例中。
第二种可行方式中,各数据库实例将备份结果全量恢复到新的恢复用数据库实例中。此方式中,步骤S504可以实现为:从所述数据备份集获取备份的数据库实例的内容数据备份结果,指示对应的各数据库实例将所述内容数据备份结果恢复到新创建的恢复用数据库实例中。
由于各数据库实例均创建一个新的数据库实例作为恢复用数据库实例,并将备份结果恢复到该恢复用数据库实例中,避免了第一种可行方式中存在的在恢复过程中影响用户业务对原有数据库实例的使用的问题,减少了在数据库恢复过程中对用户业务的侵入。
对于在数据库备份过程中,各数据库实例未进行数据更新操作的分布式数据库,进行全量恢复操作即完成了数据库恢复。
通过本实施例,在进行数据库恢复时,使用前述实施例一到三中任一所述的数据处理方法生成的数据备份集进行数据恢复,确保了数据库恢复的准确性,且能够保证恢复后的分布式数据库的数据全局一致性。
本实施例的数据处理方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机等。
实施例五
参照图6,示出了根据本发明实施例五的一种数据处理方法的步骤流程图。
本实施例的数据处理方法包括前述的步骤S502~步骤S504。
其中,所述方法还包括以下步骤:
步骤S506:从所述数据备份集中获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息。
需要说明的是,本步骤为可选步骤。对于数据备份集中包含日志文件的信息,且在数据库备份过程中,数据库实例进行了数据更新操作的数据库实例,在进行数据库恢复时执行步骤S506和步骤S508。
由于在数据库备份过程中部分或全部数据库实例进行了数据更新操作,因此可能存在部分数据更新操作的数据未被备份到备份结果中的情况,这种情况可能破坏数据的全局一致性,为了避免这一问题,在对各数据库实例进行全量恢复操作后,从数据备份集 中获取记录有设定时间段内的增量数据的日志文件的信息,以便根据日志文件的信息对数据库备份过程中更新的数据进行恢复,从而确保数据全局一致。
在本实施例中,日志文件包括用于记录所述增量数据的数据变更位置信息和所述数据变更后信息。日志文件的信息包括:所述日志文件对应的数据库实例信息、所述日志文件的名称和日志文件的偏移量。当然,在其他实施例中,本领域技术人员可以根据需要配置日志文件的信息包含任何适当的内容。
例如,日志文件可以为binlog文件。该binlog文件的指示数据格式的参数(即binlog_format)为row,即指示记录每个数据更新操作的SQL所影响的数据行、以及数据行的变更前和变更后的值。
步骤S508:根据所述日志文件的信息,对全量恢复操作后的数据库实例进行增量恢复操作。
在一具体实现中,所述步骤S508包括以下子步骤:
子步骤S5081:根据所述日志文件的信息,确定待进行增量恢复的数据库实例和确定的所述数据库实例在所述设定时间段内的增量数据。
例如,在本实施例中,日志文件通过记录对应的数据库实例在设定时间段内的数据变更位置信息和数据变更后信息来实现记录增量数据的目的。当然,在其他实施例中,可以采用其他方式记录增量数据。
此时确定待进行增量恢复的数据库实例和增量数据的过程可以为:根据日志文件的信息中的数据库实例名称,确定待进行增量恢复的数据库实例。根据日志文件的信息中的日志文件的名称和日志文件的偏移量,确定待进行增量恢复的数据库实例在设定时间段内数据变更位置信息和数据变更后信息。进而,根据数据变更位置信息,确定数据库实例中增量数据的位置,根据数据变更后信息确定增量数据的内容。
子步骤S5082:根据所述增量数据,对确定的所述数据库实例进行增量恢复。
例如,将确定的增量数据的位置处的数据更新为数据变更后信息。这样就能够恢复在数据库备份过程中进行数据更新操作产生的数据,从而确保数据全局一致性。
通过本实施例,在进行数据库恢复时,使用前述实施例一到三中任一所述的数据处理方法生成的数据备份集进行数据恢复,确保了数据库恢复的准确性,且能够保证恢复后的分布式数据库的数据全局一致性。
此外,对于在数据库备份过程中进行了数据更新操的数据库实例,根据日志文件的信息对全局恢复操作后的数据库实例进行增量恢复操作,可以充分确保数据库数据的全 局一致性。
本实施例的数据处理方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机等。
实施例六
参照图7,示出了根据本发明实施例六的一种数据处理方法的步骤流程图。
本实施例的数据处理方法包括前述的步骤S502~步骤S504。
其中,所述方法还可以包括或不包括前述的步骤S506~步骤S508。在本实施例中,所述方法还包括以下步骤:
步骤S510:从所述数据备份集中获取备份的各数据库实例的元数据,根据所述元数据创建新的中间件实例,并将所述元数据恢复至所述新的中间件实例。
在分布式数据库中,由于中间件(如DRDS proxy)需要为用户业务提供分布式数据库的路由功能,因此在中间件上存储有分布式数据库的各数据库实例的元数据,元数据包括但不限于:中间件访问数据库实例(RDS)使用的账号信息、数据库实例的分库分表的配置信息等。
在本实施例中,数据备份集中还包括各数据库实例的元数据。进行数据库恢复时,中间件从数据备份集中获取备份的元数据,并将其恢复到创建的新的中间件实例中。本领域技术人员可以采用任何适当的方式将元数据恢复到新的中间件实例中,例如采用复制的方式,本实施例对此不作限定。
这样针对恢复的新的分布式数据库创建对应的新的中间件实例,可以更加方便地为用户业务提供路由功能,使其可以方便地通过新的中间件实例管理和操作新恢复的分布式数据库,而且在恢复过程中,用户业务依然可以正常管理和操作原有的分布式数据库,使得数据库恢复过程对用户业务的侵入较小。
通过本实施例,在进行数据库恢复时,使用前述实施例一到三中任一所述的数据处理方法生成的数据备份集进行数据恢复,确保了数据库恢复的准确性,且能够保证恢复后的分布式数据库的数据全局一致性。
此外,在恢复元数据时,创建新的中间件实例,有助于减少数据库恢复过程对用户业务的侵入。
本实施例的数据处理方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机等。
实施例七
参照图8,示出了根据本发明实施例七的一种数据处理方法的步骤流程图。
本实施例的数据处理方法包括前述的步骤S502~步骤S504。
其中,所述方法还可以包括或不包括前述的步骤S506~步骤S510。在包括步骤S510时,且所述步骤S504创建了新的恢复用数据库实例的情况下,所述方法还包括以下步骤:
步骤S512:将各新创建的恢复用数据库实例,挂载到所述新的中间件实例。
为了便于用户业务通过中间件管理恢复后的分布式数据库中的各数据库实例,因此在创建了恢复用数据库实例和新的中间件实例的情况下,将恢复用数据库实例挂载到新的中间件实例,从而使两者建立关联。
通过本实施例,在进行数据库恢复时,使用前述实施例一到三中任一所述的数据处理方法生成的数据备份集进行数据恢复,确保了数据库恢复的准确性,且能够保证恢复后的分布式数据库的数据全局一致性。
本实施例的数据处理方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机等。
需要说明的是,前述的步骤中的先后执行顺序并不受步骤编号限制,本领域技术人员可以根据需要配置步骤的执行顺序,各步骤可以全部顺序执行、全部并行执行或者部分顺序执行部分并行执行。
使用场景二:
参照图9,示出了一种分布式数据库进行数据库恢复的时序图。
本使用场景中,数据处理方案应用至分布式数据库中,分布式数据库包括中间件200和多个数据库实例(即数据库实施例中的300_1、300_2到300_N),中间件200和多个数据库实例300_1~300_N之间通过网络通信。其中,以分布式数据库中的中间件200(如DRDS proxy)为执行主体,对本发明实施例提供的数据处理方法进行说明。其中,中间件200是在用户业务端100与数据库实例300_1~300_N之间加入的服务进程,主要为用户业务端100提供分布式数据库的路由能力,用户业务端100的SQL(结构化查询语句)会根据分布式数据库的分库分表算法(例如,sharding)算法路由到需要的数据库实例上,这样可以方便用户业务端100方便地管理和操作多个数据库实例。
具体地,本使用场景中的数据处理方法通过中间件进行数据库恢复的过程如下:
步骤A2:用户业务端100在中间件200(如DRDS proxy)控制台,选择一个有效 的数据备份集,并触发恢复请求。
步骤B2:中间件200(DRDS proxy)根据恢复请求,创建新的中间件实例,并将相关元数据同步至新的中间件实例。
步骤C2:中间件200(DRDS proxy)触发下层所有数据库实例300_1~300_N(如图9所示的MySQL A和MySQL B)基于该数据备份集的全量恢复操作,使各数据库实例创建新的恢复用数据库实例(如图9所示的MySQL C和MySQL D),并将数据备份集中的备份结果恢复至新的恢复用数据库实例。
步骤D2:中间件200(DRDS proxy)检查各数据库实例300_1~300_N的恢复状态,直至全部数据库实例的全量恢复操作完成。
步骤E2:中间件200(DRDS proxy)将新的恢复用数据库实例挂载到新的中间件实例下。
步骤F2:中间件200(DRDS proxy)根据数据备份集中记录的每个数据库实例的日志文件(如binlog文件)的位点信息,将原数据库实例中的从开始数据库备份到锁定分布式事务期间的binlog应用至对应的恢复用数据库实例中,以补全这段时间的增量数据。由于binlog记录的是数据行的变更,即使部分binlog重复应用到恢复用数据库实例,也能保证幂等及恢复数据的正确性。
步骤G2:中间件200(DRDS proxy)完成数据库恢复。
在前述的数据库恢复过程中,基于数据备份集中的binlog位点(position)信息,实现了Point-In-Position的恢复机制,通过数据库实例的全量恢复操作和日志文件(binlog)的增量恢复操作,保证了分布式数据库场景下的全局数据一致性。
实施例八
参照图10,示出了根据本发明实施例八的一种数据处理装置的结构框图。
本实施例的数据处理装置包括:全量备份模块1002,用于接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份;锁定模块1004,用于在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定;第一获取模块1006,用于获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;解锁模块1008,用于解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
通过本实施例,根据备份请求指示分布式数据库对应的各数据库实例进行全量数据备份,并在全量数据备份完成后锁定分布式事务并获取日志文件的信息。通过锁定分布式事务防止分布式事务造成数据库实例间的数据不一致,导致全局数据不一致的问题,保证分布式数据库具有全局数据一致状态。在获取日志文件的信息后,解锁分布式事务使分布式数据库可以正常运行,并根据各数据库实例的备份结果和日志文件的信息生成数据备份集,实现了在最小化对用户业务影响的前提下,保证数据全局一致性的数据库备份。
实施例九
参照图11,示出了根据本发明实施例九的一种数据处理装置的结构框图。
本实施例的数据处理装置包括:全量备份模块1102,用于接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份;锁定模块1104,用于在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定;第一获取模块1106,用于获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;解锁模块1108,用于解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
可选地,所述锁定模块1104包括:事务确定模块11041,用于在确定各数据库实例完成所述全量数据备份后,确定是否所有执行中的跨数据库实例的所述分布式事务提交完成;阻塞指令生成模块11042,用于若提交完成,则生成指示锁定分布式事务的阻塞指令,以对用于进行跨数据库实例的数据更新的分布式事务进行锁定;其中,所述阻塞指令以秒级为阻塞时长单位。
可选地,所述日志文件包括用于记录所述增量数据的数据变更位置信息和所述数据变更后信息;所述日志文件的信息包括:所述日志文件对应的数据库实例信息、所述日志文件的名称和日志文件的偏移量。
可选地,所述解锁模块1108包括:解锁指令生成模块11081,用于生成允许分布式事务提交的解锁指令,以指示解除对分布式事务的锁定;备份集生成模块11082,用于根据各数据库实例进行全量数据备份时备份的各数据库实例的元数据、各数据库实例的内容数据备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
本实施例的数据处理装置用于实现前述多个方法实施例中相应的数据处理方法,并 具有相应的方法实施例的有益效果,在此不再赘述。
实施例十
参照图12,示出了根据本发明实施例十的一种数据处理装置的结构框图。
本实施例的数据处理装置包括:备份集确定模块1202,用于接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集,所述数据备份集为上述数据处理装置生成的数据备份集;全量恢复模块1204,用于根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作。
通过本实施例,在进行数据库恢复时,使用前述实施例一到三中任一所述的数据处理方法生成的数据备份集进行数据恢复,确保了数据库恢复的准确性,且能够保证恢复后的分布式数据库的数据全局一致性。
实施例十一
参照图13,示出了根据本发明实施例十一的一种数据处理装置的结构框图。
本实施例的数据处理装置包括:备份集确定模块1302,用于接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集,所述数据备份集为上述数据处理装置生成的数据备份集;全量恢复模块1304,用于根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作。
可选地,所述装置还包括:信息获取模块1306,用于从所述数据备份集中获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;增量恢复模块1308,用于根据所述日志文件的信息,对全量恢复操作后的数据库实例进行增量恢复操作。
可选地,所述装置还包括:中间件创建模块1310,用于从所述数据备份集中获取备份的各数据库实例的元数据,根据所述元数据创建新的中间件实例,并将所述元数据恢复至所述新的中间件实例。
可选地,所述全量恢复模块1304用于从所述数据备份集获取备份的数据库实例的内容数据备份结果,指示对应的各数据库实例将所述内容数据备份结果恢复到新创建的恢复用数据库实例中。
可选地,所述装置还包括:挂载模块1312,用于将各新创建的恢复用数据库实例,挂载到所述新的中间件实例。
可选地,所述增量恢复模块1308包括:实例确定模块13081,用于根据所述日志文 件的信息,确定待进行增量恢复的数据库实例和确定的所述数据库实例在所述设定时间段内的增量数据;增量执行模块13082,用于根据所述增量数据,对确定的所述数据库实例进行增量恢复。
本实施例的数据处理装置用于实现前述多个方法实施例中相应的数据处理方法,并具有相应的方法实施例的有益效果,在此不再赘述。
实施例十二
参照图14,示出了根据本发明实施例十二的一种电子设备的结构示意图,本发明具体实施例并不对电子设备的具体实现做限定。
如图14所示,该电子设备可以包括:处理器(processor)1402、通信接口(Communications Interface)1404、存储器(memory)1406、以及通信总线1408。
其中:
处理器1402、通信接口1404、以及存储器1406通过通信总线1408完成相互间的通信。
通信接口1404,用于与其它电子设备如终端设备或服务器进行通信。
处理器1402,用于执行程序1410,具体可以执行上述数据处理方法实施例中的相关步骤。
具体地,程序1410可以包括程序代码,该程序代码包括计算机操作指令。
处理器1402可能是中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。电子设备包括的一个或多个处理器,可以是同一类型的处理器,如一个或多个CPU;也可以是不同类型的处理器,如一个或多个CPU以及一个或多个ASIC。
存储器1406,用于存放程序1410。存储器1406可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。
程序1410具体可以用于使得处理器1402执行以下操作:接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份;在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定;获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
在一种可选的实施方式中,程序1410还用于使得处理器1402在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定时,在确定各数据库实例完成所述全量数据备份后,确定是否所有执行中的跨数据库实例的所述分布式事务提交完成;若提交完成,则生成指示锁定分布式事务的阻塞指令,以对用于进行跨数据库实例的数据更新的分布式事务进行锁定;其中,所述阻塞指令以秒级为阻塞时长单位。
在一种可选的实施方式中,所述日志文件包括用于记录所述增量数据的数据变更位置信息和所述数据变更后信息;所述日志文件的信息包括:所述日志文件对应的数据库实例信息、所述日志文件的名称和日志文件的偏移量。
在一种可选的实施方式中,程序1410还用于使得处理器1402在解锁所述分布式事务,并根据备份的各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集时,生成允许分布式事务提交的解锁指令,以指示解除对分布式事务的锁定;根据各数据库实例进行全量数据备份时备份的各数据库实例的元数据、各数据库实例的内容数据备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
或者,
程序1410具体可以用于使得处理器1402执行以下操作:接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集,所述数据备份集为根据前述数据处理方法生成的数据备份集;根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作。
在一种可选的实施方式中,程序1410还用于使得处理器1402从所述数据备份集中获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;根据所述日志文件的信息,对全量恢复操作后的数据库实例进行增量恢复操作。
在一种可选的实施方式中,程序1410还用于使得处理器1402从所述数据备份集中获取备份的各数据库实例的元数据,根据所述元数据创建新的中间件实例,并将所述元数据恢复至所述新的中间件实例。
在一种可选的实施方式中,程序1410还用于使得处理器1402在根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作时,从所述数据备份集获取备份的数据库实例的内容数据备份结果,指示对应的各数据库实例将所述内容数据备份结果恢复到新创建的恢复用数据库实例中。
在一种可选的实施方式中,程序1410还用于使得处理器1402将各新创建的恢复用数据库实例,挂载到所述新的中间件实例。
在一种可选的实施方式中,程序1410还用于使得处理器1402在根据所述日志文件的信息,对全量恢复操作后的数据库实例进行增量恢复操作时,根据所述日志文件的信息,确定待进行增量恢复的数据库实例和确定的所述数据库实例在所述设定时间段内的增量数据;根据所述增量数据,对确定的所述数据库实例进行增量恢复。
程序1410中各步骤的具体实现可以参见上述数据处理方法实施例中的相应步骤和单元中对应的描述,在此不赘述。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的设备和模块的具体工作过程,可以参考前述方法实施例中的对应过程描述,在此不再赘述。
通过本实施例的电子设备,根据备份请求指示分布式数据库对应的各数据库实例进行全量数据备份,并在全量数据备份完成后锁定分布式事务并获取日志文件的信息。通过锁定分布式事务防止分布式事务造成数据库实例间的数据不一致,导致全局数据不一致的问题,保证分布式数据库具有全局数据一致状态。在获取日志文件的信息后,解锁分布式事务使分布式数据库可以正常运行,并根据各数据库实例的备份结果和日志文件的信息生成数据备份集,实现了在最小化对用户业务影响的前提下,保证数据全局一致性的数据库备份。
或者,通过本实施例的电子设备,在进行数据库恢复时,使用前述实施例一到三中任一所述的数据处理方法生成的数据备份集进行数据恢复,确保了数据库恢复的准确性,且能够保证恢复后的分布式数据库的数据全局一致性。
需要指出,根据实施的需要,可将本发明实施例中描述的各个部件/步骤拆分为更多部件/步骤,也可将两个或多个部件/步骤或者部件/步骤的部分操作组合成新的部件/步骤,以实现本发明实施例的目的。
上述根据本发明实施例的方法可在硬件、固件中实现,或者被实现为可存储在记录介质(诸如CD ROM、RAM、软盘、硬盘或磁光盘)中的软件或计算机代码,或者被实现通过网络下载的原始存储在远程记录介质或非暂时机器可读介质中并将被存储在本地记录介质中的计算机代码,从而在此描述的方法可被存储在使用通用计算机、专用处理器或者可编程或专用硬件(诸如ASIC或FPGA)的记录介质上的这样的软件处理。可以理解,计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机 代码的存储组件(例如,RAM、ROM、闪存等),当所述软件或计算机代码被计算机、处理器或硬件访问且执行时,实现在此描述的数据处理方法。此外,当通用计算机访问用于实现在此示出的数据处理方法的代码时,代码的执行将通用计算机转换为用于执行在此示出的数据处理方法的专用计算机。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及方法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明实施例的范围。
以上实施方式仅用于说明本发明实施例,而并非对本发明实施例的限制,有关技术领域的普通技术人员,在不脱离本发明实施例的精神和范围的情况下,还可以做出各种变化和变型,因此所有等同的技术方案也属于本发明实施例的范畴,本发明实施例的专利保护范围应由权利要求限定。

Claims (14)

  1. 一种数据处理方法,其特征在于,包括:
    接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份;
    在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定;
    获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;
    解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
  2. 根据权利要求1所述的方法,其特征在于,所述在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定,包括:
    在确定各数据库实例完成所述全量数据备份后,确定是否所有执行中的跨数据库实例的所述分布式事务提交完成;
    若提交完成,则生成指示锁定分布式事务的阻塞指令,以对用于进行跨数据库实例的数据更新的分布式事务进行锁定;其中,所述阻塞指令以秒级为阻塞时长单位。
  3. 根据权利要求1所述的方法,其特征在于,所述日志文件包括用于记录所述增量数据的数据变更位置信息和所述数据变更后信息;所述日志文件的信息包括:所述日志文件对应的数据库实例信息、所述日志文件的名称和日志文件的偏移量。
  4. 根据权利要求1所述的方法,其特征在于,所述解锁所述分布式事务,并根据备份的各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集,包括:
    生成允许分布式事务提交的解锁指令,以指示解除对分布式事务的锁定;
    根据各数据库实例进行全量数据备份时备份的各数据库实例的元数据、各数据库实例的内容数据备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
  5. 一种数据处理方法,其特征在于,包括:
    接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集,所述数据备份集为根据权利要求1-4中任一项所述数据处理方法生成的数据备份集;
    根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    从所述数据备份集中获取记录有各数据库实例在设定时间段内的增量数据的日志 文件的信息;
    根据所述日志文件的信息,对全量恢复操作后的数据库实例进行增量恢复操作。
  7. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    从所述数据备份集中获取备份的各数据库实例的元数据,根据所述元数据创建新的中间件实例,并将所述元数据恢复至所述新的中间件实例。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述数据备份集,指示对应的各数据库实例进行全量恢复操作,包括:
    从所述数据备份集获取备份的数据库实例的内容数据备份结果,指示对应的各数据库实例将所述内容数据备份结果恢复到新创建的恢复用数据库实例中。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    将各新创建的恢复用数据库实例,挂载到所述新的中间件实例。
  10. 根据权利要求6所述的方法,其特征在于,所述根据所述日志文件的信息,对全量恢复操作后的数据库实例进行增量恢复操作,包括:
    根据所述日志文件的信息,确定待进行增量恢复的数据库实例和确定的所述数据库实例在所述设定时间段内的增量数据;
    根据所述增量数据,对确定的所述数据库实例进行增量恢复。
  11. 一种数据处理装置,其特征在于,包括:
    全量备份模块,用于接收针对分布式数据库的备份请求,根据所述备份请求指示对应的各数据库实例进行全量数据备份;
    锁定模块,用于在确定各数据库实例完成所述全量数据备份后,对用于进行跨数据库实例的数据更新的分布式事务进行锁定;
    第一获取模块,用于获取记录有各数据库实例在设定时间段内的增量数据的日志文件的信息;
    解锁模块,用于解锁所述分布式事务,并根据各所述数据库实例的备份结果和所述日志文件的信息,生成所述分布式数据库的数据备份集。
  12. 一种数据处理装置,其特征在于,包括:
    备份集确定模块,用于接收针对分布式数据库的恢复请求,并确定所述恢复请求指示的数据备份集,所述数据备份集为根据权利要求11所述数据处理装置生成的数据备份集;
    全量恢复模块,用于根据所述数据备份集,指示对应的各数据库实例进行全量恢复 操作。
  13. 一种电子设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;
    所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1-4中任一项所述的数据处理方法对应的操作,或者,执行如权利要求5-10中任一项所述的数据处理方法对应的操作。
  14. 一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1-4中任一所述的数据处理方法,或者,实现如权利要求5-10中任一项所述的数据处理方法。
PCT/CN2020/104015 2019-07-26 2020-07-24 数据处理方法、装置、电子设备及计算机存储介质 WO2021018020A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910682744.9 2019-07-26
CN201910682744.9A CN112306743B (zh) 2019-07-26 2019-07-26 数据处理方法、装置、电子设备及计算机存储介质

Publications (1)

Publication Number Publication Date
WO2021018020A1 true WO2021018020A1 (zh) 2021-02-04

Family

ID=74228351

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/104015 WO2021018020A1 (zh) 2019-07-26 2020-07-24 数据处理方法、装置、电子设备及计算机存储介质

Country Status (2)

Country Link
CN (1) CN112306743B (zh)
WO (1) WO2021018020A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127266A (zh) * 2021-04-25 2021-07-16 中国工商银行股份有限公司 一种基于分布式数据库的多副本灾难恢复方法及装置
CN113326232A (zh) * 2021-05-27 2021-08-31 北京沃东天骏信息技术有限公司 数据更新方法及装置
CN113986870A (zh) * 2021-09-10 2022-01-28 广东南方通信建设有限公司 一种高速数据迁移方法及系统
CN114328719A (zh) * 2021-11-30 2022-04-12 唯品会(广州)软件有限公司 数据库语句同步方法、系统、电子设备及计算机可读存储介质
CN115396339A (zh) * 2022-08-24 2022-11-25 银清科技有限公司 一种异常报文处理方法及装置
CN117609209A (zh) * 2023-11-29 2024-02-27 星环信息科技(上海)股份有限公司 数据回收方法、数据还原方法、装置、设备及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112817941A (zh) * 2021-02-24 2021-05-18 紫光云技术有限公司 一种解决sqlserver自动修改恢复模式的方法
CN112948342A (zh) * 2021-02-25 2021-06-11 杭州沃趣科技股份有限公司 一种基于日志解析系统的数据处理方法
CN112817798A (zh) * 2021-02-26 2021-05-18 北京车和家信息技术有限公司 一种数据恢复方法、装置、介质和电子设备
CN113254267B (zh) * 2021-05-20 2022-08-09 上海安钛飞信息技术有限公司 分布式数据库的数据备份方法和装置
CN114924909A (zh) * 2022-04-20 2022-08-19 海南格子山网络科技有限公司 跨云多种组合数据备份恢复方法
CN116107807B (zh) * 2023-01-10 2023-10-13 北京万里开源软件有限公司 数据库中数据备份时获取全局一致性点位的方法及装置
CN115809301B (zh) * 2023-02-03 2023-04-21 天翼云科技有限公司 数据库处理方法、装置、电子设备及可读存储介质
CN117171266B (zh) * 2023-08-28 2024-05-14 北京逐风科技有限公司 一种数据同步方法、装置、设备和存储介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101055584A (zh) * 2007-05-17 2007-10-17 华为技术有限公司 数据库加锁、操作的方法及装置
CN101923498A (zh) * 2009-06-11 2010-12-22 升东网络科技发展(上海)有限公司 数据库全量自动备份系统及方法
CN102541940A (zh) * 2010-12-31 2012-07-04 上海可鲁系统软件有限公司 一种工业数据库数据完整性管控方法
US20120310891A1 (en) * 2011-06-06 2012-12-06 International Business Machines Corporation Methods, systems, and physical computer storage media for backing up a database
US20170046231A1 (en) * 2011-06-16 2017-02-16 Sap Se Consistent backup of a distributed database system
CN106951341A (zh) * 2017-01-20 2017-07-14 天翼阅读文化传播有限公司 一种实现分布式架构的数据库备份方法
CN107111534A (zh) * 2016-06-28 2017-08-29 华为技术有限公司 一种数据处理的方法和装置
CN108241555A (zh) * 2016-12-26 2018-07-03 阿里巴巴集团控股有限公司 一种分布式数据库的备份、恢复方法、装置和服务器
CN110096476A (zh) * 2019-04-08 2019-08-06 平安科技(深圳)有限公司 数据备份方法、装置及计算机可读存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10884869B2 (en) * 2015-04-16 2021-01-05 Nuodb, Inc. Backup and restore in a distributed database utilizing consistent database snapshots
US11068352B2 (en) * 2016-09-15 2021-07-20 Oracle International Corporation Automatic disaster recovery mechanism for file-based version control system using lightweight backups
US10613944B2 (en) * 2017-04-18 2020-04-07 Netapp, Inc. Systems and methods for backup and restore of distributed master-slave database clusters
CN109656911B (zh) * 2018-12-11 2023-08-01 江苏瑞中数据股份有限公司 分布式并行处理数据库系统及其数据处理方法

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101055584A (zh) * 2007-05-17 2007-10-17 华为技术有限公司 数据库加锁、操作的方法及装置
CN101923498A (zh) * 2009-06-11 2010-12-22 升东网络科技发展(上海)有限公司 数据库全量自动备份系统及方法
CN102541940A (zh) * 2010-12-31 2012-07-04 上海可鲁系统软件有限公司 一种工业数据库数据完整性管控方法
US20120310891A1 (en) * 2011-06-06 2012-12-06 International Business Machines Corporation Methods, systems, and physical computer storage media for backing up a database
US20170046231A1 (en) * 2011-06-16 2017-02-16 Sap Se Consistent backup of a distributed database system
CN107111534A (zh) * 2016-06-28 2017-08-29 华为技术有限公司 一种数据处理的方法和装置
CN108241555A (zh) * 2016-12-26 2018-07-03 阿里巴巴集团控股有限公司 一种分布式数据库的备份、恢复方法、装置和服务器
CN106951341A (zh) * 2017-01-20 2017-07-14 天翼阅读文化传播有限公司 一种实现分布式架构的数据库备份方法
CN110096476A (zh) * 2019-04-08 2019-08-06 平安科技(深圳)有限公司 数据备份方法、装置及计算机可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIANG, YONGLI: " Research on Backup and Recovery Mechanism for Heterogeneous Distributed Database", INFORMATION & COMMUNICATIONS, no. 5, 31 May 2011 (2011-05-31), pages 24 - 25, XP009525735, ISSN: 1673-1131 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127266A (zh) * 2021-04-25 2021-07-16 中国工商银行股份有限公司 一种基于分布式数据库的多副本灾难恢复方法及装置
CN113326232A (zh) * 2021-05-27 2021-08-31 北京沃东天骏信息技术有限公司 数据更新方法及装置
CN113986870A (zh) * 2021-09-10 2022-01-28 广东南方通信建设有限公司 一种高速数据迁移方法及系统
CN114328719A (zh) * 2021-11-30 2022-04-12 唯品会(广州)软件有限公司 数据库语句同步方法、系统、电子设备及计算机可读存储介质
CN115396339A (zh) * 2022-08-24 2022-11-25 银清科技有限公司 一种异常报文处理方法及装置
CN117609209A (zh) * 2023-11-29 2024-02-27 星环信息科技(上海)股份有限公司 数据回收方法、数据还原方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN112306743B (zh) 2023-11-21
CN112306743A (zh) 2021-02-02

Similar Documents

Publication Publication Date Title
WO2021018020A1 (zh) 数据处理方法、装置、电子设备及计算机存储介质
US8832028B2 (en) Database cloning
US9600371B2 (en) Preserving server-client session context
EP2903239B1 (en) Masking server outages from clients and applications
US6873995B2 (en) Method, system, and program product for transaction management in a distributed content management application
US9519675B2 (en) Data access management during zero downtime upgrade
US9727601B2 (en) Predicting validity of data replication prior to actual replication in a transaction processing system
CN108021338B (zh) 用于实现两层提交协议的系统和方法
US12007857B2 (en) Non-blocking backup in a log replay node for tertiary initialization
US11080262B1 (en) Optimistic atomic multi-page write operations in decoupled multi-writer databases
CN111753013A (zh) 分布式事务处理方法及装置
US12001290B2 (en) Performing a database backup based on automatically discovered properties
US11061889B2 (en) Systems and methods of managing manifest refresh in a database
CN108573015B (zh) 变更表格式的方法、装置、电子设备和可读存储介质
CN111240891A (zh) 基于数据库多表间数据一致性的数据恢复方法及装置
WO2022242372A1 (zh) 对象处理方法、装置、计算机设备和存储介质
US10860540B1 (en) Method and system for synchronizing backup and cloning schedules
US11683161B2 (en) Managing encryption keys under group-level encryption
US10872073B1 (en) Lock-free updates to a data retention index
US11269825B2 (en) Privilege retention for database migration
US8943031B2 (en) Granular self-healing of a file in a distributed file system
CN116541387A (zh) 数据闪回方法、装置、电子设备及存储介质
US11356325B1 (en) Accelerating transactions from a trusted source
US7209919B2 (en) Library server locks DB2 resources in short time for CM implicit transaction
US20230074216A1 (en) System and method for preserving access control lists in storage devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20847398

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20847398

Country of ref document: EP

Kind code of ref document: A1