CN114546730A - Data recovery processing method, device and system - Google Patents

Data recovery processing method, device and system Download PDF

Info

Publication number
CN114546730A
CN114546730A CN202210197321.XA CN202210197321A CN114546730A CN 114546730 A CN114546730 A CN 114546730A CN 202210197321 A CN202210197321 A CN 202210197321A CN 114546730 A CN114546730 A CN 114546730A
Authority
CN
China
Prior art keywords
recovery
recovery verification
verification
database
execution path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210197321.XA
Other languages
Chinese (zh)
Inventor
张心怡
周小淞
王可心
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN202210197321.XA priority Critical patent/CN114546730A/en
Publication of CN114546730A publication Critical patent/CN114546730A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data recovery processing method and a system, after preliminarily determining an undetermined execution path for serial recovery verification of a plurality of database backup files to be recovered, preliminarily determining at least one first recovery verification job which can be executed concurrently and a second recovery verification job which cannot be executed concurrently in a plurality of recovery verification jobs for recovery verification of the plurality of database backup files, then optimizing key paths of the recovery verification jobs based on the path to be executed and a preset scheduling optimization condition, and determining an optimal execution path which meets the preset scheduling optimization condition, so that a scheduling device controls at least one selected recovery verification server to be divided into preset paths according to the optimal execution path and execute the plurality of recovery verification jobs concurrently, and ensures that the recovery verification process of the plurality of database backup files can be within a preset recovery verification duration, and the resource utilization rate of the recovery verification server is improved.

Description

Data recovery processing method, device and system
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, and a system for data recovery processing.
Background
In the big data era, data disaster recovery, i.e., a data disaster backup technology, has become a considerable problem, and it can backup system data to other databases when there is an abnormality due to an operation error or a system failure, thereby avoiding data loss.
In order to ensure the availability of the backup data, in an actual service, a large amount of backup data is usually sent to a recovery target machine, and validity recovery verification is performed one by one according to a preset sequence. However, since the data volumes of different backup data often have a certain difference, if the time consumed for performing the recovery verification on a backup data with a large data volume is too long, the recovery verification operation on other backup data cannot be completed within an expected time, and thus the availability of the backup data and the effectiveness of the backup task cannot be ensured.
Disclosure of Invention
In view of this, the present application provides a data recovery processing method, including:
acquiring a plurality of database backup files to be restored and verified, and performing serial restoration verification on the plurality of database backup files to be executed;
determining at least one first recovery verification job and a second recovery verification job for performing recovery verification on the plurality of database backup files; the first recovery verification job may be executed concurrently; the second recovery verification job is not concurrently executable;
performing key path optimization on the first recovery verification operation and the second recovery verification operation based on the undetermined execution path and a preset scheduling optimization condition to obtain an optimal execution path meeting the preset scheduling optimization condition;
and controlling a recovery verification server to execute the plurality of first recovery verification jobs and the second recovery verification jobs according to the optimal execution path.
Optionally, the determining at least one first recovery verification job and a second recovery verification job for performing recovery verification on the plurality of database backup files includes:
acquiring historical recovery verification information of databases corresponding to the plurality of database backup files respectively, and recovering available resources of a verification server;
based on the historical recovery verification information and the available resources, obtaining a first prediction probability that the recovery verification server can complete the recovery verification of the plurality of database backup files within a preset recovery verification duration according to the to-be-determined execution path;
determining that the first prediction probability is less than a first probability threshold, and generating an optimized scheduling instruction for the pending execution path;
responding to the optimized scheduling instruction, and performing resource competition analysis on the recovery verification process of the plurality of database backup files;
and determining a first recovery verification job and a second recovery verification job corresponding to the first recovery verification job and the second recovery verification job from a plurality of recovery verification jobs for performing recovery verification on the plurality of database backup files based on different resource competition analysis results.
Optionally, the performing resource competition analysis on the recovery verification processes of the multiple database backup files, and determining, based on different resource competition analysis results, a corresponding first recovery verification job from multiple recovery verification jobs for performing recovery verification on the multiple database backup files includes:
determining a database server source for each of the plurality of database backup files;
if the same database server source corresponds to a plurality of database backup files, determining a corresponding recovery verification process as a first recovery verification operation;
decomposing the first recovery verification operation into a plurality of recovery verification sub-operations based on an execution entity involved in the recovery verification process of the database backup file, and determining the recovery verification sub-operations which can be executed concurrently among different first recovery verification operations; and/or the presence of a gas in the gas,
and if the plurality of database server sources respectively correspond to one database backup file, determining the recovery verification process of the database backup file as a first recovery verification operation.
Optionally, the performing, based on the pending execution path and a preset scheduling optimization condition, a critical path optimization on the first recovery validation job and the second recovery validation job to obtain an optimal execution path meeting the preset scheduling optimization condition includes:
performing execution path scheduling on a plurality of first recovery verification operations corresponding to the same database server source;
determining candidate execution paths aiming at the first recovery verification operation and the second recovery verification operation according to a scheduling optimization mode corresponding to the obtained execution path scheduling result based on the undetermined execution path and the recovery verification prediction duration of different recovery operations;
and controlling a recovery verification server to execute the predicted execution results of the first recovery verification operation and the second recovery verification operation to meet a preset scheduling optimization condition according to the candidate execution path, and determining the candidate execution path as an optimal execution path.
Optionally, the controlling, according to the candidate execution path, a recovery verification server to execute the first recovery verification job and the second recovery verification job according to the predicted execution results of the verification, which meet a preset scheduling optimization condition, includes:
acquiring recovery verification operations corresponding to the database backup files, and according to a second prediction probability of the candidate execution path completed within a preset recovery verification duration;
determining that the second prediction probability is smaller than a second probability threshold, and continuing to optimize the candidate execution path until the newly acquired second prediction probability is equal to or larger than the second probability threshold;
and/or the presence of a gas in the gas,
determining that the range of the operation which cannot be executed within the preset recovery verification duration is not reduced in the recovery verification operation corresponding to the plurality of database backup files, and continuing to optimize the candidate execution path until the range of the operation which cannot be executed within the preset recovery verification duration is reduced;
and/or the presence of a gas in the gas,
and determining that the recovery verification operation which cannot be executed within a preset recovery verification duration does not move backwards relative to the execution sequence in the path to be determined in the candidate execution path, and continuing to optimize the candidate execution path until the execution sequence moves backwards.
Optionally, the performing, based on the pending execution path and a preset scheduling optimization condition, a critical path optimization on the first recovery validation job and the second recovery validation job to obtain an optimal execution path meeting the preset scheduling optimization condition includes:
scheduling execution paths of a plurality of first recovery verification operations corresponding to the same database server source to obtain at least one first concurrent execution path;
scheduling execution paths of the first recovery verification operation corresponding to different database server sources respectively to obtain at least one second concurrent execution path;
performing key path scheduling on the first recovery verification operation and the second recovery verification operation according to a scheduling optimization strategy based on the undetermined execution path, the first concurrent execution path and the second concurrent execution path to obtain an optimal execution path meeting a preset scheduling optimization condition;
the preset scheduling optimization condition comprises that under the condition that the recovery verification of the database backup files is completed within a preset recovery verification time, the recovery verification time spent on the recovery verification of the database backup files and the consumed resource amount are reduced.
Optionally, the obtaining a path to be executed for performing serial recovery verification on the plurality of database backup files includes:
acquiring priority configuration data corresponding to the plurality of database backup files; the priority configuration data comprises one or more combinations of historical recovery verification information, database attributes, database backup modes and recovery verification prediction duration of a database corresponding to the database backup files;
determining a recovery verification priority for serial recovery verification of the plurality of database backup files based on the priority configuration data;
and obtaining the pending execution path for the recovery verification of the plurality of database backup files according to the recovery verification priority.
Optionally, the determining, based on the priority configuration data, a recovery verification priority for performing serial recovery verification on the plurality of database backup files includes:
if the database backup mode comprises a full backup mode and an incremental backup mode, configuring the recovery processing priority of the database backup file obtained by adopting the full backup mode, which is higher than the recovery processing priority of the database backup file obtained by adopting the incremental backup mode;
and/or the presence of a gas in the gas,
and determining that the corresponding database backup file does not complete or unsuccessfully complete recovery verification within the adjacent last preset recovery verification duration based on the historical recovery verification information, and improving the recovery verification priority of the database backup file.
The present application further proposes a data recovery processing apparatus, the apparatus including:
the information acquisition module is used for acquiring a plurality of database backup files to be restored and verified and performing serial restoration verification on the plurality of database backup files to be executed;
the recovery verification job determining module is used for determining at least one first recovery verification job and a second recovery verification job for performing recovery verification on the plurality of database backup files; the first recovery verification job may be executed concurrently; the second recovery verification job is not concurrently executable;
an optimal execution path obtaining module, configured to perform, based on the to-be-executed path and a preset scheduling optimization condition, critical path optimization on the first recovery validation job and the second recovery validation job, and obtain an optimal execution path that meets the preset scheduling optimization condition;
and the recovery verification processing module is used for controlling a recovery verification server to execute the plurality of first recovery verification jobs and the plurality of second recovery verification jobs according to the optimal execution path.
The present application further proposes a data recovery processing system, the system including: a database server, a backup server, a recovery validation server, and a scheduling device comprising at least one communication interface, at least one memory, and at least one processor, wherein:
the memory is used for storing a program for realizing the data recovery processing method;
the processor is used for loading and executing the program stored in the memory, and the data recovery processing method is realized.
Therefore, the application provides a data recovery processing method, a device and a system, after preliminarily determining a pending execution path for serial recovery verification of a plurality of database backup files to be recovered, firstly preliminarily determining at least one first recovery verification job and a second recovery verification job which can not be executed concurrently in a plurality of recovery verification jobs for recovery verification of the plurality of database backup files, then optimizing key paths of the recovery verification jobs based on the pending execution path and a preset scheduling optimization condition, and determining an optimal execution path which meets the preset scheduling optimization condition, so that a scheduling device controls a selected at least one recovery verification server to be divided into preset paths according to the optimal execution path and execute the plurality of recovery verification jobs concurrently, and ensures that the recovery verification process of the plurality of database backup files can be within a preset recovery verification duration, and the resource utilization rate of the recovery verification server is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic architecture diagram of an alternative example of a data recovery processing system in any application environment suitable for the data recovery processing method proposed in the present application;
fig. 2 is a schematic flowchart of an alternative example of the data recovery processing method proposed in the present application;
fig. 3 is a schematic flowchart of another alternative example of the data recovery processing method proposed in the present application;
fig. 4 is a schematic flowchart of another alternative example of the data recovery processing method proposed in the present application;
fig. 5 is a schematic flowchart of yet another alternative example of the data recovery processing method proposed in the present application;
FIG. 6 is a diagram illustrating results of an alternative example of a data recovery processing apparatus according to the present application;
FIG. 7 is a diagram illustrating the results of yet another alternative example of the data recovery processing apparatus proposed in the present application;
fig. 8 is a schematic hardware structure diagram of an alternative example of a scheduling device suitable for the data recovery processing method proposed in the present application.
Detailed Description
Aiming at the description content of the background technology part, in order to improve the recovery verification efficiency of a plurality of database backup files, a corresponding number of recovery verification servers are configured, and different database backup files are respectively and simultaneously subjected to recovery verification so as to ensure that the plurality of database backup files can be completed within a preset recovery verification time length, thereby ensuring the availability and the effectiveness of the database backup files, but the data recovery processing mode causes a great amount of waste of resources of the recovery verification servers and is not preferable.
In order to further improve the problems, on the basis of completing the recovery verification of the backed-up database backup files within the preset recovery verification duration, the resources of one or more recovery verification servers are fully utilized as far as possible, and the resources of the recovery verification servers are avoided from being wasted while the normal work of the recovery verification servers is ensured. Therefore, in consideration of the fact that the execution steps for restoring the database backup files of different databases are different in complexity, and the whole restoration verification process often involves the execution process of multiple execution entities, for example, for the restoration of backup files of relational databases such as MySQL and PostgreSQL, the backup file needs to be restored to a local disk first, and then the restoration verification of the backup file needs to be completed for a corresponding database server.
In contrast, in the recovery verification process of the multiple database backup files, the method and the system can reasonably distribute the concurrently executed jobs, determine that several recovery verification servers are needed to perform concurrent execution, and each path contains which database backup files and the execution sequence thereof; even for a plurality of database backup files from the same database server, the whole recovery verification process can be decomposed, and the distribution problem which can be executed in parallel from the time among the decomposition steps of the database backup files is optimized by combining a key path optimization algorithm, so that the resource scheduling optimization of the recovery verification server is realized, the recovery verification of the obtained data packet backup files is completed by the quality and quantity guarantee within the preset recovery verification time, the resource waste is reduced, the execution efficiency of the recovery verification operation is improved, the problem of operation overstock is solved, and the availability and the effectiveness of the database backup files are ensured.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a schematic diagram of an architecture of an alternative example of a data recovery processing system in any application environment applicable to the data recovery processing method provided in the present application is shown, where the application environment type is not limited in the present application, and as shown in fig. 1, the system may include: the system comprises a database server 11, a backup server 12, a recovery verification server 13 and a scheduling device 14, wherein the number of each of the components in the system is usually at least one, and the components can be in communication connection with each other through a wired communication network or a wireless communication network.
The database server 11 may be a server that provides a data storage service, may be an independent physical server, may be a service cluster configured by a plurality of physical servers, or may be a cloud server. In practical applications, the database server 11 is generally divided into a plurality of types, such as a relational database, a non-relational database, and the like, and the database server for storing the business data of the enterprise can be flexibly selected according to different business requirements.
One or more databases (i.e., database instances) may be configured in each database server 11 to enable flexible storage of each type of data. In general, in consideration of data security, a disaster recovery system is usually configured for the database server 11, that is, according to a preset backup period or a backup rule, the backup server 12 may perform backup operation on data of each database at regular time to obtain a corresponding database backup file for storage, so that the corresponding database backup file may be restored according to a subsequent service requirement to obtain a database instance during backup.
In practical applications, for a plurality of database servers 11, a plurality of backup servers of corresponding number may execute backup jobs in parallel to obtain database backup files corresponding to each database included in each database. Then, in order to ensure the effectiveness of the subsequent recovery operation on the database backup file, the recovery effectiveness verification is usually performed on the obtained database backup file, in the process, a server verification server (i.e., a recovery target machine) can be used for realizing recovery and restoration, and in the recovery verification process, the corresponding database server can be used for verifying the key data content of the database backup file, and the implementation process is not described in detail.
In this application, the number of the backup servers 12 may be multiple, or multiple backup processes/threads may be created on one backup server, so as to execute backup operations according to preset backup rules, implement backup processing on the database instance in the corresponding database server, and obtain the database backup file of the database instance. For the number of the recovery verification servers participating in the data recovery processing and the number of the processing processes/threads for each recovery verification server to execute the recovery verification processing, the obtained optimal scheduling scheme may be determined according to the analysis of the critical path of the concurrently executable backup recovery job during the recovery verification process of the database backup file obtained this time, which is not described in detail herein.
The scheduling device 14 may be a computer device, which is configured to analyze the plurality of database backup files obtained by the determining, determine an optimal scheduling scheme, and control scheduling control for verifying recovery of the plurality of database backup files obtained in this round according to the optimal scheduling scheme, and may be a terminal device or a server, where the terminal device may include but is not limited to a smart phone, a tablet computer, a wearable device, an intelligent transportation device, an intelligent medical device, a robot, a desktop computer, and the like; the server may be a physical server or a cloud server, and the application does not limit the product type of the scheduling device 14, which may be determined according to the situation.
In the process of the backup verification of each database backup file, the database backup file stored by the backup server may be sent to the local electronic device of the manager requesting to restore the verification, and then the local electronic device restores the database backup file to the backup verification database. The local electronic device may be the scheduling device 14, or may be another electronic device different from the scheduling device 14, which is not limited in this application and may be determined as the case may be.
It should be understood that the data recovery processing system structure shown in fig. 1 does not constitute a limitation of the data recovery processing system in the embodiment of the present application, and in practical applications, the system may include more or less devices than those shown in fig. 1, or a combination of devices, which is not listed here.
Referring to fig. 2, a schematic flow chart of an alternative example of the data recovery processing method proposed in the present application, where the method may be executed by the scheduling device, and as shown in fig. 2, the method may include:
step S21, acquiring a plurality of database backup files to be restored and verified, and an undetermined execution path for serial restoration and verification of the plurality of database backup files;
in combination with the description of the corresponding part of the above embodiment, the backup operation of the database instance in each database server is usually executed according to a certain period or time rule, for example, the backup operation is executed once every 24 hours on the database instance, and the 24 hours may be used as a preset recovery verification duration, but is not limited to 24 hours, and a time interval between two adjacent backup operations, that is, the preset recovery verification duration, may be determined according to a service requirement.
When a plurality of database backup files obtained by executing backup jobs within each preset recovery verification duration need to be subjected to recovery verification processing, in order to solve the problem that backlog jobs cannot be completed, the recovery verification priority of the plurality of database backup files may be determined first, corresponding recovery verification priority marks are configured for the recovery verification jobs corresponding to the database backup files, and then, recovery verification jobs with high recovery verification priority may be preferentially executed based on the recovery verification priority marks, that is, pending execution paths for performing serial recovery verification on the plurality of database backup files are determined, so that the pending execution paths are instructed to sequentially execute the recovery verification on the plurality of database backup files according to the pending execution paths.
Wherein, the method for obtaining the recovery verification priority of each of the multiple database backup files, that is, when determining the pending execution path, may determine the mark policy of the recovery verification priority of each database backup file according to one or more combinations of historical recovery verification information (such as whether the last recovery verification of the backup file of the corresponding database is completed, the time taken to complete the recovery verification, etc.), database attributes (such as database type, etc.), database backup manner (such as a complete backup manner for backing up all data of the database, an incremental backup manner for backing up part of data of the database, etc.), and recovery verification prediction time length (i.e., the time taken for predicting the recovery verification of the database backup file at this time, etc.), thereby determining the pending execution path for serial recovery verification of the multiple database backup files obtained at this time, namely, the execution sequence of the recovery verification of the multiple database backup files, and the implementation process is not described in detail in this application.
It should be noted that, regarding the marking policy according to which the pending execution path for performing serial recovery validation on a plurality of database backup files is determined, including but not limited to the above description, the marking policy may be determined according to actual service requirements, and details of this application are not described in detail.
In some embodiments, since the pending execution path is determined, the optimization determination is continuously adjusted and optimized with the aim of maximizing the number of recovery verification operations executed within the preset recovery verification duration, therefore, the application can predict whether the restoration verification of the plurality of database backup files can be completed within the preset restoration verification duration according to the pending execution path, and if the restoration verification can be completed with higher probability, the method includes that a first prediction probability is obtained, wherein the first prediction probability reaches a first probability threshold (namely, a prediction probability critical value which indicates that recovery verification of a plurality of database backup files can be completed within a preset recovery verification duration is represented, and the numerical value of the first prediction probability is not limited according to the application and can be determined according to the situation), a to-be-determined execution path can be used as a target scheduling scheme, and scheduling equipment can directly control the plurality of database backup files to sequentially perform recovery verification according to the to-be-determined execution path.
Otherwise, if it is determined that the first prediction probability does not reach the first probability threshold according to the prediction processing manner described above, the scheduling scheme of the verification recovery process for the multiple database backup files needs to be continuously optimized, and at this time, the subsequent steps may be executed to determine a target scheduling scheme capable of achieving verification recovery for the multiple database backup files within the preset verification recovery duration by fully utilizing the resources of the verification recovery server.
Of course, in still other embodiments, after obtaining a plurality of database backup files, the optimal scheduling scheme for the plurality of database backup files may also be determined in combination with the above description of the technical solution of the present application and in combination with a technical means for executing a recovery job in parallel, that is, a backup recovery task scheduling manner optimized based on a concurrent job critical path, so as to improve the recovery verification efficiency.
Step S22, determining at least one first recovery verification job and a second recovery verification job for performing recovery verification on the plurality of database backup files;
in this embodiment of the present application, a recovery verification process for each database backup file may be determined as a recovery verification job, where the first recovery verification job may refer to a recovery verification job that can be executed concurrently; the second recovery verification job may refer to a recovery verification job that cannot be executed concurrently, and the present application does not limit the determination method of the first recovery verification job and the second recovery verification job, and may refer to, but is not limited to, the description of the corresponding parts of the following embodiments.
In some embodiments, for a plurality of recovery verification jobs corresponding to a plurality of database backup files obtained this time, a part of the recovery verification jobs may be the first recovery verification job, all the recovery verification jobs may be the first recovery verification job (that is, there is no second recovery verification job), or the first recovery verification job may not be included. Under the condition that the plurality of recovery verification jobs are all the second recovery verification jobs, the recovery verification of the plurality of database backup files can be controlled and realized directly according to the paths to be executed.
In the first recovery verification operation, the database server from which the database backup files of each recovery verification operation come from may be considered, and the recovery verification process for the multiple database backup files from different database servers is directly and concurrently executed, which is not beneficial to improving the recovery verification efficiency, in this case, the recovery verification process for such database backup files may be decomposed to obtain corresponding multiple recovery verification sub-operations, so that, considering that the execution time lengths of different recovery verification sub-operations are different, the execution time lengths of the same database backup file in the same type of recovery verification sub-operation are also different, when determining the concurrent execution policy among the multiple database backup files, the concurrent execution is not limited to the execution time from the beginning, for example, when executing a second recovery verification sub-operation on a certain database backup file, and synchronously starting a first recovery verification sub-job of the recovery verification job of the next database backup file so as to maximize the execution quantity of the recovery verification jobs within a preset recovery verification duration and ensure the sufficiency of the resource use of the recovery verification server, wherein the determining process of the scheduling scheme is not described in detail in the application.
Step S23, based on the undetermined execution path and the preset scheduling optimization condition, performing key path optimization on the first recovery verification job and the second recovery verification job to obtain an optimal execution path meeting the preset scheduling optimization condition;
step S24, according to the optimal execution path, controls the recovery verification server to execute the plurality of first recovery verification jobs and the second recovery verification jobs.
As described above, the preset scheduling optimization condition may include that, in the case of completing the recovery verification of the plurality of database backup files (i.e., the database backup files obtained by executing the backup job within the preset recovery verification duration), the recovery verification duration and the consumed resource amount consumed for performing the recovery verification on the plurality of database backup files are reduced, so as to solve the technical problem existing in performing serial recovery verification processing on the plurality of database backup files or performing parallel recovery verification processing on the plurality of database backup files sent to a corresponding number of recovery verification servers.
In combination with the above description of the technical solution of the present application, the present application schedules concurrent execution of each first recovery verification job, that is, determines a first recovery verification job queue/execution path of each concurrent recovery verification path, and determines different starting execution times of recovery verification sub-jobs concurrently executed between different first recovery verification jobs when determining various execution orders of the first recovery verification jobs for the first recovery verification jobs that can be decomposed, to obtain multiple policies, that is, the determined various execution paths of each recovery verification job, in consideration of the recovery verification predicted durations of different recovery verification jobs and the available resources of the recovery verification server.
Then, the predicted recovery verification time length required for executing a plurality of recovery verification jobs (including the first recovery verification job and the second recovery verification job) according to each execution path (i.e., the scheduling policy), the amount of resources of the recovery verification server required to be consumed, and the like may be analyzed, and an optimal execution path may be determined based on the predicted recovery verification time length and the amount of resources of the recovery verification server. The method for determining the optimal solution (i.e. the optimal execution path) of the critical path optimization method for restoring the verification job execution sequence and meeting the preset scheduling optimization condition is not limited.
After determining the optimal execution path for performing the job on the recovery verification corresponding to the multiple database backup files obtained this time according to the method described above, the scheduling device may control the multiple database backup files to be recovered and restored to the corresponding recovery verification servers according to the optimal execution path, and sequentially implement the recovery verification on the received database backup files, and the implementation process of the present application is not described in detail.
In summary, after preliminarily determining the pending execution path for serial recovery validation of the database backup files to be recovered, in order to improve the efficiency of recovery validation and the resource utilization rate of the recovery validation server, the embodiment of the present application firstly preliminarily determines at least one first recovery validation job and a second recovery validation job that can not be concurrently executed among the multiple recovery validation jobs for recovery validation of the database backup files, and then performs critical path optimization on the recovery validation jobs based on the pending execution path and the preset scheduling optimization condition, determines the optimal execution path that satisfies the preset scheduling optimization condition, that is, the concurrent execution combination and execution sequence of the recovery validation jobs, and so on, so that the scheduling device controls the selected at least one recovery validation server to divide into the preset paths and concurrently execute the multiple recovery validation jobs according to the optimal execution path, the recovery verification process of the database backup files can be guaranteed to be within the preset recovery verification duration, the resources of the recovery verification server can be fully utilized, and resource waste is avoided.
Referring to fig. 3, a schematic flow chart of yet another optional example of the data recovery processing method proposed in the present application is shown, where this embodiment may be a description of an optional detailed implementation method of the data recovery processing method described above, but is not limited to this detailed implementation method, and this embodiment may perform detailed description on the above method for acquiring a path to be executed, and details of other processing steps are not described in this embodiment. As shown in fig. 3, the refinement implementation method may include, but is not limited to:
step S31, obtaining a plurality of database backup files to be restored and verified;
step S32, obtaining the priority configuration data corresponding to each of the plurality of database backup files;
in the embodiment of the present application, the priority configuration data may include, but is not limited to: the database backup file may determine the content of the priority configuration data according to the service requirement, which is not limited in the present application, in response to one or more combinations of historical recovery verification information, database attributes, database backup manner, and recovery verification prediction duration of the database.
The priority configuration data may be used to indicate a recovery verification priority of the corresponding database backup file, that is, a recovery verification execution sequence in a recovery verification process of the multiple database backup files acquired this time may be determined according to a recovery verification priority configuration policy of the database backup files.
Step S33, determining a recovery verification priority for performing serial recovery verification on the plurality of database backup files based on the priority configuration data;
step S34, according to the recovery verification priority, obtaining the pending execution path of the multiple database backup files for recovery verification;
in this embodiment of the present application, the database backup manners included in the priority configuration data generally include a full backup manner and an incremental backup manner, and in actual service application, a recovery verification priority of the database backup files obtained in the full backup manner may be configured to be higher than a recovery verification priority of the database backup files obtained in the incremental backup manner, so that when determining the recovery verification priorities of the obtained multiple database backup files, the database backup manner of each database backup file may be determined, and accordingly, it may be determined whether the database backup files of different database backup manners belong to a higher recovery verification priority or a lower recovery verification priority, and accordingly, an execution order of the recovery verification of the multiple database backup files is determined.
Optionally, if the priority configuration data includes the historical recovery verification information, since the priority configuration data can indicate that the recovery verification of the corresponding database backup file is not completed or not successfully completed within the last preset recovery verification duration, in order to ensure that the recovery verification of the corresponding database backup file can be reliably completed this time, the priority of the recovery verification of the corresponding database backup file may be increased, and specifically, the number of the priority levels may be increased according to the circumstances, which is not limited in this application.
In still other embodiments, in the process of adjusting the recovery verification priority of each database backup file, the recovery verification predicted time of each database backup file may be obtained, the recovery verification predicted time of the plurality of database backup files is accumulated according to the execution sequence obtained by the current adjustment to obtain the total recovery verification predicted time, and whether the total recovery verification predicted time reaches the preset recovery verification time is determined, so that the recovery verification operation of the plurality of database backup files can maximize the number of operations that can be executed within the preset recovery verification time.
In addition, according to actual service requirements, the method and the device can determine which database backup files of the database need to be subjected to recovery verification preferentially, determine the database attributes with higher recovery verification priority/needing to improve the recovery verification priority and determine the database attributes with lower recovery verification priority/needing to reduce the recovery verification priority after the recovery verification of the database backup files of the database is carried out later, and can realize the configuration of the recovery verification priorities of a plurality of database backup files subsequently.
In practical application of the present application, the recovery verification priority of each database backup file may be accurately determined in combination with one or more of the above listed recovery verification configuration manners, and the priority flag configured corresponding to the recovery verification priority is associated with the recovery verification operation of the database backup file, so as to subsequently determine the recovery verification execution sequence of each database backup file according to the priority flag.
Of course, after determining the recovery verification priority for performing serial recovery verification on a plurality of database backup files, the recovery verification execution order of the database backup files corresponding to the recovery verification priority may be used as the pending execution path for recovery verification of the plurality of database backup files. It should be noted that, the method for acquiring the pending execution path, including but not limited to the above-listed implementation manners, may be flexibly adjusted according to different service requirements, and details of examples are not described in this application.
Step S35, acquiring a first prediction probability that the recovery verification server can complete the recovery verification of the plurality of database backup files within a preset recovery verification duration according to the to-be-determined execution path;
step S36, determining that the first prediction probability is smaller than a first probability threshold value, and generating an optimized scheduling instruction aiming at the to-be-determined execution path;
in conjunction with the description of the corresponding portions of the embodiments above, a pending execution path for serial recovery validation of a plurality of database-backed files is initially determined, that is, after determining the recovery verification priority queues of a plurality of database backup files, the predicted execution time (i.e., the predicted time duration for recovery verification) or the predicted execution time interval of the recovery verification job of each database backup file at the time can be predicted, accumulating the sum of the predicted execution time of the corresponding recovery verification operation according to the execution sequence of the pending execution path to determine whether the preset recovery verification duration is reached, that is, predicting whether the recovery verification of a plurality of database backup files can be completed within the preset recovery verification duration, the present application does not limit the prediction implementation method, the method and the system can be implemented by combining information such as historical recovery verification information of each database and available resources of a current recovery verification server, and the embodiment of the application is not described in detail herein.
According to the above-described manner, after the first prediction probability that the restoration verification of the multiple database backup files can be completed within the preset restoration verification duration according to the pending execution path is obtained through prediction, whether the restoration verification of the multiple database backup files can be completed within the preset restoration verification duration can be determined by comparing the preset first probability threshold, and the numerical value of the first probability threshold and the obtaining manner thereof are not limited by the present application and can be determined as appropriate.
After the analysis, it is determined that the first prediction probability is smaller than the first probability threshold, it can be considered that the recovery verification of the multiple database backup files cannot be completed within the preset recovery verification duration according to the to-be-determined execution path, at this time, an optimized scheduling instruction can be generated to indicate that the to-be-determined execution path is further optimized, and the generation mode and the content of the optimized scheduling instruction are not limited in the present application.
Step S37, determining at least one first recovery verification job and a second recovery verification job for performing recovery verification on the plurality of database backup files in response to the optimized scheduling instruction;
step S38, based on the undetermined execution path and the preset scheduling optimization condition, performing key path optimization on the first recovery verification job and the second recovery verification job to obtain an optimal execution path meeting the preset scheduling optimization condition;
step S39, according to the optimal execution path, controls the recovery verification server to execute the plurality of first recovery verification jobs and the second recovery verification jobs.
Regarding the optimal scheduling implementation method described in steps S37 to S39, reference may be made to the description of the corresponding parts in the above embodiments, which are not described herein again in this embodiment of the present application.
Referring to fig. 4, which is a schematic flow chart of yet another optional example of the data recovery processing method proposed in the present application, this embodiment may be a description of yet another optional detailed implementation method of the data recovery processing method described above, and may perform detailed description on the scheduling optimization implementation method proposed in the above embodiment, but is not limited to the detailed implementation method described in this embodiment, as shown in fig. 4, the method may include:
step S41, acquiring a plurality of database backup files to be restored and verified, and an undetermined execution path for serial restoration and verification of the plurality of database backup files;
step S42, obtaining the historical recovery verification information of the database corresponding to each of the plurality of database backup files, and recovering the available resources of the verification server;
in the embodiment of the application, the recovery verification server may determine the available resources of the recovery verification server based on the task processes/threads created for the recovery verification job to be executed, and the more the number of the task processes executed in parallel in the same recovery verification server is, the more the resource competition is severe, and the recovery verification progress executed by the plurality of task processes may be affected. Therefore, in the process of determining parallel recovery verification of a plurality of database backup files, the number of task processes required to be created by the recovery verification server needs to be determined by combining available resources of the recovery verification server, and then a recovery verification job queue in each task process is determined.
Step S43, based on the historical recovery verification information and the available resources, obtaining a first prediction probability that the recovery verification server can complete the recovery verification of the plurality of database backup files within a preset recovery verification duration according to the to-be-determined execution path;
step S44, determining that the first prediction probability is smaller than a first probability threshold value, and generating an optimized scheduling instruction aiming at the to-be-determined execution path;
in combination with the above description of the historical recovery verification information, the prediction duration of the recovery verification of the acquired database backup files can be estimated, thereby calculating the prediction probability that the recovery verification task can not be completed within the preset recovery verification duration, or the recovery verification task is performed according to a pending execution path (namely a preliminarily set recovery verification priority queue), a first prediction probability of completion within a preset recovery verification duration, in the event that the first prediction probability is less than a first probability threshold, the method includes that a plurality of database backup files are controlled to be serially subjected to recovery verification directly according to a path to be executed, the recovery verification cannot be completed within a preset recovery verification time length, at least part of the database backup files need to be subjected to parallel recovery verification, the recovery verification time length is shortened, and meanwhile, the selected recovery verification server resources are fully utilized.
Step S45, responding to the optimized scheduling instruction, and performing resource competition analysis on the recovery verification process of the plurality of database backup files;
step S46, based on different resource competition analysis results, determining a first recovery verification job and a second recovery verification job corresponding to the first recovery verification job and the second recovery verification job from a plurality of recovery verification jobs for performing recovery verification on a plurality of database backup files;
in practical application, each database server is usually configured with one database instance, and performs backup operation on the database instance to obtain corresponding database backup files, and for recovery verification between such database backup files from different database servers, there is often no resource contention, and recovery verification operation can be executed in parallel, that is, parallel operation for recovery verification of database backup files between different nodes (that is, database servers), and the recovery verification operation can be marked as a first recovery verification operation.
Certainly, some database servers may store multiple or multiple databases, and if multiple recovery verification processes are executed directly and concurrently, it is not beneficial to improve recovery efficiency, and direct sequential execution may result in failure to complete recovery verification within a preset recovery verification duration. In order to improve the execution efficiency of the recovery verification, the recovery verification process of such database backup files may be decomposed, and a plurality of recovery verification sub-jobs corresponding to different database backup files of the same database server are executed in parallel, so that the recovery verification job of each of the plurality of database backup files from the same database server may be determined as the first recovery verification job.
Step S47, performing route scheduling on a plurality of first recovery verification jobs corresponding to the same database server source;
the recovery process of the database backup file generally comprises the steps of sending the database backup file to a local system for decompression, and restoring the decompressed recovery to a recovery verification server. As in MySQL, the backup files are exported to the SQL file by the mysqldump command, and restored by the SQL file during the restore process. In order to improve the recovery verification processing efficiency, the whole recovery verification process of each database backup file may be decomposed, for example, an execution entity is combined to decompose a corresponding recovery verification job into a plurality of recovery verification sub-jobs.
In some embodiments, the present application may determine a database server source of each of the plurality of database backup files, determine a corresponding recovery verification process as a first recovery verification job if the same database server source corresponds to the plurality of database backup files, and then may decompose the first recovery verification job into a plurality of recovery verification sub-jobs based on an execution entity involved in the recovery verification process of the database backup files, and determine recovery verification sub-jobs that can be executed concurrently between different first recovery verification jobs.
Based on this, when the concurrent job of any database backup file from the same database server is decomposed to obtain a plurality of corresponding recovery verification sub-jobs, the process of transmitting the database backup file from the backup server to the local device may be determined as the first sub-job, and the concurrently executable job execution time period may be recorded as t 1; determining the process of transmitting the database backup file from the local device to the recovery verification server as a second sub-job, and recording the concurrently executable job execution time period as t 2; determining the key content verification process of the database server on the database backup file as a third sub-job, and recording the job execution time period as t 3; the reset process to the restoration authentication environment of the restoration authentication server is determined as the fourth sub-job, and this concurrently executable job execution time period is denoted as t 4.
Therefore, in the recovery verification process of each database backup file, the execution time periods of the t2 and t3 jobs are executed by different servers, which do not affect each other, and can be executed in parallel, or sub jobs in different time periods of the same recovery verification job, such as sub jobs in the t1 time period and the t2 time period executed by the same local device, can also be executed in parallel. Therefore, the recovery verification sub-jobs corresponding to t1 and t2 belonging to different database-backed files may be executed concurrently, and the recovery verification sub-jobs corresponding to t2 and t3 may also be executed concurrently.
For example, one recovery verification job on one task process enters the sub-job of stage t2, and the sub-job of stage t1 of the next recovery verification job (determined according to the execution sequence of the pending execution path) can be started simultaneously on another task process; when this one recovery verification job enters the sub job of the time period t3, i.e., when the time period t3 is entered, the time period t2 of the next recovery verification job can be synchronously executed on the other task process. That is, during data validation of one recovery validation job, other database instances on another database server or data recovery jobs under the database may be executed synchronously. According to this scheduling manner, concurrent execution paths of different recovery validation sub-jobs of multiple database backup files in the same database server can be implemented, including but not limited to the concurrent scheduling scheme listed above in this embodiment.
It should be noted that, for the above-mentioned concurrently executable first recovery verification job capable of being decomposed, multiple recovery verification jobs belonging to the same first recovery verification job are executed in a decomposed order, that is, the recovery verification sub-jobs corresponding to t1, t2, t3, and t4 of each first recovery verification job are executed in series in one task process, but when entering the recovery verification sub-job corresponding to a certain time period, another recovery verification sub-job of the next recovery verification job may be executed in another task process in synchronization.
According to the scheduling basis described above, execution path scheduling may be performed on each recovery verification sub-job included in a plurality of first recovery verification jobs corresponding to the same database server source, so as to determine a plurality of execution paths, where each execution path may include a queue of the recovery verification job on each of a plurality of task processes and an execution order thereof. In the process of executing the route scheduling, for the first recovery verification operation which can be executed concurrently, the critical route scheduling with different recovery verification priorities can be performed.
In addition, if a plurality of database server sources respectively correspond to a database backup file, the recovery verification process of the database backup file is determined as a first recovery verification job, and the first recovery verification job can be executed as a whole without further decomposition and concurrent scheduling. Such first recovery verification jobs may be distributed to different task processes such that such first recovery verification jobs located in different task processes are executed in parallel.
Step S48, based on the pending execution path and the recovery verification predicted duration of different recovery jobs, determining candidate execution paths for the first recovery verification job and the second recovery verification job according to the scheduling optimization mode corresponding to the obtained execution path scheduling result;
in this embodiment of the present application, the scheduling optimization manner may include, but is not limited to, an exhaustion method, a steepest descent method, a heuristic algorithm, and the like, and a suitable scheduling optimization manner may be selected according to the execution path scheduling result to determine candidate execution paths for restoring the verification job corresponding to the plurality of database backup files.
In a possible implementation manner, if the obtained number of execution path schedules is less than a first number threshold (i.e., the solution space for concurrent recovery verification operations is small), an exhaustive method may be used to determine a scheduling optimization manner; if the number of the obtained execution path schedules is equal to or greater than a first number threshold (namely, the solution space of the concurrent recovery verification operation is large), and the recovery verification operations corresponding to the plurality of database backup files are greater than a first probability threshold according to a first prediction probability that the to-be-determined execution path completes within a preset recovery verification duration, if the to-be-determined execution path completes within the preset recovery verification duration according to a greater probability known by experience values, a scheduling optimization mode can be determined by adopting a steepest descent method, so that the optimization processing of the execution path of the recovery verification operation is realized; if the obtained execution path scheduling number is equal to or greater than the first number threshold and the first prediction probability is less than or equal to the first probability threshold, a heuristic algorithm is adopted to determine a scheduling optimization mode, namely the heuristic algorithm is used to calculate the key path of concurrent recovery verification operation, and a recovery verification operation range which can not be completed within a preset recovery verification duration can be marked, so that the implementation mode is not limited in the application.
How to determine the implementation process of the candidate execution path according to the various scheduling optimization manners described above may be determined according to the operation principle of the algorithm corresponding to the scheduling optimization manner, and this application is not described in detail herein.
Step S49, verifying that the predicted execution results of the first recovery verification job and the second recovery verification job executed by the recovery verification server meet the preset scheduling optimization conditions according to the candidate execution path, and determining the candidate execution path as the optimal execution path;
step S410, according to the optimal execution path, controlling the recovery verification server to execute a plurality of first recovery verification jobs and second recovery verification jobs.
In the process of optimizing the critical path according to the method described above, the embodiment of the present application may detect whether the execution path after each optimization satisfies the preset scheduling optimization condition, for example, whether the execution path satisfies the recovery verification job corresponding to the plurality of database backup files, a second prediction probability of completing within the preset recovery verification duration according to the candidate execution path is greater than a second probability threshold, and/or the range of the operation which cannot be executed within the preset recovery verification time length in the recovery verification operation corresponding to the database backup files is reduced, and/or recovery verification jobs that cannot be executed within a preset recovery duration are in the candidate execution path, if one or more conditions in the path to be executed are met, the optimization result can be adopted to obtain an optimal execution path, and the recovery verification operation is executed according to the optimal execution path.
Based on this, the above step S49 may include, but is not limited to, the following steps:
acquiring recovery verification jobs corresponding to the plurality of database backup files, determining that the second prediction probability is smaller than a second probability threshold value according to a second prediction probability of the candidate execution path completed within a preset recovery verification duration, and continuing to optimize the candidate execution path according to the method described above until the newly acquired second prediction probability is equal to or larger than the second probability threshold value; and/or determining that the range of the operation which cannot be executed within the preset recovery verification duration is not reduced in the recovery verification operation corresponding to the plurality of database backup files, and continuing to optimize the candidate execution path according to the method described above until the range of the operation which cannot be executed within the preset recovery verification duration is reduced; and/or determining that the recovery verification jobs which cannot be executed within the preset recovery verification duration do not move backward in the candidate execution path relative to the execution sequence in the pending execution path, and continuing to optimize the candidate execution path according to the method described above until the recovery verification jobs which cannot be executed within the preset recovery verification duration move backward in the candidate execution path relative to the execution sequence in the pending execution path.
It should be noted that, regarding the content of the preset scheduling optimization condition, including but not limited to one or more of the above listed combination conditions, the preset scheduling optimization condition may be determined according to the service requirement.
Referring to fig. 5, which is a schematic flow diagram of another optional example of the data recovery processing method provided in the present application, this embodiment may be a description of another optional detailed implementation method of the data recovery processing method described above, and as shown in fig. 5, another optional implementation manner of determining an optimal execution path provided in this embodiment may include:
step S51, acquiring a plurality of database backup files to be restored and verified, and an undetermined execution path for serial restoration and verification of the plurality of database backup files;
step S52, performing resource competition analysis on the recovery verification process of the plurality of database backup files;
step S53, based on different resource competition analysis results, determining a first recovery verification job and a second recovery verification job corresponding to the first recovery verification job and the second recovery verification job from a plurality of recovery verification jobs for performing recovery verification on a plurality of database backup files;
step S54, scheduling execution paths of a plurality of first recovery verification operations corresponding to the same database server source to obtain at least one first concurrent execution path;
step S55, scheduling execution paths of the first recovery verification jobs corresponding to different database server sources to obtain at least one second concurrent execution path;
with reference to the description of the corresponding parts of the above embodiments, for a plurality of first recovery verification jobs corresponding to the same database server source, each of the first recovery verification jobs may be decomposed in parallel to obtain a plurality of recovery verification sub-jobs, and then critical path scheduling is performed on the recovery verification sub-jobs to obtain one or more first concurrent execution paths of the plurality of first recovery verification jobs, that is, one or more first recovery verification job queues of this type. Similarly, for the scheduling of the execution paths of the first recovery verification jobs corresponding to different database server sources, the predicted recovery verification duration and the available resources may be combined to obtain one or more second concurrent execution paths of the first recovery verification jobs, that is, one or more first recovery verification job queues, and the implementation process is not described in detail in this application.
Step S56, based on the undetermined execution path, the first concurrent execution path and the second concurrent execution path, performing key path scheduling on the first recovery verification operation and the second recovery verification operation according to a scheduling optimization strategy to obtain an optimal execution path meeting preset scheduling optimization conditions;
in practical application of this embodiment, a scheduling optimization strategy may be determined in combination with the above-described multiple scheduling optimization manners, so as to implement combination of the above-determined various execution paths, obtain multiple key paths of the recovery verification jobs for multiple database backup files, and then determine an optimal execution path for the execution result of the recovery verification job executed according to each key path, such as the estimated recovery verification prediction duration, the consumed recovery verification server resources, and the like, so as to reduce the recovery verification duration and the consumed resource amount for performing recovery verification on the multiple database backup files under the condition that the recovery verification of the multiple database backup files is completed within the preset recovery verification duration, which does not describe in detail in this application.
Step S57 of controlling the recovery verification server to execute a plurality of first recovery verification jobs and second recovery verification jobs according to the optimal execution path;
step S58, acquiring job log data generated by the recovery verification server executing the first recovery verification job and the second recovery verification job;
step S59, determining that the unsuccessfully executed recovery verification job exists within the preset recovery verification duration based on the job log data, and adjusting the optimal scheduling scheme;
step S510, based on the adjusted optimal scheduling scheme, in the next preset restoration verification duration, the database backup files of the database corresponding to the plurality of database backup files are restored and verified.
After executing a plurality of recovery verification jobs according to the optimal execution path determined in the foregoing embodiments, that is, after completing the recovery verification task, the execution condition of the task may be reported to the administrator, that is, the job log data generated by executing each of the determined first recovery verification jobs and second recovery verification jobs is recorded, the execution condition of the corresponding recovery verification job is represented by the job log data, it is determined that there is a recovery verification job that is not successfully executed within a preset recovery verification duration based on the job log data, the optimal scheduling scheme may be adjusted, thereby implementing recovery verification processing on the backup files of the plurality of databases within a next preset recovery verification duration, that is, if determining a recovery verification job list that is not successfully executed, the administrator may adjust the recovery verification job scheduling scheme within the next preset recovery verification duration according to actual requirements, the method can be repeated before the recovery and restoration operation is started each time, the recovery verification is carried out on a plurality of database backup files, the optimization of the operation key path can be executed concurrently in the recovery process, namely, the resource scheduling of the backup file recovery is optimized, the service quality is ensured, the operation efficiency is improved, the recovery operation of a plurality of database backup files can be completed within the preset recovery verification duration,
in some embodiments provided by the application, for the method for obtaining the optimal execution path (i.e., the optimal scheduling scheme) described in each of the embodiments above, a general scheduling model may be constructed in advance, so that, for different services, after obtaining information such as a plurality of database backup files and respective corresponding priority configuration data according to the method described above, the scheduling model may be directly input, and the optimal execution path of the recovery verification job for the plurality of database backup files at present may be output. The construction implementation method of the scheduling model is not limited, and can be determined by combining the execution path optimization method described in each step.
In some embodiments, in some actual services, in order to ensure data security and availability of more databases, the number of recovery verification operations that can be performed within a preset recovery verification duration is maximized. Before the recovery verification operation starts, the size of each database to be recovered and the expected execution time (namely the predicted time for recovery verification) of the recovery verification operation are estimated, the expected time for recovery verification operation of executing a database with the capacity larger than x GB (namely a first data quantity threshold value, the numerical value of the first data quantity threshold value is not limited by the application) is calculated, and the expected time for recovery of n databases with the capacity smaller than x GB is executed once is calculated. The n databases with the capacity smaller than xGB can be bundled and unified in batch, so that the processing performance is improved. For example, a plurality of databases used at low frequency exist in the OpenStack database, the size of the databases is small, even the size of a compressed file is within 1MB, and therefore recovery verification efficiency can be effectively improved through batch database recovery operation.
Based on this, the data volume and the recovery verification predicted time length of each of the multiple database backup files can be determined, so that multiple first database backup files with the data volume smaller than the first data volume threshold value, namely the database backup files of the n databases with the capacity smaller than x GB, can be determined as one recovery verification job in the recovery verification process of the multiple first database backup files, and then scheduling control can be executed in parallel with other recovery verification jobs according to the scheduling method described above, so that the recovery verification jobs of the small-size databases can be executed in batch and concurrently, and the implementation process of the present application is not described in detail.
In addition, because the strategy for executing the recovery verification of the batch database backup files is symbiotic with the backup operation plan for obtaining the corresponding database backup files, the database backup files of the small-size database can be exported in batch and used as a batch of database backup operations for carrying out batch backup in the process of exporting the database backup files of a plurality of databases in the database server. Accordingly, the recovery verification operation for the batch database backup files can also be executed in batch and concurrently according to the above-described manner, so that the recovery processing efficiency is improved.
Referring to fig. 6, a schematic diagram of a result of an alternative example of the data recovery processing apparatus proposed in the present application is shown, as shown in fig. 6, the apparatus may include:
the information acquisition module 61 is configured to acquire a plurality of database backup files to be restored and verified, and an undetermined execution path for performing serial restoration and verification on the plurality of database backup files;
a restore verification job determination module 62 configured to determine at least one first restore verification job and a second restore verification job for performing restore verification on the plurality of database-backed-up files; the first recovery verification job may be executed concurrently; the second recovery verification job is not concurrently executable;
an optimal execution path obtaining module 63, configured to perform, based on the to-be-executed path and a preset scheduling optimization condition, critical path optimization on the first recovery verification job and the second recovery verification job, and obtain an optimal execution path that meets the preset scheduling optimization condition;
and a recovery verification processing module 64, configured to control the recovery verification server to execute the plurality of first recovery verification jobs and the plurality of second recovery verification jobs according to the optimal execution path.
In some embodiments, as shown in fig. 7, the recovery verification job determination module 62 may include:
a first information obtaining unit 621, configured to obtain historical recovery verification information of a database corresponding to each of the plurality of database backup files, and recover available resources of a verification server;
a first prediction probability obtaining unit 622, configured to obtain, based on the historical recovery verification information and the available resources, a first prediction probability that the recovery verification server can complete recovery verification on the plurality of database backup files within a preset recovery verification duration according to the to-be-determined execution path;
an optimized scheduling instruction determining unit 623, configured to determine that the first prediction probability is smaller than a first probability threshold, and generate an optimized scheduling instruction for the pending execution path;
a resource competition analysis unit 624, configured to perform resource competition analysis on the recovery verification process of the multiple database backup files in response to the optimized scheduling instruction;
a recovery verification job determining unit 625, configured to determine, based on different resource competition analysis results, a corresponding first recovery verification job and a corresponding second recovery verification job from multiple recovery verification jobs for performing recovery verification on the multiple database backup files.
Optionally, the resource competition analysis unit 624 may include:
a database server source determining unit, configured to determine a database server source of each of the plurality of database backup files;
the recovery verification job determination unit 625 may include:
a first determining unit, configured to determine, if the same database server source corresponds to multiple database backup files, a corresponding recovery verification process as a first recovery verification job;
a second determining unit, configured to decompose the first recovery verification job into multiple recovery verification sub-jobs based on an execution entity involved in a recovery verification process of the database backup file, and determine recovery verification sub-jobs that can be executed concurrently between different first recovery verification jobs; and/or the presence of a gas in the gas,
and the third determining unit is used for determining the recovery verification process of the database backup file as the first recovery verification operation if the plurality of database server sources respectively correspond to one database backup file.
In still other embodiments of the present disclosure, as shown in fig. 7, the optimal execution path obtaining module 63 may include:
an execution path scheduling unit 631, configured to perform execution path scheduling on a plurality of first recovery verification operations corresponding to the same database server source;
a candidate execution path determining unit 632, configured to determine, based on the pending execution path and the predicted recovery verification durations of different recovery jobs, candidate execution paths for the first recovery verification job and the second recovery verification job according to a scheduling optimization manner corresponding to the obtained execution path scheduling result;
an optimal execution path determining unit 633, configured to verify that the predicted execution results of the first recovery verification job and the second recovery verification job executed by the recovery verification server satisfy a preset scheduling optimization condition according to the candidate execution path, and determine the candidate execution path as an optimal execution path.
Optionally, the optimal execution path determining unit 633 may include:
a second prediction probability obtaining unit, configured to obtain a second prediction probability that the recovery verification operation corresponding to the multiple database backup files is completed within a preset recovery verification duration according to the candidate execution path;
the first optimization unit is used for determining that the second prediction probability is smaller than a second probability threshold value, and continuing to optimize the candidate execution path until the newly acquired second prediction probability is equal to or larger than the second probability threshold value;
and/or the second optimization unit is configured to determine that, in the recovery verification jobs corresponding to the plurality of database backup files, a job range that cannot be executed within the preset recovery verification duration is not reduced, and continue to optimize the candidate execution path until the job range that cannot be executed within the preset recovery verification duration is reduced;
and/or a third optimization unit, configured to determine that, in the candidate execution path, a recovery verification job that cannot be executed within a preset recovery verification duration does not move backward with respect to an execution sequence in the pending execution path, and continue to optimize the candidate execution path until the execution sequence moves backward.
In still other embodiments of the present application, the optimal execution path obtaining module 63 may also include:
the first scheduling unit is used for scheduling a plurality of execution paths of the first recovery verification operation corresponding to the same database server source to obtain at least one first concurrent execution path;
a second scheduling unit, configured to schedule execution paths of the first recovery verification operations corresponding to different database server sources, respectively, to obtain at least one second concurrent execution path;
a third scheduling unit, configured to perform, based on the pending execution path, the first concurrent execution path, and the second concurrent execution path, according to a scheduling optimization strategy, critical path scheduling on the first recovery validation job and the second recovery validation job, so as to obtain an optimal execution path that meets a preset scheduling optimization condition;
the preset scheduling optimization condition comprises that under the condition that the recovery verification of the database backup files is completed within a preset recovery verification time, the recovery verification time spent on the recovery verification of the database backup files and the consumed resource amount are reduced.
In still other embodiments, as shown in fig. 7, the information obtaining module 61 may include:
a backup file obtaining unit 611, configured to obtain a plurality of database backup files to be restored and verified;
a priority configuration data obtaining unit 612, configured to obtain priority configuration data corresponding to each of the multiple database backup files;
the priority configuration data comprises one or more combinations of historical recovery verification information, database attributes, database backup modes and recovery verification prediction duration of a database corresponding to the database backup files;
a restoration verification priority determining unit 613, configured to determine a restoration verification priority for performing serial restoration verification on the plurality of database backup files based on the priority configuration data;
and an undetermined execution path obtaining unit 614, configured to obtain an undetermined execution path for performing recovery verification on the multiple database backup files according to the recovery verification priority.
Optionally, the recovery verification priority determining unit 613 may include:
the first configuration unit is used for configuring the recovery processing priority of the database backup file obtained by adopting the full backup mode and the recovery processing priority of the database backup file obtained by adopting the incremental backup mode if the database backup mode comprises the full backup mode and the incremental backup mode;
and/or the presence of a gas in the gas,
and the second configuration unit is used for determining that the corresponding database backup file does not complete or successfully complete the recovery verification within the adjacent last preset recovery verification time length based on the historical recovery verification information, and improving the recovery verification priority of the database backup file.
It should be noted that, various modules, units, and the like in the embodiments of the foregoing apparatuses may be stored in the memory as program modules, and the processor executes the program modules stored in the memory to implement corresponding functions, and for the functions implemented by the program modules and their combinations and the achieved technical effects, reference may be made to the description of corresponding parts in the embodiments of the foregoing methods, which is not described in detail in this embodiment.
The present application also provides a computer-readable storage medium, on which a computer program can be stored, which can be called and loaded by a processor to implement the steps of the data recovery processing method described in the above embodiments.
Referring to fig. 8, which is a schematic diagram illustrating a hardware structure of an alternative example of a scheduling apparatus suitable for the data recovery processing method provided in the present application, the scheduling apparatus may include: at least one communication interface 81, at least one memory 82, and at least one processor 83, wherein:
the communication interface 81 may include an interface of a communication module for implementing data interaction by using a wireless communication network, and the communication module may include, but is not limited to, a WIFI module, a 5G/6G (fifth generation mobile communication network/sixth generation mobile communication network) module, a GPRS module, and the like, so that the scheduling device implements communication connection with other devices in the data recovery processing system, and the implementation method is not described in detail in this application. In addition, the communication interface 81 may further include a communication interface for implementing data interaction between internal components of the scheduling device, such as a USB interface, a serial/parallel interface, a multimedia interface, and the like.
The memory 82 may be used to store a program for implementing the data recovery processing method described in the above-described method embodiments; the processor 83 may load and execute the program stored in the memory to implement the steps of the data recovery processing method described in the above corresponding method embodiment, and the specific implementation process may refer to the description of the corresponding parts in the above embodiment, which is not described again.
In practical applications, the communication interface 81, the memory 82 and the processor 93 may be connected to a communication bus, and data interaction between each other and other structural components of the computer device is realized through the communication bus, which may be specifically determined according to practical requirements, and is not described in detail in this application.
In the embodiment of the present application, the memory 82 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device or other volatile solid-state storage device. The processor 83 may be a Central Processing Unit (CPU), an application-specific integrated circuit (ASIC), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device. The structures and types of the memory 82 and the processor 83 are not limited in the present application, and can be flexibly adjusted according to actual requirements.
It should be understood that the structure of the scheduling apparatus shown in fig. 8 does not constitute a limitation of the scheduling apparatus in the embodiment of the present application, and in practical applications, the scheduling apparatus may include more components than those shown in fig. 8, or some components may be combined. If the scheduling device is a terminal device, the scheduling device may further include at least one input component such as a touch sensing unit that senses a touch event on the touch display panel, a keyboard, a mouse, a camera, a sound pickup, and the like; at least one output component such as a display, speaker, vibration mechanism, light, etc.; an antenna; a sensor module; power modules, etc., which are not listed herein.
Finally, it should be noted that, in the embodiments, relational terms such as first, second and the like may be used solely to distinguish one operation, unit or module from another operation, unit or module without necessarily requiring or implying any actual such relationship or order between such units, operations or modules. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method or system that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the device, the scheduling device and the system disclosed by the embodiment, since the device, the scheduling device and the system correspond to the method disclosed by the embodiment, the description is relatively simple, and the relevant points can be obtained by referring to the description of the method part.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of data recovery processing, the method comprising:
acquiring a plurality of database backup files to be restored and verified, and performing serial restoration verification on the plurality of database backup files to be executed;
determining at least one first recovery verification job and a second recovery verification job for performing recovery verification on the plurality of database backup files; the first recovery verification job may be executed concurrently; the second recovery verification job is not concurrently executable;
performing key path optimization on the first recovery verification operation and the second recovery verification operation based on the undetermined execution path and a preset scheduling optimization condition to obtain an optimal execution path meeting the preset scheduling optimization condition;
and controlling a recovery verification server to execute the plurality of first recovery verification jobs and the second recovery verification jobs according to the optimal execution path.
2. The method of claim 1, the determining at least one first and second recovery verification jobs to recover-verify the plurality of database-backed-up files, comprising:
acquiring historical recovery verification information of databases corresponding to the plurality of database backup files respectively, and recovering available resources of a verification server;
based on the historical recovery verification information and the available resources, obtaining a first prediction probability that the recovery verification server can complete the recovery verification of the plurality of database backup files within a preset recovery verification duration according to the to-be-determined execution path;
determining that the first prediction probability is less than a first probability threshold, and generating an optimized scheduling instruction for the pending execution path;
responding to the optimized scheduling instruction, and performing resource competition analysis on the recovery verification process of the plurality of database backup files;
and determining corresponding first recovery verification operation and second recovery verification operation from a plurality of recovery verification operations for performing recovery verification on the plurality of database backup files based on different resource competition analysis results.
3. The method of claim 2, wherein the performing resource competition analysis on the recovery verification processes of the plurality of database-backed files, and determining a corresponding first recovery verification job from a plurality of recovery verification jobs performing recovery verification on the plurality of database-backed files based on different resource competition analysis results comprises:
determining a database server source for each of the plurality of database backup files;
if the same database server source corresponds to a plurality of database backup files, determining a corresponding recovery verification process as a first recovery verification operation;
decomposing the first recovery verification operation into a plurality of recovery verification sub-operations based on an execution entity involved in the recovery verification process of the database backup file, and determining the recovery verification sub-operations which can be executed concurrently among different first recovery verification operations; and/or the presence of a gas in the gas,
and if the plurality of database server sources respectively correspond to one database backup file, determining the recovery verification process of the database backup file as a first recovery verification operation.
4. The method according to any one of claims 1 to 3, wherein the performing critical path optimization on the first recovery validation job and the second recovery validation job based on the pending execution path and a preset scheduling optimization condition to obtain an optimal execution path that meets the preset scheduling optimization condition comprises:
performing execution path scheduling on a plurality of first recovery verification operations corresponding to the same database server source;
determining candidate execution paths for the first recovery verification operation and the second recovery verification operation according to a scheduling optimization mode corresponding to the obtained execution path scheduling result based on the undetermined execution path and the recovery verification predicted duration of different recovery operations;
and controlling a recovery verification server to execute the predicted execution results of the first recovery verification operation and the second recovery verification operation to meet a preset scheduling optimization condition according to the candidate execution path, and determining the candidate execution path as an optimal execution path.
5. The method of claim 4, wherein the verifying that the predicted execution results of the first recovery verification job and the second recovery verification job executed by the recovery verification server according to the candidate execution path satisfy a preset scheduling optimization condition comprises:
acquiring recovery verification operations corresponding to the database backup files, and according to a second prediction probability of the candidate execution path completed within a preset recovery verification duration;
determining that the second prediction probability is smaller than a second probability threshold, and continuing to optimize the candidate execution path until the newly acquired second prediction probability is equal to or larger than the second probability threshold;
and/or the presence of a gas in the gas,
determining that the range of the operation which cannot be executed within the preset recovery verification duration is not reduced in the recovery verification operation corresponding to the plurality of database backup files, and continuing to optimize the candidate execution path until the range of the operation which cannot be executed within the preset recovery verification duration is reduced;
and/or the presence of a gas in the gas,
and determining that the recovery verification operation which cannot be executed within a preset recovery verification duration does not move backwards relative to the execution sequence in the path to be determined in the candidate execution path, and continuing to optimize the candidate execution path until the execution sequence moves backwards.
6. The method according to any one of claims 1 to 3, wherein the performing critical path optimization on the first recovery validation job and the second recovery validation job based on the pending execution path and a preset scheduling optimization condition to obtain an optimal execution path that meets the preset scheduling optimization condition includes:
scheduling execution paths of a plurality of first recovery verification operations corresponding to the same database server source to obtain at least one first concurrent execution path;
scheduling execution paths of the first recovery verification operation corresponding to different database server sources respectively to obtain at least one second concurrent execution path;
performing key path scheduling on the first recovery verification operation and the second recovery verification operation according to a scheduling optimization strategy based on the undetermined execution path, the first concurrent execution path and the second concurrent execution path to obtain an optimal execution path meeting a preset scheduling optimization condition;
the preset scheduling optimization condition comprises that under the condition that the recovery verification of the database backup files is completed within a preset recovery verification time, the recovery verification time spent on the recovery verification of the database backup files and the consumed resource amount are reduced.
7. The method of any of claims 1-3, wherein obtaining the pending execution path for serial recovery validation of the plurality of database backup files comprises:
acquiring priority configuration data corresponding to the plurality of database backup files; the priority configuration data comprises one or more combinations of historical recovery verification information, database attributes, database backup modes and recovery verification prediction duration of a database corresponding to the database backup files;
determining a recovery verification priority for serial recovery verification of the plurality of database backup files based on the priority configuration data;
and obtaining the pending execution path for the recovery verification of the plurality of database backup files according to the recovery verification priority.
8. The method of claim 7, the determining a recovery verification priority for serial recovery verification of the plurality of database-backed files based on the priority configuration data, comprising:
if the database backup mode comprises a full backup mode and an incremental backup mode, configuring the recovery processing priority of the database backup file obtained by adopting the full backup mode, which is higher than the recovery processing priority of the database backup file obtained by adopting the incremental backup mode;
and/or the presence of a gas in the gas,
and determining that the corresponding database backup file does not complete or unsuccessfully complete recovery verification within the adjacent last preset recovery verification duration based on the historical recovery verification information, and improving the recovery verification priority of the database backup file.
9. A data recovery processing apparatus, the apparatus comprising:
the information acquisition module is used for acquiring a plurality of database backup files to be restored and verified and performing serial restoration verification on the plurality of database backup files to be executed;
the recovery verification job determining module is used for determining at least one first recovery verification job and a second recovery verification job for performing recovery verification on the plurality of database backup files; the first recovery verification job may be executed concurrently; the second recovery verification job is not concurrently executable;
an optimal execution path obtaining module, configured to perform critical path optimization on the first recovery verification job and the second recovery verification job based on the pending execution path and a preset scheduling optimization condition, and obtain an optimal execution path that meets the preset scheduling optimization condition;
and the recovery verification processing module is used for controlling a recovery verification server to execute the plurality of first recovery verification jobs and the plurality of second recovery verification jobs according to the optimal execution path.
10. A data recovery processing system, the system comprising: a database server, a backup server, a recovery validation server, and a scheduling device comprising at least one communication interface, at least one memory, and at least one processor, wherein:
the memory for storing a program for implementing the data restoration processing method according to any one of claims 1 to 8;
the processor is used for loading and executing the program stored in the memory and realizing the data recovery processing method according to any one of claims 1 to 8.
CN202210197321.XA 2022-03-01 2022-03-01 Data recovery processing method, device and system Pending CN114546730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210197321.XA CN114546730A (en) 2022-03-01 2022-03-01 Data recovery processing method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210197321.XA CN114546730A (en) 2022-03-01 2022-03-01 Data recovery processing method, device and system

Publications (1)

Publication Number Publication Date
CN114546730A true CN114546730A (en) 2022-05-27

Family

ID=81661113

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210197321.XA Pending CN114546730A (en) 2022-03-01 2022-03-01 Data recovery processing method, device and system

Country Status (1)

Country Link
CN (1) CN114546730A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117931830A (en) * 2024-03-22 2024-04-26 平凯星辰(北京)科技有限公司 Data recovery method, device, electronic equipment, storage medium and program product

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117931830A (en) * 2024-03-22 2024-04-26 平凯星辰(北京)科技有限公司 Data recovery method, device, electronic equipment, storage medium and program product

Similar Documents

Publication Publication Date Title
US10248671B2 (en) Dynamic migration script management
US10860441B2 (en) Method and system for data backup and restoration in cluster system
CN108632365B (en) Service resource adjusting method, related device and equipment
US8943353B2 (en) Assigning nodes to jobs based on reliability factors
US8091087B2 (en) Scheduling of new job within a start time range based on calculated current load and predicted load value of the new job on media resources
US9477460B2 (en) Non-transitory computer-readable storage medium for selective application of update programs dependent upon a load of a virtual machine and related apparatus and method
US20140040573A1 (en) Determining a number of storage devices to backup objects in view of quality of service considerations
CN111143133B (en) Virtual machine backup method and backup virtual machine recovery method
KR20120040707A (en) Fault tolerant batch processing
CN109634730A (en) Method for scheduling task, device, computer equipment and storage medium
CN111190753A (en) Distributed task processing method and device, storage medium and computer equipment
Faragardi et al. Optimal task allocation for maximizing reliability in distributed real-time systems
JP2010231694A (en) System, method and program for supporting job schedule change
CN114637511A (en) Code testing system, method, device, electronic equipment and readable storage medium
Garg et al. Fault tolerant task scheduling on computational grid using checkpointing under transient faults
CN114546730A (en) Data recovery processing method, device and system
CN115756783A (en) Cross-subsystem space task dependent scheduling method and system
CN111666138A (en) Timed task processing method, device and system, computer equipment and storage medium
CN102541542B (en) The content of storage and issue content storage apparatus
CN116483546B (en) Distributed training task scheduling method, device, equipment and storage medium
CN105827744A (en) Data processing method of cloud storage platform
CN113342893A (en) Node synchronization method and device based on block chain, storage medium and server
US9639636B1 (en) Algorithmically driven selection of parallelization technique for running model simulation
US7607132B2 (en) Process scheduling system and method
KR102177440B1 (en) Method and Device for Processing Big Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination