WO2023005075A1 - Disaster recovery method and system for data, terminal device and computer storage medium - Google Patents

Disaster recovery method and system for data, terminal device and computer storage medium Download PDF

Info

Publication number
WO2023005075A1
WO2023005075A1 PCT/CN2021/132314 CN2021132314W WO2023005075A1 WO 2023005075 A1 WO2023005075 A1 WO 2023005075A1 CN 2021132314 W CN2021132314 W CN 2021132314W WO 2023005075 A1 WO2023005075 A1 WO 2023005075A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
disaster recovery
task
database
preset
Prior art date
Application number
PCT/CN2021/132314
Other languages
French (fr)
Chinese (zh)
Inventor
周可
崖飞虎
范筝
乔一航
邸帅
卢道和
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023005075A1 publication Critical patent/WO2023005075A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Definitions

  • the present application relates to the technical field of financial technology (Fintech), and in particular to a data disaster recovery method, system, terminal equipment, and computer storage medium.
  • the primary and backup clusters run in two different computer rooms, respectively run independent account systems and use independent operation and maintenance management and control systems.
  • the big data remote disaster recovery solution only considers data disaster recovery on the offline side, and the basic components involved are mainly Hadoop (Apache Hadoop, an open source software framework that supports data-intensive distributed applications and is released under the Apache 2.0 license agreement), Hive (Apache Hive, a data warehouse tool based on Hadoop) and Big Data Platform Job Scheduling System (Big Data Platform Job Scheduling System).
  • the existing disaster recovery strategy for big data clusters is: use cross-computer room data synchronization tools to synchronize the daily changing data of the main cluster to the disaster recovery cluster, so that when the main cluster is unavailable, switch to the disaster recovery cluster.
  • the existing big data cluster disaster recovery solution after switching to the disaster recovery environment, it is necessary to rerun the entire process of importing, processing, and exporting business data to the business system in the disaster recovery environment to complete The entire disaster recovery switchover process, like this, leads to a long time-consuming disaster recovery switchover, and the disaster recovery switchover cannot be completed quickly and efficiently.
  • the main purpose of this application is to provide a data disaster recovery method, system, terminal equipment, and computer storage medium, aiming to realize rapid and fine-grained disaster recovery switching when the main cluster fails to provide services, thereby improving disaster recovery efficiency.
  • the present application provides a data disaster recovery method, the data disaster recovery method is applied to data disaster recovery equipment, the data disaster recovery method includes:
  • Detecting the synchronization state of the task parameters so as to determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
  • the present application also provides a data disaster recovery system
  • the data disaster recovery system includes:
  • a connection module configured to establish a communication connection with the disaster recovery database of the preset main cluster
  • a workflow reading module configured to read the workflow executed by the preset main cluster through the communication connection
  • An acquisition module configured to acquire task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationship between data and data processing tasks;
  • the recovery module is configured to detect the synchronization state of the task parameters, determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
  • each functional module of the data disaster recovery and recovery system of the present application implements the steps of the above-mentioned data disaster recovery and recovery method during operation.
  • the present application also provides a terminal device, the terminal device includes: a memory, a processor, and a disaster recovery program for data stored in the memory and operable on the processor, When the data disaster recovery program is executed by the processor, the above steps of the data disaster recovery method are implemented.
  • the present application also provides a computer storage medium, on which a data disaster recovery program is stored, and when the data disaster recovery program is executed by a processor, the above-mentioned The steps of the data disaster recovery recovery method.
  • the present application also provides a computer program product, the computer program product includes a computer program, and when the computer program is executed by a processor, the steps of the above-mentioned data disaster recovery method are implemented.
  • This application provides a data disaster recovery method, system, terminal equipment, computer storage medium and computer program product, through which the data disaster recovery equipment establishes a communication connection with the disaster recovery database of the preset main cluster; through the The communication connection reads the workflow executed by the preset main cluster; obtains the task parameters of each task node in the workflow according to the preset relationship chain model, wherein the relationship chain model is based on the relationship between data and data processing tasks Detecting the synchronization state of the task parameters, so as to determine the target node to be re-executed in each of the task nodes according to the synchronization state, and trigger the disaster recovery mechanism to execute the target node.
  • a disaster recovery cluster when a disaster occurs in the main cluster and cannot continue to provide services, so that disaster recovery switching is required and the disaster recovery cluster replaces the main cluster to provide services, through the data disaster recovery and recovery equipment under the disaster recovery cluster, a The communication connection between the disaster recovery databases of the cluster, so as to read the workflow that the preset main cluster is executing when a disaster occurs through the communication connection;
  • the relationship chain model obtained by constructing the blood relationship between them can obtain the task parameters of each task node in the workflow; finally, detect the synchronization status of the task parameters of each task node, so as to determine the task nodes to be restarted according to the synchronization status. Execute the target node, and trigger the disaster recovery mechanism to re-execute the target node when the target node is determined.
  • this application uses a complete relationship chain model constructed in advance based on the blood relationship between data and data processing tasks, and combines the synchronization status of the task parameters of the task nodes in the workflow.
  • the disaster recovery operation of switching the disaster recovery cluster in the case of a disaster in the main cluster does not need to rerun all business data tasks when the main cluster is in a disaster in the disaster recovery environment, but only based on the combination of the relationship chain model and synchronization
  • the task nodes to be re-executed after the state is determined are re-run, so that rapid disaster recovery switching and fast recovery of task nodes to be re-executed can be achieved, and the purpose of fast and refined disaster recovery switching is achieved, thereby improving disaster recovery efficiency.
  • FIG. 1 is a schematic diagram of the device structure of the terminal device hardware operating environment involved in the solution of the embodiment of the present application;
  • Fig. 2 is the schematic flow chart of one embodiment of the disaster recovery recovery method of the application data
  • Fig. 3 is the blood relationship data acquisition and processing process involved in an embodiment of the disaster recovery method for the data of the present application
  • FIG. 4 shows the first blood relationship between the data processing execution task and the data involved in an embodiment of the data disaster recovery method of the present application
  • FIG. 5 shows the second blood relationship between the data processing execution task and the data processing task involved in an embodiment of the data disaster recovery method of the present application
  • Fig. 6 is the workflow example of the data processing involved in an embodiment of the disaster recovery recovery method of the application data
  • FIG. 7 is a processing flow of the second blood relationship involved in an embodiment of the disaster recovery method for data of the present application.
  • Fig. 8 is the relationship between data processing tasks and task execution IDs involved in an embodiment of the disaster recovery method for data of the present application;
  • FIG. 9 shows the blood relationship between data and data processing tasks involved in an embodiment of the disaster recovery method for data in this application.
  • FIG. 10 is a data synchronization process involved in an embodiment of a method for disaster recovery and recovery of data in the present application
  • Fig. 11 is the disaster recovery processing flow involved in an embodiment of the disaster recovery method for the application data
  • FIG. 12 is a schematic diagram of a disaster recovery scenario involved in an embodiment of a method for disaster recovery and recovery of data in this application;
  • FIG. 13 is a schematic diagram of functional modules of an embodiment of the data disaster recovery system of the present application.
  • FIG. 1 is a schematic diagram of a device structure of a hardware operating environment of a terminal device involved in the solution of an embodiment of the present application.
  • the terminal device in the embodiment of this application may be a data disaster recovery device configured under a disaster recovery cluster to perform disaster recovery and recovery in case a disaster occurs in the main cluster and cannot continue to provide services.
  • the data disaster recovery recovery device may be a smart phone , PC (Personal Computer, personal computer), tablet computer, portable computer and so on.
  • the terminal device may include: a processor 1001 , such as a CPU, a communication bus 1002 , a user interface 1003 , a network interface 1004 , and a memory 1005 .
  • the communication bus 1002 is used to realize connection and communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may include a standard wired interface and a wireless interface (such as a Wi-Fi interface).
  • the memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • the structure of the terminal device shown in FIG. 1 does not constitute a limitation on the terminal device, and may include more or less components than those shown in the figure, or combine some components, or arrange different components.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a data disaster recovery program.
  • the network interface 1004 is mainly used to connect to the background server and perform data communication with the background server;
  • the user interface 1003 is mainly used to connect to the client and perform data communication with the client; and the processor 1001 can be used for
  • the data disaster recovery program stored in the memory 1005 is invoked, and the operations described in the following embodiments of the data disaster recovery method of this application are performed.
  • the active and standby clusters (or called the main cluster and the disaster recovery cluster) run in two different computer rooms respectively, and each runs an independent account system, and uses an independent operation and maintenance management and control system.
  • cluster delivery the primary and standby clusters are delivered separately.
  • the typical data processing flow of the big data platform is as follows:
  • Apache Sqoop is an open source tool, mainly used for data transfer between Hadoop (Hive) and traditional databases (mysql, postgresql, oracle, etc.)
  • Hive traditional databases
  • the big data cluster disaster recovery strategy is to synchronize the daily changing data of the main cluster to the disaster recovery cluster through cross-computer room data synchronization tools, and switch to the disaster recovery cluster when the main cluster is unavailable.
  • the existing disaster recovery strategy for big data clusters is: use cross-computer room data synchronization tools to synchronize the daily changing data of the main cluster to the disaster recovery cluster, so that when the main cluster is unavailable, switch to the disaster recovery cluster.
  • the existing big data cluster disaster recovery solution after switching to the disaster recovery environment, it is necessary to rerun the entire process of importing, processing, and exporting business data to the business system, which takes a long time and cannot Quickly and efficiently complete disaster recovery switchover.
  • FIG. 2 is a schematic flow chart of the first embodiment of the data disaster recovery and recovery method of the present application.
  • the data disaster recovery and recovery equipment for disaster recovery (for the convenience of explanation, the following are all expressed as disaster recovery recovery equipment), and the disaster recovery recovery method for data in this application includes:
  • Step S10 establishing a communication connection with the disaster recovery database of the preset main cluster
  • the disaster recovery device first establishes a communication connection with the disaster recovery database of the preset primary cluster where a disaster occurs.
  • the preset main cluster is the cluster where the big data platform that is performing a series of processing procedures of data extraction, data processing, and data export is located in the scenario of big data remote disaster recovery.
  • the disaster recovery database of the preset main cluster is an off-site backup of the database of the scheduling system of the preset main cluster.
  • step S10 may include:
  • Step S101 when the service of the preset master cluster is unavailable, establish a communication connection with the disaster recovery database of the preset master cluster.
  • the disaster recovery recovery process of the disaster recovery device occurs when a disaster occurs in the preset primary cluster and cannot continue to provide services for data processing.
  • the disaster recovery device When a disaster occurs in the default primary cluster currently undergoing data processing, the disaster recovery device will immediately establish a connection with the default primary cluster’s disaster recovery database Communication connection between.
  • the preset main cluster currently performing data processing is in IDC1
  • the standby disaster recovery cluster is in IDC2 in a different place.
  • a disaster occurs in the preset main cluster in IDC1, which makes the services provided by the preset main cluster unavailable (that is, the data processing process cannot be completed), or causes the preset main cluster to be unable to continue to provide services at all
  • the disaster recovery device in the disaster recovery cluster of the IDC2 in a different place starts to establish a communication connection with the disaster recovery database of the preset main cluster.
  • Step S20 reading the workflow executed by the preset main cluster through the communication connection
  • the disaster recovery device After the disaster recovery device establishes a communication connection with the disaster recovery database, based on the communication connection, it immediately reads the preset workflow that the main cluster is executing when a disaster occurs.
  • the workflow that is being executed by the main scheduling system (Scheduler) of the main cluster is queried from the disaster recovery database through the communication connection, and the result is queried in the form of a list.
  • the preset main cluster may be executing one or more workflows when a disaster occurs, or the preset main cluster may not provide services to execute data processing procedures when a disaster occurs , so there is no ongoing workflow. Therefore, when the disaster recovery device inquires about workflows whose status is being executed and returns the query result through a list, the number of workflows whose status is being executed in the list may be 0 or N, where N is greater than or equal to 1.
  • Step S30 obtaining task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationship between data and data processing tasks;
  • the disaster recovery device reads the workflow that is being executed by the preset primary cluster when the disaster occurs, it constructs the data based on the blood relationship between the data and the data processing tasks.
  • the relationship chain model of the workflow further obtains the respective task parameters of each task node in the workflow.
  • the disaster recovery device may also preset a relationship chain model based on blood relations between data and data processing tasks before a disaster occurs in the main cluster. In this way, when a disaster occurs in the preset main cluster, the disaster recovery device can directly extract the relationship chain model to obtain the respective task parameters of each task node in the workflow that the preset main cluster is executing when the disaster occurs.
  • the disaster recovery device preliminarily constructs a task node for determining the task node that needs to be re-executed during the process of the disaster recovery switching operation based on the blood relationship between the data and the data processing task as shown in Figure 9. relationship chain model. Then, the disaster recovery device uses the relationship chain model to determine all the task nodes of each workflow and all Task parameters for each task node.
  • the task parameters include the input data and output data of the task node, the above step S30 may include:
  • Step S301 determining each task node of the workflow
  • the disaster recovery device After the disaster recovery device acquires the workflows being executed by the preset primary cluster where the disaster occurred when the disaster occurred, it determines all the task nodes for the one or more workflows.
  • the disaster recovery device queries the disaster recovery database of the preset primary cluster where a disaster occurs through a list to obtain only If one is one, the disaster recovery device further determines and obtains all task nodes in the one workflow in a mature breadth-first manner.
  • Step S302 constructing a query statement according to each of the task nodes, and indexing the respective input data and output data of each of the task nodes from the relationship chain model according to the query statement.
  • the disaster recovery device constructs corresponding query statements based on the determined task nodes, so that based on the query statements, the relationship chain model constructed based on the blood relationship between data and data processing tasks is indexed to query each The respective input data and output data of task nodes.
  • the relationship chain model constructed based on the blood relationship between data and data processing tasks is the blood relationship map between data and data processing tasks as shown in Figure 9, and
  • the relationship chain model can be stored in the graph database configured under the disaster recovery cluster. Therefore, the query statement constructed by the disaster recovery device using the task node can be a graph data query statement.
  • the disaster recovery device—FindGap in the form of a list, obtains the preset master cluster from the disaster recovery database of the preset master cluster where a disaster occurred.
  • the main scheduling system of the main cluster——Scheduler is executing a workflow, and after obtaining all the task nodes of the workflow in a breadth-first manner, it will further call the preset for the relationship graph
  • the query template for index query relational data in the data such as SQL (Structured Query Language, Structured Query Language) statement, and each task node is used as the input condition in the SQL statement in turn, so as to construct the query and the task Nodes have graph data query statements related to each other, and then the disaster recovery device immediately executes the SQL statement to display the data processing tasks and blood relationship of the data from the relationship chain model stored in the graph database—Graph DB (diagram application and Data lineage), analyze and obtain the direct input data and output data of each
  • SQL Structured Query Language
  • Step S40 detecting the synchronization state of the task parameters, so as to determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger the disaster recovery mechanism to execute the target node.
  • the disaster recovery device After the disaster recovery device obtains the task parameters of each task node in the workflow that is being executed by the default master cluster when a disaster occurs, it further detects the synchronization status of the task parameters of each task node, so that, according to the detected The synchronization state of each task node determines the target node to be re-executed that needs to be re-executed to complete the disaster recovery switchover, and finally triggers the preset disaster recovery mechanism to re-execute the target node.
  • the disaster recovery recovery device—FindGap in the relationship chain model stored in the graph database—Graph DB shows the blood relationship between data processing tasks and data ( Figure 11 display application and data lineage), analyze and obtain the direct input data and output data of each task node, and then, the disaster recovery device—FindGap will further call the synchronization for data synchronization that is also configured under the disaster recovery cluster
  • the disaster recovery device FindGap once again traverses the DAG (Directed Acyclic Graph) graph of the workflow to which each task node belongs in a mature breadth-first manner (as shown in Figure 6 Workflow example), so that based on the synchronization status of the input data and output data of the task node, determine the target node that needs to be re-executed among all the task nodes of the workflow, and then trigger the preset disaster recovery mechanism to make
  • the scheduling system under the disaster recovery cluster—Scheduler (Backup) schedules the target node for re-execution, and feeds back the status result to the disaster recovery device for synchronization after the re-execution.
  • the disaster recovery device determines the disaster recovery switch based on the status result Finish.
  • the disaster recovery device determines that there is no target node to be re-executed among all the task nodes in the workflow based on the synchronization status, it does not need to trigger the disaster recovery mechanism to perform recovery. Disaster recovery can complete the switchover.
  • each task node: Job1, Job2 and Job3 the respective input data and output data are Table1, Table2, Table3, Table4, Table5 and Table6, and the disaster recovery device further detects that the synchronization status of the Table1, Table2, Table3 and Table4 has been completed disaster recovery synchronization, and The verification is consistent.
  • the disaster recovery device determines that the only target nodes that need to be re-executed in the disaster recovery cluster in the workflow are Job2 and Job3. Therefore, the disaster recovery device triggers the disaster recovery mechanism to make the The scheduling system only schedules the Job2 and Job3 for re-execution to speed up the disaster recovery speed.
  • the disaster recovery device first establishes a communication connection with the disaster recovery database of the preset primary cluster where the disaster occurs; Immediately after the communication connection between the standby databases, based on the communication connection, read the workflow that is being executed by the preset primary cluster when a disaster occurs; After the workflow is being executed, the relationship chain model constructed based on the relationship between data and data processing tasks is used to further obtain the task parameters of each task node in the workflow; Assume that when a disaster occurs in the main cluster, among the workflows being executed, after the respective task parameters of each task node, the synchronization status of each task parameter of each task node is further detected, so that each task is determined according to the detected synchronization status Among the nodes, the target node to be re-executed needs to be re-executed to complete the disaster recovery switchover. Finally, the preset disaster recovery mechanism is triggered to re-execute the target node.
  • this application uses a complete relationship chain model constructed in advance based on the blood relationship between data and data processing tasks, and combines the synchronization status of the task parameters of the task nodes in the workflow.
  • the disaster recovery operation of switching the disaster recovery cluster in the event of a disaster in the main cluster can realize rapid disaster recovery switching and fast recovery of task nodes to be re-executed, achieving the purpose of fast and refined disaster recovery switching, thereby improving disaster recovery efficiency.
  • the data disaster recovery method of this application may further include:
  • Step S50 constructing a relationship chain model based on blood relationship between data and data processing tasks.
  • the disaster recovery device Before the disaster recovery device establishes a communication connection with the disaster recovery database of the preset main cluster where a disaster occurs, it starts to build a relationship chain model based on the data processing tasks being executed by the preset main cluster and the blood relationship between the data.
  • step S50 may include:
  • Step S501 acquiring kinship data from the preset main cluster to establish a first kinship relationship between the data processing task and the data;
  • the disaster recovery device In the process of building a relationship chain model based on the data processing tasks being executed by the preset master cluster and the blood relationship between the data, the disaster recovery device first obtains blood relationship data from the preset master cluster that is executing data processing tasks to establish a The scheduling system of the preset master cluster executes the first blood relationship between data processing tasks and data.
  • the disaster recovery device obtains the hook (a type provided in Windows to replace the DOS (Disk Operating System , the system mechanism of "interruption” under the disk operating system), which is translated as “hook” or “hook” in Chinese) parses blood relationship data from data components (such as relational databases, etc.) and obtains the blood relationship data, and writes the blood relationship data into the file system of the data integration tool. Then, the scheduling system of the preset main cluster triggers the data blood relationship data integration task regularly, so that the data integration tool reads the blood relationship log from the file system to obtain the blood relationship data, and further writes the blood relationship data to the big data platform hive or spark .
  • the hook a type provided in Windows to replace the DOS (Disk Operating System , the system mechanism of "interruption” under the disk operating system)
  • data components such as relational databases, etc.
  • the scheduling system of the preset main cluster triggers the data blood relationship data integration task regularly, so that the data integration tool reads the blood relationship log from the file system to obtain the blood relationship data,
  • the scheduling system based on the preset main cluster triggers the data processing task regularly, so that the big data platform hive or spark can process and integrate the written blood relationship data to form the first link between the data processing task and the data as shown in Figure 4. blood relationship, and write the first blood relationship into the graph database system as blood relationship graph data.
  • the graph database system actively reports the writing status of bloodline graph data to the scheduling system of the preset main cluster, so that the scheduling system can confirm that the first bloodline data is constructed.
  • the lineage acquisition hook under the default main cluster implements the corresponding Lineage Hook—lineage data Hook mechanism for different data systems and data transmission tools. Every time a data system executes a SQL statement, these Hooks The mechanism captures the original blood relationship data and encapsulates it into a blood relationship log and writes it to the log system of the data integration tool.
  • Hive Lineage Hook is used for Hive data system and Spark data system respectively (by asynchronously capturing Hive execution SQL statements, calling the self-implemented Hive execution behavior analysis API to obtain SQL input data information, output data information, and associated task information), Spark-SQL Lineage Hook (obtain the SQL statement executed by Spark-SQL asynchronously, and call the self-implemented Spark SQL execution behavior analysis application program interface to obtain the input data information, output data information, and associated tasks of SQL Information), Sqoop Lineage Hook (by asynchronously capturing Sqoop execution commands, analyzing the parameters of Sqoop execution commands, and obtaining the input data and output data related information of the execution commands, as well as the associated task information), to capture lineage data.
  • the Lineage Hook corresponding to Hive and Spark-SQL is used to obtain the blood relationship between the internal data tables of the big data platform
  • the Sqoop Lineage Hook is used to capture the blood relationship between tables on the big data platform and traditional relational data.
  • Step S502 analyzing the object numbered musical notation file to establish a second blood relationship between the data processing execution task and the data processing task;
  • the disaster recovery device After the disaster recovery device establishes the first blood relationship between the data processing execution task executed by the scheduling system of the preset main cluster and the data, it further establishes the relationship between the data processing execution task and the data processing task by parsing the object numbered musical notation file. second blood relationship.
  • the disaster recovery device reads the JSON file of the workflow of each data processing task in the scheduling system under the preset main cluster through the preset task lineage analysis program, and then parses the JSON file to obtain the result shown in Figure 5
  • the data processing execution task shown the second blood relationship between the Executed Job and the data processing task.
  • the disaster recovery device parses the JSON file of the workflow in the scheduling system through the task lineage analysis program, it first passes through the big data task scheduling system. Read the task relationship JSON file and task execution record from the data integration tool, and the data integration tool directly writes the blood relationship data obtained by reading the JSON file and task execution record and parsing to the big data platform hive or spark. Then, the scheduling system of the preset main cluster triggers the data processing task at regular intervals, so that the big data platform hive or spark can process and integrate the written blood relationship data to form a data processing execution task as shown in Figure 5—Executed Job and Data Processing The second blood relationship between tasks is written into the graph database system as blood relationship graph data. Finally, the graph database system actively reports the writing status of bloodline graph data to the scheduling system of the preset main cluster, so that the scheduling system can confirm the completion of the second bloodline data construction.
  • Step S503 fusing the first blood relationship and the second blood relationship to determine the blood relationship between the data and the data processing task to construct a relationship chain model.
  • the disaster recovery device establishes the first blood relationship between the data processing execution task and the data, and the second blood relationship between the data processing execution task and the data processing task, the first blood relationship and the second blood relationship
  • the blood relationship between the two is fused to determine the blood relationship between the data processing task and the data, so as to construct a relationship chain model.
  • the disaster recovery device respectively constructs the first blood relationship between the data processing execution task and the data as shown in Figure 4, and the first blood relationship between the data processing execution task and the data processing task as shown in Figure 5
  • the second blood relationship after that, the disaster recovery device triggers the data fusion processing task regularly through the scheduling system of the preset main cluster, and analyzes the respective relationship maps of the first blood relationship and the second blood relationship to determine the first blood relationship
  • the data processing execution task (Executed Job) in the first blood relationship is replaced by The data processing task (Executed Job) corresponds to the data processing task Job, so that the respective relationship graphs of the first blood relationship and the second blood relationship are fused to form the blood relationship between the data and the data processing task as shown in Figure 9
  • the disaster recovery device builds a relationship chain model in the form of graph data based on the blood relationship between the data and the data
  • the disaster recovery device can determine the input data and output data of each task node in the workflow based on the relationship chain model. As shown in FIG. 9, the input data of the task node—Job1 is Table1 and Table2, the output data is Table4. In this way, the disaster recovery device can determine the scheduling system in the disaster recovery cluster to determine which task node the workflow starts to rerun when performing disaster recovery switching based on the relationship chain model.
  • the preset master cluster that is executing the data processing tasks first Among them, the blood relationship data is obtained to establish the first blood relationship between the data processing task executed by the scheduling system of the preset main cluster and the data; the disaster recovery device establishes the data processing execution task performed by the scheduling system of the preset main cluster and After the first blood relationship between the data, the second blood relationship between the data processing execution task and the data processing task is further established by parsing the object numbered musical notation file; the disaster recovery device establishes the data processing execution task and the data processing task respectively After the first blood relationship between the data processing execution task and the second blood relationship between the data processing task, the fusion process is performed on the first blood relationship and the second blood relationship to determine the data processing task and the data processing task.
  • the blood relationship between the data thus constructing the relationship chain model.
  • the disaster recovery device can use the tether model constructed in advance based on the blood relationship between data and data processing tasks in the process of disaster recovery switching, combined with the synchronization status of the task parameters of the task nodes in the workflow, To carry out the disaster recovery operation of switching the disaster recovery cluster in the event of a disaster in the main cluster, it can realize the fast disaster recovery switch and the fast recovery of the task nodes to be re-executed, and achieve the purpose of fast and refined disaster recovery switch, so that Improved disaster recovery efficiency.
  • the data disaster recovery method of this application may further include:
  • Step S60 executing a preset data synchronization task to make the database of the preset primary cluster synchronize data to the disaster recovery database.
  • the disaster recovery device Before the disaster recovery device establishes a communication connection with the disaster recovery database of the preset primary cluster where a disaster occurs, it first performs a data synchronization task so that the database of the preset primary cluster provides services to perform data processing tasks During the process, the data is synchronized to the disaster recovery database in the disaster recovery cluster for subsequent rapid disaster recovery switchover.
  • the data synchronization task is a task generated based on staff configuration to perform data synchronization management. It should be understood that, based on different design requirements of actual applications, in different feasible implementations, the configuration generation method and specific content of the data synchronization task may be different, and the data disaster recovery method of this application does not aim at the data synchronization The specific content of the task is limited.
  • step S60 may include:
  • Step S601 receiving the data synchronization task, and reading the metadata to be synchronized pointed to by the data synchronization task from the database of the preset master cluster;
  • the disaster recovery device In the process of synchronizing data from the database of the preset main cluster to the disaster recovery database, the disaster recovery device first receives the data synchronization task generated based on the staff configuration, and then analyzes the data synchronization task to determine the data that needs to be transferred from the preset main cluster.
  • the metadata to be synchronized is read from the database and synchronized to the disaster recovery database.
  • Step S602 executing the data synchronization task to pull the metadata to be synchronized into the disaster recovery database for storage;
  • the disaster recovery device After the disaster recovery device determines that the metadata to be synchronized needs to be read from the database of the preset primary cluster and synchronized to the disaster recovery database, it executes the data synchronization task to pull the metadata to be synchronized to the disaster recovery Store in the corresponding storage path in the database.
  • step S602 may include:
  • Step S6021 obtaining the first storage path of the metadata to be synchronized in the database of the preset main cluster
  • the disaster recovery device When the disaster recovery device synchronizes the metadata to be synchronized in the database of the preset primary cluster to the disaster recovery database, when the disaster recovery device determines the metadata to be synchronized by parsing the data synchronization task, it will synchronize A first storage path of the metadata to be synchronized in the database of the preset master cluster is obtained.
  • the disaster recovery device can specifically detect the resource manager YARN (Yet Another Resource Negotiator, another resource coordinator, also known as Apache Hadoop YARN, in the database of the preset master cluster, which is a new Hadoop resource manager), for the data size, storage time, update time, storage path and other data information managed by the data to be synchronized (specifically, it can be the data of a Hive table), to obtain the metadata to be synchronized in The preset first storage path in the database of the primary cluster.
  • YARN Yet Another Resource Negotiator, another resource coordinator, also known as Apache Hadoop YARN, in the database of the preset master cluster, which is a new Hadoop resource manager
  • the data size, storage time, update time, storage path and other data information managed by the data to be synchronized specifically, it can be the data of a Hive table
  • Step S6022 determining a second storage path corresponding to the first storage path in the disaster recovery database
  • the disaster recovery device After the disaster recovery device obtains the first storage path of the metadata to be synchronized in the database of the preset primary cluster, it can determine in the disaster recovery database based on the first storage path a path corresponding to the first storage path The second storage path.
  • the disaster recovery device can be based on the pre-built association relationship between the database of the preset main cluster and the disaster recovery database for synchronizing data.
  • the association relationship can be specifically established for the association relationship in advance.
  • a relational table directly using the first storage path in the relational table, detects and determines in the disaster recovery database, corresponding to the first storage path to store metadata to be synchronized under the first storage path Second storage path.
  • the disaster recovery device after the disaster recovery device acquires the first storage path of the metadata to be synchronized in the database of the preset primary cluster, it can also, in real time, based on the current free storage space of the disaster recovery database, Immediately generate a storage path and establish a corresponding relationship between the storage path and the first storage path, and then determine the storage path as the metadata to be synchronized under the first storage path in the disaster recovery database Second storage path.
  • Step S6023 storing the metadata to be synchronized in the disaster recovery database according to the second storage path.
  • the disaster recovery device After the disaster recovery device determines the second storage path corresponding to the first storage path in the disaster recovery database, the disaster recovery device can store the database of the preset main cluster in the second storage path according to the second storage path. The data to be stored under the first storage path is fetched and stored under the second storage path in the disaster recovery database.
  • the disaster recovery device determines the second storage path corresponding to the first storage path in the disaster recovery database, it can first pull out the database that needs to be synchronized from the database of the preset primary cluster according to the first storage path.
  • the metadata to be synchronized and then input the metadata to be synchronized into the resource manager YARN of the disaster recovery database, so that the resource manager YARN can store the metadata to be synchronized according to the second storage path.
  • Step S603 monitoring the execution status of the data synchronization task and performing consistency verification on the data stored in the database of the preset primary cluster and the disaster recovery database respectively.
  • the disaster recovery device continuously monitors the execution status of the data synchronization task to further verify the consistency of the data stored in the preset main cluster database and the disaster recovery database when each data synchronization task is completed. This ensures that the metadata to be synchronized in the database of the preset primary cluster is completely synchronized to the disaster recovery database.
  • the disaster recovery device generates a data synchronization task by receiving the configuration data of the staff. Then, based on the scheduling system—Transports, the disaster recovery device schedules the data synchronization task for execution to read metadata from the database—MySQL, which stores metadata in the preset main cluster (IDC1) through multithreading, and read the metadata to The metadata is written into the disaster recovery database of the disaster recovery cluster (IDC2) storing metadata as the metadata to be synchronized pointed to by the data synchronization task.
  • MySQL which stores metadata in the preset main cluster (IDC1) through multithreading
  • the metadata is written into the disaster recovery database of the disaster recovery cluster (IDC2) storing metadata as the metadata to be synchronized pointed to by the data synchronization task.
  • the disaster recovery device further submits the disaster recovery data synchronization task - Distcp job to YARN on the disaster recovery cluster (IDC2) based on the scheduling system - Transports, and executes the Distcp job based on the scheduling, so as to realize the preset master
  • the data of the Hive table that needs to be synchronized in the cluster (IDC1) is pulled to the storage path (HDFS directory) corresponding to the Hive table in the disaster recovery cluster (IDC2).
  • the disaster recovery device also monitors the execution status of the scheduled data synchronization task based on the scheduling system—Transports polling, and collects the database src-HDFS of the preset primary cluster (IDC1) and the disaster recovery database dest- in the disaster recovery cluster. Synchronized statistical data stored on HDFS, so as to verify whether the data on both sides of the preset primary cluster (IDC1) and disaster recovery cluster (IDC2) are consistent.
  • the disaster recovery device before the disaster recovery device executes the data synchronization task to make the database of the preset primary cluster synchronize data to the disaster recovery database, it also predefines the data disaster recovery synchronization Therefore, when the disaster recovery and recovery device executes the data synchronization task based on the scheduling system to perform the data synchronization process between the preset primary cluster and the disaster recovery cluster, it executes the data synchronization task according to the data disaster recovery synchronization rule.
  • the disaster recovery device defines the clusters, databases and data tables corresponding to the preset primary cluster and disaster recovery cluster that need data synchronization, and the time and strategy for data synchronization To form rules for data disaster recovery synchronization.
  • the data disaster recovery synchronization rules defined by the disaster recovery device are shown in the following table:
  • the source cluster and the target cluster are respectively the preset primary cluster and disaster recovery cluster described in this embodiment.
  • the synchronization status includes: synchronization completed and unsynchronized, "according to the synchronization status, determine the tasks to be re-executed in each of the task nodes target node" may include:
  • Step S401 if the synchronization state of the task parameters of the first task node among the task nodes is not synchronized, then determine that the first task node is the target node to be re-executed;
  • the disaster recovery device After the disaster recovery device obtains the task parameters of each task node in the workflow that is being executed by the default master cluster when a disaster occurs, it further detects whether the synchronization status of the task parameters of each task node is completed or not. Synchronization, thus, when it is detected that among the various task nodes, the synchronization state of the task parameters of the first task node is not synchronized, the first task node is directly determined as the one that needs to be re-run under the disaster recovery cluster to be re-executed target node.
  • Step S402 if the parent node of the first task node is a node to be re-executed, then determine that the first task node is the target node;
  • the disaster recovery device detects whether the synchronization status of each task parameter of each task node is synchronized or not, to determine each task node, and also detects whether the parent node of each task node is determined to be in the disaster recovery cluster Re-run the target node to be re-executed. In this way, when it is detected that the parent node of the first task node is the target node to be re-executed, the first task node is directly determined to be re-run in the disaster recovery cluster The target node to be re-executed.
  • the disaster recovery device traverses the acquired preset primary cluster in a depth-first manner.
  • each workflow that is being executed by the scheduling system of the preset primary cluster if the disaster recovery device traverses to The state of all parent nodes of the first task node in the current workflow has been marked as no need to rerun (disable execute), and the task parameter of the first task node—the synchronization status of input data and output data is also synchronized, Then the disaster recovery device determines that the first task node is not a target node that needs to be re-run to reschedule execution, and marks the first task node as a state that does not need to be re-run (disable execute).
  • the disaster recovery device traverses to the parent node of the first task node of the current workflow, the state of a certain parent node is marked as re-execute (enable execute), or the disaster recovery device traverses the The task parameters of the first task node——the synchronization state of input data and output data is not synchronized, then the disaster recovery device will directly determine that the first task node is the target node that needs to be re-run to reschedule execution, and send the The first task node is marked to be re-executed (enable execute).
  • the embodiment of the present application provides a method for data disaster recovery and recovery.
  • the disaster recovery recovery device Before establishing a communication connection with the disaster recovery database of the preset primary cluster where a disaster occurs, the disaster recovery recovery device first executes a data synchronization task so that the default primary cluster In the process of providing services and performing data processing tasks in the preset main cluster, the data is synchronized to the disaster recovery database in the disaster recovery cluster for subsequent rapid disaster recovery switching.
  • the disaster recovery device In the process of synchronizing data from the database of the preset main cluster to the disaster recovery database, the disaster recovery device first receives the data synchronization task generated based on the staff configuration, and then analyzes the data synchronization task to determine the data that needs to be retrieved from the preset main cluster.
  • the metadata to be synchronized is read from the database of the database and synchronized to the disaster recovery database.
  • the disaster recovery device determines that the metadata to be synchronized needs to be read from the database of the preset primary cluster and synchronized to the disaster recovery database, That is, the data synchronization task is executed to pull the metadata to be synchronized to a corresponding storage path in the disaster recovery database for storage.
  • the disaster recovery device continuously monitors the execution status of the data synchronization task, so that when each data synchronization task is completed, the data stored in the preset main cluster database and the disaster recovery database are further consistent. Verification, so as to ensure that the metadata to be synchronized in the database of the preset main cluster is completely synchronized to the disaster recovery database.
  • the disaster recovery device can pass the tether model constructed in advance based on the blood relationship between data and data processing tasks in the process of disaster recovery switching, combined with the synchronization of the task parameters of the task nodes in the workflow
  • the disaster recovery operation of switching the disaster recovery cluster can be realized, and the task nodes to be re-executed can be quickly implemented for disaster recovery switchover and fast recovery, and the purpose of fast and fine-grained disaster recovery switchover can be achieved. , thereby improving disaster recovery efficiency.
  • FIG. 13 is a schematic diagram of functional modules of an embodiment of the data disaster recovery system of the present application.
  • the disaster recovery system for the application data includes:
  • connection module 10 is used to establish a communication connection with the disaster recovery database of the preset main cluster
  • a workflow reading module 20 configured to read the workflow executed by the preset main cluster through the communication connection;
  • An acquisition module 30 configured to acquire task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationships between data and data processing tasks;
  • the recovery module 40 is configured to detect the synchronization state of the task parameters, determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
  • the disaster recovery system for data in this application also includes:
  • the relationship chain building module is used to build a relationship chain model based on the blood relationship between data and data processing tasks.
  • relationship chain building blocks include:
  • the first construction unit is configured to acquire blood relationship data from the preset main cluster and establish a first blood relationship between the data processing execution task and the data;
  • the second construction unit is used to analyze the object numbered musical notation file and establish the second blood relationship between the data processing execution task and the data processing task;
  • a third construction unit configured to fuse the first blood relationship and the second blood relationship to determine the blood relationship between the data and the data processing task to construct a relationship chain model.
  • the task parameters include input data and output data of the task node
  • the acquisition module 30 includes:
  • a determining unit configured to determine each of the task nodes of the workflow
  • the acquisition unit is configured to respectively construct query statements according to each of the task nodes and index the respective input data and output data of each of the task nodes from the relationship chain model.
  • the disaster recovery system for data in this application also includes:
  • the data synchronization module is configured to execute a preset data synchronization task to make the database of the preset main cluster synchronize data to the disaster recovery database.
  • the data synchronization module includes:
  • a receiving unit configured to receive the data synchronization task, and read the metadata to be synchronized pointed to by the data synchronization task from the database of the preset master cluster;
  • a task execution unit configured to execute the data synchronization task to pull the metadata to be synchronized into the disaster recovery database for storage
  • the verification unit is configured to monitor the execution status of the data synchronization task and perform consistency verification on the data stored in the preset primary cluster database and the disaster backup database respectively.
  • the task execution unit includes:
  • a path obtaining subunit configured to obtain a first storage path of the metadata to be synchronized in the database of the preset primary cluster; and determine a second storage path corresponding to the first storage path in the disaster recovery database Storage path;
  • the data storage subunit is configured to store the metadata to be synchronized in the disaster recovery database according to the second storage path.
  • the synchronization status includes: synchronization completed and unsynchronized
  • the recovery module 40 includes:
  • the first re-running node determining unit is configured to determine that the first task node is a target node to be re-executed if the synchronization state of the task parameters of the first task node in each of the task nodes is not synchronized;
  • the second rerun node determining unit determines the first task node as the target node if the parent node of the first task node is a node to be re-executed.
  • each module of the task scheduling node in the above-mentioned data disaster recovery system corresponds to each step in the above-mentioned data disaster recovery method embodiment, and its functions and implementation processes will not be repeated here.
  • the present application also provides a computer storage medium, on which a data disaster recovery program is stored, and when the data disaster recovery program is executed by a processor, the data recovery as described in any one of the above embodiments is realized.
  • the steps of the disaster recovery method are described in any one of the above embodiments.
  • the present application also provides a computer program product, where the computer program product includes a computer program, and when the computer program is executed by a processor, the steps of the method for disaster recovery and recovery of data as described in any one of the above embodiments are implemented.
  • the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium as described above (such as ROM/RAM , magnetic disk, optical disk), including several instructions to enable a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to the technical field of financial technology (fintech), and discloses a disaster recovery method and system for data, a terminal device and a computer storage medium. The disaster recovery method for data comprises: establishing a communication connection with a disaster recovery database of a preset main cluster by means of a data disaster recovery device; reading, by means of the communication connection, a workflow executed by the preset main cluster; acquiring task parameters of task nodes in the workflow according to a preset relationship chain model, the relationship chain model being obtained by constructing on the basis of the blood relationship between data and a data processing task; and detecting the synchronization state of the task parameters so as to determine, according to the synchronization state, a target node to be re-executed that is in the task nodes, and triggering a disaster recovery mechanism to execute the target node.

Description

数据的容灾恢复方法、系统、终端设备及计算机存储介质Data disaster recovery method, system, terminal equipment and computer storage medium
优先权信息priority information
本申请要求于2021年7月30日申请的、申请号为202110874019.9的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application with application number 202110874019.9 filed on July 30, 2021, the entire contents of which are incorporated herein by reference.
技术领域technical field
本申请涉及金融科技(Fintech)技术领域,尤其涉及一种数据的容灾恢复方法、系统、终端设备以及计算机存储介质。The present application relates to the technical field of financial technology (Fintech), and in particular to a data disaster recovery method, system, terminal equipment, and computer storage medium.
背景技术Background technique
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技转变,但由于金融行业的安全性、实时性以及稳定性等要求,也对技术提出了更高的要求。With the development of computer technology, more and more technologies are applied in the financial field, and the traditional financial industry is gradually transforming into financial technology. However, due to the security, real-time and stability requirements of the financial industry, more and more technical requirements high demands.
时下,大数据异地容灾的场景下,主、备集群分别运行在两个不同的机房中,分别运行独立的账户系统并使用独立的运维管控系统。目前大数据异地容灾方案只考虑离线侧的数据容灾,涉及的基础组件主要是Hadoop(Apache Hadoop,一款支持数据密集型分布式应用程序并以Apache 2.0许可协议发布的开源软件框架)、Hive(Apache Hive,基于Hadoop的一个数据仓库工具)以及大数据平台任务调度系统(Big Data Platform Job Scheduling System)。Nowadays, in the scenario of big data remote disaster recovery, the primary and backup clusters run in two different computer rooms, respectively run independent account systems and use independent operation and maintenance management and control systems. At present, the big data remote disaster recovery solution only considers data disaster recovery on the offline side, and the basic components involved are mainly Hadoop (Apache Hadoop, an open source software framework that supports data-intensive distributed applications and is released under the Apache 2.0 license agreement), Hive (Apache Hive, a data warehouse tool based on Hadoop) and Big Data Platform Job Scheduling System (Big Data Platform Job Scheduling System).
现有大数据集群容灾策略是:通过跨机房数据同步工具将主集群每天变化的数据同步到容灾集群,从而当主集群不可用时,切换到灾备集群。然而,现有的大数据集群容灾方案中,在切换到容灾环境之后还需要在容灾环境下重跑执行将业务数据导入、加工,和再导出到业务系统的整个流程,才能够完成整个容灾切换过程,如此,导致容灾切换耗时较长,不能快速高效的完成容灾切换。The existing disaster recovery strategy for big data clusters is: use cross-computer room data synchronization tools to synchronize the daily changing data of the main cluster to the disaster recovery cluster, so that when the main cluster is unavailable, switch to the disaster recovery cluster. However, in the existing big data cluster disaster recovery solution, after switching to the disaster recovery environment, it is necessary to rerun the entire process of importing, processing, and exporting business data to the business system in the disaster recovery environment to complete The entire disaster recovery switchover process, like this, leads to a long time-consuming disaster recovery switchover, and the disaster recovery switchover cannot be completed quickly and efficiently.
发明内容Contents of the invention
本申请的主要目的在于提供一种数据的容灾恢复方法、系统、终端设备以及计算机存储介质,旨在实现在主集群发生灾难无法提供服务时,快速和精细化的进行容灾切换,进而提高容灾恢复效率。The main purpose of this application is to provide a data disaster recovery method, system, terminal equipment, and computer storage medium, aiming to realize rapid and fine-grained disaster recovery switching when the main cluster fails to provide services, thereby improving disaster recovery efficiency.
为实现上述目的,本申请提供一种数据的容灾恢复方法,所述数据的容灾恢复方法应用于数据容灾恢复设备,所述数据的容灾恢复方法包括:To achieve the above purpose, the present application provides a data disaster recovery method, the data disaster recovery method is applied to data disaster recovery equipment, the data disaster recovery method includes:
建立与预设主集群的灾备数据库之间的通信连接;Establish a communication connection with the disaster recovery database of the preset main cluster;
通过所述通信连接读取所述预设主集群执行的工作流;Reading the workflow executed by the preset main cluster through the communication connection;
根据预设的关系链模型获取所述工作流中各任务节点的任务参数,其中,所述关系链模型基于数据与数据处理任务之间的血缘关系构建得到;Obtaining task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationships between data and data processing tasks;
检测所述任务参数的同步状态,以根据所述同步状态确定各所述任务节点中待重新执行的目标节点,并触发容灾恢复机制执行所述目标节点。Detecting the synchronization state of the task parameters, so as to determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
此外,为实现上述目的,本申请还提供一种数据的容灾恢复系统,所述数据的容灾恢复系统,包括:In addition, in order to achieve the above purpose, the present application also provides a data disaster recovery system, the data disaster recovery system includes:
连接模块,用于建立与预设主集群的灾备数据库之间的通信连接;A connection module, configured to establish a communication connection with the disaster recovery database of the preset main cluster;
工作流读取模块,用于通过所述通信连接读取所述预设主集群执行的工作流;A workflow reading module, configured to read the workflow executed by the preset main cluster through the communication connection;
获取模块,用于根据预设的关系链模型获取所述工作流中各任务节点的任务参数,其中,所述关系链模型基于数据与数据处理任务之间的血缘关系构建得到;An acquisition module, configured to acquire task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationship between data and data processing tasks;
恢复模块,用于检测所述任务参数的同步状态,以根据所述同步状态确定各所述任务节点中待重新执行的目标节点,并触发容灾恢复机制执行所述目标节点。The recovery module is configured to detect the synchronization state of the task parameters, determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
其中,本申请数据的容灾恢复系统的各个功能模块各自在运行时均实现如上所述的数 据的容灾恢复方法的步骤。Wherein, each functional module of the data disaster recovery and recovery system of the present application implements the steps of the above-mentioned data disaster recovery and recovery method during operation.
此外,为实现上述目的,本申请还提供一种终端设备,所述终端设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的数据的容灾恢复程序,所述数据的容灾恢复程序被所述处理器执行时实现如上所述的数据的容灾恢复方法的步骤。In addition, in order to achieve the above object, the present application also provides a terminal device, the terminal device includes: a memory, a processor, and a disaster recovery program for data stored in the memory and operable on the processor, When the data disaster recovery program is executed by the processor, the above steps of the data disaster recovery method are implemented.
此外,为实现上述目的,本申请还提供一种计算机存储介质,所述计算机存储介质上存储有数据的容灾恢复程序,所述数据的容灾恢复程序被处理器执行时实现如上所述的数据的容灾恢复方法的步骤。In addition, in order to achieve the above purpose, the present application also provides a computer storage medium, on which a data disaster recovery program is stored, and when the data disaster recovery program is executed by a processor, the above-mentioned The steps of the data disaster recovery recovery method.
此外,为实现上述目的,本申请还提供计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序被处理器执行时实现如上所述的数据的容灾恢复方法的步骤。In addition, to achieve the above object, the present application also provides a computer program product, the computer program product includes a computer program, and when the computer program is executed by a processor, the steps of the above-mentioned data disaster recovery method are implemented.
本申请提供一种数据的容灾恢复方法、系统、终端设备、计算机存储介质以及计算机程序产品,通过数据容灾恢复设备建立与预设主集群的灾备数据库之间的通信连接;通过所述通信连接读取所述预设主集群执行的工作流;根据预设的关系链模型获取所述工作流中各任务节点的任务参数,其中,所述关系链模型基于数据与数据处理任务之间的血缘关系构建得到;检测所述任务参数的同步状态,以根据所述同步状态确定各所述任务节点中待重新执行的目标节点,并触发容灾恢复机制执行所述目标节点。This application provides a data disaster recovery method, system, terminal equipment, computer storage medium and computer program product, through which the data disaster recovery equipment establishes a communication connection with the disaster recovery database of the preset main cluster; through the The communication connection reads the workflow executed by the preset main cluster; obtains the task parameters of each task node in the workflow according to the preset relationship chain model, wherein the relationship chain model is based on the relationship between data and data processing tasks Detecting the synchronization state of the task parameters, so as to determine the target node to be re-executed in each of the task nodes according to the synchronization state, and trigger the disaster recovery mechanism to execute the target node.
本申请在主集群发生灾难从而无法继续提供服务,从而需要进行容灾切换有容灾集群代替主集群提供服务的过程中,通过容灾集群下的数据容灾恢复设备,建立起与预设主集群的灾备数据库之间的通信连接,从而通过该通信连接读取出该预设主集群在发生灾难时正在执行的工作流;然后,数据容灾恢复设备进一步根据基于数据与数据处理任务之间的血缘关系构建得到的关系链模型,获取该工作流当中各个任务节点的任务参数;最后,检测该各个任务节点各自任务参数的同步状态,从而根据该同步状态确定该各个任务节点当中待重新执行的目标节点,并在确定到该目标节点时出发容灾恢复机制来重新执行该目标节点。In this application, when a disaster occurs in the main cluster and cannot continue to provide services, so that disaster recovery switching is required and the disaster recovery cluster replaces the main cluster to provide services, through the data disaster recovery and recovery equipment under the disaster recovery cluster, a The communication connection between the disaster recovery databases of the cluster, so as to read the workflow that the preset main cluster is executing when a disaster occurs through the communication connection; The relationship chain model obtained by constructing the blood relationship between them can obtain the task parameters of each task node in the workflow; finally, detect the synchronization status of the task parameters of each task node, so as to determine the task nodes to be restarted according to the synchronization status. Execute the target node, and trigger the disaster recovery mechanism to re-execute the target node when the target node is determined.
本申请相比于传统的大数据集群容灾方案,通过预先基于数据与数据处理任务之间的血缘关系构建的完整的关系链模型,并结合工作流中任务节点各自任务参数的同步状态,来进行主集群发生灾难情况下切换容灾集群的容灾恢复操作,无需在容灾环境下再针对主集群发生灾难时的全部业务数据任务都执行重跑,而是仅基于结合关系链模型和同步状态确定得出的待重新执行的任务节点进行重跑,如此,可以实现快速进行容灾切换和快速恢复待重新执行的任务节点,达成了快速和精细化进行容灾切换的目的,从而提高了容灾恢复效率。Compared with the traditional big data cluster disaster recovery scheme, this application uses a complete relationship chain model constructed in advance based on the blood relationship between data and data processing tasks, and combines the synchronization status of the task parameters of the task nodes in the workflow. The disaster recovery operation of switching the disaster recovery cluster in the case of a disaster in the main cluster does not need to rerun all business data tasks when the main cluster is in a disaster in the disaster recovery environment, but only based on the combination of the relationship chain model and synchronization The task nodes to be re-executed after the state is determined are re-run, so that rapid disaster recovery switching and fast recovery of task nodes to be re-executed can be achieved, and the purpose of fast and refined disaster recovery switching is achieved, thereby improving disaster recovery efficiency.
附图说明Description of drawings
图1为本申请实施例方案涉及的终端设备硬件运行环境的设备结构示意图;FIG. 1 is a schematic diagram of the device structure of the terminal device hardware operating environment involved in the solution of the embodiment of the present application;
图2为本申请数据的容灾恢复方法一实施例的流程示意图;Fig. 2 is the schematic flow chart of one embodiment of the disaster recovery recovery method of the application data;
图3为本申请数据的容灾恢复方法一实施例所涉及的血缘数据获取和加工流程;Fig. 3 is the blood relationship data acquisition and processing process involved in an embodiment of the disaster recovery method for the data of the present application;
图4为本申请数据的容灾恢复方法一实施例所涉及的数据处理执行任务与数据之间的第一血缘关系;FIG. 4 shows the first blood relationship between the data processing execution task and the data involved in an embodiment of the data disaster recovery method of the present application;
图5为本申请数据的容灾恢复方法一实施例所涉及的数据处理执行任务与数据处理任务之间的第二血缘关系;FIG. 5 shows the second blood relationship between the data processing execution task and the data processing task involved in an embodiment of the data disaster recovery method of the present application;
图6为本申请数据的容灾恢复方法一实施例所涉及的数据处理的工作流样例;Fig. 6 is the workflow example of the data processing involved in an embodiment of the disaster recovery recovery method of the application data;
图7为本申请数据的容灾恢复方法一实施例所涉及的第二血缘关系的加工流程;FIG. 7 is a processing flow of the second blood relationship involved in an embodiment of the disaster recovery method for data of the present application;
图8为本申请数据的容灾恢复方法一实施例所涉及的数据处理任务与任务执行ID的关系;Fig. 8 is the relationship between data processing tasks and task execution IDs involved in an embodiment of the disaster recovery method for data of the present application;
图9为本申请数据的容灾恢复方法一实施例所涉及的数据与数据处理任务之间的血缘关系;FIG. 9 shows the blood relationship between data and data processing tasks involved in an embodiment of the disaster recovery method for data in this application;
图10为本申请数据的容灾恢复方法一实施例所涉及的数据同步流程;FIG. 10 is a data synchronization process involved in an embodiment of a method for disaster recovery and recovery of data in the present application;
图11为本申请数据的容灾恢复方法一实施例所涉及的容灾恢复处理流程;Fig. 11 is the disaster recovery processing flow involved in an embodiment of the disaster recovery method for the application data;
图12为本申请数据的容灾恢复方法一实施例所涉及的容灾恢复场景示意图;FIG. 12 is a schematic diagram of a disaster recovery scenario involved in an embodiment of a method for disaster recovery and recovery of data in this application;
图13为本申请数据的容灾恢复系统一实施例的功能模块示意图。FIG. 13 is a schematic diagram of functional modules of an embodiment of the data disaster recovery system of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional features and advantages of the present application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.
参照图1,图1为本申请实施例方案涉及的终端设备硬件运行环境的设备结构示意图。Referring to FIG. 1 , FIG. 1 is a schematic diagram of a device structure of a hardware operating environment of a terminal device involved in the solution of an embodiment of the present application.
本申请实施例终端设备可以是被配置在容灾集群下,针对主集群发生灾难而无法继续提供服务的情况,进行容灾恢复的数据容灾恢复设备,该数据容灾恢复设备可以是智能手机、PC(Personal Computer,个人计算机)、平板电脑、便携计算机等等。The terminal device in the embodiment of this application may be a data disaster recovery device configured under a disaster recovery cluster to perform disaster recovery and recovery in case a disaster occurs in the main cluster and cannot continue to provide services. The data disaster recovery recovery device may be a smart phone , PC (Personal Computer, personal computer), tablet computer, portable computer and so on.
如图1所示,该终端设备可以包括:处理器1001,例如CPU,通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如Wi-Fi接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1 , the terminal device may include: a processor 1001 , such as a CPU, a communication bus 1002 , a user interface 1003 , a network interface 1004 , and a memory 1005 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (such as a Wi-Fi interface). The memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
本领域技术人员可以理解,图1中示出的终端设备结构并不构成对终端设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the structure of the terminal device shown in FIG. 1 does not constitute a limitation on the terminal device, and may include more or less components than those shown in the figure, or combine some components, or arrange different components.
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及数据的容灾恢复程序。As shown in FIG. 1 , the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a data disaster recovery program.
在图1所示的终端中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要用于连接客户端,与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的数据的容灾恢复程序,并执行以下本申请数据的容灾恢复方法各实施例所述的操作。In the terminal shown in Figure 1, the network interface 1004 is mainly used to connect to the background server and perform data communication with the background server; the user interface 1003 is mainly used to connect to the client and perform data communication with the client; and the processor 1001 can be used for The data disaster recovery program stored in the memory 1005 is invoked, and the operations described in the following embodiments of the data disaster recovery method of this application are performed.
基于上述硬件结构,提出本申请数据的容灾恢复方法的各实施例。Based on the above hardware structure, various embodiments of the disaster recovery method for the data of the present application are proposed.
需要说明的是,大数据异地容灾的场景下,主、备集群(或者称作主集群和容灾集群)分别运行在两个不同的机房中,且各自运行独立的账户系统,并使用独立的运维管控系统。在进行集群交付时,主、备两个集群是分别交付的。It should be noted that in the scenario of big data remote disaster recovery, the active and standby clusters (or called the main cluster and the disaster recovery cluster) run in two different computer rooms respectively, and each runs an independent account system, and uses an independent operation and maintenance management and control system. During cluster delivery, the primary and standby clusters are delivered separately.
大数据平台的典型数据处理流程如下:The typical data processing flow of the big data platform is as follows:
第一,数据抽取,通过Sqoop(Apache Sqoop是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql、oracle等)间进行数据的传递)将业务系统的数据从关系型数据库收取到Hive中;First, data extraction, through Sqoop (Apache Sqoop is an open source tool, mainly used for data transfer between Hadoop (Hive) and traditional databases (mysql, postgresql, oracle, etc.) The relational database is collected into Hive;
第二,数据加工,通过Hive SQL/Spark SQL/Python/Shell等编程方式对Hive中的数据进行加工,最终写到另外一个Hive表中;Second, data processing, processing the data in Hive through programming methods such as Hive SQL/Spark SQL/Python/Shell, and finally writing it into another Hive table;
第三,数据导出,用Sqoop将加工好的Hive数据(例如:统计日报、单日收益计算等)再导出到关系型数据库中。Third, data export, using Sqoop to export the processed Hive data (for example: daily statistics, single-day income calculation, etc.) to the relational database.
以上数据抽取、数据加工和数据导出的整个流程,都是通过大数据平台的任务调度系统触发定时调度执行。The entire process of the above data extraction, data processing and data export is triggered by the task scheduling system of the big data platform for timing scheduling execution.
大数据集群容灾的策略是通过跨机房数据同步工具将主集群每天变化的数据同步到容灾集群,当主集群不可用时,切换到灾备集群。The big data cluster disaster recovery strategy is to synchronize the daily changing data of the main cluster to the disaster recovery cluster through cross-computer room data synchronization tools, and switch to the disaster recovery cluster when the main cluster is unavailable.
现有大数据集群容灾策略是:通过跨机房数据同步工具将主集群每天变化的数据同步到容灾集群,从而当主集群不可用时,切换到灾备集群。然而,现有的大数据集群容灾方案中,在切换到容灾环境之后还需要重跑执行上述将业务数据导入、加工,和再导出到业务系统的整个流程,导致耗时较长,不能快速高效的完成容灾切换。The existing disaster recovery strategy for big data clusters is: use cross-computer room data synchronization tools to synchronize the daily changing data of the main cluster to the disaster recovery cluster, so that when the main cluster is unavailable, switch to the disaster recovery cluster. However, in the existing big data cluster disaster recovery solution, after switching to the disaster recovery environment, it is necessary to rerun the entire process of importing, processing, and exporting business data to the business system, which takes a long time and cannot Quickly and efficiently complete disaster recovery switchover.
针对上述现象,本申请提供一种数据的容灾恢复方法。请参照图2,图2为本申请数据的容灾恢复方法第一实施例的流程示意图,在本实施例中,该数据的容灾恢复方法应用于上述被配置在容灾集群下,针对主集群发生灾难而无法继续提供服务的情况,进行容灾恢复的数据容灾恢复设备(为便于阐述,后文均表述为容灾恢复设备),本申请数据的容灾恢复方法,包括:In view of the above phenomenon, the present application provides a data disaster recovery method. Please refer to FIG. 2. FIG. 2 is a schematic flow chart of the first embodiment of the data disaster recovery and recovery method of the present application. When a disaster occurs in the cluster and the service cannot continue to be provided, the data disaster recovery and recovery equipment for disaster recovery (for the convenience of explanation, the following are all expressed as disaster recovery recovery equipment), and the disaster recovery recovery method for data in this application includes:
步骤S10,建立与预设主集群的灾备数据库之间的通信连接;Step S10, establishing a communication connection with the disaster recovery database of the preset main cluster;
容灾恢复设备在进行容灾恢复的过程中,先建立起与发生灾难的预设主集群的灾备数据库之间的通信连接。During the process of disaster recovery and recovery, the disaster recovery device first establishes a communication connection with the disaster recovery database of the preset primary cluster where a disaster occurs.
需要说明的是,在本实施例中,预设主集群为大数据异地容灾的场景下,正在进行数据抽取、数据加工和数据导出一系列处理流程的大数据平台所在的集群。预设主集群的灾备数据库为该预设主集群调度系统的数据库的异地备份。It should be noted that, in this embodiment, the preset main cluster is the cluster where the big data platform that is performing a series of processing procedures of data extraction, data processing, and data export is located in the scenario of big data remote disaster recovery. The disaster recovery database of the preset main cluster is an off-site backup of the database of the scheduling system of the preset main cluster.
进一步地,在一种可行的实施例中,上述步骤S10,可以包括:Further, in a feasible embodiment, the above step S10 may include:
步骤S101,在所述预设主集群的服务不可用时,建立与所述预设主集群的灾备数据库之间的通信连接。Step S101, when the service of the preset master cluster is unavailable, establish a communication connection with the disaster recovery database of the preset master cluster.
需要说明的是,在本实施例中,容灾恢复设备进行容灾恢复的过程,发生在预设主集群发生灾难从而无法继续提供服务进行数据处理流程的情况之下。It should be noted that, in this embodiment, the disaster recovery recovery process of the disaster recovery device occurs when a disaster occurs in the preset primary cluster and cannot continue to provide services for data processing.
容灾恢复设备在当前正在进行数据处理的预设主集群发生灾难,从而导致该预设主集群无法继续提供服务或者提供的服务不可用时,立即建立起与该预设主集群的灾备数据库之间的通信连接。When a disaster occurs in the default primary cluster currently undergoing data processing, the disaster recovery device will immediately establish a connection with the default primary cluster’s disaster recovery database Communication connection between.
具体地,例如,假定当前正在执行数据处理的预设主集群在IDC1,而备用的容灾集群在异地的IDC2。如此,当处于IDC1的预设主集群发生灾难,从而导致该预设主集群提供的服务不可用(即无法完成数据处理流程)时,或者,导致预设主集群根本无法继续提供服务时,处于异地的IDC2的容灾集群中的容灾恢复设备,即开始建立与该预设主集群的灾备数据库之间的通信连接。Specifically, for example, it is assumed that the preset main cluster currently performing data processing is in IDC1, and the standby disaster recovery cluster is in IDC2 in a different place. In this way, when a disaster occurs in the preset main cluster in IDC1, which makes the services provided by the preset main cluster unavailable (that is, the data processing process cannot be completed), or causes the preset main cluster to be unable to continue to provide services at all, in The disaster recovery device in the disaster recovery cluster of the IDC2 in a different place starts to establish a communication connection with the disaster recovery database of the preset main cluster.
步骤S20,通过所述通信连接读取所述预设主集群执行的工作流;Step S20, reading the workflow executed by the preset main cluster through the communication connection;
容灾恢复设备在建立起与灾备数据库之间的通信连接之后,立即基于该通信连接,读取预设主集群在发生灾难时正在执行的工作流。After the disaster recovery device establishes a communication connection with the disaster recovery database, based on the communication connection, it immediately reads the preset workflow that the main cluster is executing when a disaster occurs.
具体地,例如,请参照如图11所示的容灾恢复处理流程,容灾恢复设备(假定为图11中所示的FindGap)在建立起与灾备数据库之间的通信连接之后,即,通过该通信连接从该灾备数据库中查询预设主集群在发生灾难时,该主集群的主调度系统(Scheduler)正在执行的工作流,并通过列表的方式查询结果。Specifically, for example, please refer to the disaster recovery recovery processing flow shown in FIG. When a disaster occurs in the preset main cluster, the workflow that is being executed by the main scheduling system (Scheduler) of the main cluster is queried from the disaster recovery database through the communication connection, and the result is queried in the form of a list.
需要说明的是,在本实施例中,预设主集群在发生灾难时正在执行的工作流可以为一个或者多个,或者,预设主集群在发生灾难时也可能没有提供服务执行数据处理流程,从而不存在正在执行的工作流。因此,容灾恢复设备在查询状态为正在执行的工作流并通过列表返回查询结果时,该列表所示状态为正在执行的工作流可能是0个或者N个,其中,N大于或者等于1。It should be noted that, in this embodiment, the preset main cluster may be executing one or more workflows when a disaster occurs, or the preset main cluster may not provide services to execute data processing procedures when a disaster occurs , so there is no ongoing workflow. Therefore, when the disaster recovery device inquires about workflows whose status is being executed and returns the query result through a list, the number of workflows whose status is being executed in the list may be 0 or N, where N is greater than or equal to 1.
步骤S30,根据预设的关系链模型获取所述工作流中各任务节点的任务参数,其中,所述关系链模型基于数据与数据处理任务之间的血缘关系构建得到;Step S30, obtaining task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationship between data and data processing tasks;
在一种可实现方式中,容灾恢复设备在读取到发生灾难的预设主集群于发生灾难时正在执行的工作流之后,即根据基于数据与数据处理任务之间的血缘关系来构建得到的关系链模型,进一步获取该工作流当中各个任务节点各自的任务参数。In a practicable manner, after the disaster recovery device reads the workflow that is being executed by the preset primary cluster when the disaster occurs, it constructs the data based on the blood relationship between the data and the data processing tasks. The relationship chain model of the workflow further obtains the respective task parameters of each task node in the workflow.
在另一种可实现方式中,容灾恢复设备还可以预设主集群未发生灾难之前,即开始基于数据与数据处理任务之间的血缘关系来构建得到关系链模型。如此,在后续预设主集群发生灾难时,容灾恢复设备即可直接提取该关系链模型,获取预设主集群于发生灾难时正在执行的工作流当中各个任务节点各自的任务参数。In another practicable manner, the disaster recovery device may also preset a relationship chain model based on blood relations between data and data processing tasks before a disaster occurs in the main cluster. In this way, when a disaster occurs in the preset main cluster, the disaster recovery device can directly extract the relationship chain model to obtain the respective task parameters of each task node in the workflow that the preset main cluster is executing when the disaster occurs.
需要说明的是,在本实施例中,数据与数据处理任务之间的血缘关系如图9所示的关系图谱,容灾恢复设备可基于该关系图谱确定在进行容灾切换操作的过程当中,需要重新执行的任务节点。It should be noted that, in this embodiment, the blood relationship between data and data processing tasks is shown in the relationship graph shown in FIG. Task nodes that need to be re-executed.
具体地,例如,假定容灾恢复设备预先基于如图9所示的数据与数据处理任务之间的血缘关系,构建得到一用于确定在进行容灾切换操作的过程当中需要重新执行的任务节点的关系链模型。然后,容灾恢复设备即利用该关系链模型,确定在当前获取到的预设主集群在发生灾难时其主调度系统正在执行的工作流中,每一个工作流全部的任务节点,以及该全部任务节点各自的任务参数。Specifically, for example, it is assumed that the disaster recovery device preliminarily constructs a task node for determining the task node that needs to be re-executed during the process of the disaster recovery switching operation based on the blood relationship between the data and the data processing task as shown in Figure 9. relationship chain model. Then, the disaster recovery device uses the relationship chain model to determine all the task nodes of each workflow and all Task parameters for each task node.
进一步地,在一种可行的实施例中,任务参数包括任务节点的输入数据和输出数据,上述步骤S30,可以包括:Further, in a feasible embodiment, the task parameters include the input data and output data of the task node, the above step S30 may include:
步骤S301,确定所述工作流的各所述任务节点;Step S301, determining each task node of the workflow;
容灾恢复设备在获取得到发生灾难的预设主集群于发生灾难时正在执行的工作流之后,即针对该一个或者多个工作流确定各自全部的任务节点。After the disaster recovery device acquires the workflows being executed by the preset primary cluster where the disaster occurred when the disaster occurred, it determines all the task nodes for the one or more workflows.
具体地,例如,容灾恢复设备通过列表的方式从发生灾难的预设主集群的灾备数据库中,查询得到预设主集群在发生灾难时该主集群的主调度系统正在执行的工作流仅为一个,则容灾恢复设备进一步通过成熟的广度优先的方式来确定并获取该一个工作流当中全部的任务节点。Specifically, for example, the disaster recovery device queries the disaster recovery database of the preset primary cluster where a disaster occurs through a list to obtain only If one is one, the disaster recovery device further determines and obtains all task nodes in the one workflow in a mature breadth-first manner.
步骤S302,根据各所述任务节点分别构建查询语句,根据所述查询语句从所述关系链模型中索引各所述任务节点各自的输入数据和输出数据。Step S302, constructing a query statement according to each of the task nodes, and indexing the respective input data and output data of each of the task nodes from the relationship chain model according to the query statement.
容灾恢复设备分别以确定得到的各任务节点构建对应的查询语句,从而基于该查询语句从基于数据与数据处理任务之间的血缘关系来构建得到的关系链模型当中,索引查询出该每一个任务节点各自的输入数据和输出数据。The disaster recovery device constructs corresponding query statements based on the determined task nodes, so that based on the query statements, the relationship chain model constructed based on the blood relationship between data and data processing tasks is indexed to query each The respective input data and output data of task nodes.
需要说明的是,在本实施例中,由于基于数据与数据处理任务之间的血缘关系来构建得到的关系链模型为如图9所示的数据与数据处理任务之间的血缘关系图谱,且该关系链模型具体可存储与容灾集群下配置的图数据库当中,因此,容灾恢复设备以任务节点构建得到的查询语句具体可以为图数据查询语句。It should be noted that, in this embodiment, the relationship chain model constructed based on the blood relationship between data and data processing tasks is the blood relationship map between data and data processing tasks as shown in Figure 9, and The relationship chain model can be stored in the graph database configured under the disaster recovery cluster. Therefore, the query statement constructed by the disaster recovery device using the task node can be a graph data query statement.
具体地,例如,请参照如图11所示的容灾恢复处理流程,容灾恢复设备—FindGap在通过列表的方式,从发生灾难的预设主集群的灾备数据库中,查询得到预设主集群在发生灾难时,该主集群的主调度系统—Scheduler正在执行的一个工作流,并通过广度优先的方式获取得到该一个工作流的全部任务节点之后,进一步调用预置好用于在关系图谱数据当中索引查询关系型数据的查询模板,如,SQL(Structured Query Language,结构化查询语言)语句,并依次以每一个任务节点作为该SQL语句当中的输入条件,以此构建得到查询与该任务节点具有相关关系的图数据查询语句,然后,容灾恢复设备立即执行该SQL语句即可从图数据库—Graph DB中存储的关系链模型所展示数据处理任务和数据的血缘关系(图示应用和数据血缘)中,分析并获取得到该每一个任务节点直接的输入数据和输出数据。Specifically, for example, please refer to the disaster recovery process flow shown in Figure 11. The disaster recovery device—FindGap, in the form of a list, obtains the preset master cluster from the disaster recovery database of the preset master cluster where a disaster occurred. When a disaster occurs in the cluster, the main scheduling system of the main cluster——Scheduler is executing a workflow, and after obtaining all the task nodes of the workflow in a breadth-first manner, it will further call the preset for the relationship graph The query template for index query relational data in the data, such as SQL (Structured Query Language, Structured Query Language) statement, and each task node is used as the input condition in the SQL statement in turn, so as to construct the query and the task Nodes have graph data query statements related to each other, and then the disaster recovery device immediately executes the SQL statement to display the data processing tasks and blood relationship of the data from the relationship chain model stored in the graph database—Graph DB (diagram application and Data lineage), analyze and obtain the direct input data and output data of each task node.
步骤S40,检测所述任务参数的同步状态,以根据所述同步状态确定各所述任务节点中待重新执行的目标节点,并触发容灾恢复机制执行所述目标节点。Step S40 , detecting the synchronization state of the task parameters, so as to determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger the disaster recovery mechanism to execute the target node.
容灾恢复设备在获取得到预设主集群于发生灾难时,正在执行的工作流当中,各个任务节点各自的任务参数之后,进一步检测该各个任务节点各自任务参数的同步状态,从而,根据检测到的该同步状态确定该各个任务节点当中需要重新执行才能完成容灾切换的待重新执行的目标节点,最后,触发预置的容灾恢复机制重新执行该目标节点。After the disaster recovery device obtains the task parameters of each task node in the workflow that is being executed by the default master cluster when a disaster occurs, it further detects the synchronization status of the task parameters of each task node, so that, according to the detected The synchronization state of each task node determines the target node to be re-executed that needs to be re-executed to complete the disaster recovery switchover, and finally triggers the preset disaster recovery mechanism to re-execute the target node.
具体地,例如,请参照如图11所示的容灾恢复处理流程,容灾恢复设备—FindGap在从图数据库—Graph DB中存储的关系链模型所展示数据处理任务和数据的血缘关系(图示应用和数据血缘)中,分析并获取得到每一个任务节点直接的输入数据和输出数据,之后, 容灾恢复设备—FindGap即进一步调用同样配置在容灾集群下的用于进行数据同步的同步装置—Transport的数据库,检查每一个任务节点各自的输入数据和输出数据的同步状态是否为已经同步完成,从而得到全部任务节点各自输入数据和输出数据的同步状态。最后,容灾恢复设备—FindGap即再一次按照成熟的广度优先的方式来遍历该每一个任务节点各自所属的工作流的DAG(Directed Acyclic Graph无回路有向图)图(如附图6所示工作流样例),从而基于该任务节点各自输入数据和输出数据的同步状态,确定出该工作流全部的任务节点当中需要重新执行的目标节点,进而,触发预置的容灾恢复机制以令容灾集群下的调度系统—Scheduler(Backup)调度该目标节点进行重新执行,并待重新执行完之后向容灾恢复设备反馈状态结果进行同步,容灾恢复设备即基于该状态结果确定容灾切换完成。Specifically, for example, please refer to the disaster recovery recovery processing flow shown in Figure 11, the disaster recovery recovery device—FindGap in the relationship chain model stored in the graph database—Graph DB shows the blood relationship between data processing tasks and data (Figure 11 display application and data lineage), analyze and obtain the direct input data and output data of each task node, and then, the disaster recovery device—FindGap will further call the synchronization for data synchronization that is also configured under the disaster recovery cluster The device—the database of Transport, checks whether the synchronization status of the input data and output data of each task node is completed, so as to obtain the synchronization status of the input data and output data of all task nodes. Finally, the disaster recovery device—FindGap once again traverses the DAG (Directed Acyclic Graph) graph of the workflow to which each task node belongs in a mature breadth-first manner (as shown in Figure 6 Workflow example), so that based on the synchronization status of the input data and output data of the task node, determine the target node that needs to be re-executed among all the task nodes of the workflow, and then trigger the preset disaster recovery mechanism to make The scheduling system under the disaster recovery cluster—Scheduler (Backup) schedules the target node for re-execution, and feeds back the status result to the disaster recovery device for synchronization after the re-execution. The disaster recovery device determines the disaster recovery switch based on the status result Finish.
进一步地,在另一种可行的实施例中,容灾恢复设备在基于同步状态确定到工作流全部的任务节点当中,不存在待重新执行的目标节点时,则无需触发容灾恢复机制进行容灾恢复即可完成切换。Further, in another feasible embodiment, when the disaster recovery device determines that there is no target node to be re-executed among all the task nodes in the workflow based on the synchronization status, it does not need to trigger the disaster recovery mechanism to perform recovery. Disaster recovery can complete the switchover.
进一步地,请参照如图12所示的容灾恢复场景,假定容灾恢复设备通过关系链模型获取到的、主集群在发生灾难时正在执行的工作流中,各个任务节点:Job1、Job2和Job3、各自的输入数据和输出数据为Table1、Table2、Table3、Table4、Table5和Table6,并且,容灾恢复设备进一步检测到该Table1、Table2、Table3和Table4的同步状态为已经完成容灾同步,且校验一致,此时,容灾恢复设备确定该工作流当中需要在容灾集群重新执行的目标节点只有Job2和Job3,从而,容灾恢复设备触发容灾恢复机制以令该容灾集群下的调度系统仅调度该Job2和Job3进行重新执行以加快容灾恢复速度。Further, please refer to the disaster recovery scenario shown in Figure 12, assuming that the disaster recovery device obtained through the relationship chain model, in the workflow that the main cluster is executing when a disaster occurs, each task node: Job1, Job2 and Job3, the respective input data and output data are Table1, Table2, Table3, Table4, Table5 and Table6, and the disaster recovery device further detects that the synchronization status of the Table1, Table2, Table3 and Table4 has been completed disaster recovery synchronization, and The verification is consistent. At this time, the disaster recovery device determines that the only target nodes that need to be re-executed in the disaster recovery cluster in the workflow are Job2 and Job3. Therefore, the disaster recovery device triggers the disaster recovery mechanism to make the The scheduling system only schedules the Job2 and Job3 for re-execution to speed up the disaster recovery speed.
在本实施例中,通过容灾恢复设备在进行容灾恢复的过程中,先建立起与发生灾难的预设主集群的灾备数据库之间的通信连接;容灾恢复设备在建立起与灾备数据库之间的通信连接之后,立即基于该通信连接,读取预设主集群在发生灾难时正在执行的工作流;容灾恢复设备在读取到发生灾难的预设主集群于发生灾难时正在执行的工作流之后,即根据基于数据与数据处理任务之间的血缘关系来构建得到的关系链模型,进一步获取该工作流当中各个任务节点各自的任务参数;容灾恢复设备在获取得到预设主集群于发生灾难时,正在执行的工作流当中,各个任务节点各自的任务参数之后,进一步检测该各个任务节点各自任务参数的同步状态,从而,根据检测到的该同步状态确定该各个任务节点当中需要重新执行才能完成容灾切换的待重新执行的目标节点,最后,触发预置的容灾恢复机制重新执行该目标节点。In this embodiment, during the process of disaster recovery and recovery, the disaster recovery device first establishes a communication connection with the disaster recovery database of the preset primary cluster where the disaster occurs; Immediately after the communication connection between the standby databases, based on the communication connection, read the workflow that is being executed by the preset primary cluster when a disaster occurs; After the workflow is being executed, the relationship chain model constructed based on the relationship between data and data processing tasks is used to further obtain the task parameters of each task node in the workflow; Assume that when a disaster occurs in the main cluster, among the workflows being executed, after the respective task parameters of each task node, the synchronization status of each task parameter of each task node is further detected, so that each task is determined according to the detected synchronization status Among the nodes, the target node to be re-executed needs to be re-executed to complete the disaster recovery switchover. Finally, the preset disaster recovery mechanism is triggered to re-execute the target node.
本申请相比于传统的大数据集群容灾方案,通过预先基于数据与数据处理任务之间的血缘关系构建的完整的关系链模型,并结合工作流中任务节点各自任务参数的同步状态,来进行主集群发生灾难情况下切换容灾集群的容灾恢复操作,可以实现快速进行容灾切换和快速恢复的待重新执行的任务节点,达成了快速和精细化进行容灾切换的目的,从而提高了容灾恢复效率。Compared with the traditional big data cluster disaster recovery scheme, this application uses a complete relationship chain model constructed in advance based on the blood relationship between data and data processing tasks, and combines the synchronization status of the task parameters of the task nodes in the workflow. The disaster recovery operation of switching the disaster recovery cluster in the event of a disaster in the main cluster can realize rapid disaster recovery switching and fast recovery of task nodes to be re-executed, achieving the purpose of fast and refined disaster recovery switching, thereby improving disaster recovery efficiency.
进一步地,基于上述本申请数据的容灾恢复方法的第一实施例,提出本申请数据的容灾恢复方法的第二实施例,本实施例与上述第一实施例之间的主要区别在于,在本实施例中,在上述步骤S10,建立与预设主集群的灾备数据库之间的通信连接的步骤之前,本申请数据的容灾恢复方法,还可以包括:Furthermore, based on the above first embodiment of the method for disaster recovery and recovery of data in this application, a second embodiment of the method for disaster recovery and recovery of data in this application is proposed. The main difference between this embodiment and the above first embodiment is that: In this embodiment, before step S10 above, the step of establishing a communication connection with the disaster recovery database of the preset main cluster, the data disaster recovery method of this application may further include:
步骤S50,基于数据与数据处理任务之间的血缘关系构建关系链模型。Step S50, constructing a relationship chain model based on blood relationship between data and data processing tasks.
容灾恢复设备在与发生灾难的预设主集群的灾备数据库之间建立通信连接之前,即开始基于预设主集群正在执行的数据处理任务和数据之间的血缘关系,构建关系链模型。Before the disaster recovery device establishes a communication connection with the disaster recovery database of the preset main cluster where a disaster occurs, it starts to build a relationship chain model based on the data processing tasks being executed by the preset main cluster and the blood relationship between the data.
进一步地,在一种可行的实施例中,步骤S50,可以包括:Further, in a feasible embodiment, step S50 may include:
步骤S501,从所述预设主集群中获取血缘数据建立数据处理任务与数据之间的第一血缘关系;Step S501, acquiring kinship data from the preset main cluster to establish a first kinship relationship between the data processing task and the data;
容灾恢复设备在基于预设主集群正在执行的数据处理任务和数据之间的血缘关系构建关系链模型的过程中,先从正在执行数据处理任务的预设主集群当中,获取血缘数据建立起该预设主集群的调度系统执行数据处理任务与数据之间的第一血缘关系。In the process of building a relationship chain model based on the data processing tasks being executed by the preset master cluster and the blood relationship between the data, the disaster recovery device first obtains blood relationship data from the preset master cluster that is executing data processing tasks to establish a The scheduling system of the preset master cluster executes the first blood relationship between data processing tasks and data.
具体地,例如,请参照如图3所示的血缘数据获取和加工流程,容灾恢复设备通过预设主集群下的血缘获取hook(是Windows中提供的一种用以替换DOS(Disk Operating System,磁盘操作系统)下“中断”的系统机制,中文译为“挂钩”或“钩子”)从数据组件(如关系型数据库等)中解析血缘数据并获取该血缘数据,并将该血缘数据写入到数据集成工具的文件系统。然后,通过预设主集群的调度系统定时触发数据血缘数据集成任务,令该数据集成工具从文件系统中读取血缘日志得到血缘数据,并将该血缘数据进一步写入到大数据平台hive或者spark。再然后,基于预设主集群的调度系统定时触发数据加工任务,令大数据平台hive或者spark针对写入的血缘数据进行加工整合形成如图4所示的数据处理任务与数据之间的第一血缘关系,并将该第一血缘关系作为血缘图数据写入到图数据库系统当中。最后,由图数据库系统主动向预设主集群的调度系统上报血缘图数据的写入状态,以供该调度系统确认第一血缘数据构建完成。Specifically, for example, please refer to the bloodline data acquisition and processing process shown in Figure 3. The disaster recovery device obtains the hook (a type provided in Windows to replace the DOS (Disk Operating System , the system mechanism of "interruption" under the disk operating system), which is translated as "hook" or "hook" in Chinese) parses blood relationship data from data components (such as relational databases, etc.) and obtains the blood relationship data, and writes the blood relationship data into the file system of the data integration tool. Then, the scheduling system of the preset main cluster triggers the data blood relationship data integration task regularly, so that the data integration tool reads the blood relationship log from the file system to obtain the blood relationship data, and further writes the blood relationship data to the big data platform hive or spark . Then, the scheduling system based on the preset main cluster triggers the data processing task regularly, so that the big data platform hive or spark can process and integrate the written blood relationship data to form the first link between the data processing task and the data as shown in Figure 4. blood relationship, and write the first blood relationship into the graph database system as blood relationship graph data. Finally, the graph database system actively reports the writing status of bloodline graph data to the scheduling system of the preset main cluster, so that the scheduling system can confirm that the first bloodline data is constructed.
需要说明的是,在本实施例中,预设主集群下的血缘获取hook针对不同的数据系统和数据传输工具实现对应的Lineage Hook—血缘数据Hook机制,数据系统每执行一条SQL语句,这些Hook机制就会捕获原始的血缘数据,并封装成血缘日志写入数据集成工具的日志系统。具体地,例如,针对Hive数据系统和Spark数据系统分别采用Hive Lineage Hook(通过异步捕获Hive执行SQL语句,调用自主实现的Hive执行行为分析应用程序接口得到SQL的输入数据信息、输出数据信息、以及关联的任务信息)、Spark-SQL Lineage Hook(通过异步获取Spark-SQL执行的SQL语句,调用自主实现的Spark SQL执行行为分析应用程序接口得到SQL的输入数据信息、输出数据信息、以及关联的任务信息)、Sqoop Lineage Hook(通过异步捕获Sqoop的执行命令,分析Sqoop执行命令的参数,得到执行命令的输入数据和输出数据相关信息、以及关联的任务的信息),来捕获血缘数据。其中,Hive和Spark-SQL各自对应的Lineage Hook用来获取大数据平台内部数据表之间的血缘,Sqoop Lineage Hook用来捕获大数据平台表和传统关系型数据之间表的血缘关系。It should be noted that, in this embodiment, the lineage acquisition hook under the default main cluster implements the corresponding Lineage Hook—lineage data Hook mechanism for different data systems and data transmission tools. Every time a data system executes a SQL statement, these Hooks The mechanism captures the original blood relationship data and encapsulates it into a blood relationship log and writes it to the log system of the data integration tool. Specifically, for example, Hive Lineage Hook is used for Hive data system and Spark data system respectively (by asynchronously capturing Hive execution SQL statements, calling the self-implemented Hive execution behavior analysis API to obtain SQL input data information, output data information, and associated task information), Spark-SQL Lineage Hook (obtain the SQL statement executed by Spark-SQL asynchronously, and call the self-implemented Spark SQL execution behavior analysis application program interface to obtain the input data information, output data information, and associated tasks of SQL Information), Sqoop Lineage Hook (by asynchronously capturing Sqoop execution commands, analyzing the parameters of Sqoop execution commands, and obtaining the input data and output data related information of the execution commands, as well as the associated task information), to capture lineage data. Among them, the Lineage Hook corresponding to Hive and Spark-SQL is used to obtain the blood relationship between the internal data tables of the big data platform, and the Sqoop Lineage Hook is used to capture the blood relationship between tables on the big data platform and traditional relational data.
步骤S502,解析对象简谱文件建立所述数据处理执行任务与数据处理任务之间的第二血缘关系;Step S502, analyzing the object numbered musical notation file to establish a second blood relationship between the data processing execution task and the data processing task;
需要说明的是,在本实施例中,如图6所示的数据处理的工作流样例,数据处理任务在大数据平台任务调度系统(例如:Azkaban、Airflow等)中,主要是通过DAG的形式组织依赖关系,在数据库中通过JSON(对象简谱)的方式存储。It should be noted that, in this embodiment, as shown in the workflow example of data processing shown in FIG. Formal organization dependencies are stored in the database in the form of JSON (object numbered musical notation).
容灾恢复设备在建立起预设主集群的调度系统执行的数据处理执行任务与数据之间的第一血缘关系之后,进一步通过解析对象简谱文件的方式建立该数据处理执行任务与数据处理任务之间的第二血缘关系。After the disaster recovery device establishes the first blood relationship between the data processing execution task executed by the scheduling system of the preset main cluster and the data, it further establishes the relationship between the data processing execution task and the data processing task by parsing the object numbered musical notation file. second blood relationship.
具体地,例如,容灾恢复设备通过预置的任务血缘解析程序读取预设主集群下,调度系统中各个数据处理任务的工作流的JSON文件,然后针对该JSON文件进行解析得到如图5所示的数据处理执行任务—Executed Job与数据处理任务相互之间的第二血缘关系。Specifically, for example, the disaster recovery device reads the JSON file of the workflow of each data processing task in the scheduling system under the preset main cluster through the preset task lineage analysis program, and then parses the JSON file to obtain the result shown in Figure 5 The data processing execution task shown—the second blood relationship between the Executed Job and the data processing task.
需要说明的是,在本实施例中,请参照如图8所示的数据处理任务与任务执行ID的关系,任务血缘解析程序读取工作流的JSON文件,并针对该JSON文件进行解析的过程中,会在数据库中记录每个数据处理任务每次的执行记录,且每个数据处理任务每次执行都关联一个Executed Job ID,如此,建立起数据处理任务和数据处理执行任务—Executed Job的第二血缘关系。It should be noted that in this embodiment, please refer to the relationship between the data processing task and the task execution ID shown in Figure 8, the process of the task lineage analysis program reading the JSON file of the workflow and parsing the JSON file In the database, each execution record of each data processing task will be recorded in the database, and each execution of each data processing task is associated with an Executed Job ID. In this way, the data processing task and the data processing execution task——Executed Job ID second blood relationship.
进一步地,在本实施例中,请参照如图7所示的流程,容灾恢复设备在通过任务血缘解析程序,解析调度系统中工作流的JSON文件的过程中,先通过大数据任务调度系统从数据集成工具中读取任务关系JSON文件和任务执行记录,并由该数据集成工具直接将读 取JSON文件和任务执行记录进行解析得到的血缘数据写入到大数据平台hive或者spark中。然后,通过预设主集群的调度系统定时触发数据加工任务,令该大数据平台hive或者spark针对写入的血缘数据进行加工整合形成如图5所示的数据处理执行任务—Executed Job与数据处理任务之间的第二血缘关系,并将该第二血缘关系作为血缘图数据写入到图数据库系统当中。最后,由图数据库系统主动向预设主集群的调度系统上报血缘图数据的写入状态,以供该调度系统确认第二血缘数据构建完成。Further, in this embodiment, please refer to the process shown in Figure 7. When the disaster recovery device parses the JSON file of the workflow in the scheduling system through the task lineage analysis program, it first passes through the big data task scheduling system. Read the task relationship JSON file and task execution record from the data integration tool, and the data integration tool directly writes the blood relationship data obtained by reading the JSON file and task execution record and parsing to the big data platform hive or spark. Then, the scheduling system of the preset main cluster triggers the data processing task at regular intervals, so that the big data platform hive or spark can process and integrate the written blood relationship data to form a data processing execution task as shown in Figure 5—Executed Job and Data Processing The second blood relationship between tasks is written into the graph database system as blood relationship graph data. Finally, the graph database system actively reports the writing status of bloodline graph data to the scheduling system of the preset main cluster, so that the scheduling system can confirm the completion of the second bloodline data construction.
步骤S503,融合所述第一血缘关系和所述第二血缘关系确定所述数据与所述数据处理任务之间的血缘关系以构建得到关系链模型。Step S503, fusing the first blood relationship and the second blood relationship to determine the blood relationship between the data and the data processing task to construct a relationship chain model.
容灾恢复设备在分别建立起数据处理执行任务与数据之间的第一血缘关系,以及,数据处理执行任务与数据处理任务之间的第二血缘关系之后,即针对该第一血缘关系和第二血缘关系进行融合处理以确定起该数据处理任务与该数据之间的血缘关系,从而构建得到关系链模型。After the disaster recovery device establishes the first blood relationship between the data processing execution task and the data, and the second blood relationship between the data processing execution task and the data processing task, the first blood relationship and the second blood relationship The blood relationship between the two is fused to determine the blood relationship between the data processing task and the data, so as to construct a relationship chain model.
具体地,例如,容灾恢复设备分别构建得到如图4所示的数据处理执行任务与数据之间的第一血缘关系,和如图5所示的数据处理执行任务与数据处理任务之间的第二血缘关系,之后,容灾恢复设备即通过预设主集群的调度系统定时触发数据融合加工任务,分别解析该第一血缘关系和第二血缘关系各自的关系图谱,以确定该第一血缘关系与第二血缘关系之间,基于数据处理执行任务(Executed Job)与数据处理任务(Job)之间的对应关系,将第一血缘关系中的数据处理执行任务(Executed Job),均替换为该数据处理执行任务(Executed Job)对应的数据处理任务Job,从而将该该第一血缘关系和第二血缘关系各自的关系图谱融合形成如图9所示的数据与数据处理任务之间的血缘关系,然后,容灾恢复设备即基于该数据与数据处理任务之间的血缘关系构建成图数据形式的关系链模型,并将该关系链模型存储至图数据库当中以供后续调用。Specifically, for example, the disaster recovery device respectively constructs the first blood relationship between the data processing execution task and the data as shown in Figure 4, and the first blood relationship between the data processing execution task and the data processing task as shown in Figure 5 The second blood relationship, after that, the disaster recovery device triggers the data fusion processing task regularly through the scheduling system of the preset main cluster, and analyzes the respective relationship maps of the first blood relationship and the second blood relationship to determine the first blood relationship Between the relationship and the second blood relationship, based on the corresponding relationship between the data processing execution task (Executed Job) and the data processing task (Job), the data processing execution task (Executed Job) in the first blood relationship is replaced by The data processing task (Executed Job) corresponds to the data processing task Job, so that the respective relationship graphs of the first blood relationship and the second blood relationship are fused to form the blood relationship between the data and the data processing task as shown in Figure 9 Then, the disaster recovery device builds a relationship chain model in the form of graph data based on the blood relationship between the data and the data processing task, and stores the relationship chain model in the graph database for subsequent calls.
需要说明的是,在本实施例中,容灾恢复设备可基于关系链模型确定工作流中各任务节点的输入数据和输出数据,如图9所示,任务节点—Job1的输入数据是Table1和Table2,输出数据是Table4,如此,容灾恢复设备可基于该关系链模型在进行容灾切换时,确定容灾集群中的调度系统确定工作流从哪一个任务节点开始重跑。It should be noted that, in this embodiment, the disaster recovery device can determine the input data and output data of each task node in the workflow based on the relationship chain model. As shown in FIG. 9, the input data of the task node—Job1 is Table1 and Table2, the output data is Table4. In this way, the disaster recovery device can determine the scheduling system in the disaster recovery cluster to determine which task node the workflow starts to rerun when performing disaster recovery switching based on the relationship chain model.
在本实施例中,通过容灾恢复设备在基于预设主集群正在执行的数据处理任务和数据之间的血缘关系构建关系链模型的过程中,先从正在执行数据处理任务的预设主集群当中,获取血缘数据建立起该预设主集群的调度系统执行数据处理任务与数据之间的第一血缘关系;容灾恢复设备在建立起预设主集群的调度系统执行的数据处理执行任务与数据之间的第一血缘关系之后,进一步通过解析对象简谱文件的方式建立该数据处理执行任务与数据处理任务之间的第二血缘关系;容灾恢复设备在分别建立起数据处理执行任务与数据之间的第一血缘关系,以及,数据处理执行任务与数据处理任务之间的第二血缘关系之后,即针对该第一血缘关系和第二血缘关系进行融合处理以确定起该数据处理任务与该数据之间的血缘关系,从而构建得到关系链模型。In this embodiment, in the process of constructing the relationship chain model based on the blood relationship between the data processing tasks being executed by the preset master cluster and the data through the disaster recovery device, the preset master cluster that is executing the data processing tasks first Among them, the blood relationship data is obtained to establish the first blood relationship between the data processing task executed by the scheduling system of the preset main cluster and the data; the disaster recovery device establishes the data processing execution task performed by the scheduling system of the preset main cluster and After the first blood relationship between the data, the second blood relationship between the data processing execution task and the data processing task is further established by parsing the object numbered musical notation file; the disaster recovery device establishes the data processing execution task and the data processing task respectively After the first blood relationship between the data processing execution task and the second blood relationship between the data processing task, the fusion process is performed on the first blood relationship and the second blood relationship to determine the data processing task and the data processing task. The blood relationship between the data, thus constructing the relationship chain model.
如此,容灾恢复设备即可在进行容灾切换的过程中,通过预先基于数据与数据处理任务之间的血缘关系构建的系链模型,并结合工作流中任务节点各自任务参数的同步状态,来进行主集群发生灾难情况下切换容灾集群的容灾恢复操作,可以实现快速进行容灾切换和快速恢复的待重新执行的任务节点,达成了快速和精细化进行容灾切换的目的,从而提高了容灾恢复效率。In this way, the disaster recovery device can use the tether model constructed in advance based on the blood relationship between data and data processing tasks in the process of disaster recovery switching, combined with the synchronization status of the task parameters of the task nodes in the workflow, To carry out the disaster recovery operation of switching the disaster recovery cluster in the event of a disaster in the main cluster, it can realize the fast disaster recovery switch and the fast recovery of the task nodes to be re-executed, and achieve the purpose of fast and refined disaster recovery switch, so that Improved disaster recovery efficiency.
进一步地,基于上述本申请数据的容灾恢复方法的第一实施例,提出本申请数据的容灾恢复方法的第三实施例,本实施例与上述第一实施例之间的主要区别在于,在本实施例中,在上述步骤S10,建立与预设主集群的灾备数据库之间的通信连接的步骤之前,本申请数据的容灾恢复方法,还可以包括:Further, based on the above first embodiment of the method for disaster recovery and recovery of data in this application, a third embodiment of the method for disaster recovery and recovery of data in this application is proposed. The main difference between this embodiment and the above first embodiment is that: In this embodiment, before step S10 above, the step of establishing a communication connection with the disaster recovery database of the preset main cluster, the data disaster recovery method of this application may further include:
步骤S60,执行预设的数据同步任务以令所述预设主集群的数据库同步数据至所述灾 备数据库。Step S60, executing a preset data synchronization task to make the database of the preset primary cluster synchronize data to the disaster recovery database.
容灾恢复设备在与发生灾难的预设主集群的灾备数据库之间建立通信连接之前,先执行数据同步任务以令预设主集群的数据库,在该预设主集群提供服务执行数据处理任务的过程中,将数据同步至容灾集群中的灾备数据库以供后续快速进行容灾切换。Before the disaster recovery device establishes a communication connection with the disaster recovery database of the preset primary cluster where a disaster occurs, it first performs a data synchronization task so that the database of the preset primary cluster provides services to perform data processing tasks During the process, the data is synchronized to the disaster recovery database in the disaster recovery cluster for subsequent rapid disaster recovery switchover.
需要说明的是,在本实施例中,数据同步任务为基于工作人员配置生成进行数据同步管理的任务。应当理解的是,基于实际应用的不同设计需要,在不同可行的实施方式当中,数据同步任务的配置生成方式以及具体内容等都可以不同,本申请数据的容灾恢复方法并不针对该数据同步任务的具体内容进行限定。It should be noted that, in this embodiment, the data synchronization task is a task generated based on staff configuration to perform data synchronization management. It should be understood that, based on different design requirements of actual applications, in different feasible implementations, the configuration generation method and specific content of the data synchronization task may be different, and the data disaster recovery method of this application does not aim at the data synchronization The specific content of the task is limited.
进一步地,在一种可行的实施例中,步骤S60,可以包括:Further, in a feasible embodiment, step S60 may include:
步骤S601,接收所述数据同步任务,并从所述预设主集群的数据库中读取所述数据同步任务指向的待同步元数据;Step S601, receiving the data synchronization task, and reading the metadata to be synchronized pointed to by the data synchronization task from the database of the preset master cluster;
容灾恢复设备在令预设主集群的数据库同步数据至灾备数据库的过程当中,先接收基于工作人员配置生成的数据同步任务,从而解析该数据同步任务来确定需要从该预设主集群的数据库中读取并同步至灾备数据库中的待同步元数据。In the process of synchronizing data from the database of the preset main cluster to the disaster recovery database, the disaster recovery device first receives the data synchronization task generated based on the staff configuration, and then analyzes the data synchronization task to determine the data that needs to be transferred from the preset main cluster. The metadata to be synchronized is read from the database and synchronized to the disaster recovery database.
步骤S602,执行所述数据同步任务以将所述待同步元数据拉取至所述灾备数据库中进行存储;Step S602, executing the data synchronization task to pull the metadata to be synchronized into the disaster recovery database for storage;
容灾恢复设备在确定需要从该预设主集群的数据库中读取并同步至灾备数据库中的待同步元数据之后,即执行该数据同步任务以将该待同步元数据拉取至灾备数据库中对应的存储路径下进行存储。After the disaster recovery device determines that the metadata to be synchronized needs to be read from the database of the preset primary cluster and synchronized to the disaster recovery database, it executes the data synchronization task to pull the metadata to be synchronized to the disaster recovery Store in the corresponding storage path in the database.
进一步地,在一种可行的实施例中,步骤S602,可以包括:Further, in a feasible embodiment, step S602 may include:
步骤S6021,获取所述待同步元数据在所述预设主集群的数据库中第一存储路径;Step S6021, obtaining the first storage path of the metadata to be synchronized in the database of the preset main cluster;
容灾恢复设备在将预设主集群的数据库中的待同步元数据,同步至灾备数据库的过程当中,容灾恢复设备在通过解析数据同步任务来确定需要同步的待同步元数据时,同步获取该待同步元数据在该预设主集群的数据库中的第一存储路径。When the disaster recovery device synchronizes the metadata to be synchronized in the database of the preset primary cluster to the disaster recovery database, when the disaster recovery device determines the metadata to be synchronized by parsing the data synchronization task, it will synchronize A first storage path of the metadata to be synchronized in the database of the preset master cluster is obtained.
具体地,例如,容灾恢复设备具体可以通过检测预设主集群的数据库中的资源管理器YARN(Yet Another Resource Negotiator,另一种资源协调者,也称作Apache Hadoop YARN,是一种新的Hadoop资源管理器),针对该待同步数据(具体可以是一种Hive表的数据)所管理的数据大小、存储时间、更新时间、存储路径等数据信息,来获取得到该该待同步元数据在该预设主集群的数据库中的第一存储路径。Specifically, for example, the disaster recovery device can specifically detect the resource manager YARN (Yet Another Resource Negotiator, another resource coordinator, also known as Apache Hadoop YARN, in the database of the preset master cluster, which is a new Hadoop resource manager), for the data size, storage time, update time, storage path and other data information managed by the data to be synchronized (specifically, it can be the data of a Hive table), to obtain the metadata to be synchronized in The preset first storage path in the database of the primary cluster.
步骤S6022,确定所述灾备数据库中与所述第一存储路径相对应的第二存储路径;Step S6022, determining a second storage path corresponding to the first storage path in the disaster recovery database;
容灾恢复设备在获取得到该待同步元数据在预设主集群的数据库中的第一存储路径之后,即可基于该第一存储路径在灾备数据库当中确定一与该第一存储路径相对应的第二存储路径。After the disaster recovery device obtains the first storage path of the metadata to be synchronized in the database of the preset primary cluster, it can determine in the disaster recovery database based on the first storage path a path corresponding to the first storage path The second storage path.
具体地,例如,容灾恢复设备可基于预先构建好的预设主集群的数据库与灾备数据库之间用于同步数据的关联关系,如,该关联关系具体可以为预先专门为该关联关系建立的一张关系表,直接以该第一存储路径在该关系表当中,检测并确定出在灾备数据库当中,与该第一存储路径相对应以存储该第一存储路径下待同步元数据的第二存储路径。Specifically, for example, the disaster recovery device can be based on the pre-built association relationship between the database of the preset main cluster and the disaster recovery database for synchronizing data. For example, the association relationship can be specifically established for the association relationship in advance. A relational table, directly using the first storage path in the relational table, detects and determines in the disaster recovery database, corresponding to the first storage path to store metadata to be synchronized under the first storage path Second storage path.
在另一种可行的实施例中,容灾恢复设备在获取得到待同步元数据在预设主集群的数据库中的第一存储路径之后,还可以实时的基于灾备数据库当前空余的存储空间,即时生成一存储路径并建立好该存储路径与第一存储路径之间的相互对应的关联关系,然后,将该存储路径确定为在灾备数据库当中存储该第一存储路径下待同步元数据的第二存储路径。In another feasible embodiment, after the disaster recovery device acquires the first storage path of the metadata to be synchronized in the database of the preset primary cluster, it can also, in real time, based on the current free storage space of the disaster recovery database, Immediately generate a storage path and establish a corresponding relationship between the storage path and the first storage path, and then determine the storage path as the metadata to be synchronized under the first storage path in the disaster recovery database Second storage path.
步骤S6023,按照第二存储路径将所述待同步元数据存储在所述灾备数据库中。Step S6023, storing the metadata to be synchronized in the disaster recovery database according to the second storage path.
容灾恢复设备在确定好灾备数据库当中与第一存储路径相对应的第二存储路径之后,容灾恢复设备即可按照该第二存储路径,将预设主集群的数据库中,存储在该第一存储路 径下的待存储数据,拉取存储在该灾备数据库当中的第二存储路径之下。After the disaster recovery device determines the second storage path corresponding to the first storage path in the disaster recovery database, the disaster recovery device can store the database of the preset main cluster in the second storage path according to the second storage path. The data to be stored under the first storage path is fetched and stored under the second storage path in the disaster recovery database.
具体地,例如,容灾恢复设备在确定好灾备数据库当中与第一存储路径相对应的第二存储路径之后,可按照该第一存储路径先从预设主集群的数据库当中拉取出需要同步的待同步元数据,然后将该待同步元数据输入至灾备数据库的资源管理器YARN当中,以令该资源管理器YARN按照该第二存储路径,对该待同步元数据进行存储。Specifically, for example, after the disaster recovery device determines the second storage path corresponding to the first storage path in the disaster recovery database, it can first pull out the database that needs to be synchronized from the database of the preset primary cluster according to the first storage path. The metadata to be synchronized, and then input the metadata to be synchronized into the resource manager YARN of the disaster recovery database, so that the resource manager YARN can store the metadata to be synchronized according to the second storage path.
步骤S603,监控所述数据同步任务的执行状态并针对所述预设主集群的数据库和所述灾备数据库各自存储的数据进行一致性验证。Step S603, monitoring the execution status of the data synchronization task and performing consistency verification on the data stored in the database of the preset primary cluster and the disaster recovery database respectively.
容灾恢复设备通过持续的监测该数据同步任务的执行状态,以在每一个数据同步任务执行完毕时,进一步针对该预设主集群的数据库和灾备数据库各自所存储的数据进行一致性验证,从而确保将预设主集群的数据库中的待同步元数据,完整地同步至灾备数据库当中。The disaster recovery device continuously monitors the execution status of the data synchronization task to further verify the consistency of the data stored in the preset main cluster database and the disaster recovery database when each data synchronization task is completed. This ensures that the metadata to be synchronized in the database of the preset primary cluster is completely synchronized to the disaster recovery database.
具体地,例如,请参照如图10所示的数据同步流程,容灾恢复设备通过接收工作人员的配置数据来生成数据同步任务。然后,容灾恢复设备基于调度系统—Transports调度该数据同步任务进行执行以通过多线程的方式从预设主集群(IDC1)存储元数据的数据库—MySQL中读取元数据,并将读取到的该元数据作为该数据同步任务指向的待同步元数据,写入到容灾集群(IDC2)存储元数据的灾备数据库中。再然后,容灾恢复设备进一步基于调度系统—Transports在容灾集群(IDC2)上提交容灾数据同步任务—Distcp job作业到YARN中,并基于调度执行该Distcp job作业,以实现将预设主集群(IDC1)中需要同步的Hive表的数据,拉取到容灾集群(IDC2)对应Hive表的存储路径下(HDFS目录)。最后,容灾恢复设备还基于调度系统—Transports轮询监控所调度的数据同步任务的执行状态,并收集预设主集群(IDC1)的数据库src-HDFS和容灾集群中的灾备数据库dest-HDFS上所存储的同步的统计数据,从而验证该预设主集群(IDC1)和容灾集群(IDC2)两边的数据是否一致。Specifically, for example, referring to the data synchronization process shown in FIG. 10 , the disaster recovery device generates a data synchronization task by receiving the configuration data of the staff. Then, based on the scheduling system—Transports, the disaster recovery device schedules the data synchronization task for execution to read metadata from the database—MySQL, which stores metadata in the preset main cluster (IDC1) through multithreading, and read the metadata to The metadata is written into the disaster recovery database of the disaster recovery cluster (IDC2) storing metadata as the metadata to be synchronized pointed to by the data synchronization task. Then, the disaster recovery device further submits the disaster recovery data synchronization task - Distcp job to YARN on the disaster recovery cluster (IDC2) based on the scheduling system - Transports, and executes the Distcp job based on the scheduling, so as to realize the preset master The data of the Hive table that needs to be synchronized in the cluster (IDC1) is pulled to the storage path (HDFS directory) corresponding to the Hive table in the disaster recovery cluster (IDC2). Finally, the disaster recovery device also monitors the execution status of the scheduled data synchronization task based on the scheduling system—Transports polling, and collects the database src-HDFS of the preset primary cluster (IDC1) and the disaster recovery database dest- in the disaster recovery cluster. Synchronized statistical data stored on HDFS, so as to verify whether the data on both sides of the preset primary cluster (IDC1) and disaster recovery cluster (IDC2) are consistent.
进一步地,在另一种可行的实施例中,容灾恢复设备在执行数据同步任务以令所述预设主集群的数据库同步数据至所述灾备数据库之前,还预先定义数据容灾同步的规则,从而,容灾恢复设备在基于调度系统执行数据同步任务进行预设主集群与容灾集群的数据同步过程中,即按照该数据容灾同步的规则执行该数据同步任务。Further, in another feasible embodiment, before the disaster recovery device executes the data synchronization task to make the database of the preset primary cluster synchronize data to the disaster recovery database, it also predefines the data disaster recovery synchronization Therefore, when the disaster recovery and recovery device executes the data synchronization task based on the scheduling system to perform the data synchronization process between the preset primary cluster and the disaster recovery cluster, it executes the data synchronization task according to the data disaster recovery synchronization rule.
需要说明的是,在本实施例中,容灾恢复设备通过定义需要进行数据同步的预设主集群和容灾集群各自对应的集群、数据库和数据表,以及,需要进行数据同步的时间和策略以形成数据容灾同步的规则。具体地,例如,容灾恢复设备定义的数据容灾同步的规则如下表所示:It should be noted that, in this embodiment, the disaster recovery device defines the clusters, databases and data tables corresponding to the preset primary cluster and disaster recovery cluster that need data synchronization, and the time and strategy for data synchronization To form rules for data disaster recovery synchronization. Specifically, for example, the data disaster recovery synchronization rules defined by the disaster recovery device are shown in the following table:
Figure PCTCN2021132314-appb-000001
Figure PCTCN2021132314-appb-000001
上表中,源集群和目标集群即分别为本实施例中所述的预设主集群和容灾集群。In the above table, the source cluster and the target cluster are respectively the preset primary cluster and disaster recovery cluster described in this embodiment.
进一步地,在一种可行的实施例中,上述第一实施例中的步骤S40中,同步状态包括:同步完成和未同步,“根据所述同步状态确定各所述任务节点中待重新执行的目标节点”的步骤,可以包括:Further, in a feasible embodiment, in step S40 in the first embodiment above, the synchronization status includes: synchronization completed and unsynchronized, "according to the synchronization status, determine the tasks to be re-executed in each of the task nodes target node" may include:
步骤S401,若各所述任务节点中第一任务节点的任务参数的同步状态为未同步,则确定所述第一任务节点为待重新执行的目标节点;Step S401, if the synchronization state of the task parameters of the first task node among the task nodes is not synchronized, then determine that the first task node is the target node to be re-executed;
容灾恢复设备在获取得到预设主集群于发生灾难时,正在执行的工作流当中,各个任务节点各自的任务参数之后,即进一步检测该各个任务节点各自任务参数的同步状态为同步完成还是未同步,从而,在检测到该各个任务节点当中,第一任务节点的任务参数的同步状态为未同步时,直接将该第一任务节点确定为需要在容灾集群下重跑以待重新执行的 目标节点。After the disaster recovery device obtains the task parameters of each task node in the workflow that is being executed by the default master cluster when a disaster occurs, it further detects whether the synchronization status of the task parameters of each task node is completed or not. Synchronization, thus, when it is detected that among the various task nodes, the synchronization state of the task parameters of the first task node is not synchronized, the first task node is directly determined as the one that needs to be re-run under the disaster recovery cluster to be re-executed target node.
步骤S402,若所述第一任务节点的父节点为待重新执行的节点时,则确定所述第一任务节点为所述目标节点;Step S402, if the parent node of the first task node is a node to be re-executed, then determine that the first task node is the target node;
容灾恢复设备在检测各个任务节点各自任务参数的同步状态为同步完成还是未同步,来确定各个任务节点的同时,还通过检测该各个任务节点的父节点是否为已经确定的需要在容灾集群下重跑以待重新执行的目标节点,如此,在检测到第一任务节点的父节点为待重新执行的目标节点时,即直接将该第一任务节点确定为需要在容灾集群下重跑以待重新执行的目标节点。The disaster recovery device detects whether the synchronization status of each task parameter of each task node is synchronized or not, to determine each task node, and also detects whether the parent node of each task node is determined to be in the disaster recovery cluster Re-run the target node to be re-executed. In this way, when it is detected that the parent node of the first task node is the target node to be re-executed, the first task node is directly determined to be re-run in the disaster recovery cluster The target node to be re-executed.
具体地,例如,容灾恢复设备通过深度优先的方式遍历获取到的预设主集群在发生灾难时,该预设主集群的调度系统正在执行的每一个工作流,如果容灾恢复设备遍历到当前工作流中第一任务节点的所有父节点的状态已经被标记为无需重跑(disable execute),并且,该第一任务节点的任务参数—输入数据和输出数据的同步状态也为同步完成,则容灾恢复设备确定该第一任务节点非为需进行重跑以重新调度执行的目标节点,并将该第一任务节点也标记为无需重跑(disable execute)的状态。而如果,容灾恢复设备遍历到当前工作流的第一任务节点的父节点中,有某个父节点的状态被标记为需重新执行(enable execute),或者,容灾恢复设备遍历到的该第一任务节点的任务参数——输入数据和输出数据的同步状态为未同步,则容灾恢复设备即直接确定该第一任务节点为需进行重跑以重新调度执行的目标节点,并将该第一任务节点标记为需重新执行(enable execute)。Specifically, for example, the disaster recovery device traverses the acquired preset primary cluster in a depth-first manner. When a disaster occurs, each workflow that is being executed by the scheduling system of the preset primary cluster, if the disaster recovery device traverses to The state of all parent nodes of the first task node in the current workflow has been marked as no need to rerun (disable execute), and the task parameter of the first task node—the synchronization status of input data and output data is also synchronized, Then the disaster recovery device determines that the first task node is not a target node that needs to be re-run to reschedule execution, and marks the first task node as a state that does not need to be re-run (disable execute). However, if the disaster recovery device traverses to the parent node of the first task node of the current workflow, the state of a certain parent node is marked as re-execute (enable execute), or the disaster recovery device traverses the The task parameters of the first task node——the synchronization state of input data and output data is not synchronized, then the disaster recovery device will directly determine that the first task node is the target node that needs to be re-run to reschedule execution, and send the The first task node is marked to be re-executed (enable execute).
本申请实施例提供一种数据的容灾恢复方法,通过容灾恢复设备在与发生灾难的预设主集群的灾备数据库之间建立通信连接之前,先执行数据同步任务以令预设主集群的数据库,在该预设主集群提供服务执行数据处理任务的过程中,将数据同步至容灾集群中的灾备数据库以供后续快速进行容灾切换。而容灾恢复设备在令预设主集群的数据库同步数据至灾备数据库的过程当中,先接收基于工作人员配置生成的数据同步任务,从而解析该数据同步任务来确定需要从该预设主集群的数据库中读取并同步至灾备数据库中的待同步元数据,容灾恢复设备在确定需要从该预设主集群的数据库中读取并同步至灾备数据库中的待同步元数据之后,即执行该数据同步任务以将该待同步元数据拉取至灾备数据库中对应的存储路径下进行存储。然后,容灾恢复设备通过持续的监测该数据同步任务的执行状态,以在每一个数据同步任务执行完毕时,进一步针对该预设主集群的数据库和灾备数据库各自所存储的数据进行一致性验证,从而确保将预设主集群的数据库中的待同步元数据,完整地同步至灾备数据库当中。The embodiment of the present application provides a method for data disaster recovery and recovery. Before establishing a communication connection with the disaster recovery database of the preset primary cluster where a disaster occurs, the disaster recovery recovery device first executes a data synchronization task so that the default primary cluster In the process of providing services and performing data processing tasks in the preset main cluster, the data is synchronized to the disaster recovery database in the disaster recovery cluster for subsequent rapid disaster recovery switching. In the process of synchronizing data from the database of the preset main cluster to the disaster recovery database, the disaster recovery device first receives the data synchronization task generated based on the staff configuration, and then analyzes the data synchronization task to determine the data that needs to be retrieved from the preset main cluster. The metadata to be synchronized is read from the database of the database and synchronized to the disaster recovery database. After the disaster recovery device determines that the metadata to be synchronized needs to be read from the database of the preset primary cluster and synchronized to the disaster recovery database, That is, the data synchronization task is executed to pull the metadata to be synchronized to a corresponding storage path in the disaster recovery database for storage. Then, the disaster recovery device continuously monitors the execution status of the data synchronization task, so that when each data synchronization task is completed, the data stored in the preset main cluster database and the disaster recovery database are further consistent. Verification, so as to ensure that the metadata to be synchronized in the database of the preset main cluster is completely synchronized to the disaster recovery database.
如此,容灾恢复设备即可在进行容灾切换的过程中,即可通过预先基于数据与数据处理任务之间的血缘关系构建的系链模型,并结合工作流中任务节点各自任务参数的同步状态,来进行主集群发生灾难情况下切换容灾集群的容灾恢复操作,可以实现快速进行容灾切换和快速恢复的待重新执行的任务节点,达成了快速和精细化进行容灾切换的目的,从而提高了容灾恢复效率。In this way, the disaster recovery device can pass the tether model constructed in advance based on the blood relationship between data and data processing tasks in the process of disaster recovery switching, combined with the synchronization of the task parameters of the task nodes in the workflow In the case of a disaster in the main cluster, the disaster recovery operation of switching the disaster recovery cluster can be realized, and the task nodes to be re-executed can be quickly implemented for disaster recovery switchover and fast recovery, and the purpose of fast and fine-grained disaster recovery switchover can be achieved. , thereby improving disaster recovery efficiency.
进一步地,本申请还提供一种数据的容灾恢复系统。请参照图13,图13为本申请数据的容灾恢复系统一实施例的功能模块示意图。如图13所示,本申请数据的容灾恢复系统,包括:Further, the present application also provides a data disaster recovery system. Please refer to FIG. 13 . FIG. 13 is a schematic diagram of functional modules of an embodiment of the data disaster recovery system of the present application. As shown in Figure 13, the disaster recovery system for the application data includes:
连接模块10,用于建立与预设主集群的灾备数据库之间的通信连接;The connection module 10 is used to establish a communication connection with the disaster recovery database of the preset main cluster;
工作流读取模块20,用于通过所述通信连接读取所述预设主集群执行的工作流;A workflow reading module 20, configured to read the workflow executed by the preset main cluster through the communication connection;
获取模块30,用于根据预设的关系链模型获取所述工作流中各任务节点的任务参数,其中,所述关系链模型基于数据与数据处理任务之间的血缘关系构建得到;An acquisition module 30, configured to acquire task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationships between data and data processing tasks;
恢复模块40,用于检测所述任务参数的同步状态,以根据所述同步状态确定各所述任务节点中待重新执行的目标节点,并触发容灾恢复机制执行所述目标节点。The recovery module 40 is configured to detect the synchronization state of the task parameters, determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
进一步地,本申请数据的容灾恢复系统,还包括:Further, the disaster recovery system for data in this application also includes:
关系链构建模块,用于基于数据与数据处理任务之间的血缘关系构建关系链模型。The relationship chain building module is used to build a relationship chain model based on the blood relationship between data and data processing tasks.
进一步地,关系链构建模块,包括:Further, the relationship chain building blocks include:
第一构建单元,用于从所述预设主集群中获取血缘数据建立数据处理执行任务与数据之间的第一血缘关系;The first construction unit is configured to acquire blood relationship data from the preset main cluster and establish a first blood relationship between the data processing execution task and the data;
第二构建单元,用于解析对象简谱文件建立所述数据处理执行任务与数据处理任务之间的第二血缘关系;The second construction unit is used to analyze the object numbered musical notation file and establish the second blood relationship between the data processing execution task and the data processing task;
第三构建单元,用于融合所述第一血缘关系和所述第二血缘关系确定所述数据与所述数据处理任务之间的血缘关系以构建得到关系链模型。A third construction unit, configured to fuse the first blood relationship and the second blood relationship to determine the blood relationship between the data and the data processing task to construct a relationship chain model.
进一步地,所述任务参数包括所述任务节点的输入数据和输出数据,获取模块30,包括:Further, the task parameters include input data and output data of the task node, and the acquisition module 30 includes:
确定单元,用于确定所述工作流的各所述任务节点;a determining unit, configured to determine each of the task nodes of the workflow;
获取单元,用于根据各所述任务节点分别构建查询语句从所述关系链模型中索引各所述任务节点各自的输入数据和输出数据。The acquisition unit is configured to respectively construct query statements according to each of the task nodes and index the respective input data and output data of each of the task nodes from the relationship chain model.
进一步地,本申请数据的容灾恢复系统,还包括:Further, the disaster recovery system for data in this application also includes:
数据同步模块,用于执行预设的数据同步任务以令所述预设主集群的数据库同步数据至所述灾备数据库。The data synchronization module is configured to execute a preset data synchronization task to make the database of the preset main cluster synchronize data to the disaster recovery database.
进一步地,数据同步模块,包括:Further, the data synchronization module includes:
接收单元,用于接收所述数据同步任务,并从所述预设主集群的数据库中读取所述数据同步任务指向的待同步元数据;A receiving unit, configured to receive the data synchronization task, and read the metadata to be synchronized pointed to by the data synchronization task from the database of the preset master cluster;
任务执行单元,用于执行所述数据同步任务以将所述待同步元数据拉取至所述灾备数据库中进行存储;A task execution unit, configured to execute the data synchronization task to pull the metadata to be synchronized into the disaster recovery database for storage;
验证单元,用于监控所述数据同步任务的执行状态并针对所述预设主集群的数据库和所述灾备数据库各自存储的数据进行一致性验证。The verification unit is configured to monitor the execution status of the data synchronization task and perform consistency verification on the data stored in the preset primary cluster database and the disaster backup database respectively.
进一步地,任务执行单元,包括:Further, the task execution unit includes:
路径获取子单元,用于获取所述待同步元数据在所述预设主集群的数据库中第一存储路径;以及,确定所述灾备数据库中与所述第一存储路径相对应的第二存储路径;a path obtaining subunit, configured to obtain a first storage path of the metadata to be synchronized in the database of the preset primary cluster; and determine a second storage path corresponding to the first storage path in the disaster recovery database Storage path;
数据存储子单元,用于按照第二存储路径将所述待同步元数据存储在所述灾备数据库中。The data storage subunit is configured to store the metadata to be synchronized in the disaster recovery database according to the second storage path.
进一步地,所述同步状态包括:同步完成和未同步,恢复模块40,包括:Further, the synchronization status includes: synchronization completed and unsynchronized, and the recovery module 40 includes:
第一重跑节点确定单元,用于若各所述任务节点中第一任务节点的任务参数的同步状态为未同步,则确定所述第一任务节点为待重新执行的目标节点;The first re-running node determining unit is configured to determine that the first task node is a target node to be re-executed if the synchronization state of the task parameters of the first task node in each of the task nodes is not synchronized;
第二重跑节点确定单元若所述第一任务节点的父节点为待重新执行的节点时,则确定所述第一任务节点为所述目标节点。The second rerun node determining unit determines the first task node as the target node if the parent node of the first task node is a node to be re-executed.
其中,上述数据的容灾恢复系统中任务调度节点的各个模块的功能实现,与上述数据的容灾恢复方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。Wherein, the function implementation of each module of the task scheduling node in the above-mentioned data disaster recovery system corresponds to each step in the above-mentioned data disaster recovery method embodiment, and its functions and implementation processes will not be repeated here.
本申请还提供一种计算机存储介质,该计算机存储介质上存储有数据的容灾恢复程序,所述数据的容灾恢复程序被处理器执行时实现如以上任一项实施例所述的数据的容灾恢复方法的步骤。The present application also provides a computer storage medium, on which a data disaster recovery program is stored, and when the data disaster recovery program is executed by a processor, the data recovery as described in any one of the above embodiments is realized. The steps of the disaster recovery method.
本申请计算机存储介质的具体实施例与上述数据的容灾恢复方法各实施例基本相同,在此不作赘述。The specific embodiments of the computer storage medium of the present application are basically the same as the embodiments of the above-mentioned data disaster recovery method, and will not be repeated here.
本申请还提供一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序被处理器执行时实现如以上任一项实施例所述的数据的容灾恢复方法的步骤。The present application also provides a computer program product, where the computer program product includes a computer program, and when the computer program is executed by a processor, the steps of the method for disaster recovery and recovery of data as described in any one of the above embodiments are implemented.
本申请计算机存储介质的具体实施例与上述数据的容灾恢复方法各实施例基本相同,在此不作赘述。The specific embodiments of the computer storage medium of the present application are basically the same as the embodiments of the above-mentioned data disaster recovery method, and will not be repeated here.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他 性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that, as used herein, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or system comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or system. Without further limitations, an element defined by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article or system comprising that element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium as described above (such as ROM/RAM , magnetic disk, optical disk), including several instructions to enable a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in various embodiments of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of the present application, and are not intended to limit the patent scope of the present application. All equivalent structures or equivalent process transformations made by using the description of the application and the accompanying drawings are directly or indirectly used in other related technical fields. , are all included in the patent protection scope of the present application in the same way.

Claims (10)

  1. 一种数据的容灾恢复方法,其中,所述数据的容灾恢复方法应用于数据容灾恢复设备,所述数据的容灾恢复方法包括:A data disaster recovery method, wherein the data disaster recovery method is applied to a data disaster recovery device, and the data disaster recovery method includes:
    建立与预设主集群的灾备数据库之间的通信连接;Establish a communication connection with the disaster recovery database of the preset main cluster;
    通过所述通信连接读取所述预设主集群执行的工作流;Reading the workflow executed by the preset main cluster through the communication connection;
    根据预设的关系链模型获取所述工作流中各任务节点的任务参数,其中,所述关系链模型基于数据与数据处理任务之间的血缘关系构建得到;Obtaining task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationships between data and data processing tasks;
    检测所述任务参数的同步状态,以根据所述同步状态确定各所述任务节点中待重新执行的目标节点,并触发容灾恢复机制执行所述目标节点。Detecting the synchronization state of the task parameters, so as to determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
  2. 如权利要求1所述的数据的容灾恢复方法,其中,所述方法还包括:The data disaster recovery method according to claim 1, wherein said method further comprises:
    基于数据与数据处理任务之间的血缘关系构建关系链模型;Build a relationship chain model based on the blood relationship between data and data processing tasks;
    所述基于数据与数据处理任务之间的血缘关系构建关系链模型的步骤,包括:The step of building a relationship chain model based on the blood relationship between data and data processing tasks includes:
    从所述预设主集群中获取血缘数据建立数据处理执行任务与数据之间的第一血缘关系;Obtain blood relationship data from the preset main cluster to establish a first blood relationship between the data processing execution task and the data;
    解析对象简谱文件建立所述数据处理执行任务与数据处理任务之间的第二血缘关系;Analyzing the object numbered musical notation file to establish the second blood relationship between the data processing execution task and the data processing task;
    融合所述第一血缘关系和所述第二血缘关系确定所述数据与所述数据处理任务之间的血缘关系以构建得到关系链模型。A blood relationship between the data and the data processing task is determined by fusing the first blood relationship and the second blood relationship to construct a relationship chain model.
  3. 如权利要求1所述的数据的容灾恢复方法,其中,所述任务参数包括所述任务节点的输入数据和输出数据,所述根据预设的关系链模型获取所述工作流中各任务节点的任务参数的步骤,包括:The data disaster recovery method according to claim 1, wherein the task parameters include the input data and output data of the task node, and the task nodes in the workflow are obtained according to the preset relationship chain model The steps of the task parameters include:
    确定所述工作流的各所述任务节点;determining each of the task nodes of the workflow;
    根据各所述任务节点分别构建查询语句,根据所述查询语句从所述关系链模型中索引各所述任务节点各自的输入数据和输出数据。A query statement is respectively constructed according to each of the task nodes, and the respective input data and output data of each of the task nodes are indexed from the relationship chain model according to the query statement.
  4. 如权利要求1所述的数据的容灾恢复方法,其中,在所述建立与预设主集群的灾备数据库之间的通信连接的步骤之前,还包括:The data disaster recovery method according to claim 1, wherein, before the step of establishing a communication connection with the disaster recovery database of the preset main cluster, further comprising:
    执行预设的数据同步任务以令所述预设主集群的数据库同步数据至所述灾备数据库。Executing a preset data synchronization task to make the database of the preset primary cluster synchronize data to the disaster recovery database.
  5. 如权利要求4所述的数据的容灾恢复方法,其中,所述执行预设的数据同步任务以令所述预设主集群的数据库同步数据至所述灾备数据库的步骤,包括:The data disaster recovery method according to claim 4, wherein the step of executing a preset data synchronization task to make the database of the preset main cluster synchronize data to the disaster recovery database includes:
    接收所述数据同步任务,并从所述预设主集群的数据库中读取所述数据同步任务指向的待同步元数据;receiving the data synchronization task, and reading the metadata to be synchronized pointed to by the data synchronization task from the database of the preset master cluster;
    执行所述数据同步任务,将所述待同步元数据拉取至所述灾备数据库中进行存储;Executing the data synchronization task, pulling the metadata to be synchronized into the disaster recovery database for storage;
    监控所述数据同步任务的执行状态并针对所述预设主集群的数据库和所述灾备数据库各自存储的数据进行一致性验证。The execution status of the data synchronization task is monitored, and the consistency verification is performed on the data stored in the database of the preset main cluster and the data stored in the disaster recovery database.
  6. 如权利要求5所述的数据的容灾恢复方法,其中,所述将所述待同步元数据拉取至所述灾备数据库中进行存储的步骤,包括:The data disaster recovery method according to claim 5, wherein the step of pulling the metadata to be synchronized into the disaster recovery database for storage includes:
    获取所述待同步元数据在所述预设主集群的数据库中第一存储路径;Obtain a first storage path of the metadata to be synchronized in the database of the preset master cluster;
    确定所述灾备数据库中与所述第一存储路径相对应的第二存储路径;determining a second storage path corresponding to the first storage path in the disaster recovery database;
    按照第二存储路径将所述待同步元数据存储在所述灾备数据库中。The metadata to be synchronized is stored in the disaster recovery database according to the second storage path.
  7. 如权利要求1-6任一项所述的数据的容灾恢复方法,其中,所述同步状态包括:同步完成和未同步,所述根据所述同步状态确定各所述任务节点中待重新执行的目标节点的步骤,包括:The data disaster recovery method according to any one of claims 1-6, wherein the synchronization state includes: synchronization completed and unsynchronized, and the determination of each task node to be re-executed according to the synchronization state The steps of the target node include:
    若各所述任务节点中第一任务节点的任务参数的同步状态为未同步,则确定所述第一任务节点为待重新执行的目标节点;和/或者,If the synchronization state of the task parameters of the first task node among the task nodes is unsynchronized, then determine that the first task node is the target node to be re-executed; and/or,
    若所述第一任务节点的父节点为待重新执行的节点时,则确定所述第一任务节点为所述目标节点。If the parent node of the first task node is a node to be re-executed, then determine that the first task node is the target node.
  8. 一种数据的容灾恢复系统,其中,所述数据的容灾恢复系统,包括:A data disaster recovery system, wherein the data disaster recovery system includes:
    连接模块,用于建立与预设主集群的灾备数据库之间的通信连接;A connection module, configured to establish a communication connection with the disaster recovery database of the preset main cluster;
    工作流读取模块,用于通过所述通信连接读取所述预设主集群执行的工作流;A workflow reading module, configured to read the workflow executed by the preset main cluster through the communication connection;
    获取模块,用于根据预设的关系链模型获取所述工作流中各任务节点的任务参数,其中,所述关系链模型基于数据与数据处理任务之间的血缘关系构建得到;An acquisition module, configured to acquire task parameters of each task node in the workflow according to a preset relationship chain model, wherein the relationship chain model is constructed based on blood relationship between data and data processing tasks;
    恢复模块,用于检测所述任务参数的同步状态,以根据所述同步状态确定各所述任务节点中待重新执行的目标节点,并触发容灾恢复机制执行所述目标节点。The recovery module is configured to detect the synchronization state of the task parameters, determine the target node to be re-executed among the task nodes according to the synchronization state, and trigger a disaster recovery mechanism to execute the target node.
  9. 一种终端设备,其中,所述终端设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的数据的容灾恢复程序,所述数据的容灾恢复程序被所述处理器执行时实现如权利要求1至7中任一项所述的数据的容灾恢复方法的步骤。A terminal device, wherein the terminal device includes: a memory, a processor, and a data disaster recovery program stored on the memory and operable on the processor, and the data disaster recovery program is The processor implements the steps of the data disaster recovery method according to any one of claims 1 to 7 when executed.
  10. 一种计算机存储介质,其中,所述计算机存储介质上存储有数据的容灾恢复程序,所述数据的容灾恢复程序被处理器执行时实现如权利要求1至7中任一项所述的数据的容灾恢复方法的步骤。A computer storage medium, wherein a data disaster recovery program is stored on the computer storage medium, and when the data disaster recovery program is executed by a processor, the method according to any one of claims 1 to 7 is realized. The steps of the data disaster recovery recovery method.
PCT/CN2021/132314 2021-07-30 2021-11-23 Disaster recovery method and system for data, terminal device and computer storage medium WO2023005075A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110874019.9 2021-07-30
CN202110874019.9A CN113590386B (en) 2021-07-30 2021-07-30 Disaster recovery method, system, terminal device and computer storage medium for data

Publications (1)

Publication Number Publication Date
WO2023005075A1 true WO2023005075A1 (en) 2023-02-02

Family

ID=78252890

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/132314 WO2023005075A1 (en) 2021-07-30 2021-11-23 Disaster recovery method and system for data, terminal device and computer storage medium

Country Status (2)

Country Link
CN (1) CN113590386B (en)
WO (1) WO2023005075A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590386B (en) * 2021-07-30 2023-03-03 深圳前海微众银行股份有限公司 Disaster recovery method, system, terminal device and computer storage medium for data
CN114584458B (en) * 2022-03-03 2023-06-06 平安科技(深圳)有限公司 Cluster disaster recovery management method, system, equipment and storage medium based on ETCD
CN114546731B (en) * 2022-03-09 2024-04-05 北京有生博大软件股份有限公司 Workflow data recovery method and data recovery system
CN115174364A (en) * 2022-06-30 2022-10-11 济南浪潮数据技术有限公司 Data recovery method, device and medium in disaster tolerance scene
CN117170983B (en) * 2023-11-02 2024-03-01 卓望数码技术(深圳)有限公司 Disaster recovery switching method, system, computer equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411520A (en) * 2011-09-21 2012-04-11 电子科技大学 Data-unit-based disaster recovery method for seismic data
CN106776153A (en) * 2015-11-25 2017-05-31 华为技术有限公司 job control method and server
US20190220342A1 (en) * 2018-01-12 2019-07-18 International Business Machines Corporation Traffic and geography based cognitive disaster recovery
CN111026568A (en) * 2019-12-04 2020-04-17 深圳前海环融联易信息科技服务有限公司 Data and task relation construction method and device, computer equipment and storage medium
CN111858065A (en) * 2020-07-28 2020-10-30 中国平安财产保险股份有限公司 Data processing method, device, storage medium and device
CN112527484A (en) * 2020-12-17 2021-03-19 平安银行股份有限公司 Workflow breakpoint continuous running method and device, computer equipment and readable storage medium
CN113157491A (en) * 2021-04-01 2021-07-23 深圳依时货拉拉科技有限公司 Data backup method and device, communication equipment and storage medium
CN113590386A (en) * 2021-07-30 2021-11-02 深圳前海微众银行股份有限公司 Disaster recovery method, system, terminal device and computer storage medium for data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101414277B (en) * 2008-11-06 2010-06-09 清华大学 Need-based increment recovery disaster-tolerable system and method based on virtual machine
CN110196888B (en) * 2019-05-27 2024-05-10 深圳前海微众银行股份有限公司 Hadoop-based data updating method, device, system and medium
CN112463451B (en) * 2020-12-02 2024-01-26 中国工商银行股份有限公司 Buffer disaster recovery cluster switching method and soft load balancing cluster device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411520A (en) * 2011-09-21 2012-04-11 电子科技大学 Data-unit-based disaster recovery method for seismic data
CN106776153A (en) * 2015-11-25 2017-05-31 华为技术有限公司 job control method and server
US20190220342A1 (en) * 2018-01-12 2019-07-18 International Business Machines Corporation Traffic and geography based cognitive disaster recovery
CN111026568A (en) * 2019-12-04 2020-04-17 深圳前海环融联易信息科技服务有限公司 Data and task relation construction method and device, computer equipment and storage medium
CN111858065A (en) * 2020-07-28 2020-10-30 中国平安财产保险股份有限公司 Data processing method, device, storage medium and device
CN112527484A (en) * 2020-12-17 2021-03-19 平安银行股份有限公司 Workflow breakpoint continuous running method and device, computer equipment and readable storage medium
CN113157491A (en) * 2021-04-01 2021-07-23 深圳依时货拉拉科技有限公司 Data backup method and device, communication equipment and storage medium
CN113590386A (en) * 2021-07-30 2021-11-02 深圳前海微众银行股份有限公司 Disaster recovery method, system, terminal device and computer storage medium for data

Also Published As

Publication number Publication date
CN113590386B (en) 2023-03-03
CN113590386A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
WO2023005075A1 (en) Disaster recovery method and system for data, terminal device and computer storage medium
US11829360B2 (en) Database workload capture and replay
US10554771B2 (en) Parallelized replay of captured database workload
US10817501B1 (en) Systems and methods for using a reaction-based approach to managing shared state storage associated with a distributed database
US7702741B2 (en) Configuring or reconfiguring a multi-master information sharing environment
US8938421B2 (en) Method and a system for synchronizing data
US10621049B1 (en) Consistent backups based on local node clock
WO2021169272A1 (en) Database table changing method and apparatus, computer device, and storage medium
JP6266630B2 (en) Managing continuous queries with archived relations
WO2018233364A1 (en) Index updating method and system, and related device
WO2017063520A1 (en) Method and apparatus for operating database
EP3722973B1 (en) Data processing method and device for distributed database, storage medium, and electronic device
CN104657497A (en) Mass electricity information concurrent computation system and method based on distributed computation
CN115374102A (en) Data processing method and system
CN111177254B (en) Method and device for data synchronization between heterogeneous relational databases
US9489423B1 (en) Query data acquisition and analysis
CN112084206A (en) Database transaction request processing method, related device and storage medium
CN110908793A (en) Long-time task execution method, device, equipment and readable storage medium
US7899785B2 (en) Reconfiguring propagation streams in distributed information sharing
CN112685499A (en) Method, device and equipment for synchronizing process data of work service flow
WO2023082681A1 (en) Data processing method and apparatus based on batch-stream integration, computer device, and medium
WO2017157111A1 (en) Method, device and system for preventing memory data loss
WO2023109286A1 (en) Data synchronization method and apparatus
CN112818021B (en) Data request processing method, device, computer equipment and storage medium
CN114020368A (en) Information processing method and device based on state machine and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21951647

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE