CN117473019B

CN117473019B - Data synchronization method, system, computer equipment and storage medium

Info

Publication number: CN117473019B
Application number: CN202311785695.4A
Authority: CN
Inventors: 倪犀子; 袁康; 付大超
Original assignee: Alibaba Cloud Computing Ltd
Current assignee: Alibaba Cloud Computing Ltd
Priority date: 2023-12-25
Filing date: 2023-12-25
Publication date: 2024-03-22
Anticipated expiration: 2043-12-25
Also published as: CN117473019A

Abstract

The present disclosure provides a data synchronization method, system, computer device, and storage medium, wherein the method includes: responding to a first synchronization operation of a data synchronization node, and acquiring first synchronization data in a source database; the data synchronization node is used for synchronizing first synchronization data from the source database to a corresponding target database; storing the first synchronization data to a cloud server; the cloud server comprises a cloud storage space for storing data to be synchronized in the source database; and responding to a second synchronization operation of the data synchronization node, and synchronizing the first synchronization data in the cloud storage space to the target database for storage.

Description

Data synchronization method, system, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of data processing, and in particular, to a data synchronization method, system, computer device, and storage medium.

Background

The data synchronization system can synchronize data in the source database to the target database for storage. In the data synchronization system, the data in the source database can be cached into the local hard disk through the storage module in the data synchronization system, and then the data in the local hard disk is synchronized into the target database for storage. When a storage module in the data synchronization system fails, a large amount of data needs to be pulled from the source database again. Thus, when the data amount to be pulled is large, the recovery time of the data synchronization system is prolonged.

Disclosure of Invention

The embodiment of the disclosure at least provides a data synchronization method, a data synchronization system, computer equipment and a storage medium.

In a first aspect, an embodiment of the present disclosure provides a data synchronization method, applied to a client, including:

responding to a first synchronization operation of a data synchronization node, and acquiring first synchronization data in a source database; the data synchronization node is used for synchronizing first synchronization data from the source database to a corresponding target database;

storing the first synchronization data to a cloud server; the cloud server comprises a cloud storage space for storing data to be synchronized in the source database;

and responding to a second synchronization operation of the data synchronization node, and synchronizing the first synchronization data in the cloud storage space to the target database for storage.

In an alternative embodiment, the synchronizing the first synchronization data in the cloud storage space to the target database for storage in response to the second synchronization operation of the data synchronization node includes:

responding to the second synchronous operation, and searching the first synchronous data in a local storage space; the data stored in the local storage space are presynchronized data which are obtained from the cloud storage space in advance;

And synchronizing the searched first synchronous data to the target database for storage.

In an alternative embodiment, the method further comprises:

responding to a disaster recovery instruction of the data synchronization node, and acquiring second synchronization data from the cloud storage space; the second synchronous data are data which are not synchronized to the target database in the source database;

sending the second synchronous data to the restarted data synchronous node; the restarted data synchronization node is used for synchronizing the second synchronous data to the target database.

In an alternative embodiment, the method further comprises:

acquiring the running state of the data synchronization node;

sending the running state to the cloud server; the running state is used for controlling the task execution state of the data synchronization task executed by the data synchronization node.

In a second aspect, an embodiment of the present disclosure provides a data synchronization method, applied to a cloud server, including:

responding to a first synchronization request of a client to acquire first synchronization data sent by the client; the first synchronization data are data to be synchronized to a corresponding target database in a source database;

Storing the first synchronization data in a cloud storage space;

and responding to a second synchronous request of the client, acquiring first synchronous data from the data stored in the cloud storage space, and returning the first synchronous data to the client.

In an alternative embodiment, the method further comprises:

acquiring the running state of a data synchronization node sent by the client; the data synchronization node is used for synchronizing first synchronization data from the source database to a corresponding target database, and the running state is used for indicating whether the data synchronization node has faults or not;

and controlling the task execution state of the data synchronization task executed by the data synchronization node based on the running state.

In an alternative embodiment, the number of the data synchronization nodes is a plurality;

the controlling the task execution state of the data synchronization task executed by the data synchronization node based on the running state includes:

determining a multi-level synchronization node group based on a plurality of the data synchronization nodes; each hierarchy comprises at least one synchronous node group, and data synchronous nodes in the synchronous node groups under the same hierarchy are different;

Determining a target running state of the data synchronization nodes in each synchronization node group;

and controlling the task execution state of the data synchronization task executed by the data synchronization node in the synchronization node group based on the target running state.

In an alternative embodiment, the controlling the task execution state of the data synchronization task performed by the data synchronization node in the synchronization node group based on the target running state includes:

if a first synchronous node group with faults of all data synchronous nodes is determined in the synchronous node groups based on the target running state, determining a hierarchy label of the first synchronous node group; wherein the hierarchical label is used for indicating a service area of the first synchronization node group;

determining a second synchronous node group in a normal operation state in the service area;

and scheduling the data synchronization task executed by the data synchronization node in the second synchronization node group to the second synchronization node group.

In an alternative embodiment, the controlling the task execution state of the data synchronization task executed by the synchronization node group based on the target running state includes:

if a third synchronous node group with partial data synchronous nodes in failure is determined in the synchronous node groups based on the target running state, determining a second data synchronous node in a normal running state in the third synchronous node group;

And scheduling the data synchronization task executed by the data synchronization node with the fault in the third synchronization node group to the second data synchronization node.

In an alternative embodiment, after storing the first synchronization data in cloud storage space, the method further comprises:

determining second synchronous data with storage time exceeding preset time in the first synchronous data;

and moving the second synchronous data from the current storage area to the low-frequency storage area.

In a third aspect, an embodiment of the present disclosure provides a data synchronization system, including: the cloud server comprises a data synchronization node, a client and a cloud server;

the client is used for responding to the first synchronous operation of the data synchronous node and acquiring first synchronous data in the source database; the data synchronization node is used for synchronizing first synchronization data from the source database to a corresponding target database; responding to the second synchronization operation of the data synchronization node, and synchronizing the first synchronization data in the cloud storage space to the target database for storage;

the cloud server is used for responding to the first synchronization request of the client and acquiring first synchronization data sent by the client; the first synchronization data are data to be synchronized to a corresponding target database in a source database; and storing in the cloud storage space in response to a second synchronization request of the client

In a fourth aspect, embodiments of the present disclosure further provide a computer device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication via the bus when the computer device is running, the machine-readable instructions when executed by the processor performing the steps of the first aspect, or any of the possible implementations of the first aspect.

In a fifth aspect, the presently disclosed embodiments also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the first aspect, or any of the possible implementations of the first aspect.

In the embodiment of the application, a client side responds to a first synchronous operation of a data synchronous node to acquire first synchronous data in a source database; then, storing the first synchronization data to a cloud server, wherein the cloud server comprises a cloud storage space for storing data to be synchronized in the source database; and responding to a second synchronization operation of the data synchronization node, and synchronizing the first synchronization data in the cloud storage space to the target database for storage.

As can be seen from the above description, the data synchronization node calculation and storage separation can be realized by the data storage module storing the first synchronization data to the cloud server. Under the disaster recovery scene, the data synchronization node can read the data to be recovered from the cloud server, and the recovery time of the data can be shortened by reading the recovered data from the cloud server, so that the synchronous real-time requirement of the data synchronization scene is met.

The foregoing objects, features and advantages of the disclosure will be more readily apparent from the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the embodiments are briefly described below, which are incorporated in and constitute a part of the specification, these drawings showing embodiments consistent with the present disclosure and together with the description serve to illustrate the technical solutions of the present disclosure. It is to be understood that the following drawings illustrate only certain embodiments of the present disclosure and are therefore not to be considered limiting of its scope, for the person of ordinary skill in the art may admit to other equally relevant drawings without inventive effort.

FIG. 1 illustrates a flow chart of a data synchronization method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a data synchronization cluster according to an embodiment of the disclosure;

FIG. 3 illustrates a flow chart of another data synchronization method provided by an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of a data synchronization system provided by an embodiment of the present disclosure;

FIG. 5 illustrates a schematic distribution diagram of a data synchronization cluster provided by an embodiment of the present disclosure;

FIG. 6 illustrates a schematic diagram of a data slicing and data carousel provided by an embodiment of the present disclosure;

FIG. 7 illustrates a schematic diagram of a computer device provided by an embodiment of the present disclosure;

fig. 8 shows a schematic diagram of a computer-readable storage medium provided by an embodiment of the present disclosure.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, but not all embodiments. The components of the embodiments of the present disclosure, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be made by those skilled in the art based on the embodiments of this disclosure without making any inventive effort, are intended to be within the scope of this disclosure.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.

In the related art, the data synchronization system may synchronize data in the source database to the target database for storage. A data synchronization system typically comprises the following modules: the device comprises a data analysis module, a data storage module and a data writing module. The data analysis module can pull and analyze the log of the source database, the analyzed data are delivered to the data storage module for caching, and the data writing module can subscribe the data in the data storage module and write the data into the target database in real time.

In the data synchronization system, a data storage module caches a large amount of data which is analyzed based on a log pulled by a source database in a local hard disk; then, the data writing module synchronizes the data in the local hard disk to the target database for storage. At this point, the computing program of the data storage module and the data store are coupled together. If a disaster recovery scene needs to reconstruct a data storage module, a large amount of data needs to be pulled from a source database again, the recovery time of the operation is generally in the order of minutes to hours, and the specific recovery time is related to the data amount needing to be recovered. However, the data synchronization system needs that the data delay of the target database and the source database is in the second level, and the too long disaster recovery time can bring about a larger influence. That is, when the amount of data to be pulled is large, the recovery period of the data synchronization system will be prolonged.

In an alternative embodiment, a distributed file system (Hadoop Distributed File System, HDFS) may be deployed separately for the data storage module to address, for example, the data storage module may write cache data to a local hard disk instead of to an HDFS through an HDFS API. Therefore, the separation of calculation and storage of the data storage module can be realized while the throughput is ensured, namely, when the data storage module needs to be rebuilt in scenes such as disaster tolerance, a large amount of data is not needed to be pulled from a source database, the remote distributed file system HDFS is directly read to recover the data, and the recovery time can be greatly shortened.

Although the above approach solves the problem of computational and storage coupling, it also introduces new problems:

1. the system is high in reconstruction cost. The data synchronization system needs to be modified, the original operation directly based on file reading and writing needs to be modified into an HDFS API to be realized, the system modification cost is high, and the performance loss is large under specific scenes such as random writing.

2. The operation and maintenance cost is high. Compared with the original technical scheme of writing the local hard disk, the HDFS is introduced to greatly improve the operation and maintenance cost of the whole system.

3. The storage costs rise substantially. The HDFS is based on the three-copy storage scheme, and compared with the data volume stored by the original single copy, the HDFS is improved by 3 times, and the corresponding storage cost is also improved by 3 times.

Based on the above study, the present disclosure provides a data synchronization method, system, computer device, and storage medium. In the embodiment of the application, a client side responds to a first synchronous operation of a data synchronous node to acquire first synchronous data in a source database; then, storing the first synchronization data to a cloud server, wherein the cloud server comprises a cloud storage space for storing data to be synchronized in the source database; and responding to a second synchronization operation of the data synchronization node, and synchronizing the first synchronization data in the cloud storage space to the target database for storage.

In the related art, if an abnormality occurs in the data synchronization node, a large amount of data needs to be re-pulled from the source database during the data recovery process. According to the technical scheme, first synchronization data to be synchronized to the target database by the data synchronization node are stored in the cloud server, and the first synchronization data in the cloud server are synchronized to the target database. On the basis, if the data synchronization node is abnormal, the data to be recovered can be read from the cloud server in the process of recovering the data by the data synchronization node abnormality, and compared with the mode of re-pulling a large amount of data from the source database in the related art, the mode of reading the recovered data from the cloud server can shorten the recovery time of the data, thereby meeting the synchronous real-time requirement of the data synchronization scene.

The present invention is directed to a method for manufacturing a semiconductor device, and a semiconductor device manufactured by the method.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

For the sake of understanding the present embodiment, first, a detailed description will be given of a data synchronization method disclosed in an embodiment of the present disclosure, where an execution body of the data synchronization method provided in the embodiment of the present disclosure is generally a computer device with a certain computing capability. In some possible implementations, the data synchronization method may be implemented by way of a processor invoking computer readable instructions stored in a memory.

The data synchronization method provided by the embodiment of the present disclosure is described below by taking an execution body as a client as an example.

Referring to fig. 1, a flowchart of a data synchronization method according to an embodiment of the disclosure is shown, where the method includes steps S101 to S105, where:

S101: responding to a first synchronization operation of a data synchronization node, and acquiring first synchronization data in a source database; the data synchronization node is used for synchronizing first synchronization data from the source database to a corresponding target database.

Here, as shown in fig. 2, the data synchronization cluster includes a plurality of data synchronization nodes, and each data synchronization node includes a data parsing module, a data storage module, and a data writing module. The client is an application program which is preset in the data synchronization cluster and can communicate with the data synchronization node; wherein, a client can be set for each data synchronization node, and the client and the data synchronization node are deployed in the same equipment; alternatively, the client and the data synchronization node may also be deployed on different devices.

Here, the data storage module in the data synchronization node is capable of communicating with the client, wherein communication between the data storage module and the client may be via a POXIS interface. By the processing mode, the read-write part of the original system disk in the data synchronization cluster does not need to be modified, and the access cost is low.

And a data analysis module in the data synchronization node pulls the log from the source database and analyzes the log to obtain first synchronization data. The data parsing module may then send the first synchronization data to a data storage module in the data synchronization node. The data storage module requests the client to synchronously cache the first synchronous data, and at the moment, the client detects the first synchronous operation of the data synchronous node. And the client responds to the first synchronous operation and acquires the first synchronous data.

S103: storing the first synchronization data to a cloud server; the cloud server comprises a cloud storage space for storing data to be synchronized in the source database.

After the client acquires the first synchronization data, the client may store the first synchronization data to a cloud storage space of the cloud server.

Here, the cloud storage space may be divided in advance in the cloud server for the source database or the target database, where the cloud storage space may be subjected to a marking process in the cloud server by an identification of the source database or an identification of the target database. The cloud server can use object storage services provided by mature high-performance cloud manufacturers, and the type of the cloud server is not particularly limited, so that the cloud server can be realized.

S105: and responding to a second synchronization operation of the data synchronization node, and synchronizing the first synchronization data in the cloud storage space to the target database for storage.

The data writing module in the data synchronization node can request the first synchronization data to be written into the target database from the data storage module at regular time. After detecting the request, the data storage module requests the first synchronous data from the client, at this time, the client detects a second synchronous operation of the data synchronous node, and returns the first synchronous data in the cloud storage space to the data storage module. The data storage module sends the first synchronous data to the data writing module, and the data writing module writes the first synchronous data into the target database.

In an optional embodiment, the step S105, in response to the second synchronization operation of the data synchronization node, synchronizes the first synchronization data in the cloud storage space to the target database for storage, and specifically includes the following steps:

step S11: responding to the second synchronous operation, and searching the first synchronous data in a local storage space; the data stored in the local storage space are presynchronized data which are obtained from the cloud storage space in advance;

step S12: and synchronizing the searched first synchronous data to the target database for storage.

As shown in fig. 2, in the embodiment of the present application, a local storage space, for example, a storage space in a local hard disk, is set in advance for the data synchronization node. The local storage space and the data synchronization node can be deployed on the same or different devices; alternatively, the local storage space may be deployed on the same or different devices as the client. The client can write data in the local storage space and read data from the local storage space.

In the embodiment of the present application, before detecting the second synchronization operation of the data synchronization node, the client may store, in advance, the first synchronization data to be written into the target database at the current time in the local storage space. For example, at time t-1, first synchronization data to be written into the target database at time t may be cached in the local storage space. At time t, if a second synchronization operation of the data synchronization node is detected, the first synchronization data may be looked up in the local storage space. And if the first synchronous data is found, feeding the first synchronous data back to the data storage module. If the first synchronization data is not found, the first synchronization data is found from the cloud server.

Here, after the first synchronization data is fed back to the data storage module, the first synchronization data may be deleted in the local storage space, and the first synchronization data to be written to the target database at a future time may be cached in the local storage space.

By the processing mode, the synchronization efficiency of data synchronization to the target database can be improved, the data synchronization speed is further increased, and the data synchronization scene with higher real-time requirements is met.

In an alternative embodiment, the method further comprises the steps of:

In the embodiment of the application, if the data storage module in the data synchronization node fails, the data storage module can be restarted on other devices. The client may then determine that the data in the source database has not been synchronized to the data in the target database. Here, the second synchronization data may be determined according to the comparison result by comparing the data in the source database and the target database.

After determining the second synchronization data, the client may synchronize the data to the target database through a data storage module restarted in the data synchronization node.

After the data storage module is restarted, second synchronization data that is not synchronized to the target database may be requested from the client. The client may then send the second synchronization data to a data storage module that is restarted in the data synchronization node, such that the data storage module forwards the second synchronization data to the data write module.

Through the processing mode, the data storage module can read the data to be recovered from the cloud server in the disaster recovery scene, and the recovery time of the data can be shortened through the mode of reading the recovered data from the cloud server, so that the synchronous real-time requirement of the data synchronous scene is met.

In an alternative embodiment, the method further comprises the steps of:

acquiring the running state of the data synchronization node;

In this embodiment of the present application, the client may further obtain an operation state of the corresponding data synchronization node, and through the operation state, it may be determined whether the data synchronization data is in a normal operation state or an abnormal operation state.

For example, a heartbeat connection between the data synchronization node and the client may be established, and in a normal operation state of the data synchronization node, the data synchronization node may report a heartbeat signal to the client, so as to obtain the operation state of the data synchronization node according to the heartbeat signal.

If the data synchronization node normally sends a heartbeat signal to the client, the running state of the data synchronization node can be determined to be a normal running state. If the heartbeat signal sent by the data synchronization node is abnormal, for example, the heartbeat signal is stopped from being sent to the client, the running state of the data synchronization node is determined to be an abnormal running state.

After acquiring the running state of the data synchronization node, the client may send the running state to the cloud server. The cloud server may control a task execution state of the data synchronization task executed by the data synchronization node based on the running state.

For example, in the case that the cloud server determines that the running state is the normal running state, the data synchronization node may continue to perform the data synchronization task, that is, the task execution state is to continue to perform. For another example, if the cloud server determines that the running state is an abnormal running state, the cloud server may schedule the data synchronization task executed by the data synchronization node to other data synchronization nodes.

By means of the mode of acquiring the running state of the data synchronization node, normal execution of the data synchronization task can be guaranteed, and therefore normal running of the data synchronization system is guaranteed.

The following describes a data synchronization method provided by the embodiment of the present disclosure by taking an execution body as a cloud server as an example.

Referring to fig. 3, a flowchart of a data synchronization method according to an embodiment of the disclosure is shown, where the method includes steps S301 to S305, in which:

step S301: responding to a first synchronization request of a client to acquire first synchronization data sent by the client; the first synchronization data are data to be synchronized to a corresponding target database in a source database.

Step S303: and storing the first synchronous data in a cloud storage space.

Thereafter, the client may send a first synchronization request to the cloud server. After the cloud server acquires the first synchronization request, the cloud server may acquire first synchronization data sent by the client, and store the first synchronization data in a cloud storage space.

Step S305: and responding to a second synchronous request of the client, acquiring first synchronous data from the data stored in the cloud storage space, and returning the first synchronous data to the client.

The data writing module in the data synchronization node can request the first synchronization data to be written into the target database from the data storage module at regular time. The data storage module requests the first synchronization data from the client after detecting the request, and at this time, the client detects a second synchronization operation of the data synchronization node and sends a second synchronization request to the cloud server.

After the cloud server acquires the second synchronization request, the cloud server determines first synchronization data in the cloud storage space and returns the first synchronization data to the client. At this time, the client may return the first synchronization data to the data storage module. The data storage module sends the first synchronous data to the data writing module, and the data writing module writes the first synchronous data into the target database.

In an alternative embodiment, the method further comprises the steps of:

In an optional embodiment, in the case that the number of the data synchronization nodes is plural, the step of controlling the task execution state of the data synchronization task executed by the data synchronization node based on the operation state specifically includes the steps of:

Here, in the case where the number of data synchronization nodes is plural, the data synchronization nodes may be hierarchically grouped, thereby obtaining a multi-hierarchy synchronization node group. The data synchronization nodes may be grouped according to a classification label, where the classification label includes a plurality of sub-labels with a hierarchical relationship, for example, the sub-labels may be: area, availability zone, cluster. Wherein a region may comprise a plurality of available regions, a region may comprise a plurality of clusters, and a cluster may comprise a plurality of data synchronization nodes.

For example, multiple data synchronization nodes may be grouped by region to obtain region 1 and region 2; then, the division of the usable area is performed in the area 1, for example, into the usable area 11 and the usable area 12, and the division of the usable area is performed in the area 2, for example, into the usable area 31 and the usable area 32; next, usable area 11 is divided into clusters 111 and 112, usable area 12 is divided into clusters 121 and 122, usable area 21 is divided into clusters 211 and 212, and usable area 22 is divided into clusters 221 and 222.

Here, a cluster may be understood as a data synchronization cluster described in the embodiments of the present application.

Here, the area 1 and the area 2 are two synchronization node groups at the same level, and the usable area 11 and the usable area 12 are two synchronization node groups at the same level.

Here, an area, an available area, and a cluster may be understood as a synchronous node group. And the clusters are contained within the available area, which is contained within the area. At this time, a multi-hierarchy synchronization node group may be determined based on all the synchronization node groups.

Then, the target running state of the data synchronization node in the synchronization node group can be determined; and further controlling the task execution state of the data synchronization task executed by the data synchronization node in the synchronization node group based on the target running state.

In an alternative embodiment, the step of controlling the task execution state of the data synchronization task executed by the data synchronization node in the synchronization node group based on the target running state specifically includes the following steps:

and scheduling the data synchronization task executed by the data synchronization node in the first synchronization node group to the second synchronization node group.

In the embodiment of the present application, after the cloud server obtains the target operation state of each data synchronization node in the synchronization node group, it may determine whether all the data synchronization nodes in the synchronization node group have failed based on the target operation state.

And if the first synchronous node group with all the data synchronous nodes in the synchronous node group failed is determined based on the target running state. Then, determining a hierarchy tag of the first synchronization node group; wherein the hierarchical label may be understood as a sub-label in the class label described in the above embodiment. By means of the hierarchical label, the service area of the first synchronization node group may be indicated, for example, the service area is "Beijing area", and for example, the service area is "available area A in Beijing area".

After determining the service area, a second set of synchronization nodes in a normal operating state in the service area may be determined. For example, if the service area is an "available area", a second synchronization node group in a normal operation state may be determined from the synchronization node groups corresponding to the available area; and then, scheduling the data synchronization task executed by the data synchronization node in the first synchronization node group to the second synchronization node group.

In an alternative embodiment, the step of controlling the task execution state of the data synchronization task executed by the synchronization node group based on the target running state specifically includes the following steps:

And if the third synchronous node group with the faults of all partial data synchronous nodes exists in the synchronous node groups based on the target running state, determining that the third synchronous node group with the faults of all partial data synchronous nodes exists in the synchronous node groups. For example, if the cluster 111 in the available area 11 fails and the cluster 112 is in a normal operation state, it indicates that a part of the data synchronization nodes in the data node group corresponding to the available area 11 fails, at this time, the second data synchronization node in the available area 11, for example, the cluster 112, which is determined to be in the normal operation state, may be determined. Thereafter, the data synchronization task performed by the failed data synchronization node in the third synchronization node group may be scheduled to the second data synchronization node. For example, data synchronization tasks performed by cluster 111 are scheduled to cluster 112.

In the above embodiment, in general, a data synchronization system performs a task of synchronizing data in a source database to a target database by using a plurality of data synchronization nodes, so that by processing an operation state of a data synchronization node according to a dimension of a synchronization node group, an abnormal data synchronization task can be accurately located and executed, so as to ensure normal execution of the data synchronization task, thereby ensuring normal operation of the data synchronization system.

In the embodiment of the present application, capacity expansion and capacity reduction may also be performed on each synchronization node group according to the resource utilization rate of the synchronization node group.

In an alternative embodiment, after storing the first synchronization data in the cloud storage space, the method further comprises the steps of:

determining third synchronous data with storage time exceeding preset time in the first synchronous data;

and moving the third synchronous data from the current storage area to the low-frequency storage area.

In the embodiment of the application, in the data synchronization system, data slicing can be performed on the first synchronization data written into the cloud storage space according to the data writing time; wherein each data slice corresponds to a data write time. At this time, it may be determined that the data writing time that does not satisfy the writing requirement, that is, the storage time between the current time and the data writing time exceeds the preset time. At this time, it is possible to determine the data fragment corresponding to the data writing time, and determine the data corresponding to the data fragment as the third synchronous data, and then move the third synchronous data from the current storage area to the low frequency storage area.

By the processing mode, the storage cost can be further compressed, and the historical data storage cost is expected to be 30%. If the historical data is required to be traced back, the data fragments in the low-frequency storage are directly read.

Referring to fig. 4, a schematic structural diagram of a data synchronization system according to an embodiment of the disclosure includes: a data synchronization node 41, a client 42 and a cloud server 43.

A client 42, configured to obtain first synchronization data in the source database in response to a first synchronization operation of the data synchronization node; the data synchronization node is used for synchronizing first synchronization data from the source database to a corresponding target database; and responding to a second synchronization operation of the data synchronization node, and synchronizing the first synchronization data in the cloud storage space to the target database for storage.

The cloud server 43 is configured to obtain first synchronization data sent by the client in response to the first synchronization request of the client; the first synchronization data are data to be synchronized to a corresponding target database in a source database; and responding to a second synchronization request of the client, acquiring first synchronization data from the data stored in the cloud storage space, and returning the first synchronization data to the client.

The above process is described below in connection with specific embodiments.

As can be seen from the above description, the data synchronization system includes a plurality of data synchronization clusters, as shown in fig. 2, where the data synchronization clusters include a plurality of data synchronization nodes, and each data synchronization node includes a data parsing module, a data storage module, and a data writing module. The client is an application program which is preset in the data synchronization cluster and can communicate with the data synchronization node; wherein, a client can be set for each data synchronization node, and the client and the data synchronization node are deployed in the same equipment; alternatively, the client and the data synchronization node may also be deployed on different devices.

As shown in fig. 2, the data synchronization system further includes a metadata engine of the storage system and a cloud storage space, where the cloud storage space may be used for object storage services provided by a cloud vendor, which is not specifically limited in this application. The client provides a POXIS access mode, which is the same as the interface for accessing the local hard disk. The existing data synchronization system can directly use the POXIS interface to access the object storage service so as to reduce the reconstruction cost.

Here, in the writing scenario, the data storage module may write to the client through the POXIS interface, where the client may persist into the remote cloud storage space, thereby implementing the capability of separating computation and storage in the data storage module. Under the reading scene, the data storage module can read the client through the POXIS interface, and the client can use a local hard disk as a caching disk and realize basic caching functions such as pre-reading and the like. When the data hits the cache disk, the read performance is the same as that between reading the local hard disk.

When the data storage module fails, the stored data is in the remote cloud storage space, so that the data storage module is not required to be rebuilt when the single machine is in disaster tolerance, the data storage module can be restarted by normal equipment, and after the restarting, the data storage module can directly read the lasting data from the cloud storage space. By the processing mode, the recovery time of the data can be improved, and the recovery time can be shortened to the second level, so that the data synchronization efficiency is improved. In addition, by this processing method, the storage cost can be reduced.

Compared with a local hard disk fault, the fault only causes the abnormality of the data storage module, in a data storage system with separated calculation and storage, if the metadata engine and the cloud storage space are abnormal, the whole data synchronization cluster is abnormal, so that the fault radius is enlarged, and therefore, a perfect fault isolation, fault perception and disaster recovery scheme is particularly important.

As shown in fig. 5, the data synchronization nodes may be deployed hierarchically, so that a multi-level synchronization node group is deployed. Here, taking classification labels as areas, available areas and clusters as examples, a deployment scheme of the data synchronization node is described.

As shown in fig. 5, from the deployment scheme, at least two available areas are deployed in each area, and at least two data synchronization clusters are deployed in each available area. For example, data sync cluster 1 and data sync cluster 2 are two independent clusters of the same area and the same availability zone. For example, data sync cluster 1 and data sync cluster 3 are two independent clusters of the same region, different available regions. For example, data sync cluster 1 and data sync cluster 5 are two independent clusters of different regions, different available regions.

The client can send the running state of the data synchronization node to a data synchronization abnormality sensing module in the cloud server.

If the metadata engine or the cloud storage space is abnormal, the data synchronization abnormality sensing module can discover and trigger single cluster disaster recovery in real time, and schedule all data synchronization tasks of the clusters to other normal clusters in the same available area. Here, a task scheduling module of the data synchronization system may be set in the cloud server, through which all data synchronization tasks of the belonging cluster may be scheduled to other normal clusters in the same available area. For example, if the data synchronization cluster 1 fails, all the data synchronization tasks of the data synchronization cluster 1 are scheduled to other normal clusters in the same available area through the task scheduling module.

If the single available area level fault occurs, the data synchronization abnormality sensing module can find and trigger the single available area disaster recovery in real time, and schedule the data synchronization tasks of all clusters in the available area to the normal available area in the same area. Here, a task scheduling module of the data synchronization system may be set in the cloud server, and through the task scheduling module, the data synchronization tasks of all clusters in the available area may be scheduled to the same area normal available area. For example, if the available area a fails, all data synchronization tasks of the available area a are scheduled to other normal available areas of the same area through the task scheduling module.

If the regional level fault occurs, the data synchronization abnormality sensing module can find and trigger regional level disaster recovery in real time, and schedule the data synchronization tasks of all clusters in the region to other normal regions. Here, a task scheduling module of the data synchronization system may be set in the cloud server, through which the data synchronization tasks of all clusters of the area may be scheduled to other normal areas. For example, if the Beijing area fails, the task scheduling module schedules all the data synchronization tasks of the Beijing area to other normal areas.

In addition, the data synchronization anomaly awareness module can discover hot spot resources in real time, for example, the conditions of high CPU/IO of a single client, high Load of a metadata engine, high IO of a cloud storage space and the like, and make corresponding Load balancing strategies.

The data synchronization method provided by the application can possibly generate a large number of data synchronization clusters, so that a resource management module of the data synchronization system can be arranged in the cloud server, and the data synchronization clusters can be automatically created and destroyed through the resource management module. If the resource water level of a certain data synchronization cluster is higher, the resource management module can identify in real time and automatically expand the capacity of the data synchronization cluster. If the resource utilization rate is low, the resource management module can identify in real time and automatically contract the capacity of the data synchronization cluster, so that the stability of the whole resource utilization rate is ensured.

In the embodiment of the application, in the data synchronization system, data slicing can be performed on the first synchronization data written into the cloud storage space according to the data writing time; wherein each data slice corresponds to a data write time. For example, as shown in fig. 6, the first synchronization data may be divided into a data slice 1, a data slice 2, and a data slice 3.

At this time, it may be determined that the data writing time that does not satisfy the writing requirement, that is, the storage time between the current time and the data writing time exceeds the preset time. At this time, it is possible to determine the data fragment corresponding to the data writing time, and determine the data corresponding to the data fragment as the third synchronous data, and then move the third synchronous data from the current storage area to the low frequency storage area.

Here, the data rotation module of the data synchronization system may be set in the cloud server in advance. The data rotation module of the data synchronization system can automatically rotate the written data fragments according to the data writing time, and the data fragments are switched from standard storage to low-frequency storage, so that the storage cost is compressed, and the historical data storage cost is predicted to be 30%. If the historical data is required to be traced back, the data fragments in the low-frequency storage are directly read.

In the embodiment of the application, the data synchronization system can realize communication between cloud servers through the POSIX interface, so that the access cost is reduced. In terms of operation and maintenance cost, the data synchronization system can improve stability and reduce operation and maintenance cost by using cloud storage space. The data synchronization system may reduce storage costs in terms of storage costs.

Another embodiment of the present application provides a computer device, which may be a device for performing product analysis on an industrial production line such as a host computer, and the computer device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to implement the product analysis method of any one of the foregoing embodiments.

As shown in fig. 7, the computer device 70 may include: processor 700, memory 701, bus 702, and communication interface 703, processor 700, communication interface 703, and memory 701 being connected by bus 702; the memory 701 stores a computer program executable on the processor 700, which when executed by the processor 700 performs the methods provided by any of the embodiments described herein.

The memory 701 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the system network element and at least one other network element is implemented via at least one communication interface 703 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.

Bus 702 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be divided into address buses, data buses, control buses, etc. The memory 701 is configured to store a program, and the processor 700 executes the program after receiving an execution instruction, and the method disclosed in any of the foregoing embodiments of the present application may be applied to the processor 700 or implemented by the processor 700.

The processor 700 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the methods described above may be performed by integrated logic circuitry in hardware or instructions in software in processor 700. The processor 700 may be a general-purpose processor, and may include a processor (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), and the like; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 701, and the processor 700 reads information in the memory 701, and in combination with its hardware, performs the steps of the above method.

The computer device provided by the embodiment of the present application and the method provided by the embodiment of the present application are the same inventive concept, and have the same beneficial effects as the method adopted, operated or implemented by the computer device.

Another embodiment of the present application provides a computer-readable storage medium having stored thereon a computer program that is executed by a processor to implement the control method of any of the above embodiments.

Referring to fig. 8, a computer readable storage medium is shown as an optical disc 20 having a computer program (i.e., a program product) stored thereon, which, when executed by a processor, performs the method provided by any of the embodiments described above.

It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.

Another embodiment of the present application provides a computer program product comprising a computer program that is executed by a processor to implement the control method of any of the above embodiments.

The computer readable storage medium and the computer program product provided in the above embodiments of the present application are both in the same inventive concept as the methods provided in the embodiments of the present application, and have the same advantages as the methods adopted, operated or implemented by the application program stored therein.

It should be noted that:

the term "module" is not intended to be limited to a particular physical form. Depending on the particular application, modules may be implemented as hardware, firmware, software, and/or combinations thereof. Furthermore, different modules may share common components or even be implemented by the same components. There may or may not be clear boundaries between different modules.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may also be used with the examples herein. The required structure for the construction of such devices is apparent from the description above. In addition, the present application is not directed to any particular programming language. It will be appreciated that the content of the present application described herein can be implemented in a variety of programming languages, and the descriptions above with respect to specific languages are provided for disclosure of embodiments of the present application.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

The foregoing examples merely represent embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. The data synchronization method is characterized by being applied to a client, wherein the client is an application program which is arranged in a data synchronization cluster and can communicate with data synchronization nodes, and the data synchronization cluster comprises a plurality of data synchronization nodes and comprises the following steps:

2. The method of claim 1, wherein synchronizing the first synchronization data in the cloud storage space to the target database for storage in response to the second synchronization operation of the data synchronization node comprises:

3. The method according to claim 1, wherein the method further comprises:

4. The method according to claim 1, wherein the method further comprises:

acquiring the running state of the data synchronization node;

5. The data synchronization method is characterized by being applied to a cloud server and comprising the following steps of:

responding to a first synchronization request of a client to acquire first synchronization data sent by the client; the first synchronization data are data to be synchronized to a corresponding target database in a source database; the client is an application program which is arranged in a data synchronization cluster and can communicate with the data synchronization nodes, and the data synchronization cluster comprises a plurality of data synchronization nodes;

Storing the first synchronization data in a cloud storage space;

6. The method of claim 5, wherein the method further comprises:

7. The method of claim 6, wherein the number of data synchronization nodes is a plurality;

8. The method of claim 7, wherein controlling the task execution state of the data synchronization task performed by the data synchronization node in the synchronization node group based on the target running state comprises:

9. The method of claim 7, wherein controlling the task execution state of the data synchronization task performed by the synchronization node group based on the target running state comprises:

10. The method of claim 6, wherein after storing the first synchronization data in cloud storage space, the method further comprises:

11. A data synchronization system, comprising: the cloud server comprises a data synchronization node, a client and a cloud server; the client is an application program which is arranged in a data synchronization cluster and can communicate with the data synchronization nodes, and the data synchronization cluster comprises a plurality of data synchronization nodes;

The cloud server is used for responding to the first synchronization request of the client and acquiring first synchronization data sent by the client; the first synchronization data are data to be synchronized to a corresponding target database in a source database; and responding to a second synchronization request of the client, acquiring first synchronization data from the data stored in the cloud storage space, and returning the first synchronization data to the client.

12. A computer device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication via the bus when the computer device is running, the machine-readable instructions when executed by the processor performing the steps of the data synchronization method according to any one of claims 1 to 10.

13. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of the data synchronization method according to any of claims 1 to 10.