CN114090343A

CN114090343A - Cross-cluster copying system and method based on bucket granularity

Info

Publication number: CN114090343A
Application number: CN202210055993.7A
Authority: CN
Inventors: 李明强; 朱辉; 薛延波; 张涛; 赵鹏
Original assignee: Beijing Huapin Borui Network Technology Co Ltd
Current assignee: Beijing Huapin Borui Network Technology Co Ltd
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-02-25
Anticipated expiration: 2042-01-18
Also published as: CN114090343B

Abstract

The embodiment of the invention discloses a cross-cluster replication system based on bucket granularity, which comprises: the cluster, two groups of message queues of distribution between every two clusters; the global configuration center is used for determining a cross-cluster replication scheme among a plurality of clusters and monitoring whether an abnormality occurs during replication in real time; the producer monitors the operation log on the production node of the main cluster in real time, encapsulates the barrel object information when cross-cluster replication is needed and sends the barrel object information to a normal message queue; the consumer monitors the normal message queue and pulls the bucket object from the main cluster to write into the backup cluster; and the exception repairing tool consumes the messages in the exception message queue when the exception occurs in the copying. The embodiment of the invention also discloses a cross-cluster copying method based on the bucket granularity. The invention relates to a multi-direction cross-cluster replication method with barrels as granularity, which supports the maximum quasi-real-time synchronization with the minimum system resource cost, and can automatically perform service hot switching and data recovery when a cluster fails.

Description

Cross-cluster copying system and method based on bucket granularity

Technical Field

The invention relates to the technical field of distributed storage, in particular to a cross-cluster replication system and method based on bucket granularity.

Background

The object storage is deployed on an independent cluster, and in consideration of data security, the data may need to be synchronized to other clusters in real time as a backup. In the prior art, for cross-cluster replication, only unidirectional and bidirectional cross-cluster replication is supported, cross-cluster replication units cannot be reasonably selected, system resources are strained when replication granularity is too large, and quasi-real-time synchronization of data among a plurality of clusters cannot be realized so as to ensure data consistency. When a cluster fails in the cross-cluster replication process, user data is easily lost and the use is affected.

Disclosure of Invention

In order to solve the above problems, an object of the present invention is to provide a cross-cluster replication system and method, which support quasi real-time synchronization of multi-cluster data to the maximum extent with the minimum system resource cost by using multi-way cross-cluster replication with bucket as granularity, and when a cluster fails, a global configuration center can automatically perform hot-switch of services and data recovery.

The multi-direction cross-cluster replication based on the bucket granularity supports the multi-direction replication, can flexibly configure the replication direction of the bucket, and realizes the quasi-real-time synchronization of data among the multi-direction cross-cluster.

The method reduces resource consumption, can efficiently perform cross-cluster replication, realizes multi-direction cross-cluster quasi-real-time synchronization, and can perform service hot switching and data recovery in case of failure.

The embodiment of the invention provides a cross-cluster replication system based on bucket granularity, which comprises:

the system comprises a plurality of clusters, wherein two groups of message queues are distributed between every two clusters in the plurality of clusters, each group of message queues comprises a normal message queue and an abnormal message queue, the normal message queue is used for storing object messages generated in the copying process of the two clusters, and the abnormal message queue is used for storing object messages which cannot be copied normally in the copying process of the two clusters;

the system comprises a global configuration center and a plurality of clusters, wherein the global configuration center is used for determining a cross-cluster replication scheme among the clusters, the cross-cluster replication scheme comprises a bucket to be replicated and a replication direction of the bucket to be replicated, the bucket to be replicated is a bucket of a producer to be replicated, the replication direction is a direction in which the bucket to be replicated is replicated from a main cluster to a backup cluster in two clusters to be replicated, and the global configuration center is further used for monitoring whether an abnormality occurs when the bucket to be replicated is replicated between the two clusters to be replicated in real time;

the system comprises a plurality of producers, a backup cluster and a plurality of backup clusters, wherein each producer is used for monitoring an operation log on a production node of the main cluster in real time, packaging barrel object information when cross-cluster replication is needed and sending the barrel object information to the main cluster to be replicated to a normal message queue of the backup cluster;

a plurality of consumers are set, each consumer monitors a normal message queue copied from the main cluster to the backup cluster, analyzes the normal message queue to obtain the metadata information of the barrel object packaged by the producer to be copied, and pulls the barrel object from the main cluster to write the barrel object into the backup cluster according to the metadata information of the barrel object, wherein the plurality of consumers monitor one normal message queue;

and the exception repairing tool is used for consuming the messages copied to the exception message queue of the backup cluster by the main cluster when the to-be-copied bucket is copied between the two to-be-crossed clusters and has an exception.

As a further development of the invention, the two clusters comprise a first cluster and a second cluster, the two sets of message queues comprise a first set of message queues and a second set of message queues,

the first group of message queues comprises a first normal message queue and a first abnormal message queue, the first normal message queue is used for storing the bucket object information copied from the first cluster to the second cluster, the first abnormal message queue is used for storing the bucket object information which is not copied from the first cluster to the second cluster normally,

the second group of message queues comprise a second normal message queue and a second abnormal message queue, the second normal message queue is used for storing the bucket object information copied to the first cluster from the second cluster, and the second abnormal message queue is used for storing the bucket object information which cannot be copied to the first cluster from the second cluster normally.

As a further improvement of the present invention, when the to-be-copied bucket is copied between the two to-be-crossed clusters and an exception occurs, the producer-encapsulated bucket object information is sent to the exception message queue of the master cluster and copied to the backup cluster.

As a further improvement of the present invention, when an abnormality occurs when the consumer pulls an object from the primary cluster to the backup cluster, the abnormal bucket object information is sent to the primary cluster and copied to an abnormal message queue of the backup cluster.

The embodiment of the invention also provides a cross-cluster copying method based on bucket granularity, which comprises the following steps:

determining a cross-cluster replication scheme among a plurality of clusters, wherein the cross-cluster replication scheme comprises a bucket to be replicated and a replication direction of the bucket to be replicated, the bucket to be replicated is a bucket of a producer to be replicated, and the replication direction is a direction in which the bucket to be replicated is replicated from a main cluster to a backup cluster in two clusters to be crossed;

the producer to be copied monitors an operation log on a production node of the main cluster in real time, encapsulates barrel object information when cross-cluster copying is needed, and sends the barrel object information to the main cluster to be copied to a normal message queue of the backup cluster;

each consumer monitors a normal message queue copied to the backup cluster by the main cluster, analyzes the normal message queue to obtain barrel object metadata information packaged by the producer to be copied, and pulls a barrel object from the main cluster to write into the backup cluster according to the barrel object metadata information, wherein a plurality of consumers monitor one normal message queue.

As a further improvement of the present invention, the method further comprises: and monitoring whether the bucket to be copied is abnormal when being copied between the two clusters to be crossed in real time, and consuming the message copied to the abnormal message queue of the backup cluster by the main cluster when the abnormal occurs.

Embodiments of the present invention also provide an electronic device, which includes a memory and a processor, where the memory is configured to store one or more computer instructions, and the one or more computer instructions are executed by the processor to implement the method.

Embodiments of the present invention also provide a computer-readable storage medium, on which a computer program is stored, the computer program being executed by a processor to implement the method.

The invention has the beneficial effects that:

the bucket granularity-based multi-direction cross-cluster replication realizes flexible backup of multiple clusters, guarantees the safety of data and solves the problem of system resource shortage caused by overlarge replication granularity; invalid bucket objects are filtered, cross-cluster copying operation can be efficiently carried out, and multi-direction cross-cluster quasi-real-time synchronization is realized; when cluster failure occurs in the replication process, service hot switching and data recovery can be carried out so as to ensure the consistency of data.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

FIG. 1 is a schematic diagram of a cross-cluster replication system based on bucket granularity, according to an exemplary embodiment of the invention;

fig. 2 is a schematic diagram of a working process of a global configuration center according to an exemplary embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.

In addition, in the description of the present invention, the terms used are for illustrative purposes only and are not intended to limit the scope of the present invention. The terms "comprises" and/or "comprising" are used to specify the presence of stated elements, steps, operations, and/or components, but do not preclude the presence or addition of one or more other elements, steps, operations, and/or components. The terms "first," "second," and the like may be used to describe various elements, not necessarily order, and not necessarily limit the elements. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified. These terms are only used to distinguish one element from another. These and/or other aspects will become apparent to those of ordinary skill in the art in view of the following drawings, and the description of the embodiments of the present invention will be more readily understood by those of ordinary skill in the art. The drawings are only for purposes of illustrating the described embodiments of the invention. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated in the present application may be employed without departing from the principles described in the present application.

As shown in fig. 1, a cross-cluster replication system based on bucket granularity according to an embodiment of the present invention includes:

Object storage provides a flat storage based on buckets and objects, all objects in a bucket being at the same logical level. Considering the importance of object data, how many real-time backups become particularly important, which places strict requirements on resource consumption and data consistency across cluster replication. Based on the defects of the existing cross-cluster replication technology, the invention provides a multi-direction cross-cluster replication system with buckets as granularity.

The system of the invention considers that each existing cluster can be replicated in a cross-cluster mode, so that two groups of message queues are distributed between every two clusters, each group comprises two message queues, one group is responsible for storing object messages generated normally in the cross-cluster replication process, and the other group is responsible for storing object messages which fail to be replicated normally. Because two groups of message queues are distributed to every two clusters to support cross-cluster replication, compared with the prior art which only supports unidirectional and bidirectional cross-cluster replication, the system can realize multidirectional replication, realize flexible backup of multiple clusters and ensure the safety of data.

The system is based on the multi-direction cross-cluster replication of the bucket granularity, supports the flexible configuration of the replication direction of the bucket, can reasonably select two clusters to be cross-cluster from a plurality of clusters to perform the bucket replication, solves the problem of resource consumption caused by overlarge replication granularity (coarse granularity) in the prior art, and reduces the consumption of system resources. The method can accurately filter, only needs to operate the barrel configured with cross-cluster copying, ensures that invalid barrel object information does not appear in the message queue, improves the copying efficiency, and realizes efficient cross-cluster copying operation.

The system uses a fault tolerance mechanism (such as a multi-copy strategy) to ensure that the content of the message queue is not lost, and a producer and a consumer are completely decoupled, and the producer is only responsible for sending the object information of the packaging barrel to the message queue, so that other invalid barrel data cannot appear in the message queue, and the efficiency of synchronizing data by the consumer is improved. Each consumer in the system is responsible for monitoring a normal message queue, and in consideration of the abnormal condition of a single consumer node, a plurality of consumers are adopted to monitor one normal message queue at the same time, namely, the many-to-one relation is adopted, so that the consumers can pull barrel objects from a main cluster to a backup cluster, and the consistency of data is ensured. When the clusters at two ends fail in the copying process, the global configuration center can evaluate the affected bucket of the current service in time, smoothly switches to the backup cluster, and supplements missing data in an abnormal time period through an abnormal repairing tool after the failed cluster is repaired, so that the final consistency of the data is ensured, and the quasi-real-time synchronization of the data among the multi-directional cross-clusters is realized.

In one embodiment, the two clusters include a first cluster and a second cluster, the two sets of message queues include a first set of message queues and a second set of message queues,

For two clusters replicated across clusters, the first cluster may be a primary cluster or a backup cluster, and correspondingly, the second cluster may be a backup cluster or a primary cluster. The second cluster is a backup cluster when the first cluster is a primary cluster and the first cluster is a backup cluster when the second cluster is a primary cluster. When copying from the first cluster to the second cluster, two message queues, a normal message queue and an abnormal message queue, are allocated. Accordingly, when copying from the second cluster to the first cluster, two message queues, a normal message queue and an abnormal message queue, are allocated.

In one embodiment, when the to-be-copied bucket is copied between the two to-be-crossed clusters and an exception occurs, the producer-packaged bucket object information is sent to the exception message queue copied to the backup cluster by the main cluster.

In one embodiment, when an abnormality occurs when the consumer pulls an object from the primary cluster to the backup cluster, the abnormal bucket object information is sent to the primary cluster and copied to an abnormal message queue of the backup cluster.

As shown in fig. 1, two clusters, cluster a and cluster B, respectively, distribute two sets of message queues between cluster a and cluster B.

Wherein the first set of message queues is:

normal (a- > B normal message queue): and the storage unit is responsible for storing the bucket object information which is copied from the cluster A to the cluster B under the normal condition.

Abrormal (a- > B exception message queue): and the storage unit is responsible for storing the bucket object information which is failed to successfully pull data from the cluster A and write the data into the cluster B in case of failure.

The second set of message queues is:

normal (B- > a normal message queue): and the storage unit is responsible for storing the bucket object information which is copied from the cluster B to the cluster A under the normal condition.

Abrnormal (B- > a exception message queue): and the storage unit is responsible for storing the barrel object information which is failed to successfully pull data from the B cluster and write the data into the A cluster.

The system includes three roles, global configuration center, producer and consumer. The producer and the consumer can not sense the flow trend of the upper-layer service and the cluster state, the global configuration center informs the barrel object information, and the abnormal repairing tool repairs data, so that smooth switching of bottom-layer storage faults can be performed, data are not lost, and real users of the upper-layer object storage system for storing data are not influenced.

For the global configuration center, the bucket object information that informs the producer and the consumer includes:

1. determining whether the current bucket needs to be copied across the cluster, namely determining a bucket to be copied;

2. what is the direction of replication of the bucket to be replicated, e.g., whether the bucket is replicated from a first cluster to a second cluster of the two clusters, or the second cluster is replicated to the first cluster;

3. whether clusters at two ends of the bucket to be copied fail or not, namely whether two clusters to be spanned fail or not, wherein the two clusters to be spanned are two of the clusters.

The system is provided with an abnormal repairing tool, and after two to-be-crossed cluster faults are recovered, the normal message queues in the copying direction are consumed, namely, the two to-be-crossed clusters are configured with messages copied from the main cluster to the normal message queues of the backup cluster, so that data loss caused by the two clusters at two ends when the faults occur is made up, and the consistency of the data is ensured.

As shown in fig. 2, the global configuration center determines, among a plurality of clusters, a cluster a and a cluster B as two to-be-crossed clusters, and adds a cross-cluster replication rule (replication direction), where the replication direction is that the cluster a is replicated to the cluster B, and adds a service configuration to determine which cluster is responsible for carrying service traffic (for example, the cluster a), and when the cluster a fails in the replication process, modifies the service configuration to switch the service traffic to the other end of the replication direction, that is, the cluster B (that is, the cluster B temporarily takes over the cluster a to work), and after the cluster a recovers, modifies the service configuration to switch the service traffic from the cluster B to the cluster a.

The producer is responsible for monitoring the operation log on the production node of the main cluster in real time, and the next action of the producer is determined through the global configuration center. When the global configuration center informs the producer that cross-cluster copying is needed, the producer is used as a producer to be copied, the producer to be copied encapsulates the bucket object information and sends the bucket object information to a normal message queue, and the normal message queue is a normal message queue copied from the main cluster to the backup cluster. For example, in fig. 1, cluster a is a primary cluster, cluster B is a backup cluster, and when the replication is normal, the bucket object information is sent to the a.b.normal. When the global configuration center informs the two end clusters of the producer that the cluster has a fault and cannot normally copy next, the information of the barrel object encapsulated by the producer to be copied is sent to an abnormal message queue, and the abnormal message queue is a normal message queue copied from the main cluster to the backup cluster. For example, in fig. 1, cluster a is a primary cluster, cluster B is a backup cluster, and when an exception occurs, bucket object information is sent to a.b.innormal.

The consumer is responsible for monitoring a normal message queue and pulling objects from the primary cluster to write into the backup cluster to ensure that the backup cluster is consistent with the primary cluster data. If the pulling write-in process is abnormal, the barrel object information is sent to the current two abnormal message queues to be configured across the clusters, namely the main cluster is copied to the abnormal message queue of the backup cluster. For example, in fig. 1, cluster a is a primary cluster, cluster B is a backup cluster, and when an exception occurs, bucket object information is sent to a.b.innormal.

The embodiment of the invention discloses a cross-cluster replication method based on bucket granularity, which comprises the following steps:

each consumer monitors a normal message queue copied to the backup cluster by the main cluster, analyzes the normal message queue to obtain barrel object metadata information packaged by the producer to be copied, and pulls an object from the main cluster to write the object into the backup cluster according to the barrel object metadata information, wherein a plurality of consumers monitor one normal message queue.

In one embodiment, the method further comprises: and monitoring whether the bucket to be copied is abnormal when being copied between the two clusters to be crossed in real time, and consuming the message copied to the abnormal message queue of the backup cluster by the main cluster when the abnormal occurs.

The disclosure also relates to an electronic device comprising a server, a terminal and the like. The electronic device includes: at least one processor; a memory communicatively coupled to the at least one processor; and a communication component communicatively coupled to the storage medium, the communication component receiving and transmitting data under control of the processor; wherein the memory stores instructions executable by the at least one processor to implement the method of the above embodiments.

In an alternative embodiment, the memory is used as a non-volatile computer-readable storage medium for storing non-volatile software programs, non-volatile computer-executable programs, and modules. The processor executes various functional applications of the device and data processing, i.e., implements the method, by executing nonvolatile software programs, instructions, and modules stored in the memory.

The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store a list of options, etc. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be connected to the external device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more modules are stored in the memory and, when executed by the one or more processors, perform the methods of any of the method embodiments described above.

The product can execute the method provided by the embodiment of the application, has corresponding functional modules and beneficial effects of the execution method, and can refer to the method provided by the embodiment of the application without detailed technical details in the embodiment.

The present disclosure also relates to a computer-readable storage medium for storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.

That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Furthermore, those of ordinary skill in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

It will be understood by those skilled in the art that while the present invention has been described with reference to exemplary embodiments, various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims

1. A bucket granularity-based cross-cluster replication system, the system comprising:

2. The system of claim 1, wherein the two clusters comprise a first cluster and a second cluster, the two sets of message queues comprise a first set of message queues and a second set of message queues,

3. The system of claim 1, wherein when the to-be-copied bucket is copied between the two to-be-crossed clusters to cause an exception, the producer-encapsulated bucket object information is sent to an exception message queue that the primary cluster copies to the backup cluster.

4. The system of claim 1, wherein when an exception occurs when the consumer pulls an object from the primary cluster to the backup cluster, sending the anomalous bucket object information to the primary cluster to be copied to an exception message queue of the backup cluster.

5. A method of cross-cluster replication based on bucket granularity, the method comprising:

6. The method of claim 5, wherein the method further comprises:

and monitoring whether the bucket to be copied is abnormal when being copied between the two clusters to be crossed in real time, and consuming the message copied to the abnormal message queue of the backup cluster by the main cluster when the abnormal occurs.

7. The method of claim 6, wherein when the to-be-copied bucket is copied between the two to-be-crossed clusters to cause an exception, the producer-encapsulated bucket object information is sent to an exception message queue of the primary cluster to be copied to the backup cluster.

8. The method of claim 6, wherein when an exception occurs when the consumer pulls an object from the primary cluster to the backup cluster, sending the anomalous bucket object information to the primary cluster to be copied to an exception message queue of the backup cluster.

9. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method of any of claims 5-8.

10. A computer-readable storage medium, on which a computer program is stored, the computer program being executable by a processor for implementing the method according to any of claims 5-8.