CN115599747A - Metadata synchronization method, system and equipment of distributed storage system - Google Patents

Metadata synchronization method, system and equipment of distributed storage system Download PDF

Info

Publication number
CN115599747A
CN115599747A CN202210432189.6A CN202210432189A CN115599747A CN 115599747 A CN115599747 A CN 115599747A CN 202210432189 A CN202210432189 A CN 202210432189A CN 115599747 A CN115599747 A CN 115599747A
Authority
CN
China
Prior art keywords
metadata
node
change operation
metadata service
segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210432189.6A
Other languages
Chinese (zh)
Other versions
CN115599747B (en
Inventor
罗杰彬
徐文豪
王弘毅
张凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SmartX Inc
Original Assignee
SmartX Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SmartX Inc filed Critical SmartX Inc
Priority to CN202210432189.6A priority Critical patent/CN115599747B/en
Publication of CN115599747A publication Critical patent/CN115599747A/en
Application granted granted Critical
Publication of CN115599747B publication Critical patent/CN115599747B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a metadata synchronization method, a metadata synchronization system and metadata synchronization equipment of a distributed storage system, wherein a metadata service master node and a metadata service slave node are determined by a consensus protocol cluster; when the metadata change occurs, the metadata change operation is packaged into a change operation log by using a metadata service main node; writing the change operation log into the segment of the consensus protocol cluster in sequence; after the writing is successful, updating the change operation log and the corresponding metadata into a local storage engine of the metadata main node; and when a new segment is created or a preset time length is set, synchronizing the change operation log and the corresponding metadata thereof into a local storage engine of the metadata service slave node according to a preset synchronization rule. The metadata service can directly read the metadata from the local storage engine without network calling and consensus process, thereby reducing time delay and improving synchronization efficiency.

Description

Metadata synchronization method, system and equipment of distributed storage system
Technical Field
The present application relates to the field of data storage technologies, and in particular, to a metadata synchronization method, system and device for a distributed storage system.
Background
The distributed storage system connects a plurality of independent servers together through a network to form a distributed cluster, and storage resources such as mechanical disks, solid state disks and the like of the servers in the cluster form a resource pool to carry out unified management and external service. Distributed storage systems typically allocate virtual volumes from a pool of storage resources, iSCSI LUNs, files, etc. storage objects to storage consumers, and the data capacity of a virtual volume or file may be larger than the total storage capacity on a single server. For example, one virtual volume 64TB, while the physical disk capacity on a single server in the cluster is only 32TB. To support virtual volumes with data volumes that exceed the storage capacity of a single server, the distributed storage system subdivides the virtual volumes or files into finer-grained data slices, for example, a 64TB volume into multiple fixed small-sized data slices of 256MB, 4MB, or 1M, and places the data slices into multiple servers in a cluster, so that one storage object can utilize the storage resources of the multiple servers. For data security and improved reading performance, the distributed storage system usually also performs data redundancy based on data fragmentation, and usually uses a copy technology or an erasure code technology. Taking a copy technology as an example, assuming that the number of copies is 3, a distributed storage system allocates a large storage object from a uniform resource pool, divides the storage object into a plurality of data fragments with finer granularity, and allocates 3 copies of each data fragment to 3 different servers in a cluster with a certain policy. In order to perform normal data reading and writing on a data object such as a volume or a file, it is necessary to know which data fragment of the data object the required data belongs to, and which copies of the corresponding data fragment exist, and which copies are respectively located on which servers. This type of location data information is an important metadata for distributed storage systems. In addition, the metadata information of the distributed storage system also includes files, directory attributes, data node information constituting the cluster, and the like.
The metadata is crucial to the distributed storage system, and if the metadata is lost, service data of the distributed storage system cannot be accessed, which causes a great influence on the service of the user. Such metadata is also typically persisted in a clustered fashion (multiple copies, etc.). In addition, the distributed storage system has very strict requirements on the consistency of the metadata and cannot tolerate data inconsistency, so that when the metadata is updated, it is required to ensure that the metadata stored by each server in the cluster is strongly consistent.
In the mode, data access needs to go through a consensus process, when updating metadata, only a master node can be written in first, and the metadata can be updated and synchronized to most slave nodes and then can be updated successfully, and when reading, the metadata also needs to be provided through a Leader of a raw module.
The main problem of the metadata synchronization mechanism directly realized based on the distributed consistency protocol is that the query of metadata is relatively expensive. The consensus protocol cluster usually provides only a single scale object query with Key-value granularity, each object query is a relatively independent action, and the consensus protocol cluster needs to pass through a process of consensus confirmation, so that when a range query or a more complex condition query containing data semantics is performed, more data results need to be obtained at a higher cost and then secondary splitting is performed. The query of each small object needs to go through a consensus process according to different specific consensus algorithms, and the time consumption is high. In a distributed storage system, however, the frequency of metadata read requests is also typically much greater than that of write requests. The performance of the metadata read request is critical to the performance of the distributed storage.
Disclosure of Invention
An object of the embodiments of the present application is to provide a metadata synchronization method, system and device for a distributed storage system, so as to solve the problems of low metadata synchronization efficiency and low metadata read request performance at present. The specific technical scheme is as follows:
in a first aspect, a metadata synchronization method for a distributed storage system is provided, the method including:
determining a metadata service master node and a metadata service slave node through a consensus protocol cluster;
when metadata change occurs, the metadata change operation is packaged into a change operation log by using the metadata service master node;
writing the change operation log into the segment of the consensus protocol cluster in sequence;
after the writing is successful, updating the change operation log and the corresponding metadata into a local storage engine of a metadata main node;
and when a new segment is created or a preset time length is set, synchronizing the change operation log and the corresponding metadata thereof to a local storage engine of the metadata service slave node according to a preset synchronization rule.
Optionally, the determining, by the consensus protocol cluster, the metadata service master node and the metadata service slave node includes:
creating a node for each metadata service node in the same directory of the consensus protocol cluster, and sequencing according to the creation time;
and determining the metadata service node corresponding to the node ranked at the head as the metadata service main node, and determining the other nodes as metadata service slave nodes.
Optionally, the method further comprises:
when the metadata service main node fails or network partitions, deleting a node representing the metadata service main node;
and determining the metadata service node corresponding to the node which is currently ranked at the top as a new metadata service main node.
Optionally, the preset synchronization rule is:
acquiring the latest change operation log sequence number from a local storage engine of a node in a metadata service;
pulling all segment information from the consensus protocol cluster;
sequencing all the segments according to the sequence number of the first change operation log in the segments;
finding the first segment not less than the sequence number of the latest change operation log;
judging whether the segment is the last one, wherein the sequence number of the first log of the segment is greater than the sequence number of the latest change operation log;
if yes, taking the last segment of the segment as a target segment;
if not, taking the segment as a target segment;
starting from the target segment, the change oplogs for all segments are synchronized.
Optionally, after synchronizing the change operation log and the corresponding metadata thereof to the local storage engine of the metadata service slave node according to a preset synchronization rule, the method further includes:
and the change operation logs synchronized to each metadata service slave node in the consensus protocol cluster are recycled through the metadata service slave node corresponding to the node arranged at the head.
Optionally, the metadata service corresponding to the node arranged at the top may periodically perform a change operation log recovery operation from the node.
Optionally, the method further comprises:
when a new metadata service node joins the consensus protocol cluster, the full amount of metadata is synchronized from the local storage engines of other metadata service nodes.
In a second aspect, the present application provides a metadata synchronization system for a distributed storage system, the system comprising:
the determining unit is used for determining the metadata service main node and the metadata service slave node through the consensus protocol cluster;
the packaging unit is used for packaging metadata change operation into a change operation log by using the metadata service master node when the metadata change occurs;
a writing unit, configured to write the change operation logs into the segments of the consensus protocol cluster in sequence;
the updating unit is used for updating the change operation log and the corresponding metadata into a local storage engine of the metadata main node after the writing is successful;
and the synchronization unit is used for synchronizing the change operation log and the corresponding metadata thereof into a local storage engine of the metadata service slave node according to a preset synchronization rule when a new segment is created or a preset time length is set.
In a third aspect, the present application provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of the first aspect when executing a program stored in the memory.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method steps of any one of the first aspects.
In a fifth aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform a method of metadata synchronization for a distributed storage system as described in any of the above.
The embodiment of the application has the following beneficial effects:
the embodiment of the application provides a metadata synchronization method, a metadata synchronization system and metadata synchronization equipment of a distributed storage system, wherein a metadata service master node and a metadata service slave node are determined by a consensus protocol cluster; when the metadata change occurs, the metadata change operation is packaged into a change operation log by using a metadata service main node; writing the change operation log into the segment of the consensus protocol cluster in sequence; after the writing is successful, updating the change operation log and the corresponding metadata into a local storage engine of a metadata main node; and when a new segment in the consensus protocol cluster is created or a preset time length is set at intervals, synchronizing the change operation log and the corresponding metadata thereof into a local storage engine of the metadata service slave node according to a preset synchronization rule. According to the method and the device, the metadata are not directly stored in the consensus protocol cluster, the main node election and the metadata change operation log are synchronized only by means of the consensus protocol cluster, and the metadata are finally stored in the local storage engine. The metadata service can directly read the metadata from the local storage engine without network calling and consensus processes, so that the time delay is reduced, the local storage engine in each service node processes the local metadata without considering the data states of other nodes, and various caching mechanisms and data organization modes can be adopted as required to further improve the performance. In addition, the existing mechanism in the consensus protocol cluster is used for ensuring strong consistency of metadata. All metadata do not need to be loaded into a memory, so that the resource consumption of the system is reduced, and the system can process a larger amount of metadata.
Of course, not all advantages described above need to be achieved at the same time in the practice of any one product or method of the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart of a metadata synchronization method for a distributed storage system according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a metadata service cluster according to an embodiment of the present application;
fig. 3 is a schematic view of a master node election process according to an embodiment of the present application;
FIG. 4 is a flow diagram illustrating synchronization of slave nodes in a metadata service according to an embodiment of the present disclosure;
FIG. 5 is a flowchart illustrating a metadata full synchronization according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a metadata synchronization system of a distributed storage system according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the present application provides a metadata synchronization method for a distributed storage system, which is described in detail below with reference to specific embodiments, and as shown in fig. 1, the metadata synchronization method for a distributed storage system provided in the embodiment of the present application includes the following specific steps:
step S101: and determining the metadata service master node and the metadata service slave node through the consensus protocol cluster.
In this step, the consensus protocol cluster may be zookeeper, etcd, etc.
As shown in fig. 2, the number of the metadata service master nodes is only 1, the number of the metadata service slave nodes may be several, and the consensus protocol cluster, the metadata service master node, and the metadata service slave nodes together form a metadata service cluster of the distributed storage system. Each metadata service node is internally provided with a local storage engine, such as MySQL, level DB and the like, and metadata is stored in the local storage engine of the metadata service node. And the metadata change operation log is synchronized by means of the consensus protocol cluster. The performance of reading and conditional queries of data by these local storage engines is typically much higher than that of a consensus-based cluster.
In the embodiment of the application, the consensus protocol cluster bears two functions, one is to provide election service for the metadata service cluster based on a consensus algorithm and determine a unique metadata service master node and a plurality of metadata service slave nodes; the other is to keep the change operation log of the metadata through the consensus protocol cluster. In particular in the form of Key-value.
The reading and writing requests of the metadata can be completed only by the metadata service main node. And when the metadata changes, the metadata service master node writes a change operation log into the consensus protocol cluster firstly, and updates the change operation log in the local storage engine after the success of the writing is confirmed. When read, because the local storage engine is already provided with the complete metadata. The consensus protocol process can be skipped directly, reading the metadata from the local storage engine. Compared with the existing scheme of directly using the consensus protocol cluster to store and manage the metadata, the embodiment of the application does not store the metadata in the consensus protocol cluster, only uses the consensus protocol cluster to perform master node election and synchronize the metadata change operation log, and the metadata finally exists in the local storage engine. The metadata service can read the metadata from the local storage engine without network calling and consensus, so that the time delay is reduced, meanwhile, the local storage engine of each node does not need to consider the data states of other nodes when processing the local metadata, and various caching mechanisms and data organization modes can be adopted as required, so that the reading performance is further improved.
Optionally, the determining, by the consensus protocol cluster, the metadata service master node and the metadata service slave node includes:
creating a node for each metadata service node in the same directory of the consensus protocol cluster, and sequencing according to the creation time;
and determining the metadata service node corresponding to the node arranged at the head as a metadata service master node, and determining the other nodes as metadata service slave nodes.
Optionally, the method further comprises:
when the metadata service main node fails or network partitions, deleting a node representing the metadata service main node;
and other metadata service slave nodes also receive the message notification, check whether the node list is arranged at the head, and determine the metadata service node corresponding to the node arranged at the head currently as a new metadata service master node. After the metadata service master node is subjected to network partition, the identity of the master node can be lost due to the fact that the metadata service master node cannot be normally connected with the consensus protocol cluster, and therefore the two metadata service master nodes can be avoided when the network is partitioned.
In another embodiment, when the metadata service fails from a node or is partitioned in a network, a node representing the node on the consensus protocol cluster is automatically deleted, and other metadata service nodes receive the message notification, but have no obvious influence on other metadata service nodes.
In addition, even if the metadata service master node or the metadata service slave node has no fault and is in network partition, each metadata service node can monitor the directory in the consensus protocol cluster all the time, and each time the node in the directory changes, the metadata service receives the notification of the consensus protocol cluster, then checks whether the node representing the metadata service node is arranged at the head, if so, upgrades the metadata service node to the master node and provides service for the outside, and otherwise, the metadata service node is used as the slave node to continue synchronizing the metadata change of the master node.
As shown in fig. 3, a specific master node election process is provided, which includes the following steps:
step S301: and (3) starting the node:
step S302: creating a node in the consensus protocol cluster;
step S303: determining whether the node number is 0, if yes, performing step S304, and if no, performing step S305:
step S304: becoming a metadata service host node;
step S305: becoming a metadata service slave node;
step S306: receiving a member change notification sent by the consensus protocol cluster;
step S307: it is determined whether the node number of the node is still 0, and if not, the process returns to step S304, and if not, the process returns to step S305.
Step S102: and when the metadata change occurs, packaging the metadata change operation into a change operation log by using the metadata service master node.
The metadata service main node is an interface for providing metadata change externally, the operation of the metadata change only can be initiated from the metadata service main node, the metadata service main node locally maintains two integer values commit _ op _ seq and replay _ op _ seq, and the metadata service slave node locally maintains the integer value replay _ op _ seq. The commit _ op _ seq represents the latest change operation log sequence number that the metadata service master node has stored in the consensus protocol cluster, and the replay _ op _ seq represents the latest change operation log sequence number that each metadata service slave node has synchronously applied to the local storage. When the metadata service main node is started, the sequence number replay _ op _ seq of the local synchronized latest log is obtained from a local storage engine of the main node, and the sequence number commit _ op _ seq of the latest log successfully written into the consensus protocol cluster is set as replay _ op _ seq. Before starting the external start service, the metadata service main node synchronizes all the latest change operation logs from the consensus protocol cluster and updates the replay _ op _ seq and the commit _ op _ seq.
Step S103: and writing the change operation log into the segment of the consensus protocol cluster in sequence.
The metadata change oplogs are stored in a fixed directory, referred to as a data directory, of the consensus protocol cluster. The change operation logs in the data directory are segmented according to the segments, each segment stores a plurality of logs at most, each change operation log has a sequence number, and the logs are arranged according to the writing sequence.
Step S104: and after the writing is successful, updating the change operation log and the corresponding metadata into a local storage engine of the metadata main node.
Step S105: and when a new segment in the consensus protocol cluster is created or a preset time interval is set, synchronizing the change operation log and the corresponding metadata thereof to a local storage engine of the metadata service slave node according to a preset synchronization rule.
When the metadata service slave node starts, the sequence number replay _ op _ seq of the local synchronized latest log is obtained from the local storage engine of the metadata service slave node, and when a new segment in the consensus protocol cluster is created, the metadata service slave node receives a notification and synchronizes the last segment from the consensus protocol cluster. The metadata service slave node can also synchronously change the operation log from the consensus protocol cluster at intervals of several seconds.
In the embodiment of the application, the metadata synchronization takes the segments as a basic unit, and when one segment is not written to full, the metadata service slave node does not immediately synchronize the changes even if there are new metadata changes. When a new segment is created, the metadata service receives event notification from the node, synchronizes the metadata change operation log of the last segment from the consensus protocol cluster, and updates the local metadata accordingly. The metadata synchronization with segment granularity is to avoid broadcast storm, because if each change operation log is synchronized once, it means that each update triggers reading and event notification of each service node in the metadata service cluster, which greatly affects cluster performance.
Optionally, the preset synchronization rule is:
acquiring the latest change operation log sequence number from a local storage engine of a node in a metadata service;
pulling all segment information from the consensus protocol cluster;
sequencing all the segments according to the sequence number of the first change operation log in the segments;
finding the first segment not less than the sequence number of the latest change operation log;
judging whether the segment is the last one, wherein the sequence number of the first log of the segment is greater than the sequence number of the latest change operation log;
if yes, taking the last segment of the segment as a target segment;
if not, taking the segment as a target segment;
starting from the target segment, the change oplogs for all segments are synchronized.
As shown in fig. 4, a specific implementation process of synchronization of a slave node of a metadata service is provided, which includes the following steps:
step S401: after the metadata service is started from the node, the sequence number replay _ op _ seq of the latest locally applied log is obtained from a local storage engine;
step S402: judging whether a new segment is created or 10 seconds are left, if so, executing step S403, otherwise, repeatedly executing step S402;
step S403: pulling all segment information from the consensus protocol cluster;
step S404: sequencing all the segments according to the sequence number of the first log in the segments;
step S405: finding out the first segment which is not smaller than the replay _ op _ seq, and returning the last segment when the segment does not exist;
step S406: judging whether the segment is the last one, and the sequence number of the first log of the segment is greater than replay _ op _ seq, if so, executing step S407, otherwise, executing step S408;
step S407: taking the last segment of the segment as a target segment;
step S408: taking the segment as a target segment;
step S409: starting from the target segment, the change oplogs for all segments are synchronized.
Since the consensus protocol cluster loads all metadata change operation logs into the memory, and the memory space is limited, which cannot meet the storage requirement of a large amount of metadata, the scale of the metadata change operation logs in the consensus protocol cluster needs to be limited. A consensus protocol cluster metadata recovery mechanism is introduced, and metadata changes in the consensus protocol cluster are deleted after being applied to the local by each metadata service node. And the metadata service slave node with the minimum node sequence number in the metadata service cluster is responsible for clearing useless change operation logs in the consensus protocol cluster. After the change operation log and the corresponding metadata are synchronized to the local storage engine of the metadata service slave node according to a preset synchronization rule, the method further comprises the following steps:
and the change operation logs synchronized to the slave nodes of the metadata services in the consensus protocol cluster are recycled through the slave nodes of the metadata services corresponding to the nodes arranged at the top.
In this embodiment, each metadata service node monitors the same data directory in the consensus protocol cluster, and receives a notification when there is a change in the number of segments. When the metadata service master node creates a new segment in the consensus protocol cluster data directory, other metadata service slave nodes receive notification of the consensus protocol cluster. The metadata service slave node corresponding to the node arranged at the head can execute the operation of recovering the metadata log, and can delete the segments before the segments synchronized to the local in the consensus protocol cluster.
Optionally, the metadata service slave node also starts a timed task, and the metadata service slave node corresponding to the node arranged at the head carries out the change operation log recovery operation at regular time.
Because the change operation log in the consensus protocol cluster is cleared, a new metadata service node may not be synchronized to a complete metadata change operation log from the consensus protocol cluster after joining the metadata service cluster, and only data of an existing metadata service node can be copied, a metadata full synchronization mechanism is introduced in the embodiment of the present application. Specifically, in the process of synchronously changing the operation logs from the consensus protocol cluster, if the change operation logs with the minimum sequence number in the consensus protocol cluster are found to be larger than the replay _ op _ seq +1 of the local storage engine, the complete and full change operation logs and metadata cannot be synchronized from the consensus protocol cluster, and at this time, the metadata and the change operation logs need to be synchronized from other metadata service nodes in a full-synchronization manner. In addition, if it is found that data of individual change operation logs is damaged when the operation logs are synchronously changed from the common protocol cluster, the individual change operation logs cannot be normally analyzed into protobuf or cannot be normally written into a local storage engine, and therefore, full synchronization is also needed. The metadata service node newly added into the cluster firstly synchronizes metadata from other metadata service nodes in a full amount and then carries out incremental data change. Optionally, the method further comprises:
when a new metadata service node joins the consensus protocol cluster, the full amount of metadata is synchronized from the local storage engines of other metadata service nodes.
As shown in fig. 5, a specific flow of metadata full synchronization is given, and the steps are as follows:
step S501: acquiring addresses of all metadata service nodes from the consensus protocol cluster;
step S502: requesting a metadata version number version of one unprocessed metadata service node;
step S503: judging whether version is equal to the latest version number known locally; if yes, go to step S504; if not, returning to step S502;
step S504: creating two temporary directories sync and old;
step S505: pulling the full data from the metadata service node selected in step S502 and putting the full data into a local sync directory;
step S506: the local metadata directory is renamed to backup, and the sync directory is changed to the name of the local metadata directory;
step S507: the restart initializes the local metadata service.
In a second aspect, based on the same technical concept, the present application provides a metadata synchronization system of a distributed storage system, as shown in fig. 6, the system including:
a determining unit 601, configured to determine a metadata service master node and a metadata service slave node by a consensus protocol cluster;
a packaging unit 602, configured to package, when a metadata change occurs, a metadata change operation into a change operation log by using the metadata service master node;
a writing unit 603, configured to write the change operation log into the segments of the consensus protocol cluster in sequence;
an updating unit 604, configured to update the change operation log and the corresponding metadata to a local storage engine of a metadata master node after the write-in is successful;
a synchronizing unit 605, configured to synchronize the change operation log and the corresponding metadata thereof to the local storage engine of the metadata service slave node according to a preset synchronization rule when a new segment in the common identification protocol cluster is created or a preset time duration is set.
Based on the same technical concept, an embodiment of the present invention further provides an electronic device, as shown in fig. 7, including a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702, and the memory 703 complete mutual communication through the communication bus 704,
a memory 703 for storing a computer program;
the processor 701 is configured to implement the steps of the metadata synchronization method of the distributed storage system when executing the program stored in the memory 703.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the metadata synchronization method of any of the above-mentioned distributed storage systems.
In yet another embodiment, a computer program product containing instructions is provided, which when run on a computer causes the computer to perform the metadata synchronization method of any of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.
It is noted that, in this document, relational terms such as "first" and "second," and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for synchronizing metadata of a distributed storage system, the method comprising:
determining a metadata service main node and a metadata service slave node through a consensus protocol cluster;
when metadata change occurs, the metadata change operation is packaged into a change operation log by using the metadata service master node;
writing the change operation logs into the segments of the consensus protocol cluster in sequence;
after the writing is successful, updating the change operation log and the corresponding metadata into a local storage engine of a metadata main node;
and when a new segment in the consensus protocol cluster is created or a preset time interval is set, synchronizing the change operation log and the corresponding metadata thereof to a local storage engine of the metadata service slave node according to a preset synchronization rule.
2. The method of claim 1, wherein determining the metadata service master node and the metadata service slave node through a consensus protocol cluster comprises:
creating a node for each metadata service node in the same directory of the consensus protocol cluster, and sequencing according to the creation time;
and determining the metadata service node corresponding to the node ranked at the head as the metadata service main node, and determining the other nodes as metadata service slave nodes.
3. The method of claim 2, further comprising:
when the metadata service main node fails or network partitions, deleting a node representing the metadata service main node;
and determining the metadata service node corresponding to the node which is currently ranked first as a new metadata service main node.
4. The method according to claim 1, wherein the preset synchronization rule is:
acquiring the latest change operation log sequence number from a local storage engine of a node in a metadata service;
pulling all segment information from the consensus protocol cluster;
sequencing all the segments according to the sequence number of the first change operation log in the segments;
finding the first segment not less than the sequence number of the latest change operation log;
judging whether the segment is the last one, wherein the sequence number of the first log of the segment is greater than the sequence number of the latest change operation log;
if so, taking the last segment of the segment as the target segment;
if not, taking the segment as a target segment;
starting from the target segment, the change oplogs for all segments are synchronized.
5. The method of claim 1, wherein after synchronizing the change oplog and the corresponding metadata to the local storage engine of the metadata service slave node according to a preset synchronization rule, the method further comprises:
and the change operation logs synchronized to the slave nodes of the metadata services in the consensus protocol cluster are recycled through the slave nodes of the metadata services corresponding to the nodes arranged at the top.
6. The method of claim 5, wherein the metadata service corresponding to the first-ranked node performs change oplog reclamation operations periodically from the node.
7. The method of claim 5, further comprising:
when a new metadata service node joins the consensus protocol cluster, the full amount of metadata is synchronized from the local storage engines of other metadata service nodes.
8. A metadata synchronization system for a distributed storage system, the system comprising:
the determining unit is used for determining the metadata service master node and the metadata service slave node through the consensus protocol cluster;
the packaging unit is used for packaging metadata change operation into a change operation log by using the metadata service master node when the metadata change occurs;
a write-in unit, configured to write the change operation log into the segments of the consensus protocol cluster in sequence;
the updating unit is used for updating the change operation log and the corresponding metadata into a local storage engine of the metadata main node after the writing is successful;
and the synchronization unit is used for synchronizing the change operation log and the corresponding metadata thereof to the local storage engine of the metadata service slave node according to a preset synchronization rule when a new segment in the common identification protocol cluster is created or a preset time length is set.
9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing the communication between the processor and the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any one of claims 1 to 7 when executing a program stored in a memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN202210432189.6A 2022-04-22 2022-04-22 Metadata synchronization method, system and equipment of distributed storage system Active CN115599747B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210432189.6A CN115599747B (en) 2022-04-22 2022-04-22 Metadata synchronization method, system and equipment of distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210432189.6A CN115599747B (en) 2022-04-22 2022-04-22 Metadata synchronization method, system and equipment of distributed storage system

Publications (2)

Publication Number Publication Date
CN115599747A true CN115599747A (en) 2023-01-13
CN115599747B CN115599747B (en) 2023-06-06

Family

ID=84842075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210432189.6A Active CN115599747B (en) 2022-04-22 2022-04-22 Metadata synchronization method, system and equipment of distributed storage system

Country Status (1)

Country Link
CN (1) CN115599747B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115794499A (en) * 2023-02-03 2023-03-14 创云融达信息技术(天津)股份有限公司 Method and system for double-live replication of data among distributed block storage clusters
CN116302140A (en) * 2023-05-11 2023-06-23 京东科技信息技术有限公司 Method and device for starting computing terminal based on storage and calculation separation cloud primary number bin
CN116561221A (en) * 2023-04-21 2023-08-08 清华大学 Distributed time sequence database copy consensus protocol method supporting Internet of things scene
CN116633946A (en) * 2023-05-29 2023-08-22 广州经传多赢投资咨询有限公司 Cluster state synchronous processing method and system based on distributed protocol

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015192661A1 (en) * 2014-06-19 2015-12-23 中兴通讯股份有限公司 Method, device, and system for data synchronization in distributed storage system
CN107426265A (en) * 2016-03-11 2017-12-01 阿里巴巴集团控股有限公司 The synchronous method and apparatus of data consistency
CN108280080A (en) * 2017-01-06 2018-07-13 阿里巴巴集团控股有限公司 A kind of method of data synchronization, device and electronic equipment
CN108322533A (en) * 2018-01-31 2018-07-24 广州鼎甲计算机科技有限公司 Configuration and synchronization method between distributed type assemblies node based on operation log
CN111858097A (en) * 2020-07-22 2020-10-30 安徽华典大数据科技有限公司 Distributed database system and database access method
CN111949633A (en) * 2020-08-03 2020-11-17 杭州电子科技大学 ICT system operation log analysis method based on parallel stream processing
WO2021051581A1 (en) * 2019-09-17 2021-03-25 平安科技(深圳)有限公司 Server cluster file synchronization method and apparatus, electronic device, and storage medium
WO2021226905A1 (en) * 2020-05-14 2021-11-18 深圳市欢太科技有限公司 Data storage method and system, and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015192661A1 (en) * 2014-06-19 2015-12-23 中兴通讯股份有限公司 Method, device, and system for data synchronization in distributed storage system
CN107426265A (en) * 2016-03-11 2017-12-01 阿里巴巴集团控股有限公司 The synchronous method and apparatus of data consistency
CN108280080A (en) * 2017-01-06 2018-07-13 阿里巴巴集团控股有限公司 A kind of method of data synchronization, device and electronic equipment
CN108322533A (en) * 2018-01-31 2018-07-24 广州鼎甲计算机科技有限公司 Configuration and synchronization method between distributed type assemblies node based on operation log
WO2021051581A1 (en) * 2019-09-17 2021-03-25 平安科技(深圳)有限公司 Server cluster file synchronization method and apparatus, electronic device, and storage medium
WO2021226905A1 (en) * 2020-05-14 2021-11-18 深圳市欢太科技有限公司 Data storage method and system, and storage medium
CN111858097A (en) * 2020-07-22 2020-10-30 安徽华典大数据科技有限公司 Distributed database system and database access method
CN111949633A (en) * 2020-08-03 2020-11-17 杭州电子科技大学 ICT system operation log analysis method based on parallel stream processing

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115794499A (en) * 2023-02-03 2023-03-14 创云融达信息技术(天津)股份有限公司 Method and system for double-live replication of data among distributed block storage clusters
CN115794499B (en) * 2023-02-03 2023-05-16 创云融达信息技术(天津)股份有限公司 Method and system for dual-activity replication data among distributed block storage clusters
CN116561221A (en) * 2023-04-21 2023-08-08 清华大学 Distributed time sequence database copy consensus protocol method supporting Internet of things scene
CN116561221B (en) * 2023-04-21 2024-03-19 清华大学 Method for supporting distributed time sequence database copy consensus protocol of Internet of things scene
CN116302140A (en) * 2023-05-11 2023-06-23 京东科技信息技术有限公司 Method and device for starting computing terminal based on storage and calculation separation cloud primary number bin
CN116302140B (en) * 2023-05-11 2023-09-22 京东科技信息技术有限公司 Method and device for starting computing terminal based on storage and calculation separation cloud primary number bin
CN116633946A (en) * 2023-05-29 2023-08-22 广州经传多赢投资咨询有限公司 Cluster state synchronous processing method and system based on distributed protocol

Also Published As

Publication number Publication date
CN115599747B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
US10896102B2 (en) Implementing secure communication in a distributed computing system
US10579364B2 (en) Upgrading bundled applications in a distributed computing system
EP4030315A1 (en) Database transaction processing method and apparatus, and server and storage medium
CN115599747A (en) Metadata synchronization method, system and equipment of distributed storage system
US7257689B1 (en) System and method for loosely coupled temporal storage management
US9535907B1 (en) System and method for managing backup operations of virtual machines
US7406487B1 (en) Method and system for performing periodic replication using a log
US8904137B1 (en) Deduplication system space recycling through inode manipulation
US20150213100A1 (en) Data synchronization method and system
CN108509462B (en) Method and device for synchronizing activity transaction table
US20060200500A1 (en) Method of efficiently recovering database
CN113268472B (en) Distributed data storage system and method
CN112334891B (en) Centralized storage for search servers
CN105469001B (en) Disk data protection method and device
CN115858236A (en) Data backup method and database cluster
US20230418811A1 (en) Transaction processing method and apparatus, computing device, and storage medium
US11429311B1 (en) Method and system for managing requests in a distributed system
JP7421078B2 (en) Information processing equipment, information processing system, and data relocation program
US11256434B2 (en) Data de-duplication
US10073874B1 (en) Updating inverted indices
CN115168367B (en) Data configuration method and system for big data
CN114780043A (en) Data processing method and device based on multilayer cache and electronic equipment
US20150135004A1 (en) Data allocation method and information processing system
US7949632B2 (en) Database-rearranging program, database-rearranging method, and database-rearranging apparatus
CN115344550A (en) Method, device and medium for cloning directories of distributed file system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant